Outside-in computations are beyond the "normal" ADP concept.
Unfortunately, ADP requires a single contiguous sub-word, for which intermediate results
are computed. To make two sub-words (0, i) and (j, n) contiguous (0 < i < j < n), we have to apply the following "trick": we
duplicate the input sequence but separate both copies with a special character, e. g.
original input ccaaagg
becomes ccaaagg+ccaaagg
. Since
everything is duplicated, sub-word (0, i) for the leading half is identical to (n+1, n+1+i).
Concatenated with the trailing half (j, n) and the separator character (n, n + 1), we get
the contiguous sub-word (j, n)+(n + 1, n + 1 + i), for which we can apply any ADP
computation.
The file "outside.hh" contains functions to realize the outside "trick":
containsBase
is to check if a symbol is the input delimiter character.collfilter2
requires a sub-word from the duplicated string to be of the size of the original input sequence.shiftIndex
maps-back the border of a sub-word of the duplicated string to the original input sequence.is a special data-type for TDM generation. It holds all rules of a grammar in an unique way, i.e. even if the generator adds twice the rules for a simple hairpin it will appear only once in the generated grammar. This is accomplished by using a two level hash. First level is for non-terminals, second level for alternatives of a non-terminal.
For TDMs the rules must be indexed. We do so by using sub-shape strings, but have to convert [
to L
and ]
to J
characters to fit GAP-L's requirements for valid non-terminal identifiers. Besides the two level hash, the sub-shape string is the second component of the rules
data type. A special situation occurs for concatenating two level 1 sub-shapes. There interface might hold unpaired bases (_
), but of course their concatenation must be _
and not __
. Ensuring this is the task of function appendShape
.
Please have a look at the application RapidShapes, to see TDMs in action.