Sequence Alignment
The ScleraSQL extension ALIGN
enables you to align rows in two sequences, say two stock tickers, in a way as to minimize the total "distance" between the rows. The distance is arbitrary and is specified as a part of the ALIGN
clause.
We can think of ALIGN
as a type of JOIN
for ordered sequences. The difference from relational joins is that in the latter, any row can join with any row as long as the join condition is met; in ALIGN
we also have the additional ordering constraint in that the rows need to be aligned in order.
We can also provide an additional constraint to ALIGN
by specifying a margin - the maximum number of skipped rows between two aligning rows. A margin of m
will force the row at position i
in one input to be aligned only rows between positions i-m
and i+m
in the other input.
The alignment is done using a technique called Dynamic Time Warping.
The syntax is similar to the SQL JOIN
syntax, and is as follows:
table_expression ALIGN table_expression [ ON distance [ MARGIN margin ] ]
where:
table_expression
is an arbitrary table expressiondistance
is a numeric scalar expression that gives the distance between a row in the left input and a row in the right inputmargin
specifies the maximum number of skipped rows between two aligned rows, as explained earlier
If the distance is not specified or is a constant, or if the margin is zero
, the operator simply aligns the rows at position i
in one input with the row in position i
in the right input.