AVA SQL

From avawiki
Revision as of 22:01, 14 May 2013 by Jxiaowei (Talk | contribs)

Jump to: navigation, search

AVA Database follows SQL '92 standard. In addition, AVA's unique SQL adds new features that facilitate high speed analytics and data mining.

Contents

Some AVA SQL conventions

Keyword cases

Keywords are case-insensitive. Both the low case or upper case, or mixed case will point to the same keyword. However, the column names and table names are case-sensitive.

Column assignment

A column can be created from an SQL update. There are two ways to assign a resulting column to a named column:

SQL convention

The "AS" clause to assign the results to a column.

ava> update low * 123.1 as new_low from a
0.024 seconds elapsed.

Non-SQL convention

A shortcut to create a column is by the syntax: "COLUMN_NAME: ...":

ava> update new_low: low * 123.1 from a
0.029 seconds elapsed.

Both will generate the same result:

ava> select * from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |       open |       high |        low |      close |     volume |  adj_close |    new_low |
---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      20.40 |      20.58 |      20.18 |      20.58 |     797800 |      16.87 |   2,484.16 |
2000/10/16 |   ss600018 |      20.16 |      20.50 |      20.15 |      20.20 |     346400 |      17.60 |   2,480.46 |
2000/10/16 |   ss600130 |      28.08 |      28.29 |      27.16 |      27.23 |     438300 |      24.89 |   3,343.40 |
0.001 seconds elapsed.


Analytics Functions

Mathematical Functions

Modulus ( % )

Take the modulus of the two numbers:

ava> select date, sec_id, ol: open % low from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |         ol |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |       0.00 |
2000/10/16 |   ss600018 |       0.00 |
2000/10/16 |   ss600130 |       1.00 |
0.036 seconds elapsed.

Power ( ^ )

Take the power of ( a real number ):

ava> select date, sec_id, o: open ^ 2 from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |          o |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |     416.16 |
2000/10/16 |   ss600018 |     406.43 |
2000/10/16 |   ss600130 |     788.49 |
0.081 seconds elapsed.

Factorial ( factorial )

Take factorial of an integer:

ava> select date, sec_id, f: factorial(5) * open  from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |          f |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |   2,448.00 |
2000/10/16 |   ss600018 |   2,419.20 |
2000/10/16 |   ss600130 |   3,369.60 |
0.02 seconds elapsed.

Other mathematical functions

  1. ABS(): Absolute Value
  2. EXP(): Exponential, e ( 2.72 ) to a power ( real number )
  3. SIGN(): Sign of a number
  4. LOG(): Log of e (2.72)
  5. LOG10(): LOG of 10
  6. SIN(): SIN
  7. COS(): COS
  8. TAN(): Tangent
  9. ASIN(): Arc Sin
  10. ACOS(): Arc Cos
  11. ATAN(): Arc Tangent

Statistical Functions

SUM

Aggregate function. Sum over a group ( option )

ava> select s: sum(volume) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 1,718
---------------------------------------------------
    sec_id |          s |
---------- | ---------- |
  ss000001 |  407718300 |
  ss600000 |  424768664 |
  ss600001 |  481888440 |
0.29 seconds elapsed.

AVG

Aggregate function. Average over a group ( option )

ava> select s: avg(close) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 1,718
---------------------------------------------------
    sec_id |          s |
---------- | ---------- |
  ss000001 |   2,214.42 |
  ss600000 |      17.70 |
  ss600001 |       5.31 |
0.288 seconds elapsed.

STD

Aggregate function. Standard Deviation over a group ( option )

ava> select s: std(close) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 1,718
---------------------------------------------------
    sec_id |          s |
---------- | ---------- |
  ss000001 |   1,034.01 |
  ss600000 |      11.83 |
  ss600001 |       1.53 |
0.291 seconds elapsed.

MIN

Aggregate function. Minimum value over a group ( option )

ava> select s: min(close) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 1,718
---------------------------------------------------
    sec_id |          s |
---------- | ---------- |
  ss000001 |   1,011.50 |
  ss600000 |       6.48 |
  ss600001 |       2.72 |
0.292 seconds elapsed.

MAX

Aggregate function. Maximum value over a group ( option )

ava> select s: max(close) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 1,718
---------------------------------------------------
    sec_id |          s |
---------- | ---------- |
  ss000001 |   6,092.06 |
  ss600000 |      61.59 |
  ss600001 |      10.92 |
0.293 seconds elapsed.

Time Series Functions

CUMSUM

Cumulative sum. Sums over a pre-ordered data column.

ava> update cs: cumsum(open) from a group by sec_id
0.339 seconds elapsed.
ava> select date, sec_id, cs from a where sec_id = "ss600001"
TABLE NAME: a | NUMBER OF RECORDS: 1,709
---------------------------------------------------
      date |     sec_id |         cs |
---------- | ---------- | ---------- |
2003/01/01 |   ss600001 |       4.99 |
2003/01/02 |   ss600001 |       9.96 |
2003/01/03 |   ss600001 |      14.91 |
2003/01/06 |   ss600001 |      19.91 |
2003/01/07 |   ss600001 |      24.94 |
0.076 seconds elapsed.

To sort the table to a particular order:

ava> .sort a order by date desc
0.052 seconds elapsed.
ava> select * from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |       open |       high |        low |      close |     volume |  adj_close |
---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
2011/04/04 |   ss900946 |       0.47 |       0.47 |       0.47 |       0.47 |          0 |       0.47 |
2011/04/04 |   ss900940 |       0.23 |       0.23 |       0.23 |       0.23 |          0 |       0.23 |
2011/04/04 |   sz000863 |       3.50 |       3.50 |       3.50 |       3.50 |          0 |       3.50 |
0 seconds elapsed.

CUMPROD

Cumulative product. Product over a pre-ordered data column.

ava> update cp: cumprod(open) from a group by sec_id
0.588 seconds elapsed.

RANK

ava> update rk: rank(volume) from a group by sec_id
0.701 seconds elapsed.
ava> select date, sec_id, volume, rk from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |     volume |         rk |
---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |     797800 |        590 |
2000/10/16 |   ss600018 |     346400 |        265 |
2000/10/16 |   ss600130 |     438300 |        864 |
0 seconds elapsed.


WRANK

SYNTAX: wrank(COLUMN_NAME,BINS) Rank a column and assign the ranks to bins.

ava> update rk: wrank(volume,10) from a group by sec_id
0.714 seconds elapsed.
ava> select date, sec_id, volume, rk from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |     volume |         rk |
---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |     797800 |          3 |
2000/10/16 |   ss600018 |     346400 |          2 |
2000/10/16 |   ss600130 |     438300 |          4 |
0 seconds elapsed.

Window Functions

FIRST

ava> select date: first(date), open: first(open) from a group by sec_id TABLE NAME: a | NUMBER OF RECORDS: 1,718


   sec_id |       date |       open |

| ---------- | ---------- |
 ss000001 | 2000/10/16 |   1,917.45 |
 ss600000 | 2003/01/01 |       9.86 |
 ss600001 | 2003/01/01 |       4.99 |

0.025 seconds elapsed.

LAST

ava> select date: first(date), open: last(open) from a group by sec_id TABLE NAME: a | NUMBER OF RECORDS: 1,718


   sec_id |       date |       open |

| ---------- | ---------- |
 ss000001 | 2000/10/16 |   2,932.48 |
 ss600000 | 2003/01/01 |      13.60 |
 ss600001 | 2003/01/01 |       5.29 |

0.025 seconds elapsed.

LAG

ava> update lag_open: lag(open,2) from a group by sec_id
0.347 seconds elapsed.

LEAD

ava> update lead_open: lead(open,2) from a group by sec_id
0.336 seconds elapsed.
ava> select date, sec_id, open, lag_open, lead_open from a where sec_id = 'ss000001'
TABLE NAME: a | NUMBER OF RECORDS: 2,678
---------------------------------------------------
      date |     sec_id |       open |   lag_open |  lead_open |
---------- | ---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss000001 |   1,917.45 |        nan |   1,911.71 |
2000/10/17 |   ss000001 |   1,902.07 |        nan |   1,925.21 |
2000/10/18 |   ss000001 |   1,911.71 |   1,917.45 |   1,913.16 |
0.063 seconds elapsed.

MAVG

ava> update mavg_open: mavg(open,2)  from a group by sec_id
0.364 seconds elapsed.

MSUM

ava> update msum_open: msum(open,2)  from a group by sec_id
0.361 seconds elapsed.

MSTD

ava> update mstd_open: mstd(open,2)  from a group by sec_id
0.377 seconds elapsed.
ava> select date, sec_id, open, mavg_open, mstd_open, msum_open from a where sec_id = 'ss000001'
TABLE NAME: a | NUMBER OF RECORDS: 2,678
---------------------------------------------------
      date |     sec_id |       open |  mavg_open |  mstd_open |  msum_open |
---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss000001 |   1,917.45 |   1,917.45 |        nan |   1,917.45 |
2000/10/17 |   ss000001 |   1,902.07 |   1,909.76 |      10.88 |   3,819.52 |
2000/10/18 |   ss000001 |   1,911.71 |   1,906.89 |       6.82 |   3,813.78 |

EMA

Exponential moving average. SYNTAX: ema(COLUMN_NAME, HALF_LIFE)

ava> update ema_open: ema(open, 0.1) from a group by sec_id
0.034 seconds elapsed.

Row-wise Aggregation Functions

WSUM

ava> update mopen: wsum(open) from a group by sec_id
0.406 seconds elapsed.
ava> select date, sec_id, open, mopen from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |       open |      mopen |
---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      20.40 |  26,741.47 |
2000/10/16 |   ss600018 |      20.16 |  31,974.72 |
2000/10/16 |   ss600130 |      28.08 |  30,661.59 |
0.001 seconds elapsed.
ava> update open_demean: open - wsum(open) from a group by sec_id
0.428 seconds elapsed.


WAVG

ava> select date, sec_id, mopen: wavg(open) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |      mopen |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      10.19 |
2000/10/16 |   ss600018 |      12.18 |
2000/10/16 |   ss600130 |      11.65 |
0.377 seconds elapsed.

WSTD

ava> select date, sec_id, mopen: wstd(open) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |      mopen |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |       4.72 |
2000/10/16 |   ss600018 |       5.91 |
2000/10/16 |   ss600130 |       8.86 |
0.379 seconds elapsed.

WMIN

ava> select date, sec_id, mopen: wmin(open) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |      mopen |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |       3.70 |
2000/10/16 |   ss600018 |       3.19 |
2000/10/16 |   ss600130 |       1.70 |
0.38 seconds elapsed.

WMAX

ava> select date, sec_id, mopen: wmax(open) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |      mopen |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      23.58 |
2000/10/16 |   ss600018 |      25.30 |
2000/10/16 |   ss600130 |      33.35 |
0.378 seconds elapsed.

WCOUNT

ava> select date, sec_id, mopen: wcount(open) from a group by sec_id
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |      mopen |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |       2624 |
2000/10/16 |   ss600018 |       2625 |
2000/10/16 |   ss600130 |       2633 |
0.286 seconds elapsed.

Table Join Functions

Data Conversion Functions

to_date

ava> select date, sec_id from a where date = to_date("2000-10-16")
TABLE NAME: a | NUMBER OF RECORDS: 100
---------------------------------------------------
      date |     sec_id |
---------- | ---------- |
2000/10/16 |   ss600008 |
2000/10/16 |   ss600018 |
2000/10/16 |   ss600130 |
0.019 seconds elapsed.

to_string

ava> select date, sec_id, sec_id2: to_string(sec_id, 5) from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |    sec_id2 |
---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      ss600 |
2000/10/16 |   ss600018 |      ss600 |
2000/10/16 |   ss600130 |      ss600 |
0.072 seconds elapsed.

to_time

ava> update time: to_time("10:10:10.999") from a
0.032 seconds elapsed.
ava> select date, time, sec_id from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |       time   |     sec_id |
---------- | ----------   | ---------- |
2000/10/16 | 10:10:10.999 |   ss600008 |
2000/10/16 | 10:10:10.999 |   ss600018 |
2000/10/16 | 10:10:10.999 |   ss600130 |
0.001 seconds elapsed.

to_long

ava> select date, sec_id, open, lopen: to_long(open) from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |       open |      lopen |
---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |      20.40 |         20 |
2000/10/16 |   ss600018 |      20.16 |         20 |
2000/10/16 |   ss600130 |      28.08 |         28 |
0.012 seconds elapsed.

to_float

ava> select date, sec_id, volume, fvolume: to_float(volume) from a
TABLE NAME: a | NUMBER OF RECORDS: 3,252,871
---------------------------------------------------
      date |     sec_id |     volume |    fvolume |
---------- | ---------- | ---------- | ---------- |
2000/10/16 |   ss600008 |     797800 | 797,800.00 |
2000/10/16 |   ss600018 |     346400 | 346,400.00 |
2000/10/16 |   ss600130 |     438300 | 438,300.00 |
0.012 seconds elapsed.

to_integer

ava> select date, sec_id, open, lopen: to_integer(open) from a

to_unsigned_integer

ava> select date, sec_id, open, lopen: to_unsigned_integer(open) from a