Source

htsql / doc / ref / functions.rst

Full commit

Functions and Operators

This document describes built-in functions and operators.

Logical Functions and Operators

Function Description Example Input Output
boolean(x) cast x to Boolean boolean('true') true
boolean('false') false
boolean(string('HTSQL')) true
boolean(string('')) false
true() logical TRUE value true()  
false() logical FALSE value false()  
p & q logical AND operator true()&true() true
true()&false() false
false()&false() false
p | q logical OR operator true()|true() true
true()|false() true
false()|false() false
!p logical NOT operator !true() false
!false() true
null(x) NULL value null()  
is_null(x) x is null is_null(null()) true
if_null(x,y) x if x is not null; y otherwise if_null(1,0) 1
if_null(null(),0) 0
null_if(x,y) x if x is not equal to y; null otherwise null_if(1,0) 1
null_if(0,0) null
x = y x is equal to y 'HTSQL'='QUEL' false
x != y x is not equal to y 'HTSQL'!='QUEL' true
x == y x is equal to y; treats nulls as regular values 'HTSQL'=='QUEL' false
2==null() false
x !== y x is not equal to y; treats nulls as regular values 'HTSQL'!=='QUEL' true
2!==null() true
x = {a,b,c,...} x is among a, b, c, ... 5={2,3,5,7}' true
x != {a,b,c,...} x is not among a, b, c, ... 5!={2,3,5,7}' false
x < y x is less than y 1<10 true
'omega'<'alpha' false
x <= y x is less than or equal to y 1<=10 true
'omega'<='alpha' false
x > y x is greater than y 1>10 false
'omega'>'alpha' true
x >= y x is greater than or equal to y 1>=10 false
'omega'>='alpha' true
if(p1,c1,...,pn,cn) first ck such that pk is TRUE; o or null otherwise if(true(),'up','down') 'up'
if(p1,c1,...,pn,cn,o) if(false(),'up','down') 'down'
switch(x,y1,c1,...,yn,cn) first ck such that x is equal to yk; o or null otherwise switch(1,1,'up',0,'down') 'up'
switch(x,y1,c1,...,yn,cn,o) switch(0,1,'up',0,'down') 'down'

Boolean Cast

boolean(x)
Convert x to Boolean.

The result of the conversion depends on the type of the argument:

untyped
The literal 'false' is converted to FALSE, the literal 'true' is converted to TRUE, any other literals generate an error.
boolean
The value is unchanged.
string
NULL and an empty string are converted to FALSE, other values are converted to TRUE.
other data types
null values are converted to FALSE, other values are converted to TRUE.

Logical Values

true()
Logical TRUE value.
false()
Logical FALSE value.

Logical Operators

p | q
Logical OR operator.
p & q
Logical AND operator.
! p
Logical NOT operator.

Arguments of a logical operators that are not of a Boolean type automatically converted to Boolean (see boolean() function).

NULL Checking

null()
Untyped NULL value.
is_null(x)
TRUE if x is NULL, FALSE otherwise.
if_null(x,y)
x if x is not NULL, y otherwise.
null_if(x,y)
x if x is not equal to y, NULL otherwise.

The arguments of if_null() and null_if() should be of the same type; if not, the arguments are coerced to the most general type.

Equality Operators

x = y
TRUE if x is equal to y, FALSE otherwise. Returns NULL if any of the operands is NULL.
x != y
TRUE if x is not equal to y, FALSE otherwise. Returns NULL if any of the operands is NULL.
x == y
TRUE if x is equal to y, FALSE otherwise. Treats NULL as a regular value.
x !== y
TRUE if x is not equal to y, FALSE otherwise. Treats NULL as a regular value.
x = {a,b,c,...}
TRUE if x is equal to some value among a,b,c,..., FALSE otherwise.
x != {a,b,c,...}
TRUE if x is not equal to all values among a,b,c,..., FALSE otherwise.

The form x = {a,b,c,...} is a short-cut syntax for x=a|x=b|x=c|.... Similarly, the form x != {a,b,c,...} is a short-cut syntax for x!=a|x!=b|x!=c|....

The operands of equality operators are expected to be of the same time. If the types of the operands are different, the operands are coerced to the most general type; it is an error if the operand types are not compatible to each other.

Comparison Operators

x < y
TRUE if x is less than y, FALSE otherwise.
x <= y
TRUE if x is less than or equal to y, FALSE otherwise.
x > y
TRUE if x is greater than y, FALSE otherwise.
x >= y
TRUE if x is greater than or equal to y, FALSE otherwise.

The result is NULL if any of the operands is NULL.

An operand of a comparison operator must be of a string, numeric, enumeration, or date/time type. Both operands are expected to be of the same type; if not, the operands are coerced to the most general type.

Branching Functions

if(p1,c1,p2,c2,...,pn,cn[,o])
This function takes N logical expressions p1,p2,...,pN interleaved with N values c1,c2,...,cN, followed by an optional value o. The function returns the value ck corresponding to the first predicate pk evaluated to TRUE. If none of the predicates are evaluated to TRUE, the value of o is returned, or NULL if o is not specified.
switch(x,y1,c1,y2,c2,...,yn,cn[,o])
This function takes a control expression x followed by N variant values y1,y2,...,yN interleaved with N resulting values c1,c2,...,cN, and concluded with an optional default value o. The function returns the value ck corresponding to the first variant yk equal to x. If none of the variants are equal to the control value, o is returned, or NULL if o is not specified.

These functions expect all the resulting values c1,c2,...,cN as well as the default value o to be of the same type. If the value types are different, all values are coerced to the most general type. Same is true for the control expression x and variant values y1,y2,...,yN of the function switch().

Numeric Functions

Function Description Example Input Output
integer(x) cast x to integer integer('60') 60
integer(17.25) 17
integer(string('60')) 60
decimal(x) cast x to decimal decimal('17.25') 17.25
decimal(223607e-5) 2.23607
decimal(string('17.25')) 17.25
float(x) cast x to float float('223607e-5') 223607e-5
float(60) 6e1
float(string('223607e-5')) 223607e-5
+ x x +60  
- x negate x -7  
x + y add x to y 13+7 20
x - y subtract y from x 13-7 6
x * y multiply x by y 13*7 91
x / y divide x by y 13/7 1.85714285714286
round(x) round x to the nearest integer round(17.25) 17
round(x,n) round x to n decimal places round(17.25,1) 17.3
trunc(x) round x to an integer, towards zero trunc(17.25) 17
trunc(x,n) round x to n decimal places, towards zero trunc(17.25,1) 17.2

Numeric Cast

integer(x)
Convert x to integer.
decimal(x)
Convert x to decimal.
float(x)
Convert x to float.

The argument of a conversion function can be of one of the following types:

untyped
An untyped literal must be a valid number. The integer() function accepts only integer literals, decimal() and float() accepts untyped literals written in integer, decimal or scientific notation.
numeric
Numeric cast functions convert numbers between different storage forms. Behavior on range overflow and rounding rules are backend-dependent.
string
A string value must contain a valid number. The set of allowed input values depends on the backend.

Arithmetic Expressions

+ x
Return x.
- x
Negate x.
x + y
Add x to y.
x - y
Subtract y from x.
x * y
Multiply x by y.
x / y
Divide x by y.

Arithmetic operators expect operands of a numeric type. If the operands are of different types, they are coerced to the most general type, in the order: integer, decimal, float. For instance, adding an integer value to a decimal value converts the integer operand to decimal; multiplying a decimal value to a float value converts the decimal operand to float.

In general, the type of the result coincides with the type of the operands. The only exception is the division operator: when applied to integer operands, division produces a decimal value.

The behavior of arithmetic expressions on range overflow or division by zero is backend-dependent: different backends may raise an error, return a NULL value or generate an incorrect result.

Note that some arithmetic operators are also defined for string and date values; they are described in respective sections.

Rounding Functions

round(x)
Round x to the nearest integer value.
round(x,n)
Round x to n decimal places.
trunc(x)
Round x to an integer, towards zero.
trunc(x,n)
Round x to n decimal places, towards zero.

If called with one argument, the functions accept values of decimal or float types and return a value of the same type.

When called with two arguments, the functions expects a decimal argument and produces a decimal value. The second argument should be an integer; negative values are permitted.

String Functions

By convention, string functions take a string as its first parameter. When an untyped literal, such as 'value' is used and a string is expected, it is automatically cast. Hence, for convenience, we write string typed values using single quotes in the output column.

Function Description Example Input Output
string(x) cast x to string string('Hello') 'Hello'
string(1.0) '1.0'
string(date('2010-04-15')) '2010-04-15'
length(s) number of characters in s length('HTSQL') 5
s + t concatenate s and t 'HT' + 'SQL' 'HTSQL'
s ~ t s contains t; case-insensitive 'HTSQL' ~ 'sql' true
s !~ t s does not contain t; case-insensitive 'HTSQL' !~ 'sql' false
head(s) first character of s head('HTSQL') 'H'
head(s,n) first n characters of s head('HTSQL',2) 'HT'
head('HTSQL',-3) 'HT'
tail(s) last character of s tail('HTSQL') 'L'
tail(s,n) last n characters of s tail('HTSQL',3) 'SQL'
tail('HTSQL',-2) 'SQL'
slice(s,i,j) i-th to j-th characters of s; null or missing index means the beginning or the end of the string slice('HTSQL',1,4) 'TSQ'
slice('HTSQL',-4,-1) 'TSQ'
slice('HTSQL',null(),2) 'HT'
slice('HTSQL',2,null()) 'SQL'
at(s,k) k-th character of s at('HTSQL',2) 'S'
at(s,k,n) n characters of s starting with k-th character at('HTSQL',1,3) 'TSQ'
at('HTSQL,-4,3) 'TSQ'
at('HTSQL,4,-3) 'TSQ'
upper(s) upper case of s upper('htsql') 'HTSQL'
lower(s) lower case of s lower('HTSQL') 'htsql'
trim(s) strip leading and trailing spaces from s trim('  HTSQL  ') 'HTSQL'
ltrim(s) strip leading spaces from s ltrim('  HTSQL  ') 'HTSQL  '
rtrim(s) strips trailing spaces from s rtrim('  HTSQL  ') '  HTSQL'
replace(s,t,r) replace all occurences of t in s with r replace('HTSQL','SQL','RAF') 'HTRAF'

String Cast

string(x)
Convert x to a string.

HTSQL permits any value to be converted to a string; the conversion respects the format for literals of the original type.

String Length

length(s)
Number of characters in s.

The exact meaning of a string length depends on the backend and the underlying SQL type. The function returns 0 if the argument is NULL.

Concatenation

s + t
Concatenate s and t.

The concatenation operator treats a NULL operand as an empty string.

Substring Extraction

head(s)
The first character of s.
head(s,n)
The first n characters of s.
tail(s)
The last character of s.
tail(s,n)
The last n characters of s.
slice(s,i,j)
The i-th to j-th (exclusive) characters of s.
at(s,k)
The k-th character of s.
at(s,k,n)
n characters of s starting from the k-th.

In HTSQL, characters of a string are indexed from 0.

Extraction functions permit negative or NULL indexes. head() (tail()), when given a negative n, produces all but the last (first) -n characters of s; if n is NULL, it is assumed to be 1.

For slice(), a negative index i or j indicates to count (-i-1)-th ((-j-1)-th) character from the end of s. NULL value for i or j indicates the beginning (the end) of the string.

For at(), a negative n produces -n characters of s ending at the k-th character; if n is NULL, it is assumed to be 1.

Case Conversion

upper(s)
Convert s to upper case.
lower(s)
Convert s to lower case.

The conversion semantics is backend-dependent.

String Trimming

trim(s)
Strip leading and trailing spaces from s.
ltrim(s)
Strip leading spaces from s.
rtrim(s)
Strip trailing spaces from s.

Search and Replace

replace(s,t,r)
Replace all occurences of substring t in s with r.

Case-sensitivity of the search depends on the backend; NULL values for t and r are interpreted as an empty string.

Date/Time Functions

Function Description Example Input Output
date(x) cast x to date date('2010-04-15')  
time(x) cast x to time time('20:13')  
datetime(x) cast x to datetime datetime('2010-04-15T20:13')  
date(yyyy,mm,dd) date yyyy-mm-dd date(2010,4,15) date('2010-04-15')
datetime(yyyy,mm,dd [,HH,MM,SS]) datetime yyyy-mm-dd HH:MM:SS datetime(2010,4,15,20,13) datetime('2010-04-15T20:13')
datetime(d,t) datetime from date and time datetime( date('2010-04-15'), time('20:13') ) datetime('2010-04-15T20:13')
today() current date today()  
now() current date and time now()  
date(dt) date of dt date( datetime('2010-04-15T20:13') ) date('2010-04-15')
time(dt) time of dt time( datetime('2010-04-15T20:13') ) time('20:13')
year(d) year of d year(date('2010-04-15')) 2010
month(d) month of d month(date('2010-04-15')) 4
day(d) day of d day(date('2010-04-15')) 15
hour(t) hours of t hour(time('20:13')) 20
minute(t) minutes of t minute(time('20:13')) 13
second(t) seconds of t second(time('20:13')) 0.0
d + n increment d by n days date('1991-08-20')+6813 date('2010-04-15')
d - n decrement d by n days date('2028-12-09')-6813 date('2010-04-15')
d1 - d2 number of days between d1 and d2 date('2028-12-09') - date('1991-08-20') 13626

Date/Time Cast

date(x)
Convert x to a date value.
time(x)
Convert x to a time value.
datetime(x)
Convert x to a datetime value.

Conversion functions accept untyped literals and string expressions. An untyped literal must obey the literal format of the respective target type. Conversion from a string value is backend-specific.

Date/Time Construction

date(yyyy,mm,dd)
Construct a date from the given year, month and day values.
datetime(yyyy,mm,dd[,HH,MM,SS])
Construct a datetime from the given year, month, day, hour, minute and second values.
datetime(d,t)
Construct a datetime from the given date and time.

Construction functions accept and normalize component values outside the regular range.

Component Extraction

date(dt)
Date of a datetime value.
time(dt)
Time of a datetime value.
year(d)
Year of a date or a datetime value.
month(d)
Month of a date or a datetime value.
day(d)
Day of a date or a datetime value.
hour(t)
Hours of a time or a datetime value.
minute(t)
Minutes of a time or a datetime value.
second(t)
Seconds of a time or a datetime value.

The extracted values are integers except for second(), where the extracted value is a float number.

Date/Time Arithmetics

d + n
Increment a date or a datetime value by n days.
d - n
Decrement a date or a datetime value by n days.
d1 - d2
Number of days between two date values.

Aggregate Functions

Function Description Example Input
exists(ps) TRUE if ps contains at least one TRUE value; FALSE otherwise exists(course.credits>5)
every(ps) TRUE if ps contains only TRUE values; FALSE otherwise every(course.credits>5)
count(ps) number of TRUE values in ps count(course.credits>5)
min(xs) smallest element in xs min(course.credits)
max(xs) largest element in xs max(course.credits)
sum(xs) sum of elements in xs sum(course.credits)
avg(xs) average value of elements in xs avg(course.credits)

Aggregate functions accept a plural argument, which, when evaluated, produces a flow of values, and generates a single aggregating value from it.

Boolean Aggregates

exists(xs)
Produce TRUE if xs contains at least one TRUE value, FALSE otherwise. The aggregate returns FALSE on an empty flow.
every(xs)
Produce FALSE if xs contains only TRUE values, FALSE otherwise. The aggregate returns TRUE on an empty flow.
count(xs)
The number of TRUE values in xs; 0 if xs is empty.

Boolean aggregates expect a Boolean argument; a non-Boolean argument is converted to Boolean first (see function boolean()).

Extrema

min(xs)
The smallest value in xs.
max(xs)
The largest value in xs.

The functions accept numeric, string, enumeration and date/time arguments. NULL values in the flow are ignored; if the flow is empty, NULL is returned.

Sum and Average

sum(xs)
The sum of values in xs; returns 0 if xs is empty.
avg(xs)
The average of values in xs.

The functions accept a numeric argument. sum() returns a result of the same type as the argument, avg() returns a decimal result for an integer or a decimal argument, and float result for a float argument.

Flow Operations

Function Description Example Input
flow ? p records from flow satisfying condition p school?code='edu'
filter(p) school.filter(code='edu')
flow ^ x unique values of x as it runs over flow school^campus
distinct(flow{x}) distinct(school{campus})
flow {x,...} select output columns x, ... for flow school{code,name}
select(x,...) school.select(code,name)
sort(x,...) reorder records in flow by x, ... course.sort(credits-)
limit(n) first n records from flow course.limit(10)
limit(n,k) n records from flow starting from k-th course.limit(10,20)
x -> xs traverse an ad-hoc link school.(campus -> school)
fork([x]) traverse a self-referential link course.fork(credits)

Sieving

flow ? p
Emit records from flow that satisfy condition p.
filter(p)
Emit records from the input flow that satisfy condition p.

The condition is expected to be of Boolean type. If the argument p is not Boolean, it is implicitly converted to Boolean (see boolean()).

Projection

flow ^ x
Emit all unique values of x as it ranges over flow. NULL values are ignored.
flow ^ {x,...}
Emit all unique values of the expressions x,.... NULL values are ignored.
distinct(flow{x,...})
Emit all unique values of the output columns of flow{x,...}. NULL values are ignored.

The projection operation flow ^ x creates a new naming scope, which may contain the following names:

flow
If flow is an identifier, then it is used to denote the plural link associating each value of x with respective records from the original flow. It is called the complement link of the projection. The symbol ^ is an alias for a complement link and could be used when flow is not an identifier and so cannot be used as a name.
x
If x is an identifier, then it refers to the value of x. It is called the kernel of the projection. When x is not an identifier, but an arbitrary expression, one may assign it a name using in-place selector assignment syntax.

Selection

{x,...}
Define output columns in the input flow.
flow{x,...}
Define output columns in the given flow.
select(x,...)
Define output columns in the input flow.

The selector expression admits two forms of short-cut syntax:

in-place assignment
If an element of a selector is an assignment expression, the name defined by the assignment is added to the current scope. Only unqualified attribute and reference assignments are allowed.
sorting decorators
If an element of a selector contains a sort order indicators, the expression is used to reorder elements in the input flow.

Scope Operations

Function Description Example Input
define(x:=...) add names to the current scope define(num_prog:=count(program))
where(expr,x:=...) evaluate an expression with extra names in the current scope count(course?credits>$c) :where $c:=avg(course.credits)
@ x evaluate an expression in the initial scope  
root() root scope  
this() current scope  

Calculated Attributes

define(x:=...)
Add a calculated attribute to the current scope.
where(expr,x:=...)
Evaluate an expression in a current scope with a calculated attribute.

These functions add calculated attributes and references to the current scope.

Scopes

@ x
Reset to the initial scope.
root()
The root scope.
this()
The current scope.

Decorators

Function Description Example Input
as(x,title) set the column title count(program) :as '# of programs'
x + indicate ascending order credits+
x - indicate descending order credits-

Title

as(x,title)
Specifies the title of the output column.

The title could be either an identifier or a quoted literal. This function should be used only when specifying output columns using a selection operator.

Direction Decorators

x +
Specifies ascending direction, NULL first.
x -
Specifies descending direction, NULL last.

This decorators should be used only on arguments of sort() or in a selection operator.

Formatters

Function Description
/:html HTML presentation output
/:txt plain text presentation output
/:csv CSV (comma-separated values) output
/:tsv TSV (tab-separated values) output
/:raw JSON-serialized generic output
/:json JSON-serialized object output
/:xml XML-serialized object output
/:sql prints corresponding SQL queries

These functions specify the format of the output data.

Presentation Output

We support two presentation outputs, /:txt output format is supported for text clients like our command line shell, and the /:html provides single-page static output. The /:html format has a customizable template that can be used to provide headers and footers.

Object Output

The /:xml and /:json formatters produce a customary "object" structure which lack any header information. These object formats are perfect for post-processing tools that already know about the query output structure, such as a Javascript program or XSLT stylesheet.

Tabular Output

Tabular formatters (/:csv, /:tsv) are intended for use in data processing tools. These formats include only column headers and the output data on subsequent rows.

Generic Output

Our primary default output, /:raw, is meant for generic tools that must handle arbitrary queries. This format reflects our native product structure and has two corresponding sections: meta details type information and data has the corresponding output data.

Query Debug

The special /:sql designator isn't actually a formatter at all, since it shortcuts the whole query execution part of the engine and produces the SQL queries used as an output. It's useful for understanding what's going on under the hood.