Kirill Simonov avatar Kirill Simonov committed 27bb7d8

Added draft sections on data flows, naming scopes and data types.

Comments (0)

Files changed (1)

doc/reference.rst

 a singular link from class `department` to class `school` and a
 reverse plural link from class `school` to class `department`.
 
+Data Flow
+---------
+
+A central concept in HTSQL is *data flow*, a sequence of homogeneous
+values.
+
+HTSQL is a data flow transformation language.  Every HTSQL expression
+operates on flows; that is, it accepts an *input flow* and transforms it
+into an *output flow*.
+
+(diagram: input flow -> expression -> output flow)
+
+The initial input flow consists of a single empty record.  HTSQL
+processor interprets the given HTSQL query as a sequence (or rather a
+directed graph) of flow operations, which it applies one by one.  The
+resulting flow is then displayed to the user in a tabular form.
+
+Different operations affect the flow in various ways: multiply it, or
+remove elements from it, apply a scalar function to each element, etc.
+In the next sections, we discuss different types flow operations.
+
+Scalar Expressions
+------------------
+
+A simplest example of a flow operation is an application of some
+expression to each value in a flow.  The output flow consists of a
+results of the expression.
+
+That kind of expression does not change the number of elements in the
+flow; we call such expressions *scalar*.
+
+.. htsql:: /(3+4)*6
+
+In this example, a scalar expression ``(3+4)*6`` is applied to the
+initial flow; the value of this expression forms the resulting flow.
+
+(diagram: [] -> (3+4)*6 -> 42)
+
+.. htsql:: /school{code, count(department)}
+   :cut: 4
+
+In this example, two scalar expressions ``code`` and
+``count(department)`` are applied to the flow consisting of *school*
+records.  For each school entity, they extract the value of the
+attribute ``code`` and the number of associated departments.
+
+(diagram: [art], [bus], [edu], [eng], ...
+    -> {'art',2}, {'bus',3}, {'edu',2}, {'eng',4}, ...)
+
+A scalar expression is an example of a *singular* expression; one which
+does not increase the number of elements in the flow, as opposed to a
+*plural* expression, which may produce more output elements than in the
+input flow.
+
+Navigation
+----------
+
+Navigation is an operation of selecting the initial record class or
+traversing a link.
+
+When used in the root scope, a class name produces a flow of all records
+from the class.
+
+.. htsql:: /school
+   :cut: 4
+
+(diagram: [] -> (school) -> [art], ...)
+
+In a class scope, the name of a link produces the flow consisting of
+associated records from the target class.
+
+.. htsql:: /school.department
+   :cut: 4
+
+(diagram: [] -> (school) -> [art], ... -> (department) -> [arthis], ...)
+
+As in the previous example, ``school`` generates a flow of *school*
+records.  Then we traverse a link ``school.department``.  That each, for
+each school record in the input flow, we find the associated
+*department* records, the output flow consists of all *department*
+records combined.
+
+A traversal operation is singular or plural depending on whether the
+respective link is singular or plural.
+
+Filtering
+---------
+
+A *sieve* expression filters the input flow leaving only those elements
+which satisfy the given condition.
+
+A sieve expression takes one argument: a scalar logical expression
+called the *filter*.  It applies the filter to each element of the input
+flow.  The output flow consists of those elements of the input flow for
+which the filter is evaluated to *TRUE* value.
+
+.. htsql:: /school?count(department)>3
+
+(diagram)
+
+In this example, the sieve expression evaluates a filter condition
+``count(department)>3`` for each record from the *school* class; those
+records for which this condition is valid generate the output of the
+query.
+
+A *sort* expression reorders elements in the flow according to a given
+argument.
+
+.. htsql:: /school.sort(name+)
+   :cut: 4
+
+(diagram)
+
+In this example, the *school* records are ordered in the ascending order
+with respect to the value of ``name`` attribute.
+
+A *truncation* operation makes a slice of the input flow.
+
+.. htsql:: /school.limit(3)
+
+(diagram)
+
+In this case, we take the top 3 records from the *school* class.
+
+Aggregates
+----------
+
+An aggregate function converts a plural expression into a scalar.
+
+The argument of an aggregate function must be a plural expression.  Then
+for each element of the input flow, the aggregate evaluates the
+respective sub-flow and applies a set function to the result to generate
+a scalar value.
+
+.. htsql:: /count(school)
+
+(diagram)
+
+In this example, ``count()`` aggregates produces the number of elements
+in the flow generated by expression ``school``.
+
+.. htsql:: /department{code, max(course.credits)}
+   :cut: 4
+
+(diagram)
+
+In this example, ``max(course.credits)`` starts with evaluating the flow
+``department.course.credits``.  Then for each *department* record of the
+input flow, ``max()`` finds the maximum value in the respective
+sub-flow.
+
+Projection
+----------
+
+A projection expression takes a scalar argument called the *kernel*.
+The output flow of projection consists of all unique values of the
+kernel as it runs over the input flow.
+
+.. htsql:: /school^campus
+
+(diagram)
+
+The output of this query consists of all distinct values of
+`school.campus` attribute.
+
+Naming Scope
+------------
+
+In HTSQL, identifiers are used to refer to class names, attributes,
+links as so on.  A collection of available names and associated objects
+is called a naming *scope*.
+
+Root Scope
+----------
+
+The root scope is the top level scope in the scope stack -- it is the
+scope where the query is evaluated.  This scope contains the names of
+all classes (tables) in the database.
+
+.. htsql:: /{count(school), count(department)}
+
+In this example, identifiers ``school`` and ``department`` belong to the
+root scope and are associated with the respective classes.
+
+Class Scope
+-----------
+
+The class scope is associated with some class (table) of the database.
+The scope contains names of all class attributes and links to other
+classes.
+
+.. htsql:: /school{code, count(department)}?exists(program)
+   :cut: 4
+
+In this example, ``school`` belongs to the root scope while identifiers
+``code``, ``department`` and ``program`` belong to the scope of `school`
+class.  ``school.code`` is the attribute of `school`,
+``school.department`` and ``school.program`` are links to the respective
+classes.
+
+Projection Scope
+----------------
+
+The projection scope is associated with a projection expression.
+
+Projection is an example of a derived class: its records are composed
+from unique values of the kernel as it runs over the base class.  A
+projection class has a natural link back to the base class: it relates
+the value of the kernel to every record of the base class that produced
+this value.
+
+(diagram)
+
+This link is called a *complement* link.  HTSQL assigns the name for the
+link that coincides with the name of the base class.  In cases when
+HTSQL is unable to deduce a link name, one may use a special
+*complement* expression: `^`.
+
+Attributes of the projection class are values of the kernel expression.
+When possible, HTSQL automatically assigns names for attributes,
+otherwise, the user may define custom attribute names.
+
+.. htsql:: /(school^campus){campus, count(school)}
+
+In this example, the projection scope ``(school^campus)`` has two names:
+the attribute name ``campus`` and the kernel link ``school``.
+
+.. htsql::
+
+    /(school^{num_dept := count(department)})
+        {num_dept, count(school)}
+
+In this example, we assign the name ``num_dept`` to the projection
+attribute.
+
+Modifying Scope
+---------------
+
+HTSQL allows adding new attributes to an existing scope, see
+functions ``define()`` and ``where()``.
+
+References
+----------
+
+Traversing a link changes the scope; any names defined in the previous
+scope are no longer available.  To pass values between different scopes,
+use references.
 
 Data Types
 ==========
 
-+----------------------+---------------------------+---------------------------+----------------------+
-| Type                 | Description               | Example Input             | Output               |
-+======================+===========================+===========================+======================+
-| `boolean`            | logical data type, with   | ``true()``                |                      |
-|                      | two values: *TRUE* and    +---------------------------+----------------------+
-|                      | *FALSE*                   | ``false()``               |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `integer`            | binary integer type       | ``4096``                  |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `decimal`            | arbitrary-precision       | ``124.49``                |                      |
-|                      | exact numeric type        |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `float`              | IEEE 754 floating-point   | ``271828e-5``             |                      |
-|                      | inexact numeric type      |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `string`             | text data type            | ``string('HTSQL')``       |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `enum`               | enumeration data type,    |                           |                      |
-|                      | with predefined set of    |                           |                      |
-|                      | valid string values       |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `date`               | date data type            | ``date('2010-04-15')``    |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `opaque`             | unrecognized data type    |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
+Every HTSQL expression has an associated type.  The type defines the set
+of valid values produced by the expression and permitted operations over
+the expression.  The type also indicates how returned values are
+formatted in the output.
+
+Since HTSQL wraps an SQL database, HTSQL data types are related to SQL
+data type.  Although HTSQL does not expose SQL data types directly to
+the user, each SQL data type corresponds to some HTSQL data type and
+vice versa.
+
+In this section we describe what data types HTSQL supports, how HTSQL
+types are mapped to SQL types, the format of input literals for each
+data type, etc.
+
+Regular and Special Types
+-------------------------
+
+HTSQL demands that every expression has an associated type.  For
+example, in the query:
+
+.. htsql:: /{2+2=4, count(school), date('2010-04-15')-6813}
+
+the expressions ``2+2=4``, ``count(school)``,
+``date('2010-04-15')-6813`` have the types `boolean`, `integer` and
+`date` respectively.  These are *regular* types.
+
+The following table lists the default set of supported regular data
+types in HTSQL; more data types could be added by *HTSQL extensions*.
+
++----------------------+---------------------------+---------------------------+
+| Type                 | Description               | Example Input             |
++======================+===========================+===========================+
+| `boolean`            | logical data type, with   | ``true()``                |
+|                      | two values: *TRUE* and    +---------------------------+
+|                      | *FALSE*                   | ``false()``               |
++----------------------+---------------------------+---------------------------+
+| `integer`            | binary integer type       | ``4096``                  |
++----------------------+---------------------------+---------------------------+
+| `decimal`            | arbitrary-precision       | ``124.49``                |
+|                      | exact numeric type        |                           |
++----------------------+---------------------------+---------------------------+
+| `float`              | IEEE 754 floating-point   | ``271828e-5``             |
+|                      | inexact numeric type      |                           |
++----------------------+---------------------------+---------------------------+
+| `string`             | text data type            | ``string('HTSQL')``       |
++----------------------+---------------------------+---------------------------+
+| `enum`               | enumeration data type,    |                           |
+|                      | with predefined set of    |                           |
+|                      | valid string values       |                           |
++----------------------+---------------------------+---------------------------+
+| `date`               | calendar date             | ``date('2010-04-15')``    |
++----------------------+---------------------------+---------------------------+
+| `time`               | time of day               | ``time('20:13:04.5')``    |
++----------------------+---------------------------+---------------------------+
+| `datetime`           | date and time combined    | |datetime-in|             |
++----------------------+---------------------------+---------------------------+
+| `opaque`             | unrecognized data type    |                           |
++----------------------+---------------------------+---------------------------+
+
+.. |datetime-in| replace:: ``datetime('2010-04-15 20:13:04.5')``
+
+Some HTSQL expressions do not produce a proper value and therefore
+cannot be assigned a regular data type.  In this case, the expression is
+assigned one of the *special* data types: `record`, `untyped` or `void`.
+
+Record entities are assigned the `record` type.  This type is special
+since values of this type are never displayed directly and it has no
+corresponding SQL data type.
+
+Quoted HTSQL literals have no intrinsic data type; their actual type is
+determined from the context.  Until it is determined, HTSQL translator
+assign them a temporary `untyped` type.
+
+Some expressions, such as assignments, produce no values and therefore
+have no meaningful data type.   In this case, the assigned type is
+`void`.
+
+The following table lists supported special data types.
+
++----------------------+---------------------------+---------------------------+
+| Type                 | Description               | Example Input             |
++======================+===========================+===========================+
+| `record`             | type of record entities   | ``school``                |
++----------------------+---------------------------+---------------------------+
+| `untyped`            | initial type of quoted    | ``'HTSQL'``               |
+|                      | literals                  |                           |
++----------------------+---------------------------+---------------------------+
+| `void`               | type without any valid    |                           |
+|                      | values                    |                           |
++----------------------+---------------------------+---------------------------+
+
+Literal Expressions
+-------------------
+
+A literal expression is an atomic expression that represents a fixed
+value.  HTSQL supports two types of literals: *numeric* (or unquoted) and
+*quoted*.
+
+An unquoted literal is a number written in one of the following forms:
+
+* an integer number
+* a number with a decimal point
+* a number in exponential notation
+
+.. htsql:: /{60, 2.125, 271828e-5}
+
+Literals in these forms are assigned `integer`, `decimal` and `float`
+types respectively.
+
+A quoted literal is an arbitrary string value enclosed in single quotes.
+
+.. htsql:: /{'', 'HTSQL', 'O''Reilly'}
+
+In this example, three literal expressions represent an empty string,
+*HTSQL* and *O'Reilly* respectively.  Note that to represent a single
+quote in the value, we must duplicate it.
+
+As opposed to numeric literals, quoted literals have no intrinsic type,
+their type is determined from the context.  Specifically, the type of
+a quoted literal is inferred from the innermost expression that contains
+the literal.  Until the actual data type of a quoted literal is
+determined, the literal is assigned an `untyped` type.
+
+Consider a query:
+
+.. htsql:: /2+2='4'
+
+Here, a quoted literal ``'4'`` is a right operand of an equality
+expression, and its left counterpart ``2+2`` has the type `integer`.
+Therefore, HTSQL processor is able to infer `integer` for the literal
+``'4'``.
+
+There is no generic rule how to determine the type of a quoted literal;
+every operator and function have different rules how to treat untyped
+values.  However the content of the literal is never examined when
+determining its data type.  It is possible to explicitly specify the
+type of an unquoted literal by applying a *cast* operator.
+
+.. htsql:: /{string('2010-04-15'), date('2010-04-15')}
+
+Here, the same quoted literal is converted to `string` and `date` data
+types respectively.  Each data type has a set of quoted literals it
+accepts; it is an error when the quoted literal does not obey the format
+expected by a particular type.
+
+.. htsql:: /{integer('HTSQL')}
+   :error:
+
+Note the error generated because ``'HTSQL'`` is not a valid format for
+an integer literal.
+
+Type Conversion
+---------------
+
+Expressions of one type could be explicitly converted to another type
+using a *cast* function.  A cast function is a regular function with one
+argument; the name of the function coincides with the name of the target
+type.
+
+Not every conversion is permitted; for instance, an integer value could
+be converted to a string, but not to a date:
+
+.. htsql:: /string(60)
+
+.. htsql:: /date(60)
+   :error:
+
+Implicit type conversion is called *coercion*.  In an arithmetic
+formulas and other expressions that require homogeneous arguments, when
+the operands are of different types, values of less generic types are
+converted to the most generic type.  The order of conversion is as
+follows:
+
+* `integer`
+* `decimal`
+* `float`
+
+
+
+Boolean
+-------
+
+Type `boolean` is a logical data type with two values: *TRUE*
+and *FALSE*.
+
+.. htsql:: /{boolean('true'), boolean('false')}
+
+.. htsql:: /{true(), false()}
+
+The following table maps the `boolean` type to respective
+native data types.
+
++----------------------+---------------------------+
+| Backend              | Native types              |
++======================+===========================+
+| *sqlite*             | ``BOOL``, ``BOOLEAN`` or  |
+|                      | any type containing       |
+|                      | ``BOOL`` in its name      |
++----------------------+---------------------------+
+| *pgsql*              | ``BOOLEAN``               |
++----------------------+---------------------------+
+| *mysql*              | ``BOOL`` aka ``BOOLEAN``  |
+|                      | aka ``TINYINT(1)``        |
++----------------------+---------------------------+
+| *oracle*             | |oracle-native|           |
++----------------------+---------------------------+
+| *mssql*              | ``BIT``                   |
++----------------------+---------------------------+
+
+.. |oracle-native| replace:: ``NUMBER(1) CHECK (X IN (0, 1))``
+
+
+
+
+
 
 Special Data Types
 ==================
 
-+----------------------+---------------------------+---------------------------+----------------------+
-| Type                 | Description               | Example Input             | Output               |
-+======================+===========================+===========================+======================+
-| `untyped`            | initially assigned type   | ``'HTSQL'``               |                      |
-|                      | of quoted literals        |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `tuple`              | type of chain expressions |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-| `void`               | type without any valid    |                           |                      |
-|                      | values                    |                           |                      |
-+----------------------+---------------------------+---------------------------+----------------------+
-
 
 Function Syntax
 ===============
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.