# Using and Configuring HTSQL

## Usage

### Invoking htsql-ctl

Installing HTSQL creates a command-line application htsql-ctl:

$htsql-ctl  The htsql-ctl script is a collection of subcommands called routines. The command-line syntax of htsql-ctl is $ htsql-ctl <routine> [options] [arguments]

• <routine> is the routine name;
• options are any routine options in short (-X) or long (--option-name) form;
• arguments are routine arguments.

To get a list of routines, run:

$htsql-ctl help  To describe a specific routine, run: $ htsql-ctl help <routine>


### Database Connection

Many routines require a connection URI parameter, which specifies how to connect to a database. The connection URI has the form:

<engine>://<user>:<pass>@<host>:<port>/<database>

• <engine> is the type of the database server; sqlite for SQLite, pgsql for PostgreSQL, mysql for MySQL, mssql for MS SQL Server, oracle for Oracle.
• <user>:<pass> are authentication parameters;
• <host>:<port> is the address of the database server;
• <database> is the name of the database.

For SQLite, <user>:<pass> and <host>:<port> are omitted, and <database> specifies the path to the database file. Thus, to connect to SQLite database htsql_demo.db located in the current directory, use the URI:

sqlite:htsql_demo.db


For PostgreSQL, if user:pass is omitted, the credentials of the current user are used; if host:port is omitted, the server is assumed to run on the local machine. Thus, to connect to a database htsql_demo running on the same host under credentials of the current user, use the URI:

pgsql:htsql_demo


Other database servers use similar conventions.

You can use option -p to prompt for a password if you do not want to specify the database password in a command line.

### Command-line Shell

To start a command-line HTSQL shell, run:

$htsql-ctl shell <DBURI>  That starts an interactive HTSQL shell, where you could type and execute HTSQL queries against the specified database. For example, to start the shell on a PostgreSQL database htsql_demo, run: $ htsql-ctl shell pgsql:htsql_demo

Interactive HTSQL Shell
htsql_demo$ For more details on the shell routine, run: $ htsql-ctl help shell


### HTTP Server

To start a demonstration web server running HTSQL, run:

$htsql-ctl server <DBURI> [<HOST> [<PORT>]]  That starts an HTTP server on the address <HOST>:<PORT>. If <HOST> and <PORT> are omitted, the server is started on *:8080. For example, to start the HTSQL web server against PostgreSQL database htsql_demo on localhost:3128, run: $ htsql-ctl server pgsql:htsql_demo localhost 3128

Starting an HTSQL server on localhost:3128 over htsql_demo


If database connection :ref:configuration <configuration> is provided by -C, you could use - as a place holder for the mandatory database URI parameter so that you could provide a HOST and PORT. For example, to run the server on localhost:80 you would write:

# htsql-ctl serve -C demo-config.yaml - localhost 80


For more details on the server routine, run:

$htsql-ctl help server  ## Extension Mechanism HTSQL has an extensive addon system that can be used to override almost every aspect of server operation or query construction with an adapter. Extensions can live in third party modules or be included in the HTSQL distribution as part of our supported "tweaks". To list supported extensions, you could type: $ htsql-ctl extension


To find out more about an extension, such as :ref:tweak.autolimit, write:

$htsql-ctl extension tweak.autolimit  ### Using Extensions An extension can be enabled using -E parameter on the htsql-ctl command line. For example, to enable the :ref:tweak.meta addon on the HTSQL demo database, you'd write: $ htsql-ctl shell pgsql:htsql_demo -E tweak.meta


Then, you could use the /meta() command registered by this addon:

Interactive HTSQL Shell
htsql_demo$/meta(/table)  Some addons have parameters which can be added to the command line. For example, the :ref:tweak.autolimit extension truncates output at limit number of rows. The default is 10k, but this value can be changed: $ htsql-ctl shell pgsql:htsql_demo -E tweak.autolimit:limit=10


If more than one parameter is possible, use "," to separate them:



## Extension Reference

The HTSQL distribution ships with several built-in extensions which we describe here.

### htsql

The HTSQL core that provides the SQL translator and an HTTP server is implemented in form of an addon htsql. This addon is always included.

Parameters:

db
The connection URI.

The parameter db specifies how HTSQL connects to the database. It could be written in a short or an expanded form. In the short form, the parameter is an URI:

htsql:
db: pgsql://htsql_demo:secret@localhost:5432/htsql_demo


In the expanded form, a mapping notation is used:

htsql:
db:
engine: pgsql
host: localhost
port: 5432
database: htsql_demo


Every component except engine and database is optional.

The parameter password allows you to override the database password keeping the other connection parameters intact.

debug
Enable debugging output.

When debug option is set, text output, as well as some error messages will contain a respective SQL query.

### engine.*

Extensions in the engine namespace implement database backends. An appropriate extension is loaded automatically based on the engine parameter of the connection URI.

HTSQL supports the following database servers:

• SQLite 3+ (engine.sqlite)
• PostgreSQL 8.3+ (engine.pgsql)
• MySQL 5.1+ (engine.mysql)
• Oracle 10g+ (engine.oracle)
• Microsoft SQL Server 2005+ (engine.mssql)

### tweak.autolimit

This addon truncates query output to a given number of rows (10,000 by default). It could be used to prevent accidental denial of service caused by queries returning a large number of rows.

Parameters:

limit
Truncation threshold (default: 10,000).
tweak.autolimit:
limit: 1000


### tweak.cors

This addon adds CORS support to permit AJAX requests to the HTSQL server by web pages hosted on a different domain.

To prevent data leaks, web browsers do not allow AJAX requests to cross domain boundaries. The CORS (Cross-Origin Resource Sharing) specification defines a way for a server to provide a list of domains which are permitted to make AJAX requests.

CORS relies on browser support and may not work with older web browsers.

Parameters:

origin
Domains allowed to access the server (default: *).

The origin parameter is a list of domains which are allowed to access the server. The value must either be * (means any, which is the default) or a space-separated list of host names:

http[s]://domain[:port]


Warning

The default settings permit HTSQL queries from any domain. Do not use the default settings with non-public data!

For example, to enable AJAX requests to the HTSQL demo server (located at http://demo.htsql.org/) from domains http://htsql.org/ and http://htsql.com/, we could write:

tweak.cors:
origin: http://htsql.org http://htsql.com


### tweak.csrf

This extension provides protection against cross-site request forgery (CSRF) attacks.

A CSRF attack tricks the user to visit the attacker's website, which then submits database queries to the HTSQL service from the user's account. Even though the browser would not permit the malicious website to read the output of the queries, this form of attack can be used for denial of service or changing the data in the database. For background on CSRF, see http://en.wikipedia.org/wiki/Cross-site_request_forgery.

This addon requires all HTSQL requests to submit a secret token in two forms:

If the token is not submitted, the addon prevents the request from reading or updating any data in the database.

Parameters:

If set, a request is permitted to read data from the database even when the secret token is not provided.
allow_cs_write
If set, a request is permitted to update data in the database even if the secret token is not provided.
tweak.csrf:
allow_cs_write: false


### tweak.django

This extension provides integration with Django web framework. It replaces built-in database introspection and connection handling with facilities provided by Django.

When using tweak.django addon, do not specify the connection URI as it is determined from the Django project database settings.

Parameters:

settings
Path to the settings module (default: use DJANGO_SETTINGS_MODULE).
tweak.django:
settings: mysite.settings


### tweak.etl

This extension provides the following commands:

insert(feed)
update(feed)
merge(feed)
delete(feed)
Deletes records from a table.
truncate(name)
Truncates a table.
do(command, ...)
Performs a series of command in a single transaction.
tweak.etl:


Warning

This extension is work-in-progress. The interface and implementation of commands may change in the future. Currently only PostgreSQL is supported.

### tweak.filedb

This extension makes a database from a set of CSV files. Each source CSV file becomes a table in the database. The name of the table is derived from the file name; the column names are taken from the first row of the CSV file. The remaining rows become the records in the table.

The database is realized as an in-memory SQLite database. Use optional parameter cache-file to specify a persistent storage for the database.

Parameters:

sources

List of entries describing the source files; each entry has the following fields:

file
The path to the CSV file.
cache-file
Persistent storage for the database.
tweak.filedb:
sources:
- file: school.csv
- file: department.csv
- file: program.csv
- file: course.csv
cache-file: cache.sqlite


### tweak.gateway

This extensions allows you to create a gateway to another database and execute HTSQL queries against it.

Parameters:

gateways

A mapping of names to datebase configuration.

Each mapping entry creates a function which takes a query as a parameter and execute it against the gateway database.

Database configuration could be either connection URI or nested HTSQL configuration.

tweak.gateway:
gateways:
sqlite_gw: sqlite:database.sqlite
remote_gw:
htsql:
db:
engine: pgsql
database: remote_db


The gateway name becomes a command which executes a query against the gateway database:

/table/:sqlite_gw


### tweak.meta

This extension provides a meta database describing tables, columns and links of the primary database.

The tweak.meta addon has no parameters.

tweak.meta:


The meta database is composed of the following tables:

table
all available tables
field
columns and links for a given table
column
all columns for a given table
all links from one table to another

Use function meta() to make a query against the meta database.

To get a list of tables:

Mapping call notation is also permitted:

To list all columns of a given table in the output order:

To get all links to and from a specific table:

To describe the meta database itself, apply meta() twice:

tweak.meta:


### tweak.override

This extension provides several ways to adjust database metadata. It allows the user to restrict access to specific tables and columns, specify additional database constraints, change the generated names for tables, columns and links, and define calculated attributes.

Parameters:

included-tables
Tables permitted to use.
excluded-tables
Tables forbidden to use.
included-columns
Columns permitted to use.
excluded-columns
Columns forbidden to use.
not-nulls
unique-keys
Additional PRIMARY KEY and UNIQUE constraints.
foreign-keys
class-labels
Labels for tables and top-level calculations.
field-labels
Labels for columns, links and calculated fields.
field-orders
Default table fields.
unlabeled-tables
Tables to hide.
unlabeled-columns
Columns to hide.
globals
Global definitions.

To restrict access to a specific set of tables, use parameters included-tables and excluded-tables. Parameter included-tables is a list of tables allowed to be used by HTSQL. If this parameter is provided, any table not in this list is completely hidden from the HTSQL processor. Parameter excluded-tables allows you to forbid access to a set of tables.

To forbid use of table confidential:

tweak.override:
excluded-tables: [confidential]


To allow access only to tables in ad and ed schemas:

tweak.override:


We could also use block form of a sequence:

tweak.override:
included-tables:
- ed.*


In general, the table name may have the form <table> or <schema>.<table> and could include * meta-character to indicate any number of characters. Table names are case-insensitive and normalized: any non-alphanumeric character is replaced with _.

Similarly, to restrict access to a specific set of columns, use parameters included-columns and excluded-columns. Parameter exclude-columns is a list of column forbidden for use by the HTSQL processor.

To exclude column SSN of table confidential, write:

tweak.override:
excluded-columns: [confidential.ssn]


The column name may have the form <column>, <table>.<column>, or <schema>.<table>.<column> and could include * meta-character.

Note that columns listed in excluded-columns are removed together with all associated key constraints. If you want to hide a column from output, but keep associated primary and foreign keys, use the unlabeled-columns parameter.

HTSQL discovers database constraints from the schema definition. If some constraints are not explicitly defined in the schema, you may provide them using parameters not-nulls, unique-keys and foreign-keys.

Warning

When specifying additional constraints, make sure they are respected by the data; otherwise, the output produced by HTSQL may be invalid.

Parameter not-nulls is a list of columns with NOT NULL constraints.

To indicate that all columns named code and id, as well as column student.full_name do not contain NULL value, write:

tweak.override:
not-nulls: ["*.code", "*.id", student.full_name]


Note that we need to put the column patterns into quotes since YAML syntax does not permit * character at the beginning of a scalar value.

Parameter unique-keys is a list of key specifications of the form <table>(<column>,...)[!]. The trailing symbol ! indicates a PRIMARY KEY constraint. All columns in a primary key are marked as NOT NULL.

To indicate that school.code is a primary key and school.name is unique, write:

tweak.override:
unique-keys:
- school(code)!
- school(name)


Parameter foreign-keys is a list of foreign key specifications, which have the form <origin>(<column>,...) -> <target>(<column>,...). Target columns could be omitted when they coincide with the target primary key.

To define two foreign keys on table program, write:

tweak.override:
foreign-keys:
- program(school_code) -> school(code)
- program(school_code, part_of_code) -> program


In HTSQL, database tables, columns and links have a label, an identifier by which they are referred in HTSQL queries. Normally, entity labels coincide with their names, but parameters class-labels and field-labels allow you to assign them arbitrary labels. In addition, these parameters allow you to assign a label to an arbitrary HTSQL expression.

Use parameter class-labels to assign custom labels to tables and top-level HTSQL expressions. Parameter class-labels is a mapping; each key is a label, the corresponding value is either a table name or an HTSQL expression enclosed in parentheses.

To rename table classification to c14n and to assign a label to expression school^campus, write:

tweak.override:
class-labels:
c14n: classification
campus: (school^campus)


Calculated classes defined by class-labels may accept parameters. For instance, to add a class students_by_year() which takes the year of admission as an argument, write:

tweak.override:
class-labels:
students_by_year($year): (student?year(start_date)=$year)


Use parameter field-labels to assign custom labels to table fields. This parameter is a mapping; each key has a form <table>.<field>, where <table> is the table label, <field> is the field label to define. The corresponding value is one of:

• a column name;
• an HTSQL expression enclosed in parentheses.

A link specification is a comma-separated list of patterns <origin>(<column>,...) -> <target>(<column>,...). Each pattern must match a foreign key or a reverse foreign key. Column lists could be omitted if the foreign key could be determined uniquely.

To rename a column student.name to full_name, write:

tweak.override:
field-labels:
student.full_name: name


To add a many-to-many link between student and class via enrollment table, write:

tweak.override:
field-labels:
student.class: student -> enrollment, enrollment -> class
class.student: class -> enrollment, enrollment -> student


Note that link specifier student -> enrollment uniquely matches foreign key enrollment(student_id) -> student(id) while enrollment -> class matches foreign key enrollment(class_seq) -> class(class_seq) so we do not need to provide column lists.

The self-referential link from program to all included programs is called, by default, program.program_via_part_of. To assign a different label to this link, write:

tweak.override:
field-labels:
program.includes:
program(school_code, code) -> program(school_code, part_of_code)


To define a calculated field student.avg_grade, write:

tweak.override:
field-labels:


Calculated fields may accept a parameter. To define a calculated field department.students_by_year() accepting the year of admission as a parameter, write:

tweak.override:
field-labels:
department.students_by_year($year): (student?year(start_date)=$year)


By default, when an HTSQL query does not contain a selector expression, all table columns are displayed. To set a custom list of fields for this case, use parameter field-orders.

tweak.override:
field-orders:
program: [code, title, degree]


Parameter unlabeled-tables is a list of tables without an assigned labels, which effectively hides the tables from the users. The tables could still be used in SQL generated by the HTSQL translator.

tweak.override:
unlabeled-tables: [enrollment]


Parameter unlabeled-columns is a list of columns without an assigned tables. Unlabeled columns are hidden from the users, but could be used in SQL generated by the HTSQL translator.

To hide all id columns, write:

tweak.override:
unlabeled-columns: [id, "*_id"]


Use parameter globals define global attributes and functions. This parameter is a mapping: each key is the attribute name with an optional list of parameters, the value is an HTSQL expression.

tweak.override:
globals:
num_school: (count(@school))
trunc_month($d): (date(year($d), month(\$d), 1))


### tweak.pool

This addons caches open database connections so that the same connection could be reused to execute more than one query. Use this addon with backends where opening a database connection is an expensive operation.

tweak.pool:


### tweak.resource

This extension adds a mechanism for serving static files via HTTP. This mechanism is used by other extensions to provide access to static resources such as Javascript and CSS files.

Parameters:

indicator
HTTP root for static files, excluding leading and trailing / (default: -)

Usually you don't need to enable this extension explicitly as it is done by extensions with static resources.

### tweak.shell

This extension adds an in-browser HTSQL editor called the HTSQL shell. The shell provides a visual query editor (based on CodeMirror) with support for syntax highlighting and code completion.

The shell is invoked by command /shell(), which takes an optional query to edit.

Parameters:

server-root
The root URL of the HTSQL server (default: guess)
limit
Truncation threshold for shell output (default: 1000)
tweak.shell:
server-root: http://demo.htsql.org
limit: 100


### tweak.shell.default

Enable addon tweak.shell.default to make the shell the default output format.

tweak.shell.default:


### tweak.sqlalchemy

This extension provides integration with SQLAlchemy toolkit. It replaces built-in HTSQL database introspection and connection handling with SQLAlchemy facilities.

When using tweak.sqlalchemy addon, do not specify the connection URI as it is determined from the SQLAlchemy engine settings.

Parameters:

engine
The SQLAlchemy engine object.

The value must have the form <module>.<attr> or <package>.<module>.<attr>.

tweak.sqlalchemy:


### tweak.system

Currently, only PostgreSQL backend is supported.

### tweak.timeout

This extension limits query execution to a given amount of time (1 minute by default). Use it to ensure against accidental denial of service caused by complex queries.

Parameters:

timeout
The timeout value, in seconds (default: 60).
tweak.timeout:
timeout: 300


Currently, this addon is only supported with PostgreSQL.