htsql / doc / ref / model.rst

Data Model

HTSQL is not a full-fledged database system. As opposed to regular data stores, it does not include a storage layer, but relies on a SQL database server to physically store and retrieve data.

HTSQL is designed to work on top of existing relational databases and does not impose any restrictions on how information is modeled and stored there. At the same time, HTSQL works best when the data in the database is highly normalized.

Even though HTSQL wraps a relational database, it does not expose the relational model directly to the users. Instead it derives HTSQL data model from the underlying database and uses this model when presenting data to the users and interpreting user queries. HTSQL data model is very close to traditional network data model utilized by CODASYL and various OODBMS and ORM systems.

In the next sections, we describe HTSQL data model and how it is inferred from the relational model of the underlying database.

Model and Instances

When describing how information is represented by HTSQL, we differentiate between a database model and a database instance.

A database model specifies the structure of the database: what types of business entities are represented and how the entities may relate to each other. A database instance is the actual data in the database and must satisfy the constraints imposed by the model. The difference between a model and an instance is the difference between the shape of data and data itself.

Let's consider the model of a student enrollment system in a fictional university. This model may contain schools, programs administered by a school, departments associated with a school, and courses offered by a department. A concrete instance of this model may contain a school of Engineering with an associated department of Computer Science, which offers a Database Theory course, etc.:

Entities and Relations

As we focus from the database model to a specific instance, classes are populated with values and entities, and each link splits into connections between individual class elements.

A value class is populated with all values of the respective type. Thus, boolean class acquires two values: true and false, integer class is filled with all integer numbers, and so on.

An entity class becomes a set of homogeneous business entities; e.g. school class becomes a set of university schools, department a set of departments, etc.

In HTSQL, individual entities are not observable, only entity attributes are. When we need to refer to a specific entity in writing, we use the value of some entity attribute that can uniquely identify it, enclosed in brackets. For example, attribute school.code uniquely identifies school entities, therefore we may say that [eng], [la], [ns] are respectively entities representing schools of Engineering, of Arts and Humanities, and of Natural Sciences.

The unit class contains a single value, which is called unit and denoted by @.

A link between two classes splits into a binary relation between elements of these classes:

  • A link from the unit class to an entity class connects the unit to every entity in the entity class.
  • A link between two entity classes connects each entity of the origin class to all related entities from the target class.
  • A link from an entity class to a value class connects each entity with the respective attribute value.

The following diagram demonstrates how the path looks for some specific database instance.

Correspondence to the Relational Model

In this section, we explain how underlying relation database model is translated to HTSQL data model.

For the most part, translation of relational structure to HTSQL model is straightforward. SQL data types become value classes, SQL tables become entity classes, table columns become class attributes. Links between entity classes are inferred from FOREIGN KEY constraints.

HTSQL allows the administrator to restrict access to specific tables and columns, configure additional database constraints, and rename link names. In the following sections we describe how HTSQL describes database model in the absence of any configuration.

A name in HTSQL is a sequence of letters, digits and _ characters which doesn't start with a digit. When an HTSQL name is generated from a SQL name which contains non-alphanumeric characters, those are replaced with an underscore (_).

Entity Names

Each SQL table induces an entity class, which, in general, borrows its name from the table.

Some SQL database servers support a notion of schemas, namespaces for tables, which may cause a naming conflict when two or more different schemas have tables with the same name. This conflict is resolved as follows:

  • If one of the schemas is marked as "default" for the purposes of name resolution, the name of the respective table is borrowed unadorned.
  • For the remaining tables, the assigned name has the form <schema>_<name>.

Attribute Names

Each table column induces an entity attribute with the same name.

When the column is a FOREIGN KEY constraint, the column name is also used to refer to the respective entity link. The usage is determined from the context; compare


Link Constraints

Column constraints are trivially translated to properties of the respective attribute links.

  • A NOT NULL constraint on a column means, in HTSQL term, that the respective attribute is total.
  • A UNIQUE constraint indicates that the attribute is unique.
  • A PRIMARY KEY constraint indicates that the attribute is both total and unique. The columns that form a primary key are also used for default ordering on the entity class.
  • A direct link induced by a FOREIGN KEY constraint is always singular. The reverse link is plural in general, but could be singular when the key column is UNIQUE.

An Example

Consider, for example, the following fragment of an SQL schema:


    code                VARCHAR(16) NOT NULL,
    name                VARCHAR(64) NOT NULL,
    campus              VARCHAR(5),
    CONSTRAINT school_pk
      PRIMARY KEY (code),
    CONSTRAINT school_name_uk
      UNIQUE (name),
    CONSTRAINT school_campus_ck
      CHECK (campus IN ('old', 'north', 'south'))

CREATE TABLE ad.department (
    code                VARCHAR(16) NOT NULL,
    name                VARCHAR(64) NOT NULL,
    school_code         VARCHAR(16),
    CONSTRAINT department_pk
      PRIMARY KEY (code),
    CONSTRAINT department_name_uk
      UNIQUE (name),
    CONSTRAINT department_school_fk
      FOREIGN KEY (school_code)

CREATE TABLE ad.program (
    school_code         VARCHAR(16) NOT NULL,
    code                VARCHAR(16) NOT NULL,
    title               VARCHAR(64) NOT NULL,
    degree              CHAR(2),
    part_of_code        VARCHAR(16),
    CONSTRAINT program_pk
      PRIMARY KEY (school_code, code),
    CONSTRAINT program_title_uk
      UNIQUE (title),
    CONSTRAINT program_degree_ck
      CHECK (degree IN ('bs', 'pb', 'ma', 'ba', 'ct', 'ms', 'ph')),
    CONSTRAINT program_school_fk
      FOREIGN KEY (school_code)
   CONSTRAINT program_part_of_fk
      FOREIGN KEY (school_code, part_of_code)
      REFERENCES ad.program(school_code, code)

CREATE TABLE ad.course (
    department_code     VARCHAR(16) NOT NULL,
    no                  INTEGER NOT NULL,
    title               VARCHAR(64) NOT NULL,
    credits             INTEGER,
    description         TEXT,
    CONSTRAINT course_pk
      PRIMARY KEY (department_code, no),
    CONSTRAINT course_title_uk
      UNIQUE (title),
    CONSTRAINT course_dept_fk
      FOREIGN KEY (department_code)
      REFERENCES ad.department(code)

In this schema, four tables, ad.department, ad.program, ad.course generate four entity classes:

Foreign key constraints department_school_fk, program_school_fk, course_dept_fk generate three direct and three reverse links:

A foreign key program_part_of_fk induces two self-referential links on program: