Commits

Mike Bayer committed 7893356

content

Comments (0)

Files changed (2)

handout/source/glossary.rst

 
     candidate key
 
-        A relational algebra term referring to an attribute or set
+        A :term:`relational algebra` term referring to an attribute or set
         of attributes that form a uniquely identifying key for a
         row.  A row may have more than one candidate key, each of which
         is suitable for use as the primary key of that row.
 
         .. seealso::
 
+            :ref:`normalization`
+
             http://en.wikipedia.org/wiki/Database_normalization
 
     relational model
 .. glossary::
     :sorted:
 
-    orphan
-        A mapped instance with a severed link to a collection or parent object.
-
     threadlocal
         A shared data structure whose data members are visible only to
         the thread which set them. The concept of "thread local" in
             http://docs.python.org/2/library/threading.html#threading.local
 
     reflection
-        The process of constructing SQLAlchemy Table objects
-        programatically at runtime by querying a live database's
-        system tables for column and key definitions.
+        The process of constructing SQLAlchemy :class:`~sqlalchemy.schema.Table`
+        objects in an automated or semi-automated fashion, where information about
+        tables, columns and constraints are loaded from an existing
+        database's internal catalogs in order to compose in-memory
+        structures representing a schema.
+
+        .. seealso::
 
+            :ref:`metadata_reflection`
 
     engine
         An object that provides a source of database connectivity.  The
         track of changes to values, and also :term:`lazy load` unloaded values
         and collections from the database when the attribute is accessed.
 
-    instrumentation
-    instrumented
-        Instrumentation refers to the process of augmenting the functionality
-        and attribute set of a particular class.   Ideally, the
-        behavior of the class should remain close to a regular
-        class, except that additional behviors and features are
-        made available.  The SQLAlchemy :term:`mapping` process,
-        among other things, adds database-enabled :term:`descriptors`
-        to a mapped
-        class which each represent a particular database column
-        or relationship to a related class.
-
-
     lazy load
     lazy loads
     lazy loading
 
             :ref:`sqla:session_object_states`
 
+    one to many
+        A style of :func:`~sqlalchemy.orm.relationship` which links
+        the primary key of the parent mapper's table to the foreign
+        key of a related table.   Each unique parent object can
+        then refer to zero or more unique related objects.
+
+        The related objects in turn will have an implicit or
+        explicit :term:`many to one` relationship to their parent
+        object.
+
+        An example one to many schema (which note is identical
+        to the :term:`many to one` schema):
+
+        .. sourcecode:: sql
+
+            CREATE TABLE department (
+                id INTEGER PRIMARY KEY,
+                name VARCHAR(30)
+            )
+
+            CREATE TABLE employee (
+                id INTEGER PRIMARY KEY,
+                name VARCHAR(30),
+                dep_id INTEGER REFERENCES department(id)
+            )
+
+        The relationship from ``department`` to ``employee`` is
+        one to many, since many employee records can be associated with a
+        single department.  A SQLAlchemy mapping might look like::
+
+            class Department(Base):
+                __tablename__ = 'department'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+                employees = relationship("Employee")
+
+            class Employee(Base):
+                __tablename__ = 'employee'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+                dep_id = Column(Integer, ForeignKey('department.id'))
+
+        .. seealso::
+
+            :term:`relationship`
+
+            :term:`many to one`
+
+            :term:`backref`
+
+    many to one
+        A style of :func:`~sqlalchemy.orm.relationship` which links
+        a foreign key in the parent mapper's table to the primary
+        key of a related table.   Each parent object can
+        then refer to exactly zero or one related object.
+
+        The related objects in turn will have an implicit or
+        explicit :term:`one to many` relationship to any number
+        of parent objects that refer to them.
+
+        An example many to one schema (which note is identical
+        to the :term:`one to many` schema):
+
+        .. sourcecode:: sql
+
+            CREATE TABLE department (
+                id INTEGER PRIMARY KEY,
+                name VARCHAR(30)
+            )
+
+            CREATE TABLE employee (
+                id INTEGER PRIMARY KEY,
+                name VARCHAR(30),
+                dep_id INTEGER REFERENCES department(id)
+            )
+
+
+        The relationship from ``employee`` to ``department`` is
+        many to one, since many employee records can be associated with a
+        single department.  A SQLAlchemy mapping might look like::
+
+            class Department(Base):
+                __tablename__ = 'department'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+
+            class Employee(Base):
+                __tablename__ = 'employee'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+                dep_id = Column(Integer, ForeignKey('department.id'))
+                department = relationship("Department")
+
+        .. seealso::
+
+            :term:`relationship`
+
+            :term:`one to many`
+
+            :term:`backref`
+
+    backref
+        An extension to the :term:`relationship` system whereby two
+        distinct :func:`~sqlalchemy.orm.relationship` objects can be
+        mutually associated with each other, such that they coordinate
+        in memory as changes occur to either side.   The most common
+        way these two relationships are constructed is by using
+        the :func:`~sqlalchemy.orm.relationship` function explicitly
+        for one side, specifying the ``backref`` keyword to it so that
+        the other :func:`~sqlalchemy.orm.relationship` is created
+        automatically.  We can illustrate this against the example we've
+        used in :term:`one to many` as follows::
+
+            class Department(Base):
+                __tablename__ = 'department'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+                employees = relationship("Employee", backref="department")
+
+            class Employee(Base):
+                __tablename__ = 'employee'
+                id = Column(Integer, primary_key=True)
+                name = Column(String(30))
+                dep_id = Column(Integer, ForeignKey('department.id'))
+
+        A backref can be applied to any relationship, including one to many,
+        many to one, and :term:`many to many`.
+
+        .. seealso::
+
+            :term:`relationship`
+
+            :term:`one to many`
+
+            :term:`many to one`
+
+            :term:`many to many`
 
     many to many
         A style of :func:`sqlalchemy.orm.relationship` which links two tables together
 
     relationship
     relationships
-        In SQLAlchemy, the junction of two mapped classes, or of a
-        mapped class to itself.  The relationship usually corresponds
-        to a foreign key relationship between two tables or
-        selectables.
+        A connecting unit between two mapped classes, corresponding
+        to some relationship between the two tables in the database.
+
+        The relationship is defined using the SQLAlchemy function
+        :func:`~sqlalchemy.orm.relationship`.   Once created, SQLAlchemy
+        inspects the arguments and underlying mappings involved
+        in order to classify the relationship as one of three types:
+        :term:`one to many`, :term:`many to one`, or :term:`many to many`.
+        With this classification, the relationship construct
+        handles the task of persisting the appropriate linkages
+        in the database in response to in-memory object associations,
+        as well as the job of loading object references and collections
+        into memory based on the current linkages in the
+        database.
 
         .. seealso::
 
             :ref:`sqla:relationship_config_toplevel`
 
-    scoped_session
-        A front end for sessionmaker which provides a "global"
-        registry of sessions, each mapped to the current thread.
+    scoped session
+        A helper object intended to provide a *registry* of
+        :term:`session` objects, allowing an application to refer
+        to the registry as a global variable which provides
+        access to a contextually appropriate session object.
+
+        The scoped session object is an optional construct
+        often used with web applications.
 
         .. seealso::
 
+            :term:`Session`
+
             :ref:`sqla:unitofwork_contextual` - an in-depth
             introduction to the :class:`sqlalchemy.orm.scoped_session` object.
 
+
     selectable
-        What relational algebra refers to as a relation, SQLAlchemy
-        refers to as a selectable. A table, subquery, or any other
-        table-valued SQL expression.
+        Refers to the SQLAlchemy analogue for a "relation" in relational
+        algebra, which is any object that represents a series of
+        rows in a database.   "Selectable"
+        actually refers in the API to objects that extend from the
+        :class:`sqlalchemy.sql.expression.Selectable` class, and
+        refers to such row-representing constructs as the
+        :class:`~sqlalchemy.schema.Table`, the :class:`~sqlalchemy.sql.expression.Join`,
+        and the :class:`~sqlalchemy.sql.expression.Select`
+        construct.
 
     Session
         The container or scope for ORM database operations. Sessions
 
             :ref:`session_toplevel`
 
-    session transaction
-        ORM-level transaction. Session activity may span multiple
-        databases, and the session transaction coordinates a
-        connection-level transaction for each. Database features such
-        as save points and two-phase transactions are also supported.
-
     sessionmaker
-        An optional, configurable factory object used to create new
-        Session instances using a chosen set of construction
-        arguments.
+        A *factory* for :term:`session` objects.   The :class:`~sqlalchemy.orm.session.sessionmaker`
+        construct basically allows a series of parameters to be associated
+        with a :class:`~sqlalchemy.orm.session.Session` constructor.
+
+        In reality, the sessionmaker is just slightly more elaborate
+        than a simple function, that is an expression like this::
+
+            from sqlalchemy.orm import sessionmaker
+            my_session = sessionmaker(bind=engine, autoflush=False)
+
+        is conceptually very similar to the following::
+
+            from sqlalchemy.orm import Session
+            my_session = lambda: Session(bind=engine, autoflush=False)
+
+        .. seealso::
+
+            :term:`Session`
 
+            :term:`scoped session`

handout/source/relational.rst

 
 .. _ddl:
 
-Data Definition Language (DDL)
-==============================
+Relational Schemas
+==================
+
+The :term:`schema` refers to a fixed structure configured within a database
+that defines how data will be represented.   The most fundamental unit
+of data within a schema is known as the :term:`table`.
 
 Table
 -----
            REFERENCES department(dep_id)
     )
 
+
+.. _normalization:
+
+Normalization
+-------------
+
+The structure of a relational schema is based on a system known as :term:`relational
+algebra`.  The central philosophy that drives the design of a relational schema
+is a process known as :term:`normalization`, which like most fundamental computer
+science concepts is an entire field of study onto itself.   In practice however,
+normalization usually boils down to a few simple practices that become second
+nature in not too much time.
+
+The general idea of normalization is to eliminate the repetition of data, so that
+any one particular piece of information is represented in exactly one place.
+By doing so, that piece of information becomes one of many atomic units by which
+data can be searched and operated upon.  For example, if hundreds of records
+all refer to a particular date record, we can correlate all those records on this
+single date record strictly based on the association of those identities.
+
+A typical example of denormalized data looks like::
+
+  Employee Language
+  ------------------
+  name        language   department
+  -------     --------   -------------
+  Dilbert     C++        Systems
+  Dilbert     Java       Systems
+  Wally       Python     Engineering
+  Wendy       Scala      Engineering
+  Wendy       Java       Engineering
+
+Normalization theory would claim the above table violates "second normal form"
+because the "non prime" attribute "department" is dependent
+only on the "name" column, and not "language", which means it
+is dependent on a subset
+of the table's  :term:`candidate key`, which would necessarily be the
+composite of "name" and "language"
+(Note that the author is carefully parsing the Wikipedia page for normalization
+here in order to state this correctly).   A proper normalization would use two
+tables along the lines of the following::
+
+  Employee Department
+  -------------------
+  name        department
+  --------    -----------
+  Dilbert     Systems
+  Wally       Engineering
+  Wendy       Engineering
+
+  Employee Language
+  ------------------
+  name        language
+  --------    --------
+  Dilbert     C++
+  Dilbert     Java
+  Wally       Python
+  Wendy       Scala
+  Wendy       Java
+
+While the formal reasoning behind the above change may be difficult to
+parse, a visual inspection of the data reveals more obviously
+how the second form is an improvement; the original version repeats
+duplicate associations between "name" and "department" many times
+according to how many distinct "language" values correspond to a name;
+whereas the second version uses distinct tables so that each "name/department"
+and "name/language" association can be expressed distinctly.
+
+The concept of data constraints, particularly the primary key constraint
+and the foreign key constraint, are designed to work naturally with
+the concept of normalization.   The above schema would be applied to
+constraints by establishing "Employee Department->name" as a primary key,
+establishing "Employee Language->name, language" as a composite primary key,
+and then creating a foreign key such that "Employee Language->name" must
+refer to "Employee Department->name".  When a schema resists being
+organized into simple primary and foreign key relationships, that's often
+a sign that it isn't strongly normalized.
+
+The Wikipedia page on normalization is a great place to learn more about
+normalization, at http://en.wikipedia.org/wiki/Database_normalization.
+
+
 .. _dml:
 
 Data Manipulation Language (DML)
     INSERT INTO employee (emp_id, emp_name, dep_id)
                 VALUES (2, 'wally', 1);
 
+.. topic:: Auto Incrementing Integer Keys
+
+  Most modern databases feature a built-in system of generating incrementing integer
+  values, which are in particular usually used for tables that have surrogate integer
+  primary keys, such as our ``employee`` and ``department`` tables.   For example, when using
+  SQLite, the above ``emp_id`` column will generate an integer value automatically; when
+  using MySQL, an integer primary key declared with ``AUTO INCREMENT`` will do so as well;
+  and on Postgresql, declaring a primary key with the datatype ``SERIAL`` will have
+  the same end effect.  When using these so-called "auto incrementing" primary key
+  generators, we *omit* the column from the INSERT statement:
+
+  .. sourcecode:: sql
+
+      INSERT INTO employee (emp_name, dep_id)
+                  VALUES ('dilbert', 1);
+
+      INSERT INTO employee (emp_name, dep_id)
+                  VALUES ('wally', 1);
+
+  Databases that feature primary key generation systems will also
+  feature some means of acquiring the "generated" integer identifier
+  after the fact, using non-standard SQL extensions and/or functions.
+  When using Postgresql, one such way of reading these generated identifiers
+  is to use ``RETURNING``:
+
+  .. sourcecode:: sql
+
+      INSERT INTO employee (emp_name, dep_id)
+                  VALUES ('dilbert', 1) RETURNING emp_id;
+
+      emp_id
+      ------
+        1
+
+  While every database features a different system of generating and retrieving
+  these keys, we'll generally refer to the style above where the integer primary key
+  can be omitted from an INSERT.   When using SQLAlchemy, one of the most fundamental
+  features it provides is a consistent and transparent system
+  of utilizing the wide variety of usage key generation and retrieval schemes.
+
 
 Updates
 --------
 
 .. sourcecode:: sql
 
-    UPDATE employee SET dep_id=7
-                  WHERE emp_name='dilbert'
+    UPDATE employee SET dep_id=7 WHERE emp_name='dilbert'
 
+When an UPDATE statement like the above one executes, it may match any number of
+rows, including none at all.  An UPDATE statement typically has a "row count"
+value associated with a particular execution, which indicates the number of
+rows that matched the WHERE criteria, and therefore represents that number
+of rows that were subject to the SET clause.
 
 Deletes
 -------
 
     DELETE FROM employee WHERE dep_id=1
 
+Above, all employee records within department id 1 will be deleted.
+
 .. _queries:
 
 Queries
 
        18
 
+Another aggregate expression might return to us the average number
+of employees within departments.   To accomplish this, we also make use of
+the ``GROUP BY`` clause, described below, as well as a :term:`subquery`:
+
+.. sourcecode:: sql
+
+    SELECT avg(emp_count) FROM
+      (SELECT count(*) AS emp_count
+        FROM employee GROUP BY dep_id) AS emp_counts
+
+    ?avg?
+    -----
+      2
+
+Note the above query only takes into account non-empty departments.  To
+include empty departments would require a more complex sub-query that
+takes into account rows from ``department`` as well.
+
 Grouping
 --------
 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.