Enhanced warehouse rest interface

Issue #665 resolved
Brian Lewis repo owner created an issue

Provide some additional flexibility when reading enrolment data from warehouse via the REST API

Comments (8)

  1. Ghislain Hachey

    @Brian Lewis at some point very soon we need to lock our REST API as it will increasingly be used by external parties, in particular the Pacific Open Education App on Android and iOS devices. So once locked we should have something like this /api/v1/… and any changes to the API will then be in /api/v2/… leaving the version 1 untouched and time for external parties to upgrade.

  2. Brian Lewis reporter

    Enrolments endpoints:

    enrol/district - disaggregated by district

    enrol/nation - nation totals

    Next step is to indicate whether Gender is denormalised ( ie there are fields on each row EnrolM, EnrolF, Enrol) or whether GenderCode is on a separate row, with the data value Enrol

    The former ( denormalised) format is called Report Format, becuase it is easier for reporting tools. The latter format is better for ‘cube’ processors - pivot tables, tabelau etc.

    To get the reportformat version, add ‘r’:

    enrol/district/r

    enrol/nation/r

    Next, you can control some filtering and aggregation by query parameters

    enrol/district parameters:

    district - single district code to filter on. If not supplied, all districts are returned.

    byAge - disaggregate by age. Can be true, false, or if query parameter is present with no value, assume true; If parameter is not present, value is false

    by ClassLevel Disaggregate by Class Level. As for age, can be true, false or present with no value. If no parameter, assume TRUE.

    e.g.

    enrol/district?district=CHK

    enrol/district?byAge

    enrol/district?byAge&byClassLevel=false

    enrol/district?district=KSA&byAge&byClassLevel

    enrol/district/r?district=CHK

    enrol/district/r?byAge

    enrol/district/r?byAge&byClassLevel=false

    enrol/district/r?district=KSA&byAge&byClassLevel

    enrol/nation parameters:

    by Age and byClassLevel as above

    e,g.

    enrol/nation?byAge

    enrol/nation?byAge&byClassLevel=false

    enrol/nation/r?byAge

    enrol/nation/r?byAge&byClassLevel=false

    enrol/nation/r?byAge&byClassLevel

  3. Brian Lewis reporter

    Using the Pacific EMIS Warehouse

    Throughout these notes, we'll use <emis_root> to desginate the base Url of the EMIS system.
    ie this will be fedemis.doe.fm for fSM

    Logging In

    Login Endpoint <emis_root>api/token

    Pass the user account name and password as form-urlencoded data, like so:

    POST https://localhost:44301/api/token HTTP/1.1
    Host: localhost:44301

    Accept: application/json, text/plain, /
    Origin: https://localhost:44301
    Content-Type: application/x-www-form-urlencoded
    Accept-Encoding: gzip, deflate, br
    Accept-Language: en-AU,en-GB;q=0.9,en-US;q=0.8,en;q=0.7

    ....(other headers may be present by default)

    grant_type=password&username=yourusername&password=yourpassword

    On successful login (status 200) you will receive a Json package in reply. Most of this information is relevant to the Pacific EMIS front end ( configuration of menu, permissions etc), but the most important thing here is the access_token.

    In all subsequent calls to the EMIS, add the authorization header type "bearer" using this token:

    Host: localhost:44301
    Connection: keep-alive
    Content-Length: 56
    Accept: application/json, text/plain, /
    Origin: https://localhost:44301
    Authorization: Bearer j7Yqqk4vVOZT8qGZ3Nt09Nv9XXYCk6hKWtoLra7CBJZZ0YZqq17PrlO0c0TCc2knuUzKLz4H96BHID33uqrJvkUJ7V16qNA5iZbSgAry6_Om3WL-fUELm5AOpYpdiAdvo8qNWBe (....etc)

    If the token times out, you will get a 401 Unauthorized status. You should go through the login again to get a fresh token.

    Lookup Lists

    You may find it useful to have lookup lists available for interactive selection of data.

    End point: <emis_root>/api/lookups/collection/core

    This returns json data that is a single json object where each property is a lookup list. Each item in the list has properties C (code) N ( name).
    Where there is a hierarchy of lookups, a list may include an additional property. For example, authorities has a property T, which points to authorityTypes; authorityTypes in turn has property G which points to authorityGovt.

    Calling the warehouse

    Warehouse endpoints are at <emis_root>/api/warehouse.

    While still in development, the basic structure of the endpoints will look like:

    <contents>/<aggregation>/[selection]?[report]&....<configuration options>

    Broadly, the end point will define the contents of the data, while the query parameters will configure how it is presented. The specific parameters support may vary according to the contents.

    The following endpoints are examples and demonstrate this general format:

    enrol/school
    enrol/district
    enrol/electoratel/
    enrol/district?report
    enrol/authority/CDE
    enrol/district/CHK?report
    enrol/school/CHKK001
    enrol/district/PNI?ByAge
    enrol/schooltype?byClassLevel=false
    flow/school
    flow/school/CHK002?asPerc
    flow/nation?report
    flow/district?report&asPerc

    Common query parameters

    "report" indicates "report format".
    Usually, the data is disaggregated by Gender; ie there is a field GenderCode on each row, and the Enrol value is the enrolment for that Gender.
    Adding the r to the endpoint means that the data does not have separate rows for gender; instead there are field EnrolM, EnrolF, Enrol on each record. This "denormalisation" can be easier to work with in Jasper reports - hence the name "report format".

    enrol endpoints

    Contents 'enrol' returns enrolment data.
    Aggregation can be by

    • school
    • district
    • electoratel (local electorate)
    • electoraten (national electorate)
    • island
    • region
    • authority
    • schooltype
    • nation

    Depending on the aggregation, you can filter for a specific value. (In practice, you may prefer just to get all records and filter at the client.) 'nation' cannot be filtered.

    Other configuration are available with option values.
    By default , enrolment data is grouped by ClassLevel, but not by Age.
    The following options allow you to control this:

    enrol/district?byAge=true
    enrol/district?byAge=false -- the default
    enrol/district?byAge -- same as byAge=true

    enrol/district?byClassLevel=true -- the default
    enrol/district/?byClassLeve=false -- no disaaggregation by class level
    enrol/district?byClassLevel -- sames as byClassLevel=true

    Using the options, the smallest return set for enrolments you can get is:

    enrol/nation?report&byClassLevel=false

    which returns just one row for each year.

    The most detailed enrolments data is

    enrol/school?byAge

    flow endpoints

    flow endpoints return flow rate indicators - repeat rate, promote rate, dropout rate, survival rate - as well as the data from which these are calculated ; ie enrolments and repeaters for 2 years.

    Flow can be reported by

    • school
    • district
    • nation

    Flow endpoints support 'report' option.

    The rates returned by flow are ratios ; ie fractional values typically between 0 and 1. Add the query parameters
    ?asPerc
    to have these returned as percentages; that is, mutiplied by 100.

    e.g.

    flow/district/PNI?report&asperc

    Future extensions

    Other content endpoints will be added for

    • teachers
    • schools ( ie school counts)
    • leveler (enrolment ratios by education level)
    • classleveler (enrolment ratios for individual year of education)
    • budget (high level budget data)
      ...
  4. Brian Lewis reporter

    Advanced Usage

    Users and servers in the Pacific may not have very fast internet connections, so the warehouse gives a few ways to control the size of downloads

    GZip:

    Add the header
    Accept-Encoding: gzip

    to get back gzipped data:
    Content-Encoding: gzip

    ETag:

    Warehouse calls will return an ETag header which indicates the version of the warehouse data.
    ETag: "5FF37255-5AE8-4A10-9230-A00E15B63C7E"

    When you call a specific Url for a second time, you can add this eTag as If-None-Match header
    If-None-Match: "5FF37255-5AE8-4A10-9230-A00E15B63C7E"

    If the warehouse has not changed ( ie there has not been a warehouse rebuild since the data was acquired the first time)
    you will get a 304 - Not Modified response.

    Custom Json Deflation

    The warehouse can supply data in a customised format that eliminates the repetitition of property names from the Json collection that is returned.

    If you add to your warehouse request the header:

    X-Json-Deflate: ON

    then the returned Json looks like this:

    That is, you get back a Json object with these two properties:
    columns: an array of strings that is the names of the columns in the data
    rows: an array of arrays - each element in rows is an array of the values of the data in that row. The order is the same as the order of columns.

    You can process this object to "reflate" a normal javascript collection.
    For example, this is how the EMIS client does this in typescript:

    export interface IDeflatedTable {
        columns: string[];
        rows: any[][];
    }
    
    
    public reflate(table: IDeflatedTable) {
            let reflateRow = (row) => {
                let out = {};
                row.forEach((cell, index) => {
                    out[table.columns[index]] = cell;
                });
                return out;
            }
            return table.rows.map(reflateRow);
        }
    

    If the data is Gzipped, Json-Deflate can decrease the size of data between 15%-20%. If not gzipped, Json-Deflate decreases size around 70%.

  5. Log in to comment