Wiki

Clone wiki

ENMA / data_model / hbase

EMPOWERING Data Model - Long term database

'Return home'

This data model documentation describes how the data is stored in the long term database. The technology used is HBase, which is a distributed database from Hadoop.

All the data is stored by tables. They represent two different typologies: the data input tables or the result tables. The former is used to calculate the analytics and the latter creates an historical backup for all the results calculated by the system.

This tables have a unique name and are composed by unique key - values pairs. The values for each key can contain multiple columns and columns families.

In the following example we represent the electricity consumption of contractId CHBB of companyId 1111111111 at timestamp 1505210023, which have a value of 25 kWh. The 'm:v' represents the column 'v' of the column family 'm'.

electricityConsumption_1111111111       4~1505210023~CHBB       {'m:v':25,'m:calc':0}

For more information about this database, visit http://hbase.apache.org/book.html

Data model by tables

Data input tables

All the examples (in cursive) consider type = "electricityConsumption", companyId = 1111111111 and weather_type = "temperatureAir". The bucket is considered as ((timestamp / 100) % 100).

Description Table name schema Key schema Value schema
Table for residential standard energy types
<type>_<companyId>

Example: electricityConsumption_1111111111

<bucket>~<timestamp>~<contractId>
{
    'm:v': <instant consumption>,
    'm:va': <cumulative consumption>,
    'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>,
    'm:timeSlotName': <time slot related with this measure>,
    'm:energyCost': <energy cost related with this measure>,
    'm:flatRate': <flat rate cost related with this measure>,
    'm:periodKey': <tariff period description related with this measure>
}
Table for residential energy types with time of use tariffs
tou<type>_<companyId>

Example: touElectricityConsumption_1111111111

<bucket>~<timestamp>~<contractId>
{
    'm:p1': <instant consumption in period 1>,
    'm:p2': <instant consumption in period 2>,
    'm:p3': <instant consumption in period 3>,
    'm:p1a': <cumulative consumption in period 1>,
    'm:p2a': <cumulative consumption in period 2>,
    'm:p3a': <cumulative consumption in period 3>,
    'm:v': <total instant consumption>,
    'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>,
    'm:timeSlotName': <time slot related with this measure>,
    'm:energyCost': <energy cost related with this measure>,
    'm:flatRate': <flat rate cost related with this measure>,
    'm:periodKey': <tariff period description related with this measure>
}
Table for tertiary energy types
tertiary<type>_<companyId>

Example: tertiaryElectricityConsumption_1111111111

<bucket>~<timestamp>~<contractId>
{
    'm:p1': <instant consumption in period 1>,
    'm:p2': <instant consumption in period 2>,
    'm:p3': <instant consumption in period 3>,
    'm:pX': <instant consumption in period X>,
    'm:p1a': <cumulative consumption in period 1>,
    'm:p2a': <cumulative consumption in period 2>,
    'm:p3a': <cumulative consumption in period 3>,
    'm:pXa': <cumulative consumption in period X>,
    'm:v': <total instant consumption>,
    'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>,
    'm:timeSlotName': <time slot related with this measure>,
    'm:energyCost': <energy cost related with this measure>,
    'm:flatRate': <flat rate cost related with this measure>,
    'm:periodKey': <tariff period description related with this measure>
}
Table for customers
customers
<companyId>~<contractId>
{
    'info:payerId': <>, 'info:ownerId': <>, 'info:signerId': <>,
    'info:power': <>, 'info:dateStart': <>, 'info:dateEnd': <>,
    'info:customerId': <>, 'info:meteringPointId': <>, 'info:activityCode': <>,
    'info:version': <>, 'info:tariffId': <>, 'info:weatherStationId': <>,
    'info:criteria_1': <>, 'info:criteria_2': <>, 'info:criteria_3': <>,
    'info:OT701': <>, 'info:last_access': <>, 'info:dwellingAirConditioning': <>,
    'info:dwellingElectricCooking': <>, 'info:dwellingOtherKitchen': <>,
    'info:dwellingOtherAppliances': <>, 'info:dwellingAllHeatingSources': <>,
    'info:experimentalGroupUser': <>, 'info:activeUser': <>,
    'info:experimentalGroupUserTest': <>, 'info:activeUserDate': <>,
    'address:buildingId': <>, 'address:city': <>, 'address:cityCode': <>,
    'address:country': <>, 'address:countryCode': <>, 'address:street': <>,
    'address:postalCode': <>, 'address:province': <>, 'address:provinceCode': <>,
    'address:parcelNumber': <>, 'profile:totalPersonsNumber': <>,
    'profile:minorsPersonsNumber': <>, 'profile:workingAgePersonsNumber': <>,
    'profile:retiredAgePersonsNumber': <>, 'profile:malePersonsNumber': <>,
    'profile:femalePersonsNumber': <>, 'profile:buildingConstructionYear': <>,
    'profile:dwellingArea': <>, 'profile:buildingType': <>,
    'profile:dwellingPositionInBuilding': <>, 'profile:dwellingOrientation': <>,
    'profile:buildingWindowsType': <>, 'profile:buildingWindowsFrame': <>,
    'profile:buildingHeatingSource': <>, 'profile:buildingHeatingSourceDhw': <>,
    'profile:buildingSolarSystem': <>
}
Table for tariffs
tariffs
<companyId>~<tariffId>~<dateStart_flatRate>~
<dateEnd_flatRate>~<dateStart_energyCost>~
<dateEnd_energyCost>~<periodKey>
{
    'm:timeSlotName': <Time slot name>,
    'm:flatRate': <Flat rate cost of this time slot (€) >,
    'm:energyCost': <Energy cost of this time slot (cts €)>,
    'm:periodStart': <Date, in format [mmdd], from which this time slot is valid>,
    'm:periodEnd': <Date, in format [mmdd], until this time slot is valid>,
    'm:hourStart': <Hour, in format [HHMMSS], from which this time slot is valid>,
    'm:hourEnd': <Hour, in format [HHMMSS], until this time slot is valid>,
    'm:days': <Concatenation string with the valid days [Mon,Tue,Wed,Thu,Fri,Sat,Sun]>
}
Table for meteorological measures
<weather_type>

Example: temperatureAir

<bucket>~<timestamp>~<stationId>
{
    'm:v': <Value of the measure>
}
Table for weather stations
stations
<companyId>~<stationId>
{
    'address:altitude': <Absolute altitude [m]>,
    'address:city': <City name>,
    'address:cityCode': <Citycode of the city>,
    'address:country': <Country name>,
    'address:lat': <Latitude in [Degrees Minutes Seconds Orientation] format. Example: 40 57 31 N>,
    'address:long': <Longitude in [Degrees Minutes Seconds Orientation] format. Example: 00 52 18 E>,
    'address:parcelNumber': <Parcel number>,
    'address:postalCode': <Postal Code of the country>,
    'address:province': <Province name>,
    'address:provinceCode': <Province code>,
    'address:street': <Street>
}

Result tables

The result tables have exactly the same outputs as their associated 'data model in the short term database'. All the output fields are grouped in the 'results' column family or in the defined key.

Example for OT101

This is a result document in the collection OT101Results of the short term database:

{
  "contractId": "123457ABC",
  "companyId": 1111111111,
  "month": 201307,
  "setupId": 52234e386cb9fea66d5b2511,
  "consumption": 216.65,
  "averageEffConsumption": 65.46,
  "averageConsumption": 151.75,
  "diffAverageEffConsumption": 230.93,
  "diffAverageConsumption": 42.76,
  "numberCustomersEff": 141,
  "numberCustomers": 602,
  "criteria": "postalCode + tariffId"
}

In the long term database, results will be stored like this:

Description Table name schema Key schema Value schema
Table for OT101 results ot101
<month>~<companyId>~<contractId>~<setupId>

201307~1111111111~123457ABC~52234e386cb9fea66d5b2511

{
   "results:consumption": 216.65,
   "results:averageEffConsumption": 65.46,
   "results:averageConsumption": 151.75,
   "results:diffAverageEffConsumption": 230.93,
   "results:diffAverageConsumption": 42.76,
   "results:numberCustomersEff": 141,
   "results:numberCustomers": 602,
   "results:criteria": "postalCode + tariffId"
}

Updated