Wiki
Clone wikiENMA / data_model / hbase
EMPOWERING Data Model - Long term database
This data model documentation describes how the data is stored in the long term database. The technology used is HBase, which is a distributed database from Hadoop.
All the data is stored by tables. They represent two different typologies: the data input tables or the result tables. The former is used to calculate the analytics and the latter creates an historical backup for all the results calculated by the system.
This tables have a unique name and are composed by unique key - values pairs. The values for each key can contain multiple columns and columns families.
In the following example we represent the electricity consumption of contractId CHBB of companyId 1111111111 at timestamp 1505210023, which have a value of 25 kWh. The 'm:v' represents the column 'v' of the column family 'm'.
electricityConsumption_1111111111 4~1505210023~CHBB {'m:v':25,'m:calc':0}
For more information about this database, visit http://hbase.apache.org/book.html
Data model by tables
Data input tables
All the examples (in cursive) consider type = "electricityConsumption", companyId = 1111111111 and weather_type = "temperatureAir". The bucket is considered as ((timestamp / 100) % 100).
Description | Table name schema | Key schema | Value schema |
---|---|---|---|
Table for residential standard energy types | <type>_<companyId> Example: electricityConsumption_1111111111 |
<bucket>~<timestamp>~<contractId> |
{ 'm:v': <instant consumption>, 'm:va': <cumulative consumption>, 'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>, 'm:timeSlotName': <time slot related with this measure>, 'm:energyCost': <energy cost related with this measure>, 'm:flatRate': <flat rate cost related with this measure>, 'm:periodKey': <tariff period description related with this measure> } |
Table for residential energy types with time of use tariffs | tou<type>_<companyId> Example: touElectricityConsumption_1111111111 |
<bucket>~<timestamp>~<contractId> |
{ 'm:p1': <instant consumption in period 1>, 'm:p2': <instant consumption in period 2>, 'm:p3': <instant consumption in period 3>, 'm:p1a': <cumulative consumption in period 1>, 'm:p2a': <cumulative consumption in period 2>, 'm:p3a': <cumulative consumption in period 3>, 'm:v': <total instant consumption>, 'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>, 'm:timeSlotName': <time slot related with this measure>, 'm:energyCost': <energy cost related with this measure>, 'm:flatRate': <flat rate cost related with this measure>, 'm:periodKey': <tariff period description related with this measure> } |
Table for tertiary energy types | tertiary<type>_<companyId> Example: tertiaryElectricityConsumption_1111111111 |
<bucket>~<timestamp>~<contractId> |
{ 'm:p1': <instant consumption in period 1>, 'm:p2': <instant consumption in period 2>, 'm:p3': <instant consumption in period 3>, 'm:pX': <instant consumption in period X>, 'm:p1a': <cumulative consumption in period 1>, 'm:p2a': <cumulative consumption in period 2>, 'm:p3a': <cumulative consumption in period 3>, 'm:pXa': <cumulative consumption in period X>, 'm:v': <total instant consumption>, 'm:calc': <1 if the instant consumption is calculated by the platform, 0 if not>, 'm:timeSlotName': <time slot related with this measure>, 'm:energyCost': <energy cost related with this measure>, 'm:flatRate': <flat rate cost related with this measure>, 'm:periodKey': <tariff period description related with this measure> } |
Table for customers | customers |
<companyId>~<contractId> |
{ 'info:payerId': <>, 'info:ownerId': <>, 'info:signerId': <>, 'info:power': <>, 'info:dateStart': <>, 'info:dateEnd': <>, 'info:customerId': <>, 'info:meteringPointId': <>, 'info:activityCode': <>, 'info:version': <>, 'info:tariffId': <>, 'info:weatherStationId': <>, 'info:criteria_1': <>, 'info:criteria_2': <>, 'info:criteria_3': <>, 'info:OT701': <>, 'info:last_access': <>, 'info:dwellingAirConditioning': <>, 'info:dwellingElectricCooking': <>, 'info:dwellingOtherKitchen': <>, 'info:dwellingOtherAppliances': <>, 'info:dwellingAllHeatingSources': <>, 'info:experimentalGroupUser': <>, 'info:activeUser': <>, 'info:experimentalGroupUserTest': <>, 'info:activeUserDate': <>, 'address:buildingId': <>, 'address:city': <>, 'address:cityCode': <>, 'address:country': <>, 'address:countryCode': <>, 'address:street': <>, 'address:postalCode': <>, 'address:province': <>, 'address:provinceCode': <>, 'address:parcelNumber': <>, 'profile:totalPersonsNumber': <>, 'profile:minorsPersonsNumber': <>, 'profile:workingAgePersonsNumber': <>, 'profile:retiredAgePersonsNumber': <>, 'profile:malePersonsNumber': <>, 'profile:femalePersonsNumber': <>, 'profile:buildingConstructionYear': <>, 'profile:dwellingArea': <>, 'profile:buildingType': <>, 'profile:dwellingPositionInBuilding': <>, 'profile:dwellingOrientation': <>, 'profile:buildingWindowsType': <>, 'profile:buildingWindowsFrame': <>, 'profile:buildingHeatingSource': <>, 'profile:buildingHeatingSourceDhw': <>, 'profile:buildingSolarSystem': <> } |
Table for tariffs | tariffs |
<companyId>~<tariffId>~<dateStart_flatRate>~ <dateEnd_flatRate>~<dateStart_energyCost>~ <dateEnd_energyCost>~<periodKey> |
{ 'm:timeSlotName': <Time slot name>, 'm:flatRate': <Flat rate cost of this time slot (€) >, 'm:energyCost': <Energy cost of this time slot (cts €)>, 'm:periodStart': <Date, in format [mmdd], from which this time slot is valid>, 'm:periodEnd': <Date, in format [mmdd], until this time slot is valid>, 'm:hourStart': <Hour, in format [HHMMSS], from which this time slot is valid>, 'm:hourEnd': <Hour, in format [HHMMSS], until this time slot is valid>, 'm:days': <Concatenation string with the valid days [Mon,Tue,Wed,Thu,Fri,Sat,Sun]> } |
Table for meteorological measures | <weather_type> Example: temperatureAir |
<bucket>~<timestamp>~<stationId> |
{ 'm:v': <Value of the measure> } |
Table for weather stations | stations |
<companyId>~<stationId> |
{ 'address:altitude': <Absolute altitude [m]>, 'address:city': <City name>, 'address:cityCode': <Citycode of the city>, 'address:country': <Country name>, 'address:lat': <Latitude in [Degrees Minutes Seconds Orientation] format. Example: 40 57 31 N>, 'address:long': <Longitude in [Degrees Minutes Seconds Orientation] format. Example: 00 52 18 E>, 'address:parcelNumber': <Parcel number>, 'address:postalCode': <Postal Code of the country>, 'address:province': <Province name>, 'address:provinceCode': <Province code>, 'address:street': <Street> } |
Result tables
The result tables have exactly the same outputs as their associated 'data model in the short term database'. All the output fields are grouped in the 'results' column family or in the defined key.
Example for OT101
This is a result document in the collection OT101Results of the short term database:
{ "contractId": "123457ABC", "companyId": 1111111111, "month": 201307, "setupId": 52234e386cb9fea66d5b2511, "consumption": 216.65, "averageEffConsumption": 65.46, "averageConsumption": 151.75, "diffAverageEffConsumption": 230.93, "diffAverageConsumption": 42.76, "numberCustomersEff": 141, "numberCustomers": 602, "criteria": "postalCode + tariffId" }
In the long term database, results will be stored like this:
Description | Table name schema | Key schema | Value schema |
---|---|---|---|
Table for OT101 results | ot101 | <month>~<companyId>~<contractId>~<setupId> 201307~1111111111~123457ABC~52234e386cb9fea66d5b2511 |
{ "results:consumption": 216.65, "results:averageEffConsumption": 65.46, "results:averageConsumption": 151.75, "results:diffAverageEffConsumption": 230.93, "results:diffAverageConsumption": 42.76, "results:numberCustomersEff": 141, "results:numberCustomers": 602, "results:criteria": "postalCode + tariffId" } |
Updated