Workbook upload gets wrong student record, leading to incorrect DoB => age

Issue #575 closed
Brian Lewis repo owner created an issue

Circumstance: there is an earlier Student_ record for a given student, matching on Student Card ID and family name, but DoB different. That earlier record has an enrolment record at the same school.

In this circumstance the pre-existing record is identified, and the old DoB is used - causing conflicting result.

Reported in RMI today 16 4 2019

Comments (2)

  1. Ghislain Hachey

    An email from Brian worth documenting here for excellent clarify in the issue:

    The issue here as we saw yesterday comes down to the fact that when we load the student data from the workbook, we ultimately want to build up a history of the enrolment for each student. That is, we have a table of Students, and for each student, a StudentEnrolment record in each school year.

    The Student record contains some “constant” values for each student – their National Id no, their name, date of birth and gender. But the problem is that sometimes the worksheets get these values wrong (particularly date of birth, but also gender) ; so that from year to year there is an inconsistency.

    So if we have a student record with particular values , and a new workbook comes in – with similar, but not identical values – how do we decide if it is the same person?

    The rules for deciding, until now, were allowing a workbook student record to be matched against an existing student that may have used a different date of birth. Since the date of birth returns the age, this caused some ages to be determined by older data, not the data on the workbook. On top of that, if the student had their gender incorrectly recorded, then using the old data could cause them to be put in the wrong gender column.

    So – I have carefully reworked all this.

    The key goal is to be sure that when you look in the enrollments table in MIEMIS, say, with a pivot table like this:

    image002.jpg

    That this pivot matches the values in the pivot on the student summary page in the workbook:

    2018:

    image003.jpg

    2017:

    image004.jpg

    You can see the grand total, and class level totals are fine – but the totals by age are a bit different because of the reason explained above. Even within grades, there are some minor variations because of incorrectly recorded gender – ie while the grade total is right, the split between M and F varies.

    So – the strategy of the uploaded now is : -

    --- ensure that the enrollments written into the enrolment table exactly matches the pivot table in the workbook ( assuming we are looking at the same school or set of schools)

    --- this means that if does not match the date of birth and gender of an existing student record in MIEMIS, it creates a new record….. EXCEPT::::

    ----- if everything matches except Date of birth, and the date of birth was previously recorded as an estimate, and is NOT an estimate on the incoming workbook, then it’s a match – AND the Date of Birth of the Student is updated.

    So this approach is trying to strike the balance between making the data fully verifiable with the workbook, but as well, to start to give us the student level history that we want to maintain going forward.

    After reprocessing 2017 and 2018 file we see Enrollments:

    2018

    image005.jpg

    2017

    image006.jpg

    And here’s 2018 split by gender: the Enrollments table after reuploading matches exactly the workbook pivot:

    image007.jpg

    Note that the uploader now puts more “breadcrumbs” on the student and student enrolment records, so that it is easier to fully audit back, all the way from the warehouse to the individual student records.

    From Workbook to Warehouse

    The full path of the data from the workbook to the warehouse is:

    Workbook – data is collected in the workbook.

    FileLibrary – when the workbook is uploaded, it is put in the filelibrary. (\fileDb folder) It is given a unique identifier - a “guid” – 9336123E-BFBE-4F71-9FD9-9ACEB571B9C2.xslm

    When the workbook is processed, it writes the tables Student_ and StudentEnrolment_

    Then the workbook upload aggregates the enrolments– as well as repeaters etc – by school, grade, gender and class; to generate records in Enrollments, and Pupil Tables. It also creates a schoolSurvey record for each school in the workbook. Note also that the warehouse rebuild will insert estimated data for any school that does not have a survey in a given year. I don’t think this affects you in RMI? , because you are getting complete data returns. (?)

    The warehouse rebuild is the final step which then generates the warehouse tables – in various aggregations. These higher level aggregations make the data faster to retrieve and smaller to download.

  2. Log in to comment