- changed status to resolved
Smarter handling of data when loading student exams
Smarter handling of data when loading student exams. I'm open to the idea of creating a student profile for the missing ones and relying on the student de-duplication tool to cleanup. But we need something a bit better to avoid constantly loading this bad data. Perhaps a process like this: load student exams data, if student has an exact match it will be "joinable" already, if no match then create a student profile. When done return to the user the list that had exact match (name and student ID) and a list of student with no match for which a new student profile had to be created. Then the de-duper can be used on those.
We discussed something like the following process:
1 Student exam data loading
2 look into Candidates/Students_ first for match (may find student misspelled again). Candidates is the place where non-canonical students are kept as is for potential identification of repeated student with misspelling. Handle the possible following cases:
-
Match in candidate but different student ID (possibly from a de-dup cleanup): presumably there is already the canonical student in Students_. Load the data associated with the
-
Match in Candidate with same ID. Match in Students_. Just load the data.
-
Match in Candidate with same ID. No match in Students_. Create Student_ profile and load the data.
-
No match in Candidate (was never loaded at least not with that spelling). Match in Students_. Create the Candicate with the Students_ profile ID.
-
No match in Candidate and no Match in Students_. Create in both the same record.
Obviously this would need to be refined or accepted with the following caveats:
-
What if it matches more then on in Students_ (not yet dedupped duplicates i Students_,
-
What if it matches with a different person of exact same name. How to tell? From School? From expected grade?
3 Feedback to use (optional, if not too much work):
-
Include list of loaded students with perfect match in the Students_ table (nothing to cleanup for those)
-
Include list of students that where loaded with no match in Students_ but match in Candidate (as long as these records go through a dedup they will be linkable to their student profile)
-
Include list of student with no match in Students_ and Candidate (those will have a new Student_ profile created along with a new Candidate and further dedup
Comments (1)
-
reporter - Log in to comment
While the implementation is a little simpler then this issue first described it will remain like this for the time being. Improvements on the data should come through improved user processes.