Wiki

Clone wiki

OYSTER / Demo 4 - IdentityUpdate

Demo 4 - IdentityUpdate

Identity Update is a hybrid form of the Identity Capture and Identity Resolution architectures. Identity Update accepts a set of input references along with a predefined set of managed identities (Knowledge base). It resolves the input references against the knowledge base and updates the knowledge base with any new information presented in the input references in essence “updating” the knowledgebase with new references.

This run will use the test source reference file named ‘IdentityUpdateTest.txt’ illustrated in Figure 1. This data consists of two references composed by five attributes. The first attribute is the IdentityID, this is a unique identifier associated to each record. The other attributes consist of FirstName, LastName, SchoolCode, and DOB. When these attributes are combined as they are in the source file they are used to define a set of sample student references. The run also uses and updates the .idty file that was in Identity Update input folder with the name IdentityUpdateInputIdentities.idty.

Identity Update Source Input.jpg

Figure 1: Identity Update Source Input

The Match Rules defined for this run are likewise identical to the Match Rules used in the Merge-purge run. The rules can be seen in Figure 2.

Identity Update Match Rules.jpg Figure 2: Identity Update Match Rules

1. Enter ‘IdentityUpdateRunScript.xml’ and press Enter to perform the run as shown in Figure 3.

Running IdentityUpdate Run Script.jpg

Figure 3: Running IdentityUpdate Run Script

2. Information about the run will be displayed in the Command Prompt. For this run, there are 2 references processed and grouped as 4 identities (3 of these came from the input idty file). The OYSTER run statistics for this run are shown in Figure 4 to Figure 7.

OYSTER Run Statistics for IdentityUpdate 1.jpg OYSTER Run Statistics for IdentityUpdate 2.jpg OYSTER Run Statistics for IdentityUpdate 3.jpg OYSTER Run Statistics for IdentityUpdate 4.jpg OYSTER Run Statistics for IdentityUpdate 5.jpg OYSTER Run Statistics for IdentityUpdate 6.jpg

Figure 4-7: OYSTER Run Statistics for IdentityUpdate

3. After the run finishes, the Output folder will contain the IdentityUpdateIndex.link, IdentityUpdateOutput.idty, Identity Change Report.txt, Identity Merge Map.csv, IdentityUpdateOutput.emap, IdentityUpdate_0.log and IdentityUpdateOutput.indx files as shown in Figure 8. The .emap and .indx files are generated since the Explanation and Debug attributes in the RunScript are set to “On”.

IdentityUpdate Output folder.jpg

Figure 8: IdentityUpdate Output folder

4. OYSTER creates/assigns the persistent identifiers for identities and stores them in the IdentityUpdateIndex.link file, shown in Figure 9. Reference 1 did not match any Identities that existed in the idty file that was used for input so it was assigned to its own EIS and assigned its own OysterID, 2I3Y0EUXN8TXWM3O. Reference 2 matched the identity with OysterID MW9AGFLZ2A1ENXZ5 and was assigned the same OysterID.

IdentityUpdateIndex.link File.jpg Figure 9: IdentityUpdateIndex.link File

5. Being an IdentityUpdate run, OYSTER updated the Identity file (IdentityUpdateInputIdentities.idty which is in Identity Update input folder) and stored it in the IdentityUpdateOutput.idty file. This file is the Identity Knowledge Base that can be updated and maintained further in future runs. The contents of this file are shown in Figure 10-11. As you can see, the references with the same OYSTER ID are grouped together in the .idty output file. And you can see how the new Identity was added to the updated .idty file. You will also note that the ID Assigned to the Modification log directly corresponds to the RunID in the Trace allowing for easy tracking of a records origin and easy to see which references were added in the current run.

IdentityUpdateOutput.idty File 1.jpg IdentityUpdateOutput.idty File 2.jpg IdentityUpdateOutput.idty File 3.jpg

Figure 10-11: IdentityUpdateOutput.idty File

The Identity Change Report, shown in Figure 12, shows that this run read three previously identified identities (EIS) in from the knowledgebase listed in Identity Update input folder (IdentityUpdateInputIdentities.idty) and that four Output Identities were created in the updated knowledgebase file. These four EIS consist of the original two EIS, a single updated EIS and the newly created EIS. In the case of Identity Update, these Identities are a representation of Previous/updated/newly created EISs that are stored in the new knowledgebase (output .idty file).

Identity Change Report for IdentityUpdate Run.jpg Figure 12: Identity Change Report for IdentityUpdate Run

You may replace the input data in the IdentityUpdateTest.txt file with your data, and edit the IdentityUpdateSourceDescriptor.xml, IdentityUpdateAttributes.xml, and IdentityUpdateRunScript.xml files to correspond to your new data. Detailed information for each of the XML configurations can be found in the OYSTER Reference Guide.

Identity Update runs are the standard configurations used to integrate new references into an existing identity knowledge base. In this scenario, it allowed us to insert two new references into the existing knowledgebase by merging one reference into an existing EIS and by creating an EIS for the reference that had no match in the existing identity knowledge base. These are the most common run once the initial creation of the knowledge base occurs through an Identity Capture Run or a Ref to Ref Assertion run.

Back to OYSTER Demonstration Run page

Updated