Wiki

Clone wiki

OYSTER / Reference_to_Structure_Configuration

Reference_to_Structure_Configuration

When OYSTER is configured for Reference to Structure Assertions, input identities are required since the purpose of this type of assertion is to force a new reference to match an existing identity. A link file and output identity file must also be specified. This is shown in Figure 67.

Figure 67.PNG

OYSTER can use the identity capture architecture to build a set of identities from a set of assertions. Reference to Structure Assertions represent knowledge about one or more known entity identities. The identities updated through this process can be used as an input when performing Identity Resolution or Identity Update to force a match based on the previous knowledge represented by the assertions. This type of assertion is used to force a reference to match an existing identity that would not match based on any defined rule.

Lastly, the RunMode should be set to “AssertRefToStr” as shown in Figure 68.

Figure 68.PNG

Example

Running OYSTER in the Reference to Structure Assertion configuration allows reference information to be injected into an existing identity, preserved, and input into later processes (OYSTER runs) that run in the Identity Resolution or Identity Update Configuration. These identities can be built from a set of assertion sources that run against a previous set of identities and represent knowledge about the entities.

For this example, the test source file is named ‘AssertionsSource.txt’, shown in Figure 69. This data consists of one reference; the reference is constructed from the following attributes:

  • RefID
  • FirstName
  • LastName
  • DOB
  • SchCode
  • @AssertRefToStr

Figure 69.PNG

Note that based on previous knowledge, an @AssertRefToStr attribute has been added to the source records. The @AssertRefToStr attribute should contain the OYSTER ID of an identity in the input idty file in which the source reference is to be inserted into. Since a reference to structure assertion run is based off of previous knowledge of the references there is no need to analyze the source data. Based on the knowledge about the source references, the source descriptor file can be created. Using this source file information the source descriptor file, named “AssertionsSourceDescriptor.xml”, can be created. This file is shown in Figure 70.

Figure 70.PNG

Note that when creating the source descriptor for a RefToStr assertion run, as mentioned earlier, an @AssertRefToStr attribute is added to each record to represent the previous knowledge. To identify to OYSTER that this is a RefToStr assertion run there is a predefined key word that must be assigned as the value of the Attribute attribute of the @AssertRefToStr attribute. This keyword is @AssertRefToStr. By looking at Figure 70 you can see that the @AssertRefToStr keyword was used. The @AssertRefToStr keyword forces OYSTER to use RefToStr assertion logic on the source input and to ignore any user defined matching rules. Matching will only occur if the @AssertRefToStr attribute in the source file matches existing OYSTER IDs in the idty input file.

Following the same process as was performed in the previous two examples, once the source descriptor is defined the source attributes file must also be defined. This file is stored in the Source folder along with the Source Descriptor file. The attributes file is used to define the attributes in the source along with the algorithms used to compare the attributes and the matching (Identity) rules used when performing ER. For this example run no matching rules will be identified. Instead, as mentioned earlier, the matching will depend solely on the values of the @AssertRefToStr attribute.

The source attribute file is named ‘AssertionsAttributes.xml’ and is depicted in Figure 71.

Figure 71.PNG

The defined attributes match the number of distinct values assigned by the Attribute value in the source descriptor. You may also note that there is no rule defined for this run as mentioned earlier but the Rule tag must still be include or the OYSTER run will fail.

As with the previous two examples, the last file that needs to be created is the RunScript for this example. For the RefToStr example, the input identity file, output identity file, and the link files should be specified. The Run Script should again be stored in the root OYSTER folder as this is where the OYSTER program is expecting the file to reside. The file for this sample is named ‘RefToRefAttributesRunScript.xml’ and is shown in Figure 72.

Figure 72.PNG

Now that all the scripts for the RefToStr Assertions example have been created we can run OYSTER. This process is depicted in Figure 37, Figure 38, and Figure 39 and described in their surrounding text in the Example section.

Once the run is complete the output for the run will be written to the command box by OYSTER. This output is shown in Figure 73 and Figure 74.

Figure 73.PNG

Figure 74.png

Above is the Figure 74: Output to Command Box Generated by OYSTER Run - 2

The statistics for this run may be slightly confusing. According to the statistics, OYSTER processed the 0 records and found they belong to 3 real-world identities, shown inFigure 74. This is due to this being an Assertions run and the references were asserted into equivalence, not matched. Figure 75 shows the Link file with shows the reference AS1.1 was added to the specified Identity, it also shows that no rules were used for matching, all matching was done through assert.

Figure 75.PNG

The entire point of this RefToStr assertion run is to update a set of identities that can be used as input when performing Identity Resolution or Identity Update. These identities are updated through the use of previous knowledge about the references. Figure 76 shows the reference was asserted into the correct identity. By assigning the Assert attribute the @AssertRefToStr Attribute value in the source descriptor, it forced OYSTER to match the records with no regard to the other attribute values of the record.

Figure 76.png

Above is the Figure 76" Identity files created for Identity Build from RefToStrAssertions.

As with the previous examples, this sample run was done using a delimited text file. Examples of how to connect to a Fixed Width text file, a Microsoft Access DB, MySQL, and Microsoft SQLServer can be seen in the OYSTER Reference Guide.

Previous to Reference to Reference Configuration Page ..................................................... Next to Structure to Structure Configuration Page

Back to OYSTER User Guide Page

Updated