Wiki

Clone wiki

OYSTER / OYSTER Demonstration Run

Oyster Demonstration Run

Demo Setup

Download “Oyster3.6.0withDemos.zip” from Downloads section via the Bitbucket link in the left menu. Extract the files and save them to a desired location, for this demonstration guide we use C:\OYSTERDemo. The extracted files and folder should look like Figure 1.

DemoFigure1.JPG

Figure 1: C:\OYSTERDemo\Oyster3.6.0withDemos Folder and Extracted Files

Each demonstration run folder contains an Input, Output, and Scripts folder which organize and contain all the required files to perform each demonstration run. Each of the eight runs covered in this demonstration guide start with the following method.

For Windows

Method :

Step1: Open the Command Prompt: click Start -> All Programs -> Accessories -> Command Prompt.

Step2: Change the working directory to C:\OYSTERDemo\Oyster3.6.0withDemos by using the command ‘cd C:\OYSTERDemo\Oyster3.6.0withDemos’.

Step3: Enter ‘java -jar oyster-3.6.0.jar’ and press Enter to execute the jar file. The screen will display “Please input the name of the runScript:” as shown in Figure 2.

Step 4: Enter the name of the runScript. For example: 'MergePurgeRunScript.xml' and press Enter to perform the run

DemoFigure2.JPG

Figure 2: OYSTER Prompt

NOTE: All demonstration runs are performed using a NULL Index. This means every reference gets compared to every other reference when performing matching. This is typically not desirable for any sizable runs for which a user defined index (UDI) should be defined. For information on configuring a UDI, please refer to the OYSTER User Guide for an explanation and the OYSTER Reference Guide for the syntax and an example.

For Mac

Method :

Step 1: After downloading and moving the zip file to the desired location as specified above, open the unzipped folder ('Oyster3.6.0withDemos').

Step 2: Locate the runScript you want to use (ex 'MergePurgeRunScript.xml'). Open this script with a text editor of your choice.

USEFUL TIP: Windows uses backslashes (\) as directory separators, whereas macOS uses forward slashes (/). For this reason, you have to update the file paths of your MergePurgeRunScript.xml (or any other RunScript you select). There are 4 places in the RunScripts that need to be updated, namely the <LogFile> tag, the <AttributePath> tag, the <LinkOutput> tag and the <Source> tag.

Step 3: Having said that, change these four lines in your RunScript.xml as follows: (NOTE: I simply updated the backslashes to forward slashes, leave everything else as it is in your file. Take care not to change the slashes in your closing tags or your specific attributes (Size, Num etc))

#!xml
<LogFile Num="5" Size="100000000">.\MergePurge\Output\MergePurgeLog_%g.log</LogFile>
<AttributePath>.\MergePurge\Scripts\MergePurgeAttributes.xml</AttributePath> 
<LinkOutput Type="TextFile">.\MergePurge\Output\MergePurgeIndex.link</LinkOutput>
<Source>.\MergePurge\Scripts\MergePurgeSourceDescriptor.xml</Source>
to
#!xml
<LogFile Num="5" Size="100000000">./MergePurge/Output/MergePurgeLog_%g.log</LogFile>
<AttributePath>./MergePurge/Scripts/MergePurgeAttributes.xml</AttributePath> 
<LinkOutput Type="TextFile">./MergePurge/Output/MergePurgeIndex.link</LinkOutput>
<Source>./MergePurge/Scripts/MergePurgeSourceDescriptor.xml</Source>

Step 4: Save the file.

Step 5: Now open up the folder of the related run script (in this case, MergePurge folder), and then open up 'Scripts' subfolder (So you should now be in /Oyster3.6.0withDemos/MergePurge/Scripts)

Step 6: Open 'MergePurgeSourceDescriptor.xml' in your text editor (or the SourceDescriptor.xml script of whichever algorithm you are working with)

Step 7: For the same reasons as above, update the <Source> tag in the following way. Again, do not change your other attributes to those that are shown here, only update the slashes in the file path

#!xml
<Source Type="FileDelim" Char="|" Qual="" Labels="Y">.\MergePurge\Input\MergePurgeTest.txt</Source>
to

#!xml
<Source Type="FileDelim" Char="|" Qual="" Labels="Y">./MergePurge/Input/MergePurgeTest.txt</Source>
Step 8: Save the file.

Step 9: Now, using Terminal (commandline tool for Mac), go to this location on your computer (use the following command)

#!bash
cd <the/rest/of/your/path/here>/Oyster3.6.0withDemos
Step 10: When you are in this directory in Terminal, run the Oyster JAR using the following command:

#!bash
java -jar oyster-3.6.0.jar
You should be prompted to enter the run script as below: Screen Shot 2019-03-08 at 12.09.00 PM.png

REMEMBER: Some scripts may vary and there might be fewer or more tags with filepaths in them in the scripts for the algorithms you chose. If you are still having problems, review the -RunScript.xml, the -Attributes.xml script and the -SourceDescriptor.xml to ensure your file paths are correct everywhere.

COMMON ERRORS AND SOLUTIONS FOR MAC: If you run the java jar file and you get the following errors:

#!java
SEVERE: ##JAVA: .\MergePurge\Input\MergePurgeTest.txt (No such file or directory)
java.io.FileNotFoundException: .\MergePurge\Input\MergePurgeTest.txt (No such file or directory)

/Oyster3.6.0withDemos/.\MergePurge\Scripts\MergePurgeSourceDescriptor.xml (No such file or directory)

java.io.FileNotFoundException: /Oyster3.6.0withDemos/.\MergePurge\Scripts\MergePurgeAttributes.xml (No such file or directory)
or else, you get the output

#!java

Process ended with Errors! Please check log file: .\MergePurge\Output\MergePurgeLog_0.log
but your log directory is empty or your logfile is never updated, then you did not completely update the file paths, and should review Steps 1 - 8 above, taking special care to change all your backslashes to forward slashes.

There are 22 demonstration runs in OYSTER. The order of OYSTER demonstration run is presented by special purpose followed by the logic of Entity Resolution. Each demonstration run's detailed explaination can be found by click each link listed below.

Updated