There are two choices:
- recreate the datasets with the BSBM tools (PHASE 1)
- or download the Store.zip and unzip it in the main folder of the project under the directory Store/ and go to PHASE 2
### PHASE 1: Create the datasets ###
1. Download the latest version of the Berlin SPARQL Benchmark tools at
2. Extract the zip into a chosen directory and build it following the instructions at http://bsbmtools.sourceforge.net/ (use build.xml)
3. Set the environment variable BSBMROOT to the directory where you chose to build the bsbmtools. This can be done for example in Linux by executing from the directory the command: export BSBMROOT=`pwd`
4. Return to the SparqlRank directory and run the script generateDatasets, which will generate the 100K, 250K, 500K, 1M and 5M datasets and load them into TDB. Now you have the TDB BSBM datasets in the Store folder.
### PHASE 2: Run the test ###
Run sparqlrank.CompleteExperiment with the appropriate main memory settings. For example in our experiments we ran on a machine with 4Gb with the maximum heap size allocated to 2Gb. (adding to the JVM settings: -Xmx2G)
WARNING1: The program was developed and tested in the Eclipse environment, using also the AspectJ plugin.
WARNING2: The experiments will probably take a long time.
Experiment1 takes one hour.
### PHASE 3: Get the results ###
For each run of the CompleteExperiment, a new Experiments_XXXX folder is created.
Inside this folder there are two subfolders:
- Experiment_1 contains the measurements for all the different strategies on a single query (the one in "queryExperiment2")
- Experiment_2 contains the evaluation of the strategies on a benchmark of the queries (the ones contained in the "queries" folder)
The results are csv files called "measures_X_X_X.txt", in which the last value is the execution time of the single query execution.
There are also some more details about the query execution, like query execution trees or number of mappings passed through each operator in the "treeDirectory" folder.