Congruence Closure for owl:sameAs

A good test dataset is http://vmlion25.deri.ie/btc-2009-small.nq.gz

Firstly preprocess to only contain owl:sameAs links, for a first attempt.

gzip -dc btc-2009-small.nq.gz | \
grep 'http://www.w3.org/2002/07/owl#sameAs' | \
gzip -9c - > btc-2009-small-sameAs.nq.gz


Then load the resulting file into Virtuoso, it must be somewhere on the allowed paths so that the database can find it:

-- the flags are:
--
SELECT DB.DBA.TTLP (
gz_file_open('btc-2009-small-sameAs.nq.gz'),
'http://eris.okfn.org/ww/2011/01/closure/btc-2009-small-sameAs',
'http://eris.okfn.org/ww/2011/01/closure/btc-2009-small-sameAs',
512
);


Configuration

Now we need to make a configuration file for the closure tool. To speak to Virtuoso we need something like this:

{
"rdflib.store": "Virtuoso",
"rdflib.args": "DSN=VOS;UID=dba;PWD=dba;WideAsUTF16=Y",
"sqlalchemy.config": "sqlite://",
#"sqlalchemy.config": "virtuoso://dba:dba@VOS"
}


Also need to install the Virtuoso back end and the patched pyodbc as described at http://packages.python.org/virtuoso/installation.html

NOTE: the temporary sqlalchemy table used for intermediate stage processing doesn't seem to work properly with Virtuoso due to unicode problems. Use another database for sqlalchemy.config. For this test dataset, it is possible to just use an in-memory sqlite database because it is pretty small (22571 rows).

Assuming the