Test and Import different results

Issue #32 open
Murray Bryant created an issue

Hi

I have a simple setup and run test, which runs through with no errors and correctly reports the number of nodes and edges to be created. I then run Import and it creates no nodes and incorrectly reports the number of edges to be created ( still with no errors).

Please see output below;

I can provide the data for replication of the issue if requested

neo4j-databridge-shell$ test hallscreek 2017-07-28 14:30:47,360 INFO .databridge.notification.ProgressMonitor: 87 - Starting profiling phase. 2017-07-28 14:30:47,410 INFO .databridge.notification.ProgressMonitor: 87 - Profiling company.json 2017-07-28 14:30:47,817 INFO .databridge.notification.ProgressMonitor: 87 - Profiling person.json 2017-07-28 14:30:47,892 INFO aware.neo4j.databridge.profiler.Profiler: 118 - Profiling finished. Rows processed: 94, time: 0 hrs, 0 mins, 0 secs, rate: 178 rows/sec 2017-07-28 14:30:47,892 INFO .databridge.notification.ProgressMonitor: 87 - Profiling phase complete. Time: 0 hrs, 0 mins, 0 secs 2017-07-28 14:30:47,892 INFO .databridge.notification.ProgressMonitor: 87 - Starting loading phase 2017-07-28 14:30:47,896 INFO .databridge.notification.ProgressMonitor: 87 - Importing company.json 2017-07-28 14:30:48,112 INFO aware.neo4j.databridge.DatabridgePlugins: 73 - Loading plugins: 2017-07-28 14:30:48,429 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.ceil:ceil.groovy 2017-07-28 14:30:48,451 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.days:days.groovy 2017-07-28 14:30:48,463 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.floor:floor.groovy 2017-07-28 14:30:48,474 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.integer:integer.groovy 2017-07-28 14:30:48,490 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.iso8601_date:iso8601_date.groovy 2017-07-28 14:30:48,506 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.julian:julian.groovy 2017-07-28 14:30:48,529 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.long_date:long_date.groovy 2017-07-28 14:30:48,539 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.real:real.groovy 2017-07-28 14:30:48,553 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.round:round.groovy 2017-07-28 14:30:48,565 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.string:string.groovy 2017-07-28 14:30:48,586 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.unix_date:unix_date.groovy 2017-07-28 14:30:48,597 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - plugins.orbit_location:orbit_location.groovy 2017-07-28 14:30:49,554 INFO .databridge.notification.ProgressMonitor: 87 - Total: rows: 39, nodes: 44, edges: 39, rate: 50 objects/sec, 23 rows/sec 2017-07-28 14:30:49,554 INFO .databridge.notification.ProgressMonitor: 87 - Importing person.json 2017-07-28 14:30:49,630 INFO .databridge.notification.ProgressMonitor: 87 - Total: rows: 94, nodes: 99, edges: 39, rate: 79 objects/sec, 54 rows/sec 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Loading phase complete. Time: 0 hrs, 0 mins, 1 secs 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Nodes created: 99 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Nodes updated: 0 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Edges created: 39 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Starting index creation phase 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :CompanyType(identity) 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :Person(identity) 2017-07-28 14:30:49,632 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :Company(identity) 2017-07-28 14:30:49,633 INFO .databridge.notification.ProgressMonitor: 87 - Index creation phase finished. Time: 0 hrs, 0 mins, 0 secs 2017-07-28 14:30:49,633 INFO .databridge.notification.ProgressMonitor: 87 - Import finished. Time: 0 hrs, 0 mins, 2 secs neo4j-databridge-shell$ import hallscreek 2017-07-28 14:30:56,384 INFO .databridge.notification.ProgressMonitor: 87 - Starting profiling phase. 2017-07-28 14:30:56,427 INFO .databridge.notification.ProgressMonitor: 87 - Profiling company.json 2017-07-28 14:30:56,773 INFO .databridge.notification.ProgressMonitor: 87 - Profiling person.json 2017-07-28 14:30:56,828 INFO aware.neo4j.databridge.profiler.Profiler: 118 - Profiling finished. Rows processed: 94, time: 0 hrs, 0 mins, 0 secs, rate: 213 rows/sec 2017-07-28 14:30:56,828 INFO .databridge.notification.ProgressMonitor: 87 - Profiling phase complete. Time: 0 hrs, 0 mins, 0 secs 2017-07-28 14:30:56,828 INFO .databridge.notification.ProgressMonitor: 87 - Starting loading phase 2017-07-28 14:30:56,832 INFO .databridge.notification.ProgressMonitor: 87 - Importing company.json 2017-07-28 14:30:57,043 INFO aware.neo4j.databridge.DatabridgePlugins: 73 - Loading plugins: 2017-07-28 14:30:57,337 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.ceil:ceil.groovy 2017-07-28 14:30:57,357 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.days:days.groovy 2017-07-28 14:30:57,369 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.floor:floor.groovy 2017-07-28 14:30:57,379 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.integer:integer.groovy 2017-07-28 14:30:57,397 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.iso8601_date:iso8601_date.groovy 2017-07-28 14:30:57,410 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.julian:julian.groovy 2017-07-28 14:30:57,432 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.long_date:long_date.groovy 2017-07-28 14:30:57,441 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.real:real.groovy 2017-07-28 14:30:57,455 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.round:round.groovy 2017-07-28 14:30:57,463 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.string:string.groovy 2017-07-28 14:30:57,481 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - com.graphaware.neo4j.databridge.plugins.unix_date:unix_date.groovy 2017-07-28 14:30:57,492 INFO aware.neo4j.databridge.DatabridgePlugins: 118 - plugins.orbit_location:orbit_location.groovy 2017-07-28 14:30:58,503 INFO .databridge.notification.ProgressMonitor: 87 - Total: rows: 39, nodes: 44, edges: 1,370, rate: 846 objects/sec, 23 rows/sec 2017-07-28 14:30:58,503 INFO .databridge.notification.ProgressMonitor: 87 - Importing person.json 2017-07-28 14:30:58,585 INFO .databridge.notification.ProgressMonitor: 87 - Total: rows: 94, nodes: 99, edges: 1,370, rate: 838 objects/sec, 53 rows/sec 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Loading phase complete. Time: 0 hrs, 0 mins, 1 secs 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Nodes created: 0 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Nodes updated: 0 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Edges created: 0 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Starting index creation phase 2017-07-28 14:30:58,588 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :CompanyType(identity) 2017-07-28 14:30:58,590 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :Person(identity) 2017-07-28 14:30:58,590 INFO .databridge.notification.ProgressMonitor: 87 - Requested: create index :Company(identity) 2017-07-28 14:30:58,674 INFO .databridge.notification.ProgressMonitor: 87 - Index creation phase finished. Time: 0 hrs, 0 mins, 0 secs 2017-07-28 14:30:58,675 INFO .databridge.notification.ProgressMonitor: 87 - Import finished. Time: 0 hrs, 0 mins, 2 secs neo4j-databridge-shell$

Comments (7)

  1. Vince Bickers repo owner

    Hi Murray I'm looking at the issues you raised. Please could you zip up the import for this issue and mail it to me.

  2. Murray Bryant reporter

    Hi Vince

    This project is importing from a Microsoft SQL 2014 database.

    I am happy to send you the entire project + the database for testing if you want

    Would putting it into dropbox work?

    thanks

    Murray

  3. Murray Bryant reporter

    I have uploaded the config files and the database backup to the dropbox folder we shared last year.

    Please let me know if you need anything more or different

    regards

    Murray

  4. Vince Bickers repo owner

    Thanks Murray

    Databridge keeps track of which nodes and edges already exist in the graph and the log indicates that the data has already been imported once already. Using one of the existing demos as an example of what I mean:

    bin/databridge import demo/satellites
    ...
    INFO .databridge.notification.ProgressMonitor:  87 - Nodes created: 21
    INFO .databridge.notification.ProgressMonitor:  87 - Nodes updated: 0
    INFO .databridge.notification.ProgressMonitor:  87 - Edges created: 35
    

    If I now run this command a second time:

    bin/databridge import demo/satellites
    
    INFO .databridge.notification.ProgressMonitor:  87 - Nodes created: 0
    INFO .databridge.notification.ProgressMonitor:  87 - Nodes updated: 0
    INFO .databridge.notification.ProgressMonitor:  87 - Edges created: 10
    

    The reason is that the data has already been loaded. (The 10 edges created is a bug, by the way, which will be fixed shortly).

    There are a couple of options:

    If you want to create a copy of the same data that is already loaded, use the -c flag:

    bin/databridge import -c ...

    Alternatively, if you want to overwrite (delete) the existing data, use the -d flag:

    bin/databridge import -d ...

    Please let me know if using one of these options resolves your issue.

    Regards, Vince

  5. Murray Bryant reporter

    Hi Vince

    I don't think that is my issue.

    I have been completely removing the graph.db directory in the folder and then running test/import.

    So there should be no existing data. Am I correct in thinking that it is only using the graph.db directory that it creates in the import directory?

    It is probably something I have done wrong in my configuration, but I cannot see what it is. Are you able to look at the company.json files and see if I have made a mistake which is causing this.

    thanks

    Murray

  6. Vince Bickers repo owner

    Actually, the graph.db isn't all you need to delete, because Databridge maintains external information about what has been loaded and what hasn't, and this must be deleted if you want to reload the same set of data.

    Using the -d flag on import will ensure that all information about any previous import of a particular dataset is forgotten about, and it will also delete the graph.db folder as well. There is no need to use it with test because no databases are created, and no permanent information about what has been loaded is maintained.

    I think this probably needs to be made clearer in the documentation, and additionally, we should handle invalid/inconsistent flags a bit better.

    Please try with import -d. If it still doesn't work, I'll start to dig a little deeper into your configuration.

    Thanks Vince

  7. Log in to comment