Snippets
Created by
mason.malone
- Script calls
create_item
on a corrupted CrossrefItem (ci_item
). Item 1939454 created. -
Script calls
ci_item.create_others
ProviderItem.create_others
iterates over other provider classes. When it gets toWosItem
, it callsWosItem.find_or_create_by_doi(10.1007/s004240050236)
WosItem.find_or_create_by_doi()
creates a WosItem for that doi, which invokes theWosItem.complete
callback.WosItem.complete
callsWosItem.pubmed_item
WosItem.pubmed_item
callsPubmedItem.find_or_create_by_pubmed_id(8781202)
PubmedItem.find_or_create_by_pubmed_id()
creates a PubmedItem for that pubmed_id, which invokes thePubmedItem.complete
callback.-
PubmedItem.complete
callsProviderItem.complete
-
ProviderItem.complete
callsProviderItem.create_item
ProviderItem.create_item
callsItem.find_or_create_by_provider_item(self)
- The doi column of PubmedItem 8781202 is blank, so
Item.find_or_create_by_provider_item(...)
callswhere(...).first_or_create
using thePubmedItem
record.Item.first_or_create
creates Item 1939455 because of slight differences in the author list:Item.find(1939454).author_list => "L Kornet, JR Jansen, EJ Gussenhoven, A Versprille" Item.find(1939455).author_list => "L Kornet, J R Jansen, E J Gussenhoven, A Versprille"
- The doi column of PubmedItem 8781202 is blank, so
-
Control returns to
ProviderItem.complete
, which callsitem.complete
Item.complete
callsItem.merge_similar
Item.merge_similar
callsItem.similar_items(false)
Item.similar_items
searches ElasticSearch with the title and author list of Item 1939455. If Item 1939454 has been indexed by ElasticSearch yet (which is likely, since the timespan between when it was created and this step is about 2 seconds), then it will match.
- If item 1939454 was matched, it will be returned to
Item.merge_similar
, which will callItem.replace_with(self)
.Item.replace_with
will delete Item 1939454 and update everything to use Item 1939455.
- Control returns to
ProviderItem.create_others
, which will see thatWosItem.item.nil?
. This is becauseWosItem.complete
does not callProviderItem.complete
, which is what normally sets the item. Thus, all newly-createdWosItem
s will have a nil item. ProviderItem.create_others
callsWosItem.instance.update_column(:item_id, 1939454)
. This results in corruption, since Item 1939454 has been deleted.
-
-
-
Control returns to the script, which calls
ci_item.item.complete
.ci_item.item
will correspond to Item 1939454. Although it was deleted, the object still exists.Item.complete
callsItem.merge_similar
Item.merge_similar
callsItem.similar_items(false)
. If Item 1939455 has been indexed yet, then it will match. Unlike before, the timing is much more sensitive. The last time I caused this to happen, the time between when Item 1939455 is created and this step was 60 milliseconds. The index refresh rate for ElasticSearch is set to 1 second, so there's a 6% chance that the item has been indexed at this point.
- If item 1939455 was matched, it will be returned to
Item.merge_similar
, which will callItem.replace_with(self)
.Item.replace_with
will delete Item 1939455 and update everything to use itself (Item 1939454). Since 1939454 was deleted, this will corrupt everything that was updated in the previous call toItem.replace_with
, which can number in the thousands of rows.
Comments (0)
You can clone a snippet to your computer for local editing. Learn more.