Remove unwanted entries from DB
Due to regular crashes of qrscu.py (issues #476, #478) some data is only partially present in the DB (e.g. with empty station name, empty study description), resulting in entries "Error has occurred - import probably unsuccessful" in the summary table.
When trying to remove these from the remapp_generalstudymoduleattr table (using phppgadmin), I'm running against foreign key constraints. Apparenty Django doesn't support "delete on cascade", but instead recursively deletes all related entries itself (for instance when deleting data via the web interface).
After fixing a particular bug I would like to reprocess the missing data, but I can't remove it, and also can't reprocess because the studyinstanceuid's are already present.
There's no way of deleting this data from the web interface as filtering on displayname=None doesn't work. Is there any other way?
Comments (43)
-
-
reporter phppgadmin: did you do anything special for that? Django by default doesn't create tables/triggers with delete on cascade enabled...
Thanks for the links; I'll have a look at it. -
It was pgAdmin III, which was a client GUI that used to be shipped with postgres when installed on Windows or Ubuntu desktop. There was a menu item 'drop cascaded', and I think I remember it working! However, I could be wrong. And I can't remember if I ever used it to delete individual records.
I see that pgAdmin III has now been replaced by a web service installed pgAdmin 4.
-
reporter pgadmin3 results in the same error as phppgadmin, so I think I'll have to use Django. Never used Django and/or filtering though, so I'll have to brush up on that first :)
ERROR: update or delete on table "remapp_generalstudymoduleattr" violates foreign key constraint "ce1c45a50ae89eb00ce7e9096fdae11e" on table "remapp_projectionxrayradiationdose" SQL state: 23503 Detail: Key (id)=(14880) is still referenced from table "remapp_projectionxrayradiationdose".
-
reporter Works like a charm!
python manage.py shell from remapp.models import GeneralStudyModuleAttr, GeneralEquipmentModuleAttr # delete entries with study instance uid = NULL a = GeneralStudyModuleAttr.objects.filter(study_instance_uid=None) a.count() a.delete() a.count() # delete entries without corresponding entry in the GeneralEquipmentModuleAttr table a = GeneralStudyModuleAttr.objects.exclude(id__in = GeneralEquipmentModuleAttr.objects.values_list('general_study_module_attributes_id', flat=True)) a.count() a.delete() a.count() exit()
-
You might do something like this. Make sure you are in your virtualenv if you are using one, and navigate to the openrem folder that contains
manage.py
.python manage.py shell
>>> from remapp.models import GeneralStudyModuleAttr >>> a = GeneralStudyModuleAttr.objects.all() >>> a.count() >>> b = a[123] >>> b. # press tab b.DoesNotExist( b._perform_date_checks( b.MultipleObjectsReturned( b._perform_unique_checks( b.__class__( b._save_parents( b.__delattr__( b._save_table( b.__dict__ b._set_pk_val( b.__doc__ b._state b.__eq__( b.accession_hashed b.__format__( b.accession_number b.__getattribute__( b.check( b.__hash__( b.clean( b.__init__( b.clean_fields( b.__module__ b.ctradiationdose_set( b.__ne__( b.date_error_message( b.__new__( b.delete( b.__reduce__( b.from_db( b.__reduce_ex__( b.full_clean( b.__repr__( b.generalequipmentmoduleattr_set( b.__setattr__( b.get_deferred_fields( b.__setstate__( b.id b.__sizeof__( b.modality_type b.__str__( b.name_of_physician_reading_study b.__subclasshook__( b.operator_name b.__unicode__( b.patientmoduleattr_set( b.__weakref__ b.patientstudymoduleattr_set( b._base_manager b.performing_physician_name b._check_column_name_clashes( b.physician_of_record b._check_field_name_clashes( b.pk b._check_fields( b.prepare_database_save( b._check_id_field( b.procedure_code_meaning b._check_index_together( b.procedure_code_value b._check_local_fields( b.projectionxrayradiationdose_set( b._check_long_column_names( b.referring_physician_identification b._check_m2m_through_same_relationship( b.referring_physician_name b._check_managers( b.refresh_from_db( b._check_model( b.requested_procedure_code_meaning b._check_ordering( b.requested_procedure_code_value b._check_swappable( b.save( b._check_unique_together( b.save_base( b._default_manager b.serializable_value( b._deferred b.study_date b._do_insert( b.study_description b._do_update( b.study_id b._get_FIELD_display( b.study_instance_uid b._get_next_or_previous_by_FIELD( b.study_time b._get_next_or_previous_in_order( b.study_workload_chart_time b._get_pk_val( b.unique_error_message( b._get_unique_checks( b.validate_unique( b._meta
Now if you have an accession number or study instance uid or date or modality that you want to delete, it's easy!
>>> c = a.filter(study_instance_uid__exact = '1.3.51.0.1.1.192.134.90.77.1000234153.16601234') >>> c.count() 1 >>> c.delete() >>> c.count() 0
If you want to filter on field in other tables, then it gets more interesting.
You can see what fields are available using the same trick as before:
>>> c = b.generalequipmentmoduleattr_set.get() >>> c. # press tab c.DoesNotExist( c._get_next_or_previous_in_order( c.MultipleObjectsReturned( c._get_pk_val( c.__class__( c._get_unique_checks( c.__delattr__( c._meta c.__dict__ c._perform_date_checks( c.__doc__ c._perform_unique_checks( c.__eq__( c._save_parents( c.__format__( c._save_table( c.__getattribute__( c._set_pk_val( c.__hash__( c._state c.__init__( c._unique_equipment_name_cache c.__module__ c.check( c.__ne__( c.clean( c.__new__( c.clean_fields( c.__reduce__( c.date_error_message( c.__reduce_ex__( c.date_of_last_calibration c.__repr__( c.delete( c.__setattr__( c.device_serial_number c.__setstate__( c.from_db( c.__sizeof__( c.full_clean( c.__str__( c.gantry_id c.__subclasshook__( c.general_study_module_attributes c.__unicode__( c.general_study_module_attributes_id c.__weakref__ c.get_deferred_fields( c._base_manager c.id c._check_column_name_clashes( c.institution_address c._check_field_name_clashes( c.institution_name c._check_fields( c.institutional_department_name c._check_id_field( c.manufacturer c._check_index_together( c.manufacturer_model_name c._check_local_fields( c.pk c._check_long_column_names( c.prepare_database_save( c._check_m2m_through_same_relationship( c.refresh_from_db( c._check_managers( c.save( c._check_model( c.save_base( c._check_ordering( c.serializable_value( c._check_swappable( c.software_versions c._check_unique_together( c.spatial_resolution c._default_manager c.station_name c._deferred c.time_of_last_calibration c._do_insert( c.unique_equipment_name c._do_update( c.unique_equipment_name_id c._general_study_module_attributes_cache c.unique_error_message( c._get_FIELD_display( c.validate_unique( c._get_next_or_previous_by_FIELD( >>> d = a.filter(generalequipmentmoduleattr__station_name__exact = 'CTAWP34582') >>> d.count() 2342 >>> d.delete() >>> d.count() 0
You can see that to get between tables, you use a
__
(double underscore), and then you use one of the filtering terms, such asexact
,contains
,icontains
for a case insensitive version etc. See the Django docs for more. -
Gah! You got there first whilst I was writing my response!
Looks like a very useful query you have formed there though. I'll hang on to that one! We should probably incorporate them into the interface if we thing they are safe...
-
- changed milestone to 0.8.0
-
assigned issue to
-
Added more of the columns. Sort of refs
#580, and is working towards ref#479→ <<cset 73869d101879>>
-
Added accumulated columns for CR and DR. Sort of refs
#580, and is working towards ref#479→ <<cset e686edfb3e40>>
-
Finished all the columns Sort of refs
#580, and is working towards ref#479→ <<cset e12932afaeef>>
-
Modified column headings, added line breaks, added detail to CT events, added not-patient indicator Sort of refs
#580, and is working towards ref#479→ <<cset 8cacea292dfb>>
-
More column titles adjustment, and adding in non-breaking spaces Sort of refs
#580, and is working towards ref#479→ <<cset a4d1cb5930c3>>
-
Changed URL as this is no-longer just about the odd studies Sort of refs
#580, and is working towards ref#479→ <<cset c65c96f43ec6>>
-
Changed name of view. Accumulated fluoro and Acquisition only display if present. Sort of refs
#580, and is working towards ref#479→ <<cset a4bb12d0a00b>>
-
Reduced duplication by making ad-hoc reprocess use the initial function used when dual is selected. Also moved the review_summary_list function. Sort of refs
#580, and is working towards ref#479→ <<cset 33e000999f5c>>
-
Some DX were being incorrectly not identified in reprocess - need to check initial DX imports... Sort of refs
#580, and is working towards ref#479→ <<cset 20034536975f>>
-
Renamed template Sort of refs
#580, and is working towards ref#479→ <<cset 5e8bdd0b1c83>>
-
Starting to add the delete function Refs
#479→ <<cset 36ea1cb5c1f9>>
-
Reformat Refs
#479→ <<cset bbca4ce20632>>
-
Rudimentary button added which does ask for confirmation but doesn't delete yet Refs
#479→ <<cset 3d647ad1a39a>>
-
Renamed button Refs
#479→ <<cset ca2ecb68192c>>
-
Delete of studies without deleting table entry now working, needs polish! Refs
#479→ <<cset e61fa68086f6>>
-
All working now I think, but needs pagination to be usable for kit with lots of studies. Refs
#479→ <<cset 69f167294f66>>
-
Added pagination. Fixed a bug where POST data False is True... Refs
#479→ <<cset 6d5f0587f1ec>>
-
Allowing no break spaces to be displayed Refs
#479→ <<cset f763f58f43a3>>
-
The studies that can't be managed in the interface currently are those with no GeneralEquipmentModuleAttr table, which leads to the 'import probably unsuccessful' message originally commented on in this issue.
In my database, with hundreds of thousands of studies, the queries proposed by TIm do not return (at least they don't return in a timeframe I am willing to wait!).
However, the same can be achieved in no time at all with, for example:
>>> from remapp.models import GeneralStudyModuleAttr >>> studies = GeneralStudyModuleAttr.objects.filter(modality_type__exact='CT') >>> studies.count() 153537 >>> broken = studies.filter(generalequipmentmoduleattr__unique_equipment_name__display_name__isnull=True) >>> broken.count() 4 >>> broken.delete() # or whatever you want to do with them
At this stage, I'm going to put management of this to a later version.
-
Hiding review column if not an administrator Refs
#479→ <<cset 69e87f228d53>>
-
- changed milestone to Future
Management of the 'import probably unsuccessful' can wait till the next release. See https://bitbucket.org/openrem/openrem/issues/479/remove-unwanted-entries-from-db#comment-42792761 for instructions on how to do this manually.
-
Initial view, URL and template for issue ref
#479→ <<cset 1de8aaf9b5f4>>
-
Now with own template - currently just added accession number. Links from display names. Refs
#479→ <<cset 9719bfe85b58>>
-
Added equipment columns. Refs
#479→ <<cset 85ed7c3e596b>>
-
Removed attempt to check and display display name as it isn't there by definition! Refs
#479→ <<cset 4fd3f937f06d>>
-
Delete studies now works - tested with one modality so far. Message didn't seem to be raised. Need to link to page from front page instead of 'error has occurred'. Refs
#479→ <<cset 0b8ccb60e867>>
-
Link from homepage sorted. Now need to sort out table on review names page and review attribute error for utf-8 encoding in update_latest_studies. Refs
#479→ <<cset 8cfa499a36dd>>
-
Table on review display names page done with AJAX update. Leaves review of attribute error for utf-8 encoding in update_latest_studies. Refs
#479→ <<cset 9946830be31e>>
-
Normalised the order of modalities on display name page. Added wait comments. Modified display_name utf-8 error message. Not expected to see this in the wild. Refs
#479→ <<cset c999e361e1ca>>
-
Added ref
#479to changes and release notes→ <<cset 697930fd08f4>>
-
Modified so a non-admingroup user sees a different message and can't click through from the home page. refs
#479→ <<cset 6f8f59dd43c1>>
-
Attempt at brief docs for failed import review feature. refs
#479→ <<cset 5c618d7268f0>>
-
Minor changes to i_deletestudies.rst, added short text to i_displaynames.rst. Refs
#479. [skip ci] docs only→ <<cset d28954ffa06d>>
-
Removing static URLs. Refs
#479,#667→ <<cset 0d3a7161ddf8>>
-
- changed status to resolved
- Log in to comment
For you, yes, for others, not so much.
phppgadmin should have worked - I've successfully done this with pgAdmin before - I just had to make sure that delete on cascade was enabled.
I usually use a variation on the instructions to delete all the records, but do some filtering first. Always remember to backup the database before doing this!.
I find it helpful to assign one of the records to a variable in the shell to enable 'dot-tab' completion (
b=a[0]
). Also there are a couple of out-of-date png files in thestuff
folder that show the database with and without CT. The with CT one is horrible to review!We should make it easier to filter for these studies though, and to delete them from the web interface.