New table of study summary information
As suggested by @dplatten it would simplify things enormously when doing filtering in the web interface if as many database calls as possible were to just one table.
There are various things we need to think about:
- If the data already has a place in one of the other tables, as determined by relevant DICOM template, that would remain and a copy of that data would be added to the summary table. Agree/Disagree?
- Some summary data is not currently held, rather it is computed each time it is needed. For example the number of Axial events in a study. This could be computed on import (or update) and stored in the table.
- Every study starts with a
GeneralStudyModuleAttr
table. The new table could have a one-to-one relationship with that table, or it could simply be extra fields in that table. Thoughts? - When do we do it? Do it now, and fix the Specific Event CT filtering vs charts bug by simplifying the event filtering code, or put it aside for the next major release and battle through to find the current bug?
- What goes in the table… some ideas:
number_of_events = models.IntegerField(blank=True, null=True) number_of_axial = models.IntegerField(blank=True, null=True) number_of_spiral = models.IntegerField(blank=True, null=True) number_of_stationary = models.IntegerField(blank=True, null=True) number_of_const_angle = models.IntegerField(blank=True, null=True) total_dlp = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) total_dap = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) total_rp_dose = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) total_agd_left = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) total_agd_right = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) total_agd_both = models.DecimalField(max_digits=16, decimal_places=8, blank=True, null=True) # for legacy
We could also consider adding in some event level data such as the acquisition protocol, but that would have to be some sort of a list as a string, so probably not worth it.
Feedback please @dplatten , @LuukO , anyone else reading this!
Comments (126)
-
-
reporter I am not a database expert either, and I wish we had one to help us!
I think there probably is quite a lot we could put in fewer tables. But the problem is always when there is, or can be, multiplicity. For example with the kVp example, it is almost always just the one kVp per exposure, but some devices detail the kVp of every single pulse in a fluoroscopy exposure.
A more common example is where there can be two planes in fluoro (and x-ray it turns out) and left and right are separate in mammo. For these, if we can assume it is only ever one or two, then maybe we can handle that by duplicating the fields in the same table. Not sure.
From the little I know about databases, having a flatter structure with fewer one-to-many lookups would certainly be faster, as well as much easier to code!
-
These two articles make interesting reading on the subject of flattening a database (denormalization):
https://www.itprotoday.com/sql-server/responsible-denormalization
https://www.itprotoday.com/analytics-and-reporting/performance-tuning-data-model-thinner-better
-
reporter Not sure what conclusion to draw from those articles… flatter better to program with but leading to compromise and trouble down the lane; normalised better to load each table (thinner), and more flexibility.
I don’t think it is simple! I am sure we can do better than what we currently have.
-
reporter Added summary fields as proposed to GeneralStudyModuleAttr, modified rdsr to populate number_of_events, tested with CT and RF. Refs
#759. As per branch name, this is an experiment to see how to best proceed.→ <<cset b82543434830>>
-
reporter Added ct event type count to summary fields. Refs
#759→ <<cset 234ffed5487d>>
-
reporter Massively simplified event filters using summary data. Refs
#759→ <<cset 0a1b4d6b0d36>>
-
reporter Moved CT event count function to extract_common and added to ct_philips (to make tests work!). Refs
#759→ <<cset 1f7a8fb01fa2>>
-
reporter Adding DLP to rdsr and ct_philips, number of events to ct_philips. Refs
#759→ <<cset d8d42d660ed1>>
-
reporter Starting to create data migration to populate summary table. Refs
#759→ <<cset 890141e8e654>>
-
reporter Modified model to recognise dual plane planar systems. Refs
#759→ <<cset 297026fde5c1>>
-
reporter Starting to populate planar. Refs
#759→ <<cset 6db29c739076>>
-
reporter Limited testing for MG, working. Limited testing for DX, some questions. Refs
#759→ <<cset 725413f2fdc6>>
-
reporter I had used the wrong table. Need to test with dual plane, and update dx amd mam extractors. Refs
#759→ <<cset 36aa42545bc8>>
-
reporter Moved mammo to extract_common. Refs
#759→ <<cset ffb53614f94a>>
-
reporter Added and tested with mam.py. Refs
#759→ <<cset 892f8f8aab68>>
-
reporter Implemented for DX, not tested. Refs
#759→ <<cset e1e37c62f713>>
-
reporter Adapted to cater for additional DX images. Need to check mammo situation... Refs
#759→ <<cset e634071c67ae>>
-
reporter Ammending mam.py to update summary when processing subsequent images. Refs
#759→ <<cset 1c57e1689875>>
-
reporter Adding number of events count to dx and mam. Refs
#759→ <<cset b32f3b78a434>>
-
reporter Mocked up a view to populate an existing database with summary information as a more friendly alternative to trying to do a data migration. Refs
#759→ <<cset d807d04d3e51>>
-
reporter Corrections to make populate view work. Refs
#759→ <<cset c37ab0960a6a>>
-
reporter Removing commented code. Refs
#759→ <<cset 4ab3da826aeb>>
-
reporter Adding number of planes to model and extract. Refs
#759→ <<cset 1bd76c854ee0>>
-
reporter DAP conversions required for display. Refs
#759→ <<cset ded839215405>>
-
reporter Started switching rffiltered to use new summary values. Refs
#759→ <<cset 4740b4c5c9f8>>
-
reporter Started switching rffiltered to use new summary values. Refs
#759→ <<cset 70ad741efeae>>
-
reporter Code reformat for readability. Refs
#759→ <<cset 0ea1ecf417b4>>
-
reporter Corrected and completed RP dose column. Refs
#759→ <<cset a4298564205b>>
-
reporter Added and populated rp dose delta fields. Refs
#759→ <<cset 6b6bbe45f40c>>
-
reporter @David Platten - looking at the display of DAP and RP doses over delta weeks, do you only consider the A plane in a bi-plane system?
-
reporter Normalizing the formatting to fix test failures. Refs
#759→ <<cset b7024da7ad8e>>
-
reporter @David Platten I think that in this line in rdsr.py: rdsr.py#lines-1521 we look at both planes of a bi-plane system separately and get the same answer, so we record the pk of each study in the time period twice.
In the detail view, the count ignores that table and therefore gets the right answer (one study): views.py#lines-853
In the filtered view we use rffiltered.html#lines-146 to get that table then rffiltered.html#lines-233 to count the number of studies, hence the wrong answer (two studies) for bi-plane systems.
-
@Ed McDonagh I thought that the code worked OK for bi-plane systems, as per issue
#665? -
reporter @David Platten It adds up both planes of DAP and RP Dose as desired. Just the number of studies reported in the filtered view is wrong. I think.
-
reporter Adding 'both' breast AGD code, plus the other possible codes for left and right'. Refs
#759→ <<cset 1e9bc8eb4e2d>>
-
reporter Addresses ref
#763. Will separate out into PR. Refs#759→ <<cset d4c4cdc995eb>>
-
reporter Formatting. Refs
#759→ <<cset 3b2819d38cbf>>
-
reporter Tidied up delta weeks fields, using them in template, populating in rdsr. TODO: populate them in recalc. Refs
#759→ <<cset 7efb738149aa>>
-
reporter Added populating delta weeks fields on update. To discuss: store delta weeks data only in new fields. Refs
#759→ <<cset f428f6b84979>>
-
reporter Refactored delta weeks summaries copying code into function. Added it to migration code. Refs
#759→ <<cset 17aa60b8fb63>>
-
reporter Added fields and code to count events per plane for radiographic and potentially fluoro. Refs
#759→ <<cset 051ecaa89a4f>>
-
reporter Implementing new fields in dxfiltered. Refs
#759→ <<cset e89c998f326a>>
-
reporter Moving migration function into an asynchronous task instead of a view. Refs
#759→ <<cset c6ebf7deae80>>
-
reporter Split migration into four tasks, to be run in parallel. Changed db model to have a row per modality. Refs
#759→ <<cset c36010b5d5cb>>
-
reporter Reformat of homepage code before editing. Refs
#759→ <<cset dcd1dcdcf740>>
-
reporter Getting Attribute on NoneType not having Attribute code_value, not ObjectDoesNotExist. Refs
#759→ <<cset 8bf2dd731642>>
-
reporter Basic AJAX status update of parallel running migrations. Corrected syntax error for CT. Refs
#759→ <<cset 40832b3aaba2>>
-
reporter Added some progress bars... Refs
#759→ <<cset cb2ab0158159>>
-
reporter Added completed status to home page. Refs
#759→ <<cset 180de882a34c>>
-
reporter Correcting my template logic syntax. Refs
#759→ <<cset e93cbd0acd85>>
-
reporter Adding summary tags to CT fitered. Refs
#759→ <<cset 183d3d2d79c0>>
-
reporter Adding check to see if data is already present to make reruns quicker. Refs
#759→ <<cset 283bea31bcce>>
-
reporter Implemented ordering and keeping track so I can restart after laptop dies... Also removed full stops. Refs
#759→ <<cset 0a7673db01a5>>
-
reporter Decimal places inappropriate for DLP. Refs
#759→ <<cset 35577471bef6>>
-
reporter Hopefully a more sensible and robust method of picking up when process interrupted. Refs
#759→ <<cset 2f63c7efb84a>>
-
reporter Removed redundant ct db lookups for ctfiltered.html. Refs
#759→ <<cset 36fc8065d8cc>>
-
reporter Replacing mgfiltered.html deep lookups with GeneralStudModuleAttr summaries. Refs
#759→ <<cset 93e348c8bd60>>
-
reporter @David Platten @Luuk I've migrated a copy of my production database using the new fields in
GeneralStudyModuleAttr
and created identical servers with the pre and post migrated databases to time the differences in rendering each of the modality filtered pages.The migration itself, the process and user interface needs a little more work, and a decision as to whether to run the migration as one task per modality or one task per study - on my limited VM on my laptop that keeps crashing it took days to do the CT…
In terms of number of studies in the database:
CT Fluoroscopy Mammography Radiography 217114 2012 83584 55787 I’ve loaded CT then RF etc for both servers at the same time, and used the Chrome Network tools to record the wait and load times for each. The view is set to 200 studies. I have then repeated this four times, and below you can see the mean times and the proportion of one over the other.
I still don’t like how slow the system is even with the summary fields, but I’m pleased with the relative increase in speed. I’d like to push on with this to have it in the next release.
What do you think of the approach? I’ll need to tidy up a bit before I can suggest that you guys try it!
-
@Ed McDonagh it’s great to see the speed increase in the filtered views. However, I’m more interested in whether the summary fields help to speed up chart data calculations…
-
reporter For CT
I am confident that the following will be faster if we use the summary data:
- DLP per study
- Events per study
- Study DLP over time
I think the others (except maybe DLP per requested procedure?) make use of event level data or have nothing to gain from new summary data?
For Fluoro
- DAP per study
For Mammography
- None of the current charts
For Radiography
- DAP per study
Is that hopeful or depressing @David Platten ? From reading the links you sent about normalised vs de-normalised (flatter) databases, I am not convinced that trying to record acquisition information at study level can work in a maintainable way.
It might be worth though having ‘average’ summary values at event level? That way we can bring kV etc into that table which would reduce the need to to handle units that record kV per pulse within an exposure for example.
-
@Ed McDonagh that’s good with me. I’d like to try out the summary table with one of the charts. Should I wait for you to tidy up the migration process before I try?
-
reporter Do you want to test it with summary data just at study level (as implemented), or possible event level summary data as mentioned in the final sentence?
-
Just at study level as implemented as a starting point
-
reporter Great. Let me tidy up a bit. I’ll try not to let it drag on. I think I’m going to go with the ‘one task per study’ option rather than the ‘one task per modality type’ option.
-
reporter Switched so that each exam update is a task (CT only). Refs
#759→ <<cset 84e65f9f3b1a>>
-
reporter Removed redundant ct db lookups for ctfiltered.html. Refs
#759→ <<cset e9ac7f6c4757>>
-
reporter Added box around migration information. Added restart link. Added error page. CT: Changed to using null on topogram numbers. Refs
#759→ <<cset 55d9114a16a2>>
-
reporter Added upgrade status singleton to db model, set automatically when each modality complete, then doesn't update on homepage anymore. Refs
#759→ <<cset 46b840200774>>
-
reporter Fix copying error CT->MG Refs
#759→ <<cset 71b34c4f825e>>
-
reporter Factoring out the mammo summary population code to enable task per study. Not tested. Refs
#759→ <<cset 92afdae69d41>>
-
reporter Refactored study level function to work for all modalities. Not tested. Refs
#759→ <<cset 9e8716323f3a>>
-
reporter Attempt to update populate_summary_progress view so RF, MG and DX act the same as CT. Refs
#759→ <<cset 2e3bc1df62b9>>
-
reporter Reorganised populate_summary, corrected CT call of study_level function. Refs
#759→ <<cset 8544cddbbbea>>
-
reporter Adding missing tasks Refs
#759→ <<cset 73769105c71e>>
-
reporter Correcting percent complete for mammo. Refs
#759→ <<cset 09d7c4d838c2>>
-
reporter Display study level completion rather than task generation for RF/MG/DX Refs
#759→ <<cset a04ddb7df94c>>
-
reporter Should now provide a way to start the migration. Refs
#759→ <<cset 9816838f00b4>>
-
reporter Update method of selecting initial list of studies to process Refs
#759→ <<cset 5b014ad15e0e>>
-
reporter Preventing messages appearing for fresh installs. Refs
#759→ <<cset 9862e789c813>>
-
reporter Adding information to not-logged in or not-started. Refs
#759→ <<cset 948eec274fb5>>
-
reporter Hi @David Platten - can you try this branch out now please? A migration is required of course. Other than that, it should be self-explanatory. You will need to restart Celery to pick up the new tasks.
-
reporter Adding ref
#759to changes and initial stab at documenting it. [skip ci] docs only→ <<cset 7578ae0546b6>>
-
reporter Updating link to release notes. Refs
#759[skip ci] docs only→ <<cset 795e88a3f97a>>
-
Will try it out later on today
-
Running the migration now. 44000 tasks and climbing… Is it a task per study?
-
I see that it is. I really like the progress bars on the home page - very neat.
-
The RF progress bar has stopped moving. The Celery log contains the following:
[2019-07-30 11:06:24,005: ERROR/MainProcess] Task remapp.tools.populate_summary.populate_summary_rf[bb5877d4-17ca-48c9-95b1-f7ac6776bfa7] raised unexpected: IndexError('list index out of range',) Traceback (most recent call last): File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\celery\app\trace.py", line 240, in trace_task R = retval = fun(*args, **kwargs) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\celery\app\trace.py", line 438, in __protected_call__ return self.run(*args, **kwargs) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\openrem\remapp\tools\populate_summary.py", line 173, in populate_summary_rf populate_summary_study_level('RF', study.pk) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\celery\local.py", line 188, in __call__ return self._get_current_object()(*a, **kw) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\celery\app\trace.py", line 439, in __protected_call__ return orig(self, *args, **kwargs) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\celery\app\task.py", line 428, in __call__ return self.run(*args, **kwargs) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\openrem\remapp\tools\populate_summary.py", line 29, in populate_summary_study_level populate_dx_rf_summary(study) File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\openrem\remapp\extractors\extract_common.py", line 172, in populate_dx_rf_summary accum_int_a = planes[0].accumintegratedprojradiogdose_set.get() File "c:\pythonvirtualenvs\openrem-dev\lib\site-packages\django\db\models\query.py", line 201, in __getitem__ return list(qs)[0] IndexError: list index out of range
The MG progress bar has also stopped progressing (almost at the end - 49 studies to go). The Celery log contains a series of lines like this, each with a different Study UID:
[2019-07-30 11:06:34,315: WARNING/Worker-1] Study UID 1.2.840.113619.redacted. Unable to set summary total_agd values
The number of entries of this type corresponds to the number of studies left on the progress bar.
-
reporter Added IndexError catch for studies with no planes/accumxraydose_set data. Refs
#759→ <<cset 343305299faa>>
-
reporter Blast. Had an answer written then pressed buttons without this box in focus and lost it all!
First issue indicates you have an RF study with no planes in it, or no
accumxraydose_set
. I had an IndexError catch for the second one (as most are single plane), but not the first. I have now added such a catch if you’d like to check out, stop all the tasks and run it again.The second error either indicates that you have mammo studies with no
projectionxrayradiationdose_set.get().accumxraydose_set.get().accummammographyxraydose_set
or something else is tripping up. Are you able to find some of the studies and have a look at what is there? -
reporter By the way, it should be fairly safe to kill and restart the tasks (and purge the queue) - they should restart where they have left off and redo any that were half done - the check for being done is the last field that is populated.
-
I just tried to commit the “IndexError” catch, and have just realised you’ve already updated it. I’ll stop my updates and try again with the fix.
-
Added an IndexError exception to catch the error thrown when plane[0] data does not exist in an RF study. I haven't tested this yet as I am waiting for my DX migration to complete (220000 studies to go...). References issue
#759.→ <<cset d968b7b869dc>>
-
Added further IndexError exception to catch error thrown if there are no accumulated x-ray dose sets present in a study. My RF migrations are now progressing. References issue
#759→ <<cset cddb6475bc8d>>
-
Some of my mammo studies have empty accumulated AGD. They are “QA-raw uniformity” studies. Not sure how to make the code skip over these and still increment the progress.
-
I think my RF migration is now complete, but the progress bar is 10 studies short. I think these are the ones with the missing accumulated data. So, as for the mammo, how do we make the progress bar increment for these?
-
reporter Populate the agd_both field for mammo studies with no content so that the migration can complete. Refs
#759→ <<cset 949e53ad29ed>>
-
reporter Set
number_of_events_a
to 0 for rf/dx if unable to set values. Refs#759→ <<cset 0de71769cb25>>
-
reporter Can you try again with the last two commits?
-
That did the trick - thanks. All migrated now.
-
I’m now looking at using some of the study summary data in the charts. The CT study DLP chart was initially broken due to None type appearing in the series name list. I’ve fixed this.
Some timings:
CT DLP per study: 3.0 s vs. 5.5 s
CT # events per study: 2.7 s vs. 5.4 s
So around a 50 % reduction.
Can
ctradiationdose__ctirradiationeventdata__mean_ctdivol
be put into the summary table? -
Scrap my request earlier - I now realise that the mean_ctdivol value is per event, not per study. Sorry - being dim.
-
Now on to the DX charts.
Using the summary total_dap_a instead of
projectionxrayradiationdose__accumxraydose__accumintegratedprojradiogdose__dose_area_product_total
in the DAP per study type reduces the time from 22.5 s to 6 s.
-
Hmm. What should a chart of median DAP per study for a bi-plane fluoro system show? Is it total DAP from the whole study? I think so, in which case I’d like a “total_dap” field in the summary table that is the sum of total_dap_a and total_dap_b. It is likely to be the same argument for a dual-plane DX system - at the moment I’ve used total_dap_a for the DX plots, but I suspect that this is incorrect if you have a dual-plane system.
-
Changed chart code to take advantage of the summary data available at study level. Replaced None with empty string in series names for bar chart of average data. Added some elapased time print statements (to be removed) later. Used
total_dap_a
for total DAP in DX charts (this isn't really correct for the bi-plane system we know of). Removed most of the prefetching from the CT chart routine as I believe this is pointless if the data is in the main table. Still to do: RF charts, as there is currently nototal_dap
field. References issue#759→ <<cset 758bd45adc12>>
-
reporter That sounds reasonable (to add a
total_dap
field). We’d need to add a few lines at https://bitbucket.org/openrem/openrem/src/949e53ad29edd2eb0fbec4bcee0b142bf6a59322/openrem/remapp/extractors/extract_common.py#lines-204 to populate it, something like:try: total_dap = total_dap_a + total_dap_b except TypeError: if total_dap_a is not None: total_dap = total_dap_a elif total_dap_b is not None: total_dap = total_dap_b else: total_dap = 0
-
reporter I should add, my laptop is currently out of use - repeated BSODs, attempted Windows reinstall today, no change, so Dell should be coming out to change the motherboard and SSD tomorrow. Until then (and until I have rebuilt everything), I won’t be able to easily do any coding!
-
reporter If you made the changes to
models.py
and toextract_common.py
, you might like to run a little function in amanage.py shell
to reset all the RF and DX studies so that you can run the migration again:from django.db.models import Q from remapp.models import SummaryFields, UpgradeStatus, GeneralStudyModuleAttr dxrf = GeneralStudyModuleAttr.objects.filter(Q(modality_type__exact='DX') | Q(modality_type__exact='CR') | Q(modality_type__exact='RF')) for study in dxrf: study.number_of_events_a = None study.save() upgrade_status = UpgradeStatus.get_solo() upgrade_status.from_0_9_1_summary_fields = False upgrade_status.save() rf_status = SummaryFields.objects.get(modality_type__exact='RF') rf_status.complete = False rf_status.save() dx_status = SummaryFields.objects.get(modality_type__exact='DX') dx_status.complete = False dx_status.save()
-
@Ed McDonagh I’ll give your suggestions a try at some point - not able to do much coding for the next week or so either.
-
reporter Adding total_dap field, attempt to populate it. Needs testing. Refs
#759→ <<cset 3bab9c4f1e69>>
-
reporter @David Platten - changes made, works with the Tosh bi-plane study and doesn’t cause errors with the single plane studies.
-
reporter Bumping version to 0.10 due to ref
#759[skip ci] docs only→ <<cset 506474952661>>
-
reporter Bumping version to 0.10 due to ref
#759[skip ci] interface words only→ <<cset e09a9ae747dc>>
-
reporter Adding some screenhots to upgrade docs for ref
#759[skip ci] docs only→ <<cset d850d0a8aa45>>
-
reporter Attempt to clean up release docs by extracting the previous release ntoes. Refs
#759[skip ci] docs only→ <<cset fbff5751853f>>
-
reporter Very minor change. Refs
#759[skip ci] docs only→ <<cset 70b44378eb8d>>
-
reporter Updating upgrade-offline with new abstraction of pre-0.9.1 doc. Refs
#759[skip ci] docs only→ <<cset f47353c82f86>>
-
reporter - changed milestone to 0.10.0
-
assigned issue to
-
reporter Removing aborted idea of using a migration file for the data migration. Refs
#759→ <<cset adb56cdce7be>>
-
reporter Changed variable name from str to val as str shadows built-in, from codacy. Refs
#759→ <<cset 00ffde8dde30>>
-
reporter Tidying up and addressing other codacy issues. Refs
#759→ <<cset c0a26c13d4b4>>
-
Switched DX study- and request-level charts to use total_dap. References issue
#759and#768→ <<cset e78fe1731540>>
-
Switched RF study- and request-level charts to use total_dap from the study summary. Speed-up of a factor of 4 on my test system. References issue
#759and#768→ <<cset a3752bf10fe9>>
-
Altered chart calculation timing code so that it is logged when DEBUG is True and logs are DEBUG. References issue
#759and#768→ <<cset afc5890af9d2>>
-
Fixing merge conflict in changes files between pull request and develop [skip ci]. References issue
#759→ <<cset c5b851c4ac09>>
-
reporter - changed status to resolved
Merged in issue759AsManyTasks (pull request #315)
Fixes
#759Summary fields in GeneralStudyModuleAttr, Refs#768Update charts to use new study level summary fieldsApproved-by: David Platten
Thanks @dplatten and @LuukO for reviewing (and adding to!) I'm leaving the branch open as you are working in it David - not sure if you want to start a new branch to finish ref
#768off, from develop?→ <<cset 890828d285b3>>
- Log in to comment
I appreciate that the current database is set up to reflect the DICOM templates, but it is far from ideal when it comes to accessing and processing the data in OpenREM, particularly for charts. I think there is a good argument for migrating to a flatter database arrangement that is designed to provide high performance, rather than the database structure reflecting the DICOM templates.
I’m thinking of a flat study-level table that contains everything that is study-level including patient-related data like ID, age, gender (if we collect that), height and weight. I’d also like a flat acquisition-level table that contains everything to do with each acquisition, avoiding the need to find things like kVp values elsewhere. We could have an acquisition-level table per modality.
One of the problems with my above suggestion is that I am not a database expert, and am not certain that the above would result in faster queries. What I do know is that producing the data for the current charts is challenging with the current database structure, and would be greatly simplified with a flatter database structure.