Home page is slow to populate
The code that populates each modality table on the home page has a for loop which runs a query for each display name. If we had the study date and time available in a single datetime field then this loop would be unnecessary, and the loop could be replaced with a single query using annotations.
Comments (20)
-
-
And I doubt we deal with time zone information properly, though I don’t know how many modalities use the time zone functionality in DICOM.
-
reporter In
models.py
we can add this new class which makes a calculated field calledstudy_date_time
available in QuerySets fromGeneralStudyModuleAttr
:class GeneralStudyModuleAttrManager(models.Manager): def get_queryset(self): qs = super(GeneralStudyModuleAttrManager, self).get_queryset().annotate( study_date_time=models.ExpressionWrapper( models.F("study_date") + models.F("study_time"), output_field=models.DateTimeField() ) ) return qs
We also need to add the following line to the end of the
GeneralStudyModuleAttr
definition inmodels.py
to make the above manager get used:objects = GeneralStudyModuleAttrManager()
You can then use
study_date_time
as a field inGeneralStudyModuleAttr
QuerySets, so the for loop inupdate_latest_studies
withinviews.py
can be replaced with this single query:test_data_new = studies.values("generalequipmentmoduleattr__unique_equipment_name__display_name").annotate( num_studies=Count("pk"), latest_study_date_time=Max("study_date_time"), timedelta=ExpressionWrapper(now - F("latest_entry_date_time"), output_field=DurationField()), ).order_by("-latest_entry_date_time")
test_data_new
can be passed to the template, wherenum_studies
,latest_study_date_time
andtimedelta
can all be accessed. What is missing with the method above is thepk
of the display name, which is present in the current code, and I assume is needed when displaying study data for the two time deltas, which is added using Ajax. The solution to that is probably to calculate the studies within the time deltas in the above code, if needed, and dispense with the Ajax call. -
Is it efficient to make that calculation (adding date and time) every time?
-
reporter I’ll check on my slow laptop.
Just worked out that we don’t need to use the pk of the display name for HTML id: can use the display name itself with the whitespace stripped.
-
reporter Trying new method of populating home page that avoids looping through each display name. Note: I've removed a chunk of the template file, and don't understand what the bit that I have removed does. Refs issue
#984→ <<cset 8e87d4be75e0>>
-
reporter Some timings on my laptop (almost 2,000,000 studies in total). Having done this I have no concern about using the new datetime field in
GeneralStudyModuleAttr
:Develop (s) This branch (s) Home page (no time deltas) 62 3 (!) Home page (inc time deltas) 119 49 (but broken…) Looks like I have somehow broken the Ajax updating of studies within the time deltas.
I will aim to calculate the studies within the time deltas without the Ajax calls.
Leaving this now for the time being.
-
reporter Removed need for Ajax calls to calculate studies since delta a and delta b. Should be much faster - will check on laptop. Refs issue
#984→ <<cset b1e26fd7fcfa>>
-
reporter Some timings from my laptop (almost 2,000,000 studies) after my changes to remove Ajax calls for workload since delta a and delta b:
Develop (s) This branch (s) Home page (no time deltas) 62 3 Home page (inc time deltas) 119 3 (no longer broken) That’s very satisfying.
-
reporter Put code to check for no display name back in. Refs issue
#984→ <<cset 0be7f59e0450>>
-
Blimey
-
reporter Fixed table sorting on home page. I don't know how long this has been broken for, but it doesn't work on demo.openrem.org. A side-effect of this fix is that I no longer need the timedelta annotation. However, I've not deleted it, just commented it out, because it took me a while to work out how to do it, and it may be useful in the future. Refs issue
#984→ <<cset 890f7a11ac54>>
-
reporter Updated changes file. Refs issue
#984→ <<cset 4ad64b42acc2>>
-
reporter - changed milestone to 1.0.0
-
reporter Addressing some Codacy things: removing unused imports and adding a space before a comment symbol. Refs issue
#984→ <<cset a80e1e66fc64>>
-
reporter For date_a and date_b comparing with today.date() rather than a datetime so that the results exactly match the develop branch. Refs issue
#984→ <<cset b96aa4f186c3>>
-
reporter Removed unnecessary Ajax update of total number of studies and number of studies in each modality - this data was already being populated when the home page is first loaded. Also replaced multiple calls to .count() with a single aggregate query. I am definitely done with updating the home page code now. Refs issue
#984→ <<cset 8ee483afe7c1>>
-
reporter What I’ve learned from doing this is that querying the database in a loop is a bad idea which results in many more hits to the database than necessary. I've also learned that using
annotate
andaggregate
on DjangoQuerySets
are good for reducing the number of database queries required.For a visit to the homepage on my laptop (1.9 million studies) the develop branch makes 722 database queries in a total query time of 240.8 seconds. This
#984branch code makes 66 database queries in a total time of 6.6 seconds. -
reporter - changed status to resolved
Fixed by commit 0ab2fb0 (pull request #564)
-
Timings comparing fresh server installs of the b1 release (without this branch) against b2 release (includes this branch). Both servers have a migrated database of 650,000 studies. Three Shift-Ctrl-R refreshes, timed using the Chrome dev tools Network tab using the “Finish” value at the bottom.
1.0.0b1 1.0.0b2 Home page, deltas on, logged in 23.85 s 23.29 s 23.41 s 2.11 s 2.17 s 2.04 s
- Log in to comment
I’ve long thought that we probably aren’t doing the right thing regarding time and date fields.