s2i:combine-milestone-chunk slow on large texts
Page next on a large text like Gerard's Herbal or Purchas can take many seconds. The main culprit is s2i:combine-milestone-chunk in standoff2inline.xqm in the annotation service (or possibly something that it calls). It can take 3-5 seconds just to do the combine operation. Given that we already have the starting and ending milestone nodes before we get here, it should not take any longer to do this for a large text than for a small one, but it does, by at least an order of magnitude.
Comments (4)
-
-
reporter There is no reason to believe there is an eXist bug here. Just something in the merging of annotations into the main text that usually is so fast that it's unnoticeable but gets slow with a large text. Or maybe it's a large number of annotations, and the recent edition of all the autocorrect annotations is what I'm noticing.
-
reporter Page next speeds are reasonable (a few hundred milliseconds) early in of these large texts but much slower (5 seconds or more) once you get a few hundred pages in. Later milestones should not be slower to retrieve with indexes than earlier ones. This needs more study.
-
reporter - changed status to resolved
We have stopped using this slow function and sped up page turning by a factor of 3 or 4 as of:
commit 1a3d6511db638887f2efcc072f3ccc555e34098f Author: Craig A. Berry craigberry@mac.com Date: Sat Aug 15 15:12:56 2020 -0500 Ditch s2i:merge-milestone-chunk for faster page turning It's very slow and buggy, and it turns out not to be necessary at all. The annotation client has all the annotation information, so with minor modifications it can make any appropriate display changes in the browser in a few milliseconds. It was already doing highlighting, but it can easily modify content on-the-fly as well. We go back to simply fetching the XML fragment containing the page or div of interest as the original TEI Simple application did. In the case of Hakluyt's Principal Voyages, paging back and forth between page 750-a and 750-b was taking over 12 seconds per page turn on average. Now it's under three seconds. N.B. There is still a bug in eXist where accessing any node in a long document is much slower than accessing a node in a shorter document, and the nearer that node is to the end of the long document, the worse it gets. Performance will never be great on long texts unless and until that gets fixed. N.B. #2. The Review page still uses s2i:wrap-recursive and may not really need to, but it doesn't seem to be the rate limiting component there.
- Log in to comment
Is there a way of getting this to the attention of the wider eXist community?