- marked as minor
- edited description
- changed title to No search results on the subsequent pages of large PDF documents.
No search results on the subsequent pages of large PDF documents.
Hello Janos.
I have noticed that large PDF attachments are not fully indexed, or at least the search stops finding results after a certain number of pages (for example no search-results after page 120 of 300). The page number on which this occurs is not always the same and varies from document to document.
Do you know what could be the cause of this? I am currently using version 1.3.12 build 1001 of piler.
Thank you and best regards, Valerio
Comments (6)
-
reporter -
repo owner The parser uses a buffer to store text from the body including the attachments, and it has a finite size ~132 kB. So if you have a pretty large pdf file with text, then only the first ~130 kB is indexed.
-
reporter Thank you for your explanation. Are there reasons against increasing the buffer size? Or would it be conceivable to increase the buffer to e.g. 512kB?
-
repo owner I think 132 kB should be fine, however, you may increase it. To do that edit src/config.h and fix the settings for #define BIGBUFSIZE, then recompile piler.
-
reporter Ok. I am considering it… Many thanks again!
-
reporter - changed status to resolved
- Log in to comment