Sphinx Indexer cron jobs wipe out sphinx index in /var/piler/Sphinx (Piler 1.3.11)

Issue #1231 resolved
Jurie Botha created an issue

Hi there,

I’m having a strange issue after upgrading from Piler 1.3.4 (ubuntu 16.04).

My procedure was:

  • I backed up data + db.
  • did a fresh install of Ubuntu 20.04 & Piler.
  • Restored the data folder (/var/piler/store).
  • Imported the DB .sql from the backup.
  • Restored encryption keys from previous installation(piler.key/pub)
  • restarted services.
  • Disabled cron jobs.
  • Initialized the index (indexer --all)
  • Ran a reindex -a since the spinx index files from 1.3.4 werent compatible with sphinx 3.3.1.
  • Then manually ran index.delta & index.main scripts. All good - could search etc… etc…

What’s happening is that after re-enabling the cron jobs - the index files under /var/piler/sphinx get zeroed / recreated and all the index data wiped out after a while. I then need to re-run reindex -a to get it all back.

Is my procedure for re-creating the index correct?
Where does sphinx log to? I’m trying to figure out what’s causing this, but dont have much to go on.

Any help/advice would be greatly appreciated.

And MailPiler’s awesome by the way.

Comments (10)

  1. Janos SUTO repo owner

    Your reindexing procedure is fine. However, I doubt that the indexer scripts reset the sphinx files. You need to find out what runs indexer --all, and disable it.

  2. Jurie Botha reporter

    OK I’ll see if I can Track it down.

    Quick Q though: After running both the Delta & Main indexing scripts, should the Delta still have data in them? Or should all of it have been merged into main and delta cleared?

  3. Janos SUTO repo owner

    The delta index merges to the dailydelta, the dailydelta merges to main1 (by default). However, based on the screenshot the delta1 index files are behind the dailydelta by 3 hours. Something is not correct. Also the main index data should be updated once a day, the cron jobs runs at dawn.

  4. Jurie Botha reporter

    OK this was a result of me running them manually, why the times are off.

    I’ve been testing and have ran them manually multiple time to test, and it’s definitely not the index.main or .delta scripts causing it. The cron jobs are still commented out in cron, while I’m trying to track down the issue, and since I commented them out, the issue hasn’t re-occured.

    I did do a reboot of the server just after running the indexing scripts - and it was after reboot i noticed the index was cleared. So something may have happened during boot. or shutdown. I’m busy importing a large ammount of mail from psts atm, but once thats done - i’ll run the inde.delta & main again - do a backup of the sphinx directory and reboot again to see if that’s the case.

  5. Janos SUTO repo owner

    You may contact the sphinx developers with a bug report as suggested. Unfortunately I can’t help it.

  6. Jurie Botha reporter

    Thanx for the assist thusfar.

    I cleared the index (indexer --config /usr/local/etc/piler/sphinx.conf --all) and ran again and no error this time.

    Could corruption have made it into the data index.delta & index.main reads?

    I have a backup of the index from before my PST Import, how do I re-index only the mails added post backup? (how do I determine where to -f from?)

    I may have found the issue - it looks like the sph_index table in DB may be corrupted - reads as fin e but 963MB in size with no data in it. I recreated it and running another reindex -a.

    Will see what the result is.

    Just for my own piece of mind - this was my installation procedure - minus the RAID stuff at the end:

    https://monklinux.blogspot.com/2021/08/installing-mailpiler-on-ubuntu-2004.html

    Reason I’m posting that is that there should be nothing else running indexer --all automatically as sphinx is installed per the maunual installation instructions.

    I’ll see how this reindex goes, but if this fails I’m going to sart from scratch. Clean piler and reimport all mails from PST & backups.

    Thank you for taking the time to try and assist me.

  7. Log in to comment