Index Builds Fail

Issue #413 on hold
Travis Edgar created an issue

There were the steps that were taken before she died on us.

/etc/init.d/rc.searchd stop rm /var/piler/sphinx/* indexer --all /etc/init.d/rc.searchd start mkdir /tmp/reindex chown piler. -R /tmp/reindex cd /tmp/reindex reindex -a /usr/local/libexec/piler/indexer.delta.sh

At this point we then died with the following error root@awsemailarchive01:/tmp/reindex# /usr/local/libexec/piler/indexer.delta.sh WARNING: sql_query_post_index: The total number of locks exceeds the lock table size (DSN=mysql://piler:***@localhost:3306/piler) FATAL: no merge destination index 'dailydelta1'

I can not find a reference to dailydelta1 in any of my configs.

We currently have ~11288221 messages in piler. So any indexing can take some time to complete.

Any help would be greatly appreciated.

Comments (8)

  1. Janos SUTO repo owner

    I occasionally change the sphinx configuration and layout to have a better, more efficient index of the stored emails. The dailydelta1 index is the result of this effort. However even if the latest piler version ships with a twisted, modified sphinx scheme, you may still use the old sphinx indices. In this case you only have to put the following to config-site.php and don't change the piler user crontab entries related to indexing:

    $config['SPHINX_MAIN_INDEX'] = 'main1,delta1';
    

    However if you had removed the sphinx index files, then be sure to update the sphinx config file, and the piler crontab entries. Then make sure searchd can start after running indexer --all (ie. resetting or zeroing the sphinx indices). At this point "only" 1 task is left: reindexing your 11 million emails. I'd do it in a batched way eg. reindex 1000 or so emails, then index them, and move an 1000 block further

    cd /tmp
    
    reindex -f 1 -t 1000
    /usr/local/libexec/piler/indexer.delta.sh
    reindex -f 1001 -t 2000
    /usr/local/libexec/piler/indexer.delta.sh
    ...
    

    A nice for cycle perhaps does the trick.

    If you have to reindex everything from scratch, then it's worth to think about upgrading sphinx to 2.2.4 since it has some improvements and incompatible changes from older versions (even though the 2.0.x or 2.1.x settings are supported at the moment). Note that this requires some fixes in the piler shipped version of sphinx.conf. If you choose this path, let me know, and help twisting sphinx.conf.

    And perhaps it's my fault not emphasizing enough what to do with sphinx when upgrading piler, I apologize for that.

  2. Travis Edgar reporter

    So I added...

    $config['SPHINX_MAIN_INDEX'] = 'main1,delta1';
    

    I also removed all crons, and then ran the following

    indexer --all
    Sphinx 2.1.5-id64-release (rel21-r4508)
    Copyright (c) 2001-2014, Andrew Aksyonoff
    Copyright (c) 2008-2014, Sphinx Technologies Inc (http://sphinxsearch.com)
    
    using config file '/etc/sphinxsearch/sphinx.conf'...
    indexing index 'main1'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.037 sec, 0 bytes/sec, 0.00 docs/sec
    indexing index 'main2'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
    indexing index 'main3'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
    indexing index 'main4'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
    indexing index 'delta1'...
    collected 10047249 docs, 28754.1 MB
    sorted 13830.0 Mhits, 100.0% done
    WARNING: sql_query_post_index: The total number of locks exceeds the lock table size (DSN=mysql://piler:***@localhost:3306/piler)
    total 10047249 docs, 28754097161 bytes
    total 11762.444 sec, 2444568 bytes/sec, 854.18 docs/sec
    indexing index 'tag1'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
    indexing index 'note1'...
    collected 0 docs, 0.0 MB
    total 0 docs, 0 bytes
    total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
    total 15789 reads, 1703.477 sec, 2967.4 kb/call avg, 107.8 msec/call avg
    total 98683 writes, 115.280 sec, 987.0 kb/call avg, 1.1 msec/call avg
    

    As you can see the job failed with a WARNING. However according this http://sphinxsearch.com/docs/archives/1.10/conf-sql-query-post-index.html it shouldn't be a problem?

    I will now continue with looping for reindexing.

  3. Janos SUTO repo owner

    Well, the sql warning is definitely a concern to me, it shouldn't occur, although if you have everything archived, then it's fine.

    However I suggested to upgrade the sphinx.conf file to support the new main - dailydelta - delta scheme provided that you would reset with "indexer --all". The older main - delta scheme also works, but at larger files it takes more and more time to merge the delta index to the main index at every 30 mins. Perhaps it's not an issue for you, and sphinx is pretty fast. Anyway I suggest to embrace the locking feature of the newer (util/)indexer.*.sh scripts to prevent the overlapping of two consecutive indexing jobs.

  4. Travis Edgar reporter

    Sorry for not following up sooner. I will upgrade to latest versions of Sphinx and Piler. I will then report back after taking the same steps mentioned above.

    Thanks for the patience.

  5. Log in to comment