Index Builds Fail

Issue #413 on hold

Travis Edgar created an issue 2014-09-26

There were the steps that were taken before she died on us.

/etc/init.d/rc.searchd stop rm /var/piler/sphinx/* indexer --all /etc/init.d/rc.searchd start mkdir /tmp/reindex chown piler. -R /tmp/reindex cd /tmp/reindex reindex -a /usr/local/libexec/piler/indexer.delta.sh

At this point we then died with the following error root@awsemailarchive01:/tmp/reindex# /usr/local/libexec/piler/indexer.delta.sh WARNING: sql_query_post_index: The total number of locks exceeds the lock table size (DSN=mysql://piler:***@localhost:3306/piler) FATAL: no merge destination index 'dailydelta1'

I can not find a reference to dailydelta1 in any of my configs.

We currently have ~11288221 messages in piler. So any indexing can take some time to complete.

Any help would be greatly appreciated.

Comments (8)

Janos SUTO repo owner
I occasionally change the sphinx configuration and layout to have a better, more efficient index of the stored emails. The dailydelta1 index is the result of this effort. However even if the latest piler version ships with a twisted, modified sphinx scheme, you may still use the old sphinx indices. In this case you only have to put the following to config-site.php and don't change the piler user crontab entries related to indexing:
```
$config['SPHINX_MAIN_INDEX'] = 'main1,delta1';
```
However if you had removed the sphinx index files, then be sure to update the sphinx config file, and the piler crontab entries. Then make sure searchd can start after running indexer --all (ie. resetting or zeroing the sphinx indices). At this point "only" 1 task is left: reindexing your 11 million emails. I'd do it in a batched way eg. reindex 1000 or so emails, then index them, and move an 1000 block further
```
cd /tmp

reindex -f 1 -t 1000
/usr/local/libexec/piler/indexer.delta.sh
reindex -f 1001 -t 2000
/usr/local/libexec/piler/indexer.delta.sh
...
```
A nice for cycle perhaps does the trick.

If you have to reindex everything from scratch, then it's worth to think about upgrading sphinx to 2.2.4 since it has some improvements and incompatible changes from older versions (even though the 2.0.x or 2.1.x settings are supported at the moment). Note that this requires some fixes in the piler shipped version of sphinx.conf. If you choose this path, let me know, and help twisting sphinx.conf.

And perhaps it's my fault not emphasizing enough what to do with sphinx when upgrading piler, I apologize for that.
- 2014-09-27T18:23:12+00:00
Janos SUTO repo owner
- assigned issue to
  
  Janos SUTO
- 2014-09-27T18:23:29+00:00

Travis Edgar reporter

So I added...

$config['SPHINX_MAIN_INDEX'] = 'main1,delta1';

I also removed all crons, and then ran the following

indexer --all
Sphinx 2.1.5-id64-release (rel21-r4508)
Copyright (c) 2001-2014, Andrew Aksyonoff
Copyright (c) 2008-2014, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/etc/sphinxsearch/sphinx.conf'...
indexing index 'main1'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.037 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'main2'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'main3'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'main4'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'delta1'...
collected 10047249 docs, 28754.1 MB
sorted 13830.0 Mhits, 100.0% done
WARNING: sql_query_post_index: The total number of locks exceeds the lock table size (DSN=mysql://piler:***@localhost:3306/piler)
total 10047249 docs, 28754097161 bytes
total 11762.444 sec, 2444568 bytes/sec, 854.18 docs/sec
indexing index 'tag1'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'note1'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
total 15789 reads, 1703.477 sec, 2967.4 kb/call avg, 107.8 msec/call avg
total 98683 writes, 115.280 sec, 987.0 kb/call avg, 1.1 msec/call avg

As you can see the job failed with a WARNING. However according this http://sphinxsearch.com/docs/archives/1.10/conf-sql-query-post-index.html it shouldn't be a problem?

I will now continue with looping for reindexing.

2014-09-29T22:35:02+00:00

Janos SUTO repo owner
Well, the sql warning is definitely a concern to me, it shouldn't occur, although if you have everything archived, then it's fine.

However I suggested to upgrade the sphinx.conf file to support the new main - dailydelta - delta scheme provided that you would reset with "indexer --all". The older main - delta scheme also works, but at larger files it takes more and more time to merge the delta index to the main index at every 30 mins. Perhaps it's not an issue for you, and sphinx is pretty fast. Anyway I suggest to embrace the locking feature of the newer (util/)indexer.*.sh scripts to prevent the overlapping of two consecutive indexing jobs.
- 2014-09-30T09:09:55+00:00
Janos SUTO repo owner
Is it working properly at the moment?
- 2014-10-09T08:07:03+00:00
Travis Edgar reporter
Sorry for not following up sooner. I will upgrade to latest versions of Sphinx and Piler. I will then report back after taking the same steps mentioned above.

Thanks for the patience.
- 2014-10-09T21:16:46+00:00
Janos SUTO repo owner
OK, fine.
- 2014-10-10T11:23:33+00:00
Janos SUTO repo owner
- changed status to on hold
OK, be sure to reopen the issue when you come back
- 2014-10-24T13:44:03+00:00
Log in to comment

Assignee: Janos SUTO

Type: bug

Priority: major

Status: on hold

Votes: 0

Watchers: 2