pilerpurge causes a segfault in libmysqlclient.so.18.0.0

Issue #1228 closed

supporthq created an issue 2022-01-30

Hi Janos,

It has been a very long time posting an issue as your solution was been working great for many years….

I have enabled pilerpurge now as it has been 7+ years of collecting emails, and there are now emails that it can delete.

Running pilerpurge as user piler faulst roughly around the same spot, give or take a few records

‌

removing attachment: */var/piler/store/00/538/91/1a/40000000538b62db354f02c400a5deb0911a.a3*
removing attachment: */var/piler/store/00/538/91/1a/40000000538b62db354f02c400a5deb0911a.a4*
removing attachment: */var/piler/store/00/538/91/1a/40000000538b62db354f02c400a5deb0911a.a5*
removing attachment: */var/piler/store/00/538/91/1a/40000000538b62db354f02c400a5deb0911a.a6*
removing attachment: */var/piler/store/00/538/91/1a/40000000538b62db354f02c400a5deb0911a.a7*
Segmentation fault

In the logs, I can see this entry:

Jan 30 11:41:10 piler kernel: [ 2857.729531] pilerpurge[3730]: segfault at 2c02cc3711 ip 00007f2e22a0c79a sp 00007fffee763370 error 4 in libmysqlclient.so.18.0.0[7f2e229d7000+2bf000]

Have tried lots of this like stopping piler, searchd, cron while running pilerpurge, also gave the machine lots more RAM and CPU but no improvement

My version of piler is very old, haven't had a need to update, am a little nervous about upgrade, if you think that is the solution then are there any instructions for this many versions or other suggestions….maybe just export into a new OVA??

‌

root@piler:~# piler -v
piler 1.1.1, build 904, Janos SUTO <sj@acts.hu>

Build Date: Sun Sep 27 15:05:42 EST 2015
ldd version: ldd (Debian EGLIBC 2.13-38) 2.13
gcc version: gcc version 4.7.2 (Debian 4.7.2-5)
Configure command: ./configure --localstatedir=/var --with-database=mysql --enable-starttls --enable-tcpwrappers

‌

Comments (17)

supporthq reporter
I may have worked aournd the issue, I am setting the date on the system back to 2013 and running pilerpurge, seems to complete without error discarding a few thousand emails…..then setting date to 2014 and running pilerpurge again, and so on. So far it seems to work.

Out of interest, how often should you run it? daily? weekly? monthly?
- 2022-01-30T02:19:00+00:00
Janos SUTO repo owner
Well, I usually suggest to upgrade for both new features and bugfixes. Since your version is indeed pretty old it’s worth to upgrade, however Debian 7 is outdated as well, and the recent versions of piler require recent OS packages as well.

Anyway, if you managed to find a solution, it’s great. I’d run pilerpurge daily, because in that case it needs to remove fewer messages, and it finishes faster.
- 2022-01-30T12:03:14+00:00
supporthq reporter
Thanks for your reply…

I have been rolling thru the purge….I suspect it is faulting on a particular record rather than the volume of records to purge…..I will try and narrow it down
- 2022-01-31T07:55:34+00:00
Janos SUTO repo owner
OK, keep me posted.
- 2022-01-31T13:16:00+00:00
supporthq reporter
I stand corrected….It was seg faulting when I would purge a whole month between 2020-04 and 2020-05, so i made up a script that would increment the clock by one day and run the purge…and it completed without issue, each day takes around 8 to 9 minutes. So maybe it is the volume after all, if not emails but the total number of attachments???

Here is when the fault occured for a 1 month batch

May 1 01:07:03 piler kernel: [ 8720.063675] pilerpurge[4307]: segfault at 2c019d380a ip 00007fec07c7c79a sp 00007fffdb7b2c20 error 4 in libmysqlclient.so.18.0.0[7fec07c47000+2bf000]

These are the ‘day by day’ batches….

Apr 1 01:02:16 piler pilerpurge[4809]: purged 0 messages, 0 bytes
Apr 2 01:09:57 piler pilerpurge[4827]: purged 123 messages, 25842526 bytes
Apr 3 01:09:04 piler pilerpurge[4895]: purged 619 messages, 160913947 bytes
Apr 4 01:08:18 piler pilerpurge[4959]: purged 690 messages, 238340349 bytes
Apr 5 01:08:32 piler pilerpurge[5007]: purged 597 messages, 256511351 bytes
Apr 6 01:07:56 piler pilerpurge[5049]: purged 454 messages, 111199477 bytes
Apr 7 01:06:54 piler pilerpurge[5100]: purged 92 messages, 35165971 bytes
Apr 8 01:08:16 piler pilerpurge[5144]: purged 121 messages, 40104701 bytes
Apr 9 01:08:18 piler pilerpurge[5184]: purged 571 messages, 132024614 bytes
Apr 10 01:08:45 piler pilerpurge[5234]: purged 576 messages, 186112442 bytes
Apr 11 01:08:15 piler pilerpurge[5286]: purged 555 messages, 151551863 bytes
Apr 12 01:08:14 piler pilerpurge[5334]: purged 470 messages, 168594676 bytes
Apr 13 01:08:15 piler pilerpurge[5374]: purged 424 messages, 166376689 bytes
Apr 14 01:06:56 piler pilerpurge[5429]: purged 86 messages, 16708693 bytes
Apr 15 01:07:15 piler pilerpurge[5473]: purged 156 messages, 73282748 bytes
Apr 16 01:09:13 piler pilerpurge[5508]: purged 601 messages, 139985711 bytes
Apr 17 01:08:02 piler pilerpurge[5573]: purged 414 messages, 62453444 bytes
Apr 18 01:09:04 piler pilerpurge[5622]: purged 449 messages, 238614441 bytes
Apr 19 01:08:12 piler pilerpurge[5679]: purged 527 messages, 108773786 bytes
Apr 20 01:08:27 piler pilerpurge[5727]: purged 387 messages, 104392763 bytes
Apr 21 01:06:52 piler pilerpurge[5778]: purged 88 messages, 10268590 bytes
Apr 22 01:07:20 piler pilerpurge[5822]: purged 100 messages, 66370951 bytes
Apr 23 01:09:30 piler pilerpurge[5856]: purged 667 messages, 214714134 bytes
Apr 24 01:08:56 piler pilerpurge[5927]: purged 643 messages, 157633252 bytes
Apr 25 01:08:30 piler pilerpurge[5979]: purged 547 messages, 137847128 bytes
Apr 26 01:07:20 piler pilerpurge[6029]: purged 118 messages, 23254623 bytes
Apr 27 01:08:10 piler pilerpurge[6064]: purged 441 messages, 117676670 bytes
Apr 28 01:07:27 piler pilerpurge[6114]: purged 82 messages, 41702033 bytes
Apr 29 01:07:19 piler pilerpurge[6160]: purged 180 messages, 169735874 bytes
Apr 30 01:09:07 piler pilerpurge[6195]: purged 607 messages, 168567069 bytes
- 2022-01-31T13:48:42+00:00
supporthq reporter
Seems to be doing it quite randomly now, I might just let it keep running and hoping it eventually finishes….

Not sure what ‘state’ the database and filesystem are left in when the purge has a seg fault…..is there are query/script I can run to check the integrity? Worried that it is leaving orphans files or records.

‌

30-04-2020
Thu Apr 30 00:00:00 EST 2020
07-05-2020
Thu May 7 00:00:00 EST 2020
14-05-2020
Thu May 14 00:00:00 EST 2020
21-05-2020
Thu May 21 00:00:00 EST 2020
Segmentation fault
28-05-2020
Thu May 28 00:00:00 EST 2020
Segmentation fault
04-06-2020
Thu Jun 4 00:00:00 EST 2020
11-06-2020
Thu Jun 11 00:00:00 EST 2020
18-06-2020
Thu Jun 18 00:00:00 EST 2020
Segmentation fault
25-06-2020
Thu Jun 25 00:00:00 EST 2020
Segmentation fault
02-07-2020
Thu Jul 2 00:00:00 EST 2020
09-07-2020
Thu Jul 9 00:00:00 EST 2020
16-07-2020
Thu Jul 16 00:00:00 EST 2020
23-07-2020
Thu Jul 23 00:00:00 EST 2020
30-07-2020
Thu Jul 30 00:00:00 EST 2020
06-08-2020
Thu Aug 6 00:00:00 EST 2020
- 2022-02-01T01:12:02+00:00
Janos SUTO repo owner
Unfortunately I’m not aware of such query or tool.
- 2022-02-01T08:23:43+00:00
supporthq reporter
I managed to get thru a full year by incrementing one day at a time….it is a slow process with takes about 7min to run the purge on each instance. I thin I can manage through it but definately seems like a daily task to keep the database in tip top shape.

I do need help with one thing however, i had three emails that where imported with a weird future date in the year 2034, I followed some advice in another thread that said to update the metadata record with an old retention date…interestingly, the three emails are still appearing at the top of the search but a marked as ‘message not verified’, if I try download it I get a “pilerget cannot open' error. So I seems that the files have been purged, but the metadata record is still in the database.

What steps can I take to manually remove this entry?
- 2022-02-02T11:51:47+00:00
Janos SUTO repo owner
In order to keep the messaging history the purge utility doesn’t remove anything from the piler mysql database.

If you insist to remove those messages (I personally believe that it’s not a good idea), then delete the affected id from both metadata and rcpt tables.
- 2022-02-02T12:35:32+00:00
supporthq reporter
Hello again, I thin I am completed with the purge, putting in the daily routine so should be OK from now on. One thing I did notice is that all the ‘purges’ messages are still appearing in the auditor search with a ‘message not verified’ message. It indeed looks like the files have been purged and the database as deleted = 1 in the metadata…..Is this expected behavior or does something need to run remove these purged messages from appearing in the search?
- 2022-02-04T13:46:35+00:00
Janos SUTO repo owner
Yes, the deleted=1 column setting is intentional. Run the delta indexer to get rid of the deleted messages from the gui.
- 2022-02-04T15:42:12+00:00
supporthq reporter
As far as I can tell, the messages have record deleted=1, the .m file has been deleted, and the .a files are still present, and the messages still appear in the search (example below). I have these two items in crontab, am I missing something?
```
5,35 4-23 * * * /usr/local/libexec/piler/indexer.delta.sh
30   2    * * * /usr/local/libexec/piler/indexer.main.sh
```
‌
- 2022-02-06T12:10:07+00:00
Janos SUTO repo owner
No, the crontab entries look fine. I’m not sure why the kill-list doesn’t kick in. Btw. what sphinx version do you have?
- 2022-02-06T15:39:03+00:00

supporthq reporter

Here is the sphinx version

‌

Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)

‌

2022-02-09T00:09:44+00:00

supporthq reporter

Would it have anything to do with being up to main4’?

‌

/etc/sphinxsearch/sphinx.conf

index main1
{
        source                  = main1
        path                    = /var/piler/sphinx/main1
        docinfo                 = extern
        charset_type            = utf-8
        enable_star             = 1
        min_prefix_len          = 6
        min_word_len            = 1

}

index main2
{
        source                  = main2
        path                    = /var/piler/sphinx/main2
        docinfo                 = extern
        charset_type            = utf-8
        enable_star             = 1
        min_prefix_len          = 6
        min_word_len            = 1
}

index main3
{
        source                  = main3
        path                    = /var/piler/sphinx/main3
        docinfo                 = extern
        charset_type            = utf-8
        enable_star             = 1
        min_prefix_len          = 6
        min_word_len            = 1
}

index main4
{
        source                  = main4
        path                    = /var/piler/sphinx/main4
        docinfo                 = extern
        charset_type            = utf-8
        enable_star             = 1
        min_prefix_len          = 6

‌

2022-02-15T02:56:40+00:00

Janos SUTO repo owner
No, I don’t think so. I suspect that the ‘no subject’ email is caused by a phantom or bogus sphinx entry. Check the mail log when you click on it whether it syslogs any error.
- 2022-02-18T18:45:50+00:00
Janos SUTO repo owner
- changed status to closed
Unless this issue is still relevant, case is closed.
- 2023-01-29T16:44:31+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: closed

Votes: 0

Watchers: 1