A couple of questions

Hi, Firstly thank you ever so much for this excellent mail archive solution. I have been experimenting with it the last few days, after installing it on a CentOS 6.4 VM. It seems to work very well, however I have a couple of questions I wonder if you could help with.

I am not an expert Linux or DB admin, more novice / intermediate, so apologies if some of my issues are of a very basic nature!

There are two things I am a little confused by. Firstly, regarding the search indexing. I don't quite understand how this works. I had thought that the regular indexer.delta.sh script would only need to index mails which had been added since the last time it ran (I assumed it used the email index id's to know which mails it had not seen). However I find today that even if I run indexer.delta.sh and then immediately run it again, it takes the same length of time (around 2.5 minutes). Is this normal?

I can see from 'select count(*) from sph_index' that new mails added are there (ef if I add 100 mails then this value is 100) and that this value goes to 0 after running indexer.delta.sh.

Secondly, today I added a bunch of mail from an mbox file which is created by postfix on my mail server using always_bcc. Basically, I copy this file off to another location at the end of each day and date/timestamp it. I then keep these as 'original' versions of all email, which I can use to recreate my archive if required in the future (I was using a hypermail archive previously). So I added yesterdays mbox format 'bcc' file , which should have all yesterdays mail in it. with pilerimport. That was fine. I then manually copied off the bcc mbox file for today so far and again used pilerimport to import it. I was surprised to see that out of 50 emails, there were 11 duplicates (I could see this in the output of pilerimport).

How can there be duplicate mails in this situation? Is this some form of deduplication relating to a conversation thread?

Many thanks in advance for your reply. Regards Craig

Comments (6)