Pilerimport imap get stuck
Hi Janos,
I am running Piler 1.3.0 build 955 on Debian server.
I am using pilerimport for archiving emails from remote imap server which has roughly 480K emails. The import goes well until at a point half way through when the process gets stuck and does not import further.
I restarted the pilerimport process to see if it's any better but it got stuck at the same position where it processed around 262K emails.
Can you please suggest any workaround for this? I really need to archive all these emails into my mail archival server.
These are the last few lines from /var/log/messages: kernel: [2855234.979740] catdoc[15409]: segfault at a0c4238 ip 0000000000404195 sp 00007ffd5c05ecc8 error 4 in catdoc[400000+9000] kernel: [2855833.174712] catdoc[16176]: segfault at 8d8f238 ip 0000000000404195 sp 00007ffdbd72d7a8 error 4 in catdoc[400000+9000]
Comments (9)
-
repo owner -
reporter Did exactly what you said. Upon doing pilerimport for the last email before it failed:
pilerimport -e 28516-imap-262700.txt
duplicate: 28516-imap-262700.txt (duplicate id: 2936075)
I think it's not the problem with the email message at all then? The other thing I noticed on my multiple failed pilerimport attempts is the number of processed emails (exactly 262700) before it crashed.
First attempt:
duplicate: 15480-imap-262611.txt (duplicate id: 2935990) duplicate: 15480-imap-262657.txt (duplicate id: 2936036) duplicate: 15480-imap-262658.txt (duplicate id: 2936036) duplicate: 15480-imap-262676.txt (duplicate id: 2936053) duplicate: 15480-imap-262677.txt (duplicate id: 2936053) duplicate: 15480-imap-262678.txt (duplicate id: 2936053) processed: 262700 [ 54%]
Second attempt:
duplicate: 28516-imap-262697.txt (duplicate id: 2936072) duplicate: 28516-imap-262698.txt (duplicate id: 2936073) duplicate: 28516-imap-262699.txt (duplicate id: 2936074) processed: 262700 [ 64%]
Between first and second attempts, emails were being moved across folders so they don't exactly show the same completion percentage.
Could it be a pilerimport limitation on the number of emails being processed ?
-
repo owner There shouldn't be such a limit. Btw. please check the email file if it looks like a valid and complete message. Also if you have pop3 enabled, then try to import via pop3 as well to see if it's an imap protocol problem.
-
reporter I was able to locate that message on piler gui so it looks like there wasn't any problem with that paticular email (as it has already been improted and archived!)...however, I moved emails around in the imap server so they are spreadout across multiple folders (around 50k each). Restarted pilerimport with one folder at a time and see what happens from there.
Thank you for your help!
-
reporter Pilerimport got stuck again and I saved the email file which it was trying to import. Did a pilerimport -e on that single .txt file and it did not complain.
I have sent the text file on your email address. Please take a look at it!
Thanks!
-
repo owner I got it, and would check it later. I'll keep you posted.
-
repo owner Well, it's spam. The parser can process it, though it doesn't recognize the ending </style> tag. I'll try to fix that.
-
repo owner The problem might be in the imap protocol implementation. So I’ve fabricated a python based tool to get emails via imap. Check out https://bitbucket.org/jsuto/piler/raw/1ec38590e62edab2b2a8d693f8ec2eca1e3c267c/util/imapfetch.py for the script. After getting the emails with imapfetch.py, you may process them with pilerimport.
Btw. sorry that it took a bit more than the usual.
-
repo owner - changed status to resolved
- Log in to comment
Hello. It seems like catdoc has a problem. You should see a file like <pid>-imap-<message number>.txt in the current directory. It's supposed to be the last email it's trying to process at the moment. Copy that file to somewhere else to save it. Then kill pilerimport, and try to import that specific message only using -e <filename>, and let's see if pilerimport can handle it. If not, then I'd like to see that email, and check what's wrong with it.