query in piler import option
Dear jsuto, I have few queries in piler,
- I am going to import TB's of mail data from zimbra 8.x to mail piler 1.1.1 using pilerimport option but as of my practical observation it would take month time. How can I import these mail data with reduced time into mail piler?
- My mail data is 4 years older already as I am using zimbra from 2011 and I import it right now into piler, if I set retention policy to delete mails with 6 years. When the mails will get deleted in piler? any advice will be appreciable.. Thanks Arun..
Comments (14)
-
-
reporter Dear extremeshok,
Thanks for the reply, Anyhow am not importing over network it is just from my local HDD as i copied all zimbra mail data to piler machine and even now also it seems taking month time. and regarding my 2nd query if I import four years older data now into piler then what will be the timestamp of those mails in piler? If piler consider those mails with current time when i import mails, then those 4 years older mails will purge after (6+4)10 years only... can you please clarify this...!!
-
Date of the email. Not the import date.
You'll need custom written scripts to speed the import, note speed is limited by CPU and memory.
-
reporter Dear extremeshok,
Thanks for the suggestion..!! hope it would help me..!!
-
reporter - changed status to closed
-
reporter Dear extremeshok, No luck for me... Pls can you share me the script.. Thanks in advance..
-
Hi extremeshok
can you give some tips on how to create this multithreaded script ... It will be very useful, since we are also having some similar requirement.
Thanks in advance
-
reporter - changed status to open
-
repo owner Pilerimport is indeed a single threaded utility allowing you to import one email at a time. However, you can run it in parallel with more instances, eg. pilerimport -d dir1, pilerimport -d dir2, pilerimport -d dir3, etc.
To speed up the import process make sure you have enough CPU power, a fine tuned mysql server with big enough buffers, and a fast disk.
Also one other option is to turn off processing attachments provided that indexing attached files is not important. I'll add an option allowing you to disable attachment processing.
-
Thanks Janos for the reply.
Can we temporarily stop indexing, as we have not released the piler server for users yet and then once import is complete, can we start the indexing. Will that overall help
-
reporter Dear Janos,
I tried running multiple dir with piler import option but import working for last directory alone. this is the command i passed, su - piler pilerimport -d dir1, pilerimport -d dir2, pilerimport -d dir3
I separated mail data into three directories dir1 dir2 dir3 but dir3 alone imported... any advise on this pls..
-
repo owner You didn't think you should type "pilerimport -d dir1, pilerimport -d dir2, pilerimport -d dir3", did you? I meant to issue these 3 commands in 3 terminals one command in one terminal, eg.
pilerimport -d dir1 pilerimport -d dir2 pilerimport -d dir3
Btw. the option I mentioned before already exists: extract_attachments, set it to 0 in piler.conf to disable attachment processing, if it's a viable option for you.
-
repo owner - changed status to resolved
I hope there's enough information for you to improve processing time.
-
reporter Dear jsuto, I tried in my environment, It works for me.. Thanks a lot...
- Log in to comment
I wrote a multi-threaded shell script for piler import which we used for migrating from TB's of data from atmail archive to mailpiler, took around 20 hours instead of 4months.
First we exported all the entire archive of individually compressed emails to /datastore/import and ran the script against this dir, mainly to remove network and protocol latencies.