retention days can overflow (Y2038 Unixtime issue?)
I'm trying to figure out how the retention policy works. In order to prevent any expiry, even inadvertedly caused by running the purge scripts by accident, I've set the retention days to 36500 (100 years). This caused my messages to get a 'retained' value somewhere in 1978, and immediate deletion upon running the purge script.
I haven't looked at this in great detail. but I would think this is caused by the Unixtime overflow issue that we will run in to around 2038, described at https://en.wikipedia.org/wiki/Year_2038_problem
I think this won't be a 'big' issue until any sane retention policy nears 2038, so given a policy of 10 years, around 2028. Still, let's make sure this is cleared up by then ;-)
Obligatory install details: fresh install:
root@dms:/usr/local/etc# piler -V piler 1.1.1, build 904, Janos SUTO sj@acts.hu
Build Date: Fri Jul 22 14:43:57 CEST 2016 ldd version: ldd (Debian GLIBC 2.19-18+deb8u4) 2.19 gcc version: gcc version 4.9.2 (Debian 4.9.2-10) Configure command: ./configure --localstatedir=/var --with-database=mysql --enable-starttls --enable-tcpwrappers
Comments (7)
-
repo owner -
reporter It's a 64 bit system, and the retained column value was a unix timestamp for somewhere in 1978 (270 million something, I had cleaned the database after this experiment - sorry)
-
repo owner I'm interested in the exact value of the retained column. Yesterday when I tested it, I got a timestamp translating to 2100 or something.
-
reporter Ok, I recreated this scenario, setting default_retention_days=36500. Importing a new message will now create a retained value in the past. Examples:
arrived: 1469432281 sent: 1433147056 retained: 291779760
arrived: 1469432281 sent: 1432646818 retained: 291279522
-
repo owner Hmm, it looks like an overflown value. The correct timestamp should be 1433147056 + 86400*36500 = 4586747056, however if modulo divided it with 2^32, then it yields to 291779760. Let me double check it.
-
repo owner It seems that the retained column is 4 byte wide, and a 100 year old retention won't fit. The fix seems to be trivial: change the retained column to bigint, ie.
alter table metadata change column retained retained bigint default NULL;
Also make sure to run pilertest against a problematic message, and verify that the retention period value is correct.
-
repo owner - changed status to resolved
- Log in to comment
The purge utility reads the options or settings, and it aborts immediately, if the purging is not enabled. 100 years for retention is nice safety anyway. What's the value stored in the retained column for the given message? And is it a 32 or 64 bit Linux?