Very long period for altering MySQL tables.

Issue #296 resolved
Thomas Lindner created an issue

I am using the MySQL history logger for an experiment. The experiment has been running for several years, logging lots of history data. Some of the history tables are getting quite large. For instance, this table has 21 million rows and 91 columns

mysql> select count(*) from beamlineepics_demand;
+----------+
| count(*) |
+----------+
| 21405157 |
+----------+
1 row in set (8.76 sec)

My problem occurs when adding new variables to the existing MIDAS history events. The history logger has a scheme for going through and altering the tables when new variables are added. The alter table commands execute something like this:

mysql> ALTER TABLE `beamlineepics_demand` ADD COLUMN `b1u_tpmtop_rdvol_demand` float;
Query OK, 0 rows affected (4 min 0.11 sec)
Records: 0  Duplicates: 0  Warnings: 0

The problem is that with 21M rows, each alter command takes ~4minutes (I think that this is because the command must add some value (even NULL) to each row for the new column).

In my particular case, I added 10 new variables to each of 4 big tables. So the operation was going to take ~2.6hours to complete.

One problem is the mlogger was not sufficiently clear about what it was doing. In particular it printed a MIDAS message about

Adding column "b1u_bpm2b_rdcur_dmnd" to SQL table "beamlineepics_dmnd",

but it only did it after the 4-minute MySQL operation was completed. So, from the users point of view, mlogger seemed to be hanging. The risk is that the user would then ctrl-c mlogger in the middle of the MySQL operation, resulting in a series of partly-created MySQL columns; this is what I did until I realized what was going on.

So my minimal suggestion is that the cm_msg message about altering the table should go before the alter operation and that it should explicitly state that the operation might take a long time.

It is possible that there might also be better/faster ways of doing these sorts of intensive MySQL operations; but I don’t know MySQL well enough to propose any particular solutions to that.

Obviously it is also probable that my MySQL server itself is not optimized. MySQL runs on an SSD, but there is probably more optimization that can be done. But in any case, it would be good if MIDAS dealt gracefully with a imperfectly optimized MySQL server.

Comments (2)

  1. lee pool

    Hi Thomas

    few years back I had a similar issue with large amounts of rows within a history table, and I found that indexing improves performance. Subsequently moved to a postgres db.

  2. Thomas Lindner reporter

    Added the extra logging output. MySQL logger still seems pretty fragile for big tables. Might need to be revisited.

  3. Log in to comment