inconsistency between db_get_record_size() and db_get_record()
for an ODB record with a double and an int (total size is 8+4=12), db_get_record_size() returns size 16 but db_get_record() only reads 12 bytes. (the last 4 bytes of the data buffer are left with previous values).
this is because db_get_record_size() pads the record size to the data alignment.
I think this is a mistake and the addition of this extra padding should be removed. (I will be looking at all users of db_get_record_size() to see if it is safe).
(specifically, this causes problems in the history system).
K.O.
Comments (10)
-
reporter -
reporter If db_get_record_size() adds extra padding to the record size to account for C++ compiler padding the data structure (i.e. sizeof(struct { double, int }) is 16, not 12), then padding should be ss_get_struct_padding() not ss_get_struct_align().
The test case for this:
struct rec { double d, int i } // size could be 12 or 16 int size = sizeof(struct rec); // could be 12 or 16 db_get_record(&rec,&size); // will read 12 bytes from ODB and bomb if we gave it the wrong value of "size"
K.O.
-
reporter one more thing to check - does db_get_record() zero-out the pad bytes? I think we should zero them out - pad bytes between data items and pad bytes at the end of the structure. K.O.
-
On “normal” 64-bit systems (even on most 32-bit systems), sizeof(rec) is 16, and db_get_record_size(rec) returns correctly 16.
I didn’t bother with zero-out padded bytes, since the normal user will never see them, but if we like we can simply do a
memset(rec, 0, size);
at the beginning of db_get_record().
-
reporter Re - zero-out of padding bytes. I think it only makes a difference when we compare “is the data the same” (without zero-out of padding, db_get_record() will return different byte sequences for the same odb record), and when we write them to disk it is good to have pad bytes set to zero (or some marker, like 0xFF) instead of changing randomly.
I do not know if overhead of memset() is significant for places where we use db_get_record(). In the history system, we are making syscalls, writing to disk, communicating with an sql database, all expensive operations, so overhead of memset() is probably negligible.
K.O.
-
reporter For posterity, explanation of padding from Stefan:
Hi Konstantin, yes all makes sense. Padding was always a nightmare. In the old days (‘90s), there was no #pragma pack(1) so each compiler did pack structures differently. Now I wanted to match ODB trees to structures. This means that db_get_record has to pull all values and push it into a structure AT THE RIGHT place. Like if you have struct { char c; int i; } s; then I cannot simply write the value of “i” just one byte after the structure starts, but have to obey the padding. In fact, most likely “i” starts four bytes after the beginning of the structure, so I need a char *p = (char *)&s; *((int *)(p+4)) = value; Now if the compiler decided to do the packing differently, like with pack(1), I need a char *p = (char *)&s; *((int *)(p+1)) = value; Since one cannot easily figure out (at least in the ‘90s) what the compiler decides, I invented ss_get_struct_align() which “measures” the structure alignment and ss_get_struct_padding() wich measures the padding by using two test structures. The padding actually shows how many padding bytes are put being a one-byte variable if at the end of a structure. Now db_get_record_size uses the current alignment to determine the proper structure size. Then it adds potential padding in odb.cxx:11164: *buf_size = VALIGN(*buf_size, max_align); as you can see the code from actually 1998 is rather tricky. The max_align takes into account the largest structure member, which defines the amount of padding. Now this code is more than 20 years old, and I cannot guarantee that it will works today without any flaw. Indeed the best would be just to remove db_get_record() and do everything with the odb++ API. But I guess we are not yet there. Looking at db_get_record, I see that the alignment is evaluated in the same way. Padding is not necessary, since db_get_record does not have to pad zeros at the end of a structure. But I need the record size with padding, if someone does a db_get_record(&s, sizeof(s)). Then db_get_record_size() derived from the ODB subtree has to match sizeof(s). So I don’t remember now how db_get_record() is used in the history system. But I hope with my explanations above you get a better idea what’s going on there. Yes a per-variable history would solve all this, but I have no idea what side effects this would have. I guess we cannot simply remove per-event history, since many experiments still use it and want to read back their old data for the next ten years or so. Stefan
-
I believe any memset() overhead is small compared with pulling some variables out of the ODB. So shall I put it in?
-
reporter Users of db_get_record() and db_get_record_size():
- lazylogger (settings, statistics)
- mhttpd (“custom gif” and sequencer code, to be removed)
- mlogger (history, channel settings and statistics)
- msequencer (state)
- alarm.cxx (alarm record, alarm class, program info)
- mana.cxx
- mfe.cxx (equipment common)
K.O.
-
reporter Added memset() to zero. I think there is nothing more to do here. Closing this bug. K.O.
-
reporter - changed status to resolved
leaving everything as-is. K.O.
- Log in to comment
More. db_get_record(&size) returns not the number of bytes read from the record (as computed by db_recurse_record()), but the result of db_get_record_size(). per this bug report, the two are not the same.
Note. if db_get_record() is called on a simple variable (an array), it does return the number of bytes read from ODB. But there is a check in the code to verify that both number are the same.
K.O.