inconsistency between db_get_record_size() and db_get_record()

Issue #273 resolved

dd1 created an issue 2020-08-18

for an ODB record with a double and an int (total size is 8+4=12), db_get_record_size() returns size 16 but db_get_record() only reads 12 bytes. (the last 4 bytes of the data buffer are left with previous values).

this is because db_get_record_size() pads the record size to the data alignment.

I think this is a mistake and the addition of this extra padding should be removed. (I will be looking at all users of db_get_record_size() to see if it is safe).

(specifically, this causes problems in the history system).

K.O.

‌

Comments (10)

dd1 reporter
More. db_get_record(&size) returns not the number of bytes read from the record (as computed by db_recurse_record()), but the result of db_get_record_size(). per this bug report, the two are not the same.

Note. if db_get_record() is called on a simple variable (an array), it does return the number of bytes read from ODB. But there is a check in the code to verify that both number are the same.

K.O.
- 2020-08-18T20:08:39+00:00
dd1 reporter
If db_get_record_size() adds extra padding to the record size to account for C++ compiler padding the data structure (i.e. sizeof(struct { double, int }) is 16, not 12), then padding should be ss_get_struct_padding() not ss_get_struct_align().

The test case for this:
```
struct rec { double d, int i } // size could be 12 or 16
int size = sizeof(struct rec); // could be 12 or 16
db_get_record(&rec,&size); // will read 12 bytes from ODB and bomb if we gave it the wrong value of "size"
```
K.O.
- 2020-08-18T20:14:52+00:00
dd1 reporter
one more thing to check - does db_get_record() zero-out the pad bytes? I think we should zero them out - pad bytes between data items and pad bytes at the end of the structure. K.O.
- 2020-08-18T22:27:48+00:00
Stefan Ritt
On “normal” 64-bit systems (even on most 32-bit systems), sizeof(rec) is 16, and db_get_record_size(rec) returns correctly 16.

I didn’t bother with zero-out padded bytes, since the normal user will never see them, but if we like we can simply do a

memset(rec, 0, size);

at the beginning of db_get_record().
- 2020-08-19T12:49:57+00:00
dd1 reporter
Re - zero-out of padding bytes. I think it only makes a difference when we compare “is the data the same” (without zero-out of padding, db_get_record() will return different byte sequences for the same odb record), and when we write them to disk it is good to have pad bytes set to zero (or some marker, like 0xFF) instead of changing randomly.

I do not know if overhead of memset() is significant for places where we use db_get_record(). In the history system, we are making syscalls, writing to disk, communicating with an sql database, all expensive operations, so overhead of memset() is probably negligible.

K.O.

‌
- 2020-08-19T13:01:07+00:00

dd1 reporter

For posterity, explanation of padding from Stefan:

Hi Konstantin, yes all makes sense. Padding was always a nightmare. In the old days (‘90s), there was no

#pragma pack(1)

so each compiler did pack structures differently. Now I wanted to match ODB trees to structures. This means that db_get_record has to
pull all values and push it into a structure AT THE RIGHT place. Like if you have

struct {
  char c;
  int i;
} s;

then I cannot simply write the value of “i” just one byte after the structure starts, but have to obey the padding. In fact, most
likely “i” starts four bytes after the beginning of the structure, so I need a

  char *p = (char *)&s;
  *((int *)(p+4)) = value;

Now if the compiler decided to do the packing differently, like with pack(1), I need a

  char *p = (char *)&s;
  *((int *)(p+1)) = value;

Since one cannot easily figure out (at least in the ‘90s) what the compiler decides, I invented

  ss_get_struct_align()

which “measures” the structure alignment and

  ss_get_struct_padding()

wich measures the padding by using two test structures. The padding actually shows how many padding bytes are put being a one-byte
variable if at the end of a structure.

Now db_get_record_size uses the current alignment to determine the proper structure size. Then it adds potential padding in
odb.cxx:11164:

*buf_size = VALIGN(*buf_size, max_align);

as you can see the code from actually 1998 is rather tricky. The max_align takes into account the largest structure member, which
defines the amount of padding. Now this code is more than 20 years old, and I cannot guarantee that it will works today without any
flaw. Indeed the best would be just to remove db_get_record() and do everything with the odb++ API. But I guess we are not yet there.

Looking at db_get_record, I see that the alignment is evaluated in the same way. Padding is not necessary, since db_get_record does not
have to pad zeros at the end of a structure. But I need the record size with padding, if someone does a db_get_record(&s, sizeof(s)).
Then db_get_record_size() derived from the ODB subtree has to match sizeof(s).

So I don’t remember now how db_get_record() is used in the history system. But I hope with my explanations above you get a better idea
what’s going on there.

Yes a per-variable history would solve all this, but I have no idea what side effects this would have. I guess we cannot simply remove
per-event history, since many experiments still use it and want to read back their old data for the next ten years or so.

Stefan

‌

2020-08-19T13:02:07+00:00

Stefan Ritt
I believe any memset() overhead is small compared with pulling some variables out of the ODB. So shall I put it in?
- 2020-08-19T13:03:54+00:00
dd1 reporter
Users of db_get_record() and db_get_record_size():
- lazylogger (settings, statistics)
- mhttpd (“custom gif” and sequencer code, to be removed)
- mlogger (history, channel settings and statistics)
- msequencer (state)
- alarm.cxx (alarm record, alarm class, program info)
- mana.cxx
- mfe.cxx (equipment common)
K.O.
- 2020-08-19T13:13:16+00:00
dd1 reporter
Added memset() to zero. I think there is nothing more to do here. Closing this bug. K.O.
- 2020-08-19T13:20:06+00:00
dd1 reporter
- changed status to resolved
leaving everything as-is. K.O.
- 2020-08-19T13:20:31+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: resolved

Votes: 0

Watchers: 1