1. Michael Granger
  2. ruby-pg
  3. Issues


Issue #47 resolved

There might be hidden GC problem in ext/pg.c

Anonymous created an issue

In totally different project (rocaml) I tracked down and identified (and fixed) this GC bug relating to rb_ary_new2. Then I went looking similar issues in the net and found this http://redmine.ruby-lang.org/issues/show/3174 which relates to pg and segfault.

This is kind of bug, which only happens when GC is triggered while code is executing in C code. And rb_ary_new2 has already created. And initialization of it is not complete yet! And code is using rb_newobj (maybe thru some non-visible way; like rb_tainted_str_new or rb_tainted_str_new2 and ending up in str_alloc/NEWOBJ).

So, in normal "toy" examples, you cannot reproduce it, because GC is not hit in trivial, short toy examples. You need memory shortage and big return array (created in foreign code side, not Ruby side; that means long row in ruby-pg case) to trigger this one.

I have identified two possible places where this could happen in pg.c. Search for rb_ary_new2 there. Ignore ones with rb_ary_new2(0) calls. You should find make_column_result_array and pgresult_fields methods (lines 3593 and 3652). In those methods, array is created and code starts to fill it. Problem is, code uses method which end up calling NEWOBJ macro in Ruby code, which might GC. And in that case, GC will find uninitialized memory pointer, tries to follow it and then it segfaults. Oops.

How to fix it? Simply initializing all those array fields before calling any rb__new methods. Qnil is enough. Something like following code (adjust to match your coding standards):

{{{ ary = rb_ary_new2(n); for(i = 0; i < n; i++) { RARRAY(ary)->ptr[i] = Qnil; } }}}

At least this worked with rocaml project and stopped it segfaulting while passing large arrays from OCaml to Ruby via rocaml generated C code.

Use your own judgement to decide if this actually affects ruby-pg code. And please note, I'm not expect on Ruby GC. This report might be total bs.


Comments (4)

  1. Anonymous

    Perhaps rb_ary_new2() is calling malloc() for RARRAY(ary)->ptr. malloc() does not bzero() the buffer on most platforms. I'd say it would be a bug in MRI rb_ary_new2(). When you call Array.new(100) you get an Array with 100 nils.

  2. Michael Granger repo owner

    Avoid use of uninitialized Arrays (fixes #47).

    Instead of declaring an Array at the beginning of fetching results and pushing values onto it to return at the end (which could result in a segfault if the GC runs during a NEWOBJ), collect an array of VALUEs on the stack and create the Array on return with rb_ary_new4().


  3. Log in to comment