Issue #138 duplicate

SEG Fault: SELECT COUNT(*) FROM t GROUP BY x

Martin Velez
created an issue

I have a table t which holds (int x, int y). It holds many-to-many relationships.

It has 43631753 (int x, int y) records.

|Domain of x| = 1630598 |Domain of y| = 2899261

My program crashes when I run this statement: counts = conn.exec("SELECT COUNT(*) FROM t GROUP BY x").column_values(0)

I have an identical database in SQLite3 and perform the same query and it doesn't crash.

Attaching trace (?).

Comments (14)

  1. Martin Velez reporter

    I get the same seg fault if I change the statement to:

    counts = conn.exec("SELECT COUNT(*) FROM t GROUP BY x").values
    

    And:

    res = conn.exec("SELECT COUNT(*) FROM t GROUP BY x") # OKAY
    counts = res.values
    

    But this works!

    counts = []
    res = conn.exec("SELECT COUNT(*) FROM t GROUP BY x") # OKAY
    res.each do |row|
        counts << row[0]
    end
    
  2. Martin Velez reporter

    I initially reported that the column_values method was triggering this bug. Then I showed that the values method also triggered it.

    I browsed the column_values C extension function and it doesn't seem to share code with the values function.

    Can you confirm that the #values bug impacts both values and column_values()?

  3. Michael Granger repo owner

    Oops, you're right. I missed the #column_values. Good, maybe it's something they share in common? What's the size of the result set when it segfaults? I.e., how many rows?

  4. Lars Kanis

    Yes, both methods share the same root cause: it's a classic stack overflow caused by the line "VALUE rows[ num_rows ];"

    The stack is typically limited to 8 MB. You may use "ulimit -Ss 81920" to increase it to 80MB. We need to use heap memory instead of variable length arrays. It was introduced with the performance improvements in 6cea3cea3b2b.

    I can reproduce the issue with:

    require 'pg'
    c=PG::Connection.new(dbname:'mydb')
    p c.exec('select generate_series(0,1999999)').values.last
    
  5. Michael Granger repo owner

    Nice call, Lars.

    Do you think 4d5e457062e8 is an okay fix, or should we build the row the same way? I suspect result sets in which a single row overflows the stack are much rarer, but maybe better safe than sorry?

  6. Lars Kanis

    Hi Michael,

    really fast fix! "VALUE new_row[ num_fields ]" isn't safe since the Postgresql-limit of 8192 columns was removed... So pgresult_fields() and make_column_result_array() may use heap, too.

  7. Lars Kanis

    Hi Michael,

    really fast fix! "VALUE new_row[ num_fields ]" isn't safe since the Postgresql-limit of 8192 columns was removed... So pgresult_fields() and make_column_result_array() may use heap, too.

  8. Lars Kanis

    Oops, I was wrong: http://www.postgresql.org/about/ states "Maximum Columns per Table 250 - 1600 depending on column types". So with VALUE taking 8 bytes, we will not use more than 12800 bytes on stack, which is far away from the stack limit. Even if the query result may can take some more columns than a table, there should be enough space left.

  9. Michael Granger repo owner

    Oh well. It can't hurt to just make it consistent everywhere. I was in the middle of building a 35M-row table to try to track this down when you replied, so I'll go ahead and test to see if #values and #column_values trigger the problem with yesterday's code, then make sure tip fixes them both.

  10. Michael Granger repo owner

    I've just pushed pg-0.14.1.pre.363.gem, which should fix all the known segfaults. Please test it out in your applications and let me know if still see the problem.

  11. Log in to comment