1. Bill Cauchois
  2. pysvmlight
Issue #5 new

Core dump arising from svmlight.so

Jagat Sastry
created an issue

I've written a simple classifier program in python that users nltk and PySVMLight. On execution, it runs till the end and then dumps core. Looks like the error is somewhere in svmlight.c (free_model function?)

Here is the whole dump and execution sequence http://pastebin.com/b3RFp4tH

Please let me know if you need more information to resolve this issue, or if I need to file the bug somewhere else.

Comments (5)

  1. Jagat Sastry reporter

    Ok, so I ran the program as a superuser and it didn't result in a segabort. So I'm suspecting it could be because of the large number of features (5654) I used in the program.

    Edit1: It has started appearing again, even when run as SU. I noticed that all the compilations (by setup.py) are done using -g flag. Yet gdb isn't able to pinpoint the problem.

    Edit2: There's most probably some bad 'free' business going on. I returned at the beginning of free_model_and_docs and free_just_model (in svmlight.c) and the error no longer occurs. That's after I found that my attempt to add NULL checks and assign NULL ptrs to pointers being freed (in svm_common.c) was of no use. This needs to be investigated.

    Edit3: Nope, even that doesn't work. Enough.

    Edit4: I shifted my focus from svmlight code to my own code. Looks like PySvmlight itself has some scalability issues. Issue doesn't get reproduced when the number of feature vectors used for training and classifying is reduced. Perhaps because of the number of threads created as a result? I then thought of pickling model (returned by svm_learn) after training and using it separately for classifying later. However, turns out I can't pickle it because it "Can't pickle 'PyCObject' object". Will try using write_model and read_model and find out if it helps

    Final update: For my purpose, I'm generating the feature vectors myself and using svmlight directly (not using PySvm). Will try to find out what the deal with PySvmLight is, in my free time.

  2. Bill Cauchois repo owner

    Heya, have you tried just commenting out the bodies of free_model_and_docs and free_just_model? That would tell us whether the issue is with that specifically.

    Also, would be nice to know how other people can repro this. Can anyone who's experienced this issue attach some training data to the issue?

    Vasily, were you dealing with a large dataset as well?

  3. Log in to comment