Allow a variable number of column/row headers in input file

Issue #189 closed
Robert Leach created an issue

John Matese was unable to load a number of treeview files that load fine in treeview 2.0 and it was because of the number of header columns/rows. We should try to implicitly figure out header rows/cols or else present the user with an interface similar to excel's text file import where you get to tell it where the data starts.

I tried to find this as an existing issue, but could not find it. If this is a duplicate, please mark it as such.

Comments (10)

  1. Robert Leach reporter

    @TreeView3Dev and I talked about this over chat. Documenting that convo in a summary: I haven't debugged the issue, but if I remember correctly, I think it could be a problem when the number of col label rows and row label cols differs. It's good to know that it should allow variable numbers of label rows/cols already. Should make the fix simple.

  2. Robert Leach reporter

    John reported that the error he was getting was:

    No data matrix could be set.

    I'll upload and link the files he got the error from...

  3. Robert Leach reporter

    There's also an error that is displayed in the main window after the window containing the error I reported above is dismissed:

    Oh oh! Looks like we ran into the following issue: Data in file unusable.

    He tried files with these extensions:

    .pcl (containing header columns: UID, NAME, & GWEIGHT and header rows beginning with: UID & EWEIGHT) .cdt (containing header columns: GID, UID, NAME, & GWEIGHT and a single header row beginning with: GID) .gtr (file containing the tree to be displayed - not opened directly)

    His name columns has very long strings such as: "S phase ZNF217 zinc finger protein 217 Hs.155040 R81830 NM_006526 obsolete| transcription| | | | | | | | | | | | "

    I was able to open both the cdt and pcl files in treeview2, but got the aforementioned errors when attempting it in 3.

    Rob

  4. Robert Leach reporter

    I see the problem with the load. John had numeric labels, so when it got past the GID/UID/NAME/GWEIGHT, it was thinking there was data to load. I changed the code to not load any data on a line where the string GID/GWEIGHT/etc was encountered and got his cdt file to load. Is that technically correct? Can there ever be data on the same line as (e.g.) GID or GWEIGHT?

    After getting it to load, I encountered a weird issue that may be separate from this issue. I will create a separate ticket for it...

  5. Robert Leach reporter

    Numeric array IDs being identified as the start of the data was the problem. Resolved by preventing lines containing GID/GWEIGHT/NAME/etc from being parsed for data.

  6. Log in to comment