Add possibility to load data from a list, instead of a matrix
USE CASE: WHAT DO YOU WANT TO DO?
Provide a more flexible experience for the user when loading a custom dataset
STEPS TO REPRODUCE AN ISSUE (OR TRIGGER A NEW FEATURE)
N/A
CURRENT BEHAVIOR
Currently, Treeview can only load matrices. One of the test users asked me if it is possible to load a dataset in a list format (label A - label B - value).
EXPECTED BEHAVIOR
Treeview should be able to automatically recognize the format (just as it recognizes the number of row/col headers now) and then present a confirmation window to the user to make sure the parsing was done successfully.
DEVELOPERS ONLY SECTION
SUGGESTED CHANGE (Pseudocode optional)
e.g. Add a color selection class
FILES AFFECTED (where the changes will be implemented) - developers only
e.g. selectColor.java & settingsPanel.java
LEVEL OF EFFORT - developers only
trivial/minor/medium/major/overhaul (choose one)
COMMENTS
Comments (23)
-
-
Aha, I figure it out. It's the case where there are no headers. If you attempt to load a list that has no headers, you get this error:
and the data will not load.
-
-
I was talking to Lance about this and he says that there exists a "list file format" that has 3 columns consisting of row label, column label, and value. Of course, labels would be highly repetitive in this format. Now that I think about it, I believe I have encountered something like this in the past.
Is this what you meant?
-
reporter Yeah, that's exactly the format i had in mind.
-
I am working on this issue, I have one doubt. Will the first line of list format have headers ? If so, an example please.
-
reporter -
assigned issue to
-
assigned issue to
-
reporter @srikanthbezawada -- I think the rules should be:
- assume by default that the first row contains column headers
- in the preview, the user will adjust this setting; if the user deselects the first row as the header, auto-generate a header (e.g., "ROW_LABELS" and "COLUMN_LABELS")
- include a checkbox to enable the user to make the matrix symmetric (e.g., if only the upper triangle of an adjacency matrix is loaded, fill in the rest of the matrix with the same values)
-
I think if there are only 3 columns, it's a good bet to default to list format parsing, but we might want some manual way to set the format type. Also note that each row has a single value for a single cell, thus missing cells need to be accounted for and treated as empty. There's no way to know what the final dimensions of the matrix will be in this format, so you will have to look for the longest row and longest column.
-
- marked as enhancement
-
- changed component to Import/export
-
- removed milestone
Removing milestone: Import/export data (automated comment)
-
- changed milestone to 01
-
- changed milestone to Faizaan/Srikanth - 01
-
- changed milestone to F/S - 01
-
@abarysh , Thanks for the pointers.
I got one doubt written below.
- Consider the following example.
a b 3
c d 5
If user selects the checkbox symmetric,
b a 3
d c 5
are also implicitly added to the view.
- Consider the following list example.
a b 3
b a 2
c d 5
If user selects the checkbox symmetric, how should this case be handled ? (a-b and b-a have different values)
-
reporter That's a great question. I think the most reasonable thing to do (without adding extra interface elements) is to average the ab and ba values. The checkbox that the user checks should say "Make symmetric" (rather than just "Symmetric" -- this way we emphasize that it is an action). If there's space next to the checkbox, we can also add a note "Values for reciprocal pairs A-B and B-A will be averaged". If there's no space, we might need to show it as a pop-up warning when the user clicks on the checkbox.
-
@abarysh I am going with averaging the reciprocal pairs..
Are there any test files of this format ? Also, can you please explain the role/usage of headers w.r.t list data after loading into treeview3.
-
reporter There're no test files for this format yet, but you can create your own for the time being and then I can make another one when I test the feature. The headers of the 2 columns in the data file should be used as row/column labels. For example:
Gene Condition Data
abd1 drugA 0.5
... ... ...
Gene and Condition will be row and column labels, respectively. In this particular case, averaging doesn't make sense (because rows and columns are different thing). In general, I think the "Make symmetric" options should be unchecked by default.
-
The following changes have been implemented.
If a given pair of labels have multiple values, all the values are averaged. For example,
a b 3
a b 2
The value of a-b comes out to be 2.5.
Another example,
a b 3
a b 2
b a 5 and user selected symmetric, the value of a-b and b-a comes out to be 3.33.
-
Would this also come out to 3.33 when not symmetric?:
a b 3
a b 2
a b 5
-
Yes Rob, all the values are averaged, so it comes out to 3.33.
-
- changed version to beta2
- Log in to comment
Can you enter a procedure that shows that list data cannot be entered? I was able to successfully load list data using this file:
https://bitbucket.org/TreeView3Dev/treeview3/downloads/mylabeltest5.txt
The data is not real data. It's a sequential number. The default data detection seemed to sort of work - plus, I could change it:
Here's what it looks like when it's loaded and the interface appears to be fully functional:
Actually, the first time I tried to open it, the "Choose a file" dialog kept popping up after loading, but I couldn't make that happen again after the first time and I don't know which jar I was running when I tested it. There could be a real problem there, but without the ability to reproduce it, I just have to write it off as a fluke.