PDF Export Requirements Gathering

Issue #292 closed
Robert Leach created an issue

I have implemented a first draft of the PDF export requirements. Please comment so we can refine it. @TreeView3Dev @abarysh @lance_parsons

These requirements apply to issue #23. This issue could be resolved by compiling the agreed consensus requirements from the proposed list below. In the absence of that, it should be resolved automatically once issue #23 is resolved.

TreeView PDF Export Requirements

Basic Requirements

  1. Accessible via menu: File->"Export to PDF..."
  2. There should be an export dialog window
  3. Default PDF composition:
    1. Full matrix with all data in a 1:1 aspect ratio
    2. Data is shrunk proportionally until neither edge runs out of the area of the page reserved for the data.
    3. Trees (if present) that run up to within a small margin of the data
    4. Optional labels (if the row/col size can accommodate them)
    5. File name header
    6. Footer with treeview link or reference
    7. Caption
  4. The height/width of the data plus trees (and/or labels) will be centered on the page (so if the height or width is narrow, the content will not appear shifted to one side, or create a gap anywhere on the page other than to the page edge)
  5. The exported data will be clustered or unclustered, depending on the in-app view
  6. Colors:
    1. Trees: black
    2. Data: as set by user
    3. labels: black
    4. caption: black
  7. Label font is as-set by user in-app
  8. Header font will be helvetica, 14 point
  9. Footer font will be helvetica, 10 point
  10. Header will be bold
  11. Header will be plain style (i.e. not bold)
  12. Long header will be wrapped
  13. Label justification is as-set by user in-app
  14. Labels that are too long to fit will run off the non-justified edge
  15. Which label type to be used will be as is what's seen in the app
  16. No borders around any elements
  17. No spaces other than margins between data, trees, & header
  18. Export dialog should include a preview of the page layout but:
    1. Actual contents may be represented by a stand-in image
  19. Options in the export dialog:
    1. Page dimensions menu: 8.5x11 (portrait: default), 11x8.5 (landscape), custom (opens a separate dialog)
    2. Cell Aspect ratio menu: 1:1 (default), as seen on screen, fit to page
    3. Include row labels checkbox (grayed out when impossible) (default: when possible)
    4. Include column labels checkbox (grayed out when impossible) (default: when possible)
    5. Include row trees checkbox (grayed out when none exist) (default: when exist)
    6. Include column trees checkbox (grayed out when none exist) (default: when exist)
    7. Left margin text box
    8. Right margin text box
    9. Top margin text box
    10. Bottom margin text box
    11. Caption text box
    12. Caption font menu
    13. Caption font size text box
  20. Selections in the export dialog will live-update the preview

    Extra (Optional) Requirements

  21. Show selection box(es)

  22. Allow user to click & drag sides & corners of data to essentially set aspect ratio & the space that the data takes up
  23. Allow selection to define an inset that is drawn as a pop-out zoom
  24. Allow user to edit labels (for length)
  25. Allow labels of a selection to be bolded
  26. Allow user to select specific labels to be drawn
  27. Allow user to label a region or regions defined by a label range and drawn as a region label with brackets
  28. Allow user to resize the export dialog and grow the preview with it
  29. Advanced button
  30. Include date in header checkbox
  31. Title text box

Here is a mock-up of what I would envision for the export dialog interface:

exportdialogmock1.png

Comments (72)

  1. Robert Leach reporter

    Note, the tree and label percentages are for the percentage of the "page" those things should take up - and it only applies to the limiting edge length - in this case, the column height.

    I also included a footer with a TreeView logo as a shameless plug. I didn't include a checkbox for it in the mock-up.

  2. Robert Leach reporter

    In the mock-up, I just realized that some fonts do not have an italic version or other styles, so we may have to disable some checkboxes when particular fonts are selected.

  3. Robert Leach reporter

    I just had a cool idea. I was thinking that we could provide an option to not include the logo in the lower right corner, but have it grayed out. next to that, we could provide a hotlink that says "enable", which would bring up a dialog that asks the user to do something that we want them to do, like let us track their usage or sign up for our email list, or whatever. So basically, if they want an image without the treeview logo (which serves an advertising purpose for us), they would have to contribute in some way. We could even provide a list of options and we can automatically enable the checkbox if we detect that they're already done the requested action.

  4. Anastasia Baryshnikova

    I think this proposal is great. I would modify the following: - "Stretch to fill page" -> "Fill page" or "Fit to the page" (based on what I have seen in other software) - header/footer should be the same size (i.e., Helvetica 10 pt) and same style (not bold)

    I would also add: - Format: drop down menu to choose PDF, SVG, PNG, JPG, ... (the standard formats that we can export to).

    These things I think are unnecessary (at least, for the standard export form) but may be later included in the "Advanced" mode. - caption - % for trees and labels - points 22-28 (all cool ideas, but a little overkill maybe)

    In general, my expectation is that people will use Export to PDF (or JPG/PNG etc.) to get the visualization in a format that they can then modify and make into a figure for a manuscript or a presentation. That's why, for example, the caption is not necessary (both in the manuscript and the presentation, it would be replaced/rewritten multiple times).

  5. Robert Leach reporter

    Do those libraries you mentioned in #23 have their own interface? That would make the mockup useless.

  6. Christopher Keil repo owner

    I don't think they do. They likely provide some classes from which you can create objects that you can use a bunch of methods with specific parameters on.

    Our GUI can be used to collect the parameters and options and this data will then be passed to the library which would do its job. If they had a GUI they'd be a full fledged application, not just a library, no?

  7. Robert Leach reporter

    OK. Sounds good.

    BTW, different topic, but regarding the nature of this ticket... Normally, I wouldn't create an issue for requirements gathering or design like this issue is for, but we don't have a formalized process for either of those things (yet) and so I thought getting these requirements recorded was important enough to try and get something into the system about them, but I don't know how something like this gets "resolved". I suppose once a consensus is reached on the particulars, this issue is resolved and it's issue #23 that is resolved once it's all implemented. I will edit this ticket to more prominently associate it with issue #23, but I think in the future, we can probably put this kind of information in the change request form's expected behavior and/or suggested change sections.

    If someone were to take this ticket, I think all it would involve is striking the requirements we decide we don't want/need and confirm the ones we do. If we don't explicitly address that, then when issue #23 is resolved, we should just resolve this at the same time.

  8. Robert Leach reporter

    Should there be options to draw?:

    1. Current zoomed area (or entire matrix)
    2. Selection box(es)
    3. Overview (when printing a zoomed area)
    4. User specified row/column labels
  9. Anastasia Baryshnikova
    1. I think we should always export the current view (whatever is in the main panel + the global overview in the top left corner).
    2. Yes
    3. I think it should be included by default and no option to remove it. If someone doesn't like it, they can crop it out or delete it in Illustrator. But no need to add an extra option.
    4. Yes
  10. Robert Leach reporter

    Well, I've found a bunch of things to do the PDF (and other format) exports. I wanted to make sure that what I was finding could actually work, so I wrote some "quick" proof of concept code and managed to output a vector-based graphic in a PDF with minimal code changes.

    I tried a project called freehep. Freehep required maven to compile (I took some notes for that - I'm not sure what limitations that might add to our project), and I had to add these jar files to the build path:

    freehep_compile_additions.png

    The vector image it outputs would need some tweaking because there are gaps between the tiles, some tiles are different dimensions, and larger matrices end up getting truncated. However, the output is definitely vector, and that's the main hurdle. Everything else can be worked out. Here's an example PDF I generated:

    https://bitbucket.org/TreeView3Dev/treeview3/downloads/SamplePDFOutputFreehep.pdf

    Freehep may have some downsides. I think it's not actively maintained. I tried it first because people said it was easy and worked well. It wasn't quite that easy, but it was minimal code (other than the added jar files).

    The main addition was to DendroController:

    import java.awt.Dimension;
    import java.io.File;
    import java.util.Properties;
    import org.freehep.graphics2d.VectorGraphics;
    import org.freehep.graphicsio.pdf.PDFGraphics2D;
    import org.freehep.graphicsio.svg.SVGGraphics2D;
    
    ...
    
    } else if (e.getSource() == dendroView.getExportButton()) {
    
                    try {
                        Properties p = new Properties();
                        p.setProperty("PageSize","A5");
                        VectorGraphics g = new PDFGraphics2D(new File("Output.pdf"), new Dimension(400,300)); 
                        g.setProperties(p); 
                        g.startExport();
                        getInteractiveMatrixView().exportPixels(g,400,300);
                        g.endExport();
                    }
                    catch(Exception exc) {
                        exc.printStackTrace();
                    }
                }
    

    Then there was an extra public method in MatrixView to circumvent the bufferedImage stuff:

        public void exportPixels(Graphics g,int w, int h) {
            final Rectangle destRect = new Rectangle(0,0,w,h);
            final Rectangle sourceRect = new Rectangle(0, 0, 
                    xmap.getMaxIndex() + 1, ymap.getMaxIndex() + 1);
            if (drawer != null) {
                drawer.paint(g, sourceRect, destRect, null);
            }
        }
    

    Other than the little button I added to trigger the export (for testing purposes), that was all the code changes.

    I guess the main point here is that exporting a PDF vector image appears to be fairly simple. I think we could put together something simple (interface-wise - not code-wise) and then improve it for a later version.

    Take a look @TreeView3Dev, @lance_parsons, & @abarysh.

  11. Robert Leach reporter

    Looks like drawRect does a much nicer job drawing the matrix in the PDF than fillRect, but there are still some weird edge issues.

  12. Anastasia Baryshnikova

    I checked the example you linked and it looks pretty good. If we think we can fix the general appearance (e.g., gaps between tiles), add the row/col labels and the selection box (if any), then I think this would be a very good start.

  13. Robert Leach reporter

    Yeah, that's what I meant to address when I said that the method "drawRect" does better than "fillRect". It doesn't leave the gaps. It has its own problems though - some weird colors between some tiles. I think though that that's not due to the method I used, but rather, the rendering idiosyncrasies of the PDF reader, because it changes as you zoom in. Also, when you convert from PDF to PNG (the latest version I have), the image looks perfect:

    Output.png

  14. Robert Leach reporter

    I pushed a branch called "pdf_export_experiments" that contains a working copy of this test version. I added a temporary button named simply "X" next to the zoom buttons. Clicking it exports a file named "Output.pdf". I only wanted it to be able to test the export method and only exports the whole matrix.

    I put a test jar on the downloads page:

    https://bitbucket.org/TreeView3Dev/treeview3/downloads/tv3_pdfexporttest.jar

    Could you guys run it and click the "X" button to see if the export works on your machine? I'm not sure whether it will work given the rigamarole I had to go through to compile the dependencies.

    What should happen when you click the button is, it will darken for a few seconds as it produces the file. I'm not sure where it will put the file. It put it in ~/git/treeview3/LinkedView directory on my system. You might have to search your system for the output file.

    All of this will of course be fixed before it goes into the master branch, but just for testing, it should be sufficient.

    So, @TreeView3Dev, @abarysh, and @lance_parsons; could you run the linked jar above and see if you can generate a PDF?

  15. Robert Leach reporter

    So Lance confirmed it worked on his system (mac 10.10) yesterday. It works for me on mac 10.9. I'm curious about either linux or windows. Do you guys have access to either?

    Also, it will put the output file wherever the jar is apparently.

  16. Anastasia Baryshnikova

    Works for me too, I'm on mac 10.10 as well. However, I realized that we should have the option of exporting as PNG directly as well, in addition to PDF... PDF files tend to get very big very fast (~50 Mb for one of our test datasets) and become tough to even scroll through. It's a required option for smaller matrices and/or zoom-ins of larger matrices (when you want to be able to see the labels and maybe edit them in Illustrator etc.). But for larger matrices -- it becomes a bit impractical.

    What do you guys think? How difficult would it be to enable both raster and vector export?

  17. Robert Leach reporter

    Shouldn't be difficult. In fact, I think Chris said in out doc review meeting today that the code existed once in an old version.

  18. Robert Leach reporter

    I have tweaked the export interface mock-up to both reflect some of the suggestions, and also to propose some new ideas and further define how some of the other ideas should work.

    1. I thought that when a format is selected, the resultant estimated file size could be shown.
    2. For the labels section, I think that the select "All" button could change to "None" if all are already selected.
    3. For the labels section, I think that the select "That Fit" button could select every 3rd or 4th or whatever that fit. Clicking it a subsequent time would offset the selections by 1 (e.g. if it's every third, clicking a second time, would select every third plus 1).
    4. I propagated the "Show Selection" checkbox for each section that has a way of depicting a selection. Note that it's unchecked for row labels and thus, there's no yellow in the row label area.
    5. I added the disabled logo checkbox in the footer and have included the window you would get from clicking the "enable" link.
    6. I decided to provide label checkboxes for labels to include in the image and wanted to show whether labels would overlap by showing them in gray. They can still be selected, but if selected labels will overlap, I color the background red and show an "Overlap" message in the lower right corner.
    7. I put bare-bone options in the simple version and added a "Show Details" button in the lower left that reveals all the other options.
    8. We could probably put a note or something that says that label font, size, style, etc are determined by the labels menu.
    9. Note, I grayed out the portions of long labels that would not display in the given area and changing the percentage would adjust that.

    These are all bells and whistles that we don't necessarily need to have. I was just designing my druthers.

    exportdialogmock2-simple.png

    exportdialogmock2.png

    enable_no_logo.png

  19. Anastasia Baryshnikova

    I think this is way too many options. So much control over the exact appearance of the output presumes that most people will be using this PDF or PNG as their final figure (either for a paper or for a presentation) -- I don't think that will be the case. This file will be cut & edited outside of Treeview to fit the specific need of the user. As a result, things like the header, the footer, or their font options don't need to be here -- they won't be kept anyway. Same for the logo (easier to just remove it for now). The margins can be default & unchangeable. I think we should focus on the essentials: the matrix, the labels and the trees.

  20. Robert Leach reporter

    OK, so nix the header, footer, logo, and margins. Got it. I generally agree. I just wanted to explore options in the mockup and think it through a bit. And for a first pass at the export functionality, I wouldn't include the labels or "show selection" options either. We could also go with a static percentage for the tree sizes. At some point, we could eventually add them if we think it's worth it.

  21. Robert Leach reporter

    Here it is with most adjustments. I took a few liberties, but only for the graphic. Incidentally, I'd suggest we start with the simple version. The "Show Details" version can be implemented later. Of course, the plan right now is to export things separately and have the user piece them together, though I have experimented a bit with trying to weave them together.

    exportdialogmock3-simple.png

    exportdialogmock3.png

    enable_no_logo.png

  22. Christopher Keil repo owner

    I personally don't like forcing people to subscribe to an email list to remove the logo.

  23. Robert Leach reporter

    I can understand that. Seems a bit pushy perhaps? I thought it was a cool idea, but I see where you're coming from. What if we made it a suggestion, and disable it even if they don't opt-in to either? Would it still be too much?

  24. Anastasia Baryshnikova

    I just don't think it makes a lot of sense here.

    1) Eliminating the logo is not a great incentive for signing up -- the logo is nice, it doesn't interfere with the figure and, most likely, will be lost anyway in the process of finalizing the figure (e.g., editing the PDF in Illustrator or cropping in Photoshop).

    2) We have no real interest in getting everyone to sign up for the mailing list. We only want people who really want to be updated, no?

  25. Robert Leach reporter

    Good points. Usage stats and errors would be good to track though.

    I'm fine with not including it. I just have a sense that I gained from running a small business in Buffalo, that there's a chance to capitalize on things (whatever they may be) with every user interaction. I just think that even if a user is not opposed to, say, providing usage stats, they won't likely do it if it's an opt-in sort of mechanism that is left to the user to find. This provides them an opportunity to get something from opting in and introduces an incentive to make that decision. Again, I'm fine with not including this logo hook, but I just think that there's an opportunity here to get something back from the user - whether it's an email list, usage stats, or anything...

    Oh, and one other thing occurred to me: we could have a message on how to cite treeview at the bottom of this export window... What do you think about that?

  26. Anastasia Baryshnikova

    I think we should get back to the mailing list later, once all the basic features are implemented and we know exactly how to use a large mailing list on a regular basis.

    For the citation -- yes, good idea. I suggest this plan: alpha3 -> beta1 -> write manuscript -> post on bioRxiv -> add citation to the footer of every exported file (not export window) -> submit manuscript to journal -> manuscript published -> update citation with new journal/issue info

  27. Anastasia Baryshnikova

    I think it should be in the exported PDF, not in the window. Easier to store/retrieve/copy.

  28. Robert Leach reporter

    For alpha03, I suggest we exclude the matrix options section (only the whole matrix) and the show selections thing. In fact, I don't know if there's an option for page orientation - AND, with the format, if the user selects PNG, there's no "page" options at all...

    Also, what I have working doesn't include an overview (it's the whole matrix anyway), nor is there a logo or citation. I think all that stuff can come in later versions...

  29. Anastasia Baryshnikova

    I think that, if we're going to have just 1 option, that should be "export what you see on the screen". That way, it could be either the whole matrix or a subset, depending on the zoom level. I agree that the other options (show seletions, page orientation, overview etc.) are lower priority.

  30. Robert Leach reporter

    Exporting a portion is harder, especially given the partial tree drawing. The whole thing is an easy first step.

  31. Anastasia Baryshnikova

    I understand, but we have to think about our goal. I think the goal is to have a useful export function, so that people can make figures for their presentations/papers. If the matrix is small to begin with, people wouldn't even use Treeview to explore it. If the matrix is large, they would be using Treeview to find interesting trends and not being able to save these trends will be very frustrating to them. So, I would be hesitant to say "we introduced an export function" when in reality it's in a form that's not very useful to the users.

    I know one could argue that "export the entire matrix" would make an overview figure, but that just doesn't sound good enough to me. We have to do it right. This is where the power of Treeview is: exploration of complex datasets & identification of interesting trends among all the chaos. Let's not worry about the release deadline and just figure this part out. And then, I agree, all the other bells and whistles can be added later.

  32. Christopher Keil repo owner

    I am not 100% what the final components to be included for #354 are? Also please tell me which formats are expected (PDF, PNG...?) and which paper sizes (US Letter,..?).

    I will likely commit and resolve #354 soon. The rest of the options can quickly be added during the PR review process :)

  33. Robert Leach reporter

    I'm using these ones supported by freehep:

    pdf, ps, & svg

    to export document-style files and these already in the code-base:

    png, jpg, and ppm

    (since they were there) to output images that are not "part of a page".

    For the document style ones, I'm not sure what "PageSize" values it accepts. I've been testing with "A5", which was in an example.

    I created a class called "ExportHandler", as you suggested with 2 methods thus far:

    exportDocument(String format,String pageSize,String fileName) exportImage(String format,String fileName)

    The constructor (so far) takes:

    ExportHandler(final DendroView dendroView,final MapContainer interactiveXmap,final MapContainer interactiveYmap)

    Though I will need to have some sort of options for whether the region being exported is the visible, selected, or entire matrix. Other things are likely imminent as well, but for now, those are the basics.

    BTW, having a dendroView reference gets me both the IMV and the trees. Not sure if that's the best way to access their paint/"export" methods or not.

  34. Robert Leach reporter

    OK, I created a holistic export method:

                    ExportHandler eh = new ExportHandler(dendroView,interactiveXmap,
                        interactiveYmap);
                    eh.export("png","Output.png");
    

    It uses a default PageSize, but you can set the PageSize like this:

    eh.setDefaultPageSize("A5");
    
  35. Christopher Keil repo owner

    Yea, looks great. I am using an Enum right now to populate the combo boxes for type/ paper choices. In the future, maybe export() can use the Enum class instead of a String for "png" which would make switch cases (or if-else) considerably easier. I wanted to do that for most combo boxes in TreeView, so maybe we can make this part of a future issue.

    For the freehep ones, could you show me a list of available paper sizes so I can add them? For the document style ones it can just be ignored for now (disable paper choice when PNG is selected etc.)

  36. Robert Leach reporter

    I'm looking at aspect ratio now and I think that "Fit to Page" isn't straightforward at all and would likely yield unreliable results. So I think we should either remove that option or make it like a slider or something independent of page-size. The reason is that freehep's manipulation of the image for page size appears to be a black box. There's no way to grab the page dimensions from it.

    Also - do you think that aspect ratio would be intuitively interpreted as relating to the tiles? Or do you think that people will think of it as the aspect ratio of the matrix as a whole? I.e., if they have more rows of data than columns and they select 1:1, do you think they would expect the overall image to be a tall rectangle or a square? My intuition is that they would expect a tall rectangle.

  37. Anastasia Baryshnikova

    I'm ok with removing the "fit to page" option for the time being. Also, I agree with Rob: intuitively, 1:1 to me means "tall rectangle".

  38. Robert Leach reporter

    Well, I was working on connecting the interface to the export code that generates the image and I realized that we're missing one crucial thing: the file we're going to export to.

    What seems more intuitive?

    1. Pop up a file save dialog after they click export here.
    2. provide a file name field and file system browser in this export window

    I've seen it done both ways. I looked at a couple "Save as..." examples. They tend to have a file system browser at the top and optional as the bottom:

    firefox_save_as.png

    excell_save_as.png

  39. Robert Leach reporter

    Sorry, I guess my wording was somewhat ambiguous.

    1. a popup after clicking export
    2. no popup - instead provide the file name, location and export options (file type, aspect ratio, show selections, etc) all in 1 dialog window.
  40. Robert Leach reporter

    There's also the way excel does it - which is the other way around: a file name with location browser, a format selection and an options button to access things like aspect ratio, etc.

  41. Anastasia Baryshnikova

    To me, the simplest option seems to be: a single window with this option at the very top

    Export as: [ <edit box for full path to file name> ] [Browse...]

  42. Robert Leach reporter

    Option 1 would pretty much be that a new popup would appear similar to the firefox example above without the format menu.

    Here's a mockup of option 2:

    exportdialogmock5-simple.png

  43. Robert Leach reporter

    OK, a browse button. That's similar to option 1 with the popup, only it's just for the save location. That's another option.

    Oops, I left out the new folder button in my mockup...

    exportdialogmock5-simple.png

  44. Robert Leach reporter

    The requirements seem to be mature enough. Note, the file name and location selection could be improved. The requirements did not cover this well, but what's currently implemented is sufficient.

  45. Log in to comment