dataliberation / README.rst

Tools for liberating project's data from different sites.


issues/ - export data from various issue trackers


[ ] Get issue export working
. this is needed for SCons project . and probably for Subversion

[ ] Make sure converted data for is complete [ ] Define target format for conversions [ ] Define common intermediate format for issues [ ] Write exporter from intermediate format into target

[ ] Figure out manual mapping requirements, like
usernames, issue numbers etc.
[ ] Describe conversion process, because it is probably
can not be 100% automated (it is at least needed to post redirection links for older issues)

Final goal

The major part is to build "structure converter tool" to convert XML or other formats into tree. The tree can be dumped, validated, compared to other tree, or converted. It is also very important to get full information about conversion - if conversion is full, is it reversible or is there a data loss on the way, what kind of data is missing to do the conversion ot make conversion reversible?

There is need in a tool that allows easy analysis/debug of convertation process, to walk step by step and see the outcome of every operation before it occurs. This imposes a certain requirements for good visualization on user interface, and it's should be cross-platform. But underlying scripts should be independent from UI.

It may worth to start such visual interface in PySide. Start with display of initial source data file, then add line numbers, then convert it to a tree keeping line numbers linked to tree elements. Then a window can be split vertically to show scaled tree and file contents simultaneously. Then try to highlight lines and corresponding elements in the tree on selection. Do the same with mouseover - scrolling main window while selecting tree items. After that the UI can be teached to compare trees highlighting modified parts. And finally it should visualize conversion process and walk through it step by step.

If everything above sounds too complicated, we may end up with using Google Refine. =)