Source

wiki2csv /

Filename Size Date modified Message
34.3 KB
20 B
4.6 KB
1.6 KB
6.8 KB
README wiki2csv
===============

Copyright © 2012 Jan Kanis

This program is licenced under terms of the GNU General Public Licence as published by the Free Software Foundataion,  either version 3 of the licence or (at your option) any later version. This program is distributed without any warranty whatsoever. 


Purpose
-------

This program was designed to help edit large tables on wikipedia in the wikitable format. It works by converting the wikitable syntax to comma separated value syntax, that can be edited with Excel or LibreOffice. The result can then be transformed back to wikitable syntax. The program tries to preserve the wikitable formatting and metadata as good as is practically possible. 

The wikitable format is documented at http://en.wikipedia.org/wiki/Help:Wikitable. 


Requirements
------------

wiki2csv requires Python 2.7, available at http://www.python.org. The core conversion routines will probably also work on older Python versions, but wiki2csv uses the argparse library to parse command line arguments, that is only available since Python 2.7. 


Usage
-----

wiki2csv was designed to be used as follows:

1) Be sure wiki2csv is installed somewhere where you can find it, and als make sure that the right version of Python is available on your system. Running "python wiki2csv.py -h" should produce the program's help output. 
2) Copy the wikitable from the edit view on wikipedia into a new file on your computer. The wikitable is everything starting from '{|' up to and including the '|}'. Be sure to save this into a text file: on Windows use Notepad, or if you use Word be sure to save the file as 'plain text' and not as a .doc file. On Linux you could use for example Gedit. For these instructions I will assume you named your file "table.wiki". 
3) Open a command window and move to the directory where you saved your file. 
4) Run "python wiki2csv.py table.wiki table.csv". (Or change table.wiki to the location of your file, and table.csv to the location of your file with the extension changed to .csv)
5) Start Excel or LibreOffice Calc and open the newly created file table.csv. 
6) Excel/LibreOffice will come up with a dialogue asking how it should read the file. Choose 'separated by' and check 'comma', and make sure all other possible separators are unchecked. Also make sure that 'merge delimiters' is unckecked and 'text delimiter' is set to the double quote (") character. Any other options should probably be disabled. 
7) You should now be able to edit the table in Excel/LibreOffice. As you can see, all formatting directives are preserved as text rather than by formatting the content of the cells in the editor. 
8) When you are done editing, save the file under the same name. 
9) In the command window, run "python wiki2csv.py --towiki table.csv table.wiki". This will overwrite 'table.wiki' with the changes you've made. 
10) Reload or reopen table.wiki in the text editor, and copy the changed wikitable to the edit box on the wikipedia website. 


Invocation
----------

wiki2csv will by default convert SOURCE from wikitable format to CSV format. However, if it is invoked as csv2wiki (possibly with an extension) it will default to working in the other direction. The --towiki or --tocsv flags take precedence when supplied. See the output of "wiki2csv.py --help" for further documentation. 


Limitations
-----------

- wiki2csv does not preserve the wiki '||' sytax to place multiple table cells on one source line. When converted back from csv, the cells will appear each on a separate line. This limitation does not hold for header cells. 
- The optional |- before the first table row is not preserved, and neither are multiple |-'s without any data inbetween. A final |- before the end of the table is however produced. 
- wiki2csv does not deal very well with nested tables. 
- Correct wikitable syntax should be handled correctly, but if wiki2csv encounters bad wikitable syntax it can generate garbage output or crash. 


Todo
----

The program is doing what I need it to do, so I have no significant improvements planned (though I do intend to fix bugs if they are found). However, the user interface is not yet very great. It would be very nice if someone was interested in either creating a GUI interface, or implementing this as a web service. (Using Google Appengine that shouldn't be too difficult.) Even better would be a plug-in for LibreOffice to allow it to directly edit wikitables, but that is probably a lot more work. 


Contact & Bugs
--------------

This program is maintained on bitbucket at https://bitbucket.org/JanKanis/wiki2csv. That page also contains a bugtracker. 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.