Unable to get the HTML file after conversion
Hi, I am not able to get the HTML file after converting the media wiki xml file. only index.html is generated.
PFA source file and index.html file
Comments (10)
-
repo owner -
reporter - attached test.xml
PFA input file
-
repo owner - changed status to open
will look into it
-
repo owner Hi, I just tried a convert job myself. Result isn't very nice but at least I get output. Can you tell me where this file comes from? Which MediaWiki Version is this?
-
repo owner Hi Can I get some more information on your source file? I just tested my tool with the current MediaWiki version 1.22.6 and this works fine as well.
-
reporter i took the sample text from http://biowikifarm.net/meta/Mediawiki_XML_page_importing
and changed value in <text></text> tag
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.4/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.4/ http://www.mediawiki.org/xml/export-0.4.xsd" version="0.4" xml:lang="en"> <siteinfo> <!-- … the XML header from an arbitrary wiki page export --> </siteinfo> <page> <title>GBIF:cultivar</title> <revision> <contributor><username>User name</username><id>123</id></contributor> <text xml:space="preserve">{| |Orange |Apple |- |Bread |Pie |- |Butter |Ice cream |}</text> </revision> </page> </mediawiki>
-
repo owner Ah well, I think I see the issue: The converter basically just looks at everything within the <page> attribute and renders this information. Due to a special syntax my example database uses, stuff right of a pipe symbol is being ignored. Thus you get an empty output in the file that is being created. My wiki uses * to mark <li> items. Does this work?
-
reporter i just tried the same input by replacing | with *. still i am not getting any output.
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.4/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.4/ http://www.mediawiki.org/xml/export-0.4.xsd" version="0.4" xml:lang="en"> <siteinfo> <!-- … the XML header from an arbitrary wiki page export --> </siteinfo> <page> <title>GBIF:cultivar</title> <revision> <contributor><username>User name</username><id>123</id></contributor> <text xml:space="preserve"> *collection = GBIF *short URI = cultivar *full URI = http://vocabularies.gbif.org/services/gbif/taxon_rank/cultivar *label = cultivar *code = cultivar *see also = http://rs.gbif.org/vocabulary/gbif/rank.xml</text> </revision> </page> </mediawiki>
Am i doing anything wrong?
-
repo owner Hi, I am afraid the tool is quit picky because it has to handle quite a lot of special markup for the case I built it for. Could you try by adding a whitespace before the *?
<text xml:space="preserve"> * collection = GBIF * short URI = cultivar * full URI = http://vocabularies.gbif.org/services/gbif/taxon_rank/cultivar * label = cultivar * code = cultivar * see also = http://rs.gbif.org/vocabulary/gbif/rank.xml</text>
I wrote a version two the last few days that enhances the capabilty for my specific case. I will merge these changes into this branch asap as well..
-
reporter I am still not able to get the output.
- Log in to comment
Hi Can you provide the source file as well? I need to analyse what happened because your target file GBIF is completely empty..