1. Shlomi Fish
  2. perl-XML-LibXML
Issue #1 invalid

dom->toFile(file,2) doesn't put newlines after newly added nodes.

Ted Toal
created an issue

I don't know if I'm doing something wrong that is causing this problem, or if it is a bug. I parse an xml string, insert a node before an existing node, then write the xml to a file using Document->toFile() with $format=2. The problem is that the output file does not have a line break after the newly inserted node. I'm using version 2.0002.

Input Xml:

{{{

!text

<?xml version="1.0" encoding="UTF-8"?> <pfam xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://pfam.sanger.ac.uk/" xsi:schemaLocation="http://pfam.sanger.ac.uk/ http://pfam.sanger.ac.uk/static/documents/schemas/results.xsd" release="26.0" release_date="2011-11-17"> <protein length="368" name="Solyc07g008440.2.1"> <database> </database> </protein> </pfam> }}}

Output Xml:

{{{

!text

<?xml version="1.0" encoding="UTF-8"?> <pfam xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://pfam.sanger.ac.uk/" xsi:schemaLocation="http://pfam.sanger.ac.uk/ http://pfam.sanger.ac.uk/static/documents/schemas/results.xsd" release="26.0" release_date="2011-11-17"> <protein length="368" name="Solyc07g008440.2.1"> <seq>Hello world</seq><database> </database> </protein> </pfam>

}}} Code:

{{{

!perl

use strict; use warnings; use XML::LibXML; open(DATAFILE, "<" . "test.xml"); my @entireFile = <DATAFILE>; close(DATAFILE); my $xml = join('', @entireFile); my $xml_parser = XML::LibXML->new(); my $dom = $xml_parser->parse_string($xml); my $pfam = $dom->documentElement(); my ($proteinTag) = $pfam->getElementsByTagName('protein'); my $seqText = $dom->createTextNode("Hello world"); my $seqTag = $dom->createElement('seq'); $seqTag->appendChild($seqText); my ($databaseTag) = $proteinTag->getElementsByTagName('database'); $proteinTag->insertBefore($seqTag, $databaseTag); my $state = $dom->toFile("test2.xml", 2);

}}}

Thank you for looking into this.

Comments (6)

  1. Toby Inkster

    Why would there be a line break after the newly inserted node? You didn't insert a line break after it. If you want a line break before the <database> tag, then:

    $proteinTag->insertBefore($dom->createTextNode("\n"), $databaseTag);
    

    In the general case, whitespace is considered significant in XML, so XML::LibXML won't try to guess where you want line breaks and insert them without being instructed, as that would risk breaking some XML documents.

    If you want pretty-printed XML, take a look at XML::LibXML::PrettyPrint on CPAN.

  2. Ted Toal reporter

    (Reply via twt...@ucdavis.edu):

    Thanks. I thought that the $format=2 argument was equivalent to asking that the document be pretty-printed.

    ted

  3. Toby Inkster

    Kinda, but even with $format=2, XML::LibXML is very conservative about its reformatting. It will add leading and trailing whitespace to text nodes, but there's no text node between your two elements, so nothing to add whitespace to.

  4. Log in to comment