Issue #29 new

Zero padded numbers ending in 8 or 9 dump incorrectly (or everything else does!)

jtor14
created an issue

When dumping zero padded numbers, yaml.dump quotes the strings:

yaml.dump(['01', '02', '0043'])
# "['01', '02', '0043']\n"

However, if your zero padded number ends an 8 or 9, the quotes are dropped:

yaml.dump(['08', '029', '0002418'])
# '[08, 029, 0002418]\n'

Yet, the string version of 8 and 9 behave nicely:

yaml.dump(['8', '9'])
# "['8', '9']\n"

As do strings of numbers ending in 8 or 9:

yaml.dump(['349', '2308'])
# "['349', '2308']\n"

Comments (2)

  1. Florent Xicluna

    I am afraid that it is by design.

    • example 1, the representation without quotes is an octal (starting with 0 and all digits < 8)
    • example 3 and 4, the representation without quotes is a decimal (starting with non 0 and digits only)
    • example 2 can be represented without quotes because it cannot be confused with a decimal or an octal: starts with 0 and contains digit 8 or 9

    I had a similar annoyance in a project, I fixed it by subclassing yaml.Dumper with custom representers (it fixes two other annoyances also).

    import yaml
    
    
    def represent_unicode(dumper, data, style=None):
        data = unicode(data)
        if not dumper.default_style and data.isdigit():
            style = "'"
        return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style=style)
    
    
    class UDumper(yaml.Dumper):
        pass
    
    # Represent longs (42L) the same as ints
    UDumper.add_representer(long, yaml.Dumper.yaml_representers[int])
    # Always use quotes when the string contains only digits
    UDumper.add_representer(str, represent_unicode)
    # Represent <unicode> the same as <str>
    UDumper.add_representer(unicode, represent_unicode)
    
    
    # Usage:
    # yaml.dump(['01', '02', '0043'], Dumper=UDumper)
    # yaml.dump(['08', '029', '0002418'], Dumper=UDumper)
    
  2. jtor14 reporter

    I understand the need for octal representation, however, I'm specifically interested in the string representation of those numbers. It is the apparent coercion to some type of number even though it is quoted that seems incorrect.

    It seems to me that the yaml.dump (and representer) should remain ignorant of the notion that the element contained within the quotes can be coerced into a number (of any kind).

  3. Log in to comment