unquoted string "0x_" cause NumberFormatException
e.g. try parsing string_that_almost_looks_like_hex: 0x_
Comments (13)
-
reporter -
reporter well, i just realized that the ruby and python parsers also blow up on this test case, so perhaps it isn’t unreasonable for snakeyaml to do the same. also found the spec shows the regex to use for integers, and that seems to match what’s on master.
that said, i still think treating
0x_
as a string is a sensible decision. the spec is somewhat ambiguous about how it should be treated. but whatever y’all feel like is fine with me – i can work around it in my app. -
We spent too much time trying to fix deviations of the parsers. I am not in favour to add yet another one.
By the way, you can specify your own Resolver at runtime to achieve the goal. Than you do not have to use quotes.
-
reporter yep, we’re currently working around it with a custom constructor, actually. anyway, it isn’t a goal of mine to be able to not use quotes in the yaml i produce, this is a matter of parsing user-provided yaml, i.e. that i haven’t constructed. since the spec is ambiguous, i guess it’s up to us to decide what
0x_
means (or0b_
or0_
– other cases that i’m just realizing also fail).the solution in my PR is a bad one – it causes
0x_0A_74_AE
to be parsed as a string, but this is a case that is specifically called out as an example of a hexadecimal int in http://yaml.org/type/int.html. sounds like you don’t want to update the resolver, but nonetheless i’ll update that PR with an even more complicated regex to handle this issue, just so there’s a record of it for posterity. -
reporter k, updated the PR on github to handle stuff like
0x_0A_74_AE
as a number, and to handle stuff like0b_
and0_
and-_
as a string. -
I am afraid it goes too far.
This is a number not a string:
0123456789
-
reporter you probably saw my comment on https://bitbucket.org/asomov/snakeyaml/issues/449, but I think that’s debatable. for what it’s worth, ruby 2.6 and 2.7 consider it a string, as does python 3.9.5. if it is a number, what base is it, and what is the resulting decimal value?
also, i know you said early on you don’t want to fix parser deviations, and i respect that decision. if you don’t want handling for these problematic strings to be merged in, i’ll drop the issue. if you do think we’re on track and just need a little more tweaking, i’m happy to stick around and work out the details with you, or hand it off for you to finish up, whatever you prefer.
-
I am not sure about Python 3.9, but Python 3.6 which I have raises an exception:
ValueError: invalid literal for int() with base 16: ''
-
reporter are you talking about for
0x_
? that’s what 3.9 does for that input, but for0123456789
it loads it as a string. -
I appreciate your clarification and contribution.
I think it should be taken. The only question I have is that the tests fail now.
8e-06 parsed as String instead of Double.
-
Please run
./mvnw clean install site
-
- changed status to resolved
Thank you. I have changed the pattern for Float. It will be delivered in version 1.30 https://bitbucket.org/asomov/snakeyaml/wiki/Changes
-
reporter sorry, this fell off my radar for a bit, but thanks for accepting the fix and taking care of the affected tests!
- Log in to comment
guess i can’t create pull requests on bitbucket, but i put one up on github for your review: https://github.com/asomov/snakeyaml/pull/16