Jid.of() doesn’t parse unicode domains

Issue #128 closed

Daniel Gultsch created an issue 2018-03-08

I started using xmpp-addr in Conversations (after using the full library in various other projects) and I noticed a weird behaviour with unicode vs punycode domains which I'm not sure is intentional.

If I create a Jid using it‘s punycode representation everything works as expected. If I output this jid using Jid.toString() I get the (normalized) unicode representation of that jid. However when I then put that string representation into Jid.of() again I get an IllegalArgumentException with invalid jid.

Consider the following code. That goes throws an exception in the third line.

Jid fromPunyCode = Jid.of("user@xn--xample-2of.com");
System.out.println("jid: "+fromPunyCode.toString());
Jid fromUnicode = Jid.of(fromPunyCode.toString());
System.out.println("jid: "+fromUnicode.toString());

I'm honestly not really sure how unicode domains are supposed to work. If I'm reading the RFC correctly they should go in their unicode representation over the wire. Therefor Jid needs to be able to parse the unicode form and sort of only accept punycode as a bonus for easier user input.

Also since I'm unable to get the unicode representation out of the Jid class again I can't store the Jid object in a database.

Edit: I'm using 0.7.5 and briefly looked through the commit history but couldn't find anything. I apologize if this has already been addressed.

Comments (10)

Daniel Gultsch reporter
- edited description
- 2018-03-08T23:08:38+00:00
Christian Schudt repo owner
Hi Daniel.

thanks for the bug report. I am sorry, there were issues with that in 0.7.x, which are already fixed for 0.8.0.

For 0.8.0 I've overhauled the Jid implementation, making Jid an interface, fixing the punycode/IDN issues and updating to the new PRECIS spec (RFC 8264/RFC 8265) which is used by RFC 7622 (XMPP Address Format). Since the PunyCode issue was never critical to me or anybody else, I didn't backport it.

I also think Jids should go over the wire in Unicode and parsing from PunyCode is a only a bonus.

Unfortunately I don't plan to release 0.8.0 in the near future. How critical is that issue for you? For testing purposes I've uploaded a new snapshot and added a new unit test (2b64c12) with your sample.
- 2018-03-09T19:51:38+00:00
Daniel Gultsch reporter
Hi,

thanks for getting back to me. Not sure how I missed the fact (again) that this is already fixed. (I looked through recent commit in the master branch… Anyway…)

Well let’s say it is important enough for me that I would have considered fixing this myself and temporarily pulling the full source into my project until this is fixed upstream. While there might only be one or two Conversations users with unicode domains out there I don’t want to break it for them. Glad I don't have to. It's working fine with the snapshot. Assuming that there won’t be too many changes on xmpp-addr I might just stick with the SNAPSHOT for now (or see if I at least can point it to a specific date/commit)
- 2018-03-09T20:08:05+00:00
Christian Schudt repo owner
Another question: Does it run on Android? I always thought it doesn't due to JAXB. xmpp-addr at least has a XmlJavaAdapter in it and 0.8.0 will make use of Java 8 language features (static interface methods).
- 2018-03-09T20:14:07+00:00
Daniel Gultsch reporter
It’s complicated(TM) Short answer yes. The Android compiler toolkit can compile Java 8 source code. The actual language features work. It is just missing a lot of the standard library stuff like the entire stream API or other standard library features like HashMap.getOrDefault() - those are only available if you target at least Android 7 (which you don't want to do in practice)

proguard will complain that it can't find the Xml annotations but just using the Jid class works fine (as it doesn't expose Java 8 classes to the outside) I suspect we would have a problem if it for example would be returning an Optional<> somewhere. (as this isn't available on Android either)
- 2018-03-09T20:21:22+00:00
Christian Schudt repo owner
Oh ok... good to know. Indeed I was thinking about an Optional<String> getResource(), dumb Android. But I saw they added it in API Level 24.
- 2018-03-09T21:07:30+00:00
Daniel Gultsch reporter
Makes sense; Maybe also for getLocal().

On a somewhat related note note i missed isDomain() - I had this in my own Jid implementation. Worked around that with a simple getLocal() == null, but maybe having this method would be nice.

In any case if you do want to use Optional<> I would in fact appreciate a 0.7.6 with the unicode fixes back ported that I could just use indefinitely. Or until Android >= 7 (API 24) does have a significant market share.
- 2018-03-09T21:18:19+00:00
Christian Schudt repo owner
I never missed a isDomain(), will think about it.
- 2018-03-09T21:26:43+00:00
Christian Schudt repo owner
- changed status to resolved
Add Jid#isDomainJid()

Fixes issue ~~#128~~

→ <<cset 99f6888d5efc>>
- 2018-07-12T12:00:19+00:00
Christian Schudt repo owner
- changed status to closed
Fix version 0.8.0
- 2018-07-17T08:33:49+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: closed

Version: –

Votes: 0

Watchers: 1