Unicode dash in MUC JID causes NullPointerException

Issue #75 closed
Andrew Potter created an issue

Per the title, somehow someone in my HipChat organization managed to create a chatroom with a unicode dash in the name, which HipChat translated into the JID as-is. This is causing the following exception when discovering chatrooms, as JAXB seems to be setting the JID for that Item to null (ChatRoom asserts the JID is not null).

2016-05-17 10:23:58.804  WARN 8613 --- [Listener Thread] rocks.xmpp.util.XmppUtils                : java.lang.NullPointerException
java.lang.NullPointerException: null
    at java.util.Objects.requireNonNull(Objects.java:203) ~[na:1.8.0_73]
    at rocks.xmpp.extensions.muc.ChatRoom.<init>(ChatRoom.java:110) ~[classes/:na]
    at rocks.xmpp.extensions.muc.ChatService.lambda$discoverRooms$3(ChatService.java:75) ~[classes/:na]
    at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) ~[na:1.8.0_73]
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) ~[na:1.8.0_73]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[na:1.8.0_73]
    at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) ~[na:1.8.0_73]
    at rocks.xmpp.core.session.XmppSession.lambda$sendAndAwait$4(XmppSession.java:711) ~[classes/:na]
    at rocks.xmpp.util.XmppUtils.lambda$notifyEventListeners$2(XmppUtils.java:192) ~[classes/:na]
    at java.util.concurrent.CopyOnWriteArrayList.forEach(CopyOnWriteArrayList.java:890) ~[na:1.8.0_73]
    at java.util.concurrent.CopyOnWriteArraySet.forEach(CopyOnWriteArraySet.java:404) ~[na:1.8.0_73]
    at rocks.xmpp.util.XmppUtils.notifyEventListeners(XmppUtils.java:190) ~[classes/:na]
    at rocks.xmpp.core.session.XmppSession.lambda$handleElement$11(XmppSession.java:998) ~[classes/:na]
    ... 3 common frames omitted
Wrapped by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
    at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[na:1.8.0_73]
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) ~[na:1.8.0_73]
    at rocks.xmpp.util.concurrent.AsyncResult.get(AsyncResult.java:286) ~[classes/:na]
    at com.dealer.discobot.xmpp.BabblerXmppAdapter$3.accept(BabblerXmppAdapter.groovy:72) ~[classes/:na]
    at com.dealer.discobot.xmpp.BabblerXmppAdapter$3.accept(BabblerXmppAdapter.groovy) ~[classes/:na]
    at rocks.xmpp.util.XmppUtils.lambda$notifyEventListeners$2(XmppUtils.java:192) ~[classes/:na]
    at java.util.concurrent.CopyOnWriteArrayList.forEach(CopyOnWriteArrayList.java:890) ~[na:1.8.0_73]
    at java.util.concurrent.CopyOnWriteArraySet.forEach(CopyOnWriteArraySet.java:404) ~[na:1.8.0_73]
    at rocks.xmpp.util.XmppUtils.notifyEventListeners(XmppUtils.java:190) ~[classes/:na]
    at rocks.xmpp.core.session.XmppSession.lambda$handleElement$12(XmppSession.java:1003) ~[classes/:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_73]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_73]
    at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_73]

I see two different ways forward:

  1. Is there an easy way to get JAXB to encode unicode characters instead of leaving the whole string null, or is it not an issue with JAXB?

  2. If (1) isn't an option, I can ask the room owner to delete the room and remake it - AFAIK there's no way to change the JID without doing that. This is not ideal, as they'd lose all room history as well, and it seems there's nothing stopping this from happening again in the future.

Babbler version: 0.6.0/0.7.0-SNAPSHOT Offending JID example: 99999_contains_both_-_dash_and_–_emdash@conf.hipchat.com

Comments (8)

  1. Christian Schudt repo owner

    The offending code point is 0x2013 (EN DASH), which is a punctuation character.

    Punctuation characters are disallowed in JID usernames.

    The other "dash" is actually a 0x002D (HYPHEN MINUS), which is allowed.

    The underlying specification for this is: https://tools.ietf.org/html/rfc7564#section-4.2

    When parsing the invalid JID string, it throws an exception and JAXB turns it to null. The correct solution is to fix the JID on the server (2). But I could also think of some "loose" parsing in the JID class.

  2. Andrew Potter reporter

    That makes sense, thanks for the explanation. It would be nice to have a config option to disregard punctuation in the JID - I don't really care about being able to reference a room with a broken JID, but I would rather not have that break all room discovery.

    I'll ask the room owner to remake the room.

  3. Andrew Potter reporter

    It doesn't look like I have access to the xml in the listener, so that won't work either.

  4. Christian Schudt repo owner

    Parsing JIDs from the XMPP stream should not perform validation and enforcement.

    This has 2 reasons:

    • A server could use a newer Unicode table and a JID could be valid on the server, but validation would fail on the other side.
    • Or servers could have simply a poor or outdated implementation of the JID, in which case we can't communicate with the server.

    Therefore simply accept JIDs as-is when unmarshalling.

    Fixes issue #75.

    → <<cset 76072f843a5a>>

  5. Log in to comment