MQTT not handling binary data very well

Issue #263 resolved
Simon Walters created an issue

Due to using string handling code in the library, binary payloads don’t seem to be read correctly as any 0 in the payload are treated as end of payload indicators

Looking at github for the source library that I think is being used, I think if it was switched to using the advanced callback - it might then be able to work OK

I’m going to try again to set myself up to be able to experiment with microblocks source but @Wenjie Wu might be able/willing to try it out?

Here is a test topic is cymplecy/cheerlights/rgb on mqtt.eclipseprojects.io which is a 3 byte RGB payload

Comments (35)

  1. John Maloney repo owner

    It might be easier to experiment with the library in straight Arduino code. If someone can verify that the advanced callback works correctly with binary data then Wengie or I can modify the primitives to use it. The last MQTT event payload will also need to be a byte array object rather than a string. At the library level we can have two version of payload for MQTT event, one that returns a byte array and one that return a string.

    I have two questions for you MQTT experts:

    1. Is it safe to assume that that the topic is always a string?
    2. Is there a maximum length for the topic string?

    The primitive code currently allocates a lastMQTTTopic buffer the same length as the payload buffer. However, for large payload sizes (e.g. >256 bytes) that might be a waste of RAM. If there isn’t a limit defined in the spec, is there a practical limit for the topic length?

  2. Wenjie Wu

    @Simon Walser I reproduced the problem you mentioned, and for testing purposes, I posted the code for MQTT pub:

    import paho.mqtt.client as mqtt
    client = mqtt.Client()
    broker = 'mqtt.eclipseprojects.io'
    client.connect(broker)
    client.loop_start()
    R, G, B = (100, 0, 100)
    payload =bytearray([R, G, B])
    client.publish('bytesTest/rgb', payload)
    

    Only d (ascii 100: d) can be received in MicroBlocks. (as @Simon said: 0 in the payload are treated as end of payload indicators)

    I like the simplicity of string, and the Radio library also doesn't support bytes.

    Without providing bytes blocks, I don't know how to solve this problem in vm.😥 Sorry @Simon Walser . If you have a good solution, I look forward to your sharing it here, I am happy to participate in testing !

  3. Simon Walters reporter

    “I have two questions for you MQTT experts:”

    I’m not an expert but I have a twitter friend (Andy Standord-Clark what co-invented MQTT ) so I’m asking him 🙂

    But I’ve never seen a binary type topic

    Re 2 - not sure about that

  4. John Maloney repo owner

    Thanks for doing that test Wenjie.

    Simon, it is pretty clear that the advanced callback is meant to handle binary data so no need to test unless you want to.

    I will convert both the primitives and the library to handle both byte arrays and strings. Users who only use strings can continue to use strings (i.e. the current API). I will add a new library function to return a binary payload.

    I answered my own questions. Topics are UTF-8 strings and there is a topic length limit – 65536 bytes.

  5. Simon Walters reporter

    @Wenjie Wu

    Could I ask that you don’t publish tests to cymplecy/cheerlights/rgb as I use it in other MQTT applications 🙂

    Can you confirm which MQTT library you have used as the source for Microblocks?

    “Without providing bytes blocks, I don't know how to solve this problem in vm.”

    Hopefully, if you can provide the payload as byte array , John could provide method of getting it to the VM

    @John Maloney Re length of topic - maybe we need to provide that as separate parameter in the connect?

  6. John Maloney repo owner

    The Radio functions are compatible with the ones in Makecode, which specifies strings. For super advanced users, MicroBlockks offers the packetSend and packetReceive primitives which allow one to define one’s own packet format, including the use of binary data.

    For all but the most advanced users, the simplicity of the Makecode radio API is fine. I’ve done many workshops with the radio feature and even people who have never coded before are able to understand and experiment with the radio blocks right away.

    Similarly, when introducing MQTT I would stick with string payloads since that makes debugging easier. For example, I’d encode colors as three decimal numbers separated by commas or as six digit hexadecimal strings. Both of those representations use more bytes than necessary, but for most projects that doesn’t matter.

    That said, the ability to handle binary data has a few important use-cases, such as sending hundreds of RGB values to update an animated NeoPixel display, which might be what Simon has in mind. In that case, encoding efficiency is important. Binary data is also needed to efficiently transmit sound and image data.

    Fortunately, in this case we can just add one new library function – perhaps only available in “advanced blocks” mode – to make it possible to work with binary data. Most people can continue to use strings using the current library functions.

  7. John Maloney repo owner

    Re length of topic - maybe we need to provide that as separate parameter in the connect?

    Good idea.

    Many people will be fine with the default topic and payload length of 128 bytes, so they won’t need to provide the optional length parameters.

  8. Simon Walters reporter

    “encoding efficiency is important.”

    That is the reason for having the cheerlights colour in it's most compact format

    I use Node-RED to take in the current cheerlights colur and then echo it back out with a lot of different formats to make it easy for client devices to use the data

    I added in the decimal payload last week as the easiest way of getting the colour into Microblocks :)

  9. Wenjie Wu

    John's thinking is incredibly clear and admirable. John can handle this problem better than I can. This problem is currently difficult for me and I look forward to learning from John's solution!

    Could I ask that you don’t publish tests to cymplecy/cheerlights/rgb as I use it in other MQTT applications

    Sure!

  10. John Maloney repo owner

    I’ve modified the MQTT primitives and library to handle binary data. It will be in the next pilot release. I’ve done only very light testing so let me know if you find any problems.

    Has anyone used the AdaFruit MQTT service? (https://learn.adafruit.com/mqtt-adafruit-io-and-you/getting-started-on-adafruit-io) It’s free, provides a private namespace, and has support for viewing and maybe interacting with your MQTT devices through a web browser UI. Looks like a cool way to get started with MQTT and IoT…

  11. Simon Walters reporter

    Last time I looked at adafruit mqtt (year or 2 ago) it wasn’t free but will now check it out again

  12. Simon Walters reporter

    After a quick play around, its got quite a lot of limitations (all to do with making sure their broker isn’t overwhelmed)

    So no QoS2 and no retained data and quite a lot of restriction on connection attempts/data rate

    And a few non-standard Adafruit add-on features (as with all their software products 🙂 )

    But useful if you need a cloud broker that only people with username/key can access

  13. John Maloney repo owner

    Thanks for the info about both the Adafruit and HiveMQ services. Within a given setting (e.g. a classroom or home) I’m guessing it is easy to set up an MQTT server on a Raspberry Pi. The Adafruit or Hive options are useful if you want a server you can access from the wider internet (e.g. from a phone while traveling). Of course, you could also set up your own server on a cloud hosting service such as Digital Ocean or Linode, but that’s more overhead.

  14. Simon Walters reporter

    As long as the data isn’t important, then these public ones

    broker.emqx.io

    mqtt.eclipseprojects.io

    test.mosquitto.org

    broker.xmqtt.net

    “Of course, you could also set up your own server on a cloud hosting service”

    Got one of those as well 🙂

    simplesi.cloud :)

  15. John Maloney repo owner

    I’m thinking about the best ordering for the optional parameters in the connect block.

    How often do you think people use the client ID? It seems best to just to leave that parameter empty and let the system assign a unquie client ID. How often are the name/password used?

    Finally, how often do you think people will need to change the default payload size?

    I’m thinking a good order would be: name/password, payload size, client ID. Most people probably don’t need any of those optional parameters. But if would be nice if those who needed them did not need to expand the block to show any more parameters than they actually need. People have different needs, so there isn’t an ordering that will work for everyone, but it would be nice to make things convenient for most users.

  16. Simon Walters reporter

    One thing I’ve found missing in the MQTT world is a “pop-up” LAN broker

    And you’ve got me thinking that I should write a broker in GP (to match with the client I wrote back in '17) 🙂

    That would keep me busy for a while

  17. John Maloney repo owner

    Today's pilot release (1.0.35) includes some MQTT improvements including: - binary data support - disconnect function (although it does not appear to disconnect from some MQTT brokers) - the ability to set the retain flag and QoS level when publishing - the ability to set QoS level when subscribing - use "say" to report failures but stay silent when operation succeeds

    I did some testing but it could use some more careful testing by a regular user of MQTT...

  18. John Maloney repo owner

    @Simon: thanks for the list of public MQTT brokers. I found both mqtt.eclipseprojects.io and broker.emqx.io fast and reliable. I sometimes could not connect to broker.hivemq.com, broker.xmqtt.net, and test.mosquitto.org, and test.mosquitto.org often drops the connection, presumably when it is heavily loaded.

    I tried the Adafruit broker but I quickly ran into their rate limits. I like their dashboards and I think Adafruit could be useful for slowly changing data, such as room temperature or light levels, but it wouldn't be great for something like animating a string of Neopixels.

  19. Simon Walters reporter

    I did some testing but it could use some more careful testing by a regular user of MQTT...

    The binary works for me :)

    Slight issue is that payload for MQTT event assuming payload is a string so doesn’t deliver the byte array - my workaround is to directly access item 2 of the message

    Also, could find any simple method of using say to display a byte array so made a small function up.

    Maybe some extra byte array blocks (or make join list block work on byte arrays) would be nice 🙂

    Retain flag publishing works for me

    One missing option is to be able to specify a Keepalive time in the connection block as some brokers will disconnect clients that use the default 60 secs so we need some method of specifying a different value

  20. Simon Walters reporter

    “ I sometimes could not connect to …”

    All the public brokers, are to some extent, unreliable and non of them have any guarantee of service.

    Over past year - emqx has been most reliable except for a 2 day period when it was down completely.

    eclipse is next in reliability for me but suffered a bad period of reliabilit for past two months as it was overloaded with retained topics- the maintainers have altered its behaviour to try and mitigate issues but long term retained topics are liable to be lost

    Test.mosquitto is just completely overwhelmed and dodgy 🙂 but is the most well known one

    Like, you - I did like having the simple Adafruit dashboard but the very non-standard behaviour of its broker probably means I’m not going to use it much

  21. John Maloney repo owner

    "Slight issue is that payload for MQTT event assuming payload is a string so doesn’t deliver the byte array - my workaround is to directly access item 2 of the message'

    If you reload the MQTT library you'll get the option to expand the last MQTT event block to specify that you want binary:

    Screen Shot 2022-03-14 at 8.31.58 AM.png

    "Maybe some extra byte array blocks (or make join list block work on byte arrays) would be nice"

    We're starting to use byte arrays more (e.g. for serial data) so it would be nice to have a quick way to display them. Your idea is clever. Another option would be an "to list block that would convert a byte array to a list, which you could then use join list* on. We could also make the built-in printer show the first N values in the byte array, with ellipses if it was longer than N; that would handle your RGB values nicely.

    "One missing option is to be able to specify a Keepalive time in the connection block as some brokers will disconnect clients that use the default 60 secs so we need some method of specifying a different value"

    I figured the client could just include a loop that calls last MQTT event every ten seconds to keep the connection alive, even if they don't subscribe to any events.

    I'm guessing that the public MQTT servers will time out clients after some period of inactivity regardless of the keep-alive requested, but maybe I'm wrong about that.

  22. John Maloney repo owner

    I responded to too fast -- I see from your code that you already know about last MQTT event binary.

    When you set the binary flag to that block true, then payload for MQTT event will return the binary payload (i.e. the second item). But it's fine to manually extract item 2 of the message as you did.

  23. Simon Walters reporter

    “When you set the binary flag to that block true, then payload for MQTT event will return the binary payload”

    It doesn’t for me in my testing - will test more

    “I figured the client could just include a loop that calls last MQTT event every ten seconds to keep the connection alive, even if they don't subscribe to any events.

    I'm guessing that the public MQTT servers will time out clients after some period of inactivity regardless of the keep-alive requested, but maybe I'm wrong about that.”

    So, my understanding is that broker determines its keepalive time and if it doesn’t detect any activity from a client within that time - it will assume that client has disconnected and will stop sending any further messages. If the client maintains the connection, then the broker should continue to send it any messages on its subscribed topics

    It's one of the fundamental parts of MQTT that makes the whole protocol work

  24. Simon Walters reporter

    The payload reporter looks like this on my setup

    This is my proposal for modified version

    The join may be superfluous but I just left it in anyway

  25. John Maloney repo owner

    Yes, I think that's superfluous. The "join" would convert a byte array to a string but since the payload is not a byte array it must be a string and the "join" just makes a copy of that string.

    "So, my understanding is that broker determines its keepalive time and if it doesn’t detect any activity from a client within that time - it will assume that client has disconnected and will stop sending any further messages. If the client maintains the connection, then the broker should continue to send it any messages on its subscribed topics"

    Good to know. Calling last MQTT event will call pmqtt_client->loop(), which I believe will keep the connection alive. So it's up to your MicroBlocks code to call last MQTT event often enough to keep the connection alive, even if it doesn't subscribe to any messages. I'm guessing the publishing updates will also keep the connection alive.

    In my testing, some brokers timed out after 16 to 30 seconds, others did not seem to time out the connection for much longer, if at all. They may have a policy of only closing idle connections when necessary based on load.

  26. Simon Walters reporter

    Did some monitoring with a LAN broker on PINGREQ from microblocks - it’s sending them out every 10 seconds so that should keep connections alive.

    Hopefully, it doesn’t cheese off some of the public ones 🙂

  27. John Maloney repo owner

    Good to know.

    Does the MicroBlocks code need to do something to ensure that ping requests are sent or are they sent even when MicroBlocks is idle?

    A ping every ten seconds doesn't seem too aggressive so hopefully that won't be seen as misbehavior by public brokers.

  28. Simon Walters reporter

    Just connecting but not subscribing - client is disconnected within a minute

    Subscribed but nothing else - same behaviour

    Add a loop and PINGREQ are sent every 10 secs

  29. Simon Walters reporter

    “People have different needs, so there isn’t an ordering that will work for everyone,”

    I missed reading this until now

    You are right, no order works for everyone - luckily in Microblocks - you can name the variadic parameters - I wish we had that in Snap! as well

  30. Simon Walters reporter

    I’d just like to complain that my MQTT driven Cheerlights neopixel strip failed this morning - I’d like a refund please!

    🙂

    Probably a loose connection

    🙂

    Up until then - been working great for at least a week without any issues - excellent work everyone 🙂

  31. John Maloney repo owner

    Ha! The problem is that MicroBlocks MQTT is working too well. You've used your Cheerlights so much they've worn out. :-)

    Glad MQTT is working well.

    I'm going to mark this issue resolved.

  32. Log in to comment