Possible serial port issue with v2.2 micro:bits on Mac OS

Issue #279 resolved
Wenjie Wu created an issue

I encountered this problem when I used microbit(v2), it can't be reproduced stably, it can be reproduced by plugging and unplugging the micro:bit several times or refreshing the web page, the block with the problem is also unstable, I encountered this problem in multiple LED Display blocks

Comments (30)

  1. John Maloney repo owner

    Since those error message don't make sense, it sounds as though the code on your board is getting corrupted in some way, It is remotely possible that the Flash memory on your micro:bit if faulty, although I have only encountered that problem once in about five years of using micro:bits, and that was on a board I got from Sue Sentence, an early micro:bit educator who had used the board extensively for demos.

    More likely there is a bug in the MicroBlocks code that compacts Flash memory. SInce Flash memory is "write once", MicroBlocks always appends new code to the end of the unused portion of Flash memory. When there is not enough unused Flash left, it erases the entire portion of Flash used for MicroBlocks code and starts over. It's a bit more complicated since there are actually two "half spaces" that are used alternately, but that's the basic idea. The symptom you describe of failing only after several program reloads are consistent with a failure connected to Flash compaction.

    If you open the developer's console, you may see messages printed when compaction occurs (if I didn't disable those messages for production). If the failures always occur immediately after a compaction that would be strong evidence of a problem with compaction.

    Are you working with a large program or one that includes a lot of libraries? Could you send me the .ubp file?

  2. Wenjie Wu reporter

    Thank you for your analysis.

    In order to locate the problem more clearly, I just opened a brand new micro:bit without any program for it, and after flashing in the firmware in MicroBlocks IDE and using LED Display blocks, I encountered the problem:

  3. John Maloney repo owner

    Good test. I think that rules out "defective micro:bit".

    I cannot reproduce these errors in either the browser or Mac OS stand-alone MicroBlocks v1.1.61.

    What OS and version are you running (looks like MacOS).

    Is it possible that you have more than one instance of MicroBlocks running (possibly in a different browser tab) and connected to the board? That can lead to strange behavior. Another possibility is that some other program is attempting to connect to the the micro:bit's serial port. If the stand-alone MicroBlocks app is running, it attempts to auto-connect when a board is plugged in, which can conflict with MicroBlocks running in the browser.

    A quick test would be to see if you see the same problem on a different computer, if you have one handy. If the problem is specific to that computer, you could try rebooting, starting Chrome, and making sure that the only window and tab open is MicroBlocks.

    That won't necessary turn off a background program that is connnecting to the serial port, but it will narrow down the possibilities..

  4. Wenjie Wu reporter

    This is my work environment:

    • MacOS 12.3.1 (MacBook Air M1)
    • Chrome 103.0.5060.114

    I closed all the software and restarted the computer. This is how I reproduced the problem:

    https://scratch3-files.just4fun.site/1658369626740111.mp4

    A quick test would be to see if you see the same problem on a different computer, if you have one handy. If the problem is specific to that computer, you could try rebooting, starting Chrome, and making sure that the only window and tab open is MicroBlocks.

    I don't have any other computers around at the moment, but as soon as I get a chance, I'll do so, and give you more information about the test.

    But one piece of information that might be helpful is that this problem was originally brought to my attention by an Elite user who also seems to be using MacOS, and I'm the second person to reproduce this problem on a computer, so it may not be a problem with my computer.

  5. Wenjie Wu reporter

    Since this problem was only recently discovered, I guess it may be introduced by recent commits, I rolled back the web IDE to v1.1.27(Firmware is the latest), there is still this problem, this test may be able to narrow down the scope of the problem

    commit fed33863f6769813c8218ef01e9823aedf790703 (HEAD)
    Author: John Maloney <jhmaloney@gmail.com>
    Date:   Sun Feb 6 18:11:16 2022 -0500
    

    The micro:bit boards in question are all from the same recent batch, and I'm looking for older boards to test.

  6. Wenjie Wu reporter

    Hi John, I think I may have located the problem.
    The boards that are currently having problems are the latest batch (v2 20)
    On v2 00 it is fine

  7. John Maloney repo owner

    Great detective work in tracking that down! The micro:bit ed foundation changed the serial interface chip between v2.0 and v2.2:

    https://support.microbit.org/support/solutions/articles/19000132336-bbc-micro-bit-v2-interface-processor-change-v2-2-

    It sounds as though that change has broken something. For example, the new chip may not be able to handle large bursts of data at 115200 baud. I don't have a v2.2 micro:bit to test but I will check with the folks at micro:bit to see if this is a known problem and if there is a fix.

  8. John Maloney repo owner

    I've reported this problem to the micro:bit ed foundation.

    Another experiment to try would be to revert the firmware to an earlier version:

    https://microblocks.fun/downloads/v1.1.41/vm/

    or even:

    https://microblocks.fun/downloads/v1.1.16/vm/

    To install the firmware, just drag and drop the file "vm_microbit-universal.hex" from a given release onto the MICROBIT drive.

    I will try to find a v2.2 micro:bit to test.

    How urgent is this problem? How many users have reported this problem? Are all the reports from MacOS users? If so, what version of MacOS were they using (if you know)?

  9. Wenjie Wu reporter

    How urgent is this problem? How many users have reported this problem? Are all the reports from MacOS users? If so, what version of MacOS were they using (if you know)?

    The problem is not urgent and currently only affects MacOS users. The computers that have been tested with the problem so far are:

    • MacOS 10.15.7
    • MacOS 12.3.1

    I just asked the Elite staff and they said that most users are using windows and v2.2 seems to have no problems.

    I will keep updating here if there is further information

  10. John Maloney repo owner

    Glad it is not impacting too many users. If it is only happening on MacOS this could be due to a bug the MacOS driver for the new serial support chip.

    I will try to get a micro:bit v2.2.

    What is the simplest way to reproduce this problem? Using the webapp on MacOS, does it fail if you just connect a v2.2 board and click on one of the "LED Display" blocks in the palette?

    If you've got a way to semi-consistently reproduce the problem, you might try to find a way to work around it that we could document and share with Mac users. Here are a few things to try:

    1. Get the system into a state where it is giving those error messages, then clicking the stop button (which also checks that the CRC's of the scripts on the board match those in the IDE) and try again. If that also fails, select "new" in the file menu and try. (Hypothesis: The script sync process will fix missing or corrupted scripts on the board).

    2. Plug in a v2.2 board, but do not connect to it immediately. Wait until the yellow light stops flashing and the MICROBIT drive becomes visible, then wait an additional 20 seconds. Finally, connect MicroBlocks to the board and see it fails. (Hypothesis: the logic that mounts the board as a USB flash drive is interfering with serial communications).

    Assuming you do these tests in the browser, it would be good to have the the Javascript console open in case the IDE is printing any warning messages the might provide a clue about the problem.

    There might be some other things we can try, such as updating the micro:bit firmware for the support chip or manually installing an updated Mac OS driver for that USB-serial chip. If so, I will let you know.

    Thanks for reporting this and for isolating the problem.

  11. John Maloney repo owner

    Thanks! Wenjie, I'd expect the 2.2 board to come with the v257 firmware already installed but it couldn't hurt to check the firmware version on your board and install the latest DAPLink firmware if necessary.

    I should have a v2.21 micro:bit in about a week. When I get it, I will try to reproduce and pinpoint this problem.

  12. Wenjie Wu reporter

    @Dariusz Dorożalski Thanks!

    @John Maloney I flashed in the new firmware and it seems that the problem has not been solved

  13. Wenjie Wu reporter

    What is the simplest way to reproduce this problem? Using the webapp on MacOS, does it fail if you just connect a v2.2 board and click on one of the "LED Display" blocks in the palette?

    Yes, it does not require special reproduction skills, multiple plugging and unplugging, then normal connection, there will be more than half the probability of this problem

    Thanks for your guidance! If I have further debugging information, I will update it here immediately.

  14. John Maloney repo owner

    I'm sorry (but not surprised) that updating the DAPLink firmware did not solve the problem.

    When the problem occurs -- that is, the board is plugged in and connected, and you are getting those error messages when clicking on blocks, does it help to click the stop button and retry? Does it help to select "new" from the FIle menu and retry? (If scripts are getting corrupted those steps may replace the corrupted scripts with clean versions.)

    I think you said earlier that you get errors when clicking on certain blocks in the palette, but you do not get errors if you drag the blocks into the scripting area and click them. Is that consistent? (I'm not sure why those cases would be different, but sometimes behavior that doesn't seem to make sense can be a valuable clue...)

    I'll be teaching a class in a different city next all of next week, so I probably won't get the v2.21 micro:bit until Friday or Saturday. I'm glad this doesn't seem to be impacting many users so far. But I'm worried that it could become a problem as more micro:bit v2.20 and v2.21 boards get distributed.

    It's really great that you noticed and reported the problem and discovered that the problem only occurs on 2.20 boards. With luck, we'll find a fix before very many people have v2.2x boards..

  15. Wenjie Wu reporter

    When the problem occurs -- that is, the board is plugged in and connected, and you are getting those error messages when clicking on blocks, does it help to click the stop button and retry?

    Yes, it often works, but soon there will be other blocks that go wrong.

    Does it help to select "new" from the FIle menu and retry?

    Same as “click the stop button and retry”

    I think you said earlier that you get errors when clicking on certain blocks in the palette, but you do not get errors if you drag the blocks into the scripting area and click them. Is that consistent? (I'm not sure why those cases would be different, but sometimes behavior that doesn't seem to make sense can be a valuable clue...)

    It worked in several experiments, so that I mistakenly thought it was stable, but a counterexample soon appeared, so I removed the conjecture.

    I'll be teaching a class in a different city next all of next week

    What is the course about? I have learned so much from you, your course must be wonderful and I am very interested in!

    It's really great that you noticed and reported the problem and discovered that the problem only occurs on 2.20 boards. With luck, we'll find a fix before very many people have v2.2x boards..

    There was a more worrying situation this morning, one user had this problem on a v2.0 board (macOS 10.15.7) but it hasn't reappeared since, I'm still following up and will sync it here if he can reproduce it.

    Anyway, at the moment it seems to affect only MacOS users, the scope is very small (mostly Elite employees), not serious.

  16. John Maloney repo owner

    What is the course about?

    It's a hands-on micro:bit class for teachers. You wouldn't learn anything new from me in the class, but I expect to learn a lot from the teachers. :-)

    There was a more worrying situation this morning, one user had this problem on a v2.0 board (macOS 10.15.7) but it hasn't reappeared since, I'm still following up and will sync it here if he can reproduce it. Anyway, at the moment it seems to affect only MacOS users, the scope is very small (mostly Elite employees), not serious.

    Yes, that is worrying. Again, when you get error messages that make no sense I suspect the code on the board is either corrupted or not in sync with the the code in the IDE. If that happens, the first thing to try is clicking the stop button, since that checks the CRC's of all scripts and updates any that don't match. If that fails, it may help to power-cycle the board (i.e. unplug it) and reconnect. That should reload the entire project onto the board.

    I've sometimes seen corrupted script issues even with a board and project that normally works. If the problem is super rare and can't be reproduced, it might just be a serial data transmission glitch.

    Although rare, I have sometimes seen a bad USB cable cause serial errors. So it can be worth swapping USB cables.

    On Mac laptops, it is also possible that a USB A to C adaptor is unreliable. Until recently I had an old Mac with USB-A ports so I haven't actually see a problem with a USB adaptor. In any case, if you are in communication with a user reporting such problems its worth askig about the hardware and perhaps changing the USB cable/adaptor to see if that helps.

  17. Wenjie Wu reporter

    I expect to learn a lot from the teachers.

    Users are good teachers :-)

    There was a more worrying situation this morning, one user had this problem on a v2.0 board (macOS 10.15.7)

    I just followed up on the problem and the user said he didn't encounter the problem again all afternoon, and he confirmed that the problem was with v2.0 this morning.

    If the problem is super rare and can't be reproduced, it might just be a serial data transmission glitch.

    I also think it may be an occasional glitch.

    On Mac laptops, it is also possible that a USB A to C adaptor is unreliable. Until recently I had an old Mac with USB-A ports so I haven't actually see a problem with a USB adaptor. In any case, if you are in communication with a user reporting such problems its worth askig about the hardware and perhaps changing the USB cable/adaptor to see if that helps.

    ok, this could also be a clue, my own computer that had problems used a USB adaptor

  18. John Maloney repo owner

    ok, this could also be a clue, my own computer that had problems used a USB adaptor

    Definitely a clue. During last week's workshop, 2 out of 15 teachers, had serial issues on Mac laptops with USB-C adaptor/hubs. In both cases, the problems were solved by switching to a simple USB-C-to-A adaptor (not a hub) or using a direct USB-C to USB-micro cable. After that switch, they had no further serial problems for the remaining 4.5 days of the workshop.

    I'm teaching another workshop next week to a smaller group, 7-8 teachers. If anyone has serial problems I will ask them if I can do some tests with their USB adapter and the micro:bit v2.21 board that I just got from Katie Henry.

    I will also try to reproduce the problem on my laptop today. If I cannot, the problem could be associated with certain Mac USB hubs/expansion ports.

    Meanwhile, if you can find a USB C to USB micro cable you might see if using that solves the problem on your computer.

  19. John Maloney repo owner

    I've been able to reproduce the bug on my MacBook Pro with the v2.21 micro:bit. The problem happens more often with a USB/Ethernet port adapter that I have then with a plain USB-C to USB-micro cable but I have seen the failure with both.

    I did the same test with my USB/Ethernet port adapter and a micro:bit v2.0 board. I never saw the error in 12 unplug-plug cycles. With the v2.21 board I never went more than five unplug-plug cycles without getting the error. I would tentatively agree with your hypothesis that this problem is specific to the v2.2 and v2.21 micro:bits.

    It is not yet clear exactly what is happening at the low level. Is data sent to the board being corrupted (bytes changed) or truncated (bytes lost)? I will need to do some experiments with simple Arduino programs to try to determine that, but I won't have time to do that until after my next week of teaching.

    I did stumble across a small change that seems to lower the probability of errors which I will include in the next Pilot release. But it doesn't entirely fix the problem.

    For now, the workaround if you have a v2.2 or v2.21 micro:bit and get strange error messages is to click the stop button and rerun the script until the error goes away. The stop button does a CRC check and resends any scripts with bad CRC's to the board. But I have seen even the resend fail, which is why several clicks of the stop button may be needed. If you open the Javascript console you'll see warning messages when there are CRC mismatches.

    I think we'll eventually find a solution, but it will take time...

  20. Wenjie Wu reporter

    The problem happens more often with a USB/Ethernet port adapter that I have then with a plain USB-C to USB-micro cable but I have seen the failure with both.

    Yes, the problem seems to be very compounded and I have tried to find the single influencing factor before and have been unsuccessful. There are many factors that can affect the error probability.

    I did the same test with my USB/Ethernet port adapter and a micro:bit v2.0 board. I never saw the error in 12 unplug-plug cycles.

    I previously thought micro:bit v2.0 board was stable and reliable until a few days ago there was a counter-example (but that user said it only happened once in a day)

  21. Wenjie Wu reporter

    Meanwhile, if you can find a USB C to USB micro cable you might see if using that solves the problem on your computer.

    I don't have such a cable at the moment, once I get one I will test it and report the results

  22. John Maloney repo owner

    I have a theory about this problem.

    MicroBlocks is built on the Arduino framework for the micro:bit and other nRF5x chips. I dug into that code and the serial buffer is only 64 bytes. It's possible that that buffer is sometimes getting overrun by bursts of incoming data. The new KL27 serial support chip introduced in the micro:bit v2.2 may provide incoming serial data in larger bursts than the previous chip did.

    I'm about to enter another intense week of teaching but when I get a chance I will try increasing the buffer size to see if that makes a difference.

    Meanwhile, the latest Pilot may mitigate the problem slightly (i.e. lower the probability of errors). But I could be be just imagining that. :-)

  23. John Maloney repo owner

    Good news! Although I don't fully understand the problem, I think I've found a workaround.

    I've pushed it to the webapp but not yet as a full Pilot release, so the stand-alone apps do not yet have this fix. If things look good with the webapp then I'll update the Pilot release.

    After adding the workaround I have not been able to make it fail and I've done over two dozen unplug-plug cycles. Before the fix it always failed within 5-6 cycles.

    Don't worry about getting a direct USB C-micro cable; just test with your usual USB adaptor and let me know if you see any failures. Fingers crossed!

  24. Wenjie Wu reporter

    After adding the workaround I have not been able to make it fail and I've done over two dozen unplug-plug cycles. Before the fix it always failed within 5-6 cycles.

    It works perfectly 🎉🎉!

    I have plugged and unplugged many times and everything is working fine.

    The bug has been forced to the corner, soon you would catch it.

  25. John Maloney repo owner

    Hooray! Thanks for testing it. I'm glad the fix is working for you.

    When I get a chance, I hope to reproduce the problem with a simple Arduino program so that I can report the issue to micro:bit or the Arduino framework maintainer (or both). Something must have changed between the v2.0 and v2.2 micro:bit, and the change might impact other users of serial communications.

  26. Log in to comment