"Not enough room even after compaction" issue

Issue #513 resolved
Sean shao created an issue

While participating in a hardware programming competition using the xgo robot dog. The program written was too large and the programme could not be executed correctly.

“Downloaded 116 functions to board (9286 msecs) 
emModule.js:18 Compacted RAM code store (0 msecs)
emModule.js:18 45832 bytes used (99%) of 46080 
emModule.js:18 Not enough room even after compaction 
emModule.js:18 Memory cleared ”

What are some optimisations that can solve this problem? Including how to optimise the code, or give more space when compiling the VM.
The xgo libraries and the HuskyLens library will be used in this program. The chip is ESP32.

Another problem is when the program is too large. The program doesn't execute correctly, but the IDE doesn't give any hints unless the log is viewed via console.

Comments (30)

  1. John Maloney repo owner

    On ESP32, the MicroBlocks interpreter executes code from RAM, which is a very limited resource. The code space is only 45k and you are hitting that limit.

    We are working on a way to pack more code into the available space. That will enable users to work with larger programs.

    Unfortunately, the solution we're working on involves a complete redesign of the byte-codes (and the interpreter, compiler, and decompiler). That's a big effort, so we do not plan to launch it until some time this summer at the earliest.

    Meanwhile, thing you can do to avoid this issue are: 1. remove any libraries you are not using 2. consider creating more streamlined versions of large libraries 3. remove unused project code and/or refactoring the code to be smaller

    Which libraries is the project using (besides the xgo and HuskyLens)? If you share the project, either here or on Discord, I can take a look to see what I'd recommend.

  2. John Maloney repo owner

    Yes, PSRAM could be helpful -- with some work.

    Unfortunately, to use PSRAM you need to configure the PSRAM at compile time (see https://thingpulse.com/esp32-how-to-use-psram/) which means we'd need distribute a different version of MicroBlocks specifically for ESP32 boards with PSRAM. (There are also some other things that would need to be changed to make use of PSRAM for the code store.) At the momement, it won't help to run MicroBlocks on a ESP32 board that has PSRAM since the PSRAM won't be usable without those code changes and compile-time configuration.

    The ESP32-S3 and ESP32-C3 both have more RAM. MicroBlocks allocates 60k for the code store on those machines.

    A short-term option that might be worth considering is a "remove unused functions" menu command. I suspect that many of the library functions are not actually used by a given program, so removing unused functions could free enough space for the code that is being used. In fact, it might be possible to do that pruning automatically. I'd love to analyze the program that failed to load to better understand how much it would help to prune unused functions.

  3. John Maloney repo owner

    Here's a quick analysis of where the space is going.

    Original project:
      HuskyLens: 9644 bytes 
      XGO-nano" 19300 bytes 
      project scripts: 14952 bytes 
        Total: 43896 bytes
    
    Project with Pilot libraries:
      HuskyLens: 10016 bytes 
      XGO-nano: 17108 bytes 
      project scripts: 14952 bytes 
        Total: 42076 
    
    Original project without decompiler metadata:
      HuskyLens 7304 bytes 
      XGO-nano 14140 bytes 
      project scripts: 12296 bytes 
        Total: 33740 (~80.2% of original size)
    

    So the version of the XGO-nano library in the project is bit bigger than one in the pilot release, the Husky Lens library is a bit smaller.

    Eliminating the metadata would save 20%, which would buy some breathing room, but you would not be able to decompile the project (i.e. use the "open project from board" feature). It would also prevent using the "call" block to call functions by name (since the function name is part of the metadata).

    I'd also like to run an analysis to see how many library functions are unreachable code -- i.e. not called by script in the main project. But that analysis will take a bit of work since it needs to walk the call graph to figure out whats not reachable.

    I'm heading to Spain tomorrow for a robotics meeting and conference so I won't have much time to work on this until I return in mid-May.

    Meanwhile, you might try making subsets of the XGO and Husky Lens libraries with only the features you need for the project you are working on. Hopefully that will allow you to make some progress.

  4. Pengfei Liu

    Indeed, XGO requires a large amount of code when executing complex tasks, and currently our contestants are troubled by this.

  5. John Maloney repo owner

    I compiled an ESP32 VM with a 50k code store:

    https://microblocks.fun/mbtest/tmp3/vm_esp32_50k_code.bin

    This has enough room to run the project that you shared on Discord. You can install it using the "Install ESP firmware from URL" command in the web app running in a Chrome or Edge browser.

    Unfortunately, I can't increase all ESP32 boards to this size since that does not leave enough RAM for some ESP32 boards with displays (e.g. M5Stack).

    This is a temporary workaround to allow you to continue work on your project. I'm leaving for a 10-day trip today and I won't have time to explore long term solutions until I return.

  6. Pengfei Liu

    Thank you very much, we look forward to the XGO robotic dog driven by microblocks achieving good results in national competitions for elementary and middle school students.

  7. John Maloney repo owner

    I saw that XGO robotic dog with MicroBlocks won all three categories in the nation competition. Wonderful!

    I've made streamlined the implementation of the XGO-nano library. The old version was about 17 kbytes of compiled code the new version is 12616 bytes of compiled code. The main change was to create two functions, _XGO_sendCmd and _XGO_sendRequest, and to use those everywhere that commands were being sent.

    I also created an XGO Lite library that includes only the functions I believe will be most useful to beginners. That library also changes some command names and arguments. It uses a different prefix from the XGO-nano library (lower-case "xgo_") so you can load both libraries to compare them. The XGO Lite library is only 5820 bytes of compiled code so it loads fast.

    These changes will be in the next pilot release. I'm traveling now but hope to create that pilot release in the next week.

    The XGO dog made a bit hit at the Robolot conference in Olot, Spain. There was a "live coding" performance and Kathy made the XGO dance in time to the music (triggered by microphone input).

  8. John Maloney repo owner

    Unfortunately, I've already increased the code space as much as possible on the ESP32.

    If you are using the XGO library, try the XGO Lite version of that library, which uses only about one third of the RAM space of the XGO-nano library. If fact, I plan to remove the XGO-nano soon since the XGO Lite supports the same features more efficiently. Let me know if you have questions about porting code from the XGO-nano to the XGO Lite library.

  9. John Maloney repo owner

    For performance, MicroBlocks keeps the user's code instructions in RAM on the ESP32. That's done because the ESP32 does not have on-chip Flash memory, and reading from the external SPI Flash chip is slower than reading from RAM.

    But RAM is a limited resource, which limits the maximum program size.

    As an experiment, I created a modified version of the MicroBlocks firmware that executes code directly from Flash. Although the performance is better than I expected, it is still only about half the speed of executing code from RAM.

    This design would allow us to increase the maximum program size by a factor of two or more, but I'm not sure it is worth giving up nearly a factor of two in performance. I think it is better to make our libraries are as memory efficient as possible. As an example, the XGO Lite library only about one third the size of the XGO-nano library with the same functionality.

    Another thing that will help: we are reworking the instruction set design to use 16-bit (vs. 32-bit) instructions. That will allow much larger programs to fit into the same amount of RAM.

  10. Pengfei Liu

    Can I test the firmware that executes code from Flash? The national-level competition is about to start, and many teams are facing the issue of program size again.

  11. John Maloney repo owner

    That code is NOT well tested. I stopped working on it after seeing how slow it was. However, if you would like to try it you can build the firmware with:

    pio run -e esp32-flashCodeStore

    If teams are using the XGO nano library I strongly recommend trying the XGO Lite library. It is about one third the size which leaves much more room for other libraries and user code.

  12. John Maloney repo owner

    By "slow", I mean that program execution is about 5x slower. That may be a worse issue than the limited code space.

  13. Pengfei Liu

    Since the library has been continuously used, we have not yet allowed users to use the lite library. We will start the library merger work after the competition at the end of this month. Before that, we want to use the firmware running on flash to solve this problem.

  14. John Maloney repo owner

    It may be worth trying the firmware that executes code from Flash. However, keep in mind that there are risks associated with it.

    First, that code is not well tested and there may be bugs or unexpected interactions with other functions (e.g. WiFi and Bluetooth). I'm attending a conference at MIT then teaching a week-long class for teachers in the next month, so I won't be able to help debug any problems that might arise.

    Second, the MicroBlocks code executes much more slowly when running from Flash due to the much higher cost of reading data from the external SPI Flash chip. The decrease in speed may cause code that already been written by the teams to behave differently or perhaps fail.

    Of course, running out of program memory is also a failure!

    How many teams are hitting the program memory limit?

    There might be another way to increase the code store size that is lower risk. Let me give that some thought...

  15. Pengfei Liu

    I am aware of the potential risks, but compared to the lack of space, the teams participating in the competition may be willing to try. It's okay, if possible, I would like to give them more options. Currently, about 30 teams are using the XGO+microblocks solution in this competition.

  16. John Maloney repo owner

    I found a way to expand the RAM code storage space by 8k, from 48k to 56k. That may not sound like a lot, but because the libraries consume about 30k, adding 8k is the equivalent of going from 18k to 26k of room for user code.

    If you are interested, I could include this modification in a pilot release by the end of this week. As with any change, there could be some unforeseen complications, but this change is lower risk than the experimental Flash memory version.

    What do you think?

  17. John Maloney repo owner

    One more point -- if you don't need WiFi you could increase the code store to 80k. BLE still works with that memory allocation.

  18. Pengfei Liu

    I understand that a slight increase in space could solve their problem. If the risk is low, I think we can allow the contestants to try it. Otherwise, compressing code during the competition would be a huge headache.

  19. Pengfei Liu

    Wow, this is really exciting news. 80k of space should definitely solve the problem, unless they want the robot dogs to go execute missions on Mars.

  20. John Maloney repo owner

    I have even better news.

    After spending most of today working on increasing the size of the RAM code store, I went back and tested the experimental Flash code store code.

    It turns out to be slower than the RAM code store ONLY when WiFi is running! Evn then, the slow down is only about 1.6x, not the 5x that I remembered from earlier tests.

    I need to do more testing, but I am inclined to make the Flash code store the default in the next pilot release. That will provide a 80k code store (vs. the current 48k one). It appears that there is no speed penalty if you never use WiFi. Although, as soon as you start WiFi, it becomes 1.6 times slower, even if you are not actively using WiFi.

    More good news: in my tests so far I have not seen any slowdown of execution speed connecting to the board with BLE.

    Part of me fears that this is too good to be true. Maybe I just haven't encountered the worst case performance in my testing. I'll do some more testing tomorrow...

    Hoping this will work!

  21. Pengfei Liu

    Thank you very much, we are very much looking forward to the 80k space allowing the robot dog to complete tasks freely and effortlessly.

  22. John Maloney repo owner

    I just pushed this change to the pilot release (v1.2.85). You will need to update the firmware on your ESP board to test it, of course. With "advanced blocks" enabled you can use the "compact code store" command in the gear menu to see how much space is used of the 80k available.

    It seems to perform well -- both execution speed and code download speed is similar to previous versions.

    I've tested this on several ESP32 boards, but let me know if you find any problems. I have not tested it on an ESP32-S3 (which is not a fully supported board), only original ESP32 boards.

  23. Pengfei Liu

    That's fantastic; I've already passed on the news to all the teams participating in the national competition.

  24. Log in to comment