An intro for the Atari 2600, released at Revision 2017. 4k ROM, no compression, 128 bytes RAM, no framebuffer.
Code and direction: Kylearan (firstname.lastname@example.org) Music: Glafouk (email@example.com)
"How do you get to Carnegie Hall? Practice, practice, practice. How do you become a very angry person? The answer is the same. Practice, practice, practice." - B.J. Bushman (2001)
After focusing on transitions and flow (TIM1T), hi-res graphics against all limitations (Ascend), and sync (Derivative 2600), this time I wanted to do a size-limited intro. For one because many people in the 2600 scene think it's mandatory to make a 4k production or otherwise you're not a "true" 2600 programmer, but also because I wanted to show that an oldschool 4k demo doesn't mean you have to ignore concept and direction.
If you like the concept is for you to decide. What follows is a technical write-up about each effect and the size-coding challenges involved.
I use players, missiles and the ball to display the lightning, to get enough colors but also to be able to have branches. A pseudo random number generator is used in the kernel to determine if the direction of the lightning should be reversed, if a new branch should be started, and for each branch in which direction it should go and how long it should be. Since there are a lot of conditionals involved, this kernel is not a strict n-line kernel. Instead, it simply takes as long as it needs to decide and process all things before doing a WSYNC and HMOVE, resulting in a variable 2-4 scanlines before objects are moved. This is no problem as lightning should look chaotic anyway, and the kernel uses INTIM to determine when to stop.
One interesting challenge was when a branch (using a missile) ends. Since movement of all objects is determined randomly and happens relative to its last position (as per usual on the Atari 2600), I have no idea where everything is for a given line during the kernel. So how should I "recall" a missile object back from a branch to the main lightning then if I don't know its position? Computing the new new position after each change would be too expensive, not even speaking about wasting a scanline for repositioning via HMOVE.
That was when it paid off that I have that strange habit of reading specifications from front to back instead of using it only to look up things when needed. I remembered these odd registers I had never used before, RESMP0/1, which will force the missile objects back to their corresponding player objects - exactly what I needed here. Originally built into the console for supporting the Tank game, they came in handy here as well! It was fun to make use of such an obscure feature.
For displaying the word below the lightning, there is a table of all characters needed and the words are constructed on demand. There's a small "framebuffer" where I scroll in each character from right to left until centered, checking for kerning while doing so. This routine combined with the tables for the words and the characters are actually smaller than storing the playfield graphics for all words directly.
Since I wanted an aggressive, energetic look, I used the famous Cosmic Ark star effect expanded to all five objects as an overlay. Credits for how to do it with five objects goes to Thomas Jentzsch. When he discovered how to do it, I immediately dismissed it as I couldn't imagine a use case (using it as a star field looks very bad in my opinion, as the pattern is way too regular). Turns out my initial reaction was misplaced, as I ended up using it not only here, but in the white noise part as well. :-)
Yes yes I know, a tunnel effect - the most overused effect of the demoscene after the rotating cube. I can almost see your eyes rolling, but hear me out first!
To support the concept of the demo, I wanted to have effects related to the notions of descent and falling - and a tunnel is perfect for that. And on a technical level, as far as I know no tunnel that big (16x16) with so high precision and with multiple colors has been done on the Atari 2600 before. The reason is the lack of a framebuffer combined with the weird playfield behavior that makes it so hard. For example, the tunnel in KK's Ataventure demo is smaller and only single-colored, so I wanted to beat that - and doing it in a multi-part 4k demo was an even more interesting challenge.
The tunnel is displayed using a mirrored playfield where the right side gets updated at the only cycle where this is possible (45), a quad-sized player overlay and some AND masks applied via the SAX opcode for creative dithering. The atan and distance tables (using 8.2 precision) as well as the two textures are packed into 512 bytes only, and it took a lot of optimization to use these packed tables to construct those weird playfield registers at 25fps at least. Computation is spread over overscan, vblank and the kernel, which is also the reason I have been unable to center the tunnel vertically on the screen - I need the screen real estate for calling the music routine and computing 1/8th of the next frame!
Not much to say here. The original plan was to make a triangle "hole" oscillating back and forth, but due to some brain fart the original concept didn't work (or at least not in 4k). But since I discovered an acceptable background effect by accident (applying a sine-moving color gradient over a scrolling playfield background), I kept the part - it doesn't look very good, but I think it fits the theme and thus works okay here.
The easy way to implement such a two-color-bar parallax effect would have been to make each color 8 pixels wide and then simply use player and missile objects for the scrolling, as that would have been exactly in the range that HMOVE can handle. But I wanted to have wider colors (16 pixels), which made some more complex computation necessary if I didn't want to have black lines between the bars used for repositioning. So what I have to do now as well is to determine if I have to swap colors when a new bar begins, as that allows me to remain in the HMOVE range again for the objects. That took a lot of RAM and some fiddling around.
There's actually a small bug in the code that causes some jitter from time to time, but I decided to leave it in there as it fits the theme nicely. :-D
Trivial, even to do it as size-optimized as possible.
The display kernel uses SkipDraw for showing the characters and a mirrored playfield for the parallax rays. The challenge here was how many counters I could deploy for determining when to set or clear playfield bits in a two-line kernel in addition to the SkipDraw routine. Turns out it's six, with zero cycles left. :-)
For the distortions increasing in duration over time, I thought about how to use random numbers to do it - but in the end, a simple table with timestamps was smaller and easier to control.
With each playfield pixel being 4 pixels wide, it is surprisingly difficult to implement a good-looking white noise. You only have 5 hi-res objects which cannot be repositioned arbitrarily during the kernel, and besides you wouldn't have time for that when you also want to constantly write new values into the playfield registers in the kernel.
That was when I remembered the Cosmic Ark star effect, which not only allows for more copies of one object per scanline, but which also does the repositioning each scanline for you for free. So all I do is write random values into the playfield registers using an inlined prng algorithm, and the overlaying Cosmic Ark star effect with all five objects hides the low resolution a bit. In the chaos the regular pattern of the effect is also less visible, so it works nicely.
When I had all parts roughly working (already programmed with size in mind), I only had 236 bytes left - and that was without any scripting yet, without the music routine and without music data! I had jokingly posted a screenshot of my compiler free ROM output on Facebook and was surprised and happy to receive multiple offers from musicians to try and make music even for this little space! <3 Glafouk had been the first one, and since he has been using my TIATracker several times already, I happily accepted his offer.
After adding scripting and several serious and painful optimization passes, I figured Glafouk could have about 320-350 bytes for the music (including the player routine!), which is ridiculously small - especially considering that I needed two tunes, one short "happy" loop for the Relief part, and a longer, dissonant, aggressive tune for the rest of the demo. Poor Glafouk!
After Derivative 2600, I had spent a very long time developing TIATracker which now paid off, as it allowed quick experiments and turn-around times. Glafouk went totally crazy and sent me no less than 17(!) small "happy" loops and 3 main tunes - you can find his experiments in my public bitbucket repository for this demo, in the "whiteboard/music" folder.
Don't blame Glafouk if you find the music too repetitive and simple - with these limitations forced upon him by me, I'm very happy with the music he delivered. The problem was that I couldn't cut out a full part of the demo to make more room for the music, as all parts were needed for the underlying concept to work.
Optimizing for Size
With so much content in a 4k demo, almost half my development time went into optimizing for size. I don't know how often I read my code front to back, each time scraping off another 5-10 bytes from somewhere... Here are some things I did:
- First, the obvious and most painful measures: I had to cut out features. Originally, the tunnel had a decorative border, the game part had a ammo display and the crosshair moved for each shot, the parallax part showed different symbols each time, the colors of the lightnings and the parallax bars changed, etc.
- I searched for the subroutine I called the most (fine-position an object, called 13 times) and used BRK instructions instead of JSR to call them, saving 13 bytes.
- In a spreadsheet, I listed all data tables I used (around 40) and their beginning and ending bytes. Then I looked for tables that start and end with the same byte sequences, and moved them in the code so they could overlap, taking into account that some of them must not cross a page boundary.
- If possible, I would inline subroutines at least once.
- Where possible, tables were packed together. For example, the characters used for the lightning words are only 5 bits wide, so I used the upper 3 bits for the durtaion values for each part (the 6 bits of a 9 bit value split into two 3 bit values). Using AND to mask out unwanted bits and combining the two 3 bit values is smaller than a stand-alone table of 32 duration values.
- Similarly, the part indexes which specify which part to call when (3 bits) are encoded in another table with unused bits.
- Turns out the distance table for the tunnel effect also doesn't use the upper two bits, so the two textures went into those.
- The music only uses one percussion instrument, simplifying the player routine. In addition, all start indexes of the patterns are <128, so I could do away with having to construct a pointer to the current pattern and use a simple byte index instead.
- And lots of traditional stuff like manually mirroring the sin table; identifying actions that I to several times (like setting most graphics registers to 0 at the end of a kernel) and making a subroutine out of them; checking if SEC/CLC instructions are really needed, sometimes even accepting occasional errors in the operation as long as they are not really noticable; re-using immediates (if a color stored into COLUPF which has bit #1 set, use it immediately to enable a missile for example) etc.
Several months of optimizing code for size nearly drove me insane, and now I need a break and have a huge itch to unroll loops, waste ROM with look-up tables and such. :-)