Download: rambop.v2.zip
As a child, Rambo was one of my favourite games. It was a very difficult to finish RPG,
when you didn't know the exact sequence of screens to follow. I liked the game so much
that I bought an original japanese cartridge, and even translated some bits of
the manual. But there was a drawback with this game... the
opening screen!
If the game is all colorful, then why the opening screen was monochrome and ugly?
Certainly the people of Pack-In-Video could have made it better. So
I accepted the challenge to improve it, while still mantaining the
original requirements of the game: MSX-1 with 16kb of RAM, and the entire
game fitting in a 32kb ROM cartridge.
Redrawing the opening screen was a weekend's worth of work, done
with the help of Cyberknight,
and using the Graphos III made by Renato Degiovani.
The real challenge was putting the screen back into the ROM. The original
screen, compressed with RLE, had 2.6kb. So the new screen had to fit into
this same space, and that was very difficult to achieve! In the end I
managed to compress it, using state-of-the-art algorithms.
The final result is available in this page to download. Have fun with it!
The first challenge in Rambo Plus was, of course, to create the new opening screen. Since
the process used to create was very similar to N-Sub, I won't go
into details, focusing instead on the second, and more difficult challenge: the compression.
The new opening screen, compressed by GIF, has 4328 bytes. The original screen used address
AC10h to B768h in the ROM, and there was also some empty space in the end of the ROM after
BD82h (filled with the text 王様の耳はろばの耳, "the king has donkey's ears"). This means
only 3542 bytes could be used, so any compression created must be better than GIF.
That's when trouble begins.
GIF uses LZW, and this algorithm can be proven to
be optimal (when the file size goes to infinity). So how can someone beat an optimal
algorithm? The answer is the mantra of compression: know your data. The only
way to beat a statistical compression scheme is to model your data, extracting
redundancies. If you know a way to throw data away, even better!
So, let's review
what we know about our data. We know the data represents an image, so not only
it has redundancy on the horizontal axis, but it also has it on the vertical.
We know it's a SCREEN 2 image, so each consecutive eight pixels can have only
two colors (out of 15). This means we can split our complex image in three simpler
ones, the first being a monochrome one for the "pattern" data, the other two
images containing, respectively, the colors of the bits "1" and the colors of the bits "0",
these two images having only 1/8 of the horizontal resolution. These three images
can be seen below, I enlarged the color images by 8 on the horizontal axis in order
to have the same aspect ratio for all images.
The "pattern" image is very different from the "color" ones, so they will need different
kinds of compression. Let's review the pattern image. The first feature we note is that the image
has lots of empty spaces, so we should choose a compression method that can handle empty spaces
well. However, before we choose a method, can we improve our data? Can we extract redundancy of
the image? Can we throw data away? The answer, surprisingly, is yes, we actually can use a
lossy data scheme! The trick with lossy schemes is that the final user shouldn't be able to
tell apart the compressed image from the original one. In mp3 this is done by removing
frequencies the human ear can't hear, here in our context, we must change the image in
a way the user can't see the difference.
Then, how to do it? Since we are going to choose a compression focused on efficient storage
of empty spaces, any transformation that increase the number of empty spaces is good for us.
In SCREEN 2, we have two colors for each 8 pixels, if the pixel is "1", we choose the first
color (INK); if the pixel is "0", we choose the second one (PAPER). But what happens if the eight pixels
have the same color? Then either we use all ones and the PAPER color is irrelevant, or we
use all zeros and INK color is irrelevant. But, since we are trying to increase empty spaces,
we can always choose all zeros! If the original image had all ones, we just need swap
INK with PAPER and then paint the pattern with zeros, this is a lossy operation, but
the user won't notice it at all.
There are more lossy operations we can do with the pattern. Suppose a pattern octet is
not all zeros or all ones. It should have two colors then. However, what if the artist
(by mistake or otherwise) used the same color for INK and PAPER? Then it doesn't matter
if the bits in pattern are one or zero! In this case, we can safely throw away the pattern,
and replace it with all zeros. After these operations, the transformed images can be
seen below.
As we can see, the Rambo itself was pretty much already formatted, but we did
increase the empty spaces on the helicopter. Finished the data modeling, now we go
for the compression itself. Just by looking at the image we can see RLE would not
perform very well here. We do have lots of empty space that RLE can compress,
however RLE operates on one axis only, and we need to explore the two-dimensional
connectivity of the empty space. This usually is a task for space-frequency analysis
such as wavelet transforms, but on the Z80 wavelets would be very slow. We can,
however, use quad-trees!
The idea behind the quad-tree is simple. This recursive algorithm start by looking
at the current rectangle (initially, the entire screen). Is the rectangle empty?
If it is, print a "0". If it is not, print a "1", break the rectangle in four,
and repeat the procedure for each smaller rectangle until it's empty, or if it consist
of a single pixel, in this case, print its value. This algorithm can compress
entire regions of empty space to a single bit, so it's quite suited for our needs.
We need some small tweaking however, since our screen isn't a square, we can
simplify the decoder if we break our 256x192 screen into three regions 256x96 each,
and stop the recursion when the rectangle is a 4x1 strip, this way the dimension of
each recursion step can be easily evaluated as x/2 and y/2. In the image below, I painted
each empty rectangle with a lighter shade of red, so you can see the quad-tree in action
as the algorithm partition the space:
The result seems to be good, large areas have dark red, so this means these regions
were compressed by just a few bits. The complete bitstream for this quad-tree has only
2334 bytes, very good. Now we must compress the color images. The empty spaces approach
from the pattern would not work here, instead we can notice that colors in a same
vertical column are usually equal. Can we improve it, by throwing data away or
modifying it in a way the user doesn't notice? Sure! Suppose the artist
painted an octet with blue over red, and the next one with red over blue.
If we invert the pattern, switching ones with zeros, we can also swap
INK with PAPER, and the two octets now have the colors. And exactly which one
we choose, red over blue or blue over red? An easy way to choose is
to pick up a lexicographic order, we can for example impose that INK must
always be greater than PAPER. After sorting the colors in all octets, the
length of the vertical color stripes should be larger.
There's another transformation we can make. Previously we saw that
patterns with eight equal pixels were changed to all zeros and their
colors were set in the PAPER attribute. But what about the INK? We
can choose any color we want for the INK, since it won't be used anyway.
In order to enlarge even more our color stripes, we can choose the INK
to be the same color as the latest octet. Performing the two operations
in the images give us these images:
This is much better! And we must notice that, even after these extensive
changes in the images, the final user won't see any difference. Now we must
choose the encoding method. The large stripes can benefit from RLE, but
there are some patterns where it won't perform well: the alternating colors
in Rambo's hair, for example. These alternating colors would perform very well
in a LZ-variant, however LZ-variants would not perform well for the large
stripes.
The solution is using both RLE and LZ at the same time! We could, for
instance, output a "0" if the next string is RLE-encoded, and output an "1"
if next string is LZ-encoded. Now the trick to make this succeed is to
choose the right amount of bits for each RLE and LZ. Remember RLE has
a number to represent how many times we should repeat the color. If this
number is represented with few bits, very large stripes will need
lots of strings to be represented, and this is not good. If we choose
too many bits, a not so large color stripe will have lots of dummy zeros
on the high bits of the number. In the LZ the same reasoning applies,
but we have two numbers to worry about: how much should we go back on the
string, and how much should we copy from that position.
How can we choose the optimum bits distribution? If we were going
to make a generic compressor, this could be a problem. However, we are
going to compress the Rambo opening screen just one time. In this case,
nothing will be better than a brute-force search on number space! Even
if the brute-force search take some hours running, it doesn't matter,
since this work will be done only one time. After implementing it,
turns out only some minutes were needed to give the result. The magical
numbers were 7 bits for the run-length, 8 bits for the LZ-backpointer,
and 5 bits for the LZ-repetition. The final bitstream, summing both
the pattern and the color, was 3189 bytes, not only small enough to
fit in the original ROM, but also smaller than Info-ZIP in maximum
compression mode, which is very neat!
The final touch is writing a decoder for these bitstreams,
fast enough the user doesn't notice it is being decoded at all,
and small enough to fit in the small area left. This was done
using the Super Optimizer. Whew! Lots
of trouble just to get a color opening screen, but the final
result was worth it.