viernes, 7 de febrero de 2020

Disassembling Supercross 3D

I've been playing a bit with my disassembler, mostly fixing bug... And I've been using Supercross 3D for testing. Looking at the source code I can understand why it runs so slow. Ok the Jaguar it's very slow at texture mapping but the code could be better.

For now I've seen the following things.
  • The code it's about 117KB, 120,016 bytes to be precise and it's stored at the end of the cartridge.
  • The game it's locked to a minimum of 4 vbls per frame for PAL systems and 5 vbls for NTSC ones, this means that it will run at maximum speed of 12,5fps and 12fps respectively.
  • There are one block of DSP code, I suppose that it's the sound engine.
  • There are eleven blocks of code for the GPU (maybe one or two more, I haven't finished the disassembly)
  • One of the GPU blocks it's used just to set the Object Processor List Pointer, this one never it's loaded into the GPU internal RAM, it runs from ROM.
  • There are about 20KB (21, 184bytes) of dead code or unused data, they are spread around the code and most of them end with a $4E75 (rts opcode) but they are never referenced or called.
  • Short branches are almost never used.
  • It waits for the bitter to be idle in several places, but IMO if you are using the 68000 you don't need to wait because it has lower priority (68000 < blitter), so if the blitter it's busy the 68000 will be stoped. The only advantage of not having a cache.
  • There are some link/unlink opcodes, also some routines push values into the stack, jump somewhere, load the values from the stack to the registers and jump again to do the actual work. I think that some parts are written in C and others in assembler, and this kind of routines are used to jump from C to ASM.
  • There are some parts of the game that depends if the system it's PAL or NTSC, but it reads the hardware register each time that it needs to instead of using a flag.
  • The game runs in 8bits mode with colors in CrY format (not 100% sure).

And now some codes snippets. All of them are actual code (it's full of them).
move.w (a0),d0
addq.w #1,d0
move.w d0,a3
move.w a3,-(sp)
jsr l01e3e0e
At least it uses quick add, I think that this is used to increment the lap count and print it.

move.w #0,l01b72d8
move.w #0,l01b72da
move.w #0,l01b72dc
move.w #0,l01b72de
move.w #0,l01b72e0
...

What about using a data register and post-increment addressing?

move.l a1,-(sp)
move.l #l01ece80,d3
move.l d3,a1
jsr (a1)
Because jsr l01ece80 it's too easy.


By the way, I've found two bugs in my assembler when I was looking at the disassembled code to write this post.