Monday, October 17, 2011

Part Two - In the beginning there was a binary data blob

If you haven't guessed yet, it's going to become more technical. You've been warned.

So there I was on June 19th 2009. A printed copy of Jordan Mechner's PoP source code documentation in one hand, and Virtual II (an excellent Apple II emulator for Mac OS X) running the original Prince of Persia game.

Running the Apple II version of Prince of Persia in the Virtual II emulator


I managed to save a snapshot of the game state at the very beginning of level 1 into a .vii file, and using the memory inspector in Virtual II and a hex editor in tandem I was able to isolate the two 64KB banks of memory in the snapshot file. I saved them as two binary files bank1_0000.bin and bank2_0000.bin.

Isn't it a pretty blob?

Then I took my favorite 6502 disassembler (dxa) and let it process the data. I somehow ran it on bank 2 first, and that was a lucky coincidence, as it turned out that bank 2 contained most of the game's code. So let's just focus on that bank for now. To Apple II programmers, it is known as "Auxiliary Memory".
The unlabeled output of the disassembler can be found here. It's all just unknown code with generic labels. Disassemblers try to identify data regions within the code, but typically they only have limited success. I've already correctly assigned the data regions in this output file, but originally it looked much worse.

Here's the disassembled code of the hex editor window above, starting from $2000:

;----------------------------------
l2000:
            jmp l2015
;----------------------------------
l2003:
            jmp l20e2
;----------------------------------
l2006:
            jmp l201e
;----------------------------------
l2009:
            jmp l2029
;----------------------------------
l200c:
            jmp lfc00
;----------------------------------
l200f:
            jmp lfc00
;----------------------------------
l2012:
            jmp l25aa
;----------------------------------
l2015:
            sta lc009
            jsr l2046
            jmp l20e2
;----------------------------------

As you can see, this makes no sense. Yes, it's a jump table of sorts, but what is the purpose of these subroutines. What do they do? How can we find out?

One approach is to identify memory-mapped IO registers (soft switches in Apple II lingo). There's one right there: $c009. It's used to switch the zero page and stack areas of the 6502 ($0000-$01ff) to auxiliary memory.
Looking at all the code that's using soft switches allows you to find the parts where joysticks, keyboard and other hardware is accessed. This is one way of entering the terra incognito of the game's code.

However, I used a different approach. I started with the immediate goal of finding the data describing the animation frames, because that's what I needed to improve my existing program, which was still based on FreePrince.
So I consulted the source code documentation to see what it has to say about animation data.

One of the most interesting parts I noticed were the two pages titled "Character animation comments" (starting at page 17 of the PDF). They describe 16 bytes of data holding most of the character's state, its position, velocity, health, a pointer into the "sequence table" and its current entry in the "frame list". It mentioned frame #15 as being standing still. Guess what, the 15th frame extracted from the DOS version is the Prince "standing still". I was onto something.

In a different part of the document (on page 15) the sequence table is mentioned again, together with the "frame def list". It was clearly obvious that the frame def list contained information about each of the images used to animate the character. Index into image table, x and y shift, and more.
It seemed like the sequence table was a list of frames with embedded commands. Positive values are frames, negative values are commands. Simple. The commands were listed right there: goto, aboutface, up, down, etc.
It was clear to me that if I would find the frame def list or the sequence table in memory, I could use the Virtual II debugger to find the code accessing it, and then I would be right in the heart of the animation code. But where could they be?

It felt like searching for a needle in a haystack. Surely they can't be that small (frame def list is described to be 1200 bytes in size), but I just couldn't pinpoint them with enough confidence. I thought if the PC version is based on the same animation code, then maybe the same sequence table is used there. I looked at the data files that come with the Prince of Persia for DOS and only found level data. The data file format had been reverse engineered and was clearly documented. The files mostly just contained bitmap data, wav files, midi files and palettes. No sign of a frame list or animation sequence table.
Then it struck me. The data must be in the PRINCE.EXE file itself, not in an external data file. But that meant it was probably just as hard to find. Except if it really was similar to the Apple II table. From reading Jordan's journals it was clear to me that Lance Groody (who programmed the DOS version) was working off of Jordan's code and data. Why would he change the animation sequence table? He would have had no reason to do that.
So in a desperate attempt, I started my copy of Araxis Merge in binary comparison mode, dragged in bank2_0000.bin onto the left pane and PRINCE.EXE onto the right pane and was hoping for a comparison match. Nothing. Too many differences, no direct matches. I removed everything that looked like code and text from the .EXE file to make it easier for Araxis to find a match. Still nothing. I scrolled aimlessly through the files and then I saw it:

Similar data in the Apple II memory (left) and in the PC executable (right). Found it!


I traced the data pattern to its beginning in the Apple II memory dump, and saw that it starts at $2800. It seemed to be larger than 1200 bytes though. I scrolled on to find its end.
The memory layout clearly changed at the $3000 point, so that must be it. At that location I noticed something that looked like a table of addresses:

E9 30 16 31 09 32 43 32 AE 32 CF 32 50 33 C9 33 16 34 79 34 9A 34 C0 34 C6 34 DF 34 6D 35 F9 33 A2 37 91 33 76 33 D5 37 AD 33 17 38 AD 34 DE 33 46 34 7A 32 90 32 F4 34 17 37 0B 37 FB 36 EB 36 D6 36 C1 36 AC 36 92 36 8C 36 70 36 53 36 34 36 15 36 F6 35 C8 32 D2 35 7F 35 B4 35 75 35 12 35 2C 37 20 37 39 38 4C 38 4F 38 44 38 3A 31 89 31 A4 31 B2 31 F2 31 F9 31 D9 31 D4 31 DE 31 E4 31 EB 31 D0 31 6F 31 63 34 CA 31 58 38 29 38 F1 35 53 34 52 31 B8 31 7D 31 1E 31 82 37 90 32 01 32 FA 32 16 33 34 33 E4 30 24 38 98 31 1C 31 22 31 A2 32 37 31 43 37 53 37 74 37 14 39 B5 38 CE 38 E5 38 18 39 27 39 EC 38 ED 39 B9 38 3B 39 6F 33 C4 39 D2 39 DA 39 43 39 3F 39 65 39 69 39 7B 39 BF 39 D6 39

In little-endian format, the addresses $30e9, $3116, $3209, $3243, and so on. They ended at $30e4, suspiciously close to the first entry of $30e9. So a pointer table into a data structure.
I checked the data of the first pointer:

$30e9 : F9 01 01 02 03 04 FB 08 05 FB 03 06 FB 03 07 FB 05 08 FB 01 F2 01 09 FB 02

Negative number, some positive numbers, then a negative number, and so on. Could it be? Did I just look at the sequence table?

What if the sequence table instructions were listed starting at $FF?

My annotations of the sequence instructions in Jordan's document

So lets look at the sequence above. Given my assumption it would translate into:

F9 01    act 01 ; change action code to "running, jumping, ..."
01              ; show frame 1
02              ; show frame 2
03              ; show frame 3
04              ; show frame 4
FB 08    chx 08 ; KIDX := KIDX + 8
05              ; show frame 5
FB 03    chx 08 ; KIDX := KIDX + 3
06              ; show frame 6
FB 03    chx 08 ; KIDX := KIDX + 3
07              ; show frame 7
FB 05    chx 08 ; KIDX := KIDX + 5
08              ; show frame 8
FB 01    chx 08 ; KIDX := KIDX + 1
F2 01           ; was unknown at first, turned out the play the footstep sound
09              ; show frame 9
FB 02    chx 08 ; KIDX := KIDX + 2

and so on...

It made perfect sense. The first few frames in my list of images were part of the running animation. I was looking at the sequence for making the kid run. Although it seemed there were more than the 8 documented instructions used in the sequence table. I had to find out what the others did.
Now I was on fire. I quickly debugged through the code and set strategic memory watch points to find the code using these two tables. I found the sequence table parser first:

;----------------------------------
l4945:
            jsr ld003   ; read a byte from the seq table
            cmp #$fb    ; is it CHX?
            bne l4957   ; no, then go to l4957
l494c:
            jsr ld003   ; read parameter byte of CHX instruction
            jsr ld015   ; perform CHX instruction
            sta l41
            jmp l4945   ; loop back to read next byte
;----------------------------------
l4957:
            cmp #$fa    ; is it CHY?
            bne l4966   ; no, then go to l4966
l495b:
            jsr ld003   ; read parameter byte of CHY instruction
            clc         ; perform CHY instruction
            adc l42
            sta l42
            jmp l4945   ; loop back to read next byte
;----------------------------------
l4966:
            cmp #$fe    ; is it ABOUTFACE?
            bne l4973   ; no, then go to l4973
l496a:
            lda l43     ; perform ABOUTFACE instruction
            eor #$ff
            sta l43
            jmp l4945   ; loop back to read next byte
[and so on for all other commands]

I had the animation system by the balls. Can you guess which part I found next, just by looking at this parsing code? I'll tell you in the next part.

15 comments:

  1. Thank you for delivering this morning's nerdgasm.

    And I'm going to guess sprite positioning and drawing, by following 141-143.

    ReplyDelete
  2. Most excellent disassembly and dissecting (reverse engineering sounds boring anyway).

    Yes me too I am guessing actual drawing code and its gotchas.

    ReplyDelete
  3. I am amazed you can remember all that so clearly after all these months! Wondering if there was any sprite compression mechanism used to save the memory. Probably not, I'd try to interpret a data blob as a hires/multicolour bitmap to look for character looking pixels ;) Like ICU64 and live memory dump like http://play.blog2t.net/flash-as3-c64-intro-previewer

    ReplyDelete
  4. It's not so much that I can remember it, but that I kept detailed notes and a daily work log.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Wow, superb conversion, and the articles are a joy to read so far.

    It's fascinating to take a look at PoP's animation system, as I'm employed as a game engine programmer and I did a lot of animation stuff over the last months, but in a current-gen 3D engine. Looking at the animation system of PoP, there are some interesting similarities with some modern games (mostly 3rd person games) where you don't control your character directly, but instead you control an animation state machine, which in turn controls the movement of your character. That's exactly what PoP does with it's instruction codes in the sequence table.

    That seems way ahead of it's time for me. Was PoP the first game to let the animation data actually drive the character's movement? Or does anybody know a game which did it earlier?

    ReplyDelete
  7. I'll just repeat what MagerValp said "Thank you for delivering this morning's nerdgasm." :)

    This is one of the best decoding stories I read in decades... Looking forward to the rest of story... Thanks man!!!

    ReplyDelete
  8. Very inspiring effort, i'm impressed.

    ReplyDelete
  9. Dude - Kiitos!

    This is, well, quite similar to how I've always started reversing; I'll have a specific thing in mind that I need to isolate; I'll use external references (graphics, mostly - also your copy of Jordan's documentation is really helpful here in removing 'unknowns' and 'false leads') and then I will narrow down until suddenly -BAM- there it is, and then yes, full understanding and quite often awe of how clever it all is put together.

    And yes, this is VERY much convincing me that I need to work on a project that I've wanted to do for some time - mind you, it's not a port, but a conversion; but still. Inspiring.

    ReplyDelete
  10. Thank you for documenting all of this info. I am trying to recreate the game in Javascript and this information was very useful.

    Quick question. Would you happen to have a mapping of the action codes to the offsets?
    for e.g. 'RUNNING' => $30e9 ?

    It would help a lot.
    Thanks,

    ReplyDelete
  11. @Aditya: I only have a very incomplete list. Mostly because I couldn't come up with good names for them. Some of the sequences are very specific to the part of the code that uses them. It's probably easy enough to identify them by searching for all the places in the code that call setSequence.

    ReplyDelete
  12. @mrsid - I've been wading through the code over the last couple of weeks. I've identified a big chunk of the images and animations. I pretty much ended up writing a parser that reads the sequence table and processes the commands.

    1. My running, walking, jumping animations look exactly like the original game.
    However I notice that my climbing related animations look very off. Jordan mentioned something about CTRL & CHx/y for setting image offsets. I know where the CHx/y are. Where do I find the CTRL instructions? I assume they are the reason everything looks off....

    2. I extracted the images from the PC version using PR. However I am using the apple II disassembled code as a reference implement my code. Do the images map the same way on both versions?

    Even knowing which images map to which code would save me a lot of effort... eg.
    // 1 to 14 (x1 to x0E) is running
    // 15 (x0F) is standing
    /16 (x10) to 33 (x21) is jumping

    I still haven't managed to map the images for fighting/swords and the non-kid guard sprites.

    Would appreciate any info/notes that you could share so I don't end up replicating your effort. I've read and reread everything you've shared so far but I'm feeling very stuck now.

    If you could show me a way that I could figure it out on my own, that would really help. Wading through the code with a 6502 reference manual is very painfully slow for me since I've never touched an Apple 2 before. ;)

    Thanks again. :)

    PS: This is one of my other projects. http://www.adityaravishankar.com/projects/games/command-and-conquer/

    ReplyDelete
  13. 1. CTRL is a bad name for one of the main update functions. I'm not sure which one it is exactly. But the math he mentions (KIDX+Fdx, KIDY+Fdy) can be found in updateCharPositionImpl. It first reads from the frame def list to get Fdx, Fdy, etc., then does the calculation:

    lda Fdx ; loads Fdx
    jsr addForwardXOffsetToCharX ; adds to or subtracts from CharX (depending on wether kid faces left or right)
    sec
    sbc #$3a ; subtracts the X coordinate of the left edge of the screen. Check diagram on page 7 of PDF, visible X coordinates start at $3a
    sta CurrentObjImageXLo ; store low-byte of X coordinate
    asl CurrentObjImageXLo ; multiply X coordinate by 2 (convert from 140-pixel based to 280-pixel based)
    rol CurrentObjImageXHi ; does the multiply for the high-byte.

    So CharX is not actually changed here. It adds the Fdx offset for this frame and calculates the final X coordinate in screen pixels.
    You've read the chapter on the coordinate system on page 2, right?

    The same is done for Y coordinates afterwards.
    Depending on your canvas size (probably 320x200 or a multiple of that), you need to convert the coordinate into something more appropriate. On the C64 I used a lookup table for that. Multiply by 320 and divide by 280.

    2. Yes, the order of the images is the same, but on the Apple II they are stored in multiple different image tables (see encoding in Fcheck and Fsword of the frame def list). On the PC it's just one big list, so it's easier to use the frame def list from the PC and not bother with the bit shifting and masking to decode the image number. That code is in getImageAndChTableForCurrentFrame and can be ignored. On the PC Fimage is just the image number.

    I didn't bother with doing separate handling of swords. I've put the sword directly into my kid and guard images. It's mostly a memory optimization and you don't have to do it.

    Guards are done just like the Kid, except that there's a separate frame def list (at $2cb0). See overrideFrameDefListForCharacter. It takes the normal frame number and subtracts $95 to get an index into the guard's frame def list. The logic is a bit tricky there.

    It's not important to know exactly what each frame of the animations is, as long as your code is correct it'll just work. Make sure you know the flow of data. Sequences define the frame, CharX and CharY, then you lookup the frame in the frame def list to get the image number and the Fdx/Fdy offsets. CharX+Fdx defines the coordinate in 140-pixels which then get converted to screen coordinates and used to draw the image.

    If you have any more specific questions, please let me know. It shouldn't be too hard to find my email address on CSDb.

    ReplyDelete
  14. @MrSid ... Thanks so much. This should help a lot... I'll go over the code again with this new information and see how much further I can get :)

    I'll contact you if I get stuck again.

    Wasn't able to find your email on the CSDb site.

    ReplyDelete