http://www.visual6502.org/images/pages/Motorola_68000.html
by Javantea Oct 2-Nov 8, 2017 Batman's Kitchen Meeting Nov 8, 2017 https://www.altsci.com/old_non-x86/ Slides || Talk video Software: old_non-x86-0.6.tar.xz [sig]
Note: This paper was meant to be too verbose to read in one sitting. Watch the videos, read a few bullets, and then read a section when you're ready.
In the realm of computing, there is a clear front-runner for desktop, laptop, and server processor: x86 architecture. In battery powered mobile devices such as phones, Raspberry Pi, and such, ARM is a clear front-runner because x86 is inefficient and bloated. ARM supports System-on-Chip (SoC) designs and powerful systems with limited power use. Because of this bias, you can go a long ways on just one architecture. If students focus their attention on x86 or ARM reverse engineering and exploitation, they will have a simple blind spot when coming up against challenges that involve other architectures. In this paper, we try to understand why. Qemu supports aarch64 alpha arm cris i386 lm32 m68k microblaze microblazeel mips mips64 mips64el mipsel moxie nios2 or1k ppc ppc64 ppcemb s390x sh4 sh4eb sparc sparc64 tricore unicore32 x86_64 xtensa xtensaeb
architectures and MAME supports many more architectures. Some of these architectures are old, and some are so different that it would benefit a person to spend just one hour learning a bit about how these systems work. Another major benefit of learning to program and reverse engineer a different architecture is to interact with hardware in a way that on x86 and ARM only the bootloader, kernel, and ring zero hacker is allowed to do. In 10 lines of code, 1 second of compiling, and 1 second of emulation, a student can write their first kernel-level code. This will come in handy when a challenge requires them to understand the inner working of their system. In the environment of a CTF, challenges can use your bias against you!*
A common misconception is that old non-x86 architectures are too difficult to work with. They use different tools, they have significant limitations, but this should not make them significantly harder. In fact, if you know x86, you should be well on your way to becoming architecture agnostic. M68k Assembly was taught at University of Washington to Physics majors with no computer programming background in 2002. How did I pass that class before I was a hacker if M68k is too difficult to work with?*
* It may mean that I'm talented in assembly level programming, but I think it means that everyone has a chance at learning M68k assembly.
Computers are an easy excuse to not socialize, don't use it! If you're having trouble getting your code to work, ask someone for help. If you can't understand something and you've tried Google, try someone you don't know. A person I was talking to at a local bar learned to program 6502 assembly as a teenager.
We want to be able to understand the fundamentals of working with old systems and how they can be used in contemporary systems. This will teach us about the nature of computers and kernel-level programming.
While we're going to be emulating these systems, we are intending to model the function of an actual system with some function. If you want a real game or a console, I recommend Pink Gorilla in the U-district or in the Intl-district. They have knowledgeable staff and excellent selection and prices.
Please download these tools and my paper in case you're playing a CTF and need to hack a ROM.
Permalinks: Paper
Software: old_non-x86-0.6.tar.xz [sig]
Pitfall was written in 6507 Assembly for the Atari 2600. 6507 is a 6502 architecture processor. The system was very limited in graphics and computer, which made the platform far less successful than its competitors, but the amazing games made surprised players and set the stage for improvements. A good example of a modern game written for Atari 2600 is Ultra SCSIcide by Joe Grand. It has binaries and source code if you'd like to try reversing it. The source code will tell you how well you did.
The 6502 is often hailed as being one of the easiest architectures to program in assembly. The phenomenon of NES, C64, and Apple II programming in the 21st century provides some evidence to this. I am more apt to think that this is because of the ease of programming a limited system suits people better than a complex system like x86.
Systems that feature 6502 architecture:
6502 | +------+--------+--+--+-------+-------+ | | | | | | 6510 deco16 6504 6509 n2a03 65c02 | | +-----+-----+ r65c02 | | | | 6510t 7501 8502 +---+---+ | | 65ce02 65sc02 | 4510https://wiki.nesdev.com/w/index.php/CPU_memory_map
Before we work on how to understand assembly, let's focus on the the compiler. C code is a far more human language than assembly in that it can be understood as a set of functions, statements, and variables. Those functions, statements, and variables are widely used in C, C++, C#, Java, Python, JavaScript, PHP, and other easier procedural languages.
CC65 is a C compiler and assembler targeting multiple systems that use 6502 architecture. You can use it to compile code for any 6502 system, supported or not. What is a C compiler? A simple C compiler needs to take any valid C program (see listing 1) and turn it into a valid assembly program for a certain architecture.
Listing 1: C sample
int main() { int i; char buf[10]; for(i = 0; i < 10; ++i) { buf[i] = i; } return 0; }
With compilers it became clear that without a valid C library, you'd be up a creek trying to write your own, so each compiler should either be paired with a libc or it will be difficult to work with. But you don't necessarily need libc to write a cool program as we'll see later. CC65 comes with a libc for each system. In order to deal with text on a platform that does tiles and sprites and is very limited, they wrote a library that doesn't do graphics perfectly, but good enough for debug and a little bit of showing off. You can use it, but after a while you'll want to load sprites, create a nice scrolling background and so on or write your own engine. Yes, the NES can do some pretty amazing things. And CC65 has no limitation.
How does an 8-bit system handle 32-bit ints? 64-bit ints? Floats?
int i;
* Use volatile and learn what it does. Trust me.
How does volatile work? The way that you talk to audio or visual hardware in the system is the same way you talk to RAM on a system. That is you can read and write to actual hardware by address. In 6502 assembly that looks like:
LDA #4000 STA #4000How does the compiler know what to optimize out and what not to optimize out? In a C compiler, it can optimize anything it wants.
a = 43; a = 42;What is the value of a? It's 42, so why should the compiler write 43 to a? If a isn't volatile, the compiler won't (assuming it's optimizing correctly). If a is volatile, the compiler will. This makes it possible for you to talk to hardware in such a way that allows you to use the time dimension to give complicated sequences of data to a single address. We'll make that very clear as time goes forward. This is why it's much more reasonable to put a sleep into an embedded program than it is to put a sleep in a desktop program.
What would happen when you decide to turn off libc?
The first thing that would go wrong is printf
. If you don't need to print or draw anything, lucky you. The second thing is file access. If you have no files, great. The third thing is user input. If you need to take in input, how do you get it? getch
? Joystick? Where is the Joystick? In NES, the joystick can be read using memory-mapped IO. It was nice to have a joystick driver because I couldn't figure out how to read from the joystick in the time allotted. cc65 gives you a simple joy_read
function to call. With a C compiler and an assembler, you can write your own libc. The reason that people don't is because it takes time. This becomes a theme. Given weeks of time, you could hack quite a lot of things you've never seen or heard of before, but in the time span of a CTF only the prepared will prevail.
All programming has pitfalls, in writing this paper I ran into many. Here are some:
Why didn't I just copy someone else's code? If you copy something, there is no guarantee that you understand what's going on. This paper is about understanding what's going on.
How do we know what is going on? The first option is to guess what will happen and then test. If your guess is correct, then you are either lucky or you understand the process at some level. In order to prove that you understand the process, you should be able to predict many things and have a significant number turn out to be true. This type of trial and error programming is very widely used from web development to embedded systems. In the heart of programming, there must be some point of trial and error when you break new ground that you and other people have never done before.
It's worthwhile to take a look at where our guesses do not hold up to reality.
Mednafen has a debugger and is nice to use. It appears to be designed to fill in the gap where a piece of code isn't working or you don't know how a piece of code works. The key bindings are quick and you can skip to a memory address quickly. An obvious downside of Mednafen's debugger is its lack of features. You can't set a breakpoint without visiting the address. You can't set a watchpoint. The sub-byte disassembly is clumsy.
MAME has a featureful debugger with file output and a scripting language (Lua). While it's interface is not very good, it's possible that the Qt debugger could be coming to Linux and improvements to MAME are frequent.
MAME's scripting language made significant improvements in a recent version in regards to automated debugging.
Both Mednafen and MAME give you step 2 (emulate it) and 3 (debug it).
Mednafen doesn't allow you to export disassembly, so it's a lot weaker than MAME. MAME is less easy to work with.
Some things you'll probably want to learn about a new architecture:
Radare2 supports a large set of non-x86 architectures. But Radare2 has bugs. It can help you reverse a lot, but only if it works. There are bugs in many old versions, so I heartily recommend compiling from git master or the most recent release.
A list of very useful commands:
# Start radare disassembling a 6502 architecture file. r2 -e asm.arch=6502 file.nes # Start radare disassembling a M68k architecture file. r2 -e asm=m68k file.md # Start radare disassembling a 8051 architecture file that is encoded in Intel hex format. r2 -e asm.arch=8051 ihex://harvard1.ihx # Start radare disassembling a z80 architecture file. r2 -e asm=z80 file.bin # Analyze all functions (it sometimes helps to run this twice) aaa # Print disassembly of the current function. pdf # write disassembly to a file pd >harvard1.dis # Visual mode V # Switch to the next visual mode p # Graph mode V # If it complains, define a function df # go to the top of anything g # go to the bottom of anything G # exit q
You might notice that it's trying to be vim. That means if you're comfortable moving around using hjkl, you can use that.
Examples of 6502 in CTF:
Pwn Adventure Z from CSAW
Compromising a Linux desktop using... 6502 processor opcodes on the NES?!
Hacking Time from CSAW CTF 2015 [.kr]
Juniors CTF 2016 - Joy500 Oldschool NES Rom Write Up
If you look at the source code included with this paper, you'll find a C program compilable with CC65. It uses libraries specific to CC65 but should be portable to other compilers and libc implementations because CC65 is not too far from the C specification. In nes/src/nsf7l.c you see a simple main function where we call init()
and then we play music by setting values in the APU in a while loop. Also in the while loop is a call to cprintf which prints characters to the screen. Also found in the nes/src/ is hello.s which is an assembly program that has a similar structure to nsf7l.c. music.s is just a stub, so hello.s will not actually play any music if you get it to compile and run.
Before writing nsf7l.c I wrote a few programs that worked at the 6502 machine code level. By concatenating bytes (using assembly and guides) I was able to create a working nsf file, which is an NES rom that executes 6502 instructions but only plays music. By writing the bytes I was able to bypass both assembler and compiler, providing myself a truly bootstrap experience, though instead of using a keypad to enter hex into memory or something like that, I used a fully functioning desktop computer with Python and gigabytes of RAM -- most of which was unnecessary except for convenience.
nsf7l.c plays music in 105 lines of C with a tiny libc. Can you do that in Linux, Windows, or OSX? By making x86 and ARM systems more powerful, we have also made it more difficult to write a proof of concept.
The bootstrap experience:
Once you write an assembler in machine code, the next step would be to write a compiler in assembly. You can then use your machine code assembler to compile assembly programs to machine code. You might also consider porting your machine code assembler to assembly since there might be bugs in your machine code assembler that you can't see because it's bytes in memory. Once you had a C compiler that could compile even the simplest subset of C, you could write your compiler and assembler in C and gain the benefits from there on. Then of course, you would endeavor to make your C compiler more complete as it's pretty obvious that you wouldn't have an operating system or robust file system once you had a compiler.
If you look at the function init()
in nsf7l.c
you can see
APU.status = 0xf;
What does this do? In nes.h
, there is a line that describes APU as a struct at address 0x4000.
#define APU (*(struct __apu*)0x4000)
This means that the write to APU.status
will write the value 0xf to address 0x4015. At address 0x4015, there is an APU which controls the audio. By writing to that address, you are controlling a chip. While you may be writing to memory, don't expect that every memory address is memory.
Here are a handful of answers to the questions we asked at the beginning of this section.
nes.cfg
you find hardware vectors which are at the end of the 2nd 8K ROM.
# Hardware Vectors at End of 2nd 8K ROM ROMV: file = %O, start = $FFFA, size = $0006, fill = yes; ... VECTORS: load = ROMV, type = rw;In reset.s, you can see that these values are set to functions in reset.s:
.segment "VECTORS" .word nmi ;$fffa vblank nmi .word start ;$fffc reset .word irq ;$fffe irq / brk
Motorola's 68k or 68000 is one of the best CPUs in my opinion, the Genesis alone was enough to convince me to work on this paper. Since it was my favorite, I spent a lot of time on it.
The M68k has a ton of history. I won't get into it but if you take a look at the systems that feature it below and you search the web for current uses of the architecture, you'll see that it plays an important part in the historical computing market.
Systems that feature M68k architecture:
As of the writing of this paper, I found no examples of M68k in CTF.
In src/md/loop2mustestcm.c you find my finest creation of this talk: an almost functional music system with 16 steps that allows you to change the registers of the YM2612. It also draws to the screen with a custom puts
function that I had to write in assembly. It has functions like play_note
and sleep
. If you look close, these building blocks are what you need to turn C without a good libc into a musical graphical demo! But how?! Let's take a very close look at our code and try to understand the patterns in use.
Features:
puts(const char *data)
and sleep(void)
in 68k assembly (loop2mustestc.s)music_init(void)
, play_note(uint16_t freq)
, stop_music(void)
in CclearScreen(void)
, WaitVBlankStart(void)
, WaitVBlankEnd(void)
, btohex(char *dst, uint8_t src)
in Cvolatile uint32_t * const VDP_DATA = (volatile uint32_t *)0x00C00000; volatile uint32_t * const VDP_CTRL = (volatile uint32_t *)0x00C00004; volatile uint16_t * const VDP_STATUS = (volatile uint16_t *)0x00C00004;Remember volatile? Yes, these are 16-bit and 32-bit Memory-mapped I/O (MMIO). That means writing to these addresses causes a state change in the video processing unit. Don't think of it like putting characters into VRAM just yet, but instead think of it like writing to a serial port, on the other end of which is a CPU. This will guide you on your journey to M68k hacking. That means when you reverse your MegaDrive (Sega Genesis) code, you're looking for
0xc00000
and 0xc00004
. These can be computed or statically shown in assembly, so it's up to your reverse engineering to uncover these addresses! If you're reverse engineering an arcade system, you will probably need to find out what addresses are the video processing unit addresses.
// from https://bigevilcorporation.co.uk/2012/03/23/sega-megadrive-4-hello-world/ //mov.l #0x40000003, VDP_CTRL *VDP_CTRL = 0x40000003; //mov.w #0x8F02, VDP_CTRL // Set autoincrement to 2 bytes *VDP_STATUS = 0x8F02; //mov.l #0xC0000003, VDP_CTRL // Set up VDP to write to CRAM address 0x0000 *VDP_CTRL = 0xC0000003; //lea Palette.l, %a0 // Load address of Palette into a0 //mov.l #0x07, %d0 // 32 bytes of data (8 longwords, minus 1 for counter) in palette for(i = 0; i < 16; ++i) { //VDPLoop: // mov.l (%a0)+, VDP_DATA // Move data to VDP data port, and increment source address *VDP_DATA = Palette32[i]; // dbra %d0, VDPLoop } //mov.w #0x8705, VDP_CTRL // Set background colour to palette 0, colour 0 *VDP_STATUS = 0x8705;I hope that you can appreciate how incredibly beautiful this code is. Note how the original at big evil corporation is actually written in assembly, the original source code being essentially equal to the disassembly that you get in the first step of reverse engineering. Then you see the equivalent C code. But does it make sense to think of writing to memory mapped I/O as
*VDP_CTRL = 0x40000003;
? To me, it makes as much sense as mov.l #0x40000003, VDP_CTRL
, which is what we're shooting for. What's cool about this is that we're able to take a weird idea in assembly and put that into C with a little bit of insisting.
When you look at the for loop at the bottom of this code snipped, I hope that you understand that this is just sequentially writing data to the data port. Again think of it like a serial port and you're just sending dword after dword over (a dword is 32-bit unsigned int, which is what VDP_CTRL is -- uint32_t).
Think for a moment how this for loop should look in assembly. mov, dbra
. While most people think of M68k as being a very intelligent, low instruction count CISC, dbra does in one instruction what x86 does in 3. Wow M68k, what are you trying to do? Well because decrement, branch if above is so common, they made it a single operation, so all your for loops can be reversed by decompiling a single instruction. In x86, you have to decompile three instructions. But why do we care? Well, M68k has a lot of these. In order to understand M68k, you'll want to chill out for a bit and accept that there will be a few instructions that have inexplicable additions, you'll look them up in the M68k reference, and then you'll annotate.
At this point, we've figured out most of what's going on, but let's cement the understanding with a practical example of M68k.
puts("loop2 by Javantea"); newline(); puts("Sept 23 - Oct 28, 2017"); newline();
As you can see, this code is comprised of 4 function calls. A C compiler is allowed to inline as much as it wants, so you might not see this in assembly. So let's look at the disassembly. Objdump works on this, but let's go for Radare2 today.
r2 -e asm.arch=m68k loop2mustestcm.bin Copyright: SEGA MEGA DRIVE (C)---- 2017.SEP DomesticName: loop2 OverseasName: l00p2 ProductCode: GM 31337B17-01 Checksum: 0xb481 Peripherials: JD SramCode: JUE ModemCode: JUE CountryCode: JUE [0x00000200]> aaa
At this point you'll probably go into visual mode with v and then switch through the different modes looking for assembly, which is p.
| 0x0000073a 487900001903 pea.l 0x1903.l | 0x00000740 45f9000005d0 lea.l fcn.000005d0, a2 | 0x00000746 4e92 jsr (a2) ;[2] | 0x00000748 47f90000047e lea.l 0x47e.l, a3 | 0x0000074e 4e93 jsr (a3) ;[3] | 0x00000750 487900001915 pea.l 0x1915.l | 0x00000756 4e92 jsr (a2) ;[2] | 0x00000758 4e93 jsr (a3) ;[3]
The first question is how did I know that this is the actual code? I noticed the pattern of jsr
after I found a few constants nearby. The first constant was 0x40020003 and the second constant was 0x100. With these two constants, it was clear that this is the correct code. That said, we should decompile this code to make certain. 0x1903 is "loop2 by Javantea" and 0x1915 is "Sept 23", so this is correct. Note that Radare2 couldn't automatically detect the strings.
Epilepsy warning! This video contains flashing at 30 Hz.
8051 (aka Intel MCS-51) is the most different architecture in this list. Like the others it's still in use in certain applications. One of the most interesting is a USB Bluray drive which was reversed and hacked by Scanlime with full video discussion. 8051 is an 8-bit Harvard architecture microprocessor. Harvard is different from Von Neumann (which is in use in all other processors in this list as well as x86 and ARM) in a very interesting way - there are two address spaces, one for executable program data and another for data. By separating the address spaces, you can't jump to data like you can on Von Neumann architecture CPUs*. To create an 8051 program, I took a look at MAME's supported systems, focusing on systems that have just one CPU which is in the 8051 family. I found re900 which is a roulette wheel (like in a gambling casino) with a video screen. I found documents and the machine code in MAME to be very instructive. It took a while to actually understand how to initialize the video system, but once I had, it was showing all kinds of colors.
* Does this mean it's impossible to exploit a buffer overflow on Harvard architecture? Of course not. If you understand modern x86 and ARM memory protection mitigations against exploitation, you know that you can set program memory to be read only and that is still exploitable using ROP and other techniques where data is used to control program flow. This is true of Harvard architecture CPU programs as well. Many software design patterns result in function pointers being popped off the writable stack and then used as the instruction pointer. The difficulty of exploitation is significantly higher on Harvard architectures, but if you have trouble exploiting, you might understand why if you know that it uses two address spaces.
Systems that feature 8051 architecture:
While it is popular in systems, it is not easy to get ahold of an 8051 that isn't attached to another system. Why are 8051s used at all? If you look at Digikey's list of microcontrollers with 8051 core you can see that it is cheap, fast, competitive, and partially compatible with code written in 1980. It's not an ARM, but it is a viable architecture. How do you program an 8051? You'll find that information in the documentation.
volatile __xdata __at(0xe000) unsigned char vreg; volatile __xdata __at(0xe001) unsigned char vram; volatile __xdata __at(0xe002) unsigned char watchdog;
The above shows how simple the interface to the I/O is. Note that we're using a different C compiler -- SDCC to convert our C code to a binary.
You can see how our game loop works below.
// Game loop. Write random stuff to video memory? while(1) { vreg = 0; vreg = 0x40; for(i = 0; i < 0x1000; ++i) { vram = fair_roll[i % 500]; } /*while(1) { }*/ }
It's pretty simple in this case we aren't using the *v = x
; mechanism, but instead we're using v = x
; because the way the compiler handles memory. The mechanism is the same. Why is this rom so large if the code is so small? This question remains unanswered, can you answer it?
Zilog's Z80 was very popular because it compatible with Intel's famed 8080 and was pretty well designed for the time. It was used on many arcade systems. Just grepping MAME for z80 will give you some insight.
Systems that feature Z80 architecture:
Examples of Z80 in CTF:
Let's Disassemble from SECCON CTF 2014
My experiment: Writing a Pacman ROM in 10 hours.
What I spent that time learning:
Video RAM picks tiles. (1024 bytes)
Color RAM picks two palettes per tile. (1024 bytes)
Tile ROM is a weird packing of bits. (4096 bytes == 256 tiles)
Palette ROM is a list of colors. (32 bytes)
Program ROM is your code (4096 bytes * 4 == 16kB)
Sound ROM is where you put your music and sfx (?)
ASCII tiles are inefficient, but useful.
In this part I want to introduce a few new concepts. The first is a watchdog. A watchdog is a mechanism on the computer which will count down (when activated) toward resetting the computer. You have to poke it or it will reset. Once you poke it, you're fine until the next interval. The way this helps is if you have a bug in your code that causes an infinite loop, you don't want to have to unplug the computer and plug it back in. Instead of executing undefined behavior, it prefers to reboot. A good reason that important computers should have watchdogs (or similar mechanisms which can cause a computer to exit a while loop if it goes too long) is the ocurrence of bit flipping in hardware. Whether it's caused by overheating, bad hardware, or cosmic rays, your code could do things that you have proven it not to.
You can see a watchdog in use in the code snippet below:
watchdog = 0; // Writing msg to the entire contents of video memory starting at 0. video_mem[k & 0x3ff] = msg_trans[(((unsigned int)k) & 0x7fff) % sizeof(msg)]; //watchdog = 0; if((k & 0x3ff) == 0)
Yes, all you have to do is set watchdog = 0
every once in a while.
The second part that I want to draw your attention to is the fact that I'm using unsigned ints on z80. How big is an unsigned int on z80? I'm compiling with sdcc. Unsigned ints are guaranteed to be at least 16 bits even if your processor is 8 bit. See how C compilers are useful? You can use 16-bit ints on an 8-bit cpu.
The third part that I want to draw your attention to is the fact that I'm using if(k & 0x3ff) == 0)
to delay some branch of the code. Simply put, a person writing a rom could delay printing the flag until the player has beaten the kill screen. A person writing a rom could prevent the player from accessing the switch that turns on god mode by using a sequence of button presses. This is the concept of how a game master gets to choose how the players interact with the world. But as a hacker, you don't have to play the game. You can modify the game so that instead of branches occurring when the game master wants them to, they occur when you want them to. How do you do this? Learn assembly, overwrite a branch. If a game master is really smart, they can make it so that if you try to overwrite a branch, the system will fail. In my Pacman ROM, there is a simple anti-cheat mechanism. See if you can find it and defeat it. It's not obvious from the ROM, so take a look at the source if you don't see it in the ROM. If you can modify the ROM substantially (say hook the start so that it says your name across the top) and there is no change in the gameplay, you win.
I didn't have much trouble with Z80 because it is pretty similar to other hardware and x86. I hope you also enjoy Z80.
Systems that feature 6809 architecture:
Examples of Robotron in CTF:
Church of Robotron at Toorcamp!
How do we go about actually using a fully reversed ROM? In this section, we'll take a look at the excellent reverse
engineering effort of Robotron by Scott Tunstall. If you search for KILL in the reverse engineered source, you can see that
there's a function called KILL_PLAYER
at position 30EF. That function is called by PLAYER_COLLISION_DETECTION
. But notice how the code is organized. It doesn't return (RTS
) and the function that calls it jumps to it (BNE
) instead of calling it (BSR
). So that's not a function, so much as a labeled branch of the code.
In 6809, you have registers A and B just like 6502 and you read and write using LDA, LDB, STA, STB, and so on.
So now that we're clear on what's going on, let's try to use this information to subvert the activity of this ROM. First, we can break on the kill branch. That will stop execution once we touch something instead of executing the kill branch, it will stop execution.
The first think we can do in mame and many other debuggers is memdump. For Robotron, you will get RAM, which is valuable in hacking processes. Not only can you understand the usage of memory, but you can take a look at what you can control. As you write your exploit, you will probably want to check as time goes on breaking closer and closer toward your final goal. In this case (getting killed), you might not actually gain anything from a memdump.
Another incredibly useful command for hacking and reverse engineering is statesave which will make it possible for you to recreate sometimes incredibly nuanced behavior in ROMs. In this case, you'd end up back at the kill state, which is probably not useful unless you want to save where you have N lives and try again and again until you have made it to the level where the flag is printed. That is a valuable hack, but there are better ways to hack Robotron.
An incredibly useful command that you will almost immediately need is source. Because the Qt debug gui is in progress, it doesn't work on all systems. Since that is true, you'll possibly be working with the imgui interface which doesn't have cut, copy, and paste. A way to fix this is to use a text editor of your choice to create a set of commands you want to run, then use source to execute them. This allows you to run incredibly complex commands once you've got a hang of the debugger.
A very cool way to fix our death state is to save data (saved) and then modify it with a hex editor and then load it (loadd). While this won't stop you from dying, you can give yourself as many lives as you need to complete the game, causing the main problem with the game (difficult) to decrease considerably. That isn't a good way to hack this if you simply aren't coordinated enough to finish the game with many lives (see Pacman), but should be sufficient if you have someone who just needs 20 more lives to get to the flag.
What you want to exploit this situation is to use the do command. The command do is very special because it modifies the emulated system -- down to any value you wish to change. What kind of power you can wield with this command? Let's put a stopper in death.
bpset 30ef,1,do pc=30ec
That should do it. But when we do that, it crashes and reboots. No fair. Let's find out what triggers that reboot. My guess is a watchdog. Yes, if you quickly press F5, you can get into a state where you can't be killed and can still continue. Let's try to beat the game.
So the watchdog is at address CBFF according to the williams.cpp. Let's watch that address. But more important of course is to be able to control this sytem with more precision. Can we get it to continue after it has modified pc?
bpset 30ef,1,{do pc=30ec;g}
As you might tell, this command changes the pc and then continues as fast as it can. But this just gives us an unwinnable state. We want winnable state.
Back to the source. Let's look for someone calling this function. We search for 30B3, the start of PLAYER_COLLISION_DETECTION and we find nothing. So we assume that control flows from above. Above PLAYER_COLLISION_DETECTION we see CHECK_IF_ANOTHER_OBJECT_PRESENT at 3085. Searching for 3085 returns two results. One is a comment of a JSR and the other is a JMP. You can see the JSR goes to 26C6 which is the JMP. So why would they JSR to a JMP? This convention might be necessary for short jumps, but that's just my guess. Remember that 6809 is a 6502, so it has parts that are 8-bit. Looking at the bytes of the instruction JSR, we see that it could indeed JSR to 3085 instead of 26C6. What other reason could explain this behavior? Since it won't win the game for us, we can push this onto our learning stack.
Now we're at 028B. We see that this call is occurring in a function called ANIMATE_FAMILY_MEMBER which is actually not called (so it isn't a function either). This might become a theme, if you understand assembly and the common themes of programming in 1982, you might understand that procedural programming was not in the state it is today. If we look upward, we see that GET_FAMILY_MEMBER_FROM_LIST is called, so that is the function we're in. The naming of this is not great, but let's see how it works.
When you're just starting out without any reverse engineering information, take a look at the command hotspot.
To get information about these commands without wasting CPU and electricity, open up src/emu/debug/debughlp.cpp in a text editor. You're welcome.
To hack Robotron, we look at cheats that have been produced for it. We see in robotron.xml there is a straightforward invincibility cheat. It might look difficult, so let's just take a quick look at what is actually going on.
<script state="run"> <action>maincpu.mb@130C2=00</action> <action>maincpu.mb@130CE=00</action> <action>maincpu.mb@130DA=00</action> </script>
The first line writes to memory address 130c2
the value 00. If we look at the assembly, we can quickly understand what this does:
30C1: 26 2C BNE $30EF ; if collision, goto $30EF, KILL_PLAYER
So instead of branching to $30EF, which kills the player, it continues on. There are many ways to do this, but their solution is elegant.
Computers are able to do things you didn't expect.
At this point, you might be thinking: computers are wonderful! They are. Let's dig into this for a quick moment. Computers can be simple as a puts("Hello world")
with no RAM, computers can be mainframes, tablets, phones, or even refrigerators. But that doesn't mean much. It's often used as a replacement for transistors, resistors, and such. So when you replace a transistor with a chip, you are not just creating a glorified transistor, you're enabling complex functionality that can be unlocked with a sequence of actions.
https://exploitee.rs/
So you want to implement a new architecture in MAME? http://arcadehacker.blogspot.com.au/2014/11/capcom-kabuki-cpu-intro.html
Is 8086 an x86 architecture? Technically, yes. But it's antiquated and pre-32-bit x86!
When you run Qemu, what is actually occurring? Let's take the case of our 8086 code in the boot sector. Qemu is creating an emulator for the x86 real-mode. It executes the BIOS, SeaBIOS which is an open source BIOS for the PC that can correctly interface with Qemu's virtual peripherals. What type of peripherals exist in an IBM PC clone? Keyboard, mouse, VGA, APIC, IDE, ISA, PCI, and in newer systems: USB, SATA, and other special interfaces. To print characters to the screen, the BIOS needs to be able to completely interact with the VGA on the system. That might be as easy as writing to video RAM, or it might be as complicated as what we found in other systems. That is why a BIOS is assumed on every IBM PC. If you want to write a BIOS for the PC, you have to go one level deeper. But today, we're going to assume a BIOS is supplied. In our boot sector we can start executing 8086 instructions. The very first that you want to execute is mov ax, 0x7c0
. This is Intel syntax, so destination is first. This command sets the register ax to 0x7c0. This sets up the second instruction mov ds, ax. This sets the data segment equal to ax (0x7c0). The data segment register tells the system what segment we should use for data. This decides how many further instructions work. Data is the most commonly used segment in x86, so be very careful to understand data segment before you try to hack a boot sector. To get the memory address of a data segment, multiply by 16. If you are smart, you know that in hex multiplying by 16 is just a left shift of 4, so the memory address that is associated with data segment equal to 0x7c0 is 0x7c00. Make sense?
Now let's discuss the real important part of this boot sector, the music!
mov si, music mov cx, 0x1a9 beep: ; Choose a frequency for the PIT ; cx is the period mov al, 0xb6 out 0x43, al mov al, cl out 0x42, al mov al, ch out 0x42, al ; connect the pit to the pc speaker in al, 0x61 or al, 3 out 0x61, al ; TODO: stored duration of note mov ax, 0xfff call sleep
You can see in the above "function" beep that we mov and we use this instruction out. You might wonder what out is. The out instruction is how we communicate with the PIT. It is Port I/O which is the other way to do I/O on systems besides Memory Mapped I/O (MMIO). Intel created 8086 with Port I/O in the expectation that it would be worthwhile. It turns out that Memory Mapped I/O is the way to go and Port I/O is actually just a vestige of the IBM PC.
Check out osdev.org wiki for a new insight into your computer's boot process.
Qemu Advent Calendar
DOS viruses
Malware
Boot sector viruses
SECT CTF 2017 PWN300 The gibson
[b64]
ForbiddenBITS CTF 2013 – Old 50
DEF CON CTF Qualifier 2014 - dosfun4u
DEF CON CTF Qualifier 2014 - dosfun4u round 2
Honorable mentions:
Ghost in the Shellcode 2014 - DOS attack
My paper with downloads and links
https://www.altsci.com/old_non-x86/
https://sono.us/mame
Radare2 is open source, free, and supported.
You might have missed Portland Retro Gaming Expo, but remember it for next year.
JRSFuzz is open source, free, and supported.
jvoss@altsci.com
We have the opportunity to do things that I originally thought were fantasy. This will repeat, let us be clear, more times than you will wish. Don't let that be your excuse to not write code. Find something that benefits you or someone else and take a look from the perspective of: this is possible by means of effort.