Writing a disassembler/ monitor
A disassembler is little more than a glorified context editor. Opcodes corresponding to instructions can be found in databooks, or by using an assembler and looking at the resulting memory map. The only other major requirement is an array providing the length of each command, in bytes. Decoding each opcode individually is not necessary, though it simplifies coding at the expense of a larger data table. Some of the bits in the instruction byte will give the type of the command (eg 'ADD', 'MOVE') and the remaining bits will tell the addressing mode (eg 'absolute', 'indexed'). Your code could need to look at 20% of different cases only, by using this technique- depending on the type of microprocessor.
The main drawback of this disassembler is that it lists commands in the order in which they reside in memory, not the actual sequence of execution. The real flow of instructions is modified by subroutine calls, loops and conditional branching instructions. A monitor (or debugger, for languages in the legacy of 'C') will present commands as they are run, display register contents and prompt for modified register values.
Writing a monitor for a microprocessor is not as horrendous as it may seem: The processor is allowed to execute most commands, such as load, store and arithmetic/ logic operations, but any branching instructions
are intercepted and simulated by the debugger. After all, flow control must stay with the simulator, not the program under simulation. The main steps in the monitor are as follows:- If the command to be simulated does not- potentially or certainly- branch, it is copied to a buffer, possibly padded with NOP instructions, since all commands do not have the same length. The (simulator) program counter is increased by the length of the command.
- If a branching instruction is met, it is simulated by the monitor, which reads any flag involved, possibly affecting the (simulator) program counter. Calls to and returns from subroutines also modify the stack. Control is next passed to the previous step.
-
All registers are restored, bringing the machine to the exact state it was, after the command which was previously emulated.
-
The command in the buffer is executed.
-
All registers are saved to a trusted place in memory, so that the next command simulated does not find corrupted data.
- The command name and register values are sent to the screen (or a buffer in memory to be later stored to disc) and, if in single- step mode, the user optionally asked for modified register content.
- The whole sequence it repeated for each instruction.
In the first instance, most of the simulator can be written in a high- level language, apart from the part that saves/ restores register values, and executes a non- branching command (this will minimise the need to emulate the entire processor.) The first task of the machine code part will be to pop the return address off the stack and store it, (it will be put back immediately before exiting the routine) so the program under simulation does not find its stack space disturbed. However, it is not possible to use this simulator in order to trace the high- level language interpreter.
Writing the entire simulator in assembly is quite feasible, and the problem above is then removed. My code for the old Apple TM computer microprocessor was under 380 bytes- that excluded the routine to show register values and optionally ask for modified register values, as well as the length array detailed above. (There is some room for optimisation, if you are obsessed with it.) Admittedly, a modern processor requires more.
The simulator and the program being simulated must not share any resources: That includes data space as well as subroutines. Chaos will ensue, for example, if the monitor uses the official routine to send output to the screen, and then attempts to trace the same routine, unless it happens to be re-entrant- most unlikely.
(A re-entrant routine can be interrupted by another instance of itself without crashing. It will save all memory space it modifies on the stack when it starts running. Under certain conditions, it could also reserve a different frame in memory for the workspace of each of its instances. Each frame is not deallocated until the instance associated with it has run completely. Routines which expect to be used by an interrupt- and do not disable exceptions in their body- will be re-entrant.)
The facility to modify register values, or skip the next instruction (if the registers which can be modified include the instruction pointer and status register, it is much the same) is quite invaluable: A loop or subroutine may need to have its exit condition tested or be to tried with different parameters. But once the main part is debugged, you will hardly want to waste time on useless iterations. Much better to concentrate on the salient parts of the program.
All the above assumes you are not willing to make a single hardware modification to your microprocessor board; if you are, the main debugger loop is even simpler.
Tracing routines which access peripherals will only be of limited use: The machine is effectively running at a much slower speed.