So you want to learn assembly?


Then you've come to the right place. Whether its because you want to crack shareware, optimize your programming skills, or you just don't want to be one of those people who sit in front of a computer and have no idea what it's ACTUALLY doing, you've made the right choice. Personally I was seventeen when I learned assembly and aside from the lack of tutorials on the web (especially those dealing with Windows programming), the hardest steps were the first few. For example, look at the following examples:
Basic
Print "Hello World"
Pascal
Write('Hello World');
C
printf("Hello World");
Java
System.out.print("Hello World");


All these examples will write the string "Hello World" (without the quotes) to a DOS screen. They are all very simple to understand, just a one-line command does the job. Below is the assembly code in DOS for the same function:


	.data
	hello db 'Hello World',0dh,0ah,'$'

	.code
	mov ah, 9h
	mov dx, offset hello
	int 21h		


See, that wasn't so bad. There is a lot to learn from this example. First, it shows you that strings don't just magically appear on the screen. You have to store the string, yes the size of the string will actually change the size of your program! I will now explain the basic syntax of this code:
Your source code in assembly is always made up of two different things: data (variables), and code (instructions). The way you let the compiler know which is which is by putting .data to signify that everything up till the next .code is data. Then when a .code is reached, the machine knows this is the actual code. You can at any place in the program throw in .data to define new variable then return to your code with .code and continue programming. In the above code, hello is a variable. You might call it a string but it's acutally nothing more than the address of the first character of the string. When you call the int 21h with 9 in the ah register, it will print all the characters starting from hello until it reaches a $ which it interprets as an end of string. Confused? Don't worry, you might not know what the ah is or an interrupt (int 21h) but it doesn't really matter. Just remember the following steps to printing Hello World on the screen.
  1. Define the string "Hello World"
  2. Put 9 in the high order end of the ax register (in other words the ah register)
  3. put the address of the string in the dx register
  4. Then just call DOS' interrupt 21 and let it do all the rest of the work.
**Note: The above code will not work because of a couple of things you'll have to consider. Just read on.
 
Okay, that's enough DOS, let's get to what you really came to learn, 32-bit windows assembly. It works very much the same way: Enter all your parameters, and then just call the appropriate function from the Windows API (this is what dlls are-they store all these functions). Let's build our first windows program: A simple message box.
	.386
	.model	flat

	public _start

.data
title1 db 'Welcome',0
message db 'Hey!  What do you know?  Your first message box in assembly!',0

	.code

MB_OK equ 0
MessageBox equ MessageBoxA

	extrn	MessageBox:near
_start:

	push	MB_OK		;-----
	push	offset title1	;  |
	push	offset message	;the code
	push	large 0		;  |
	call	MessageBox	;-----

end	_start
This code will compile exactly the way it is. Just like in the DOS example, all you have to do is enter the right parameters ( but now, it must be in the right order), then call the MessageBox function, and voila! A message box appears! The parameters pushed are unique to all functions. The reason I chose MessageBox is because it is simple and doesn't need that many parameters. You might be thinking what the hell are all those parameters? It's quite simple:
push MB_OK
There are many types of message boxes. You can have a combination of buttons (E.g. Ok, Ok/Cancel, Yes/No, Yes/No/Cancel buttons), and you can also add icons to the message box. MB_OK is the default, it will give you a message box with just an Ok button. From the line that says MB_OK equ 0, you know that MB_OK equals 0! Therefore, push large 0 would work just as well. However no one in the world would even consider trying to memorize all the numerical constants. So names are given to each so that they're easier to remember. All these names are put in an include file that defines all necessary constants. Try using different constants rather than MB_OK: (Please download my win32.inc and copy it into your include folder)


push offset title1
This is the address of the string you want to but in the title bar of the message box. The variable title1 stores this string.
push offset message
This will push the address of the string you want in the message box where message is the string variable storing your message. Try swapping the place of these two commands, ie. put push offset message before push offset title1. The title will now be your message while the message will be displayed in the title bar. Hence, order determines what parameters are being represented.
push large 0
This is the handle of the window the message box belongs to. When you create a window, the function to create it will return a handle (an arbitrary number that the computer uses to determine that you're talking about that particular window). Since we don't have a window, we use null (or 0). By specifying a window handle, the program will not let you access that window until the message box is closed.
call MessageBox
This calls the actual MessageBox function (named MessageBoxA) which interprets these parameters and creates the appropriate message box on the screen. You don't actually have to draw the message box on your own!

 
Before proceeding, I want to explain include files a bit more. The command MB_OK equ 0 is something that will never be seen outside of your source code. It is merely a message to the compiler. The compiler replaces all these constants by the actual numerical values from the include file. This works with any two values. The actual name of the function that creates a message box is MessageBoxA, but because it may be easier to remember just MessageBox, many programmers include the line MessageBox equ MessageBoxA just to avoid typing the 'A' after MessageBox. Personally I just type the actual name. The point however is that it doesn't matter what you call it. Go ahead and use Messb or Messbox or George! It doesn't matter (just remember it's case sensitive). As soon as the compiler sees MessageBox (or whatever you called it, it will replace it with MessageBoxA before compiling any further. You can do the same thing with large. Instead of always typing push large 0, you could just put L equ large and then just type push L 0. So in short, your include file(s) will never be compiled and do not represent any sort of code that the machine will ever execute. The statement include win32.inc just means replace all these constants that I use in my code with the numerical values in this file.

 
Explanations of some other lines (I don't know about other compilers but this is for TASM):
.386
Since the 386, computers have been expanded to 32 bit registers (one of many changes). Windows needs to uses 32-bit data so this command tells the compiler that they will be used.
.model flat
I used to know this but I've forgotten. It never made a difference and was never anything I played with so I forgot. All I know is that for DOS programs the model has to be small. Just keep writing .model flat for as long as you're writing 32-bit applications.
public _start
This command tells TASM that the program starts wherever _start: is. Note the difference with the colon. Program execution will start at _start:
end _start
This tells the compiler that the program ends here.
extrn MessageBox:near
This defines MessageBox (remember this will be replaced with MessageBoxA) as the name of a function so that it can be called. You must do this for EVERY function you call.

 
 
Okay, message boxes are all fine and dandy, let's get to some real programming . . .

Home - Next

1