Assembly
From MRL Wiki
Contents |
[edit] Challenge
New challenge 4/9 (reading challenge 1)
This is another challenge designed to introduce you to assembly language. Instead of trying to write a simple program, we will read a simple program.
Assembler languages look different, depending on which assembler you are using and what processor it is targeting. This can sometimes be frustrating, but is an unavoidable fact of being an reverser. You must learn to read a variety of assembler dialects if you are to have flexibility.
Here are two simple routines which do the same thing. The first is written for the GNU assembler, the second for NASM, MASM, or another typical x86 assembler.
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
movl 12(%ebp), %ebx
addl %ebx, %eax
mov %ebp, %esp
popl %ebp
ret
push ebp
mov ebp, esp
mov eax, [ebp + 8]
mov ebx, [ebp + 12]
add eax, ebx
mov esp, ebp
pop ebp
ret
On the x86, many instructions take a source and a destination operand. (The operands are the things after the instruction name.) Notice how the source and destination operands are different for the two flavors. For AT&T syntax, it is source, destination. For Intel syntax, it is destination, source. It is easy to get confused when you are learning both syntaxes at the same time, so be careful.
Notice the mov instructions on lines 3 and 4. They do the same thing, but they look very different. In AT&T syntax, we say offset(pointer), and in Intel syntax, we say [pointer + offset].
Questions:
- The add instruction on line 5 adds the contents of the eax and ebx registers. Where does the answer go? Explain why this makes sense given what you know about the operand names.
- The offset in the AT&T version is in decimal, not hexadecimal. What happens when you replace the offset with a hexadecimal version?
- What does this routine do?
- What do you think will happen when you give the function the arguments 0xffffffff and 0xffffffff ?
- (optional) You can call this function from C, if you want. What would the C prototype for the function look like?
- (optional) If you called this function from C, the function would return an answer. Where is the C routine getting the answer from?
[edit] Solutions
1. The add instruction on line 5 adds the contents of the eax and ebx registers. Where does the answer go? Explain why this makes sense given what you know about the operand names.
After adding eax and ebx registers the answer is stored in ebx register. The operand names in AT&T syntax are: add source, destination (or reverse in Intel syntax).
2. The offset in the AT&T version is in decimal, not hexadecimal. What happens when you replace the offset with a hexadecimal version?
It should not change anything as long as the offsets are still pointing to the right value.
3. What does this routine do?
Short Answer: This routine adds two integers
Long Answer:
# standard function prologue
# preserving register values
pushl %ebp
movl %esp, %ebp
# store two parameters from the stack
# into eax and ebx registers
movl 8(%ebp), %eax
movl 12(%ebp), %ebx
# add the two registers and store the
# result in eax
addl %ebx, %eax
# standard function epilogue
# restoring register values
mov %ebp, %esp
popl %ebp
# return to the calling program
ret
4. What do you think will happen when you give the function the arguments 0xffffffff and 0xffffffff ?
A carry condition will occur since the result is larger than the maximum value allowed. This can be easily detected by checking the appropiate flag in EFLAGS
[edit] Old Challenges
[edit] Other Challenges
[edit] Tips
The best way to learn assembly is by disassemblying an already made program. Even better if you wrote a C program and you want to see how it will look like in assembly try this:
gcc -s myprogram.c
This will create another file myprogram.s which contains assembly code generated by gcc.
In case you are stuck with Intel syntax, you can convert it to AT&T with Intel2Gas: http://www.niksula.cs.hut.fi/~mtiihone/intel2gas/