Recognizing Code in Assembly Language
Last updated
Last updated
Notice that x is changed in memory when eax
is moved into dword_40CF60
This can be broken down and translated back to C
The for
loop can be recognized by locating the four components
Initialization
Comparison
Execution instructions
Increment/decrement
The calling convention used depends on the compiler
Three most common calling conventions
cdecl
parameters are pushed onto the stack from right to left
caller cleans up the stack when the function is complete
return value is stored in EAX
stdcall
requires the callee to clean up the stack when the function is complete
function called would be responsible for cleaning up the stack
standard calling convention for the Windows API
Any code calling these functions will not need to clean up the stack, that responsibility falls to the DLLs that implement the code for the API function
fastcall
first few arguments are passed in registers with the most commonly used registers being EDX and ECX
the calling function is usually responsible for cleaning up the stack
more efficient because the code doesn't need to involve the stack as much
Push vs. Move
adder
function adds two arguments and returns the result
main
function calls adder
and prints the result using printf
Compiled in two common ways
using the if style
using jump tables
A compiled switch statement looks like a group of if statements
There may be multiple ways to represent the same code constructs in assembly
Using a jump table
defines offsets to additional memory locations
switch variable is used as an index into the jump table
Used by programmer to define an ordered set of similar data items
Malware sometimes uses array of pointers to strings that contain multiple hostname that are used as options for connections
In assembly, arrays are accessed using a base address as a starting point
ecx
is used as the index, which is multiplied by 4 to account for the size of the elements
The resulting value is added to the base address of the array to access the proper array element
Similar to arrays
Comprise elements of different types
Commonly used by malware authors to group information
Accessed with a base address used as a starting pointer
Linked list
A data structure that consists of a sequence of data records
Each record includes a field that contains a reference (link) to the next record in the sequence
Benefits over arrays - order of linked items can be different from the order in which data items are stored in memory or disk
To recognize a linked list, you have to recognize that some object contains a pointer that points to another of the same type
To recognize a linked list - you have to first recognize that some object contains a pointer that points to another object of the same type