Recognizing Code in Assembly Language

Recognizing Code in Assembly Language

Notice that x is changed in memory when eax is moved into dword_40CF60

Disassembling Arithmetic Operations

int a = 0; 
int b = 1;
a = a + 11;
a = a - b; 
a--;
b++;
b = a % 3; 

This can be broken down and translated back to C

Recognizing if Statements

Recognizing nested if statements

Recognizing Loops

  • The for loop can be recognized by locating the four components

    • Initialization

    • Comparison

    • Execution instructions

    • Increment/decrement

Finding while loops

Understanding Function Call Conventions

  • The calling convention used depends on the compiler

  • Three most common calling conventions

    • cdecl

      • parameters are pushed onto the stack from right to left

      • caller cleans up the stack when the function is complete

      • return value is stored in EAX

  • stdcall

    • requires the callee to clean up the stack when the function is complete

    • function called would be responsible for cleaning up the stack

    • standard calling convention for the Windows API

    • Any code calling these functions will not need to clean up the stack, that responsibility falls to the DLLs that implement the code for the API function

  • fastcall

    • first few arguments are passed in registers with the most commonly used registers being EDX and ECX

    • the calling function is usually responsible for cleaning up the stack

    • more efficient because the code doesn't need to involve the stack as much

  • Push vs. Move

    • adder function adds two arguments and returns the result

    • main function calls adder and prints the result using printf

Analyzing switch statements

  • Compiled in two common ways

    • using the if style

    • using jump tables

  • A compiled switch statement looks like a group of if statements

  • There may be multiple ways to represent the same code constructs in assembly

Using a jump table

  • defines offsets to additional memory locations

  • switch variable is used as an index into the jump table

Disassembling Arrays

  • Used by programmer to define an ordered set of similar data items

  • Malware sometimes uses array of pointers to strings that contain multiple hostname that are used as options for connections

  • In assembly, arrays are accessed using a base address as a starting point

  • ecx is used as the index, which is multiplied by 4 to account for the size of the elements

  • The resulting value is added to the base address of the array to access the proper array element

Identifying Structs

  • Similar to arrays

  • Comprise elements of different types

  • Commonly used by malware authors to group information

  • Accessed with a base address used as a starting pointer

Analyzing Linked List Traversal

  • Linked list

    • A data structure that consists of a sequence of data records

    • Each record includes a field that contains a reference (link) to the next record in the sequence

    • Benefits over arrays - order of linked items can be different from the order in which data items are stored in memory or disk

    • To recognize a linked list, you have to recognize that some object contains a pointer that points to another of the same type

  • To recognize a linked list - you have to first recognize that some object contains a pointer that points to another object of the same type

Last updated