Analyzing Malicious Windows Programs

The Windows API

What is the Windows API?
- A broad set of functionality that governs the way that malware interacts with the Microsoft libraries
- Uses its own names to represent C types
- Hungarian Notation
  - used for API function identifiers
  - Uses a prefix naming scheme that makes it easy to identify a variable's type

Handles

What are handles?
- Items that have been opened or created in the OS:
  - Window
  - Process
  - Module
  - Menu
  - File
- Cannot be used in arithmetic operations
- Do not always represent the object's address
- Only thing you can do with handles is store it and use it in a later function call to refer to the same object
- Example:
  - CreateWindowEx function - returns an HWND, which is a handle to a window

File System Functions

Common ways that malware interacts with the system:
- creating or modifying files
- Distinct filenames
- Changes to existing filenames
Functions for accessing the file system
- CreateFile
  - Used to create and open files
  - Can open existing files, pipes, streams, and I/O devices
  - Can also create new files
  - dwCreationDisposition parameter controls whether the function creates a new file or opens an existing one
- ReadFile and WriteFile
  - Used for reading and writing to files
  - Operate on files as a stream
- CreateFileMapping and MapViewOfFile
  - File mappings are commonly used by malware writers because they allow a file to be loaded into memory and manipulated easily
  - CreateFileMapping - loads a file from disk into memory
  - MapViewOfFile - returns a pointer to the abuse address of the mapping, can be used to access the file in memory
  - Malware calling these functions could use the pointer returned from MapViewOfFile to read and write anywhere in the file
  - Handy when parsing a file format
  - Malware can obtain map of file, make changes in memory and execute the PE file as if it had been loaded by the OS loader

Special Files

Not accessed by their drive letter and folder
Stealthier than regular ones because they don't show up in directory listings
Provide greater access to system hardware and internal data
Can be passed as strings to any of the file-manipulation functions and operate on a file as if it were a normal file

Shared Files

Special files with names that start with \serverName\share
Access directories or files in a shared folder stored on a network
The \\?\ prefix tells the OS to disable all string parsing and allows access to longer filenames

Files Accessible via Namespaces

Namespaces
- Thought of as a fixed number of folders, each storing different types of objects.
- NT Namespace
  - Lowest level namespace is the NT namespace with the \ prefix
  - The NT namespace has access to all devices and all other namespaces exist within the NT namespace
- Win32 device namespace
  - Prefix \\.\
  - Often used by malware to access physical devices directly, and read and write to them like a file
  - Example: \\.\PhysicalDisk1 to directly access Disk1 (ignoring the file system) allowing it to modify it in ways not possible using the API
  - Malware might be able to read and write data to an unallocated sector without creating or accessing files, allows it to avoid detection by AV and security programs
    Example:
    Witty worm
    accessed \Device\PhysicalDisk1 via the NT namespace to corrupt its victim's file system
    Would open it and write to a random space on the drive at regular intervals, eventually corrupting the victim's OS and rendering it unable to boot
    Malware can also access physical memory directly, allows user-space programs to write to kernel space.
    This technique is used by malware to modify the kernel and hide programs in user space

Alternate Data Streams

Allows additional data to be added to an existing file within NTFS, essentially adding one file to another
Extra data does not show up in a directory listing and it is not shown when displaying the contents of the file; only visible when you access the stream
Named according to the convention normalFile.txt:Stream:$DATA
- Allows a program to read and write to a stream
Malware authors like ADS because it can be used to hide data

The Windows Registry

Malware often uses the Registry for persistence or configuration data
Malware adds entries into the registry that will allow it to run automatically when the computer boots
Writing entries to the Run subkey set up software to run automatically - often used by malware to launch itself automatically

Common Registry Functions

Malware uses registry functions that are part of the Windows API to modify the registry to run automatically when the system boots
Common Functions:
- RegOpenKeyEx - opens a registry for editing and querying
- RegSetValueEx - adds a new value to the registry and sets its data
- RegGetValue - returns the data for a value entry in the registry
If you see these in malware, you need to identify the registry keys they are accessing

Registry Scripting with .reg Files

They are like scripts for changing the registry
Files with a .reg extension contain human-readable registry data.
When a user double-clicks a .reg file, it automatically modifies the registry by merging the information the file contains into the registry
Malware uses .reg files to modify the registry

Networking APIs

Malware relies on network functions to do its dirty work
Malware most commonly uses Berkeley compatible sockets (primarily implemented in ws2_32.dll)

WSAStartup function has to be called before any other networking functions to allocate resources for the networking libraries.
- While debugging code, set a breakpoint on WSAStartup

Server and Client Sides

Server side - maintains an open socket waiting for incoming connections
- Steps:
  - socket
  - bind
  - listen
  - accept
  - send/recv
Client side - connects to a waiting socket
- Steps:
  - socket call
  - connect call
  - send/recv calls

The WinINet API

A higher-level API
Functions are stored in Wininet.dll
Implements protocols like HTTP and FTP at the application layer
You can gain an understanding of what malware is doing based on connections it opens
Functions
- InternetOpen - used to initialize a connection to the Internet
- InternetOpenUrl - used to connect to a URL
- InternetReadFile - allows the program to read the data from a file downloaded from the Internet
Malware can use this to connect to a remote server and get further instructions for execution

Following Running Malware

First and most common way to access code outside a single file is through the use of DLLs

DLLs

Dynamic Link Libraries (DLLs)
- Windows' way to use libraries to share code among multiple applications
- An executable file that does not run alone, but exports functions that can be used by other applications.
- Main advantages
  - Memory used by the DLLs can be shared among running processes
  - When distributing an executable, you can use DLLs that are known to be on the host Windows system without needing to redistribute them
  - DLLs are useful code-reuse mechanism
  - Maintain a single library of common code and distribute it only when needed.

How Malware Authors use DLLs

To store malicious code
- Store malicious code in a DLL rather than in an .exe file
- Malware sometimes uses DLLs to load itself into another process
By using Windows DLLs
- Functionality needed to interact with the OS
By using third-party DLLs
- Malware can use third-party DLLs to interact with other programs
- Example - use the Mozilla Firefox DLL to connect back to a server, rather than connecting directly through the Windows API

Basic DLL Structure

DLLs use the PE file format
Only a single flag indicates that the file is a DLL
Often have more exports and fewer imports
Other than these there is no real difference between a DLL and an .exe
DllMain
- Main DLL function
- It has no label
- Is not an export in the DLL, but it is specified in the PE header as the file's entry point
- Function is called to notify the DLL whenever a process
  - Loads or unloads the library
  - Creates a new thread
  - Finishes an existing thread
- This notification allows the DLL to manage any per-process or per-thread resources

Processes

Malware can execute code outside the current program by creating a new process or modifying an existing one
Windows uses processes as containers to manage resources and keep separate programs from interfering with each other
Each process is given a memory space that is separate from all other processes and that is a sum of memory addresses that the process can use
When the process requires memory, the OS allocates memory and give the process an address that it can sue to access the memory
Processes can share memory addresses
- Addresses are the same, but the physical memory that stores the data is not the same
A malicious program that accesses a memory address, will affect only what is stored at that address for the process that contains the malicious code

Creating a New Process

CreateProcess - most commonly used function by malware to create a new process
- Malware could call this function to create a process to execute it malicious code to bypass host-based firewalls and other security mechanisms
- Commonly used by malware to create a simple remote shell with just a single function call
- STARTUPINFO parameter
  - includes a handle to the standard input, standard output and standard error streams for a process
  - malicious programs could set these values to a socket, so that when the program writes to standard output, it is really writing to the socket, allowing an attacker to execute a shell remotely without running anything other than the call to CreateProcess

Call to CreateProcess
- creates a new process so that all input and output are redirected to a socket
- Malware often creates a new process by storing one program inside another in the resource section
- When the program runs
  - Extracts the additional executable from the PE header, writes it to disk and then call CreateProcess to run the program

Threads

Processes contain threads
Threads are what the Windows OS executes
Threads are independent sequences of instructions that are execute by the CPU without waiting for other threads
Threads within a process all share the same memory space, but each has its own processor registers and stack

Thread Context

Running threads have complete control of the CPU
When an OS switches between threads, all values in the CPU are saved in a structure (thread context)

Creating a Thread

CreateThread function
- Used to create new threads
- Caller specifies a start address, often called the start function
- Execution begins at the start address and continues until the function returns
- Caller of CreateThread can specify the function where the thread starts and a single parameters to be passed to the start function
Malware can use CreateThread in multiple ways
- Used to load a new malicious library into a process
  - The address of LoadLibrary specified as the start address
  - Argument passed to CreateThread is the name of the library to be loaded
  - The new DLL is loaded into memory in the process and DllMain is called
Create two new threads for input and output
- One to listen on a socket or pipe and then output that to standard input of a process
- The other to read from standard output and send that to a socket or pipe
- Goal is to send all information to a single socket or pipe in order to communicate seamlessly with the running application

Fibers are like threads, but are managed by a thread, rather than by the OS

Interprocess Coordination with Mutexes

Mutexes
- Also called mutants when in the kernel
- Are global objects that coordinate multiple processes and threads
- Mainly used to control access to shared resources
- Example
  - If two threads must access a memory structure, but only one can safely access it at a time, a mutex can be used to control access
- Only one thread can own a mutex at a time
- Important to malware analysis because they often use hard-coded names, making them good host-based indicators
  - Hard-coded names are common because mutex's name must be consistent it used by two processes
Threads gains access to the mutex with a call to WaitForSingleObject
When a thread is done using a mutex it uses ReleaseMutex
CreateMutex function
- Creates a mutex
Malware will commonly create a mutex and try to open an existing mutex with the same name to make sure that only one version of the malware is funning at a time

Services

Service
- Another way for malware to execute additional code
- Services run as background applications
- Scheduled and run by the Windows service manager without user input
- Advantages for malware writers
  - Services are normally run as SYSTEM or another privileged account
  - SYSTEM account has more access than administrator or user accounts
  - Provide another way to maintain persistence on a system
  - Users wouldn't find anything suspicious, because malware is not running in a separate process
Key Windows API functions related to services:
- OpenSCManager
  - Returns a handle to the service control manager
  - Used for all subsequent service-related function calls
  - Any code that interacts with services will call this function
- CreateService
  - Adds a new service to the service control manager
  - The caller can specify whether the service will start automatically at boot time or has to be started manually
- StartService
  - Starts a service
  - Used only if the service is set to be started manually
Most common service types used by malware
- WIN32_SHARE_PROCESS
  - Stores the code for the service in a DLL
    Combines several different services in a single, shared process.
- WIN32_OWN_PROCESS
  - Stores the code in an .exe file and runs as an independent process
- KERNEL_DRIVER
  - Used for loading code into the kernel
  - Information about services is stored in the registry under HKLM\SYSTEM\CurrentControlSet\Services

SC Program

Used to investigate and manipulate services
Commands for adding, deleting, starting, stopping and querying services

The Component Object Model

An interface standard that makes it possible for different software components to call each other's code without knowledge of specifics about each other
Works with any programming language
Designed to support reusable software components
Implemented as a client/server framework
Each thread that uses COM has to call the OleInitialize or CoInitializeEx function at least once prior to calling any other COM library functions

CLSIDs, IIDs, and the Use of COM Objects

COM objects are accessed via
- Globally Unique Identifiers (GUIDs)
- Class Identifiers (CLSIDs)
- Interface Identifiers (IIDs)
CoCreateInstance function
- Used to get access to COM functionality
Navigate function
- Common function used by malware
- Allows a program to launch Internet Explorer and access a web address
Interfaces are identified with a GUID called an IID, and classes are identified with a GUID called a CLSID

The OS uses information in the registry to determine which file contains the request COM code when a program call CoCreateInstance
To identify what a malicious program is doing when it calls a COM function, malware analysts have to determine which offset a function is stored at
One strategy for identifying the function called by a COM client to check the header files for the interface specified in the call to CoCreateInstance
Some COM objects are implemented as DLLs - loaded into the process space of the COM client executable
COM object is set up to be loaded as a DLL, the registry entry for the CLSID

COM Server Malware

Malware can implement a malicious COM server that can then be used by other applications
Browser Helper Objects (BHOs)
- provide common COM server functionality for malware
- Third-party plug-ins for Internet Explorer
- No restrictions, so malware authors use them to run code running inside the IE process
- This allows them to monitor Internet traffic, track browser usage, communicate with the Internet, without running their own process
Usually easy to detect because it exports several functions
- DllCanUnloadNow
- DllGetClassObject
- DllInstall
- DllRegisterServer
- DllUnregisterServer

Exceptions: When Things Go Wrong

Exceptions
- Allow a program to handle events outside the flow of normal execution
- Caused by errors
- When they happen, execution transfers to a special routine that resolves the exception
- When an exception occurs, Windows looks in fs:0 for the stack location that stores the exception information and then the exception handler is called
- After the exception is handled, execution returns to the main thread
Structured Exception Handling (SEH)
- Windows mechanism for handling exceptions
- SEH information is stored on the stack

If the exception handle for the current frame does not handle an exception, it's passed to the exception handler for the caller's frame
If none of the exception handlers responds to an exception, the top-level exception handler crashes the application
Exception handlers can be used in exploit code to gain execution
- A pointer to exception-handling information is stored on the stack
- During a stack overflow, an attacker can overwrite the pointer
- By specifying a new exception handler, the attacker gains execution when an exception happens

Kernel vs User Mode

User Mode
- Each process has its own memory, security permissions, and resources
- When a program executes an invalid instruction and crashes, Windows can reclaim all the resources and terminate the program.
- Cannot access hardware directly
- Restricted to only a subset of all the registers and instructions available on the CPU
- Relies on the Windows API to manipulate hardware or change the state in the kernel
  - Presence of SYSENTER, SYSCALL, INT 0x2E instructions in disassembly indicates that a call is being made into the kernel
Kernel Mode
- All processes running in the kernel share resources and memory addresses
- Kernel code has fewer security checks
  - If the code contains invalid instructions, then the OS cannot continue running, resulting in the famous Windows BSoD
- Code running in kernel can manipulate code running in user space, but code running in user space can affect the kernel only through well-defined interfaces
- Most security programs (AV and Firewalls) run in kernel mode
- Malware running in kernel mode can more easily interfere with security programs or bypass firewalls
- OS's auditing features don't apply to the kernel
- Nearly all rootkits use code running in the kernel
  - Only sophisticated malware runs in the kernel
  - Most malware has no kernel component

The Native API

Lower-level interface for interacting with Windows that is rarely used by non-malicious programs
Bypasses the normal Windows API

User applications get access to user APIs like kernel32.dll and other DLLs which call ntdll.dll
ntdll.dll
- a special DLL that manages interactions between user space and the kernel
- ntdll functions use APIs and structures just like the ones used in the kernel
- functions make up the Native API
- Programs are not supposed to call the Native API but nothing in the OS prevents them from doing so
Calling the Native API is attractive for malware because
- it allows them to do things that might not otherwise be possible
- Additional functionality that is not exposed in the regular Windows API
- Stealthier

Native API calls that provide information about the system, processes, threads, handles and other items
- NtQuerySystemInformation
- NtQueryInformationProcess
- NtQueryInformationThread
- NtQueryInformationFile
- NtQueryInformationKey
- NtContinue
  - Native API function popular with malware authors
  - Meant to transfer execution back to the main thread of a program after an exception has been handled
  - Location to return to is specified in the exception context and it can be changed
  - Malware often uses this function to transfer execution in complicated ways to confuse an analyst and make a program more difficult to debug
Native applications
- Applications that do not use the Win32 subsystem
- Issue calls to the Native API only
- Rare for malware but almost nonexistent for non-malicious software, so native applications are likely malicious
- Subsystem in the PE header indicates if a program is a native application

PreviousWannaCry NextStatic Analysis

Last updated 1 year ago