Static Analysis

Analysis without executing

Sample: file under investigation or analysis (to see if malicious or benign). Approaches: 1. Basic Static Analysis : Does not involve examination at code level (source code and assembly). Metadata and embedded data. 2. Advanced Static Analysis : Examines source code (disassembled assembly instructions).

Basic Static Analysis Methodology

Basic Static Analysis Methodology: 1. File Identification and Classification 2. Scanning 3. File Format Analysis 4. Identifying Obfuscation

File Identification and Classification

Whats the use? It help to uniquely identify the same, understand how the sample works. Based on File Type: PE, PDF, DOCX Based on file Hashes: md5,sha1,Fuzzy Hash, ImpHash, Section Hash Based on String Embedded within file: strings can be encoded.

File Categories: 1. ASCII (plain text) (HTML,XML) 2. Structured FIles (binary): Hex view reveals File headers that help identify. Online Identify File Type: https://www.garykessler.net/library/file_sigs.html Linux Command: file <filename> Identify masquerading techniques using Resource Hacker : where exe can use pdf icon for seeming unharmful.

Hashes

Allow to identify if the sample is already well known malware. Determine if belongs to a family, group. Hashing creates hash signature for the sample. This hash is compared to a database of hashes of other samples to see if it was seen before. If so, malware is identified and previous investigation allows for remediation. If not, the hash will be used as Indicator of Compromise (IOC), if the sample is found malicious in further analysis.

Tools: CLI : md5sum, sha1sum, sha256sum GUI : HashMyFiles, Hasher

Drawback: Changing even a single bit in binary changes the hash. Harder to identify same malware with minimal changes. This is called Polymorphic malware, meaning original algorithm intact, just little mutation. How to Handle Polymorphic Malware ? Fuzzy Hashing: This is used to usually compare samples (not only). The sample is divided into segments, and the hashes are calculated from them. The more hashes are same, the similar are the two files.

Fuzzy Hashing in basic

Tools for Fuzzy hashing: SSDEEP - CLI tool for fuzzy hash checks. https://ssdeep-project.github.io/ssdeep/index.html

Other Hashes: Import Hash Section Hash

Strings

A computer datatype consisting of sequence of characters. Mostly defined in an array, by can be defined in others like a record. Last character of a string array is called escape sequence. aka. Null terminator "\0" in char representation, "00" in hexdec representation.

Main Encoding of strings: 1. ASCII: Use 7-bits i.e 128 chars, but implemented using 1Byte and cover 256 chars.

All bytes in a file are in a way strings, but there are 2 types for differentiation. 1. Printable: Strings readable by humans with interpretation. 2. Non-Printable: For example: Line Feed (LF) "\n" and Carriage Return (CR) "\r". They are not visible in char representation (like spacebar), but in hexdec representation have hex values "\x0A" and "\x0D"

Unicode Strings: - Use same escape sequence value for termination. "\x0" or NUL. - Represented by 2Bytes known as "wide chars" or "UTF-16".

Extract both ASCII and Unicode strings, allows to store all encoding format.

Why Strings: Internal/ext messages used Functions referenced/invoked Section used by sample IP/domains used. Error messages Others

Tools: CLI: strings, bstrings GUI: BinText,StringSifter, FLOSS

Scanners and Sandboxes

Scanner scans a file and decides if its benign or malicious. (Anti-virus). Offline/Online, signature database for offline. VirusTotal, Hybrid-Analysis.

Cons of online Scanner: File will be accessible to the public and scan results. So during an investigation, for a new malware scan generates scan hash record, attacker will know that the malware is being investigates if researchers use online scanners for an ongoing investigation.

Offline scanners utilize sandbox: sandbox: Isolated, controlled enviroment for running programs. To mitigate risks involved with malicious programs. - Virtual Machines.

Sandbox for Malware Analysis: Cuckoo

Open source offline scanner. Download VM: https://github.com/ashemery/CuckooVM

Other scanners and sandbox: https://sandbox.anlyz.io/ https://app.any.run/ https://valkyrie.comodo.com/ http://www.hybrid-analysis.com/ https://www.intezer.com/ https://www.secondwrite.com/ https://vicheck.ca/ AMIRA https://github.com/rshipp/awesome-malware-analysis#online-scanners-and-sandboxes https://linuxsecurity.expert/tools/cuckoo-sandbox/alternatives/ DeepFreeze

Portable Executable File Format Analysis

PE File Structure: PE file tells the OS of execution requirements PE informs of the requirements required to run (not limited to): - Memory required - Memory permissions - program location in memory - Libraries/Functions required - Execution start in loaded address space All these requirements are defined in the PE structure.

Most of the malware/programs are not self-contained and use libraries available on the target, to run functions. This becomes an important factor of malware analysis as we seek the libraries and functions that are used by the sample application.

Libraries: Contains functions (code) that can be referred by an application and executed. Useful as a code can be shared between multiple programs, instead of individually writing entire code for every program. Like for printing stdout, std library is used, and print line code is not newly written in every code otherwise. Microsoft Libraries: Shared library implementation is called Dynamic-Link Library (dll). Use PE file format, like one used by exe files.

Compiling a program:

  1. Compiler compiles the source file (code) into an object file (binary file) (compiled code)

  2. The Linker links the library files mentioned in the source file with the object file to produce the executable file.

Types of Linking:

  1. Static Linking: Resolves library requirements and copies the entire library into executable.

  2. Dynamic Linking:

    1. Implicit : Libraries referenced in program are linked but not added to the executable. At execution time, OS loads the linked libraries and manages all memory addresses for the program. PE structure section holds libraries imported and functions referenced. e.g .rdata

    2. Explicit : No import section. No library linked to program. Linking done within code by developer. No need for OS to load library at execution. i.e whenever any library requires, developers' code to load a dll and then get address of required function. For example: LoadLibrary() - dll load, GetProcAddress() - function addr.

PE File Format is derivative of COFF (Common Object File Format) specification used in Unix executables. Some file types using the PE file format: .exe , .scr executed via explorer.exe (Windows Shell). .dll require program/service to be run.

Type of file defined at compile time.

Header and Sections: Defined by these two. PEView. Main file headers:

  1. MS-DOS Header At beginning of PE (at offset zero) and starts with magic (signature) value : "MZ" or "0x5A4D".

  2. Signature (Image Only)

  3. COFF File Header

  4. Optional Header

MS-DOS Header

MS-DOS header in PEView
MS-DOS header definition.
If executable can't run under MS-DOS, stub error message is displayed. This message can be changed at compilation using /STUB linker option.
As per the DOS-Header, elfanew is defined at 0x3c offset , by checking the memory location, we find for elfanew field holding offset values 0x108 to the PE Header.

Last updated

Was this helpful?