Funky malware format found in Ocean Lotus sample
Credit to Author: hasherezade| Date: Fri, 19 Apr 2019 18:37:54 +0000
Recently, at the SAS conference I talked about “Funky malware formats”—atypical executable formats used by malware that are only loaded by proprietary loaders. Malware authors use these formats, such as a custom format that is not recognized as an executable by AV scanners, in order to make static detection more difficult.
Using atypical formats may also slow down the analysis process because the file can’t be parsed out of the box by typical tools. Instead, we need to write custom loaders in order to analyze them freely.
Last year, we described one such format in a post about Hidden Bee. This time, we want to introduce you to a malware we discussed at the SANS Conference: Ocean Lotus, also known as APT 32, a threat group associated with Vietnam.
Sample
49a2505d54c83a65bb4d716a27438ed8f065c709 – the main executable
Special thanks to Minh-Triet Pham Tran for providing the material.
Overview
The sample comes with two elements—BLOB and CAB—that are both executables in the same unknown format. The custom format is achieved by conversion from PE format. (There are some artifacts that indicate it manifests in a way typical for PE files.) However, the header is fully custom, and the way of loading it has no resemblance with PE. Some of the information from the typical PE (for example, layout sections) is not preserved: sections are shuffled.
Origin
This sample is from June 10, 2017, from the following email:
The title “Sổ tay vấn đề pháp lý cho các nhà hoạt động nhân quyền” translates to: “Handbook of legal issues for human rights activists.” It’s a subject line for a spear phishing campaign targeting Vietnamese activists.
The malicious sample was delivered as an attachment to the email: a zipped executable. The icon tried to imitate a PDF (FoxitPDF reader).
Behavioral analysis
After being run, the sample copies itself into %TEMP%, unpacks, and launches the decoy PDF.
While the user is busy reading the launched document, the dropper unpacks the real payload. It is dropped into C:ProgramDataMicrosoft Help:
The dropper executable is deleted afterwards.
The malware manages to bypass UAC at default level. We can see the application sporder.exe running with elevated privileges.
Persistence is provided by a simple Run key, leading to the dropped script:
The interesting factor is that the sample has an “expiry date” after which the installer no longer runs.
Internals
The main executable sporder.exe is packed with UPX. It imports the DLL SPORDER.dll:
SPORDER.dll imports another of the dropped DLLs, hp6000.dll:
The key malware functionality is, however, not provided by any of the dropped PE files. They are just used as loaders.
As it turns out, the core is hidden in two unknown files: BLOB and CAB.
Custom formats
The files with extensions BLOB and CAB are obfuscated with XOR. After decoding them, we notice some readable strings of code. However, none of them are valid PE files, and we cannot find any of the typical headers.
BLOB
The BLOB file is obfuscated by XOR. We can see the repeating pattern and use it as an XOR key:
As a result, we get the following clear version: 2e68afae82c1c299e886ab0b6b185658
BLOB’s header:
The BLOB file looks like a processed PE file, however, its sections appear to be in swapped order. The first section seems to be .data, instead of .text.
We can see visible artifacts from the BZIP library and C++ standard library.
CAB
The CAB file is obfuscated with XOR in a similar way, but with a different key:
When we apply the key, we get an analogical clear version: b3f9a8adf0929b2a37db7b396d231110
This sample also has a custom header, which does not resemble the PE header. However, we found sections inside that are typical for PE files, for example, a manifest.
Loader
As it turned out, both files are loaded by hp6000.dll: 67b8d21e79018f1ab1b31e1aba16d201
The loading function is executed in an obfuscated way: when the DllMain is executed, it patches the main executable that loaded the DLL.
First, the file name of the current module is retrieved. Then, the file is read and the address of the entry point is fetched. Then, the copy of the module that is loaded in the memory is set as an executable:
Finally, the bytes are patched so that the entry point will redirect back to the appropriate function in the loading DLL:
This is how the entry point of the main module looks after the patch is applied:
We see that the Virtual Address (RVA 0x1210 + DLL loading base) of the function within the DLL is moved to EAX, and then the EAX is used as a jump target.
The function that starts at RVA 0x1210 is a loader for BLOB and CAB:
This redirection works, thanks to the fact that when the executable is loaded into the memory, before the Entry Point of the main module is hit, all the DLLs that are in its Import Table are loaded, and the DllMain of each is called. Just after the DLLs are loaded, the execution of the main executable starts. And in our case, the patched entry point redirects back to the DLL.
Inside the function loading BLOB and CAB:
As you can see, the CAB file is loaded first:
Further, we see this function retrieving some environmental variable. This variable is used to store the state of the application, and is shared between consecutive executions. Depending on this state, one of multiple execution paths can be taken.
The name of the variable is created by concatenating:
- hardcoded string: L”Local\{076B1DB0-2C01-45A5-BD0A-0CF5D6410DCB}”
- the name of the executable
- a local username
The content variable may be one of the following: ‘@’, ‘*’,’:’. If it is empty, the first value ‘@’ is set. Those variables are translated to particular states that control the flow.
- ‘@’ -> state 1
- ‘*’ -> state 2
- ‘:’ -> state 3
The main process is restarted on each state change. Finally, the state 3 create mutex and load the file with the BLOB extension.
The mutex name is the same as the variable name, but with a suffix “_M” added:
While the application runs, we can see the BLOB being loaded in executable form inside the main module’s memory:
By comparing the format that is loaded in the memory with the format that is stored on the disk, we can see that the beginning and the end of the BLOB is skipped in the loading process. So, we can guess that those parts are some headers that contains the information necessary for loading, but not for execution. The header at the beginning of the file will be referenced as Header1, and the one at the end (footer) will be referenced as Header2.
The Header2 file in the memory vs. its equivalent on the disk:
We also found that some of the addresses were relocated (the new Image Base was added).
Reversing the reversed PE
The files with both extensions CAB and BLOB are loaded by the same function:
The core of the loader is in the following function:
This is the function that we need to analyze in order to make sense out of the custom format.
Let’s take a look at the loading process itself.
First, DWORD of the Header1 is skipped. Then, we have two DWORDs that are used as an XOR key. Once they are fetched, the rest of the header is decoded.
After applying the key, we get the content of the file in its clear form. The next value from the headers is used in the formula calculating the size for loading the executable part of the module. In the currently analyzed case (the CAB file), it is 0x17000:
So, 0x17000 + 0x2000 is the size of the memory that will be allocated for the payload.
Example (from CAB file):
Then, 0x17000 bytes of the payload is copied, but the beginning containing the Header1 is skipped (the first 16 bytes).
After the module content is copied, Header2 is used to continue loading.
Looking at Header2, we can see some similarities with Header1. Again, the initial DWORD is skipped, and then we have a value that is used in a formula calculating the size of the memory to be allocated. The new memory region that is being allocated this time is used for the imports that are going to be loaded (the full process will be explained further).
Conceptually, we can divide Header 2 into two parts.
First comes a prolog that contains two DWORD values. Example from the currently-analyzed CAB file:
- val[0] = 0x21A0 -> skipped
- val[1] = 0x013D -> val[1]*8+0x400 -> size of the next area to allocate
Then there is a list of records of a custom type. Each record represents a different piece of information that is necessary for loading the module. They are identified by the type ID that is represented by a DWORD at the beginning of the record.
Relocations
Type 1 stands for relocation. It has one DWORD as an argument. It is an address that needs to be relocated.
typedef struct { DWORD reloc_field; } reloc_t;
We can see how the field is used to relocate the address. Example: filling the address at 0x8590:
Entry point
Type 2 stands for entry point or an exported function. The pointed address is stored on the list in order to be called later, after the loading finished. This record has three DWORD parameters.
typedef struct {
DWORD count;
DWORD entry_rva;
DWORD name_rva;
} entry_point_t;
Example of the record of type 2:
Address to be stored: params[1] = 0x00001030
By observing the execution flow, we can confirm that indeed the stored entry point of the module is being called later:
Exported functions are stored in the same way, along with their names.
Imports
Type 3 stands for imports. It has four DWORD parameters.
typedef struct {
DWORD type;
DWORD dll_rva;
DWORD func_rva;
DWORD iat_rva;
} import_t;
Example of a chunk responsible for encoding imports:
Type: params[0] = 0x00000002 – means the function will be imported by name, meaning of all the possible types of this record.
Address of the DLL: params[1] = 0x0107DA
Address of the import: params[2] = 0x010774
In contrast to PE format, the address of the imported function is not loaded into the main module. Instead, it is written into the separate executable area (in the given example it is written at VA: 0x00240001):
And then, the address where the import was filled is filled back in the main module. The address in the main module that needs to be filled is specified by the last parameter of this record. In the given example, chunk[3] = 0x0000E014 is being filled by 0x00240001:
Atypical IAT
The functions from the embedded list are for a loader, however, as mentioned earlier, the addresses are not filled in a normal IAT, typical for PE format. Rather, all are filled as a list of jumps stored in a newly-allocated memory page.
The import loading function not only fills the address, but also emits the necessary code for the jump:
Meaning of the type field
The import record has a field type, that can have one of the following values: 1,2,3,4.
The 1 and 2 are the most important: They are used for loading the imports. 1 stands for loading by ordinals, 2 for loading by name. The remaining 3 and 4 are used for cleanup of the fields that are no longer needed. 3 erases import name, 4 erases DLL name.
When the record of the type 3 or 4 occurs, the pointer in the IAT area is still incremented, so as a result we can see some gaps between the functions records:
Functionality of the custom files
The CAB file is another installer that provides persistence to the whole package by creating a service:
“C:Windowssystem32wscript.exe” /B /nologo “C:UserstesterDesktopmodsporder.vbs”
I also generate the VBS script that is dropped:
The CAB file is loaded first, just to install the malware, and then deleted.
All the espionage-related features are performed by the BLOB that is loaded later and kept persistent in the memory of the loader.
In addition to being in a custom format, BLOB is also heavily obfuscated.
We can observe its attempts to connect to one of the CnCs:
png.eirahrlichmann.com : 443 engine.lanaurmi.com :3389 movies.onaldest.com : 44818 images.andychroeder.com : 80 png.eirahrlichmann.com : 44818 engine.lanaurmi.com : 44818 movies.onaldest.com : 9091 images.andychroeder.com : 9091 png.eirahrlichmann.com : 3389
Some of those domains are known from previous reports on Ocean Lotus, i.e. [the Cyclance white paper].
Ocean Lotus: a creative APT
Ocean Lotus often surprises researchers with its creative obfuscation techniques. Recently, a different sample of Ocean Lotus was found using steganography to hide their executables (you can read more about it in the report of ThreatVector). The format that we described is just one of many unusual forms that their implants can take.
Appendix
Parser for the described format: https://github.com/hasherezade/funky_malware_formats/tree/master/lotus_parser
Presentation from the SAS conference:
The post Funky malware format found in Ocean Lotus sample appeared first on Malwarebytes Labs.