Keeping it real: Sophos and the 2024 MITRE ATT&CK Evaluations: Enterprise

Credit to Author: Michael Wood| Date: Wed, 11 Dec 2024 15:35:22 +0000

Each year, several security solution providers – including Sophos – sign up for MITRE’s ATT&CK Evaluations: Enterprise, a full-scale cyber attack emulation covering one or more scenarios based on real-world threat actors and their tactics, tools, and procedures.

The evaluation is designed to provide a realistic (and transparent – the results are publicly available) appraisal of security solutions’ performances, based on end-to-end attack chains which include initial access, persistence, lateral movement, and impact. Emulations typically include a multi-device ‘customer’ environment, complete with endpoints, servers, domain-joined devices, and Active Directory-managed users.

2024 marked the fourth year of Sophos participating, and to celebrate we wanted to provide some insight into what this year’s assessment entailed, and to show how true to life it actually is. In particular, we’ll dive into the realism of the tooling, nuances in the testing methodology, and Sophos’ protection and detection capabilities. While we can’t cover everything (each scenario has 20-40 steps!), we’ll discuss a selection, highlighting the depth and accuracy of the emulations.

The 2024 threat categories

For the 2024 evaluation, MITRE selected two threat categories, Ransomware and the Democratic People’s Republic of Korea (DPRK). The former, as has been the case for a long time, is one of the biggest cyber security threats in the industry, and continues to evolve (for example, the increase in remote encryption). The latter is also very relevant, given the proliferation of state-sponsored espionage attacks associated with the region.

MITRE built three scenarios around these categories: an attack by a DPRK-affiliated threat actor focused on MacOS (following threat actors targeting MacOS in several campaigns, a trend that looks set to continue), and attacks by affiliates of two ransomware groups (Cl0p and LockBit).

DPRK

The DPRK scenario was simple but realistic, based on the flow of the JumpCloud supply chain compromise: an attacker compromises a device, establishes a persistent agent, and steals credentials. Threat actors affiliated with the DPRK are known to break their attacks into discrete stages and maintain backdoors for launching future attacks.

Initial access

While the evaluation presumes a supply chain attack, the scenario itself involved a user downloading and executing a malicious Ruby script (our analysis showed a user execution path of Ruby). In a real-world supply chain attack, pre-installed software would likely automatically execute the script. Nevertheless, this is still a plausible and meaningful approach – DPRK-affiliated attackers will use social engineering to convince users to run a script, as recent incidents show.

Just as in the JumpCloud attack, MITRE’s Ruby script (called start.rb, thematically similar to the name of the real script: init.rb) downloads and executes a first-stage C2 agent (a Mach-O binary), masquerading as a docker-related component. It’s worth noting that reverse-engineering genuine JumpCloud samples is not possible; to our knowledge, the real-world samples are not publicly available. As with all MITRE ATT&CK Evaluations, the malware used was custom-built for the assessment.

Persistence

The first-stage C2 agent then downloaded a second-stage backdoor (known as ‘STRATOFEAR’ in the real-world JumpCloud attack), which established persistence in much the same way as the genuine article, via LaunchDaemons (/Library/LaunchDaemons/us.zoom.ZoomHelperTool.plist).

A screenshot of a dashboard showing commands which establish persistence via 'ZoomHelperTool.plist'

Figure 1: Establishing persistence via ZoomHelperTool.plist

As with the Ruby script in the Initial Access phase, MITRE designed the backdoor to closely emulate the real thing. The backdoor was dropped in the same location (/Library/Fonts), and had a very similar name (the real version was named ArialUnicode.ttf.md5, whereas the evaluation version was pingfang.ttf.md5; both ‘Arial’ and ‘pingfang’ are names of genuine fonts).

As in the real JumpCloud attack, the ‘threat actor’ was stealthy and evasive, removing the first-stage implant files from the system very quickly. In the emulation, they achieved this with an rm -f <FILE> command, as our execution path analysis showed. We don’t know if this was the exact method used by the JumpCloud threat actor (it’s noisier than a direct API method, since a process execution is more likely to be logged), but, as noted previously, we can’t confirm this since the real-world samples are not available.

Like the genuine STRATOFEAR, the MITRE backdoor used encrypted configuration files, with a shell-out openssl enc -d command and a hardcoded password. Again, using a direct API-based method would be stealthier, but we don’t know if the JumpCloud threat actor took that approach.

A quick note on test safety: For its C2 infrastructure, MITRE uses domains that work within the confines of the test environment, but are not publicly resolvable via DNS. However, they do resolve to public IP addresses. This means that the network traffic looks like genuine C2 activity, but the domains are not reachable outside the test environment.

Impact

As in the JumpCloud attack, the threat actor’s goal is to collect data, including system information, credentials, and sensitive information held in the Keychain. MITRE’s STRATOFEAR backdoor was faithful to the original, in that it downloaded and executed additional modules from the C2 server to carry out the theft. Like the modules downloaded by the real STRATOFEAR, these were written to a .tmp file in the /tmp directory, each named with a string of six random alphanumeric characters.

In the evaluation, MITRE’s STRATOFEAR downloaded /private/tmp/rhkA2f.tmp, a module with the ability to read MacOS keychain files.

A screenshot of disassembled code

Figure 2: The ExecuteModule function in MITRE’s STRATOFEAR sample, using dlopen/dlsym to call an ‘Initialize’ function

This scenario ended with the backdoor collecting the data; the evaluation did not involve any actual exfiltration. While some might call this out as an issue with the methodology – credentials are often only useful if exfiltrated – we would argue that it’s a minor one. If you, as an incident responder, can observe credential theft, you’ll be aware of the potential impact and the associated malicious activity.

Cl0p

The second scenario involved an emulation of an attack by the Cl0p ransomware group (also known as TA505), a prolific threat actor. Here, the flow of the attack closely mimicked – for the most part – that of a 2019 incident, involving a downloader, a persistent RAT, sophisticated process injection, and abuse of a trusted process – ultimately leading to a ransomware payload.

Initial access

While most of the scenario was faithful to the 2019 real-world campaign, the initial access stage was slightly different. As in 2019, the threat actor used a DLL to install a persistent RAT. But whereas the real-world attack involved malicious Office documents containing an embedded DLL, which was loaded dynamically into the Office process, the MITRE scenario involved a user interactively running cmd.exe and executing the DLL via rundll32.exe.

This DLL was already present on the host, having been downloaded via a curl command from a separate interactive cmd.exe (this step was not included in the scenario) following initial access over RDP. It’s worth noting that this method of initial access is very common amongst ransomware groups and other threats actors, particularly when purchasing stolen credentials/access via initial access brokers (IABs). In one very prominent case, however, Cl0p also abused a zero-day vulnerability in the MOVEit file transfer application (CVE-2023-34362).

While it’s very plausible that an attacker would gain direct remote access to the compromised host, the scenario could perhaps have included the ingress of the DLL tooling for a more complete emulation.

Persistence

As in the 2019 campaign, the MITRE ‘threat actor’ loaded the persistent RAT SDBbot by compromising the trusted winlogon.exe process, using Image File Execution Options (IFEO) injection with a ‘VerifierDLL’ key.

SDBbot uses encrypted strings and a mutex to guard its start-up. As with the DPRK scenario, the MITRE sample used a similar-but-different name for the mutex (‘windows_7_windows_10_check_running_once_mutex’ in the real-world attack, ‘win10x64_check_running_once’ for the evaluation).

A screenshot of disassembled code

Figure 3: Disassembly of MITRE’s SDBbot sample. Note the mutex name and the decryption function

In MITRE’s implementation of SDBbot, the key material is a repeat of the same 16 incrementing bytes from 0 to 15. This is not as secure as a genuinely random 128-byte string – but it’s sufficient to obfuscate the strings used to reference API names and data fields beyond trivial static analysis methods. MITRE used this method of string obfuscation throughout the Cl0p scenario, as well as in the LockBit scenario discussed below.

MITRE’s sample was loaded via a reflective loader, overwriting image memory in setupapi.dll. Since the RAT exists in standard ‘image’ memory, it’s harder to detect than if it were in dynamically-allocated heap memory. This is a sophisticated injection method, designed to evade modern defenses. MITRE’s approach presented another challenge when it came to detecting the activity of the installer (the rundll32 process) dropping the SDBbot loader component. The installer dropped the loader to a %TEMP% location, but created a symbolic link to that path in the SYSTEM folder, and the IFEO registry key was set up to point to the SYSTEM folder path – thereby creating an additional layer of abstraction between the dropper and the persistent RAT.

A screenshot of a command window showing a symlink for msverload.dll

Figure 4: The symbolic link for the msverload.dll loader

The use of the ‘VerifierDLLs’ method added further complexity to the execution flow, as the loader (msverload.dll) was loaded into the winlogon.exe process space prior to the process’s entry point. It then used VirtualAlloc to inject and execute embedded shellcode, and VirtualProtect to make the otherwise RX image memory of setupapi.dll writeable, before overwriting its contents with the SDBbot RAT. The memory permissions were later reset to RX, in order to make the code look like ‘regular’ image memory – as a DLL would appear when loaded directly from disk.

A screenshot of disassembled code

Figure X: MITRE’s SDBbot is loaded, and overwrites the module of the otherwise legitimate setupapi.dll IMAGE memory, with memory protections reset to PAGE_EXECUTE_READ

Our detection strategy here involved several aspects: it’s suspicious to have C2 activity originating from a winlogon process, and C2 activity in itself is a common memory scan trigger (as we discussed in a blog on this topic in 2023). Memory scans also detected a shellcode pattern. The suspicious C2 event enabled Sophos Detection to capture the data exfiltration behavior, and we noted that the exfiltration method – using SDBbot and sending data over the C2 channel – was adopted by Cl0p in 2020.

A screenshot of a dashboard, showing detection of exfiltration

Figure 6: Detecting exfiltration during the Cl0p scenario

Impact

MITRE’s implementation of the Cl0p ransomware sample (sysmonitor.exe, downloaded via SBDbot) was modelled very closely on a real-world sample from 2019. Just like the real thing, MITRE’s sample used GetKeyboardLayout to check for layouts used in Russia, Georgia, and Azerbaijan (to avoid targeting any systems using them). It also employed an identical comparison for the GetDC/GetTextCharset APIs, used to achieve the same objective.

A screenshot of disassembled code

Figure 7: MITRE’s Cl0p sample calling GetDC and GetTextCharset to check for infected hosts in Russia, Georgia, or Azerbaijan

We also noted other near-exact matches in behavior and methodology, particularly when it came to how the ransomware dealt with shadow volumes and attempting to kill various services on compromised hosts.

Many ransomware families will attempt to delete shadow volumes, to prevent their targets from restoring data, and then resize the shadow storage, so that no further shadow volumes can be created. However, the 2019 Cl0p ransomware performed the latter step in a specific way, cycling through a hardcoded list of drives (from C to H). MITRE’s sample emulated this behavior exactly.

A screenshot of a dashboard, with a list of commands to resize shadowstorage

Figure 8: MITRE’s implementation of Cl0p cycling through various drives to resize the shadow storage

Moreover, like many ransomware variants, Cl0p ransomware iterates through a list of various services – including security services and services that may contain key data to be encrypted – and attempts to terminate them via net stop.

MITRE’s sample employed the same list used by the genuine Cl0p ransomware, in the same order – albeit it excluded security services, presumably to prevent any disruption to the test.

A screenshot of a dashboard, showing a list of executed net stop commands for various services

Figure 9: Sophos detection, showing the net stop commands used in MITRE’s Cl0p sample

For its file encryption, the MITRE malware used AES, appending a special marker (“Cl1pCl0p!?”) to the data within the encrypted files. This was a similar approach to the real malware, which used a marker of “Clop^ ”. However, whereas the 2019 samples used the advapi32.dll CryptAcquireContextW API for cryptographic algorithm support, the MITRE version employed the open-source CryptoPP library – a more modern approach used by many ransomware families today.

LockBit

LockBit, like Cl0p, is a prolific ransomware group, albeit one significantly disrupted by law enforcement agencies in February 2024. Nevertheless, due to a LockBit builder leaked in 2022, threat actors continue to deploy its ransomware. MITRE’s LockBit scenario included TTPs known to be used by some LockBit affiliates (as with the Cl0p scenario, it’s worth noting that while the behavior of ransomware binaries will generally be consistent across attacks, since these are developed and distributed centrally, affiliates may have more flexibility in their approaches, and so their playbooks – and subsequent TTPs and IOCs – may differ). These TTPS included the initial access method, the use of ThunderShell and PsExec, and various evasion strategies.

Initial access

The MITRE ‘threat actor’ began their attack by authenticating over an externally-facing TightVNC service (a legitimate remote administration tool), using credentials that had previously been compromised. Ransomware-as-a-Service (RaaS) affiliates commonly obtain initial access in this way, using previously-compromised services and credentials that are sold on cybercrime forums by IABS, as noted earlier with the Cl0p scenario.

Once the attacker gained access, they executed various discovery commands, which aligned with commands that we often observe early on in a RaaS attack, including:

nltest /dclist:<domain>  cmdkey /list  net group “Domain Admins” /domain  net group “Enterprise Admins” /domain  net localgroup Administrators /domain  powershell /c "get-wmiobject Win32_Service |where-object { $_.PathName -notmatch "C:Windows" -and $_.State -eq "Running"} | select-object name, displayname, state, pathname

These commands are almost identical to those observed during a 2022 LockBit attack.

The execution of cmd.exe during a remote interactive session was a key indicator of attack here, as was a TightVNC connection and remote interactive logon from a suspicious IP address.

A screenshot of a dashboard showing that cmd.exe was executed during an RDP session

Figure 10: Investigating suspicious activity during the initial access stage

Persistence

To maintain a foothold in the environment, the threat actor then deployed a PowerShell remote access shell known as ThunderShell. As CISA notes, this is a tool known to be used by LockBit affiliates, enabling them to maintain persistence if the initial access method is lost. Here, we were able to monitor recurring network connections to identify ‘beaconing’ behavior, and flag processes and connections deemed suspicious.

The MITRE ‘attacker’ established further persistence through the winlogon automatic logon registry key. This action did deviate slightly from what we would expect in a real-world scenario; in our experience, threat actors typically enumerate those keys to potentially identify plaintext credentials.

Impact

MITRE opted to emulate the bespoke LockBit exfiltration tool StealBit, which RaaS affiliates use to perform double extortion (a technique used by many other ransomware groups) – allowing them to exfiltrate sensitive data to a remote server before it is encrypted.

MITRE’s version of StealBit (named connhost.exe), just like the real thing, used a PEB “BeingDebugged” flag to check for attached debuggers, and also performed dynamic API resolution using LoadLibraryExA and GetProcAddress – with resolved DLLs stored as XOR-obfuscated filenames. This is a very similar approach to the real StealBit malware.

After exfiltration, the MITRE ‘threat actor’ deployed an emulated version of the main LockBit executable to encrypt data and self-replicate across the environment.

As with the real-world version, MITRE’s LockBit sample used several evasive techniques, including dynamic API resolution using an in-memory API hashing algorithm (to keep API names hidden from static analysis), and anti-debugging via NtSetInformationThread. We documented both of these methods in our analysis of LockBit 3.0 in 2022, although it’s worth noting that MITRE’s implementation used DJB2 hashing. This differs from the original LockBit approach (a custom implementation using a ROR-based hashing method with a seed key), but the end result is the same, while also preventing the introduction of a known IOC which we and other vendors may have previously detected.

A screenshot of disassembled code

Figure 11: MITRE’s version of LockBit used an implementation of the DJB2 hashing algorithm. This was a complex implementation, and we noted that MITRE seemed to have gone to great lengths to replicate the functionality of the genuine LockBit binary

Sophos detected this activity using CryptoGuard, although we should note that as this particular test was running in monitor-only mode, CryptoGuard did not roll back the encryption. In another, separate test, focused on protections, encryption activity resulted in the encrypted files being rolled back to their original state, even during remote encryption emulations.

A screenshot of text ('thumbprint information') from CryptoGuard

Figure 12: CryptoGuard thumbprint information showing the detection of ransomware activity and the creation of a ransom note

Conclusion

2024 marked the fourth year that Sophos has participated in MITRE’s ATT&CK Evaluations: Enterprise. As in previous years, the focus on end-to-end attack chains and realism has made the evaluation an extremely worthwhile exercise in assessing our capabilities and those of other vendors. We also welcome MITRE’s emphasis on transparency.

Like any kind of emulation, much of the value of these evaluations comes from how accurate and realistic their scenarios are. While we did note that MITRE’s tests deviated from real-world attacks in a few, minor instances – often due to unavoidable constraints – the overall resemblance to known campaigns and threat actors was very strong.

Transparent, realistic evaluations, in which multiple vendors participate, benefit not only vendors themselves, but also customers, and, as a result, wider society. We look forward to continuing to participate in these evaluations in the future, and to reporting our thoughts and findings wherever possible.

http://feeds.feedburner.com/sophos/dgdY