Content updates and product architecture: Sophos Endpoint

Credit to Author: Matt Wixey| Date: Thu, 15 Aug 2024 16:37:18 +0000

Following on from our recent article on the kernel drivers in Sophos Intercept X, in which we discussed how they’re tested and what they do, we’re providing further transparency into the inner workings of Intercept X – this time with a look at content updates that are either configuration changes that result in changes to code execution paths, or are code themselves.

Intercept X uses a combination of real-time Cloud lookups and on-device content updates. Because the threat landscape is constantly evolving and shifting, it’s crucial that on-device content updates are delivered frequently (some on-device data changes less frequently, but may require updates at short notice). However, this comes with its own risks; if content updates are corrupt or invalid, this can result in disruption.

Sophos uses a common mechanism to distribute on-device content updates, which are loaded into low-privileged Sophos user-space processes (rather than being loaded into or interpreted by Sophos kernel drivers) from Sophos’s Content Distribution Network (CDN). Content updates form one of the three main components of Intercept X, along with software from the CDN, and policy and configuration from Sophos Central.

In this article, we’ll explore the various types of content updates we use, how we verify and validate them, and how the ecosystem is architected to avoid issues caused by corrupt or defective content. (As we noted in our previous article, Intercept X (and all its components) has also been part of an external bug bounty program since December 14, 2017.)

It’s worth noting that the details within this article are correct as of this writing (August 2024) but may change in the future as we continue to update and develop solutions.

Staged content update rollouts

Sophos delivers new content updates to customers in ‘release groups.’ Each Sophos Central tenant is assigned to a release group.

The first release group is for internal engineering testing; we don’t assign any production customers to it. This allows our engineering teams to test new content updates on production infrastructure, without requiring any manual steps. If testing fails, we abort the release without proceeding to any further release groups.

If engineering qualification succeeds, we manually promote the release to the ‘Sophos internal’ release group (‘dogfooding’). This includes Sophos employees’ production devices, as well as employees’ personal accounts. Again, if problems are detected or reported, we abort the release and don’t proceed any further.

All being well, we then manually promote the release to public release groups. From this point, the Sophos release systems automatically publish the new content update to all the release groups over a period of several hours or days by default (see Figure 1 below).

A blue timeline graphic depicting how content updates are released in phases

Figure 1: Phases of release, with verification checks at each phase

Downloading data: Sophos AutoUpdate

Sophos AutoUpdate – part of Intercept X – checks for new content updates every hour, although in practice updates are less frequent than this (see table below).

Sophos AutoUpdate downloads each content update from the CDN and checks to see if new content update packages are available for the appropriate release group.

Content updates are time-stamped and signed using SHA-384 and a private Sophos certificate chain. Sophos AutoUpdate verifies the updates it downloads. If it detects corrupt or untrusted updates, it discards them and warns both Sophos and the Sophos Central administrator. In addition, to protect against stale CDN caches or malicious replay attacks, Sophos AutoUpdate rejects any otherwise-valid update whose signature timestamp is older than the already-downloaded update.

If a new content update package is available, Sophos AutoUpdate downloads and installs it using the relevant package installer. Different updates are handled by different components of Intercept X.

Overview of content updates

The following content updates are part of the latest Intercept X release (2042.2).

Table 1: An overview of the content updates that are part of the latest Intercept X release (2024.2)

A graphic showing which content updates relate to which processes and kernel drivers

Figure 2: A diagram illustrating which Sophos processes (shown in navy blue) load which content updates (shown in purple)

DatasetA

DatasetA is loaded by SophosFileScanner.exe, a low-privilege process with no filesystem access (other than its log folder and a temporary directory used for scanning large objects). It loads the Sophos Anti-Virus Interface (SAVI).

SophosFileScanner.exe scans content following scan requests from other Sophos processes. Although it’s called “SophosFileScanner.exe”, the name is somewhat historical: it is the primary content scanner in Intercept X, scanning files, process memory, network traffic, and so on.

LocalRepData

LocalRepData contains two reputation lists:

  1. Reputation by SHA-256
  2. Reputation by signer

When a Windows executable begins execution, Intercept X looks it up in the LocalRepData by its SHA-256 hash and its signature (assuming it is validly signed). If the reputation is provided by LocalRepData, Intercept X ‘tags’ the process with the reputation (Sophos rules treat high-reputation files and processes differently – for instance, exempting them from cleanup).

SSPService.exe uses LocalRepData to assign reputation as processes launch.

SophosFileScanner.exe also loads LocalRepData, so that it can assign reputation to embedded executable streams it discovers in content other than executed files.

Behavior

Behavior rules are loaded by SSPService.exe. Rules files contain signed and encrypted Lua code. SSPService.exe verifies, decrypts and loads the rules into a sandboxed LuaJIT interpreter with access only to Sophos-internal APIs.

Lua is a fast, embedded scripting language. Sophos uses Lua for behavior rules because it provides a flexible way to deliver new behavior detections without needing a new software release, but while still maintaining safety. The rules are loaded in user-space, so cannot cause a critical system failure if they misbehave. In addition, Sophos builds its rules engine without the Lua base libraries – the only access to the system is via Sophos’ internal API, which is hardened against accidental misuse by the behavior rules. Sophos collects extensive telemetry about rule runtimes, and continuously tunes and reduces runtime overhead.

Rules are reactors: Intercept X provides various events, and rules register handlers for those events. Rules can also configure various aggregation parameters for some high-volume events, allowing the sensor to coalesce or discard certain events.

Flags

Flags are the means by which Sophos gradually enables new features in Intercept X. Flags are delivered in two ways:

  1. The Flags Supplement contains a baseline set of flags corresponding to the available features in the software
  2. The Flags Service is a Sophos Central microservice that allows Sophos Release Engineers to configure flags across multiple tenants

The Flags Supplement for a given software release contains a set of feature flags and how the feature should be enabled:

This mechanism gives Sophos multiple avenues to enable and disable features.

  • Sophos can introduce new features with the flag “Available” (but not enabled in the Flags Service)
  • Sophos can gradually enable new features using the Flags Service to enable flags across tenants
  • Sophos can disable a problematic feature by disabling the flag in the Flags Service
  • Sophos can disable a problematic feature in a specific software release by changing the release’s Flags Supplement.

CRT

The Competitor Removal Tool (CRT) contains a set of rules for removing known-incompatible software during the installation. It is automatically downloaded by the installer, and is removed after installation.

Normally the CRT is not used by Intercept X; however, if a customer installs a non-protection component like Sophos Device Encryption, and later opts to deploy Intercept X, the existing agent downloads and installs the CRT and runs it prior to installation. Once Intercept X is installed, the CRT is automatically removed.

Endpoint Self Help Ruleset

The Endpoint Self Help (ESH) rules are a set of regular expressions for certain log files. If Sophos engineers have identified a common root cause or misconfiguration, they can publish a new rule and link back to the Knowledge Base Article (KBA) describing the problem and the suggested solution(s).

ScheduledQueryPack

The scheduled query pack content update contains a list of scheduled queries and their execution frequency. The rules are loaded by SophosOsquery.exe; the output is delivered by McsClient.exe for ingestion by the Sophos Central Data Lake.

SophosOsquery.exe has a built-in watchdog that prevents ‘runaway’ queries from consuming excessive CPU or memory. Sophos collects telemetry on scheduled query performance, and regularly optimizes and tunes scheduled queries to avoid triggering the watchdog.

RemapperRules

The remapper rules are loaded by McsAgent.exe and used to ‘remap’ Sophos Central policy settings into the Endpoint configuration, stored in the Windows registry under HKLMSOFTWARESophosManagementPolicy.

The policy is supplied from Central as a set of XML documents. The rules are also a set of XML documents that describe the structure of the data stored in the registry and provide XPath queries and a few conversion functions to extract content from the policy XML and generate registry data.

If a rule file is corrupt, or if processing them fails for some other reason, none of the registry values defined by that file are updated and any previous settings are left intact. Processing of other, valid, rule files is similarly unaffected.

EPIPS_data

The EPIPS_data content update contains intrusion prevention system (IPS) signatures loaded by SophosIPS.exe. SophosIPS.exe contains a Sophos-built IPS product; the signatures are IPS signatures published by SophosLabs.

SophosIPS.exe runs as a low-privilege process. When IPS is enabled, the sntp.sys driver sends packets to SophosIPS.exe for filtering; SophosIPS.exe responds to the driver with commands to accept or reject the packets.

Interacting with network flows packet-by-packet deep in the network stack requires extreme care. The Windows Filtering Platform (WFP) callouts at L2 are very sensitive to the underlying drivers, often from third-parties, that service the physical and media access layers. Because of the high risk to system stability, the IPS feature monitors itself for BSODs or network disruptions that are likely caused by third-party driver interactions. If detected, the IPS feature automatically disables itself and sets the endpoint’s health status to red as an alert to the incompatibility.

NTP_OVERRIDES

One of the potential issues when building a Windows Filtering Platform (WFP) kernel driver is that although the platform is designed for multiple drivers to interact with the filtering stack at the same time, Sophos has identified certain third-party software packages that are not compatible with the IPS feature, which requires the ability to intercept and manipulate L2 packets.

The NTP_OVERRIDES content update contains a list of known-incompatible drivers. If IPS is enabled in policy but deployed on a device with an incompatible driver, SophosNtpService.exe disables IPS, overriding the policy.

This is delivered as a content update so that as new incompatible drivers are discovered, Sophos can react dynamically to protect other customers with the same configuration. In addition, if Sophos or third-parties update drivers to address the incompatibility, Sophos can remove the driver as of a certain version.

RepairKit

During each hourly update, Sophos AutoUpdate executes a self-repair program (su-repair.exe) to detect and correct any repairable known issues. The RepairKit was originally built to detect and repair file corruption caused by unclean shutdowns that could corrupt the Sophos installation. Over time, the Sophos engineering team has used this facility to correct many issues that historically would have required a Sophos support engagement with the customer, or potentially gone unnoticed until a future software update flagged the issue.

RepairKit rules are written in Lua and loaded by su-repair.exe. The rules are encrypted and signed. If su-repair.exe fails to load the RepairKit rules, it loads a baked-in ‘last resort’ ruleset which only focuses on repairing Sophos AutoUpdate itself.

RepairKit rules have broad access to the machine and run as SYSTEM, since they need the ability to correct privileged keys and files.

TELEMSUP

This telemetry content update contains a JSON document describing how often and where to submit telemetry:

{  "additionalHeaders": "x-amz-acl:bucket-owner-full-control",  "port": 0,  "resourceRoot": "prod",  "server": "t1.sophosupd.com",  "verb": "PUT",  "interval": 86400  }

The telemetry content update has not changed since it was introduced in 2016.

APPFEED, USERAPPFEED

The APPFEED content updates contain signed and encrypted Lua snippets for detecting installed applications and dynamically generating exclusions for them.

If an application is detected for which the APPFEED contains exclusion rules, the rules dynamically generate machine-specific exclusions based on the installed application. These exclusions are reported back to Sophos Central for informational display to the Sophos Central administrator.

The rules have read-only access to the registry and filesystem, and generally operate by looking for known apps in the Add/Remove Programs registry keys. Some applications, like Microsoft SQL Server, require executing PowerShell script to detect optional OS components.

APPFEED and USERAPPFEED are loaded by an instance of SEDService.exe.

ProductRulesFeed

Product rules are loaded by SSPService.exe. They are in the same format as Behavior rules, with the same access and privileges. They are loaded into the same LuaJIT interpreter and provide core functionality required by the Behavior rules.

ML models

The ML models content update contains several machine learning models loaded by SophosFileScanner.exe. Unlike most content updates, ML models contain Windows DLLs that contain the core ML model logic, as well as the ‘weights’ – the result of training and tuning models in the SophosLabs Cloud.

The ML models are loaded by SophosFileScanner.exe and are run in the same low-privilege environment. SophosFileScanner.exe supports loading two versions of each model: ‘telemetry’ and ‘live.’ Sophos uses this capability to deliver candidate ML models in telemetry mode. When SophosFileScanner.exe has an ML model in telemetry mode, it selects a sample of data for telemetry analysis, and runs it through the telemetry model (in addition to normal activities). The output from the telemetry model, alongside the data collected by the normal models, provides telemetry to Sophos for analysis and training.

Sophos delivers ML models as content updates so that a new ML model can get several iterations of telemetry, retraining, and fine-tuning before being promoted to the live model.

Since the ML model update contains executable code, Sophos releases it more gradually and with more gates:

  • It spends more time in the early release groups (engineering testing and Sophos Internal)
  • It is released over several weeks, not hours.

Hmpa_data

The Hmpa_data content update contains a global allowlist of HitmanPro.Alert thumbprints. Every HitmanPro.Alert detection creates a unique thumbprint for the relevant mitigation and the detection-specific information. For example, a thumbprint for a StackPivot mitigation might include the process and the last few stack frames.

Hmpa_data contains a compact list of globally allowed thumbprints. The HitmanPro.Alert service hmpalertsvc.exe uses this database to quickly and quietly suppress detections, reduce false positives, and avoid performance or stability issues.

  • The HitmanPro.Alert driver, hmpalert.sys, generates thumbprints and sends them to the service for any driver-based mitigation: CryptoGuard, CiGuard, PrivGuard, etc.
  • The HitmanPro.Alert hook DLL, hmpalert.dll, which is injected into user processes, generates thumbprints for each detection and sends them to the service for reporting.

Conclusion

In order to keep pace with the ever-evolving threat landscape, and to protect against emerging threats, it’s vitally important to regularly update security products with new data. However, corrupt or defective content updates can cause disruptions, so it’s also essential that there are mechanisms in place to help ensure that they are valid, signed, and verified.

In this article, we’ve provided a high-level overview of the content updates we use in Intercept X – exploring what they are, how often they’re delivered, how they’re validated and verified, the specific low-privileged processes they’re loaded into, and the methods we use to roll them out in a staged and controlled manner.

As we alluded to in our previous article on Intercept X kernel drivers, balancing protection and safety is risky – but we’re committed to managing that risk, as transparently as possible.

http://feeds.feedburner.com/sophos/dgdY