GPU.zip attack in simple terms | Kaspersky official blog
Credit to Author: Enoch Root| Date: Mon, 09 Oct 2023 15:53:31 +0000
Researchers from four U.S. universities have published a study detailing a potential, intriguing attack on computer graphics subsystems — specifically targeting common integrated GPUs manufactured by both AMD and Intel. The attack was named GPU.zip, alluding to its two main features: (i) stealing secrets from the graphics system, and (ii) exploiting data compression algorithm vulnerabilities. In this post, we try as usual to explain the new research as simply as possible. But mainly we’ll just marvel at how elegant and complex it is — while we’ll also cringe at how, ultimately (in its current form), completely impractical it is.
About compression algorithms
Before delving into the GPU.zip attack itself, let’s discuss some aspects of compression algorithms. These algorithms can be broadly categorized into lossy compression algorithms (like MP3) and lossless compression algorithms (like RAR or ZIP). The latter ones compress data in such a way that it can be completely restored. The simplest method of compression is to store repeating data only once, and then indicate where specific sets of characters or numbers should be placed. For example, the length of this post could be significantly reduced by recording all the places where the word “data” appears and storing the word itself only once.
From an information security perspective, compression algorithms have a vulnerability of sorts. Let’s imagine that we’re transmitting some data over the internet using compression. The volume of information depends on how effective the compression algorithm is — the better the compression, the smaller the data size. Back in 2002, it was shown that this feature could be exploited to steal secrets even if the data is encrypted. One of the relatively practical attacks confirming this possibility was demonstrated in 2012.
It was found that in some cases, if information between a browser and a server is transmitted simultaneously in both compressed and encrypted forms, the compression algorithm could reveal secret information even if the encryption algorithm isn’t hacked. If attackers can send numerous requests to the server, they can observe how the size of compressed data changes based on the content. And from this they can calculate the secret information character by character. It remains to be seen whether unchecked compression of graphics subsystem data can also lead to leakage of secrets.
About the features of computer graphics
Today, we’re discussing the graphics subsystem — or, simply put, video cards, although they’re often integrated directly into the processor. Discrete GPUs are separate computational modules, usually with their own RAM. Computer gamers are familiar with the situation when the latest cool game struggles to run on a not-so-powerful video card: the frame refresh rate drops below optimal, the image is no longer smooth, and sometimes it even freezes for a fraction of a second. There can be two reasons for such behavior. Most often, the video card can’t handle the calculations required to create 3D images quickly enough. Sometimes, however, the required data is transmitted too slowly from the main RAM to the graphics subsystem memory.
This problem can be solved by using data compression algorithms. Games use lossy compression algorithms to compress textures. The authors of the paper found that, at least in Intel and AMD integrated GPUs, lossless compression algorithms are used as well — to transmit any graphic information that needs to be displayed on the screen (desktop, browser windows, and so on). These algorithms cannot be disabled and, moreover, are proprietary – no one but the manufacturer knows how they work. The researchers studied them in “black box” mode: the very existence of the compression algorithm was determined based on indirect signs, such as the amount of data transferred from RAM to video memory, which varied depending on the image. Transmitting graphic patterns made entirely of black pixels, black and white pixels in a specific order, and random patterns, showed that when easily compressible data is sent to the video system, less information is transferred between the main RAM and video memory: exactly the way data compression should work.
Most of the study is dedicated to reverse engineering these proprietary data compression algorithms. This research was deemed necessary to understand exactly how such algorithms work — for example, how graphics information is divided into blocks before compression. The researchers found that different algorithms are applied depending on the manufacturer or even the model of the graphics subsystem.
The problem is that the time it takes to compress data also depends on the data itself. If we have a poorly compressible set of information (random data without any repeating elements), the processing time will differ compared to “simple data”. Meanwhile, an attacker can measure this time — for instance, by creating a special webpage.
The beauty… and uselessness of the GPU.zip attack
Imagine someone creating a “malicious” webpage that also contains a request to embed another page from which they want to steal data. This person has the ability to measure the time it takes to render their page in the browser, but nothing more. If, for example, a window with the target’s work email is embedded in the page, the attacker won’t gain access to the content of that window. Why? Such an action is strictly prohibited by the same-origin policy rule — you can place code on a site to track user actions, for example, but it won’t work on the embedded “foreign” webpage. There is one exception, however: styling rules can be applied to the embedded page.
The authors of the GPU.zip attack took advantage of this and began applying specific graphics patterns to the target page. This led to changes in the time required to process compressed graphic data, thereby slightly altering the duration of page rendering. Which can be measured.
We’ve finally reached the practical implementation of this attack. Here’s how it works: the attacker somehow lures the user to the malicious webpage. The page contains code embedding another page from a completely different site — in this case, Wikipedia’s main page. Let’s assume the browser user has a Wikipedia account and is logged in. Their username will be displayed on the embedded page. By applying effects to this page and measuring the time it takes to render, the attacker can reconstruct the content of the target page from this single parameter alone. More specifically, the attackers can obtain the username. In this way, they can identify the visitor of their malicious site — even if the visitor tries to remain anonymous, for example.
This is a typical side-channel attack: the attacker uses an indirect parameter that they can measure (the time it takes to render a web page) to steal data they don’t have access to. But now, let’s discuss the impracticality of this attack…
The content of the target web page is reconstructed pixel by pixel. The attacker has a timer and the ability to slightly modify the appearance of the page in the browser. As a result, it takes half an hour on an AMD Ryzen processor with integrated graphics to reconstruct not the entire page but only a small piece, as shown in the screenshot above. On an Intel processor, the algorithm works even slower — the reconstruction takes more than three hours!
This means the potential victim has to open the page and forget about it for quite a while, without closing it. During all this time, the page will be refreshing, which puts a heavy load on the system. However, the accuracy of the data reconstruction is quite high (97-98%) and, most importantly, the method works even when a large volume of other data is transmitted through the video card. The researchers had a YouTube video playing in the background. Unlike previous studies, this attack works reliably even with a significant amount of such “background noise”.
The final argument against the feasibility of this particular attack is that most websites cannot be embedded into other web pages if they display confidential content. This means that you can’t sneakily “screenshot” email messages or chat conversations in this way. The example with the Wikipedia page was actually chosen because it’s a rare case where a website with a visible username can be embedded.
To sum it up. Unlike other hardware vulnerabilities, it can’t be said with certainty that GPU developers made a mistake in this case. We’re talking about extremely complex interactions among different components — the properties of which can be exploited to steal data. The theft itself is not mind-blowing yet, but further research may well discover a more effective method. We hope that GPU developers will take this study into account and adapt their algorithms so that they don’t leak sensitive information.
The quality of this study should not be underestimated either. Forgetting all the practical difficulties for a moment, the researchers essentially demonstrated a method of remote data theft and took screenshots of secret information. All this was achieved through a detailed examination of a minor feature in the operation of GPUs — and manufacturers tend not to publicize anything about the operation of their CPUs. Still, it’s an impressive piece of research — even if it has no practical consequences… for now.