The Elite Intel Team Still Fighting Meltdown and Spectre
Credit to Author: Lily Hay Newman| Date: Thu, 03 Jan 2019 17:33:54 +0000
A year ago today, Intel coordinated with a web of academic and independent researchers to disclose a pair of security vulnerabilities with unprecedented impact. Since then, a core Intel hacking team has worked to help clean up the mess—by creating attacks of their own.
Known as Spectre and Meltdown, the two original flaws—both related to weaknesses in how processors manage data to maximize efficiency—not only affected generations of products that use chips from leading manufacturers like Intel, AMD, and ARM, but offered no ready fix. The software stopgaps Intel and others did roll out caused a slew of performance issues.
On top of all of this, Meltdown and particularly Spectre revealed fundamental security weaknesses in how chips have been designed for over two decades. Throughout 2018, researchers inside and outside Intel continued to find exploitable weaknesses related to this class of "speculative execution" vulnerabilities. Fixing many of them takes not just software patches, but conceptually rethinking how processors are made.
"In the past no one was aware of these issues, so they weren’t willing to sacrifice any performance for security."
Jon Masters, Red Hat
At the center of these efforts for Intel is STORM, the company's strategic offensive research and mitigation group, a team of hackers from around the world tasked with heading off next-generation security threats. Reacting to speculative execution vulnerabilities in particular has taken extensive collaboration among product development teams, legacy architecture groups, outreach and communications departments to coordinate response, and security-focused research groups at Intel. STORM has been at the heart of the technical side.
"With Meltdown and Spectre we were very aggressive with how we approached this problem," says Dhinesh Manoharan, who heads Intel's offensive security research division, which includes STORM. "The amount of products that we needed to deal with and address and the pace in which we did this—we set a really high bar."
Intel's offensive security research team comprises about 60 people who focus on proactive security testing and in-depth investigations. STORM is a subset, about a dozen people who specifically work on prototyping exploits to show their practical impact. They help shed light on how far a vulnerability really extends, while also pointing to potential mitigations. The strategy helped them catch as many variants as possible of the speculative execution vulnerabilities that emerged in a slow trickle throughout 2018.
"Every time a new state of the art capability or attack is discovered we need to keep tracking it, doing work on it, and making sure that our technologies are still resilient," says Rodrigo Branco, who heads STORM. "It was no different for Spectre and Meltdown. The only difference in that case is the size, because it also affected other companies and the industry as a whole."
Intel received industry criticism—especially early in 2018—for haphazard communication, and for pushing some bad patches as the company attempted to steer the Spectre and Meltdown ship. But researchers who have been heavily involved in speculative execution vulnerability response outside of Intel say that the company has largely earned back goodwill through how relentless it has been in dealing with Spectre and Meltdown.
"New things will be found no matter what," says Jon Masters, an architecture specialist at the open source enterprise IT services group Red Hat, which was recently acquired by IBM. "But in the past no one was aware of these issues, so they weren’t willing to sacrifice any performance for security. Now for Intel security is not just a checkbox, but a key feature, and future machines will be built differently."
By some estimates, the process of adding fundamental speculative execution defense to Intel chips will take four to five years. In the meantime, in addition to patches for legacy processors Intel added its first physical defenses to its 2019 chips announced in October. But fully reconceptualizing the chips to physically defend against speculative execution attacks by design will take time. "A complete microarchitecture design from scratch is not done that often," Masters says.
While Intel is quick to tout the progress it has made so far rather than focusing on this larger timeline, its offensive security researchers also acknowledge the scale and importance of thorough redevelopment.
"With mitigating vulnerabilities the response time varies depending on the type of product you build," Intel's Manoharan says. "So if it’s a simple application versus something that’s an operating system or something that’s low-level firmware or now silicon, the complexities are vastly different. And our ability to turn around and address things are different as well at each end of that spectrum."
Speculative execution attacks are just one area in a long list of research topics STORM tackles. But the subject largely held the spotlight throughout 2018. Within days of Intel's initial Meltdown and Spectre disclosure, the company's then-CEO Brian Krzanich announced a "Security-First" pledge. "The bottom line is that continued collaboration will create the fastest and most effective approaches to restoring customer confidence in the security of their data," Krzanich wrote. Since then, numerous members of STORM were recruited from independent security consultancies and other outside research groups to help infuse Intel with their less corporate approach.
"We share our insights and expertise with the different product teams, but we're not bound to a specific product," says Marion Marschalek, a STORM researcher who primarily works on security analysis of software compilers and joined the team in 2017. "That is special in the sense that we’re independent of the production flow. We’re able to do more advanced research without being bound to a timeline."
"For me it’s exciting, I need to work on problems that people never needed to work on before. You're defining the impact."
Rodrigo Branco, STORM
The STORM team sits together in an open room at Intel's Hillsboro, Oregon campus. "It's like a movie," Marschalek says, laughing. "There are lots of white boards and lots of people jumping up and drawing ideas on the wall and discussing them with someone else." STORM even has an unofficial team mascot, the alpaca, thanks to gatherings every few months hosted at Branco's farm in rural Oregon. "I got a complaint that I wasn’t cutting my grass and it was a fire hazard," he says. "One thing led to the other and now we have 14 alpacas."
Balancing out that levity and camaraderie, though, is the severity of the threats STORM faces down. A huge swath of computing devices around the world have Intel inside, from embedded devices and mobile, to PCs, servers, and super computers. To keep individual team members from feeling overwhelmed, Branco works to assign them challenges with a manageable scope.
"For me it’s exciting, I need to work on problems that people never needed to work on before. You're defining the impact and for me that is kind of amazing," Branco says. "For less experienced researchers that may be too much. It can be overwhelming."
This reach and scale has been both Intel's strength and challenge in dealing with speculative execution vulnerabilities, and opening itself up to the idea of new conceptual classes of processor attacks.
"Hopefully one of the messages Intel will take away from this is defense in depth," says Thomas Wenisch, a computer architecture researcher at the University of Michigan who has worked on speculative execution research, including an attack on Intel's secure enclave technology for processors. "If somebody does find a flaw in one piece of a system, how can we make portions of the security still stand up? Hopefully we’ll see chip designs that are less brittle."
Researchers within STORM seem to understand the stakes, but with years until Intel fully implements hardware protections against speculative execution attacks—and with new categories of threats emerging all the time—the company as a whole will need to remain committed to "security first" for the long haul. The same goes for the entire silicon industry, a tough sell when security protections are so often at odds with speed and agility. And pushing for max performance is how Spectre and Meltdown cropped up in the first place.
The economic pressures often don't square with the security needs. "It’s tough because at the end of the day they compete on performance," Wenisch says. Regardless of what's going on around them, though, STORM's mandate is simply to keep delving into whatever risks are on the horizon. It should have no shortage of opportunities.