DARPA’s Hail Mary Plan to Restart a Hacked US Electric Grid
Credit to Author: Lily Hay Newman| Date: Wed, 14 Nov 2018 12:00:00 +0000
In his years-long career developing software for power grids, Stan McHann had never before heard the ominous noise that rang out last Wednesday. Standing in the middle of a utility command center, he flinched as a cyberattack tripped the breakers in all seven of the grid's low voltage substations, plunging the system into darkness. "I heard all the substations trip off and it was just like bam bam bam bam bam bam bam bam," McHann says. "The power’s out. All you can do is say, OK, we have to start from scratch bringing the power back up. You just take a deep breath and dig in."
Thankfully, what McHann experienced wasn't the first-ever blackout caused by a cyberattack in the United States. Instead, it was part of a live, week-long federal research exercise in which more than 100 grid and cybersecurity experts worked to restore power to an isolated, custom-built test grid.
In doing so they faced not just blackout conditions and rough weather, but also a group of fellow researchers throwing a steady barrage of cyberattacks their way, hoping to stymie their progress just as a real enemy might.
Funded by the Defense Advanced Research Projects Agency, the exercise, which ran the first week of November, served as a testing scenario for seven DARPA-developed grid recovery tools.
But while the situation was manufactured, the conditions of the exercise were all too real. Researchers built their test grid off of the already isolated power grid on Plum Island, a Department of Homeland Security animal disease research facility at the tip of Long Island's North Fork. Roughly the size of Manhattan's Central Park, Plum Island sits about three miles offshore in the Long Island Sound, and is accessible only by ferry. In addition to DHS's livestock research facility, Plum Island is also home to ruins from armaments and fortresses built during World War I and II, pristine beaches, a lighthouse built in 1898, and even packs of gregarious harbor seals in the winter.
The result: A surreal combination of utilitarian federal operations, breathtaking natural habitat, untapped Hamptons real estate, and a nagging sense of foreboding. (Despite persistent conspiracy theories, DHS representatives patiently but firmly deny that there is anything creepy about the island.)
"When we first started the program, we were working in university labs and simulating everything," says Walter Weiss, the DARPA program manager who oversees the agency's research into restoring power to a dead grid—what utilities call "black start."
During one early RADICS meeting, Weiss convinced the host university to cut power to the floor the team was on, forcing researchers to consider how the tools they were developing would remain effective during a blackout. "We said, 'Imagine you're going to an island,'" Weiss says, laughing.
Over the past few years, the threat of grid hacking has morphed from a distant possibility to a stark reality. The most chilling incidents to date are two cyberattack-induced blackouts in Ukraine—one in December 2015 and the next a year later in December 2016—that caused power outages for hundreds of thousands of residents in Kiev for a few hours each time. Both attacks are thought to have been perpetrated by Russian state-sponsored hackers. And though a similar incident hasn't played out in the US so far, there is increasing evidence that various hacker groups have infiltrated US grid defenses. The Department of Homeland Security warned repeatedly this year that it has detected extensive Russian probing of the US grid.
But awareness can only get you so far. For actual resilience, the industry needs what cybersecurity practitioners call an "assume breach" mentality: thinking not just about how to keep attackers out, but knowing how to respond if and when they do break in.
"When we first started the program, we were working in university labs and simulating everything."
Walter Weiss, DARPA
Since the end of 2015, DARPA's Rapid Attack Detection, Isolation and Characterization Systems program, which Weiss oversees, has taken up that mantel for power grids. RADICS seeks to develop tools that aid in three phases of black start after a cyberattack. The first involves creating sensors that can give accurate readings and situational awareness even after a hack has potentially skewed or degraded the reliability of existing monitoring equipment. The second looks at developing specialized equipment for rapidly setting up a secure backup network in a pinch, since whatever malware caused the blackout may still infect some systems. And the third focuses on tools that can quickly scan for threats to help understand how an attack happened, and how to lock down any remaining hacker footholds as power comes back online.
Those tools are all necessary pieces in the critical puzzle of jumpstarting a dead grid. "The real weakness is just how do you get that power back from nothing after 30 days when you don't even know what's up," says Gary Seifert, a federal electrical engineering contractor who conceived much of the RADICS test grid on Plum Island.
RADICS conducted a relatively small black start pilot exercise on Plum Island in June; the grid at that time was designed to be managed by a single utility running a diesel generator, known as Utility A, and a small cluster of substations. A grid is essentially made up of a utility's generators—which power a system—substations that transform electricity from low to high voltage to be transmitted across power lines over long distances, substations that transform electricity back down to lower voltage for local distribution, and customers who receive electricity. For this month's followup, which DARPA hosted in conjunction with the Department of Energy, RADICS expanded the test grid to include a second utility and generator, known as Utility B, and a number of additional substations.
In the scenario laid out for the exercise, a massive cyberattack knocks some portion of the grid offline for weeks—long enough that residual power and substation batteries would all be depleted. Utility B's goal is to black start as quickly as possible, to deliver power to a customer that has been designated a critical asset. After failures plague Utility B, Utility A then needs to step in, restarting to offer redundant power to that same critical customer.
In order to interact and safely share electricity, utilities also need to get their electromagnetic frequencies in tune at around 60 hertz, so part of the exercise involved not just getting Utility A and B running, but syncing them.
"We had 18 substations, two utilities, two command centers, and we had two generation sources that we had to bring up a crank path and synchronize," says Stan Pietrowicz, a researcher at Perspecta Labs who is working on a black start network analysis and threat detection tool through RADICS. A "crank path" is a plan for restoring substation networks and seeding power back into a grid. "It had a realism that you don’t really find in lab environments that made you rethink the approach. Do you turn up everything at once? Do you turn up smaller pieces of the grid and put them in a protected environment to do cyberforensics?"
The test grid was designed to mimic the hodgepodge of technologies that coexist in real industrial control deployments. Vital systems like the grid can't be taken offline casually or overhauled easily, so equipment often remains in place for decades. Black start recovery, especially after a cyberattack, involves navigating, defending, and configuring generations of technologies.
The Plum Island exercise emphasized the difficulty of physically delivering and installing equipment during a massive power outage. The teams established secure landline systems to communicate, with many researchers working at a remote station on mainland Long Island and other crews on Plum Island itself. The exercise included volunteers from major utilities around the country in addition to cybersecurity researchers.
"Even that small victory got taken away from us."
Stan Pietrowicz, Perspecta Labs
One of the recovery tools in development for surveying a grid from above is simply a balloon that has lightweight electromagnetic radiation detectors inside. In a blackout, utilities could launch the balloons, which would look for simple indicators of live power, like whether home Wi-Fi routers are on and emitting a network. The balloons could also detect whether two grids were operating separately or had come into sync, by listening for the "hum" around 60 Hz emitted by electrified infrastructure. Other tools included black boxes that monitor grid equipment, and remote equipment that can hook into secure industrial control networks.
The conditions on Plum Island factored in throughout the week. Multiple rainy days with high winds made taking the ferry back and forth to the island, and physically working on the grid, difficult. One day, the researchers were instructed to pack overnight bags in case they couldn't come back from the island until morning. The balloons weren't reliable in the bad weather, so some of the researchers tried flying the sensors on a kite instead. That proved impractical with the winds. And all the while, the so-called red team kept hacking away.
"Most of the exercise was really about trying to figure out what was going on and deal with the conditions," Pietrowicz says. "It wasn’t a hit and run—while we were cleaning things up the adversary was countering our moves. There was one instance on the third day of the exercise where we almost had the crank path fully established and the attacker took out one of our key substations. It was sort of a letdown and we had to just keep going and figure out our next viable path. Even that small victory got taken away from us."
Ultimately, the participants succeeded in black starting the two grids, and largely achieved the two goals of the exercise. But they say that the most valuable insights came from the setbacks along the way.
DARPA will run another even more sophisticated version of the exercise on Plum Island in May, and hopes for more after that. Eventually RADICS's Weiss hopes that the entire apparatus will be adopted by an organization like DOE to offer preparedness training exercises for government workers and utilities over the long term.
And though many of the RADICS response tools remain in development, some are already in use in active grid instillations around the country. One is a machine intelligence tool from researchers at the National Rural Electric Cooperative Association, a trade group that represents more than 900 independent utilities around the US. NRECA's tool establishes a baseline for normal behavior on critical infrastructure networks and then uses that standard to help detect deviating voltages, new devices on a network, or other unusual behavior. A handful of utilities around the country, including Wake Electric in North Carolina, have already put it to work. And it has benefits beyond doomsday scenarios; it has already detected seven fires and identified situations where too much current is flowing through a transformer and wearing it out.
Though NRECA is composed of independent utilities that serve low-density areas, NRECA chief scientist Craig Miller says that the organization supports RADICS research because its members are worried about cybersecurity intrusions. "We see probes multiple times a day. People try to get in, they knock on the door," Miller says. "We're concerned about cybersecurity because the grid is changing. The new grid is more distributed and it's something that's actively managed, which makes us greener and more efficient and more reliable and resilient. But it also makes us more vulnerable."
For researchers exposed to the elements on Plum Island and scrambling to handle the RADICS exercise day after day, that vulnerability was palpable. Given the urgency of strengthening grid defenses and recovery plans, there may be nothing more important right now than that reality check.