The Next Big Privacy Hurdle? Teaching AI to Forget

Credit to Author: Darren Shou| Date: Wed, 12 Jun 2019 12:00:00 +0000

When the European Union enacted the General Data Protection Regulation (GDPR) a year ago, one of the most revolutionary aspects of the regulation was the “right to be forgotten”—an often-hyped and debated right, sometimes perceived as empowering individuals to request the erasure of their information on the internet, most commonly from search engines or social networks.

Darren Shou is vice president of research at Symantec.

Since then, the issue of digital privacy has rarely been far from the spotlight. There is widespread debate in governments, boardrooms, and the media on how data is collected, stored, and used, and what ownership the public should have over their own information. But as we continue to grapple with this crucial issue, we’ve largely failed to address one of the most important aspects—how do we control our data once it’s been fed into the artificial intelligence (AI) and machine-learning algorithms that are becoming omnipresent in our lives?

Virtually every modern enterprise is in some way or another collecting data on its customers or users, and that data is stored, sold, brokered, analyzed, and used to train AI systems. For instance, this is how recommendation engines work—the next video we should watch online, the next purchase, and so on, are all driven by this process.

At present, when data is sucked into this complex machinery, there’s no efficient way to reclaim it and its influence on the resulting output. When we think about exerting the right to be forgotten, we recognize that reclaiming specific data from a vast number of private businesses and data brokers offers its own unique challenge. However, we need to realize that even if we can succeed there, we’ll still be left with a difficult question—how do we teach a machine to “forget” something?

This question is even more impactful for children and adolescents coming of age in this world—the “AI Generation.” They have gone through the largest “beta test” of all time, and it’s one that did not consider the fact that children make mistakes, they make choices, and they are given space by society to collectively learn from them and evolve. Algorithms may not offer this leniency, meaning that data collected on a youthful transgression may be given the same weight (and remembered the same) as any other data—potentially resulting in the reinforcement of bad behavior, or limited opportunities down the line as this data becomes more embedded into our lives.

For instance, today a college admissions counselor may be able to stumble upon incriminating photos of an applicant on a social media platform—in the future, they may be able to hear recordings of that applicant as a 12-year-old taken by a voice assistant in the child’s home.

The AI Generation needs a right to be forgiven.

Historically, we have worked hard to create protections for children—whether that’s laws about advertising, the expunging of juvenile criminal records, the Children's Online Privacy Protection Act, or other initiatives. All of these align with a common belief in our society that there’s a dividing line between adulthood and childhood, and that standards and accountability need to be separate and more forgiving for youth.

Children coming of age today are not always enjoying that privilege. This prolific data collection and the infusion of AI into their daily lives has happened with minimal oversight, and seemingly little serious thought has been given to what the consequences could be. Society engaged in far more rigorous debate over advancements that would seem trivial today—the introduction of car radios, for example, drew much more concern from the United States government. The moral panics of the mid 20th century seem quaint in comparison to today’s digital free-for-all.

The lack of debate on what data collection and analysis will mean for kids coming of age in an AI-driven world leaves us to imagine its implications for the future. Mistakes, accidents, teachable moments—this is how children learn in the physical world. But in the digital world, when every click, view, interaction, engagement, and purchase is recorded, collected, shared, and analyzed through the AI behemoth, can algorithms recognize a mistake and understand remorse? Or will bad behavior be compounded by algorithms that are nudging our every action and decision for their own purposes?

What makes this even more serious is that the massive amount of data we’re feeding these algorithms has enabled them to make decisions experientially or intuitively like humans. This is a huge break from the past, in which computers would simply execute human-written instructions. Now, advanced AI systems can analyze the data they’ve internalized in order to arrive at a solution that humans may not even be able to understand—meaning that many AI systems have become “black boxes,” even to the developers who built them, and it may be impossible to reason about how an algorithm made or came to a certain decision.

On a basic level, people understand that there are trade-offs when they use digital services, but many are oblivious to the amount of information captured, how it is used, and whom it is shared with. It’s easy to view an email address or a birth date as a single, discrete puzzle piece, but when small bits of information are continually given to an ever-consuming, ever-calculating algorithm, they add up to a shockingly complete picture.

One of the starkest examples of this dates back to 2012, when The New York Times published the story of how a major retailer’s customer prediction model ended up informing a father that his teenage daughter was pregnant through the targeted advertisements she received in the mail. That was seven years ago—not only has technology made great progress since then, but the meter has been running.

In 2019, data profiles of everyone who has gone through the system are seven years richer. The teenager in this example is now an adult, and the data surrounding her pregnancy is forever attached to her. Who has the right to know that? And who—or “what,” when we consider AI systems—has the right to make judgments based on that?

This is where the problem lies—all this data collection and personali­zation seems benign, even beneficial, until it isn’t. The fault line between the two is time. Looking further into the future raises more questions. What rights do human beings have to their data after they die? Should AI be able to train on an individual’s choices or behaviors once that person is dead?

When a person dies, they need to consent as an organ donor to give up their organs. If they pass away with a safety deposit box at a bank, they can specify who gains ownership after their death. In the physical world, we’re given choices and have control over our own possessions. The reverse would be preposterous. Imagine the outrage if, upon dying, our bodies, thoughts and possessions could be taken and used in perpetuity by private enterprises. But that’s essentially what we’ve allowed the digital world to do.

Lacking readily applicable laws, rules setting boundaries or technology that changes the “art of the possible,” we’re left with a decentralized system without a human at the controls. The algorithms can’t choose what to unlearn, and those in charge of them may have no reason, ability, or desire to address the problem.

AI began in academia, and those behind its development had altruistic purposes. The advancements made by AI were going to cure the sick and feed the hungry. As businesses have deployed AI, it’s been used to make products and services better, often through learning what the customer wants. The combination of cheap storage and AI’s seemingly endless capacity has made it an incredibly attractive tool, but it has also resulted in mass data collection with no easy way to “forget” data.

While AI systems may have the memory of an elephant, they are not infallible—researchers have recently discovered that AI can be “tortured” into giving up secrets and data. This discovery means that the inability to forget doesn’t only impact personal privacy—it could also lead to real problems for our global security.

It’s not too late to address this crucial issue, but the time to act is now. People, not artificial intelligence, constructed this problem, and it’s time for them to take ownership of solving it. There are no simple answers when it comes to privacy, but there are guardrails, safety nets, and limits that can be put into place to restore order and give the public power over their own information.

While initial research has already begun investigating potential solutions, a true shift will require partnership from the private entities leading the cutting edge of AI development, technologists, ethicists, researchers, academics, sociologists, policymakers, and governments. Together, these entities must work to create safeguards and frameworks to guide the development of AI systems for decades to come. As artificial intelligence becomes increasingly prevalent, the need for governance is becoming more and more dire.

To fall short in this effort would be unforgivable.

WIRED Opinion publishes essays by outside contributors, representing a wide range of viewpoints. Read more opinions here. Submit an op-ed at opinion@wired.com

https://www.wired.com/category/security/feed/