French philosopher Pierre Bourdieu has insisted that naming is an exertion of power. Powerful elites have long used naming and classifying not only to legitimize knowledge, but also to perpetuate and reproduce harmful and oppressive historical practices. Language can liberate, but it can also harm. The way we talk and the words we use are intimately connected to social inequalities and discrimination. And while technology has unquestionably propelled us forward at startling speeds, the epicenter of knowledge in Silicon Valley is not altogether different from the ruling elite that has governed Washington D.C. and state capitols for centuries, in that it is largely dominated by wealthy white males. Everything else is a deviation. We are only beginning to understand the dangers we face when historic inequalities are reinforced by invisible technologies that appear neutral.
Algorithms are everywhere. They decide how we dress, what we eat, where we travel, and essentially rule over our daily life. We are told that privacy is dead , but we are not told how the systems that have reduced our agency function. Instead, we hear claims that technology is neutral—that when a computer says no, no should be the answer because, come on, machines are infallible. We even use the word “smart” to describe objects that have integrated digital sensors or internet connectivity, despite the fact that adding the internet to a toaster is actually pretty stupid.
One of the most widely used algorithms today is the one that animates Google Translate. For years, language translators were useless because, despite the fact that they had access to enormous datasets of words in different languages, they could not, for all the hype about artificial intelligence, harness the essence of language: how words join to create sentences and convey meaning. Google Translate has come close to solving this problem. That’s because the algorithm has learned from the sentences and corrections millions of people enter into the system, year after year. Thanks to all that human labor, Google’s algorithm now does a fairly good job translating complex sentences and grammatical constructions.
But the millions of people who helped to train Google Translator didn’t only help Google’s algorithm get better at doing its core job; it also saddled Google’s system with all the implicit and explicit biases that human beings exhibit when they talk and write.
These biases are especially tricky and problematic when Google Translate is asked to process translations between languages that deal differently with gendered nouns. For example, in Spanish and most Latinate languages, unlike English, nouns are gendered. In Spanish, because of patriarchal cultural norms and the historic dominance of men, the male noun became the generic “non-gendered” way of referring to most things. In some cases, the gender of nouns is directly related to societal norms, such as with some professions and roles in society that have historically been reserved for men or women. Language evolved accordingly. And because human beings have trained Google’s algorithms, these gendered linguistic norms now appear in Google Translator outputs—with often frustrating results.
Last week, a Twitter user tweeted a screenshot revealing sexist bias in the system. The person used Google Translate to translate an English letter into Spanish. In English, the letter was directed to “Professor.” Google Translate, informed by its users, produced an output indicating that the letter was directed to a male professor. That got me thinking about what other bias might exist in Google Translate, so I looked up some other translations from English to Spanish.
The result? A whole lot of gendered bias. “Dear Doctor” translated to “Querido Doctor,” masculine. “Hairdresser” translated to “peluquero” because, of course, only men are hairdressers. Housekeeper translated to “ama de casa,” which basically means “housewife.” “Dear Kindergarten teacher” and “dear preschool teacher” both translated to “querida maestra de kinder” and “querida maestra de preescolar,” suggesting that only women teach young people. (If you remove youth from the equation, the translator gives you the opposite output: “Dear teacher” translated to “querido maestro,” a man.) More examples: “dear lawyer” translated to “querido abogado.” “Scientist” returned a “cientifico.” All men. No women scientists, nor lawyers; no Marie Curie and no Ruth Bader Ginsburg.
Then I entered the words “boss” and “supervisor” into the translator. Not only did Google Translate return the words “jefe” and “supervisor,” both male gendered nouns, but a deeper function of the translator offered four types of leadership positions (jefe, patron, cacique and mayor) for boss and 3 type of positions (supervisor, inspector y controlador) for supervisor. Every single one of those words is gendered male.
Politicians and public officials are not safe from the gender bias either. “Dear President” returned “querido president,” “dear senator” returned “querido senador,” and “head of State” returned “Jefe de Estado. All male nouns, despite the fact that Latin America, the biggest Spanish speaking community in the world, has had more female presidents and heads of state than any other part of the world.
We’ve been told that technology will liberate us, but as this small experiment shows, technology can also codify historic biases and inequalities, all the while making the bias appear neutral and natural. Technology is not neutral, and language algorithms, like others programmed by human beings and trained with human inputs, are just as likely to reproduce bias as any human being.
But once these biases in the machine are made visible, tech companies have the opportunity—indeed, the responsibility— to stop the reproduction of age old inequalities in their own systems.
In this case, one simple thing Google could do to address the problem of sexism in its translator is to keep up to date with the latest developments in language studies. In Spanish, for example, there is a new trend of “inclusive language” that uses the letter “e” or the character “@” when using nouns or describing a group of people that contain both genders. In this way, the masculine form that ends in an “o” does not override the feminine “a,” and space is created in the language for gender non-conforming and non-binary people. Another option would be to provide users with both the feminine and masculine translations of words and phrases, to alert users who may be unfamiliar with the ways foreign languages deploy gender. This would be a very simple change in the user interface, but it could make an enormous impact.
Tweaks like these may seem like small interventions, but as Bourdieu observed, power and language are inextricably linked. Tech companies may not have historically worried much about the relationship between power and language, but now that they are offering translator tools to the world, and mediating so much of human culture, thought, and expression, they must.
This post was written by Emiliano Falcon, Technology and Civil Liberties Policy Counsel.
Date
Friday, November 16, 2018 - 3:00pmFeatured image
Show featured image
Hide banner image
Related issues
Show related content
Tweet Text
Type
Show PDF in viewer on page
Style
This month marks a year since the Supreme Court issued its landmark privacy decision in Carpenter v. United States, ruling that the government must get a warrant before accessing a person’s sensitive cellphone location data.
Carpenter, which the ACLU argued before the Supreme Court, concerned information revealing where Timothy Carpenter had traveled with his phone. The police, searching for evidence to connect Carpenter to the scenes of various robberies, obtained months’ worth of Carpenter’s detailed location data from his cellphone company without a warrant. That data exposed Carpenter’s daily routines, including where he slept and attended church.
The court held that government access to such detailed location data provides a method of “near-perfect surveillance,” and recognized that the Fourth Amendment must protect such sensitive information. It added that old-world legal rules don’t automatically apply in the digital age.
The Supreme Court’s decision stands as one of the most consequential rulings regarding privacy in the digital age, providing a roadmap for lower courts to protect many other kinds of sensitive data from warrantless government intrusion. One year in, we’re working to ensure that lower courts heed the high court’s call and extend the lessons of Carpenter to other contexts.
For instance, we were in the Georgia Supreme Court last week arguing that Carpenter made clear courts cannot “mechanically apply” older legal doctrines that allow warrantless searches to new, complex digital-age contexts. Instead, courts should carefully assess what protections are necessary in light of rapidly advancing technology and increasingly accessible data.
In that case, the state of Georgia is arguing that a legal doctrine dating back to the early 20th century should give police the authority to obtain — without a warrant — the vast and detailed data modern cars collect on us. This data can include everything from our car’s speed and braking data, to call record and text history, to music preferences and GPS coordinates. Under the dated doctrine, known as the “vehicle exemption,” police do not need a warrant to search a car for physical items due to the “ready mobility of vehicles,” which might drive away before a warrant is obtained. But, as we argued in court last week, that old rule shouldn’t be extended to override people’s unprecedented privacy interest in new kinds of sensitive digital data.
Similarly, in our lawsuit challenging the government’s warrantless searches of electronic devices at the U.S. border, the federal government has been invoking a centuries-old rule allowing border agents to search travelers’ physical luggage without individualized suspicion or a warrant for contraband or import violations. We argue that old-world rules can’t be twisted into unfettered authority to search the incredible volumes of data on people’s phones and laptops when they return from a trip abroad.
In both cases, Carpenter (and a predecessor Supreme Court case, Riley v. California) provide a powerful rebuke to the government’s arguments. The quantities and types of information that might be discovered by a manual search of a car’s trunk and glove compartment — or a traveler’s luggage — pale in comparison to the kinds of comprehensive data stored on our electronic devices today. This requires greater protections under the Fourth Amendment.
Carpenter also holds that, in the digital age, our sensitive information does not lose Fourth Amendment protections merely because we store that information on a “third party” server, such as with Google or DropBox. This is a game-changer.
In the digital age, it is virtually impossible to avoid leaving a trail of highly sensitive data. Our information is saved not only on our personal laptops and phones, but also on the servers of the companies with which we interact. As we argued in a case now before the First Circuit Court of Appeals, the government can no longer get away with warrantless searches of our personal information by relying on the “third party” doctrine.
That case concerns the Drug Enforcement Administration’s efforts to access — without a warrant — people’s prescription records stored in the New Hampshire Prescription Drug Monitoring Program, a secure state-run database set up for public health purposes. The DEA is arguing that when people reveal their symptoms to their doctor and bring the doctor’s prescription to their pharmacist, they have given up their Fourth Amendment privacy rights in that sensitive health information. That can’t be right when the result is unfettered police access to deeply private information about our health and medical history.
In other cases, we have similarly argued that people’s location history stored in gargantuan automated license plate reader databases should be protected by a warrant requirement because of the intense privacy interest in digitized location data recognized in Carpenter.
The Supreme Court rightfully understood in Carpenter that courts have an essential role in ensuring that privacy protections remain vital in the digital age. While the government advocates for unfettered access to the personal information companies are sweeping up on us, it’s crucial the courts make clear, as Carpenter does, that we do not forfeit our Fourth Amendment rights simply for owning a laptop, driving a car, or having a cellphone.
Blog by Nathan Freed Wessler, Staff Attorney, ACLU Speech, Privacy, and Technology Project.