What kinds of texts are AI systems trained on? This artwork explores one unexpected source.
Sam Lavigne and Tega Brain, The Good Life, 2016 (screen capture). Website, public domain text, and mail server. Courtesy the artists
Machine learning is widely used in contemporary digital culture, for tasks that range from the analysis of consumer habits to suggesting auto-replies on email threads. Before systems can analyze or generate new text, they must be given an existing body of text to study. Very often, the text they are given is the Enron email corpus.
At the beginning of the second Bush administration, the Enron Corporation was one of America’s largest companies, a darling of the stock market in the wake of the dot-com crash. Its growth was fueled by fraudulent accounting practices, and as its business ventures ran operating losses and fueled rolling blackouts in California, its opaque earnings statements began to draw scrutiny. At the start of 2001, Enron was trading for $83 a share; by the end of the year, the share price was just sixty cents, and the company was under investigation and had filed for bankruptcy. Thousands of people lost their jobs and savings.
In March 2003, the Federal Energy Regulatory Commission released Enron’s emails into the public domain as a part of the evidence of the company’s crimes. It was the first time an archive of emails of its size had ever been made public, and the archive remains one of the largest freely available. As a result, this record of the online communications of the participants in a fraud of historic proportions has proven unexpectedly useful long after the court cases closed. Social scientists and linguists pore over it to draw conclusions about the use of language in the American workplace, as recently described in the New Yorker:
“Only six per cent of the e-mails she examined had any greeting at all; most began in medias res. The employees most likely to use a friendly greeting were women not in positions of authority, followed by men in subservient positions. Powerful men were the most likely just to open an e-mail window and start typing. In some cases, an e-mail would simply be addressed ‘Guys.’”
Alongside this interest from researchers, countless machine learning systems have been trained on the Enron emails—and then go on to reproduce the patterns and biases found in them. As artists Sam Lavigne and Tega Brain note, “This dataset, which was generated by a group of mostly white male corporate criminals, is therefore in our lives in ways we don’t understand and haven’t fully considered.”
Produced with the aid of a 2016 Rhizome Net Art Microgrant, The Good Life gives users the opportunity to reflect on the specific qualities of this now-influential text by receiving the emails one at a time. In the artists’ own words:
“ The Good Life invites you to experience a nightmarish simulation of living through the death throes of a corporation in the 2000s. Sign up at http://enron.email to receive 225,000 Enron emails over the course of seven years. You will receive the emails in chronological order at the frequency at which they were sent, relatively adjusted to the seven-year timeline.”
There are many ways to enjoy the Enron corpus, but by far the most pleasurable is to read all 225,000 emails in the order they were sent.
Many of the former Enron employees whose correspondence appears in the corpus are highly privileged, possibly criminal, and generally unsympathetic, but their emails serve as a reminder that this dataset (like many others) is ultimately a record of people’s lives and labor. Discussions of oil-and-gas skullduggery are interspersed with details of meals and loves, families and pets.
The city of Houston, where Enron was headquartered, is also a regular presence; the company employed thousands of people there. Today, as Houston residents recover from the tragic aftermath of Hurricane Harvey, our thoughts are with all those affected. We hope viewers and subscribers to “The Good Life” will consider supporting relief efforts.