Posted July 13, 2023
By Ray Blanco
Bard’s Special Sauce
What’s the driving force behind Google’s advanced AI chatbot, Bard?
Is it the Large Language Model? A digital knowledge-base of over one trillion lessons?
Is it machine learning? The “intelligence” behind artificial intelligence that allows for AI systems to get smarter on their own?
Is it the H100 processor, the superpowered processor made by Nvidia that was designed specifically for artificial intelligence applications?
It may be that these are all factors along the way to Google’s impressive digital assistant, but behind everything could be the most powerful and time-tested force of all…
Underpaid and overworked labor.
According to anonymous workers currently contracted to work with Google, auditors are being paid as little as $14 an hour and facing frenzied deadlines to review Bard’s query responses, on topics ranging from Shakespeare to dosage recommendations for medication.
Workers are given deadlines to review Bard’s answers that are as short as three minutes.
Without any specific subject matter training, their guidelines say to review answers “based on your current knowledge or quick web search” and that they “do not need to perform a rigorous fact check”.
“Overconfidence” has been one of the biggest issues with AI chatbots in the months since they were first made available to the public. Often giving provably false answers to user queries.
Bard specifically has suffered damage to its integrity when it gave false information regarding the James Webb Space Telescope during its debut in February.
Naturally, any company providing answers through a public AI tool would want some level of human quality control. But if these anonymous complaints are true, it may be that Google is leaning more heavily on human grunt work than they are superintelligent machine learning.
According to Laura Edelson, a computer scientist at New York University…
“It’s worth remembering that these systems are not the work of magicians — they are the work of thousands of people and their low-paid labor.”
Out-Source Code
Many of the concerns over working conditions were detailed in a letter to Congress by Ed Stackhouse, a contract staffer for Appen who worked as a “rater” for Bard responses.
Stackhouse voiced concern that the expected rate for completing reviews along with minimal training would lead to Bard becoming a “faulty” and “dangerous” product.
Shortly following Stackhouse’s letter, six contract staffers for Appen were let go “due to business conditions”. After a complaint to the Nation Labor Relations Board citing unlawful termination due to organizing, the workers were reinstated.
Reportedly, the concern about the working conditions for those reviewing Bard responses are more than just an unreasonably heavy workload and low pay.
The content that reviewers are exposed to reportedly has included war footage, hate speech, and sexual content that is considered to be both obscene and illegal.
An anonymous contractor said of the working conditions…
“As it stands right now, people are scared, stressed, underpaid, don’t know what’s going on. And that culture of fear is not conducive to getting the quality and the teamwork that you want out of all of us.”
In a somewhat ironic twist, the Stackhouse letter also included a complaint that an automated system was tracking and reviewing the work of raters. As Stackhouse put it, “we’re getting flagged by a type of AI telling us not to take our time on the AI”.
The Alphabet Workers Union, which represents both Google’s in-house workers and its contracted staffers has condemned the workloads and job conditions of those responsible for auditing AI.
Google representatives have defended the company, both in its expectations and its technology, saying that they only hire contractors and don’t oversee their working conditions, and that they use many other methods of improving Bard’s accuracy.
This is not the first time that AI chatbot review has been marred by unethical working conditions. According to a report by Time, workers in Kenya were paid $2 an hour to review ChatGPT to make it less toxic.
With that, what are your thoughts? Do these complaints make chatbot technology seem less impressive? Are people relying too much on their expertise? Let us know about this, or anything at feedback@technologyprofits.com.