An AI wrote this sentence—or was it? OpenAI’s new chatbot, ChatGPT, presents us with a problem reminiscent of the unpredictability found in games like “book of dead slots“: How will we know whether what we read online is written by a human or a machine?
Since its release in late November, ChatGPT has been used by over a million people, captivating the AI community. The internet is becoming flooded with AI-generated text, akin to the spinning reels of a slot machine where the outcome is uncertain. Users use it for various purposes, from crafting jokes to writing children’s stories and improving email correspondence.
ChatGPT, based on OpenAI’s GPT-3, creates very human-like answers. It’s fascinating but also tricky because it seems correct like a slot game like “Book of Dead Slots” makes sense. The sentences look right with the correct words in order. However, like a slot game’s random symbols, the AI doesn’t understand what it’s saying. It guesses the next word. Sometimes, it might say something true but differs from guessing in slots.
The Challenge of Regulating AI-Generated Content in a Politically Charged Online Environment
In today’s online world, opinions are already divided. AI tools could make the information we see even less reliable. If these tools are used in real-life products, it could cause big problems. Irene Solaiman, from Hugging Face, used to work on AI at OpenAI. She says finding ways to tell if someone wrote a text or AI is important. This can prevent misuse of this technology.
New tools will be crucial to enforcing bans on AI-generated text and code. Stack Overflow, a website where coders can ask for help, recently banned specific questions. ChatGPT can regurgitate answers to software problems, but it could be better. Getting code wrong can lead to buggy and broken software, which is expensive and chaotic.
A person from Stack Overflow says their team is checking many user reports. They use different tools to spot AI-written text. Yet, they didn’t share more details about it.
It is difficult, and the ban is likely impossible to enforce.
Today’s Detection Tool Kit
Researchers use software to analyze text features to identify AI-generated text. Automated systems are good at detecting patterns in the choice of words, such as the overuse of common words. However, humans prefer “clean” text with fewer mistakes, which language models produce better. Large language models can be trained to differentiate between text written by humans and text made by AI. Technical solutions like watermarks have limitations and perform best when analyzing large amounts of text. It takes more work to build AI models that detect human-written text.
Training the Human Eye
Solaiman says there is no easy solution for detecting text that AI writes. “A detection model will not be your answer for detecting synthetic text. A safety filter will not be your answer for mitigating biases,” she says.
To solve the problem, we need improved technical fixes. Also, we need more transparency when humans interact with AI. People will need to learn to spot the signs of AI-written sentences.
Ippolito says, “It would be nice to have a plug-in for Chrome or whatever web browser you use. It will let you know if a machine makes any text on your web page.”
Some help is already out there. Harvard and IBM created GLTR to help humans identify passages generated by computers.
But AI is already fooling us. Researchers at Cornell University found that people found fake news articles generated by GPT-2 credible about 66% of the time.
Another study found that untrained humans could only spot text generated by GPT-3 at a level consistent with random chance.
The article reports that people can be trained to identify AI-generated text. The author created a game to test how long a player can recognize non-human sentences. She found that people got better over time.
“If you look at many generative texts and try to figure out what doesn’t make sense, you can get better at this task,” she says. One way to detect implausible statements is when the AI says it takes 60 minutes to make coffee.
ChatGPT’s predecessor, GPT-3, was introduced in 2020, which means it has been around for only a year.OpenAI says ChatGPT is a demo. It’s only a matter of time before powerful models are developed and rolled out into products. These products could include chatbots for customer service or health care. And that’s the crux of the problem. The development speed in this sector means that every way to spot AI-generated text needs to be updated. “It’s like an arms race, but unfortunately, we are losing.”