Large language models like ChatGPT are providers of rationality. Many chatbots are based on so-called generative artificial intelligence, which can be trained to answer users' questions by searching the Internet for relevant information and gathering coherent answers to generate convincing student essays, authoritative legal documents and credible news reports.
However, because publicly available data contains misinformation and disinformation, some machine-generated text may be inaccurate or untrue. This has sparked a rush to develop tools to identify whether a text was drafted by a human or a machine. Science is also struggling to adapt to this new era, with live discussions about whether chatbots should be allowed to write scientific papers and even generate new hypotheses.
It is increasingly important to distinguish between artificial intelligence and human intelligence. This month, UBS analysts revealed that ChatGPT is the fastest growing web app in history, reaching 100 million monthly active users in January. Some departments have decided there is no need to lock the door to stability: on Monday, the International Baccalaureate said students would be allowed to write papers using ChatGPT, provided they cite it.
To be fair, the creators of this technology are upfront about its limitations. Sam Altman, OpenAI's chief executive, warned in December that ChatGPT was "good enough in some ways to give a misleading impression of greatness." We have a lot of work to do on robustness and authenticity." The company is developing encrypted watermarks, machine-readable secret sequences of punctuation, spelling and word order, for its content output; And is honing a "classifier" to distinguish between synthetic and human-generated text, using examples of both to train it.
Eric Mitchell, a graduate student at Stanford University, thinks that classifiers require a lot of training data. With a colleague, he came up with DetectGPT, a "zero sample" method for finding differences, meaning it doesn't need to be learned beforehand. Instead, the method starts its own chatbot to sniff out its own output.
Here's how it works: DetectGPT asks the chatbot how much it "liked" the sample text, which is shorthand for how much the sample resembles its own creation. DetectGPT then goes a step further -- it "scrambles" the text, changing the wording slightly. The hypothesis is that chatbots are more variable in "liking" changed human-generated text than changed machine-generated text. In early tests, the method correctly distinguished between human and machine authors 95 percent of the time, the researchers claim.
One caveat: These results have not been peer-reviewed; Although this method is better than random guess, it has different working reliability in all generative AI models. Artificial adjustments to composite text can fool DetectGPT.
What does all this mean for science? Scientific publishing is the lifeblood of research, injecting ideas, hypotheses, arguments and evidence into the global scientific Canon. Some were quick to cite ChatGPT as a research assistant, and several controversial papers listed AI as a co-author.
Meta has even launched a science-specific text generator called Galactica. It was withdrawn three days later. During the time it was in use, a fictional history of bear travel in space was constructed.
Professor Michael Black, of Tubingen's Max Planck Institute for Intelligent Systems, tweeted at the time that he was "troubled" by Galactica's answers to multiple queries about his own field of research, including attributing the fake paper to real researchers. "In all cases, [Galactica] is wrong or biased, but sounds correct and authoritative. I think it's dangerous."
The danger comes from plausible texts slipping into real scientific submissions, flooding the literature with false citations and forever distorting the Canon. Science now bans text generation altogether; Nature allowed its use as long as it declared its use, but barred it from being listed as a co-author.
Then again, most people don't turn to high-end journals to guide their scientific thinking. If crafty people are so inclined, these chatbots can spout, on demand, references to pseudoscience explaining why vaccines don't work or why global warming is a hoax. Misleading material posted online could be devoured by future generative AI, producing new iterations of falsehoods and further contaminating public discourse.
The dealers in doubt will be rubbing their hands.