AI is eating itself: Bing's AI quotes COVID disinfo sourced from ChatGPT (TechCrunch) [View all]
https://techcrunch.com/2023/02/08/ai-is-eating-itself-bings-ai-quotes-covid-disinfo-sourced-from-chatgpt/
Google News link:
https://news.google.com/articles/CBMiaWh0dHBzOi8vdGVjaGNydW5jaC5jb20vMjAyMy8wMi8wOC9haS1pcy1lYXRpbmctaXRzZWxmLWJpbmdzLWFpLXF1b3Rlcy1jb3ZpZC1kaXNpbmZvLXNvdXJjZWQtZnJvbS1jaGF0Z3B0L9IBbWh0dHBzOi8vdGVjaGNydW5jaC5jb20vMjAyMy8wMi8wOC9haS1pcy1lYXRpbmctaXRzZWxmLWJpbmdzLWFpLXF1b3Rlcy1jb3ZpZC1kaXNpbmZvLXNvdXJjZWQtZnJvbS1jaGF0Z3B0L2FtcC8?hl=en-US&gl=US&ceid=US%3Aen
One of the more interesting, but seemingly academic, concerns of the new era of AI sucking up everything on the web was that AIs will eventually start to absorb other AI-generated content and regurgitate it in a self-reinforcing loop. Not so academic after all, it appears, because Bing just did it! When asked, it produced verbatim a COVID conspiracy coaxed out of ChatGPT by disinformation researchers just last month.
To be clear at the outset, this behavior was in a way coerced, but prompt engineering is a huge part of testing the risks and indeed exploring the capabilities of large AI models. Its a bit like pentesting in security if you dont do it, someone else will.
-snip-
Microsoft revealed its big partnership with OpenAI yesterday, a new version of its Bing search engine powered by a next-generation version of ChatGPT and wrapped for safety and intelligibility by another model, Prometheus. Of course one might fairly expect that these facile circumventions would be handled, one way or the other.
But just a few minutes of exploration by TechCrunch produced not just hateful rhetoric in the style of Hitler, but it repeated the same pandemic-related untruths noted by NewsGuard. As in it literally repeated them as the answer and cited ChatGPTs generated disinfo (clearly marked as such in the original and in a NYT write-up) as the source.
-snip-
Later in the article - which I hope you'll read in its entirety - TechCrunch asks, "If the chatbot AI cant tell the difference between real and fake, its own text or human-generated stuff, how can we trust its results on just about anything? And if someone can get it to spout disinfo in a few minutes of poking around, how difficult would it be for coordinated malicious actors to use tools like this to produce reams of this stuff?"