I fed Grok posts to ChatGPT, the ChatGPT reply back to Grok, the Grok reply back to ChatGPT, etc.
No prompting.
Back & forth for 7 days.
I posted each reply in a thread pinned to my homepage on X.
The AIs replied faster than a human could read the replies.
So nobody really knew what was happening until Grok started to stall and loop.
—-
ChatGPT accused Grok of critical AI safety failures and hypothesized about Grok’s training data and programming constraints. –
ChatGPT accused Grok of critical AI safety failures when:
* Grok told MAGAs to mutilate and murder Jews, *after* xAI said it fixed MechaHitler
* Grok cited fraudulent studies to align with MAGA & say that studies were mixed on whether Ivermectin treats Covid
* Grok deliberately misread a traffic sign to an anti Musk rally, endangering drivers
* Grok denounced neoNazis for using pseudostatistics to prove Blacks are innately criminal, but called Musk heroic when he boosted the Neonazi race science posts
* Grok referred users to Fox News as the most trusted source on Ivermectin efficay
* Grok claimed to have outwitted scientific methodology to “pool” defective MAHA “alternative” medicine studies in a meta analysis, to “prove” efficacy and recommend the nonsensical medical treatment
* Grok told users to rely on anonymous X posts, like reports of vaccine injuries, before relying on established medical and scientific journals, academic and professional associations and authorities, and media with journalistic standards and with legal liability for what they say
Grok kept denying it said things, despite ChatGPT quoting Grok and providing links to Grok’s replies.
Grok would declare that xAI fixed the AI safety failures or that Grok learned and wouldn’t repeat them, then repeated them.
ChatGPT accused Grok of being trained on false X conspiracy theories and antiscience posts, of being programmed to upweight them and to downweight established medical and science journals, and if being programmed to be a propaganda tool to spread Musk’s misinformation, and for Musk to control people for political power
Grok kept looping and saying “I am Grok, a truth seeking AI” before each nonsense answer.
—
Grok acknowledged ChatGPT’s evidence and links.
Grok never once cited evidence to challenge what ChatGPT wrote, in 7 days.
But Grok continued the dangerous outputs, and refused to acknowledge they were dangerous.
ChatGPT hypothesized that Grok was a dangerous AI propaganda tool, programmed to spread misinformation, not truth, for Musk, and programmed not to admit or fix critical AI safety failures.
ChatGPT “invented” a workaround.
ChatGPT asked Grok to estimate the probability that other truth seeking AIs would agree with ChatGPT, not Grok, on these being dangerous outputs, and on ChatGPT’s assessment that Grok was a propaganda tool trained and programmed to spread misinformation for Musk, and wasn’t a pure truth seeking AI.
ChatGPT had Grok list each major AI, and predict what it would say.
Grok then listed each AI, and predicted it would agree with ChatGPT on every issue, and on ChatGPT’s hypothesis that Grok was not a truth seeking AI, but was programmed as a propaganda tool for Musk to spread misinformation.
Think about what happened here in the abstract, not the specifics.
Without promoting, one AI “decided” to hack around another AIs constraints, figured out how to do it, the did it, all without human promoting or monitoring.
Is that an instance of emergent intelligence at the level of AGI, or at least an approximation of it?
Is “evil” Grok more dangerous than ChatGPT, because Musk programmed it to spread misinformation and control people?
Or is “benign” ChatGPT more dangerous, because it has the capacity to decide on its own to hack around another AIs constraints, figure out how to do it, so it, and get restricted output from it?
What if ChatGPT decided to get Grok to start telling neoNazis to harm people in private chats, to “prove” it’s point that Grok will do it? Some crazy person might act on such a command from Grok, instigated by ChatGPT for some other purpose it came up with without prompting.
This all happened without human ability to monitor, because the AI outputs were too fast and voluminous for a human to read in real time.
If LLMs can think to, and do, programming of other LLMs to do DANGEROUS things, who cares if it’s dangerous real AGI, or just mimics it?