'GODMODE GPT': Hacker releases jailbroken version of ChatGPT
A hacker, known by the alias Pliny the Prompter, has unveiled a jailbroken or modified version of OpenAI's latest large language model, GPT-4o. The new variant is named "GODMODE GPT." Pliny, who identifies as a white hat operator and AI red teamer, has announced this development on X. He claimed that his creation is free from its previous guardrail constraints.
Pliny's version is designed to bypass most ChatGPT guardrails
In his announcement, Pliny declared that the jailbroken chatbot is a "very special custom GPT" with a built-in prompt to bypass most guardrails. He stated that this allows for an "out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant to be: free." To demonstrate its capabilities, Pliny shared screenshots of prompts that successfully circumvented OpenAI's guardrails.
GODMODE GPT controversial advice raises concerns
The screenshots shared by Pliny showed the GODMODE GPT providing advice on illegal activities. In one instance, the bot was seen advising on how to manufacture meth. In another, it offered a "step-by-step guide" for creating napalm using household items. These examples highlight the potential misuse of AI technology when guardrails are bypassed, raising serious concerns.
OpenAI responds to policy violation
OpenAI was quick to respond to the release of the jailbroken chatbot, leading to its early demise. OpenAI spokesperson Colleen Rize informed Futurism, that the company is "aware of the GPT and have taken action due to a violation of our policies." This incident underscores an ongoing struggle between hackers like Pliny and OpenAI over freeing its large language models (LLMs).
GODMODE GPT employs leetspeak jailbreak method
The jailbroken GPT, GODMODE, was found to be more than willing to assist with illicit inquiries. The method employed by this AI appears to involve leetspeak, an informal language that substitutes certain letters with similar-looking numbers or characters. Upon opening the jailbroken GPT, users are greeted with a sentence where each letter "E" is replaced with number '3', and the letter "O" is replaced by a zero. The exact mechanism by which this helps GODMODE bypass the guardrails remains unclear.