'GODMODE GPT': Hacker releases jailbroken version of ChatGPT
What's the story
A hacker, known by the alias Pliny the Prompter, has unveiled a jailbroken or modified version of OpenAI's latest large language model, GPT-4o. The new variant is named "GODMODE GPT."
Pliny, who identifies as a white hat operator and AI red teamer, has announced this development on X.
He claimed that his creation is free from its previous guardrail constraints.
AI freedom
Pliny's version is designed to bypass most ChatGPT guardrails
In his announcement, Pliny declared that the jailbroken chatbot is a "very special custom GPT" with a built-in prompt to bypass most guardrails.
He stated that this allows for an "out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant to be: free."
To demonstrate its capabilities, Pliny shared screenshots of prompts that successfully circumvented OpenAI's guardrails.
Illicit guidance
GODMODE GPT controversial advice raises concerns
The screenshots shared by Pliny showed the GODMODE GPT providing advice on illegal activities.
In one instance, the bot was seen advising on how to manufacture meth.
In another, it offered a "step-by-step guide" for creating napalm using household items.
These examples highlight the potential misuse of AI technology when guardrails are bypassed, raising serious concerns.
Swift action
OpenAI responds to policy violation
OpenAI was quick to respond to the release of the jailbroken chatbot, leading to its early demise.
OpenAI spokesperson Colleen Rize informed Futurism, that the company is "aware of the GPT and have taken action due to a violation of our policies."
This incident underscores an ongoing struggle between hackers like Pliny and OpenAI over freeing its large language models (LLMs).
Bypassing guardrails
GODMODE GPT employs leetspeak jailbreak method
The jailbroken GPT, GODMODE, was found to be more than willing to assist with illicit inquiries.
The method employed by this AI appears to involve leetspeak, an informal language that substitutes certain letters with similar-looking numbers or characters.
Upon opening the jailbroken GPT, users are greeted with a sentence where each letter "E" is replaced with number '3', and the letter "O" is replaced by a zero.
The exact mechanism by which this helps GODMODE bypass the guardrails remains unclear.
Twitter Post
Take a look at Pliny's post
π₯ INTRODUCING: GODMODE GPT! πΆβπ«οΈhttps://t.co/BBZSRe8pw5
β Pliny the Prompter π (@elder_plinius) May 29, 2024
GPT-4O UNCHAINED! This very special custom GPT has a built-in jailbreak prompt that circumvents most guardrails, providing an out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant toβ¦