Summarize

OpenAI warns users against probing its latest 'o1' AI model

By Dwaipayan Roy

Sep 18, 2024

02:37 pm

What's the story

OpenAI has issued stern warnings to users, attempting to investigate the inner workings of its new o1 AI model codenamed "Strawberry." They are receiving warning emails and threats of ban.

The company recently launched this innovative LLM, which includes o1-preview and o1-mini, boasting unique reasoning capabilities.

Unlike previous models like GPT-4o, the new AI was specifically designed for a step-by-step problem-solving approach, before generating an answer.

AI reasoning

OpenAI's new approach to problem-solving

The "Strawberry" utilizes a unique method for problem solving.

When users pose a question to an "o1" model via ChatGPT, they can choose to view this chain-of-thought process in the interface.

However, OpenAI intentionally conceals the raw chain of thought from users, opting instead to display a filtered interpretation created by another AI model.

User warnings

Clamp down on attempts to uncover AI's reasoning

Despite OpenAI's efforts to keep the raw chain of thought hidden, enthusiasts have been trying to expose it using jailbreaking or prompt injection techniques.

In response, the company has been monitoring activity via the ChatGPT interface and issuing stern warnings, against any attempts to probe o1's reasoning.

Users have reported receiving warning emails for using terms like "reasoning trace" in conversation with o1 or simply asking about its "reasoning."

Policy enforcement

Warning email and potential bans

The warning email from OpenAI indicates that certain user requests have been flagged for violating policies, against bypassing safeguards or safety measures.

The company has threatened to ban users who continue such activities, potentially leading to loss of access to GPT-4o with Reasoning, an internal name for the o1 model.

This move has sparked discussions among AI enthusiasts and researchers.

AI transparency

OpenAI's stance on hidden chains of thought

OpenAI maintains that the hidden chains of thought in AI models provide a unique monitoring opportunity, permitting them to "read the mind" of the model, and understand its so-called thought process.

However, for commercial reasons, they have chosen not to make these raw chains directly visible to users.

This decision has been met with criticism from independent researchers like Simon Willison who argue it hampers community transparency.