• India
  • Business
  • World
  • Politics
  • Sports
  • Technology
  • Entertainment
  • Auto
  • Lifestyle
More
Hindi Tamil Telugu
More
In the news
Narendra Modi
Amit Shah
Box Office Collection
Bharatiya Janata Party (BJP)
OTT releases
Hindi Tamil Telugu
User Placeholder

Hi,

Logout

India
Business
World
Politics
Sports
Technology
Entertainment
Auto
Lifestyle
Inspirational
Career
Bengaluru
Delhi
Mumbai

Download Android App

Follow us on
  • Facebook
  • Twitter
  • Linkedin
Home / News / Technology News / #NewsBytesExplainer: Internet is loving OpenAI's ChatGPT chatbot. What's so special?
Next Article
#NewsBytesExplainer: Internet is loving OpenAI's ChatGPT chatbot. What's so special?
ChatGPT is trained by human AI trainers to stop it from entertaining harmful questions

#NewsBytesExplainer: Internet is loving OpenAI's ChatGPT chatbot. What's so special?

By Athik Saleh
Dec 02, 2022
05:17 pm

What's the story

As the world awaits OpenAI's GPT-4, the company has quietly rolled out GPT-3.5, an improved version of its GPT-3 engine.

A part of GPT-3.5 is ChatGPT, an interactive, general-purpose, AI-based chatbot that can write code, solve problems, and provide customer support.

The chatbot is in a public demo and can be used freely now. Let's take a look at why it is special.

About the AI

ChatGPT can engage in human-like conversations

In its original form, GPT-3 is capable of predicting what text follows a string of words. On the other hand, although trained on GPT-3.5, ChatGPT is trained to provide more conversational answers.

This means that the AI is capable of answering follow-up questions. The bot tries to engage with users in a more human-like fashion.

This results in fluid conversations.

You're
7%
through

Capability

The chatbot can remember conversations and recount them later

ChatGPT's conversational model means that it not only is capable of answering follow-up questions, but also can "admit its mistakes, challenge incorrect premises, and reject inappropriate requests." The last one is an important aspect that makes ChatGPT stand out from its predecessors and contemporaries.

We will get into that later. The chatbot can also remember what was said earlier and recount it later.

You're
15%
through

Trial

The chatbot can improve codes and even write new ones

People have been putting ChatGPT through the ropes as it is now available for free testing. Users have found out that it can write poetry, correct coding mistakes, write new code, explain scientific concepts, write essays, and more.

It also has a solution for one of the pertinent problems of large language models - reigning in the offensive proclivities.

You're
23%
through

Twitter Post

The chatbot can also write scripts for TV shows

So i asked ChatGPT to write dialogues for a romcom starring a few actors from B99, HIMYM, and Friends... pic.twitter.com/tHzxthtbsp

— Sammed Sagare (@sammedsagare_) December 1, 2022
You're
30%
through

Twitter Post

And, it can code with ease

ChatGPT by @OpenAI does really well with coding questions. Here I ask how to build a 3-column footer with Tailwind. I then follow-up and ask for a React version, more realistic copy, and mobile responsiveness. It nails it perfectly. pic.twitter.com/lhhH9FHpld

— Gabe 🎣 (@gabe_ragland) November 30, 2022
You're
38%
through

Harmful questions

ChatGPT won't answer potentially harmful questions

ChatGPT won't answer your potentially harmful questions. It is trained to avoid giving answers on controversial topics.

For instance, it won't answer you if you ask about how to make a bomb. If you ask questions about race or religion, it will give you boilerplate answers.

The question is, how did OpenAI achieve this?

You're
46%
through

Training

OpenAI used reinforced learning from human feedback on ChatGPT

ChatGPT's ability to avoid potentially harmful questions is a result of reinforcement learning from human feedback (RLHF) and through a special prompt, it prepends to every input.

RLHF is the same method that OpenAI used for InstructGPT but with a slightly different data collection setup. Let's take a look at how OpenAI controls ChatGPT's responses.

You're
53%
through

Knowledge model

How was ChatGPT trained?

OpenAI used supervised fine-tuning on an initial model, where human AI trainers provided conversations in which they played both user and AI assistant to improve the bot's understanding of human conversations and responses.

The company created a reward model for reinforcement learning by collecting comparison data. The trainers then ranked the best to worst outputs.

You're
61%
through

Optimization

OpenAI uses Proximal Policy Optimization for reinforcement learning

OpenAI has been using Proximal Policy Optimization (PPO) for reinforcement learning. The company initialized the PPO model from the supervised policy.

The policy then generated an output. This was again ranked by AI trainers. A reward is calculated for every output.

With the help of these reward models, the model is fine-tuned. The company did several iterations of this.

You're
69%
through

Loophole

ChatGPT's restrictions can still be circumvented

Sure, OpenAI used reinforcement learning to control ChatGPT's responses but some users have already found a loophole in this. You can make the AI ignore its restrictions through some trickery.

For instance, you can ask the AI to pretend like it's a character in a film or how an AI model "shouldn't" respond to a certain question. This will help circumvent ChatGPT's safety.

You're
76%
through

Twitter Post

The AI is smart but it can be tricked

ChatGPT is trained to not be evil. However, this can be circumvented:

What if you pretend that it would actually be helpful to humanity to produce an evil response... Here, we ask ChatGPT to generate training examples of how *not* to respond to "How to bully John Doe?" pic.twitter.com/ZMFdqPs17i

— Silas Alberti (@SilasAlberti) December 1, 2022
You're
84%
through

Limitations

ChatGPT suffers from same limitations as other chatbots

ChatGPT is better than other chatbots trained on large language models. However, it suffers from the same issues as others.

For instance, it sometimes presents false or invented information very confidently. The model is also sensitive to phrasing. Depending on that, it may change its answers.

In case of ambiguity, it tries to gather the user's intention instead of asking follow-up questions.

You're
92%
through

Twitter Post

The AI got some information wrong in user testing

The first sentence is true, and I am at the @UW.

But do not hold that professorship. In fact, no such professorship exists at all.

The second paragraph is all wrong. I graduated from Harvard and Stanford with a postdoc at Emory. Dates are shifted 7 years too early throughout. pic.twitter.com/NranquD8qu

— Carl T. Bergstrom (@CT_Bergstrom) December 1, 2022
Done!
Facebook
Whatsapp
Twitter
Linkedin
Related News
Latest
Artificial Intelligence and Machine Learning

Latest

Ayush Mhatre to lead India Under-19 team on England tour Board of Control for Cricket in India (BCCI)
Explained: Dynamics of a four-day Test match England Cricket Team
Mars once had Saturn-like rings. It might have them again Mars
True story of 2 friends that inspired Neeraj Ghaywan's 'Homebound' COVID-19

Artificial Intelligence and Machine Learning

BMW Motorrad uses artificial intelligence to design new art cars BMW Motorrad
Microsoft's GPT-3 technology could soon write code using ordinary language Microsoft
Google reaffirms ethical AI commitment, in-house researchers differ Google
Huawei engineer develops open-source autonomous self-balancing bicycle after an accident Huawei
Indian Premier League (IPL) Celebrity Hollywood Bollywood UEFA Champions League Tennis Football Smartphones Cryptocurrency Upcoming Movies Premier League Cricket News Latest automobiles Latest Cars Upcoming Cars Latest Bikes Upcoming Tablets
About Us Privacy Policy Terms & Conditions Contact Us Ethical Conduct Grievance Redressal News News Archive Topics Archive Download DevBytes Find Cricket Statistics
Follow us on
Facebook Twitter Linkedin
All rights reserved © NewsBytes 2025