Summarize

Why ChatGPT can't crack coding problems after 2021

By Dwaipayan Roy

Jul 08, 2024

04:28 pm

What's the story

A study published in IEEE Transactions on Software Engineering has evaluated the code generated by OpenAI's ChatGPT. It found that ChatGPT's performance dropped when solving coding problems post-2021, possibly due to unfamiliarity with these problems in the training dataset. Notably, the study revealed a wide range of success rates for ChatGPT's ability to produce functional code, from as high as 89% to as low as 0.66%. Factors like task difficulty and programming language used also influenced results.

Worrying

Limitations in code generation

The research team tested GPT-3.5's ability to solve 728 coding problems from the LeetCode testing platform, in five programming languages. Tang noted that ChatGPT demonstrated proficiency, especially with problems existing on LeetCode before 2021. Yutian Tang, a lecturer at the University of Glasgow involved in the study, emphasized understanding ChatGPT's strengths and limitations for improved generation techniques.

Performance insights

ChatGPT's efficiency and error correction capabilities

Interestingly, the study found that ChatGPT generated code with lesser runtime and memory overheads than at least 50% of human solutions to the same problems. However, Tang noted that when it came to correcting its own mistakes, ChatGPT was less successful. He explained that "ChatGPT may generate incorrect code because it does not understand the meaning of algorithm problems, thus, this simple error feedback information is not enough."

Security analysis

Security concerns and complexity in AI-generated code

The study also highlighted certain security concerns with AI-generated code. The researchers found that ChatGPT-generated code had a fair amount of vulnerabilities, like a missing null test, but many of these were easily fixable. Tang noted that the generated code in C was the most complex, followed by C++ and Python, which had similar complexity to human-written code.

Developer guidance

Recommendations for developers using ChatGPT

Tang suggested that developers using ChatGPT should offer additional information, to help the AI better understand problems and avoid possible vulnerabilities. He advised, "When encountering more complex programming problems, developers can provide relevant knowledge as much as possible, and tell ChatGPT in the prompt which potential vulnerabilities to be aware of." This guidance aims to enhance the functionality and security of AI-generated code.