Code Generation Gets a Boost from PanGu-Coder2

A new AI system called PanGu-Coder2 is poised to advance the state of the art in automatic code generation, according to a new paper from researchers at Huawei Cloud and others. PanGu-Coder2 builds on recent advances in large language models to create code from natural language prompts, outperforming previous models on standardized tests.

The researchers propose a novel training framework called RRTF (Rank Responses to align Test&Teacher Feedback) to enhance an existing 15-billion parameter model called StarCoder. The framework guides the model by ranking candidate code solutions generated from prompts. The ranking uses both automated test cases and human expert preferences as feedback signals. Solutions that pass more test cases or are favored by human judgement are ranked higher. The model is then trained on triples of prompt, high-ranked solution, low-ranked solution to reinforce generating solutions preferred by the test+teacher feedback. This elegantly aligns the model’s behavior with proper code quality standards in a data-efficient manner.

On the benchmark HumanEval test, PanGu-Coder2 achieved a 61.64% accuracy rate in generating working code from prompts, surpassing the previous best open-source model WizardCoder. It also outperformed larger proprietary models like Google’s LaMDA. PanGu-Coder2 similarly topped the leaderboard on tests of generating code from realistic programming scenarios like debugging contexts.

The success highlights the potential of using ranking-based training paradigms to align large language model behavior with human preferences. While reinforcement learning has shown promise for code generation, the simplicity and efficiency of RRTF proved more effective here. The human+test feedback signal is well suited for guiding code quality.

The researchers emphasize the importance of training data quality and compute in achieving the boost provided by RRTF. More training epochs over diverse code examples steadily increased performance. PanGu-Coder2 reaches near human-level proficiency on simpler coding prompts, though complex requirements still challenge it.

Looking forward, combining RRTF with other techniques like instructor tuning may unlock further gains. Code generation has applications spanning software development, computer science education, and pervasive AI assistance. With models like PanGu-Coder2 approaching higher competence, the vision of multifunctional AI adept at writing software is coming closer to reality.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.