May 26, 2021 2:15 PM

AI Could Soon Write Code Based on Ordinary Language

Microsoft reveals plans to bring GPT-3, best known for generating text, to programming. “The code writes itself,” CEO Satya Nadella says.

The AI Database →

Application

Text analysis

Text generation

Software development

Company

Microsoft

End User

Consumer

Research

Sector

Research

Source Data

Text

Technology

Machine learning

Natural language processing

In recent years, researchers have used artificial intelligence to improve translation between programming languages or automatically fix problems. The AI system DrRepair, for example, has been shown to solve most issues that spawn error messages. But some researchers dream of the day when AI can write programs based on simple descriptions from non-experts.

On Tuesday, Microsoft and OpenAI shared plans to bring GPT-3, one of the world’s most advanced models for generating text, to programming based on natural language descriptions. This is the first commercial application of GPT-3 undertaken since Microsoft invested $1 billion in OpenAI last year and gained exclusive licensing rights to GPT-3.

“If you can describe what you want to do in natural language, GPT-3 will generate a list of the most relevant formulas for you to choose from,” said Microsoft CEO Satya Nadella in a keynote address at the company’s Build developer conference. “The code writes itself.”

Courtesy of Microsoft

Microsoft VP Charles Lamanna told WIRED the sophistication offered by GPT-3 can help people tackle complex challenges and empower people with little coding experience. GPT-3 will translate natural language into PowerFx, a fairly simple programming language similar to Excel commands that Microsoft introduced in March.

This is the latest demonstration of applying AI to coding. Last year at Microsoft’s Build, OpenAI CEO Sam Altman demoed a language model fine-tuned with code from GitHub that automatically generates lines of Python code. As WIRED detailed last month, startups like SourceAI are also using GPT-3 to generate code. IBM last month showed how its Project CodeNet, with 14 million code samples from more than 50 programming languages, could reduce the time needed to update a program with millions of lines of Java code for an automotive company from one year to one month.

Microsoft's new feature is based on a neural network architecture known as Transformer, used by big tech companies including Baidu, Google, Microsoft, Nvidia, and Salesforce to create large language models using text training data scraped from the web. These language models continually grow larger. The largest version of Google’s BERT, a language model released in 2018, had 340 million parameters, a building block of neural networks. GPT-3, which was released one year ago, has 175 billion parameters.

Such efforts have a long way to go, however. In one recent test, the best model succeeded only 14 percent of the time on introductory programming challenges compiled by a group of AI researchers.

Still, researchers who conducted that study conclude that tests prove that “machine learning models are beginning to learn how to code.”

To challenge the machine learning community and measure how good large language models are at programming, last week a group of AI researchers introduced a benchmark for automated coding with Python. In that test, GPT-Neo, an open-source language model designed with a similar architecture as OpenAI’s flagship models, outperformed GPT-3. Dan Hendrycks, the lead author of the paper, says that’s due to the fact that GPT-Neo is fine-tuned using data gathered from GitHub, a popular programming repository for collaborative coding projects.

As researchers and programmers learn more about how language models can simplify coding, Hendrycks believes there will be opportunities for big advances.

Hendrycks thinks applications of large language models based on the Transformer architecture may begin to change programmers’ jobs. Initially, he says, application of such models will focus on specific tasks, before branching out into more generalized forms of coding. For example, if a programmer pulls together a large number of test cases of a problem, a language model can generate code that suggests different solutions then let a human decide the best course of action. That changes the way people code “because we don’t just keep spamming until something passes,” he says.

Hendrycks thinks AI that suggests your next line of code could improve the productivity of human programmers and potentially lead to less demand for programmers or allow smaller teams to accomplish goals.

OpenAI currently provides private beta access to GPT-3. GPT-3 has demonstrated an ability to accomplish tasks ranging from completing SAT analogies correctly to answering questions or generating text. It’s also generated text that involves sexual acts with children and generate offensive text about Black people, women, and Muslims. OpenAI has shared little about how it uses filtering methods to try and address such toxicity; if OpenAI can’t figure out how to eliminate offensive or toxic comments generated by GPT-3, that could limit its use.

Exactly how Microsoft, OpenAI, and GitHub will work together on AI for coding is still unclear. In 2018, soon after Microsoft acquired GitHub, the company detailed efforts to use language models to power semantic code search, the first in a series of applied research initiatives involving AI. Such a capability could make it easier for a programmer to search and use code using natural language. A GitHub spokesperson declined to comment on the status of that project.