Microsoft on Tuesday announced an artificial intelligence system that can recommend code for software developers to use as they write code.
Microsoft is looking to simplify the process of programming, the area where the company got its start in 1975. That could keep programmers who already use the company’s tools satisfied and also attract new ones.
The system, called GitHub Copilot, draws on source code uploaded to code-sharing service GitHub, which Microsoft acquired in 2018, as well as other websites. Microsoft and GitHub developed it with help from OpenAI, an AI research start-up that Microsoft backed in 2019.
Researchers at Microsoft and other institutions have been trying to teach computers to write code for decades. The concept has yet to go mainstream, at times because programs to write programs have not been versatile enough. The GitHub Copilot effort is a notable attempt in the field, relying as it does on a large volume of code in many programming languages and vast Azure cloud computing power.
Nat Friedman, CEO of GitHub, describes GitHub Copilot as a virtual version of what software creators call a pair programmer — that’s when two developers work side by side collaboratively on the same project. The tool looks at existing code and comments in the current file and the location of the cursor, and it offers up one or more lines to add. As programmers accept or reject suggestions, the model learns and becomes more sophisticated over time.
The new software makes coding faster, Friedman said in an interview last week. Hundreds of developers at GitHub have been using the Copilot feature all day while coding, and the majority of them are accepting suggestions and not turning the feature off, Friedman said.
Programming involves coming up with an idea about how to do something and then implementing it, and GitHub Copilot is good at the second part, said Greg Brockman, a co-founder of OpenAI and its chief technology officer.
“You don’t want to go read Twilio’s API documentation. It knows all that stuff. It’s actually quite reliable at it,” he said. Brockman calls this work last-mile programming, and he said that having computers take care of it leads to speed improvements.
Microsoft’s chief technology officer, Kevin Scott, has seen that happen firsthand.
“It can save me from having to dive through a whole bunch of documentation to get a tool to do a thing that I know it’s capable of doing, and that is so good for productivity,” he said. “I can’t even tell you the number of hours I’ve wasted trying to figure out the right way to do a relatively prosaic thing, just navigating the complexity of these tools.”
GitHub Copilot isn’t just for software veterans like him, though.
“It may very well be one of those things that makes programming itself more approachable,” Scott said.
It supports almost every programming language, but it’s been designed to work best with JavaScript, Python and TypeScript, Friedman said.
GitHub Copilot will first appear in Microsoft’s Visual Studio Code, a free open-source product, and Microsoft plans to incorporate it into the commercial Visual Studio product in the future.
A descendent of OpenAI’s GPT-3
The model at the core of GitHub Copilot, called Codex, is a descendent of GPT-3, a powerful model that OpenAI trained on large volumes of text, Brockman said. Engineers fed the model “many, many terabytes of public source code out there,” Friedman said.
This isn’t the first time Microsoft has leaned on OpenAI to deliver smart software. Last month Microsoft showed how it would update the Power Apps Studio application, which nontechnical people use to write apps, so that users could type in words describing the elements they’d like to add and have GPT-3 show options for the necessary code.
OpenAI recognizes the potential for AI models to come up with code with GPT-3, which it introduced last year. The start-up says on its website that an online service delivering GPT-3 can handle “code completion.” But back when OpenAI was first training the model, the start-up had no intention of teaching it how to help code, Brockman said. It was meant more as a general purpose language model that could, for instance, generate articles, fix incorrect grammar and translate from one language into another.
Over the next few months, people experimented with the model to see what it could do, both useful and silly — for instance, one engineer made a website that could design a button that looked like a watermelon. Brockman reached out to Friedman, as he was running a key destination where millions of programmers work on code and things proceeded from there.
GitHub employees have tried to ensure that GitHub Copilot will generate secure, high-quality code. “We’ve built a number of safety mechanisms into Copilot that we think are cutting-edge in terms of reducing the chances of mistakes in various areas here, but they’re definitely not perfect,” Friedman said.
The underlying technology won’t be only Microsoft’s to use. OpenAI will release the Codex model this summer for third-party developers to weave into their own applications, Brockman said.
Microsoft could someday release a version of the product that enterprises could train to understand their programming styles, Scott said. For now, Microsoft is only offering the service that knows about code stored in public repositories.