Ilya Sutskever (PhD 2012) is building artificial intelligence that’s mastering a new skill – language

September 28, 2022 by Alec Scott

The dawn of smart, self-aware machines is a popular sci-fi storyline that, for many years, seemed destined to remain in the realm of make-believe, or at least in the distant future. Recently, though, many problems that long bedevilled computer scientists have fallen one by one – so much so that insiders are calling the last 10 years a golden decade in artificial intelligence. Computers can now outshine humans in recognizing images, can understand what we’re saying and often respond sensibly to questions, can translate competently from one language to another and can defeat us in even the most complicated strategy games.

Recent advances in AI have largely built on work by Geoffrey Hinton, a U of T professor emeritus of computer science and chief scientific advisor at the Vector Institute. For years, he worked in relative obscurity, his main idea on this burgeoning field’s fringes. He was one of a few scientists to back the idea that a machine mimicking the neural networks we have in our brains could, if given masses of data, find patterns there, make its own version of sense out of it all and propose solutions. 1

    His approach to building artificial intelligence gained sudden credibility when, in 2012, a neural net that he and two of his graduate students created won a major international competition to identify the content of images.2

      One of those graduate students was Ilya Sutskever (BSc 2005, MSc 2007, PhD 2012), whose career since finishing his doctorate at U of T has touched on many of AI’s big wins in the last decade. Currently the chief scientist at OpenAI, a San Francisco-based enterprise he co-founded in 2015, Sutskever has expressed the hope of growing an artificial intelligence there that “loves humanity.” (The company’s mission statement characterizes this, in a more pedestrian way, as ensuring that artificial intelligence “benefits all of humanity.”) 3

        OpenAI has the backing of two of the world’s leading tech entrepreneurs – Tesla’s Elon Musk and Peter Thiel, the co-founder of PayPal – who together, with others, invested $1 billion in the company. Sam Altman, the longtime head of the tech accelerator Y Combinator is the company’s CEO, and Greg Brockman, who helped turn the payment-processing company Stripe from a start-up to a global player, is president and chairman.

        I spoke recently with Sutskever at OpenAI’s headquarters, located in a mid-rise building in San Francisco’s Mission District. Walking through the office, with its blonde wood, lush plants and sleek futurist furniture, I half-expected to be greeted at reception by one of the robot “hosts” from Westworld. Instead, a very friendly (and very human) staff member led me to a conference room named for the star Betelgeuse – a red supergiant that shines brightly in the night sky. There, Sutskever and I talked about some of the advanced computer tools he and his team are creating.

        OpenAI’s flagship product is DALL-E 2, a system released earlier this year that can create original images and edit existing ones based on text commands. (Its name is a play on Pixar’s animated robot WALL-E and the artist Salvador Dali.) Although there is currently a waitlist to try the system, some of the early adopters have shared their creations on social media: a raccoon playing tennis at Wimbledon; an Italian town made out of pasta, tomatoes, basil and parmesan. Users can also specify the style of the image they want. In a demo, I watched it conjure up an illustration of a comic-book rabbit working as a tattoo artist. What’s intriguing about the program is how it can combine completely unrelated concepts in imaginative and seamless ways.

        Sutskever is clearly proud of what DALL-E 2 can achieve, though he acknowledges a limitation when compared with humans: “It’s creative but it can’t come up with a whole new aesthetic in the way that a genius like Picasso did.”

        OpenAI is also well known for GPT-3 (or Generative Pre-Trained Transformer), an AI that produces humanlike text. Using a supercomputer based in Iowa, the system – soon to be released in its fourth iteration – has consumed all the digitized books in the world and much of the internet’s text. Having learned from this vast corpus, it can now write short, original essays, using a specific prompt. Ask it about the moon, for example, or author Italo Calvino and it can generally supply something informative and well-written in reply. It can write original poetry and even headlines. It can also give short summaries of a much longer text. And, when given a few sentences, it can go on to write several sentences more in the same vein. Though its programmers worked on its competence in English, the program basically taught itself other languages, including Vietnamese and French.

