Skip to main content

Microsoft’s new bot can draw a photo-realistic bird based on text descriptions

Microsoft
Image used with permission by copyright holder

Microsoft’s research labs created a new artificial intelligence, or bot, that can draw any image you want based on simple descriptions. The company says this bot can draw anything in pixel form stemming from caption-like text descriptions you provide. And although text-to-image creation isn’t anything new, Microsoft’s “drawing bot” focuses on captions as image descriptors to produce an image quality that is claimed to be three times better than other state-of-the-art technologies.  

“The technology, which the researchers simply call the drawing bot, can generate images of everything from ordinary pastoral scenes, such as grazing livestock, to the absurd, such as a floating double-decker bus,” Microsoft states. “Each image contains details that are absent from the text descriptions, indicating that this artificial intelligence contains an artificial imagination.” 

Microsoft’s drawing bot merges two components of artificial intelligence: Natural-language processing and computer vision. The research project started with a bot that could generate text captions from photos. The researchers then advanced the project to answer human-generated questions about images, such as identifying a location, the object in focus, and so on. 

But actually drawing an image is a huge step. While the bot can generate components based on text descriptors, it must “imagine” all the other missing pieces of the picture. Thus, if you tell the bot to draw a yellow bird with black wings, it has four descriptors, but must pull the remaining parts from data it acquired from previous drawings, photos, and more. In other words, knowledge obtained through machine-based learning. 

Microsoft’s bot relies on a generative adversarial network (GAN). Just imagine two teams of computers: One side must render an image to fool the other team into believing it’s an actual photograph. Both teams go back and forth, with the first saying the image is real, and the second saying “nuh-uh,” disproving the claim. The goal, obviously, is to render an image that finally fools the second team. 

In this case, the first team renders an image derived from text-based descriptions and the second team will disprove its “authenticity” as an actual photograph until the first team correctly renders the image. Microsoft first fed its GAN with paired images and captions so that it could understand that it needs to draw a bird based on that single word. 

From there, Microsoft continued to build the knowledge base with paired images and captions consisting of multiple traits, such as black wings and a red belly. But Microsoft says it’s not using just any GAN, but one that targets tiny details so the bot can produce photo-realistic results. Microsoft dubs it as an attentional GAN, or AttnGAN. 

“As humans draw, we repeatedly refer to the text and pay close attention to the words that describe the region of the image we are drawing,” the company says. “[AttnGAN] does this by breaking up the input text into individual words and matching those words to specific regions of the image.” 

You can read Microsoft’s research paper describing its AttnGAN here. 

Editors' Recommendations

Kevin Parrish
Former Digital Trends Contributor
Kevin started taking PCs apart in the 90s when Quake was on the way and his PC lacked the required components. Since then…
Best OLED monitor deals: Get an OLED screen from just $450
Marvel's Spider-Man running on the Samsung Odyssey OLED G8.

Shopping some of the best monitor deals is a good way to save on some extra screen real estate, but if you’re looking for something that can produce a stunning image you should turn your shopping cart toward the OLED monitors. OLED is one of the best picture technologies currently available, and it can create a lifelike image that makes interacting with games, presentations, and creative work much more immersive. The best OLED monitors can run pretty expensive, but that’s what OLED monitor deals are for. If you’re looking for superior picture quality and some ways to save, read onward for more details on the best OLED monitor deals taking place right now.
ViewSonic 15.6-inch VX1655 4K OLED portable monitor — $450, was $500

Getting into the OLED game can be both affordable and portable with the ViewSonic VX1655. It’s a 4K OLED display that’s made to function as either a laptop extension or something to pair with a tablet. It comes in at a super sharp 4K resolution and a refresh rate of 60Hz. This isn’t something you’d want to pair with a gaming PC, but it’s a great little display to keep with you if you do creative work on the run or want some extra screen real estate while working on a tablet at your desk.

Read more
Best monitor deals: Gaming, office, curved, OLED and more
Dell UltraSharp 27 4K PremierColor Monitor

Whether you prefer to work at one of the best desktop computers, the best laptops, or anything in between, an external monitor can be helpful to add some extra screen real estate. One of the best monitors can even go a long way toward reducing eye strain and creating an immersive digital or gaming experience. While high end monitors can get quite expensive, there are always some impressive monitor deals to shop, and we’ve tracked them all down. Reading onward you’ll find what we feel are the best monitor deals, whether you’re shopping for 4K monitors, gaming monitors, ultrawide monitors, or more general monitors meant for all-purpose users.
Best monitor deals

The following deals represent a best-of of the deals below. They're selected to give a mixture of prices, styles, and levels of discount. If you don't see something you like, don't worry, as the following sections will have plenty more deals for you to choose from. However, this is a highly recommended place to start:

Read more
7 best Chromebooks for 2024: the best for every budget
Close up of the Chrome logo on the top of a Chromebook.

Chromebooks might have a hard time competing with Windows laptops and MacBooks, but that doesn't mean they don't have their place. We've reviewed hundreds of laptops over the years, testing for important qualities like performance, battery life, and display quality — and we've found that Chromebooks consistently excel at performance and reliability.

You can find Chromebooks from Google, HP, Lenovo, Acer, and many others, and we've dug through them to put together this roundup of the best Chromebooks on the market. They're incredibly accessible devices, and for the right person, a Chromebook can be the best laptop in terms of value.

Read more