Microsoft’s New AI Bot can Draw Images Based on Captions

Pranav Dar Last Updated : 20 Jan, 2018

2 min read

Microsoft has built an AI powered bot that can draw images based on the text it is provided. The below image, published by Microsoft, depicts a yellow black bird that was completely generated by the bot.

Source: Microsoft

Microsoft is simply calling this new technology the “drawing bot” for now. It can generate images from animals to scenic hillsides, and even outlandish things like flying cars and twisted street lamps. It’s basically the AI version of pictionary where you’re supposed to draw something based on cue cards. The only difference is you type something for the bot, and it will run it’s algorithm and give you the image.

The most exciting part about the technology is that the images geenrated might not even be of actual real things. The bird created in the above image? It might not even be in existence – they’re just a rendering of the machine’s imagination of how a bird looks like. Further, each image that is created contains other details that are not provided in the text descriptions.

In terms of where this bot will be used once it’s made available, Microsoft see it being used by painters and interior decorators. It can also be used a voice-activated tool for creating or refining photos (maybe there’s a role for Cortana in there).

To make the AI understand what words go with which pictures, the drawing bot was trained on pairs of images and captions. The algorithm is divided into two parts:

GAN – Generative Adversarial Network, it generates images based on the text
Discriminator – this judges the quality of the generated image

Microsoft has previously released the CaptionBot, which takes images as input and writes captions for them. They followed this up with the SeeingAI tool. Again, it takes images as input and describes what’s in them. This is especially targeted towards low-vision and blind people.

Our take on this

While Google launched a similar AI last year which could create doodles, Microsoft’s version is in a different league altogether. It’s not perfect yet, but one can imagine the future uses for such technology. The principal researcher in this matter, Xiaodong He, thinks it might even be used to create animated movies (using pre-written scripts). Following Google’s AutoML Vision launch yesterday, 2018 is already promising to be a big year in the image recognition field.

Pranav Dar

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

AVbytes

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Microsoft’s New AI Bot can Draw Images Based on Captions

Our take on this

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Data analyst Learning Path

Tableau Learning Path

NLP Learning Path

Data Scientist Learning Path

Data Engineer Learning Path

MLOps Learning Path

AI Engineer Learning Path

Computer Vision Learning Path

Generative AI Learning Path

Generative AI Roadmap for Enterprises

LLMs Roadmap

Prompt Engineer Leaning Path

Microsoft’s New AI Bot can Draw Images Based on Captions

Our take on this

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques