Rosetta – How Facebook uses Machine Learning to Process Text in Billions of Images

Pranav Dar Last Updated : 07 May, 2019

3 min read

Overview

Rosetta is Facebook’s own large-scale machine learning system to deal with text in images
The system extracts text from more than a billion Facebook and Instagram images each day in real time, and recognises multiple languages
The text extraction works in two steps: detection and recognition

Introduction

Facebook processes a ludicrous number of images per day. Even with the recent controversies it has had to face, the number of people using the platform has not diminished a whole lot. And the uploading and sharing of photos continues unabated.

Now Facebook faces a challenge every day. Quite a lot of these images have text in them (could be a meme, quote, street sign, menu, business card, etc.). How can the big tech giant make use of this text? How can they extract it and use it to improve the user experience?

Given the sheer amount of images Facebook has to process, using a traditional optical character recognition (OCR) software won’t cut it. The OCR might be able to recognize the characters, but it definitely won’t understand the context.

Step up Rosetta, Facebook’s own large-scale machine learning system.

Rosetta extracts text from more than a billion public Facebook and Instagram images (and even videos) on a daily basis. The text isn’t just limited to English, Rosetta is able to recognize multiple languages in real time. This text data is then fed to a text recognition model that has been trained on classifiers with the singular aim of understanding the context of the text in each image.

Text extraction is performed in two steps, independent of each other:

Detection: The system detects rectangular regions that could potentially contain text. Facebook’s approach was based on using Faster R-CNN, a state-of-the-art object detection framework
Recognition: Once the regions had been detected, a CNN was used to recognize and transcribe the world present in each region

The below image is a nice illustration of Rosetta’s architecture:

I strongly recommend reading the entire blog post on Facebook’s Code site. It is a marvellous explanation of how Rosetta works, and especially how the detection and recognition models were designed from scratch. Alternatively, you can watch the below video from KDD2018 which summarises the inner workings of Rosetta in under two and half minutes:

Our take on this

It’s always a pleasure to read Facebook and Google’s AI research posts. There’s so much knowledge to be gained with each breakthrough or service they write about. Most of us in the data science domain must have wondered how a behemoth like Facebook uses machine learning in real-world cases (except their news feed, of course) and bit by bit, the curtains are drawn back.

If you’re a NLP enthusiast, the text detection using Faster R-CNN approach sounds pretty intriguing, doesn’t it? Rosetta is already being heavily used by Facebook and Instagram.There’s a lot more work to be done since text comes in all forms and structures, and Facebook’s research team is just getting started.

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Pranav Dar

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

AVbytes

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Rosetta – How Facebook uses Machine Learning to Process Text in Billions of Images

Overview

Introduction

Our take on this

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Data analyst Learning Path

Tableau Learning Path

NLP Learning Path

Data Scientist Learning Path

Data Engineer Learning Path

MLOps Learning Path

AI Engineer Learning Path

Computer Vision Learning Path

Generative AI Learning Path

Generative AI Roadmap for Enterprises

LLMs Roadmap

Prompt Engineer Leaning Path

Rosetta – How Facebook uses Machine Learning to Process Text in Billions of Images

Overview

Introduction

Our take on this

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques