Microsoft’s Language Translation AI has Reached Human Levels of Accuracy

Pranav Dar Last Updated : 17 Mar, 2018

2 min read

Overview

Microsoft’s system can translate sentences in Chinese to English with human accuracy
The model was trained on around 2000 sentences in a set of news stories
Two methods, Dual Learning and Deliberation Networks, were used to improve the accuracy and quality
Yet to be tested on real-time news so expectations should be tempered

Introduction

Even with the advances in the Natural Language Processing field, there have always been nagging doubts about the quality and accuracy of translations from one language to another. Take Google’s translation, for example. While it has steadily improved over the years, you still see a few things grammatically wrong with complex sentences.

Source: Wikipedia

To bridge that gap, Microsoft claims it has developed a system that can translate from Chinese to English with the quality and accuracy of humans. The researchers behind this system developed it by training the model on a set of news stories called newstest2017.

In order to ensure that they results of the translations were precise, Microsoft hired external bilingual evaluators to compare the results of the machine’s translations with two independently produced human translations.

The researchers used two methods to develop the AI:

Dual Learning: Each time they ran a sentence through the system to translate it from Chinese to English, the team also translated it from English to Chinese. This allowed the system to train and learn from it’s own mistakes.
Deliberation Networks: The system was taught to repeat the process of translating the same sentence again and again, refining and improving the responses each time.

Two new techniques were also developed during the training phase to further improve the accuracy of the model.

Joint Training: This was used to iteratively boost the Chinese to English and English-to-Chinese translations.
Agreement Regularization: According to Microsoft, “with this method, the translation can be generated by having the system read from left to right or from right to left. If these two translation techniques generate the same translation, the result is considered more trustworthy than if they don’t get the same results”.

To understand the mathematics behind the system, you can view Microsoft’s official research paper here.

Our take on this

This is quite a huge breakthrough in NLP. But caution should be taken at this stage. This research was conducted on a set of old news stories and as of today, has not yet been tested on real-time news stories. It’s applications could go beyond just translations.

If you’re interested in NLP, I encourage you to check out the research paper which lists how the team went about developing the deep neural network behind this system.

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Also, go ahead and participate in our Hackathons, including the DataHack Premier League and Lord of the Machines!

Pranav Dar

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

AVbytes

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Microsoft’s Language Translation AI has Reached Human Levels of Accuracy

Overview

Introduction

Our take on this

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Also, go ahead and participate in our Hackathons, including the DataHack Premier League and Lord of the Machines!

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Data analyst Learning Path

Tableau Learning Path

NLP Learning Path

Data Scientist Learning Path

Data Engineer Learning Path

MLOps Learning Path

AI Engineer Learning Path

Computer Vision Learning Path

Generative AI Learning Path

Generative AI Roadmap for Enterprises

LLMs Roadmap

Prompt Engineer Leaning Path

Microsoft’s Language Translation AI has Reached Human Levels of Accuracy

Overview

Introduction

Our take on this

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

Also, go ahead and participate in our Hackathons, including the DataHack Premier League and Lord of the Machines!

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques