Performing Unsupervised Machine Translation to and from Rare Languages

In the world of Deep Learning, applications are plagued by Computing Resources (GPUs and CPUs), Datasets and Algorithms. But these problems are even more elevated in the Language Translation domain due to non-availability of parallel corpus between Common and Rare Languages. In this Hack session, We will have a look at the current State-of-the-Art in the Unsupervised Language Translation domain which does not require parallel corpus of translations and utilises the previous works in the field of Machine Translation, Statistical Machine Translation and Unsupervised Embeddings to achieve the results.

We will also see how the performance of Unsupervised Machine Translation can be significantly increased and brought close to the current State-of-the-Art Supervised Machine Translations with only few thousands of parallel translations.

Structure of the Hack Session

Then and Now – History of Machine Translation from SMT to MT
Types of Machine Translation – Supervised and Unsupervised
Brief on GNMT – State-of-the-Art on Supervised MT
Detailed discussion on how Unsupervised Machine Translation works.
Brief on Evaluation Metrics for Machine Translation
Implementing Unsupervised MT for Language Translation without Parallel Corpus.
Implementing Unsupervised MT for Language Translation with a small Parallel Corpus.
Conclusion – The next possible steps for research in Unsupervised MTs.

Tools Used

Python 3
Pytorch

HACKERS

Neeraj Singh Sarwan

He is a perpetual, quick learner and keen to explore the realm of Data analytics and science. He is deeply excited about the times we live in and the rate at which data is being generated and being transformed as an asset. He is well versed with a few tools for dealing with data and also in the process of learning some other tools and knowledge required to exploit data.

Duration of Hack-Session: 1 hour

Buy Ticket

Structure of the Hack Session

Tools Used

HACKERS

Neeraj Singh Sarwan

Download Brochure