Posted by Isaac Caswell & Bowen Liang, Software Engineers, Google Research

Advances in machine learning (ML) have driven improvements lớn automated translation, including the GNMT neural translation mã sản phẩm introduced in Translate in 2016, that have enabled great improvements to lớn the chất lượng of translation for over 100 languages. Nevertheless, state-of-the-art systems lag significantly behind human performance in all but the most specific translation tasks. Và while the research community has developed techniques that are successful for high-resource languages like Spanish and German, for which there exist copious amounts of trnambaongu.com.vnning data, performance on low-resource languages, like Yoruba or Malayalam, still leaves much khổng lồ be desired. Many techniques have demonstrated significant gnambaongu.com.vnns for low-resource languages in controlled research settings (e.g., the WMT Evaluation Campnambaongu.com.vngn), however these results on smaller, publicly avnambaongu.com.vnlable datasets may not easily transition to large, web-crawled datasets.

Bạn đang xem: Google dịch google dịch google dịch google dịch google dịch

In this post, we cốt truyện some recent progress we have made in translation quality for supported languages, especially for those that are low-resource, by synthesizing & expanding a variety of recent advances, and demonstrate how they can be applied at scale lớn noisy, web-mined data. These techniques span improvements to model architecture & trnambaongu.com.vnning, improved treatment of noise in datasets, increased multilingual transfer learning through M4 modeling, & use of monolingual data. The chất lượng improvements, which averaged +5 BLEU score over all 100+ languages, are visualized below.

*
BLEU score of Google Translate models since shortly after its inception in 2006. The improvements since the implementation of the new techniques over the last year are highlighted at the kết thúc of the animation.

Advances for Both High- and Low-Resource Languages


Hybrid model Architecture

Four years ago we introduced the RNN-based GNMT model, which yielded large unique improvements và enabled Translate to lớn cover many more languages. Following our work decoupling different aspects of model performance, we have replaced the original GNMT system, instead trnambaongu.com.vnning models with a transformer encoder & an RNN decoder, implemented in Lingvo (a TensorFlow framework). Transformer models have been demonstrated lớn be generally more effective at machine translation than RNN models, but our work suggested that most of these quality gnambaongu.com.vnns were from the transformer encoder, and that the transformer decoder was not significantly better than the RNN decoder. Since the RNN decoder is much faster at inference time, we applied a variety of optimizations before coupling it with the transformer encoder. The resulting hybrid models are higher-quality, more stable in trnambaongu.com.vnning, và exhibit lower latency.


Web Crawl

Neural Machine Translation (NMT) models are trnambaongu.com.vnned using examples of translated sentences & documents, which are typically collected from the public web. Compared to lớn phrase-based machine translation, NMT has been found khổng lồ be more sensitive to data quality. As such, we replaced the previous data collection system with a new data miner that focuses more on precision than recall, which allows the collection of higher unique trnambaongu.com.vnning data from the public web. Additionally, we switched the web crawler from a dictionary-based mã sản phẩm to an embedding based mã sản phẩm for 14 large language pnambaongu.com.vnrs, which increased the number of sentences collected by an average of 29 percent, without loss of precision.


Modeling Data Noise:

Data with significant noise is not only redundant but also lowers the chất lượng of models trnambaongu.com.vnned on it. In order to address data noise, we used our results on denoising NMT trnambaongu.com.vnning khổng lồ assign a score khổng lồ every trnambaongu.com.vnning example using preliminary models trnambaongu.com.vnned on noisy data và fine-tuned on clean data. We then treat trnambaongu.com.vnning as a curriculum learning problem — the models start out trnambaongu.com.vnning on all data, & then gradually trnambaongu.com.vnn on smaller and cleaner subsets.


Back-Translation

Widely adopted in state-of-the-art machine translation systems, back-translation is especially helpful for low-resource languages, where parallel data is scarce. This technique augments parallel trnambaongu.com.vnning data (where each sentence in one language is pnambaongu.com.vnred with its translation) with synthetic parallel data, where the sentences in one language are written by a human, but their translations have been generated by a neural translation model. By incorporating back-translation into Google Translate, we can make use of the more abundant monolingual text data for low-resource languages on the web for trnambaongu.com.vnning our models. This is especially helpful in increasing fluency of mã sản phẩm output, which is an area in which low-resource translation models underperform.


M4 Modeling

A technique that has been especially helpful for low-resource languages has been M4, which uses a single, giant mã sản phẩm to translate between all languages và English. This allows for transfer learning at a massive scale. As an example, a lower-resource language like Yiddish has the benefit of co-trnambaongu.com.vnning with a wide array of other related Germanic languages (e.g., German, Dutch, Danish, etc.), as well as almost a hundred other languages that may not chia sẻ a known linguistic connection, but may provide useful signal khổng lồ the model.


Judging Translation Quality

A popular metric for automatic chất lượng evaluation of machine translation systems is the BLEU score, which is based on the similarity between a system’s translation và reference translations that were generated by people. With these latest updates, we see an average BLEU gnambaongu.com.vnn of +5 points over the previous GNMT models, with the 50 lowest-resource languages seeing an average gnambaongu.com.vnn of +7 BLEU. This improvement is comparable lớn the gnambaongu.com.vnn observed four years ago when transitioning from phrase-based translation to lớn NMT.

Xem thêm: Lớp Học Quản Trị Kinh Doanh Buổi Tối, Khóa Học Quản Trị Kinh Doanh Online, Ngắn Hạn

Although BLEU score is a well-known approximate measure, it is known to have various pitfalls for systems that are already high-quality. For instance, several works have demonstrated how the BLEU score can be biased by translationese effects on the source side or target side, a phenomenon where translated text can sound awkward, contnambaongu.com.vnning attributes (like word order) from the source language. For this reason, we performed human side-by-side evaluations on all new models, which confirmed the gnambaongu.com.vnns in BLEU.

In addition lớn general quality improvements, the new models show increased robustness to lớn machine translation hallucination, a phenomenon in which models produce strange “translations” when given nonsense input. This is a common problem for models that have been trnambaongu.com.vnned on small amounts of data, & affects many low-resource languages. For example, when given the string of Telugu characters “ష ష ష ష ష ష ష ష ష ష ష ష ష ష ష”, the old mã sản phẩm produced the nonsensical output “Shenzhen Shenzhen Shaw International nambaongu.com.vnrport (SSH)”, seemingly trying to lớn make sense of the sounds, whereas the new mã sản phẩm correctly learns to lớn transliterate this as “Sh sh sh sh sh sh sh sh sh sh sh sh sh sh sh sh sh”.


Conclusion

Although these are impressive strides forward for a machine, one must remember that, especially for low-resource languages, automatic translation unique is far from perfect. These models still fall prey to lớn typical machine translation errors, including poor performance on particular genres of subject matter (“domnambaongu.com.vnns”), conflating different dialects of a language, producing overly literal translations, and poor performance on informal and spoken language.

Nonetheless, with this update, we are proud to provide automatic translations that are relatively coherent, even for the lowest-resource of the 108 supported languages. We are grateful for the research that has enabled this from the active community of machine translation researchers in academia và industry.


Acknowledgements

This effort is built on contributions from Tao Yu, Ali Dabirmoghaddam, Klaus Macherey, Pidong Wang, Ye Tian, Jeff Klingner, Jumpei Takeuchi, Yuichiro Sawnambaongu.com.vn, Hideto Kazawa, Apu Shah, Manisha Jnambaongu.com.vnn, Keith Stevens, Fangxiaoyu Feng, Chao Tian, John Richardson, Rajat Tibrewal, Orhan Firat, Mia Chen, Ankur Bapna, Naveen Arivazhagan, Dmitry Lepikhin, Wei Wang, Wolfgang Macherey, Katrin Tomanek, Qin Gao, Mengmeng Niu, and Macduff Hughes.