What does Google’s new NMT training functionality mean for the translation and localization industry?
In a blog post from 24 July 2018, Google announced the public availability of the beta version of AutoML Translate, a new addition to its cloud-based automated machine learning offering. With this offering, customers can provide their own content to train custom translation models using Google Translate’s Neural Machine Translation (NMT) system.
What does this mean for the translation and localization industry? Is it, as Google claims in its post, another step in the “democratization” of machine learning, which more industries can now harness? Or is this “Goodbye Language Industry,” as declared by Slator in a recent email newsletter?
Let’s start with a reality check.
Google’s release of AutoML Translate is without a doubt a substantial development in the industry. Google is the most-used MT system on the planet, and any functionality it adds has repercussions for the language services industry. Until now, Google had not allowed customers to create customized versions of Google Translate. But Google certainly isn’t the first to offer cloud-based training of MT systems. Microsoft has offered cloud-based training of Statistical Machine Translation systems for more than six years and in May announced similar functionality for its NMT platform. And it’s safe to expect that Amazon will eventually make a similar announcement. All three players have substantial cloud computing business units, have made substantial investments in the development of Neural Machine Translation, and have open-sourced their Deep Learning/NMT toolkits.
That doesn’t sound like “Goodbye Language Industry” to me. To me, this development actually means one thing: diversity of choice.
If you want to create your own Neural MT systems, you now have another great choice. You can still run any one of the open-source or commercially available NMT toolkits on your own hardware and your own network, or you can choose to leave the machine learning expertise and substantial hardware requirements to others and use one of several cloud services providers that sell the ability to both train and host MT systems. And if you chose the second path, you can now choose Google as a provider.
It’s up to you to decide which of these offerings best suits your needs and provides the best value to you or your organization. At Lionbridge, our customers benefit from our best-of-breed MT strategy, which allows us to choose the most appropriate MT system on a highly granular level—not just for a given customer, but for specific projects and language directions. We use our more than 16 years’ experience in creating bespoke MT systems as an aid in the creation of high-quality translations at scale to inform our decisions and best practices on how to best apply MT. And we’ll continue to do so. This announcement just means more options for us—and for our customers.
One final thought: it’s interesting to note that the training of a Google NMT system costs $76 per hour after the first two hours of computing time, and the per character cost of running translations is $80 per million characters. This is four times higher than the per character price of translation when using Google’s general translation API. Compared to buying your own high-end GPU servers, that doesn’t sound too bad. But if you run hundreds and hundreds of trained MT models and translate tens of millions of words per month, then you should sharpen your pencil and do the math to decide whether this is a worthwhile investment when compared to Google’s baseline systems and other offerings.
So, do I think that having trainable Google NMT in my toolbox spells the end of the language industry? Nope, as a matter of fact—just the opposite. This is not a goodbye; it’s a hello.
Jay Marciano is the Director of Machine Translation at Lionbridge. He has 20 years of experience in the MT space and has been with Lionbridge, where he leads a globally distributed team creating best-of-breed automated translation solutions, since 2010.