Creating Effective Termbase Glossaries for Machine Translation

Need to create a termbase glossary for machine translation? Glossaries can really make a big difference in your translation quality. You can improve translation quality by using software that directly inserts your company’s terms into your machine translations.
In other words, creating a termbase glossary for machine translation is well worth your time. However, there are some key tips and notes to keep in mind during translation glossary development if you want it to properly serve its purpose.
That’s why we compiled this list of 5 tips for building termbase glossaries for machine translation.
5 Tips for Creating Termbase Glossaries for Machine Translation
1. Machine translation glossaries are different from human translation glossaries
If your company is new to using machine translation within a translation management system, this is important to know.
Up until the past several years, companies always developed glossaries for human translators to work with while using translation memory software. Now as artificial intelligence continuously improves machine translation, we have to rethink how to build those glossaries. It requires a different strategy for machine translation.
Now that you know this key fact, we’ll continue on with more specific glossary development tips.
2. Be precise when creating translation glossaries for machine translation
In machine translation glossary development, terms and their translations need to be exact matches. In other words, they can’t be ambiguous or have more than one meaning.
You can’t include more than one translation for one term, because the machine translation engine won’t be able to discern which translation to use. Only include terms in the glossary that have one meaning, to be used one way.
Example: “assembly”
“Assembly” is a term that is sometimes used as a verb (the action of putting together components) and sometimes used as a noun (a unit composed of multiple parts). There can also be multiple meanings within one grammatical category. For example, “assembly” as a noun can also refer to a gathering of people in one place. In short, assembly also exists as multiple nouns.
3. Include various word shapes in your glossary
While the following logic doesn’t currently apply to building glossaries for all translation engines, it applies to building one for Amazon’s machine translation engine. Amazon allows you to use glossaries to create custom translations.
Put simply, a “word shape” means singular vs plural, initial cap vs lower case, versus all caps.
Example: Eats, eat, eaten and ate
When developing a machine translation glossary, you’ll need to be mindful of word shapes. The more word shapes you include in your termbase, the better matches you receive.
To produce custom translations with Amazon’s translation engine you’ll need to include all the different word shapes for each term. If you want Amazon’s engine to replace the term you lookup from your text, it needs to exist in that exact same word shape within your glossary.
For example, if the term in the glossary is initial caps (“Eat”), but the term in the text is lowercase (“eat”), it won’t pick it up. See tip #5 for more on the subject of replacing terms.
4. Understand the difference between glossaries and translation memories
What is a termbase / glossary?
A termbase is a bilingual (source and target) repository in CSV or TBX file format. It contains specific terms and terminology that regularly appear in a company’s content. Termbases typically include product names, company names, brand names, acronyms and other repetitive terminology.
Termbases are also known as Glossaries. Once your glossary has been imported to a translation management system, you can look up terms while you are editing and reviewing translations. Some systems will automatically replace terms in the translation for you.
What is a translation memory?
A Translation Memory is a bilingual (source and target language) repository in TMX file format. It consists of text that has been translated from one language into another, and it is stored on a server for future reuse. Translation Memories can deliver nearly instant quality improvements and reduce translation costs and effort.
What’s the difference between translation memory and a glossary?
A translation memory consists of sentences, phrases and segments, whereas a glossary consists of keywords, names and acronyms.
5. You can now use termbases for more than term lookups
Translation management systems are beginning to enable users to leverage their termbase glossaries for more than just looking up terms.
For example, Pairaphrase now allows users to also insert glossary terms into the text during a term lookup. This enhances productivity within the human translation workflow while improving the quality and consistency of translations. So when you create a translation glossary, take some extra time to include as many terms as possible.
This upfront investment will significantly reduce translated-related costs down the line.
According to Pairaphrase CTO Rick Woyde in an interview with Enterprise Viewpoint, “the next wave of AI-led development will include the ability to use terminology glossaries to interactively improve machine translations and create custom translations.” This is something to keep in mind as you build your glossary, which your company will use for years to come.
Learn more about what’s next in machine translation.
Get More Out Of Your Translation Glossary File with Pairaphrase
Want a translation management system that can insert glossary terms into your text during a term lookup? Pairaphrase delivers a suite of easy-to-use (yet powerful) translation tools in a secure web-based ––all while enhancing translation productivity.
Schedule a Demo or Request a Free Trial

Creating Effective Termbase Glossaries for Machine Translation
Need to create a termbase glossary? Get our top 5 tips for creating effective termbase glossaries for machine translation.

Best English to Korean Document Translation Software
Looking for English to Korean translation software, but not sure what features you need? Access this buying guide.

DeepL Alternative for 2022 Onward
Explore various DeepL drawbacks and user-specific needs you might identify with. This way, you can conclude whether alternatives to DeepL could fill these gaps.

Most Accurate Translator: How to Get it
While a 100% accurate translator does not yet exist, this post will tell you how to get the most accurate translator tailored to your company’s words and phrases.

2022 Translation Industry Trends to Anticipate
Explore 2022 translation industry trends! Learn about 10 translation industry trends 2022 will bring, according to our predictions »

Secure Transcription Software That Translates Languages
Need secure transcription software that translates languages? Learn what features to look for and how Pairaphrase will benefit you.

Best English to Portuguese Document Translation Software
Looking for English to Portuguese translation software, but not sure what features you need? Access this buying guide.

Translation KPIs for Translation Management Success
Establishing translation KPIs (Key Performance Indicators)? Here are the 5 Most Important KPIs for translation management success.