Skip to main content

How to localize text

This is the most obvious and visible of all localization activities. As textual content keeps evolving, it is also an area that requires a constant translation cost. It pays to get it right from the get go.

1. Make sure you got the basics

Do you use Unicode in your front-end / services communication / back-end / database code? Do you have the concept of locale in URL, context, or somehow available where local-dependent logic needs to be executed? Did you set the lang attribute in html?

2. Identify translatable text

When your app needs to operate in languages different than the source language the app was originally built with, the most basic step is translating textual content. It helps categorizing translatable text, as it may be stored differently and also translated differently.

2.1 Text used directly by front-end code

In this category we have strings that are typically used directly by the front-end. For example, navigation headings, registration forms, and any custom-built UI that is not data-driven. The best practice for translating these strings is to not hard-code them in code but externalize them. These strings are stored separately either in source control in the form of resource files (usually in some form of key-value format), or in some database and managed by custom code or off-the-shelf CMS. Content is then usually either fetched or embedded in the front-end code and referenced by key/id. Plenty of content retrieval and interpolation libraries are usually available in every platform. Since front-end code may be run on a server and/or a browser/client, there are content delivery considerations to address.

2.2 Text returned by back-end services

This is a broad category because it may include:

  • Externalized server software strings. Just like the front-end code, server code may need to return interpolated localized strings. Similarly, it may leverage resource files or content repositories.
  • Templatized data-driven pages created internally. This may include long-form content such as press releases, blog posts, product descriptions. These are normally stored in a DB/CMS, which needs to be designed with multilingual storage in mind.
  • User-generated content. Some products allow users to enter content that may be desirable to translate. This poses several additional issues and considerations compared to content owned by the company. For example there may be legal implication if translations are not correct. Source content quality and style may not be consistent. Volumes and translation costs need to be considered.

3. Determining the translation process

At the highest level, the first decision is whether to:

  1. Use a translation vendor
  2. Use machine translation
  3. Outsource or crowdsource the translation directly
  4. Leave content untranslated

In reality, you'll very likely going to use a hybrid approach.

4. Send content resources to translation vendor

Manual vs. Continuous Localization

Option 1: email/upload DON'T DO THIS! You want a repeatable process as automated as possible. If you do CI/CD then you want to do CL.

Option 2: build integration code yourself THAT'S JOB SECURITY

Option 3: use a TMS vendor https://support.gengo.com/hc/en-us/articles/231437927-How-Can-I-Localize-My-App-or-Site-With-Gengo- Phrase Transifex

Localizability

Translators need to be enabled to achieve the best possible quality. Here's a list of topics to go deeper and prevent localization problems: