Localizing County Names using CLDR(Common Locale Data Repository)

Developers can easily localize country names using CLDR (Common Locale Data Repository), saving valuable translation resources. However, there are a few important considerations that developers should be aware of. When developers encounter a situation where they realize the product’s UI displays a lengthy list of country names in English that needs to be localized into specific languages, they will need to externalize English strings from the code at that moment.

Experienced developers familiar with internationalization understand that country names should be externalized, but they do not necessarily require human translation. Instead, developers can leverage CLDR with appropriate technical integration.

Ideally, there is no need to externalize or include the localized country names in translation resource files. By integrating LDML (Locale Data Markup Language) in the backend and dynamically calling it based on the loaded language, the localized country names can be seamlessly incorporated into the client-side.

The challenge lies in aligning the product’s business requirements with the CLDR data when integrating it successfully.

Personally, I think approximately 90% of the countries covered by the product can be accommodated(localized the names of counties) through CLDR.

CLDR, developed by the Unicode Consortium 19 years ago (on 19 December 2003), has already been widely adopted by prominent i18n-specific open-source projects and is consistently updated. Moreover, the standardization of country names across various industries simplifies the process of localizing them using CLDR. However, region names will be limited even with the CLDR’s territory data – https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-localenames-full/main/en/territories.json#LL11C10-L11C21

CLDR npm

With the emergence of ES6 (ECMAScript 2015), CLDR can now be installed through npm as well.

https://www.npmjs.com/package/cldr (Unpacked Size: 256 MB)

For country names, we only need `cldr-localenames-full`.

https://www.npmjs.com/package/cldr-localenames-full (Unpacked Size: 23 MB)

https://www.npmjs.com/package/cldr-data (Unpacked Size: 16.8 KB) is sort of mapping npm package that let dev config downloadable data individually.

https://www.npmjs.com/package/cldr-core (Unpacked Size: 1.63 MB) is CLDR supplemental data package which works as a mapper of individual CLDR data groups. For instance, if we want to display specific countries(and its localized names) per region(with localized region name), you can use this `territoryContainment` – https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/supplemental/territoryContainment.json

For instance,

 "013": {
        "_contains": [
             "BZ",
             "CR",
             "GT",
             "HN",
             "MX",
             "NI",
             "PA",
             "SV"
        ]
 },

The 013 means – `Central America`(https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-localenames-full/main/en/territories.json#L18)The `BZ` means – `Belize`(https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-localenames-full/main/en/territories.json#L81)

Install CLDR localenames via npm

npm i cldr-localenames-full

Although the npm page stated that the unpacked size of the package is 23MB, I noticed that it was approximately 28MB when I examined it within my application.

Once the cldr-localenames-full package is installed, you can import it.

const cldr = require('cldr-localenames-full/main/'+ cldrLang +'/territories');
const territories = cldr.main[cldrLang].localeDisplayNames.territories;

It is important to note that the CLDR npm package solely serves as a static file-based data structure and does not include any logic. Therefore, it is recommended to require a specific language’s specific file, such as territories, based on your needs.

As mentioned earlier, the primary purpose of utilizing CLDR is to address specific business requirements related to the supported country list in a product. This often involves hard coding the country dataset into the code. Typically, the hard coded country dataset is represented as constants, either with the country ISO 3166-Alpha-2 code and its corresponding value, or solely with the ISO 3166-Alpha-2 code.

const demoCountryCodes = [
  'GLOBAL',
  'US', // United States
  'GB', // United Kingdom
  'CA', // Canada
  'MX', // Mexico
  'AU', // Australia
  'ES', // Spain
  'FR', // France
  'DE', // Germany
  'AD', // Andorra
  ....
  'CH', // Switzerland
  'TW', // Taiwan
  'TH', // Thailand
  'TR', // Turkey
  'UY', // Uruguay
  'VN', // Vietnam
];

This implies that we solely need to display the pre-translated CLDR country names for the aforementioned list of countries.

Full code example looks like this.

  listingRegion = () => {
    const { classes, lang } = this.props;
    const tmpLangCodes = lang.split('-');
    const cldrLang = tmpLangCodes[0];
    const cldr = require('cldr-localenames-full/main/'+ cldrLang +'/territories');
    const territories = cldr.main[cldrLang].localeDisplayNames.territories;
    const regionListing = regionList.map((obj, idx) => {
        let regionName = (!territories[obj.cldr_code]) ? obj.label : territories[obj.cldr_code];
        return <ListItemText primary={obj.value + ' (' + regionName + ')'} />
    });

    return (
      <ListItem>
        <tbody>{regionListing}</tbody>
      </ListItem>
    );
  }


Posted

in

by

I am a software engineer who enjoys building things. Please share your interest with me – chrisyoum[at]gmail.com