L10n 2.0

If you write a software package, and want it to be usable by as many people as possible, it’s important to translate it into other languages. But like documentation, localization (l10n) is one of those chores that programmers don’t want to do. But if it’s a web app, why not ask the users to contribute translations?

Background

A bit of background: if you’re using the gettext library, every time your code prints a string that the end-user will see, you need to mark it as a string that needs to be translated into the end-user’s language. For example, instead of writing

printf("Hello, world!n");

you need to write

printf(gettext("Hello, world!n"));

This is the internationalization (i18n) part — writing code that can be adapted to other countries.

The localization (l10n) part is the bit where someone actually sits down and translates all of the strings marked with gettext() into the end-users’ languages.

Fortunately, there are utilities that’ll look for the strings marked gettext(), dump them to a file for the translator, and generate machine-readable versions of the translations. But if you want to help translate a project, you still need to download the list of strings to translate, edit them with a text editor, and upload the translations to the project.

Now, it strikes me that there are a lot of people out there who, even if they can’t tell an abstract class from a buffer overflow, know two or more languages, and would be willing to help with the translation, if it were easy and didn’t take a lot of time.

A Proposed Solution

Here’s a possible solution for web apps: let’s say your app currently generates

<p>Hello, World!</p>

Presumably you’ve already internationalized your app, with something along the lines of

echo "<p>", gettext("Hello, world!"), "</p>n";

where gettext simply looks up the translation and returns the translation, or the original string if no translation is available (because it’s better to print a message in the wrong language than not to print anything).

What if, instead, it returned

<span class="i18n" lang="fr">Bonjour, tout le monde!</div>

if a translation was found, or

<span class="broken-i18n" lang="fr">Hello, world!</div>

if it couldn’t find a translation?

It should be a simple matter, using CSS and JavaScript, to display an icon next to the “broken-i18n” strings. When the user clicks on the icon, a window pops up and prompts the user for a translation. This translation is then added to the translation table, so that the next person sees a properly-translated string. Or a user could start a translation into a new language, or regional dialect.

Obviously, this system has various problems. Aside from the usual SQL injections and XSS, there’s vandalism. One possible solution would be to allow multiple translations for a string. Users could then vote on the quality of a given translation, so that the better ones percolate to the top, like at Urban Dictionary.

This way, people who don’t know how to use a text editor but can find their way around a browser can easily contribute as few or as many translations as they like, and the community can decide which ones work best.

Update, Nov. 5, 2013: Thanks to alert reader Johan Sundström, who pointed out that I had “l17n” instead of “l10n” throughout.

2 thoughts on “L10n 2.0”

    1. Yes, I’m pretty sure I meant localization. I guess I should’ve run perl -le ‘print length “localization”-2’. Please feel free to write corrections on your monitor as needed, until I have a chance to fix the post.

      Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.