Friday, August 1, 2008

Translating Django apps. Good practices

In this article you'll find some tips, that could be useful for avoiding problems or extra work when translating your Django application.

1. Setting up the environment

Doing some trivial changes to your project structure, can avoid you of translating many string (the ones that are already translated in Django, or in any external application).

For achieving it, my tip is to copy Django itself, and all external applications to your project path, not in a PYTHONPATH directory. It can also avoid compatibility problems, and version conflicts if you're working on several  projects. Then your project root will contain something like:

__init__.py
settings.py
urls.py
django/
transdb/
myapp/


Next step is  patching Django (while it's not included in trunk) to omit the inclusion of already translated applications into your project. Here is the patch, and you can also see #7050 for further information, or know the status.

Then, when executing ./manage.py makemessages you'll find in your project catalogs, just strings that aren't previously translated.

2. Creating string

If you don't have a correct literal creation policy, then your translator will have extra work, problems, and your translation won't be as correct as it should.

The first thing to do is write literals thinking in reusability (as  software reusability but for translations). I'll show it with some examples:

Using

{% trans 'product' %}
{% trans 'Product' %}
{% trans 'product:' %}

you'll create 3 different string in your translation. Using

{% trans 'product' %}
{{ _("product")|capfirst }}
{% trans 'product' %}:

will create just one.

Another thing to consider is that some times  you consider that a word has just one meaning, or at least you don't think that could be translated using different words. But actually, when translating it to another language it can be converted to different words depending on the context. Let's use an example.

Play football
Play the guitar


Probably for most native English  speakers play doesn't have more than a subtle difference in two sentences, but if I translate it as follows:

_("play") -> jugar


Then you'll find something like

Play football -> Jugar a  futbol (what's correct)
Play the guitar -> Jugar con la guitarra (what means "To have fun with the guitar", probably without generating any sound)


This will be avoided most times, because usually we don't translate word my word, but there are few cases where we do that, and you should consider doing something like that in that case (actually I never had to do it :)

_("play <!-- an instrument -->")
_("play <!-- a sport -->")


Then when you translate into Spanish:

"play <!-- an instrument -->" -> "tocar"
"play <!-- a sport -->" -> "jugar"


And I would also translate English into English:

"play <!-- an instrument -->" -> "play"
"play <!-- a sport -->" -> "play"


There will be infinite cases that will generate issues when translating, and it'll be impossible to control everyone. I just wanted to give some tips focused to Django applications.

3. Translating

This article isn't intended to explain how to translate (I think that there is a degree at university for it ;) . But may be you should give some tips/explanations to your translators for better results.

The first thing you should explain them is how to work with some special cases in your strings. Here you have the two mos common examples that they will found:

"This is normal text, <big>and this one is bigger</big>"
"Hello %(name)s"


Unless you explain them what it means, probably you'll find something like that in you translated string (using Spanish in the example):

"Este texto es normal, <grande>y  éste es mayor</grande>"
"Hola %(nombre)s"


Of course those translations doesn't generate the expected results, because the correct ones are:

"Este texto es normal, <big>y  éste es mayor</big>"
"Hola %(name)s"


Another thing that could be clarified, specially if your translator is involved in the Web site that is being translated (or at least knows the context where every string is used), is not to create translations more specific than the original texts.

For example, imagine that you've in your application a form for personal data, and one of the fields is called "name". Then you translate your application to Catalan, and your translator knows when translating "name", that is used as person name, and translate it as "nom propi" (first name). It will look nicer by now, while being incorrect for me, so later may be you'll add a form where you ask corporate information and you have a field "name" for the company name. You won't send the string "name" to the translator again, and your translation will be incorrect, so "nom propi" (first name) is not valid for the company name.

4. Choosing the main language

Sometimes it isn't so obvious  the language your application is written in (I mean the language you use inside gettext strings, or trans/blocktrans tags).

If you're writing an application that will be used widely in the world, and it will be translated to many languages, probably you think that English should be the right language for it, but in some cases there are some questions to take care on.

  • Will your company have an international team (specially of Django developers)? If you have workers from many countries, probably English will be good for letting all of them write/read from the code.

  • Will your translation team/company use English as it source language? The .po files will show your main language as source for translating string. If you hire a German to French translator, isn't a good idea writing your strings in English, so your/their work will increase a lot, and the reliability of the process will decrease.

  • Are your coders fluent in English? It's more complicated (more work) to change a string from a the main language than from a translation. So if your developers can't write correct English, writing literals in their mother tongue language will save time and work.


Unluckily some times you'll have conflicts in previous questions, and you'll have to choose the lesser evil.

10 comments:

  1. Very cool tips!
    gracias
    SAn

    ReplyDelete
  2. Very good tips, but I think you've got the last one backwards: It's actually _less_ work to change a string from a translation than from the main language. Otherwise, thanks for the round-up.

    ReplyDelete
  3. Thanks for the correction Matthias, already corrected it.

    ReplyDelete
  4. thanks for the demonstration!
    it saves quite a lot of time

    ReplyDelete
  5. Excellent writeup Marc. This is quite helpful.

    ReplyDelete
  6. Your "native language" doesn't need to be "copy-ready" text in any language! You can use a string that is more descriptive of the way the text will be used, then translate it to all your languages, even your native one. In the "play" example, you might have the strings:

    "play [a musical instrument]"
    "play [a sport]"
    "play [intransitive]"

    You translate all of them to "play" in English, but "jugar" and "tocar", etc., as appropriate.

    In that example, you might want to use a string interpolation (e.g., "play %(musical_instrument)s"), in case a target language has unusual rules about how the phrase is formed.

    Still, all this semi-automatic internationalization stuff is really automated code-writing for natural-language generation. We have a few barely-adequate domain-specific languages (gettext and string interpolation, locale-specific currency, etc.), but the problem really requires the full power of programming. More on that later, perhaps.

    ReplyDelete
  7. Thanks for your comment David, I escaped the characters to let user see it as is.

    I didn't know that in French you use a space before the colon. This make my example be useless... I don't want to know how can it be in Chinese... :)

    For the dynamic string using names parameters (f.e. "My name is %(person_name)s") and also with comments should be enough for more cases.

    Of course it's really difficult to have a i18n that works in all languages. And I'm not a real expert...

    ReplyDelete
  8. As for TransDB.

    Can I use it with slugfield ?
    As far as I know there should be an sql index created for slugfield,
    which means I can't store it as it's stored as dict.

    Please correct me if I'm wrong.

    Thanks!

    ReplyDelete
  9. Hi Robert,

    there is no TransSlugField included in transdb, so you won't be able to use it directly. I don't know how much difficult will be implementing a slugfield using a TransCharField, probably not so much.

    ReplyDelete
  10. it is not clear how to keep the translations for the applications separately inside of one project.

    ReplyDelete