Labs/Ubiquity/Ubiquity 0.5 Command Localization Tutorial
Contents
Introduction
Ubiquity 0.5 adds the ability to localize commands bundled with Ubiquity and, in addition, lays the foundation for the future localization of community commands. The "localization of Ubiquity commands" involves the translation of verb names, description and help strings, and message and interface strings in the commands' preview
and execute
codes.
For information on the localization of the Ubiquity Parser—i.e., teaching Ubiquity the grammar of your language—please read this tutorial. In general, localizing the Parser is a prerequisite to the localization of individual commands.
Gettext and the po
format
Ubiquity command localization follows the po
(portable object) format of the GNU gettext system. po
is a de facto standard in the world of localization, particularly in the UNIX world, and is supported by a variety of different tools and editors.
Like most types of localization, Ubiquity command localization works essentially by replacing strings. The original commands are written without regard for other languages, as long as they follow certain guidelines to keep the command localizable. The commands are written with strings in a set source language and, through the process of localization, Ubiquity will go through and replace those source language strings with the target language equivalents. In the case of Ubiquity's built-in commands, the source language is always English.
Here is one translation entry from a Ubiquity command's po
file:
msgctxt "twitter.execute" msgid "direct message sent" msgstr "ダイレクトメッセージを送信しました。"
Every* Ubiquity translation entry consists of these three parts:
-
msgctxt
, the message context. This is a structured string which tells you which command's string this is, and which aspect of the command's behavior it is related to. Here, this translation entry is from theexecute
code of thetwitter
command. -
msgid
, the message id. This is the original text in the source language, so you know exactly what content needs to be translated. This must exactly match the localized string in the command code. -
msgstr
, the message string. This is the localized string in the target language, here Japanese. As expected, themsgstr
above says "direct message was sent" in Japanese.
* almost every - see "shared keys" below.
Ubiquity's localization files for built-in commands are all stored in a central directory, and are organized by command feed. For example, the localization of the firefox.js
command feed must be called firefox.po
and is placed in the directory ubiquity/localization/XX/
, where XX
is your language's language code. For example, the Danish localization file for firefox.js
would be stored as ubiquity/localization/da/firefox.po
. (For more information, see the "Testing your localization" section below.)
po
files may be translated in a text editor or in specialized localization software. Some free tools include poEdit and Translate Toolkit. All Ubiquity po
files are in UTF8 encoding. Traditional Gettext also uses a binary format called mo
(machine object) but Ubiquity only uses po
files.
Localization templates
po
localizations are often started based on a po
template or pot
file. These templates are simply po
files with all blank msgstr
's.
Ubiquity includes a handy localization template tool to create these initial templates for you. Just go to the Ubiquity command list, and click on the "get localization template" link next to a command feed. If no such link is showing up, it means that that feed was not bundled with Ubiquity and thus does not currently support localization.
You can also get many of the pot
files pre-generated from the Ubiquity hg repository. If you already get your Ubiquity from the source, you'll find the templates in ubiquity/localization/templates
. Else, you can get them from our hg server.
When you start working on a localization, make sure to add your contact and credit information to the header. Each Ubiquity po
file begins with a header that looks like this:
# social.po # # Localizers: # Masahiko Imanaka <test@yahoo.co.jp>
This is from the Japanese po
file for social.js
, social.po
. You'll see a line called "Localizers:". Add a new line (or replace the dummy line) below that and add your name, followed by your email address in <braces>.
Please note that these automatically generated localization templates are not perfect. For technical reasons, there are often a handful of localizable strings which do not get automatically added into the template. When you notice these, you can simply duplicate one of the msgctxt
-msgid
-msgstr
entries and fill in the appropriate details. This will often require that you take a look at the original command feed's source.
Examples and special cases
Two types of strings
There are, broadly speaking, two types of translation entries in Ubiquity command localization: properties and inline strings.
Properties
Properties are metadata about the command that are in individual properties of command objects. Localizable properties include names
, help
, and description
. As each command only has one each of these properties, there is only one msgcntxt
of each kind. For example, a command like twitter
would never have two different twitter.names
localization entries. Even though it is logically somewhat redundant, these property entries still need to keep both their msgctxt
and their msgid
's to function properly.
Note in particular that the names
translation may be a plurality of names, not just one. In this case, use the pipe (|) character to delimit the names:
#. twitter command: #. use | to separate multiple name values: msgctxt "twitter.names" msgid "twitter|tweet|share using twitter" msgstr "呟く|呟いて|呟け|つぶやく|つぶやいて|つぶやけ|twitter|tweet"
Inline strings
Inline strings are those strings which are used in a command's preview
or execute
methods. As such, they always have msgctxt
of "command name.preview
" or "command name.execute
". As there may be many different localizable strings in each of these methods, there can be multiple different translation entries in commands with this same msgctxt
but they will each have unique msgid
's.
Multiline strings
Often the strings to localize—and their localizations—will be multiple lines long. In this case, the po
format offers a special syntax to deal with such line breaks:
msgctxt "digg.description" msgid "If not yet submitted, submits the page to Digg.\n" "Otherwise, it takes you to the story's Digg page." msgstr "このページを Digg にたれこみます。\n" "または該当する Digg ページを開きます。"
The convention here is that if a line begins with a quote ("), it is the continuation of the line before it. In this case, newline characters (encoded \n
) are not automatically inserted, so a \n
must be inserted at the end of the line to mark that there is a newline there. Note that the msgid
must match what is in the source code exactly, so it is best to leave the msgid
's as they are in the templates.
There is a known bug in Ubiquity 0.5 preventing these multi-line keys from being properly dealt with.
In some situations, a command author will write some shared code which is executed as part of the command's preview
and execute
or even between commands.
In this special case, as it would be redundant to write the exact same translation entry twice for both preview
and execute
contexts, you can optionally make do without the msgctxt:
# no message context msgid "original message" msgstr "translation"
By not specifying a context, this same translation entry can be shared across any instance of the string "original message" in any preview
or execute
string in any command in that feed. You cannot, however, use shared keys to share translations between command feeds.
Note however that there are instances in some languages where you would actually want to localize the two contexts' strings separately: for example, suppose the source language does not mark tense or aspect, but your target language does. A status message in that language may be the same in cases where the action is about to be done (in the preview
) and when it was completed (in the execute
) and you would thus want to translate these strings differently in the different (future/past) contexts.
Localizing formatting templates
Localizable strings will often contain variable references and other code in {curly braces}, following the JavaScript Templates format. For example, you may see a string like this:
msgid "${number} results found."
In these cases, the ${number}
is going to be replaced out by some data, so you will want to leave it alone in your localization, for example (French):
msgstr "Il y a ${number} résultats."
JavaScript templates can also include some basic control structures, most importantly letting you handle pluralization directly in the . Suppose you want your localized string to display slightly differently depending on whether the number
value is singular or plural. The JavaScript Templates format allows for a simple {if}
statement to handle these possibilities:
msgstr "Il y a ${number} résultat{if number > 1}s{/if}."
This string will produce "Il y a 3 résultats." when plural and "Il y a 1 résultat." when singular as we would like. You can also write an {else}
condition, using the pattern
{if ...} ... {else} ... {/if}
Alternatively, suppose the source language includes such special syntax for pluralization, but your language does not have plural marking. You can simply remove the {if}...
control structure from the localized string.
msgid "${number} result{if number > 1}s{/if} found." msgstr "找到${number}個結果"
Testing your localizations
Manual testing is an important step in preparing your localizations. Testing requires you to place your po
files in a specific language directory in your Ubiquity source folder. Where this folder is depends on how you obtained Ubiquity:
- for Ubiquity installs via xpi (addons.mozilla.com):
- If you installed Ubiquity via a packaged xpi, such as from addons.mozilla.com or from the "Add-ons" menu item in Firefox, your Ubiquity folder is in your Firefox profile. This support tutorial will tell you where to find your Firefox profile. Within your profile drill down to
extensions/ubiquity@labs.mozilla.com/
. This is your Ubiquity source folder.
- If you installed Ubiquity via a packaged xpi, such as from addons.mozilla.com or from the "Add-ons" menu item in Firefox, your Ubiquity folder is in your Firefox profile. This support tutorial will tell you where to find your Firefox profile. Within your profile drill down to
- for Ubiquity installs from hg:
- If you pulled the Ubiquity source from our hg repository and installed it using the
manage.py
utility, you will find the Ubiquity source folder,ubiquity
right in the root level of the repository.
- If you pulled the Ubiquity source from our hg repository and installed it using the
Once you've found your Ubiquity source directory, you want to find your language's directory at localizations/XX/
where XX is your language code. If the directory doesn't exist, you can make it. Place your po
files there.
Finally, go to the Ubiquity settings page and make sure it's set to use your language (if you haven't already).* Restart to test out your localizations.
* currently there's no way to test localizations for languages which do not yet have parsers for them. Read this tutorial to learn more about how to write parser language settings for your language.
Contributing your localizations
Currently the latest versions of Ubiquity built-in command localizations are kept in Ubiquity's main hg repository. You can sumbit new localizations by posting them on this trac ticket.
You can also ask questions and seek help on Ubiquity localization on the The Ubiquity i18n Google Group.
References
- Labs/Ubiquity/i18n, the Ubiquity i18n project wiki page
- The Ubiquity i18n Google Group, a mailing list to discuss Ubiquity internationalization and localization
- Making commands localizable, a guide for command authors
- Parser 2 localization tutorial, how to actually teach Ubiquity the grammar of your language
- The GNU gettext manual
- JS Gettext, the open source JavaScript library chosen to incorporate
po
file handling in Ubiquity.