Message Catalogs on Darwin
I wanted to localize a shell command to give danish output and decided to look into the message catalog functions described in/by XPG4.
Opening and Using a Catalog
The basics are simple, here is a short example:
#include <cstdio>
#include <nl_types.h>
int main (int argc, char const* argv[])
{
if(nl_catd cat = catopen("test.mo", 0))
{
char* str = catgets(cat, 0, 0, "This is a test!");
puts(str);
catclose(cat);
}
}
First we open the message catalog named test.mo
, then we ask for string zero in set zero of that catalog (each catalog can contain multiple sets, and each set, multiple strings/messages), finally we print the string and close the catalog.
Generating the Catalog File
The test.mo
is the catalog and needs to be generated using the gencat
command. The source is plain text, and for our test, it looks like this:
$set 0
0 Dette er en test!
First we indicate that we are defining strings for set zero, and then we provide a string with index zero. This format also supports comments, but let’s ignore those for now.
To generate our test.mo
file from the above (which we save as test.txt
) we would run this line:
gencat test.mo test.txt
Finding What Catalog to Load
Okay, that was all very simple, now comes the tricky part. Which catalog to actually open?
The catopen
function can take an absolute path, but that’s no good, as the idea is that the user can change the language without recompiling our program. Instead, if we specify just the name, the system will use the value of the NLSPATH
environment variable to figure out where the file is located.
This variable can contain multiple locations (separated with a colon) and it can contain placeholders, such as a placeholder for the current language.
So what is the default value of NLSPATH
on Mac OS X? Well, it is unset, and the catopen
function has a default value (for when it is unset) which is useless on Mac, since the locations it points to do not exist with a default install of Tiger.
It seems Mac OS X keeps the catalogs under /usr/share/locale/…/LC_MESSAGES
. Here the three dots refer to the actual language, e.g. en_US
, da_DK
, etc. In the NLSPATH
we can use %L
as placeholder for the current language, so it would seem that this line would be required in our shell startup (e.g. .profile
):
export NLSPATH=/usr/share/locale/%L/LC_MESSAGES/%N
The last %N
is a placeholder for the actual message catalog name, e.g. test.mo
in our code above.
So now that we have established this path, we would need to copy our test.mo
to /usr/share/locale/da_DK/LC_MESSAGES
.
Changing Language
If we run our command it still gives the english message because we have not changed the language yet. So how to do that? Ideally I think one should be able to set the LC_MESSAGES
environment variable to da_DK
, but on Darwin the only variable used, when resolving NLSPATH
is LANG
, so this is the variable we need to set to da_DK
.
And that’s it! So here are all the steps:
# first compile our test command
gcc -o test test.cc
# generate the message catalog
gencat test.mo test.txt
# install it (requires sudo)
sudo cp test.mo /usr/share/locale/da_DK/LC_MESSAGES
# now export the NLSPATH variable
export NLSPATH=/usr/share/locale/%L/LC_MESSAGES/%N
After this, we can run the command with either danish or default (english) output:
% LANG=da_DK ./test
Dette er en test
% ./test
This is a test!
Closing Notes
I refer above to da_DK
as the language. It really is the language, then an underscore, and then the country/region (territory). One can refer to the language (subpart) in NLSPATH
using %l
and to the territory using %t
.
It is also possible to provide an encoding (codeset), e.g.:
export LANG=da_DK.UTF-8
And this encoding can be referred to in the NLSPATH
as %c
. If you look in /usr/share/locale
you will see that there actually are subdirectories for all of the following:
da
da_DK
da_DK.ISO8859-1
da_DK.ISO8859-15
da_DK.UTF-8
I don’t know what the intended usage of these subdirectories are. Maybe the idea is to set NLSPATH
to something like this (I wrapped the line for display purposes):
export NLSPATH=
/usr/share/locale/%L/LC_MESSAGES/%N
:/usr/share/locale/%l_%t/LC_MESSAGES/%N
:/usr/share/locale/%l/LC_MESSAGES/%N
I should also add that based on the manual for catopen
, it sounds like providing NL_CAT_LOCALE
as flag is the right way, as it will then use the LC_MESSAGES
locale category (of the current locale). I was however unsuccessful in changing the current locale away from C
. So using that flag meant I never got the localized messages, no matter what value I gave LANG
, LC_ALL
, LC_MESSAGES
, etc.
I do get the feeling that either I am missing something, or no-one ever actually made sure that this stuff works as it should for Darwin.