English-language version of Luistxo Fernandez's blog
False, true, symmetrical and asymetrical multilingual blogs
L10n in blogs can result in different types of Non-English sites. A localised blog in a given language it's the most obviuous, but then there is that curious possibility: the bilingual or multilingual blog, a territory of the net not totally explored to date. L10n in blogs can result in different types of Non-English sites.
- A localised blog, monolingual, in a given language.
- A multilingual blog, where a given language_change button transforms the interface from language a to b, and viceversa.
- A multilingual blog, where a given language_change button transforms the interface, and also the content visualization, from language a to b, and viceversa.
Some might think that 2) and 3) are the same, but they are not. Not at all.
In the Nuke family of PHP products for blogging and portal management, they have achieved stage 2) quite well.
Is example, nukeunited.com is a Nuke site with English content but in which you can change the interface language. But that's not multilingualism. It's a fun trick, at most. Probably, not even fun. What's the interest of a blog with content in English, to have it's interface in Bulgarian? Those are false multilingual blogs.
I have my own false multilingual blog. It's my triblog here, set up as an example of my personal Coreblog l10n effort, not a real posting site. The content is uniform in one language (a non-language, in this case, just lorem ipsum chatter) but if you click on the Language Change options, you can see the interface in Basque, Spanish or English.
----
In a true multilingual blog, however, content should change when you click the language_change button. That's what the user expects, at least. He/she is reading something in English in a given site, but if a "Spanish" button is offered, the user obviously expects to continue reading the content in Spanish, and not just changing a couple of menu-messages.
This kind of multilingual blog has, in turn, different possibilities. Mainly, it can be symetric or asimetric.
SYM is a symmetric blog. You are reading an item titled Elvis is alive, click on language change, and, well, you get the Spanish version of it: Elvis está vivo. Of course, this requires that, when posting, you have to fill double data in the form, two titles, two text bodies... It cannot work otherwise. Some people use to type double entries for their blogs, but just in one entry: check Merdeinfrance . If the software this blogger uses could be adapted to a doubled-input interface, it could have truly symmetric blogs: now the it's a double-reading blog.
ASYM is an asymmetric blog. Posts are directed, either to an English version, or the Spanish version. We may have 3 messages in English, Elvis is alive, Nostradamus was right, and Mars attacks Earth. And, just two in Spanish, Elvis está vivo, and Carlos Gardel resucita. In this site, when posting, you fill a usual form, but there must be some language choice to be made (or perhaps, two posting forms, one per language).
In ASYM, if you are reading Elvis is alive and click language_change, are you directed to Elvis está vivo? In order to make that possible, there must be some variables, marked somehow, linking those two posts. However, where's that variables when there is no equivalent to a given posting? You are reading Mars attacks Earth and click on language_change... Then you end up reading Carlos Gardel resucita or what?
My personal opinion is that, if you are reading Mars attacks Earth on ASYM type blog, and click on language change, you should land on the Spanish index page of the blog, simply.
Are this kind of blogs possible? Yes. The SYM type is a very closed news-blog type. I think that it is fit only for a corporate-like news posting system or so. A good example is ThyssenKrupp's press releases (A corporate newsroom is a blog? basically, yes, IMHO. Index page with latest posts, the newer ones are up, the other down, click on a given link to read it all...). Some bloggers that opt to translate everything they write, could also switch to that kind of interface.
ASYM type is more free. More apt for the average bilingual blogger. At my company we have developed several of these, based in Squishdot, but they are in the corporate-line, as news service of the sites. No comments allowed, although moderated contributions are possible in some sites.
Check http://www.eizie.org/News enter any article, and change language to test. The result is different at other levels of that website, as the general content-tree is, unlike, the News-section, truly symmetrical. So, language change is different at http://www.eizie.org/Tresnak
Basque Indymedia is also of this kind. And my own blog, The English Cemetery, wants to be of this kind, asymmetrical and multilingual.
Some posts in Basque, others in English...
However, mine is not a very usual multilingual blog. There are bilingual blogs out there, not in the thousands, but a bunch of them, and they are very different... Most (and mine, also) have some unsatisfactory feature, some l10n detail unresolved. That's the subject for another post, the Ten Commandments for Bilingual Blogs.
Localised blogs: Some examples from the real world
As I mentioned in the previous post, whatever the i18n degree that blogging software or services may have reached, users almost always have the option to create their own localised blog in a given language.
I have found several Welsh examples, that resume probably the typology of l10n results that particular bloggers may achieve when localizing their blogs. It is a 3-level typology.
- No interface localization. The blogging machine seems to be of the EnglishBlog type, or, perhaps, the user doesn't have the expertise to hack for l10n. Examples: "buchods.blog-city.com/:"http://buchods.blog-city.com/ & BratiaithBlog . Bloggers in both sites post in Welsh, and all interface messages are in English.
- Partial interface localization. The blogging machine seems to have some degree of i18n, or, perhaps, personalization of the blog permits the user to touch strings and things like tkat. Unardegg looks very much Welsh, but date formats appear in English, with mixed messages like Postiwyd gan Mr Coch yn Tuesday, April 20 @ 13:05:44 GMT. I assume that the blogger couldn't reach that level of date-format l10n... If the blogger knew or could have done it, I suspect that the dates would be also in proper Welsh.
- Consistent localization. Morfablog looks Welsh, and dates are also in Welsh. Qgil's blog in Catalonia looks also consistently localised
Those blogs are made with a variety of tools.
The first two, the imperfect ones (that's an unfair way to say it, I know....) are hosted in blogging account web services. They seem to offer limited l10n options to their users...
The 2nd case (Unardegg), with partial interface localization, is a blog made with PHP-Nuke. This software has an active i18n and l10n activity but it seems that many Nuke sites have the same date-formatting problem as Unardegg. This example, "nukeunited.com" is a Nuke site with English content but in which you can change the interface language. The resulst are very poor. Enviado por NukeUnited el Thursday, 08 August a las 11:29:49 doesn't look like a very correct Spanish sentence... However, some Nuke users are well aware of that, and have devised partial solutions: this patch for date formatting in Turkish shows that correct dates may be displayed in Turkish, at least... There's hope for Welsh Nuke-sites, after all.
The consistent cases use Movable Type in the case of Morfablog, and Drupal in the case of QGil's blog.
- Movable Type has an active l10n section at their website.
- Drupal is in development stage, but it seems that there are good options for bloggers or hackers to develop localized versions, as Qgil shows. There is also discussion going on on the site. In this other post that catalan user of Drupal, Qgil, offers his approach to the variety of multilingualism we may find in the web.
Just to mention CoreBLOG, the software running behind The English Cementery.
- CoreBLOG. No i18n attempts so far. Well, the software hasn't reached 1.0. However, this is Zope and l10n can be done, hacking a little bit as I am doing. I also hope to release a l10n skin soon. We'll see.
Besides these localised blogs, then there is the curious world of bilingual or multilingual blogs... A much more limited necessity, probably, but well, there are some of us with that strange obsession. That's a matter for another post.
Blog internationalization (i18n) and localization (l10n)
I will post a series of probably long messages to my blog, regarding i18n and l10n of blogs. Many people has written about blog i18n, as for example Blojsom.com" . But I hope to clarify some points mainly from myself, before going on introducing a very modest Coreblog l10n project of my own... First of all, definitions.
- Internationalization. The process of planning and developing products so that they can be changed to meet the requirements of specific local languages and cultures.
- Localization is the actual preparing of data (or the software) for a particular language or locale.
For example the Zope product Plone is i18n aware and has localization for several languages. These terms are also spelled internationalisation and localisation, and shortened in the geek terms i18n and l10n, which are formed by the first and last letter of the word and the number of letters in between.
In the realm of blogging software and bloggers, i18n attempts are (probably) restricted to blogging software producers or online blogging account providers. One day or another, all of them will reach to that point: let's do i18n (as an example, here is the recent resolve of the creator of WordPress.
In the case of free software blogging machines, bloggers with a technical background can in turn re-arrange the original code to make personal i18n attempts, for personal use or re-release following the original license.
In turn, l10n is a more open process. If a software producer makes an i18n version of their blogging software, they nay be able to release different products. Let's suppose the blogging software SuperInternationalBlog by SuperInternational Co. is internationalized. Then they may release:
- SuperInternationalBlog in Spanish - SuperInternationalBlog in Arabic - SuperInternationalBlog ... - SuperInternationalBlog intl' version with a default
Englishskin but with options and instructions for users to create their localised version.
People may download different versions of SuperInternationalBlog , and also develop new ones. A wide arrange of l10n efforts may result from that.
And then, there is a company EnglishSoft Co. that has released EnglishBlog with no i18n attempt at all, and just as an English version. In this case, it is also possible that localized version of EnglishBlog may arise. How?
- Users will personalize EnglishBlog as they can, to create a blog in their language.
So, localised blogs may be created both with SuperInternationalBlog and with EnglishBlog. Obviously, SuperInternationalBlog users have more opportunities to create good blogs than EnglishBlog users...
However, the quality of i18n achieved by SuperInternational Co. will affect the output of different attempts.
Not all i18n attempts resolve well locale-sensitive issues like date-time formatting, character sets or directionality of script. So, having some SuperInternationalBlog out there does not assure that l10n may be done correctly.
In turn, even the most simple of EnglishBlog-like machines will permit some personalization in skins or so on, and therefore, a localised simple version of EnglishBlog is probably easy to achieve. And then, there's this option for most users: no matter which language is the one that appears at my blogging software, I will post in my language.
Things that are important to assess
We may say that blog i18n will be all the more accurate if a given effort complies with:
- The sustainabality of l10n efforts. When the software updates, what happens to users' particular localized blogs? Their l10n's will update at the same level?
- The possibility to share l10n results is also important. Some software developers may ask their users to contribute their l10n
packages(be them string collections, skins, or whatever) to a central repository: this practice has very different results in free or propietary software, I guess... Other systems may permit a user-to-user interaction where l10npackagesmay be shared, with no original developer intervening in that. Probably some l10n trials, however, are too hardcoded in one instance and are difficult to share. - Standardization of l10n procedures. If localized strings are stored in .po files, then translations may be shared between different systems, translation memory may be used... .po files are a standard format in the GNU Gettext i18n/l10n framework.
- The real web effect of Blog software i18n is mostly localized monolingual blogs in a given language. But if i18n also permits to develop multilingual or bilingual blogs, a blog that can be at the same time in English and in Japanese, for instance, that's a step ahead.
Unicode learning, Technorati and Bloglines
One day I'll learn about Unicode. I suppose it's a must to better understand blog i18n. Three resources that may help in the learning process:
BTW, I have a Technorati account now. Don't see clearly the service's advantages. Bloglines in turn was a sudden revelation for me. I got instatly fascinated and I'm a daily user of my account now.
How to convert a Coreblog into the archive of a mailing-list
Use the moblog feature of Coreblog to archive a mailing-list. I used Mailman for the following example, and tested it at Zettai Zope hosting. 1. Subscribe the moblog email address to a given Mailman mailing list
- at the msg_header variable of the non-digest options of Mailman write the moblog-password and another line as: password_word category_word
- Add that password_word to the "Password for adding entry" moblog feature of your coreblog settings.
- Leave the "Sender address for moblog" empty (in case it´s an open mailing-list, but if it´s a bulletin or newsletter-like list with an unique sender all the time, put that sender's address at this setting of your coreblog).
Related problems:
- the category_word is necessary, if not, the moblog entry will ignore the line next to the password_word
- subscribers of the list will always see their messages beginning with the password_word and the category_word
- if a subscriber sends a message with the password_word to the moblog address, it will appear at the coreblog. Solution: always hide the moblog address, don't make it public or easy to guess.
- maybe some Mailman subscribers will have problems receiving the posts, as the content of the messages after the password_word and the category_word may appear to some wrapped in an attachment, and if they have some email client that blocks attachments, then... (don't know much about this problem, I just suspect that it may happen).
So, I can now convert a Coreblog into the archive of a Mailman mailing-list. But why should one want to do such a thing?
Well, email and mailing-list usage is a subject of very personal preferences :-)
Corebloggers of the World, unite and take over!
Some emails exchanged with Klaus-san have led to the creation of Coreblog-en, the English-lanaguage mailing list for Corebloggers.
The list is kindly hosted by NBI and it´s been set-up thanks to a friend of Klaus, Henrik Christian Grove
Already Atsushi Shibata, creator of Coreblog, joined and invited all of us to suggest features and so on.
Thanks to all. I hope it will be a useful resource.
Larry Trask has died
I've just known that Larry Trask died last Saturday. I know how much he liked to take part in sci.lang and I decided to post this here. Seems he was very ill since time ago, but every time he felt OK he used to write here and there, always with interesting points.
I knew Trask by email from the time of Basque-L, probably the first Basque resource of Internet. He was very much appreciated among Basque internet users. His work, History of Basque, is brilliant.
posted to sci.lang
Last year, The Guardian published "this interview": http://www.guardian.co.uk/life/interview/story/0,12982,984721,00.html
They compared Trask to Pinker and Chomsky there. And they remembered this note from his website
please note: I do not want to hear about the following: Your latest proof that Basque is related to Iberian/Etruscan/Pictish/Sumerian/ Minoan/Tibetan/Isthmus Zapotec/ Martian. Your discovery that Basque is the secret key to understanding the Ogam inscriptions/the Phaistos disc/ the Easter Island carvings/the Egyptian Book of the Dead/the Qabbala/the prophecies of Nostradamus/your PC manual/the movements of the New York Stock Exchange. Your belief that Basque is the ancestral language of all humankind/a remnant of the speech of lost Atlantis/the language of the vanished civilization of Antarctica/ evidence of visitors from Proxima Centauri. I definitely do not want to hear about these scholarly breakthroughs.
It clearly defines his humour, his knowledge and his true love for Basque, my language.
BTW, another Basque linguist died last week, Andolin Eguzkitza, at the age of 50, he was too young... A polyglot, writer and, as Trask, "euskaldunberri": not a native speaker, but an adult learner of our language.
Some notes on Sustatu, a Basque newsblog:
- about Trask about Eguzkitza, this and this one
Moblog problems at Zettai
The moblog feature seems to work, but not when attaching a picture. This blog at manterola.org is hosted on Zettai.net as for the POP email account, I have tried with a manterola.org address, also served by Zettai, as well as with another external POP account.
In both cases, messages without attachments go finely into the blog. But not when attaching a picture, this error raises:
Error Type: IndexError Error Value: list index out of range
Another difference. To moblog from the zettai-based email account, I have to select the APOP option in the settings. In the non-zettai email account, it doesn't work with APOP. I had to unselect it.
First trials with Coreblog
First trials with Coreblog
These are my first trials with Coreblog.
Technical purposes:
- to transform this weblog into a multilingual blog using Localizer, that nice Zope product. An approach may be passing the locales as a category, that's way there are 3 categories on this blog: eu (basque), es (spanish) and en (you know). I hope to advance little by little.
- produce a localised basque version of Coreblog. Or something like that.
Personal purposes:
- publish and collect in a single place things I post for several lists and websites in different languages.
- experiment with mailing-list and blogging integration. There's a veteran Basque mailing list that I love, Eibartarrak, and I feel like it needs to evolve towards blogging and XML Feeds and maybe the semantic web of the future... I wish there were more intl' Coreblog users. The mailing list is all-Japanese.
I have detected a couple of non-Japanese corebloggers out there. I mention them in this inaugural English post just to try the trackback tool. Magnetic Ink surfs in Denmark. I hope to use this post of his to try the nortification feature of Coreblog.
Tomster.org is also a coreblogger. His method to validate the RSS output of Coreblog may be useful, so I trackback it as well.
So far the XML link with RDF 1.0 is OK for me. It integrates perfectly with Bloglines
