A

Anand

Hello Fellow listers,

I have been thinking of making the tiny browser SCADA put up at www.smbd.org multilingual, and here is the reason why: I have been planning to migrate to Canada and though the procedure is in
the initial stages, I decided to learn french. Then I met an Indian at a railway station and he, looking at my french books, well, we started
chatting and he informed me that in the french regions of Canada, you need to know french!
I recalled similar things happenning in the past in India. I once even wrote a operations manual in Marathi!

Now what if say in a control room, there is a french person and un anglais!

or suppose a giant corporation like shell or chevron wanted to integrate the SCADA or systems working across the globe and ensure some sort of
interoperability from across the globe! Then either the entire globe has to become anglais which again is problematic. the Hindis will stick to hindi, the Arabs to arabic and french to french, japanese to japanese and chinese to chinese and so on. We need a system such that the
interpretation in the various languages is automatic.

I believe that having a new tags in HTML will greatly help in having multilingual support.
The new tag &lt;langcode> will give a twelve digit alphanumeric code. It will be closed by &lt;/langcode> When there is no equivalent &lt;nolangcode alt="123456789abc" native="french"> which will mean that there is no langcode for this text. the nearest equivalent code is given and the native language or the language in which the text which has no langcode was written. this can be closed by &lt;/nolangcode> numbers inside the texts could be written by &lt;langnumber> closed by
<&lt;langnumber>

Say the alarm tag is "Converter Temperature High 175.25". In french it may be "Temperature du converter haut (or grand) 175,25" {Note that since i could not get the accent signs in My linux m/c, the text is basically incorrect in french, but does convey the message} And somewhat different in German, spanish, Arabic, Hindi, Marathi and the thousands of languages across the globe.

Now suppose the server or WEB PLC gives a message indicating the langcode say for example

&lt;langcode> abc123456789 &lt;langnumber>175.25&lt;/langnumber>&lt;/langcode>

The browser could have the related messages for the codes stored in the language used in the particular PC and the message could be displayed in the relevant language.

Detailed explaination of the working:
A standards committee like say the W3C makes the langcodes. The body establishes a comprehensive list of codes for common sentences in every
language.
The table has the following columns:
A langcode column as a primary key,
The language which represents the exact meaning for this langcode, The languages columns which store the meaning for the langcode in that
language, like englist(us), english(uk),french, german, hindi,arabic, persian and so on.

the browser stores the langcode in the machine. The webtraffic is limited to transmitting langcodes. The browser interpretes the langcode
in the native language. The user sees the text in users own native language.

There are of course limitations that not every phrase could be covered, but such messages can be written with code &lt;nolangcode
alt="abc213456789" native="english"> No bhai there is no anglais code for this&lt;/langcode>
Where the user sees the message in english "No bhai there is no anglais code for this"
followed by "nearest Equivalent: There is no English equivalent" in native language.
The browser now has to play a critical role, of converting whatever the user types to equivalent codes for transmitting. And on reciept to
reconvert the messages back to text.

Anand
--
------------------------------------------------------------------------
Visit www.smbd.org <http://www.smbd.org> for
Free Tutorials, Source Codes and Other stuff.
------------------------------------------------------------------------

M

Michael Griffin

I believe what you are suggesting is that a committee come up with a collection of stock phrases in various languages, and you simply use a
pre-defined set of codes to call up these stock phrases in whatever language you wish. The problem I see is that if you cannot cover all possible messages with this method (and I don't see how this would be possible), then why bother?

I also suspect this problem may be more difficult than it appears at first. A suitable equivalent expression for a particular language may depend upon the context in which it is used. In other words, language 'A' may use one expression to cover several different situations, while language 'B' requires a different expression for each situation. How would someone who speaks language 'A' but not language 'B' know which "code" to pick?
Another problem is that different langauges seem to require different numbers of characters and words to express an idea. Your screen designs would have to accomodate the longest case even if it were never to be used in practice. If you then added another langauge to the system which required even more space, how would existing systems adapt?

I could however, see the utility of a phrase book for MMI designers which would help in creating conventional translations. I don't know if such a thing already exists, but it would be a much less abitious project as it would not have to cover all possible messages nor all languages to be very useful.

I believe that there are organisations outside of the automation field which are already working on how to integrate multi-lingual support
into the web. I don't know what their approach is, but I would suspect that it would simply be a means of automatically selecting a different set of conventionally created screens depending upon what langauge you have selected in your browser.

**********************
Michael Griffin
**********************

J

Johan Bengtsson

As someone answered to the nearly same mail in linuxPLC mailing list, this won't work practically.

1. Translation isn't that easy really.
2. You will end up with a lot of phrases not translated, and what to do then? write the missing phrase in one fixed language - this will give you a text that is half translated half untranslated
3. The browser will have to have all those texts stored, quite a lot of text if you want to cover a lot of phrases. This will not be implemented in all browsers in a near future. (for example the
browser in a PDA, where there isn't that much memory - yet).
4. What if the browser isn't new enough and don't have the latest phrases you used? Should it just show the number or some "phrase missing" text? Or nothing at all?

Nice idea, but I don't think it will work, in some future you might be able to do good machine translation and in those cases you can just write the page in your favorite language and have it translated on the fly in the clients browser (or in the server). If the browser is unable to translate some words you does at least have the original word to display.

/Johan Bengtsson

----------------------------------------
P&L, Innovation in training
Box 252, S-281 23 H{ssleholm SWEDEN
Tel: +46 451 49 460, Fax: +46 451 89 833
E-mail: [email protected]
Internet: http://www.pol.se/
----------------------------------------

A

Anand

I agree with most of what you say.

Languages have their own way of expressing things.

Translation has several limitations.

For instance, i borrowed two TINTIN comics yesterday. One Objectif Lune in french and another Destination Moon. And "blistering barnacles" is "Mille sabords". Though I am at a very preliminary stage of french, I can easily say that these are two entirely different literal meanings, but do convey the message that captain haddock is swearing.

The Langcode tables will be easiest for SCADA messages, Company reports and online analysis. It may face problems where we encounter poetry or
phrases that have not been covered by the langcodes. These can be sent in general HTML where the translation falls upon the receiver or normal translation means can be used for translating the messages and putting them in separate places inside the web site.

I believe that the lines of working of W3 are slightly different. I tried sending them some E mails but it probably requires some access
levels etc. My mails bounced back.

However, i am confident that this concept would become a winner as there is no nearest equivalent solution being proposed at this moment. I am
planning to set up an organization for starting on the tables.

Those who are interested can join.

Anand

J

Johan Bengtsson

Well yes that's right of course, but solveable (like putting something around the word marking it as an untranslated word)

But of course there will be trouble during the translation if whatever/whoever translates don't understand all words. Our products are written in swedish first (obvoiusly since that is our natural language) and then translated to various
languages (first english and then others) We does quite often have to correct the translations after the products come back because some of the more technical words are not understood by the translator. But at least in most (but certanly not all) cases are they marked as "not understood what is the correct word?"

Well, enough ranting about that, I agree that it is not easily done in a way that covers everything, not even by humans and a machine is far behind any good human in this case.

/Johan Bengtsson

----------------------------------------
P&L, Innovation in training
Box 252, S-281 23 Hessleholm SWEDEN
Tel: +46 451 49 460, Fax: +46 451 89 833
E-mail: johan [email protected]
Internet: http://www.pol.se/
----------------------------------------

J

Jiri Baum

Anand,

> The Langcode tables will be easiest for SCADA messages, Company reports
> and online analysis. It may face problems where we encounter poetry or
> phrases that have not been covered by the langcodes.

It will encounter problems long before that. There are just too many words, too many phrases, too many meanings.

> These can be sent in general HTML where the translation falls upon the
> receiver or normal translation means can be used for translating the
> messages and putting them in separate places inside the web site.

This makes the whole concept a lot less useful, though not perhaps completely so.

> However, i am confident that this concept would become a winner as there
> is no nearest equivalent solution being proposed at this moment.

That's because it's not doable. Machine translation has been one of the holy grails of Artificial Intelligence for decades, and progress is meager. Human language simply doesn't lend itself to this kind of processing.

Sorry.

> I am planning to set up an organization for starting on the tables.

I'm not sure if you know a second language, but it would be a good idea to have a polyglot and/or a linguist leading the effort, preferably one with
translation experience. (I'm none of the above, to any significant degree.)

Michael Griffin:
> >I also suspect this problem may be more difficult than it appears at
> >first. A suitable equivalent expression for a particular language may
> >depend upon the context in which it is used.

The classical example would be the phrase doing X'', which needs to be translated differently depending on whether it means now doing X'' or
while doing X''

Another point is the old pluralization thing. You may need to distinguish between:
- (no distinction)
- singular and plural
- singular, dual and plural
- singular, 2-4 and plural
- singular, 2-4 and plural based on number modulo 10, except that 11-14
count as plural (and so do 111-114 etc).

Third point - if there are any substitutions, you may need up to a dozen or so different variants based on gender; and the substituted word may need to be inflected into one of up to about 7-8 cases.

Jiri
--
Jiri Baum <[email protected]> http://www.csse.monash.edu.au/~jirib
MAT LinuxPLC project --- http://mat.sf.net --- Machine Automation Tools

A

Anthony Kerstens

I have worked with such a thing on a Wonderware application.

Every piece of text in the application was animated with message tags. A recipe file was created with the help of a technical translator at the customer's location. Refinements in the translations were made on site to suit the preferences of the plant staff.

For example,

"Tank tranlated to something like "Farma
Farm" Tanque"

(pardon my incorrect portuguese spelling)

This required, in addition to the message tag animation, a location animation to switch the positions of the two words.

It is possible, but for this particular application, required in the neighbourhood of 800+ tags and hours of work.

Anthony Kerstens P.Eng.

M

Michael Griffin

At 10:48 23/10/01 -0400, Anand wrote:
<clip>
>The Langcode tables will be easiest for SCADA messages, Company reports
>and online analysis. It may face problems where we encounter poetry or
>phrases that have not been covered by the langcodes. These can be sent
>in general HTML where the translation falls upon the receiver or normal
>translation means can be used for translating the messages and putting
>them in separate places inside the web site.
<clip>
>However, i am confident that this concept would become a winner as there
>is no nearest equivalent solution being proposed at this moment. I am
>planning to set up an organization for starting on the tables.
<clip>

If I were to attempt this problem (multi-lingual web MMI), I think I would take a different approach. The browser can signal the required language to the web server. The web server could then deliver in the required language a web page which was prepared conventionally. This is fairly conventional, and works provided a suitable language version is
available. The main disadvantage of this method is the amount of work involved in adding each additional language.

Alternatively, there is another approach. In this second method, you would prepare only one set of your screen displays, but instead of using
normal text, you would create 'tags' which reference tables of translated text. You would then prepare a set of translation tables for each language to be implemented.
When the client web browser called up a display, the web server would look at the desired language setting, and substitute the correct text
for each 'tag' from the appropriate translation table before sending the web page out to the browser.
Each 'tag' would have a length defined by the MMI programmer, who needs to allow enough space for any of the possible translations. The
development software would allow the programmer to see his screens either as a set of tag names, or the actual text from the translation table of his choice. Development would consist of creating screens in one language (i.e. define one table as you go along). Additional language tables could be added at any time afterwards.

What I have just described is similar in concept to what you have spoken of, but with a few key differences which I think make implementation a more managable task.
1) The translations reside in the server, rather than in the web browser. This keeps more closely to the concept of what a web browser does.
2) Conventional existing web browsers are used, rather than "special" MMI web browsers.
3) The translations can be made as there is a demand for them. There is no need to create a great body of work in advance.
4) Creating this system essentially comes down to someone writing the MMI development software plus some sort of add-in for a web server to do the translation substitutions. Neither of these sounds particularly difficult for someone who can create commercial software.

One of the problems I see with the "langcodes" proposal you have described is the enormous amount of work which must be done (creating all the stock phrases in various languages) before it can be implemented. We need to keep in mind that not only do we have every human language (of which there are thousands), but even within a single language there are different phrases or descriptions used in different regions or industries.
I believe that creating the necessary collection of stock phrases in one go would be a project similar in magnitude to creating a conventional language dictionary. I believe for example, that creating the original Oxford English Dictionary took decades.

If you are looking for a multi-lingual project, might I suggest an MMI designer's phrase book? This would be a book (or rather set of books - or better still a CD-ROM) which would essentially encompass the information you wanted to define in your "langcodes", but accessed as a reference work instead of being incorporated into a web browser. It would contain sets of common words and phrases used in MMI work with equivalents various languages.
A phrase book does not have to contain every possible phrase or language in order for it to be very useful. This makes a project of
managable size. Langauges and phrases can be added one by one. For example, an english - hindi MMI phrase book would allow someone to create an MMI system (using conventional software) in both languages with a fair chance that only minor corrections would need to be made by a native speaker during the final review.

I don't know if such a thing (an MMI phrase book) already exists or not, but I can see how useful it could be. If you are still convinced
however that the "langcode" idea has merit, such a phrase book would still be the necessary starting point for any such project anyway.

**********************
Michael Griffin
**********************

A

Anand

Translation may work for simple SCADA messages, Company reports, and other machine generated messages. It can also work for simple communication needs.

Literature or poetic works in one language may entirely loose its charm in another.

i believe that there is a vast potential in terms of businesses being able to communicate directly with people who do not understand their
language.

Planning to startup a company for langcodes.
Interested persons can get in touch.

J

Jiri Baum

Johan Bengtsson:
> If the browser is unable to translate some words you does at least have
> the original word to display.

Which can be a problem if the original word happens to mean something else in the target language, so that it's not obvious it's untranslated.

An obvious and grave example would be the word "not" between English and German, where it means "emergency".

Jiri
--
Jiri Baum <[email protected]> http://www.csse.monash.edu.au/~jirib
MAT LinuxPLC project --- http://mat.sf.net --- Machine Automation Tools
tlhIngan Hol jatlhlaHchugh ghollI' Hov leng ngoDHommey'e' not yISuD
Never bet on Star Trek trivia if your opponent speaks Klingon. --Kung Foole

J

Johan Bengtsson

Well yes that's right of course, but solveable (like putting something around the word marking it as an untranslated word)

But of course there will be trouble during the translation if whatever/whoever translates don't understand all words.
Our products are written in swedish first (obvoiusly since that is our natural language) and then translated to various languages (first english and then others)
We does quite often have to correct the translations after the products come back because some of the more technical words are not understood by the translator. But at least in most (but certanly not all) cases are they marked as "not understood what is the correct word?"

Well, enough ranting about that, I agree that it is not easily done in a way that covers everything, not even by humans and a machine is far behind any good human in this case.

/Johan Bengtsson

----------------------------------------
P&L, Innovation in training
Box 252, S-281 23 Hessleholm SWEDEN
Tel: +46 451 49 460, Fax: +46 451 89 833
E-mail: johan [email protected]
Internet: http://www.pol.se/
----------------------------------------

A

Anand

Let me correct what you have said, You have worked on a Wonderware system where translation was made for a few messages into Portuguese.
And It probably was not very satisfactory.

In the same application you could not switch to english easily as in case the keyboard mapping is a bit different, then you will get totally
incorrect messages.

The system that I propose will circumvent this need for reprogramming just to have a change in language. Secondly the translation will be done
by professional translators and the system will be open for future changes/analysis, so making corrections would be limited to updating the
offline database.

I have received queries from people who are interested in making this happen. I am planning to involve as many people as possible as this is a
mega resource project and with tremendous utility.

Anand

Y

Yvon Menard

May I propose you look at a solution that is already implemented. I believe (I hope I remember well the book I read) Linux KDE interface already works using "tags" and a list of ASCII files.

When configuring the interface, you specify which ASCII language file you want to use. Each ASCII language file is accessible using normal word processor so you can adjust it to you needs. You can even create a new language file if you like, by renaming an existing file and slowly translating its content from, let say, english to french. Parts not tranlated would simply be shown in english and translated part would appear in french.

So any user wanting to convert its MMI in another language can do so. It would may be better to have two language files, one for standard interface features (buttons, displays, etc.) that would follow MMI software and one for user designed parts of screens (equipment names for example).

Yvon Menard
Process engineer
Alcan, Grande-Baie plant
[email protected]

A

Anand

Well, The idea of making it multilingual is to do this: Say an operator who knows marathi comes to operate from the machine, he sees the plant and alarms in marathi, gives command in marathi and so on. Another operator who relieves him knows russian, well, he does everything in russian and see the action of the previous operator in Russian. Lets say the top guy is an english speaking gentleman, then he is able tosee, actions, reports and alarms in English.

In other words, People switch Keyboard settings and language on the fly. text and messages are translated and for ease of translation use codes.

A very big task but certainly possible.

Anand