Question : Non-latin characters in SMS on Mobile Phones?

Hi experts

Developing an SMS application for different geographical markets, I need some information about mobile phones in countries with non-latin scripts.

My Norwegian mobile phone (Ericsson T68i) is for example not capable of showing Cyrillic characters, but I presume that a Russian can buy a phone in Russia, which is capable of showing Cyrillic characters. Is that correct?

And will this phone be able to send and receive SMS with cyrillic letters?

If so, I suspect that KOI8-R is the characterset in use? Right? May there also be other charactersets in use on russian phones?

I would appriciate a link that tells which charactersets is standard/in use in different countries worldwide.

And finally, since a SMS is 7-bit, the upper part of the ascii table of KOI8-R (or other non-standard-latin scripts)  is accessed with the use of two characters (Esc + something). Is that correct?


(If you are not familiar with the russian, you may use any other non-latin script as example.)

If possible, include reference to websites.

Additional points will be given for overall info and for countryspecific info.

kind regards

Øyvind Strøm

Answer : Non-latin characters in SMS on Mobile Phones?

"Note : for asian markets, I think you don't need to worry too much, since all  mobile phones over there should support UCS2"

All phones MUST support USC-2 and GSM-7. That's the minimum standard. The phones are not however required to DISPLAY all Unicode characters. They seem however to display all GSM-7 characters, so that's guarenteed.

Most new phones handle a variety of 8-bit codes like 8895-5 or KOI8-R. In fact many old Nokia phones support the old Norwegian PC OEM character set. The real problem is that you have NO idea which character set the phone supports beyond the two standard sets.

In fact some older phones won't receive 8-bit messages in ISO-8859-1.

The problem with UCS-2 is the 70 char limit. I started trying to send concatenated SMS messages when the 7-bit GSM limit of 160 was exceeded and that lead to some SPs or phones garbling the messages. So I settled for finding a word break below 160 and sending "unconcatenated" messages.

Although the GSM-7 has an escape to subsequent pages virtually nobody uses the extensions. So you stuck with GSM-7 or UCS-2 and no concat.  And possibly a local code page that you'd have to parametrise on installation. If the message is GSM-7 chars send a PDU, if local chars send an 8-bit and hope! otherwise send UCS-2. And break at word boundaries rather than concat.

I actually have a database full of my users. I have even thought about asking them what phone type they have when the register. But there again I'd be here till Christmas programming in all the variations!
Random Solutions  
 
programming4us programming4us