Local Languages Initiative

Locale information for Sinhala and Tamil, Sri Lanka

ඊ-තැපැල මුද්‍රණය

 

A locale which is a string identifier that refers to linguistic and cultural preferences with regard to a language, script and country etc, is necessary for companies to localize their software and to adapt the software to the conventions of different languages such as date formats, time formats, time zones, currency values, collation sequence, sorting order, separating digits in sets of either threes or twos in currency values etc. 
 
“The Unicode CLDR provides key building blocks for software to support the world's languages. CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization.   The Common Locale Data Repository (CLDR) provides a general XML format for the exchange of locale information for use in application and system software development, combined with a public repository for a common set of locale data generated in that format. The consortium's goal is to enable people around the globe to use computers in their own languages” - www.unicode.org
 
ICTA has carried out some preliminary work on this area with regard to Sinhala and has defined some basic locale information. The Sinhala collation sequence which is part of the required Sinhala locale information is also now a national standard. It is now necessary to comprehensively review the information needed and complete the CLDR Sinhala (Si-LK) pages on the Unicode site accordingly, and also define the relevant information for Tamil Sri Lanka (Ta_LK). Work on these two projects has commenced.
 
 
 
 

Internationalized Domain Names

ඊ-තැපැල මුද්‍රණය

 

 To implement internationalized domain names (IDNs) in Sinhala and Tamil, ICTA set up a Task Force on IDNs comprising representatives  from ICTA, the Telecommunications Regulatory Commission (TRCSL), the LK Domain Registry (LKNIC), the Department of Official Languages, the UCSC, language experts and two ISPs.

It was imperative that the subject of IDNs is addressed as soon as possible.  This subject was being addressed in the international arena and it was necessary for Sri Lanka to define the equivalent of Sri Lanka’s top level domain .LK in the Sinhala and Tamil.
 Implementing Sinhala and Tamil IDNs will have an impact on all local language users; it will have national level impact.  The Task Force on IDNs commenced work in May 2008. A public consultation process was followed and advertisements were placed in national newspapers calling for those interested to attend a workshop held in August 2008. Specific invitations were also sent to stakeholders. At the public consultation workshop, the IDN Task Force was able to ascertain the views of a wider representative gathering. Views of stakeholders and those interested were also collected through a questionnaire.
 
After reviewing the outcome of the public workshop and after further extensive discussion,  following IDN ccTLDs for Sri Lanka were unanimously agreed on:
 
-                                  It was agreed that the IDN top level domain for Sri Lanka in Sinhala would be .ලංකා (.lanka)
-                                  It was agreed that the IDN top level domain for Sri Lanka in Tamil would be  .இலங்கை (.ilangei)
 
Consequently the IDN equivalent for Sri Lanka’s top level domain .LK in Sinhala and Tamil - .ලංකා and .இலங்கை respectively – were announced at the inauguration of the eAsia Conference on 6th December 2009.  Sri Lanka’s request for the top level domains .ලංකා  and .இலங்கை were approved by the Internet Corporation for Assigned Names and Numbers (ICANN) in March 2010.  These two domain names were launched by the LK Domain Registry in June 2010.
 

SLS 1134 : 2004 (Sinhala ICT standard)

ඊ-තැපැල මුද්‍රණය

Sri Lanka Sinhala Character Code for Information Interchange, SLS 1134 : 2004

 The second revision of the Sinhala ICT Standard was standardized by the Sri Lanka Standards Institution in 2004 as the as the Sri Lanka Sinhala Character Code for Information Interchange, SLS 1134 : 2004.  The International Organization for Standardization (ISO) has included the Sinhala Character Code for Information Interchange, in the standard Information technology-Universal multiple-octet coded character set, ISO/IEC 10646-1.

 This second revision of SLS 1134 provides coding of the set of Sinhala characters for use in ICT, specifications for the code sequences and keyboard sequences. It also provides a revised keyboard layout, based on the layout in the original version of this standard, which in turn is based on the Wijesekara typewriter keyboard. This revision retains compliancy with ISO/IEC 10646-1.

 SLS 1134, Part 1, Sinhala Collation Sequence

 Sinhala collation is based on the order of Indic letters derived from Sanskrit, but has evolved its own conventions over the years. The dictionaries and other reference works which have been published since the 19th Century agree on the basic Sinhala collation sequence, but disagree in details.

 ICTA’s Local Languages Working Group noted that there were a number of issues regarding the collation sequence of Sinhala that needed to be clarified.

 Consequently, ICTA requested the University of Colombo School of Computing (UCSC) to research the issue of Sinhala collation and recommend a suitable collation algorithm. UCSC subsequently submitted a report on the issue. ICTA studied the report and recommended that both a dictionary collation sequence for use in compiling dictionaries and other scholarly works and a simple collation sequence for use in data processing and other activities on lists of personal and other names be defined.

 The dictionary collation sequence:

 The dictionary collation is the canonical collation order, and should be used when correct collation, based on the linguistic derivation of Sinhala, is required, e.g. for a dictionary. This is recommended for use in scholarly and academic activities.

 The simple collation sequence:

 The simple collation is to be used for preparing lists of names, places, etc. and will produce identical results as the dictionary collation sequence when collating personal names, place names and other common data.  This algorithm is both easier to implement, thus encouraging vendors to support Sinhala in their products, and produces a result which will not confuse a naive user, who is not aware of the subtleties of the language.

 The two collations will produce different results only between words with the letters ජ්‍ය or ඣ and the letter ඥ in a given position.

 SLS 1134, Part 2, Requirements and Method of Test.

 Companies have created Sinhala “packs”, keyboards, keyboard drivers, Unicode compatible fonts, which they claim are compatible with the standard.  Conversely there are numerous requests from organizations on recommendations for the above products.

 Therefore a standard was defined to test that products conform to SLS 1134 : 2004, so that these could be given SLS certification.  This standard – i.e. Part 2 of SLS 1134 : 2004 - defines the products to be tested, the test criteria and the test method.  The scope of this standard is for computers only.

 

 
JPAGE_CURRENT_OF_TOTAL
Tamil-Sri LankaEnglish (United Kingdom)

Nenasala Pranama

ICT Agency of Sri Lanka launched “Nenasala Pranama – Swarna Sammana” ceremony in appreciation of the contribution rendered by Nenasala owners / operators for achieving the National objective.

Read More

නව සිදුවීම්

Awards

mod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_countermod_vvisit_counter

රැඳී සිටින්න

facebook

facebook

youtube channel

Blog

 

 

  ICTA News

  Procurement Notices