Skip to main content
Home
The Baheyeldin Dynasty
The journey for wisdom starts with knowledge
  • Home
  • About
  • Site Map
  • Contact

Arabic on the Internet: History of Arabic on Computers

  1. Home
  2. Arabic on the Internet

By Khalid on 2004/12/19 - 21:18, last updated 2005/02/09 - 22:59

This page provides a historical overview on how computers handled Arabic in the past, and how it is handled at present.

Arabization in the computer world was initially a hardware vendor activity. Vendors either had in house resources (Systems Engineers), or subcontracted others to do the job for them.

Some vendors established centers of expertise to develop Arabic, such as the NCR-ALIF (Arabic Language Integration Facility) in Nicosia, Cyprus then in Cairo, Egypt. IBM and ICL had similar facilities.

Each major vendor had several standards as improvements were gradually made.

7-bit In Lower Case

Initially, Arabization was very rudimentary. The lowercaseEnglish characters were replaced by the Arabic characters, and werestored as 7-Bits. Lower case English was not used much in computers in the 1960s and 1970s, until UNIX came to being. Some printers were not even capable of printing lower case English letters. This made the replacement of lower case somewhat acceptable in those days. Texas Instruments minicomputers used this scheme for the Arabization in Egypt in the early 1980s.

Limited Character Set

8-Bit character sets were then used. These used the upper part of the ASCII table (128 decimal/80 Hexadecimal and above).

Initially, this was somewhat primitive, with just one representation for the several different shapes an Arabic characters could have. For example, Thesecond letter in the Arabic alphabet, the ب BA character and its like (TA ت, THA ث) were only represented as the beginning ofword shape, which also functioned for the end of word shape as well.

An example of this era is NCR-64. It provided limited shaping, the above three letters had only one variant, instead of havingseparate shapes for beginning, middle and end of word. The 'Ein ع and Ghein غ letters only had two variants, not the usual four variants for them.

Moreover, thischaracter set catered for Farsi (Persian) as well, and had the 4 extraletters for Gaf, Pa, Ja, Tch sounds. It also had a sorting problem: for someunknown reason, it had the Waw و before the Ha' ه too!

Enhanced Character Set

As improvements were done, richer character sets came into being, which had more shapes, and thus wasmore visually acceptable.

An example is NCR-96. Storage was still in 7-bit most of the time,and terminals used SI/SO (control characters Shift In, Shift Out) to switch languages.Auto shaping was done in the ROM of terminal, which was a step forward.For printing, applications had to call special system routines to do the automatic shaping (called Context Analysis at NCR).

Still, every vendor had their own character sets, and interchanging data was quite a chore, just like the pre-ASCII days in the 1950s and 1960s in the West.

IBM EBCDIC and its idiosyncracies

IBM, since they use EBCDIC had a peculiar character set, where Arabicand English were interspersed. The detection of whether the 8th bit wason or off was not a guarantee that this is an Arabic letter. Sortingwas problematic too.

Standardized Character Sets

Finally, the era of standards arrived. These were cross-vendor standards accepted by the industry, and set by standard bodies.

ASMO 449

One of the first vendor independant standards was ASMO 449 (Arabic Standardsand Measurements Organization). However, it was still 7-bit, and requiredescape characters (normally the plain text braces { and }, which was kind of odd to use as escape codes on the terminal).

ASMO 708

By the time ASMO-708 was introduced, things were getting better. Thiswas a true transparent 8-bit vendor-independant standard.

Terminals and Terminal Emulatorswere now sophisticated enough to do the shaping and work in 8-bits, distinguishing between Latin and Arabic automatically by whether the 8th bit is set or not, as well as obeying certain escape sequences to shift the keyboard language/direction as well as the screen language and direction.

Printing was donein the firmware of the printers (e.g. from Alis). Where there were no Alisprinters, the UNIX spooler did the work through a custom filter written in C that does the context analysis.

ASMO 708 was adopted by the Internatinal Standards Organization as the ISO 8859-6 standard.

Still in some cases, vendors did silly things: NCR and ICL each had theirown ASMO-708 derivative. For example, in Egypt customers insisted that theLam-Alef لا is one character, and not stored as a separate Lam ل and Alef ا ! Alsothe Ya' ي at the end of a word, with two dots under it was virtually unknownin Egypt, and the version without the dots (actually and Alef sound) was used instead. This led to NCR-ASMO and such vendor variants.

Other approaches

ICL DRS-80 system

Even in the mid 1980s, some companies did get Arabization right, such as ICL (later bought by Fujitsu). They had this DRS-80 system which had totally seamless Arabization. One could enter Arabic right in the programcode using the editor, be it COBOL or BASIC, a feat which was uncommonin other systems in those days, where development environments, editors and other programmers' tools could not handle 8-bit input.

This system also had other features, such as calling programs from one language to another languages, such as COBOL and BASIC, and the ability to recover from power failures via a dump facility.

The PC Revolution

Microsoft Code Pages

The advent of the PC in the early 1980s also influenced how Arabization(and other internationalization) is to be done.

Several code pages weredeveloped by Microsoft, including CP 720 for Arabic DOS. Later, when Windows arrived, anothercode page was developed, Windows CP 1256.

Unicode

Unicode finally arrived. It was an international standard, and neither vendor specific nor developed by Arabic standards organizations.

The scheme required 2 bytes to represent every character on earth, from Arabic to Chinese.

UTF-8

Due to the limitations of Unicode requiring 2-bytes, some scientists developed a scheme called UTF-8, which is basically Unicode that allows for existing ASCII to be encoded injust one byte (8-bits).

You can read How UTF-8 was born, by Rob Pike.

Also, Joel Spolsky has an article on the use of Unicode in software from a developer's perspective. Remember that this is just a standard for encoding, and not presentation. Therefore, it is not Arabic specific, nor does not address the problems with direction, but it is a standard nonetheless, and hence very useful in general.

Still other developers have proposed other encoding systems.

Contents: 
Arabization

Book Navigation

  • ‹ Arabic on the Internet: Background
  • up
  • Arabic on the Internet: Microsoft and Arabization ›
  • Add comment

Current

Pandemic

  • COVID-19
  • Coronavirus

Search

Site map

Contents

  • Family
    • Khalid
    • Ancestry
    • Extended
  • Friends
  • Nokat نكت
  • Writings
    • Cooking
    • Culture
    • Science
    • History
    • Linguistics
    • Media
    • Literature
    • Politics
    • Humor
    • Terrorism
    • Business
    • Philosophy
    • Religion
    • Children
  • Technology
    • Linux
    • Arabization
    • Drupal
      • Association
    • Software
    • Internet
    • Technology in Society
    • Digital Archeology
    • NCR History
    • MidEast Internet
    • Programming
    • Saudi ISPs
    • Miscellaneous
  • Places
    • Canada
      • Weather
    • Egypt
      • Cuisine
      • Alexandria
      • E.G.C.
    • USA
    • Saudi Arabia
  • Interests
    • Astronomy
    • Fishing
    • Photography
    • Snorkeling
    • Nature
    • Photomicroscopy
  • Miscellany

In Depth

  • al-Hakim bi Amr Allah: Fatimid Caliph of Egypt الحاكم بأمر الله
  • Alexandria, Egypt
  • Arabic on the Internet
    • Arabic on the Internet: Background
    • Arabic on the Internet: History of Arabic on Computers
    • Arabic on the Internet: Microsoft and Arabization
    • Arabic on the Internet: The Issue of Platform and Browser Independance
    • Arabic on the Internet: My old position: Limited Workaround
    • Arabic on the Internet: Transliteration: Using Latin Characters for Arabic
    • Arabic on the Internet: Table of Arabic Characters Latin Transliteration
    • Arabic on the Internet: Translation Pitfalls
    • Arabic on the Internet: Links and Resources
  • Articles on the history of Muslims and Arabs in the Iberian Peninsula تاريخ المسلمين و العرب في الأند
  • DIY GOTO Telescope Controller With Autoguiding and Periodic Error Correction
  • E.G.C. English Girls College in Alexandria, Egypt
  • Egyptian Cuisine, Food and Recipes مأكولات مصرية
  • George Saliba: Seeking the Origins of Modern Science?
  • Internet Scams and Fraud
  • Mistaken for an Arab or Muslim: Absurdities of being a victim in the War on Terror
  • Mistaken Identity: How some people confuse my site for others
  • One People's Terrorist Is Another People's Freedom Fighter
  • Overview of Google's Technologies
  • Photomicroscopy
  • Pseudoscience: Lots of it around ...
  • Resources for using Google Adsense with Drupal
  • Rockwood Conservation Area, Southern Ontario
  • Selected Symbolic Novels And Movies
  • Snorkeling the Red Sea near Jeddah
  • Updates and Thoughts on the Egyptian Revolution of 2011

Recent Content

Most recent articles on the site.

  • Origin Of COVID-19: Natural Spillover, Lab Leak Or Biological Weapon?
  • Kamal Salibi and the "Israel from Yemen" theory
  • How To Upgrade HomeAssistant Core In A Python Venv Using uv
  • Ancestry - Paternal Side
  • Review of Wait Water Saver For Whole House Humidifiers
more

Most Comments

Most commented on articles ...

  • Another scam via Craigslist: offering more than asking price
  • Warning to female tourists thinking of marrying Egyptians
  • Craigslist classified for used car: Cheque fraud scam
  • Winning the lottery scam email: World Cup South African lottery
  • Email Scam: BMW 5 Series car and lottery winning
more

About Khalid

Various little bits of information ...

  • Khalid Baheyeldin: brief biography
  • Presentations and Talks
  • Youtube Videos
  • GitHub Projects
  • Drupal.org Profile
  • Astrophotography @ Flickr

Sponsored Links

Your Link Ad Here

Tags

Android Mobile Ubuntu Sony OnStep OpenWRT Router Ericsson COVID-19 Rogers Coronavirus Arabic Kubuntu Home Assistant GSM Telescope tablet Spectrum Scam Python 419 Laptop Firefox DIY CPU Conspiracy Comet Balkanization backup App
More

© Copyright 1999-2025 The Baheyeldin Dynasty. All rights reserved.
You can use our content under the Terms of Use.
Please read our privacy policy before you post any information on this site.
All posted articles and comments are copyright by their owner, and reflect their own views and opinions, which may not necessarily be consistent with the views and opinions of the owners of The Baheyeldin Dynasty.

Web site developed by 2bits.com Inc.