 | Arabic alphabet: Encyclopedia II - Arabic alphabet - Presentation of the alphabet
Arabic alphabet - Presentation of the alphabet
The following table provides all of the Unicode characters for Arabic, and none of the supplementary letters used for other languages. The transliteration given is the widespread DIN 31635 standard, with some common alternatives. See the article Arabic transliteration for details and various other transliteration schemes.
Regarding pronunciation, the phonetic values given are those of the "standard" pronunciation of the fusha language as taught in universities. Actual pronunciation between the varieties of Arabic may vary widely. For more details concerning the pronunciation of Arabic, consult the article Arabic phonology.
Arabic alphabet - Primary letters
Letters lacking an initial or medial version are never tied to the following letter, even within a word. As to ﺀ hamza, it has only a single graphic, since it is never tied to a preceding or following letter. However, it is sometimes 'seated' on a waw, ya or alif, and in that case the seat behaves like an ordinary waw, ya or alif.
The only compulsory ligature is lām+'alif. All other ligatures (yaa - mīm, etc.) are optional.
- lām ʼalif (lā [læː]):
ﻻ (medial ﻼ)
Unicode has a special glyph for the word llāh, the post-vocalic form of Allah "God."
- U+FDF2 ARABIC LIGATURE ALLAH ISOLATED FORM:
ﷲ
Combined with an initial alif, this becomes full allāh:
The latter is a work-around for the shortcomings of most text processors, which are incapable of displaying the correct vowel marks for the word "Allah". Compare the display of the composed equivalents below (the exact outcome will depend on your browser and font configuration):
- lam-lam-hā':
لله
- alif-lam-lam-hā':
الله
The following are not actual letters, but rather different orthographical shapes for letters.
Notes
The ʼalif maqṣūra, commonly using Unicode 0x0649 (ى) in Arabic, is sometimes replaced in Persian or Urdu, with Unicode 0x06CC (ی), called "Farsi Yeh". This is appropriate to its pronunciation in those languages. The glyphs are identical in isolated and final form (ﻯ ﻰ), but not in initial and medial form, in which the Farsi Yeh gains two dots below (ﯾ ﯿ) while the ʼalif maqṣūra has neither an initial nor a medial form.
Arabic alphabet - Hamza
Main article: hamza
Initially, the letter ʼalif indicated an occlusive glottal, or glottal stop, transcribed by [ʔ], confirming the alphabet came from the same Phoenician origin. Now it is used in the same manner as in other abjads, with yāʼ and wāw, as a mater lectionis, that is to say, a consonant standing in for a long vowel (see below). In fact, over the course of time its original consonantal value has been obscured, since ʼalif now serves either as a long vowel or as graphic support for certain diacritics (madda or hamza).
The Arabic alphabet now uses the hamza to indicate a glottal stop, which can appear anywhere in a word. This letter, however, does not function like the others: it can be written alone or on a support in which case it becomes a diacritic:
- alone: ء ;
- with a support: إ, أ (above and under a ʼalif), ؤ (above a wāw), ئ (above a dotless yāʼ or yāʼ hamza).
Arabic alphabet - Diacritics
Main article: Harakat
Arabic short vowels are generally not written, except sometimes in sacred texts (such as the Qurʼan) and didactics, which are known as vocalised texts. Occasionally short vowels are marked where the word would otherwise be ambiguous and cannot be resolved simply from context.
Short vowels may be written with diacritics placed above or below the consonant that precedes them in the syllable. (All Arabic vowels, long and short, follow a consonant; contrary to appearances: there is a consonant at the start of a name like Ali in Arabic ʻAliyy or a word like ʼalif.)
Long "a" following a consonant other than hamzah is written with a short-"a" mark on the consonant plus an alif after it (ʼalif). Long "i" is a mark for short "i" plus a yaa yāʼ, and long u is mark for short u plus waaw, so aā = ā, iy = ī and uw = ū);
Long "a" following a hamzah sound may be represented by an alif-madda or by a floating hamzah followed by an alif.
In an un-vocalised text (one in which the short vowels are not marked), the long vowels are represented by the consonant in question (alif, yaa, waaw). Long vowels written in the middle of a word are treated like consonants taking sukūn (see below) in a text that has full diacritics.
For clarity, vowels will be placed above or below the letter د dāl so it is necessary to read the results [da], [di], [du], etc. Please note, د dāl is one of the six letters that do not connect to the left, and is used in this demonstration for clarity. Most other letters connect to ʼalif, wāw and yāʼ.
Main article: shadda
šadda (ّ) marks the gemination (doubling) of a consonant; kasra (when present) moves to between the shadda and the geminate (doubled) consonant.
An Arabic syllable can be open (ended by a vowel) or closed (ended by a consonant).
- open: CV[consonant-vowel] (long or short vowel)
- closed: CVC (short vowel only)
When the syllable is closed, we can indicate that the consonant that closes it does not carry a vowel by marking it with a sign called sukūn (ْ) to remove any ambiguity, especially when the text is not vocalised: it's necessary to remember that a standard text is only composed of series of consonants; thus, the word qalb, "heart", is written qlb. Sukūn allows us to know where not to place a vowel: qlb could, in effect, be read /qVlVbV/, but written with a sukūn over the l and the b, it can only be interpreted as the form /qVlb/; we write this قلْبْ. This is one stage from full vocalization, where the a vowel would also be indicated by a fatḥa: قَلْبْ,
The Qur'an is traditionaly written in full vocalization. Outside of the Qur'an, putting a sukun above a ya' which indicates [i:], or above a waw which stands for [u:] is extremely rare, to the point that yaa with sukuun will be unambiguously read as the diphthong [ai] and waw with sukun will be read[au].
The letters m-w-s-y-q-ā (موسيقى with an ʼalif maqṣūra at the end of the word) will be read most naturally as the word mūsīqā ("music"). If you were to write sukuns above the waw, ya and alif, you'd get وْسيْقىْ, which would be read as *mawsaykay (note that an ʼalif maqṣūra is an alif and never takes sukūn). The word, entirely vocalised, would be written مُوْسِيْقَى in the Qur'an (if it happened to appear there!), or مُوسِيقَى elsewhere. (The Quranic spelling would have no sign above the final alif maqsura, and a miniature alif above the qaf, which is a valid Unicode character but most Arabic computer fonts cannot in fact display as of 2006.)
A Sukun is not placed on word-final consonants, even if no vowel is pronounced, because fully vocalised texts are always written as if the i`rab vowels were in fact pronounced. For example, ʼaḥmad zawǧ šarr, meaning "Ahmed is a bad husband", for the purposes of Arabic grammar and orthography, is treated as if it was still pronounced with full i`rab, ʼaḥmadu zawǧun šarrun.
Other related archivesAbjad numerals, Afghanistan, Afrikaans, Albanian, Allah, Aqaba, ArabTeX, Arabic Chat Alphabet, Arabic calligraphy, Arabic language, Arabic phonology, Arabic transliteration, Aragonese, Aramaic, Aramaic alphabet, Armenian, Avestan, Azerbaijan, Azeri, Baluchi, Bashkir, Belarusian, Berber, Berber languages, Bosnian, Brunei, Brāhmī, Cape Malays, Celtiberian, Central Asia, Chaghatai, Chechen, China, Chinese, Comorian, Comoros, Cyrillic, Cyrillic alphabet, DIN 31635, Dari, Devanagari, Dungan, Eastern Arabic numerals, Fulani, Ge'ez, Glagolitic, Gothic, Greek, Hajjaj ibn Yusuf, Harakat, Hausa, Hebrew, Hindi language, Hindu-Arabic numerals, History of the Arabic alphabet, Hui, ISO-8859-6, Iberian, India, Indian, Indonesia, Indonesian, Iran, Iraq, Islam, Jabal Ram, Jawi, Kashmiri, Kazakh, Kazakhstan, Kufic, Kurdish, Kyrgyz, Kyrgyzstan, LaTeX, Languages of Muslim countries, Latin, Latin alphabet, List of national languages of India, Maghreb, Malay, Malay language, Malaysia, Mandinka, Middle Bronze Age, Morocco, Mozarabic, Muslims, Mustafa Kemal Atatürk, N'Ko, Nabataean, Nabatean alphabet, Nasta'līq, Nubian, October Revolution, Official language, Old Italic, Ottoman Empire, Ottoman Turkish, PERF 558, Pakistan, Pashto, Persian, Persian modified letters], Perso-Arabic script, Phoenician, Phoenician alphabet, Polish, Portuguese, Proto-Canaanite, Proto-Semitic, Punjabi, Qur'an, Qurʼan, Roman alphabet, Roman script, Runes, Samaritan, Sanskrit, Semitic, Shahmukhi, Sindhi, Singapore, Somali, Songhay, Sorani-Kurdish, South Arabian, South Arabian alphabet, Spanish, Swahili, Syria, Syriac, Tachelhit, Tajik, Tamazight, Tatar, Tatars, TeX, Thuluth, Tifinagh, Timbuktu, Turkey, Turkic, Turkish, Turkmen, Turkmenistan, USSR, Ugaritic, Umayyad, Unicode, Urdu, Uyghur, Uzbek, Uzbekistan, West African, Wolof, Xinjiang Uyghur Autonomous Region, abjad, abjadi order, abjads, alif, aljamiado, bi-directional text, character sets, chronograms, diacritic, diacritics, epigraphic, fusha, gemination, glottal stop, glyph, hamza, hamzas, handwriting styles, i`rab, iske imlâ, language families, ligatures, loanwords, memorized, papyrus, phonemes, phonology, pre-Islamic Arabic inscriptions, rendering engine, script, seventh century, shadda, short vowels, typefaces, unicase, upper and lower case, varieties of Arabic, vocalization, vowel marks, zaouias
 Adapted from the Wikipedia article "Presentation of the alphabet", under the G.N U Free Docmentation License. Please also see http://en.wikipedia.org/wiki |