Tocharian — 700 AD, N.W. China

Tocharian "cha"

In the early part of the 20th century, archeologist Aurel Stein discovered manuscripts in northwestern China in a script that had been lost for centuries, in a language that which completely stunned the linguistic community.

The first surprise was that it was an Indo-European language.  That wasn’t such a huge surprise, since the Greeks had come tromping through almost 1000 years earlier, Northwest China was along the Silk Route not far from the historically powerful Indo-European-speaking Farsi speakers and the Indo-European-speaking Sanskrit speakers.

The second surprise was that it wasn’t Greek or Greek-derived, nor Farsi or Farsi-derived, nor Sanskrit or Sanskrit-derived.  It appeared that it had been in the area for an awfully long time, not that some e.g. Finns took a wrong turn somewhere in 500 BC; Tocharian was a completely novel language.

The third, and biggest surprise, is that its structure was closer to the -European side of the Indo-European languages than to the Indo- side, despite being geographically closer to the areas that spoke languages on the Indo- side.

Despite the language being clearly European-ish, the writing system is Asian: it is an abugida (with “a” as the implied vowel)  clearly derived from Brahmi; most of the surviving (and found) documents in Tocharian were translations of Sanskrit Buddhist documents.

Tocharain script had extra decorations to denote aspiration and nasalization.

Links: Wikipedia, Ancient Scripts, Omniglot

Posted in Abugida, Rating: 5 "Whoa!!" | Leave a comment

Naxi Geba — 1200 AD? S. China

Naxi Geba character

Like its sibling (parent?) script, Naxi Dongba, Naxi Geba is highly idiosyncratic and used mostly for religious writings.  Unlike Naxi Dongba, Naxi Geba is a syllabary, but different people used different symbols for the same syllable. This makes it less than ideal as an interpersonal communications medium.

This raises another aspect of the question: “What is a writing system?”  Does a writing system need to be shared to be a writing system?  If I create a language and only make notes to myself, is it a writing system?  What if I then forget what it meant — is it still a writing system?  What if I make marks on paper that might be meaningful to someone somewhere in the universe?  I do not know the answer to that, and indeed that question might not have a universal answer.  Its answer might be as idiosyncratic as the Naxi Geba syllabary.

Links: Naxi Pictographic and Syllabographic Scripts, Omniglot, Sinoglot, Wikipedia

Posted in now ceremonial, Rating: 3 "I did not know that", Syllabaries | 2 Comments

Naxi Dongba — 600AD? S. China

Naxi Dongba excerpt

Much like Aztec and Mixtec, Naxi Dongba is a highly pictographic communication system.  Like Aztec and Mixtec, it’s almost not a writing system.  If you look at a picture of the writing, it looks more like what we think of as “drawings” than as “writing”.

Indeed, this writing system is used less for communicating with other people and more with communicating to oneself and to the gods: it is mostly used as a mnemonic system by priests, and the edges of the book are burned as part of ceremonies.  It is highly logographic (though the logograms are wildly idiosyncratic), with pictorial rebuses for words that don’t have a logogram.

After the revolution in China and then the Cultural revolution, Naxi Dongba nearly became extinct.  The largest collections of Naxi Dongba manuscripts are those that had already left the country.  It is still highly threatened.

Naxi Dongba, like Aztec and Mixtec, raises a  question of “What is a writing system?”  Today’s Wikipedia article on Writing Systems says “A writing system is a symbolic system used to represent elements or statements expressible in language.”  But how much abstraction into symbols is required before it moves from drawing to writing?  Does Naxi Dongba qualify?  Do Renaissance religious paintings qualify?  I do not know the answers.

Links: Wikipedia, Omniglot, Naxi Pictographic and Syllabographic Scripts, Sinoglot

Posted in Logograms, now ceremonial, Rating: 5 "Whoa!!" | 3 Comments

Modern Yi — 1974 AD, China

Modern Yi "ot"

In 1974, the Chinese government decided to make a syllabary for the Yi language, based on the symbols in Classic Yi.  As with the Zhuang in the 1950s, it isn’t clear to me why if it was such a good idea to come up with a phonetic script for a minority language, they didn’t think it was an equally good idea for Chinese script.

The syllabary they came up with is stupid-huge.  There are (depending on how you count) at least 756 characters, many more than any other syllabary.  Part of the problem is that Yi has a very rich complement of sounds: 43 distinguishable consonant clusters, 8 distinguishable vowels, and four distinguishable tones in the reference dialect.  (There are other dialects of Yi which use five.)

The astute reader will note that 43*8*5 is much greater than 756:.  One of the tone’s glyphs are derived from another tone’s by decorating the glyph with a little hat.  There are also some combination of vowels, consonant clusters, and tones that just don’t appear in the spoken language.

The astute reader will also note that 756 is much, much greater than 43+8+5, which they could have represented the sounds in if they had used diacritics to show the vowel and tone.  I cannot find enough information on Classic Yi to tell, but it might be that the characters used in the Modern Yi syllabary are the Classic Yi characters for that syllable (similar to how Manyogana came from from Chinese characters), or derived from them (like Hiragana derived from Chinese characters).

Note that it is probably in part because of the existence of the syllabary that I can find so little information on Classic Yi.  In addition to cutting the Yi people off from their written history, creating a syllabary also had the effect of cutting off interested bystanders from learning about the Classic Yi writing system.

Links: Wikipedia, Ancient Scripts, Omniglot, Encoding Yi, Unicode, Babelstone

Posted in Rating: 4 "Huh, interesting!", Syllabaries | Leave a comment

Taiwanese kana — ~1900 AD, Taiwan

Taiwanese kana aspirated "ch(a)"

As a result of losing the first Sino-Japanese war, China had to cede Taiwan to Japan in 1895. The Japanese went through phases of let-the-Taiwanese-be-Taiwanese alternating with phases where they tried to assimilate the Taiwanese into Japanese culture.

During one of the assimilationist periods, Japan imposed the use of Katakana to show the Taiwainese pronunciation of Chinese script characters.  Unfortunately, spoken Taiwanese is more complicated than spoken Japanese.  Taiwanese is a tonal language which distinguishes between aspirated (“breathy”) and unaspirated (“not breathy”) consonants; it also has a few more vowels and consonants than Japanese.

To compensate, they decorated the syllables.  They put marks to the right of the words (which were always written top-to-bottom) to show what tone the word should have and whether the vowel was nasal or not.  They put horizontal bars over “s-” syllables to make “ch-” syllables; they put dots under the syllables to show aspiration.

There were also some quirks in the spelling.  If a syllable only had a consonant and one vowel, the vowel was repeated, making the initial characters almost alphabetic.  However, if there were lots of vowels in a syllable, the first vowel would be part of the initial syllable.  Thus, “ki” would be written with “ki”+”i” characters, but “kiau” would be spelled with “ki”+”a”+”u” characters.

Pretty much after Japan got relieved of its administration of the island, the Taiwanese elected to discontinue use of Taiwanese kana.

Links: Wikipedia

Posted in government-mandated, Rating: 4 "Huh, interesting!", Syllabaries | 3 Comments

Gaiji

Hryvnia

Many writing systems have a finite set of glyphs; you can write down a complete list and there are no others, except for the rare invention of new characters. But some writing systems have an open-ended set of glyphs; no matter how many you write down, it is routine to discover or invent a new one. Chinese script and its variants are prime examples of such open-ended scripts.

This leads to a big issue of writing technology. If you are writing by hand, you don’t care: you just draw any glyph you need. But movable type, and later digital fonts, have essentially finite glyph complements. It is not easy, especially for an end user, to add a new glyph to a font.

To make matters worse, a font may not contain all the glyhs in a writing system. The reference dictionaries of Chinese and Japanese contain some 50-70,000 characters; common Chinese and Japanese digital fonts contain some 20,000 characters today, and 15 years ago the standard was more like 8,000. (By comparison, a Japanese high-school graduate is required to know less than 2,000 characters, enough for adult communication.)

There were several causes for the limitations on font size. Many computer architectures were developed in the USA and other Latin-script territories, and initially addressed only Latin script needs. The capability for Chinese, Japanese, Cyrillic, Arabic, and other scripts were added later (and are still being added). Also, computer storage and processing were hideously expensive, and computer hardware and software developers could only afford to meet the most common needs of their their largest customers, not the complete needs of every potential user.

Hence publishers, and recently computer users, find sometimes that they want to use a character which is valid in their writing system but is not in their font. In Japanese these characters are called gaiji (meaning “outside character”).

Gaiji are not just a problem for Japanese and Chinese (and Korean, to a lesser extent): this is a significant problem for any scholars writing about extinct logographic writing systems like Chu nom or Sumerian.

If you are a computer geek, you might say, “ah, but Unicode solves that problem”.  Unfortunately, Unicode doesn’t. Unicode encodes many characters; that is, it defines numbers to represent many characters in a string. It doesn’t make the font stretch to represent those characters.  It also does not encode all characters, past and future.

This problem is less common in European scripts, because glyphs are rarely added to these scripts. But “rarely” does not mean “never”.  The glyph adorning this post is the Hryvnia, the Ukranian currency symbol, introduced in 2004.  You might scoff at the Hryvnia, and figure that it isn’t that important to support the Hryvnia symbol, but I doubt you would say that about the Euro symbol, introduced in 1996.

NB: This is an issue of great interest to my family; my beloved husband’s team at Adobe System came up with a publishers’ tool for the gaiji problem, known as the SING Gaiji Architecture, available in Adobe Systems software between 2005 and 2010. (Disclaimer: I asked my husband to edit this post, and he did.)  Adobe’s developer information page on “Gaiji — Supplemental Characters/Glyphs” has links to good overview papers.  Many technologies for supporting gaiji have been tried, including  stroke-based fonts, and ideograph decomposition.

Links: Wikipedia, “Gaiji: Characters, Glyphs, Both, or Neither?” paper (2002), “SING: Adobe’s New Gaiji Architecture” paper (2004), Sean Palmer essay, another ideograph decomposition proposal.

Posted in Commentary, Rating: 5 "Whoa!!" | Leave a comment

Tangut — 1036, China

Tangut character

Like King Sejong did four hundred years later with Korean, Emperor Li Yuanhao of the Tangut told one of his advisors to make him a new writing system.  Yeli Renrong did, and quickly.  Yuanhao must have been more forceful than Sejong, or his elites less powerful, because his new script was more quickly adopted than Sejong’s: schools were set up to teach it and government documents were written in it.

To those of us who do not read Chinese characters, Tangut looks kind of like Chinese.  To someone who is familiar with Chinese characters, however, Tangut looks unambiguously not Chinese.  Traditional Chinese script is made up of eight basic stroke shapes, and Tangut uses some additional strokes and the components (“radicals”) look very different as well.

Links: Wikipedia, Omniglot, Tangut (Xīxià) Orthography and Unicode

Posted in Logograms, Rating: 5 "Whoa!!" | Leave a comment

Classic Yi — 700? 1485? AD, China

Yi "fir"

For a very long time, the Yi people used a logographic script to write their language. Their tradition says that it was created by someone named Aki in around 700 AD, but the earliest record is from 1485 AD.

Mostly the Yi priests used it for religious, magical, or medical texts.  The literacy rate was very low (less than 3% in 1956), so kids weren’t using it to write to Grandma.  They didn’t have a strong central government, so didn’t use it as a bureaucratic language.  They didn’t appear to use it for accounting or records; when they communicated with the outside world, they used Chinese.  As a result, there was zero standardization, and every set of village priests had their own local dialect of the script, resulting in a huge number of characters: 90,000 by one estimate.

Links: Wikipedia, Ancient Scripts, Omniglot, Babelstone, Encoding Yi

Posted in Logograms, Rating: 4 "Huh, interesting!" | 1 Comment

Zetian characters — 690 AD, China

Zetian "star"

There was one female ruler of China, Wu Zetian, who, among other things, mandated use of around twenty new characters.  (These characters were presented to her by a junior relative, Zong Qinke, but she went along with it.)

She took one of the new characters as her own name.  At that time, it was taboo to say or write the name of high-status people.  (It was thus a bit tricky to tell people whose name you couldn’t say or write!)  This was a big deal: people (plus their families) got executed for disrespecting the emperor by saying their name or writing their name properly.  One way around the writing problem was to leave out a stroke in the name, but if the name-character was common in other words, it could be difficult to work around the ruler’s name.  Considerate emperors with common characters in their names frequently changed their name to something less common, but Empress Wu went a step beyond in using a hitherto-unused character.

For the other characters, it is not clear why she insisted on their use instead of the perfectly good ones that were available.

Interestingly, one of the characters, the one shown at the top of this post, uses none of the eight basic stroke shapes of Chinese script, and indeed looks quite un-Chinese.

The characters did not catch on, and ceased being used quite promptly after her death.

Links: Wikipedia, Dylwhs

Posted in government-mandated, inventor known, Logograms, Rating: 5 "Whoa!!", significant female influence | 2 Comments

Sawndip — <689 AD, China

Sawndip "see" (word)

The Zhuang people of southern China have been using an augmented Chinese script for over 1300 years called Sawndip.  This writing system was used extensively in popular culture (songs, poems, ceremonies, and some literature) and religion, but not governmental documents.

As with Chu nom, some of the Chinese characters are used “as-is” (about 80-90% for Sawndip), but the Zhuang also developed characters of their own using the same mechanisms for composing characters as Chinese script uses.  Most often, they combined one character for the meaning and one character to give the pronunciation.  This is tricky, however, since Zhuang has sounds that do not occur in Chinese.  Frequently, the phonetic radical’s Chinese sound is only a very rough approximation of the sound in Zhuang.

This, coupled with a lack of government-imposed standardization, meant that there was a huge variation in the characters.  Indeed, it is difficult to read Sawndip manuscripts, and it appears that they served more as a mnemonic device for the manuscript’s owner than as something just anybody could pick up and read.  (I found one source that indicated that priests didn’t always understand what they recited, just how to recite it.)

In the late 1950s, the government of the People’s Republic of China decided to standardize the Zhuang writing system with a phonetic alphabet.  (Why they decided that a minority language should be phonetic, while it was clearly just fine for Chinese to use logograms, was not explained in any of the supporting material that I saw.)  At first, they used this strange bastard love-child of Cyrillic and Latin script, but in 1986, they stripped out the non-Latin characters.

Links: Wikipedia, Omniglot, Google Books excerpt of The Tai-Kadai Languages

Posted in Logograms, Rating: 4 "Huh, interesting!" | 4 Comments