Vietnam is remarkable in that it's the only one of the four major Sinosphere languages (Chinese, Japanese, Korean and Vietnamese, aka CJKV) to fully abandon Chinese characters and adopt the Latin alphabet instead. China and Japan both had serious movements to do the same, Japan during the Meiji reformation, China during the Cultural Revolution, but only Vietnam went ahead and did it. Korea forged its own path with their native Hangul writing system, but even there Chinese characters are still occasionally used.
Chu Nom is no Chinese Characters, I can't read Cho Noms for example (can read kanji/hanja without any issue), it is more like `Vietnamese` Characters, it is its own set of unique logograms, if that makes sense.
It is complex and difficult to use/write, abolishing it really didn't take too much effort, as most of the population is illiterate at the time.
However, this is very different from CJK countries, where the Chinese characters are mostly kept consistent. Cho nom on the other hand, renders almost completely unreadable to me.
Try reading some Japanese in man'yougana https://ja.wikisource.org/wiki/%E4%B8%87%E8%91%89%E9%9B%86/%... and compare to the version where all characters used purely for their sound value have been distorted into squiggly hiragana. I find it much easier to pick out coherent words in the latter because they're clearly visually distinct.
> Nom is pretty much Chinese word meaning + Chinese word pronunciation.
Hard disagree, from a Japanese speaker standpoint. I suspect it's almost deliberate that any mixtures of Chu Nom and Chinese can be handled natively to Vietnamese speakers, and nobody else. Nothing in it makes sense to me, far less than Chinese text do.
Understandable, because it was based on how it sounded, not based on the meaning of the compound characters. So it only makes sense for Vietnamese people.
The first word "núi" (mountain) is made up of 山 and 內, 山 is the meaning of the word, and 內 is how it (mostly) sounds. Without knowing Vietnamese it is impossible to guess the meaning.
Japanese also developed a Chu Nom type system (manyogana) to represent Japanese sounds using Chinese characters, and it was almost as complex, requiring a good understanding of Chinese as a foundation. But manyogana was simplified into the two syllabaries used today, hiragana and katakana.
Hiragana or Katakana are not Chinese characters. They don’t follow the principles of how Chinese characters were developed(六书), nor did they follow the structures.
At best they are inspired by Chinese characters. No one in China or Japan will classify kanas as Japanese made Chinese characters, which by itself, meant something different entirely, like 畑/辻/雫
イ developed from 伊, ク from 久, め from 女 and れ from 礼 in much the same way that 余 developed from 餘, 区 from 區, 汉 from 漢 and 礼 from 禮.
Wantonly altering character shapes to make them easier to write quickly may not be a blessed way to create new characters, but it has tradition. It's not like the Oracle Bone Script was passed unchanged from generation to generation and only the moderns dared lay hand to it. Lazy scribes have existed for as long as there have been scribes.
That hiragana and katakana are now treated as a completely separate category is more down to them being used very differently than how they were derived.
«Wantonly» is perhaps not the most accurate term at least in the context of hiragana. It evolved as a writing script for women and used exclusively by women and by women only due to the Japanese women having been denied education in Japan at the time and having been considered too inferior to learn kanji. The men scoffed at hiragana for many centuries as something relegated to the «crazy» women writing love letters and expressing «lowly» emotions in a correspondence with each other. The Japanese men would only write using kanji. Eastern Asian mysoginy has a pretty rich history that has even found its reflection in a writing system.
Fair, I think the difference here is that by simplifying Chinese characters into Kana, it also removes the semantics from the character completely, but that is also not completely unheard for Chinese either (like 的, for example).
It's kind of funny too, since of all those languages (besides Chinese), Vietnamese is the most suited to the Chinese characters, being isolating.
That said, they probably also had the easiest time abandoning them because there was never any real formalisation of how to write Vietnamese, as opposed to Literary Chinese, whereas in both Korea and Japan, people were writing their vernacular language in Chinese characters, or derived scripts, for centuries. Vietnam of course having chu Nom and other systems of writing vernacular, but not to the same degree as Korea or Japan did.
I think the French colonizers trying to make their empire more legible to them and suppressing the use of Chinese characters in service of that goal had a bigger influence on the decline of chu Nom than a particular lack of formalization.
There is more context than that in the article. The gist of it is that it was originally developed for religious reasons, became official late in the colonisation process, and was further developed by Vietnamese intellectuals.
It out-competed chữ Nôm because it was easier to learn. No need for nefarious master plans.
It was the French, but that doesn't make it nefarious.
Have you seen how annoying it is to type pictographic characters on a computer? Try doing so on a typical typewriter, or using an old style printing press.
Whether right or wrong, it was seen as beneficial.
There's occasional talk in Japan for dumping Kanji in favor of Hiragana and Katakana only but Kanji makes it easier to read the written word:
1) Japanese doesn't use spaces between words (and the language is structured in a way to make it difficult to decide what is a 'word') - you'll sometimes see signs in Japan that use Katakana spellings for some words that aren't normally katakanized.
2) Japanese would have a TON of homographs if they got rid of Kanji, so determining which word is harder even with some context.
1) You can introduce the innovation of spaces, as is used for example in older Japanese video games that had no Kanji
2) If that were a problem oral Japanese would be unintelligible. It is not. Most homonyms can be resolved.
Furthermore, they could adopt a Korean system, where everyone uses a non-Kanji system, but the Kanji are still taught in high school as a type of “Latin”/“Greek” where they draw literally 55% of their vocabulary from.
Regarding your point (2), it's not that simple, since written / literary language uses a larger vocabulary, making homonyms more of a problem.
In fact if you search a Japanese dictionary in hiragana with a combination of two reasonably common kanji readings (say かん+ちょう), you'll often get a double digit number of results (12 in jmdict in this example).
Most of these are uncommon words unlikely to be used in the spoken language, but could occur in writing.
It could probably still work more or less by relying on context, but it's more of an issue than you make it sound.
Another point is that writing in hiragana with spaces would disconnect the language from its roots. The Kanji used add a layer of meaning to the language that isn't there in languages with phonetic alphabets, and help guessing at the meanings of unknown words.
You could probably argue that making it easier to learn to read and write would outweigh the loss of those benefits, but I'm not so sure. Japan has a high literacy rate, so it does seem to work alright.
I've thought about this a little bit and wonder if the result CJK readers have ended up with is the result of a few interesting historic quirks.
Assuming that homophones are simply going to exist in the languages, there's probably other alternatives. For example, in the case of modern Korean readers, the Hanja can supposedly used to help clarify the meaning of a homophone where context doesn't provide clarity. But in practice, most Korean readers don't use Hanja enough to remember more than a small percentage of what they learn in school, so they work in effect as an index into a lookup table, where they look the Hanja up in a dictionary, find the definition written in Hangul, and use that to determine the meaning.
Support Japanese was reformed to entirely use one or both Kana in the way that modern Koreans use mostly Hangul. Then Kanji would be used to help determine meaning in the same way. As most people wouldn't encounter Kanji with enough frequency and distribution to remember the bulk of them, only the most common would be remembered. These are unlikely to be the difficult to discern homophones as contextual clues would clarify. So the Kanji would again also end up as indexes for looking up standardized definitions. In modern Japanese, this doesn't happen because Kanji is still used with enough frequency and variety that most people sustain some level of memory about the system. The question is how well do they remember it? [1][2][3][4][5]
So for non-Sino CJK languages, instead of spending so much time in education learning what amounts to a very complex lookup table indexing scheme, why not just standardize the definitions and use the numbers of the definitions as post-fixes to ambiguous homophones with Chinese origins?
So, to English speakers, homophones might look like unfortunate side effects of using an inferior symbol sets or something, I don't think that is how languages end up with a lot of homophones - rather, I think it's the result of a lossy compression; there are just such set of same tokens and circuits of thoughts that are addressed from different context for different effects.
And I think your lookup table thinking is halfway there to a deep understanding of the matter, it does work like a forward lookup from ideograms to meanings and pronunciation cues. sure you can sub:4 idea:5 wit:1 phne:2 Alph, and homo:7 dis:32 with num:2 idx for std:6 def:3, but I wonder if us humans are really good with that data model.
I’m (slowly and painfully) learning to read Thai and find myself wondering how a written language could naturally evolve without spaces between words. It adds a significant overhead since it requires learning the word boundary rules.
Spaces between words are a relatively recent addition to scripts like the Greek or Latin alphabets, roughly 1500 years ago. Early Greek was also written boustrophedon (“like an ox plows a field”) that is, left to right, then the next line is right to left, and so on. Sometimes a point (dot) was used to break words.
Spacing is hardly standardized in languages using Latin script; French typography, especially in older books is notably different from English or German, with spacing between sentences or certain punctuation being different. Then again phrases or terms which in English or French would be multiple words are written as single “words” in German („Straßenkehrgerät“ == “Street Sweeper”). And I find Russian spacing rules disturbing.
And look at Arabic which does have spacing but in calligraphy can grossly violate the bounds of what you might consider “running text” coming from a European background.
The boustrophedon has one more rule that you omitted and it’s extremely natural, – when you write from right to left you flip all the letters! My little daughter intuitively writes in this system even though no one taught her that nor did she see it somewhere. She explains that that way you know how to read longer passages that need to wrap over.
It was the French who pushed for a Latin based writing system for Vietnamese in an attempt to improve literacy in Vietnam and to also cut the cultural umbilical cord with China.
If China itself were colonised by Western powers towards the end of the Qing dynasty or after, the fate of Chinese characters in China could have become quite different, although it is a matter of speculation, of course.
Vietnamese Chu Nom is a bit of Cyrillic among CJKV. I think they always shared too little with Chinese language and especially vocabulary that script could be anything so long aurally rich, unlike with CJK languages.
The share of Chinese loans in Vietnamese is quite large and doesn't differ much from Japanese or Korean, though estimates vary depending on whether you're looking at everyday speech or the number of dictionary entries. https://en.wikipedia.org/wiki/Sino-Vietnamese_vocabulary
As seen on that page, they use too much of invented or re-assigned characters that maps to native Vietnamese vocabulary. Sentence structures also differs too much.
As implied in a sibling comment by karmasimida, we CJK speakers have a bit of mutual text-only intelligibility across CJK - speeches are all steelpans unless you have few years of training + auditory cortex power cycling, just writing - and that completely breaks apart in Vietnamese, in the way very reminiscent of looking at Cyrillic script.
I’m not super familiar with the history, but it is worth noting that many of the countries around Vietnam also adopted the Latin alphabet: Malay/Indonesian, Filipino, etc.
I’m in Malaysia at the moment and it’s surprising just how many everyday words are the same as English but “spelt with a Malay accent”. Ones I saw today were “restaurant” => “restoran” and “complex” => “kompleks”.
Written text doesn't have tones and accents, so everyone understands as they could.
I definitely read your comment with the accent on 'Malay accent', missing the whole part was in quotes. I just had enough experience to know it not limited to Malaysia or Asia.
Chinese has a small set of pronunciations and, as a consequence, a huge number of homophones. I think that's partly the reason why words are usually 2 characters: Yes, there is a character that means 'forest', for example, but when you want to actually say 'forest' it is combined with another character (of similar meaning) so that the spoken word is intelligible.
There is a huge loss of information when writing in pinyin (Latin characters) vs. ideograms.
It might be the same in Vietnamese.
I suspect a lot of the shift away from sinograms in Vietnam was due to nationalism, which is also partly why China is so attached to them.
No, no ambiguity at all. I'm sure you could construct deliberately a sentence with a double meaning playing on the two components of some words but that's a hard task for sure.
I have around a B1 level now so I'm not nearly good enough to build such complex puns though.
Very interesting. For someone who knows (some) Chinese but no Vietnamese there seems to be a lot of commonality so it's very interesting that Vietnamese does not have all those homophones.
I guess that explains a lot of the appeal of Latin characters, then.
Yeah that kind of make sense. The Chinese writing didn't fit very well to Vietnamese in the first place.
Vietnamese is a language with an highly complex pronunciation so some kind of writing which maps as close to possible to this complex pronunciation fits better.
Pronuncing and recognizing the pronounciation properly is the hardest task in Vietnamese and you really want the writing to help you with that.
And since more complex words are usually 2 syllable components of simpler ones, the Latin script actually helps you to guess the meaning of the new word as well, similarly as a Chinese character I guess.
Wikipedia and all linguistic resources that I have come across categorize Vietnamese as an Austroasiatic language (which includes Khmer) and not “Sinosphere language” (whatever that means). You are confusing writing system with language. I would argue that adopting the Latin alphabets was a great mistake. It basically severe all the young generations from all their historical writings. Also, Latinized Vietnamese scripts is one the most hideous scripts out there.
> I would argue that adopting the Latin alphabets was a great mistake. It basically severe all the young generations from all their historical writings. Also, Latinized Vietnamese scripts is one the most hideous scripts out there.
No thanks. Most historical texts have been transliterated into modern chữ quốc ngữ without much compromise in meaning and literary eloquence. The writing facilitates learning Latin languages so much, and most importantly it has perfect phonemic orthography: there exists a bijection between well-formed words and speech, i.e., spelling bee is not a thing in Vietnamese.
Yes, I have to agree. I was so surprised by that bijection claim. It is so outlandish. I mean if you restrict and force everyone to only use the exact phonemes the scripts allow, then of course you will be able to make that claim but it is not possible.
The ‘Sinosphere’ is the cultural and linguistic area which has been influenced by China. This area includes the Vietnamese language, which has undergone heavy Chinese influence. At the same time, Vietnamese is an Austroasiatic language by descent, related to other Austroasiatic languages like Khmer (as you note), Mon and Santhali. Most other Austronesian languages are not part of the Sinosphere, but Vietnamese is.
As a speaker (or at least writer) of English which uses the Latin alphabet and that has extensive borrowings from Latin and French, how connected do you consider yourself to the Canterbury Tales, to Beowulf, and to Cicero?
It's definitely cool that Vietnam ditched the overcomplicated ideographic system, but the Latin-based script could still do with a lot of improvement. The need for multiple diacritics on almost every vowel reveals starkly how ill-fitting Latin is for a tonal language. Meanwhile, for the consonants, Vietnamese spelling adopted French/Portuguese-inspired digraphs such as qu and ph when the much more internationally recognized k and f would have been available and clearly be preferable.
I feel it's a missed opportunity. Vietnamese writing could have been a lot cooler. The ideal would have been a specially crafted writing system designed for the language, much like Hangeul is tailor-made for Korean; but even if adopting Latin letters were somehow a requirement, it could have been more thought-out than the French-imposed hodgepodge that we got.
> The need for multiple diacritics on almost every vowel reveals starkly how ill-fitting Latin is for a tonal language.
How so? There are some phonetic systems for Mandarin that adjust spelling to indicate tones, and IMO they’re terrible. Fortunately they’re also obsolete. I don’t know much about Vietnamese, but it seems to use 5 diacritics for tones, which seems just fine.
The real issue seems to be that Vietnamese has too many vowels for the Latin alphabet, and that was also’solved with diacritics, so there are tone diacritics and vowel diacritics.
English also has too many vowel sounds, and it’s, ahem, solved by being ambiguous. Mandarin has too many vowels and Pinyin uses (somewhat ill-conceived IMO) pairs of vowels. But that means that all the diacritics [0] are tones, which is nice.
[0] As a sort of exception, Pinyin occasionally needs an apostrophe as a word/syllable break, which is a side effect of the aforementioned vowel pair issue and the fact that Mandarin doesn’t actually need spaces to be readable, so spaces are kind of optional except when they aren’t.
True too many diacritics are difficult to use, although I think it was as easy as anything else when handwritten. As technologies progress any script will be written easily, so a nation having a distinct script is part of its identity. Also a mistaken belief is that a scripts glyphs and clusters need to represent the phonetics accurately. Hangeul is possibly the best phonetic script that I can think of other than using IPA which is now used for some African languages.
Interestingly, Filipino and Indonesian also had their own scripts for some time (a subset of Indian scripts) then converted to Latin scripts in the 18th century and later.