Are some languages more "complex" than others?

lpetrich · Nov 13, 2024

Separate roots ("suppletion") are also found in pronouns, like English I/me/my/mine and we/us/our/ours. These ones are inherited from Proto-Indo-European, with *ego-/*me- and *wei/*nos- In some others, some of the separate nominative (subject) forms dropped out. Latin 1p nôs, 2p vôs, 1s in Celtic (Irish, Welsh me) and Indic (Hindi mai, Sinhalese mama).

But in some languages, the plural ones are plural forms of the singular ones: Turkish 1s ben, 1p biz, 2s sen, 2p siz. BTW, the usual plural suffix in Turkish is -lar/-ler, depending on the preceding vowel.

Some languages also distinguish between inclusive and exclusive we: you and I, we without you.

lpetrich · Nov 13, 2024

Here is another example of complexity: multiclause sentences.

Second clause with a non-finite verb:
I watched Sophie scratch the cat's head.

Second clause a relative clause:
I watched Sophie, who scratched the cat's head.

Second clause introduced with a subordinating conjunction:
I watched Sophie as she scratched the cat's head.

Second clause introduced with a coordinating conjunction:
I watched Sophie, and she scratched the cat's head.

Both clauses separate (apposition):
I watched Sophie. She scratched the cat's head.

Stripped-down grammar:
Me watch Sophie. Sophie scratch cat head.

Swammerdami · Nov 13, 2024

lpetrich said:
Complexity may be measured by how many individual forms one has to learn. For N numbers and C noun cases, a modular sort of noun inflection would have only N + C forms, while a non-modular sort would have N*C forms. In general, N*C > N + C, so modularity is lower in complexity.

I'm sure your point is interesting and well-argued, but I don't understand a word. C=1 is the number of "noun cases" that Thai has, I suppose. I.e. Thai's noun inflection is neither "modular" nor "non-modular" -- it just doesn't exist.

lpetrich said:
A further problem is that a language can be simple in one thing and complex in another.

Obviously. The point is: Spoken Thai seems NON-complex in nearly all ways.

lpetrich said:
Chinese, Vietnamese, and Thai have highly modular and thus relatively simple grammar, but they have a kind of complexity that may seem odd to many of us: Classifier (linguistics) using classifiers with counts. How do measure words work in Mandarin Chinese?
liǎngwèi lǎoshī - two person teacher
yīpǐ mǎ - one animal horse
sìbǎ yǐzi - four holdable chair
yībǎ huā - one holdable flower (bunch of flowers)
yīshuāng xié - one pair shoe
yītiáo yú - one long-and-narrow fish
...

Korean, Japanese, Khmer, and Burmese also have classifiers. The closest thing in English is "head of cattle", but that's very unusual.

As you go on to admit, English needs lots of words playing the role of classifiers. Three cups of ice-cream, two pieces of pizza, two herds of elephants, that school of fish, etc. etc.

lpetrich said:
English distinguishes between countable and uncountable nouns,

But unlike English, Thai is SIMPLE in that it doesn't distinguish these two types of noun. Some nouns serve as their own classifiers: "Bring glass beer three glass"; "Person come three person"; etc. And superfluous classifiers are sometimes optionally omitted.

Treating ALL nouns as "uncountable" may seem DIFFERENT from European languages, but it hardly seems COMPLEX.

lpetrich said:
like countable "cup" vs. uncountable "water". Thus, "three cups" but not "three waters". One must specify something countable with an uncountable noun, like "three cups of water". But one can use "bottle" or "gallon" or "drop" or several other such words, words that can be used on their own.

lpetrich said:
Another sort of complexity is selection of pronouns by level of formality and social status: T–V distinction and T–V distinction in the world's languages

Most of Thai's pronouns are used in intimate speech rather than formally. Whereas English needs 2 or 3 words to say "You, my dear" Thai makes do with a single word (an affectionate 'You'). English "He", "She", "They" all translate as THE SAME WORD (/khao/). Complex?? (Full disclosure: This same /khao/ can also serve to mean "I" when speaking to an intimate friend!)

Wiktionary said:
เขา • (kǎo)
1. he; she; they.
2. (childish) I; oneself.

lpetrich said:
Slavic languages distinguish verb aspects, imperfective (incomplete, continuing) and perfective (complete, momentary). These are formed in a variety of ways, usually from affixes (prefixes and suffixes), but sometimes from separate roots, like Russian impf govorit' pf skazat' "to speak".

Thai lacks inflections, verb affixes, and obligatory markers. Yes, this makes many utterances ambiguous. Is ambiguity "complexity"? What few markers (or "helping verbs") there are, are multi-purpose: Consider /dai/ which can mean any of

to get; to obtain; to receive.
(auxiliary) used before a verb to indicate that one has done or has an opportunity to do the action.
(auxiliary) used after a verb to express that the action can be or has been done.

Reduplication^* includes "friend friend" to mean "friends" and "red red" to mean "very red." Complex?
* - why the RE in reduplication? The words are duplicated, not triplicated.

I've previously displayed a 13-word sentence actually overheard by a professor of Thai linguistics, that consists of 1 pronoun (/khao/, see above!) followed by TWELVE verbs! Complex? Each of the 12 verbs is used in its ordinary meaning.

Swammerdami · Nov 20, 2024

(I'm going to add to this discussion, partly in hopes that one of our linguist experts will pose the question on a professional linguists' board. And partly because, with my key laptop "on the fritz" this is one way for me to create a permanent record!)

First let me say that Thai has no real plural marker. My brain "needs" a plural marker when speaking Thai and I've fallen into the habit of prefixing พวก - phuak for plural. But this word is actually a noun meaning 'group.' Like many nouns, it is its own classifier so three groups of people would be 'group person three group.'

REDUPLICATION

I think this term includes invented rhyming words, like English "roly-poly" or "easy-peasy." Cockney slang has very complicated patterns. So, although Thai has LOTS of invented rhymes used as slang or colloquially, I don't think that adds much "complexity."

As mentioned, duplicating an adjective is like prefixing "very." Duplicating a noun to make it plural however, is used only in a tiny number of cases I think.

Here's a duplicated word that becomes ambiguous:

ล้านล้าน - lan-lan (million, million)

Conversationally the duplication usually just creates a plural: 'millions.' BUT 'million' is the largest number the language has, so 'thousand million' means 'billion' and 'million million' can also mean 'trillion.'

Instead of rhyming, the initial consonant can be repeated. In each case I show (a) Thai spelling, (b) Romanized rendering per the Royal Institute standard, (c) meaning [in most cases the slangy compound is missing from wiktionary; in some cases it is missing from thai-language.com], (d) definitions for each of the two syllables separately, with "X" denoting absence from both dictionaries.

เยอะแยะ - yoe-yae = many (many, X)

งี่เง่า - ngii-ngao = clumsy, idiot (X, stupid)

เซ่อซ่า - soe-sa = clumsy foolish (foolish, X)

ฟุ่มเฟือย - fum-fueai = extravagant (X, X)

ฟุ้งเฟ้อ - fung-foe = extravagant (to spread, extravagant)

Color words often have a specific suffix applied to mean 'very.' In only the first case ('pee') does the word appear in a dictionary. 'Pee' also means 'very' when prefixed to words for sour, bitter or tight.

ดำปี๋ - dam-pee = very black (black, very*)

แดงแจ๋ - daaeng-jaae = very red (red, X)

ขาวจั๊วะ - khaao-jua = very white (white, X)

'Reuai', which wiktionary translates as 'usually, continuously, always', is often accompanied by a rhyming syllable. (The 'rotten' meaning is not cognate?)

เรื่อยเปื่อย - rueai-pueai = continuously and aimlessly (continuously, rotten)

เรื่อยเบื่อย - rueai-bueai = continuously and aimlessly (continuously, X)

เรื่อยเฉื่อย - rueai-chueai = unhurried (continuously, slow/inert)

เรื่อยเจื้อย - rueai-jueai = on and on without stopping (continuously, X)

เฉย - choei by the way is a near-homonym of เฉื่อย - chueai and has related meaning. Wiktionary shows etymology for neither.

Finally, here are two common words built by adding a nonsense syllable to 'messy' or 'dirty.' The two words are definitely NOT exact synonyms, but the separate connotations may be dialect-dependent!

เละเทะ - le-the = messy (messy, X)

เลอะเทอะ - loe-thoe = sloppy (dirty, X)

Swammerdami · Nov 20, 2024

PRONOUNS

Pronouns were proposed as an example of Thai's alleged complexity. I think its pronouns instead show simplicity!

Many words ("Uncle", "Child", "Sarge", "Doctor" etc. etc.) can play the role of pronouns, and operate as 1st-, 2nd- or 3rd-person. In English we might say "Doctor, what medicine do you recommend?" "Doctor" is a noun here but what role does it play in the sentence? (How do you diagram it?) In Thai ("Doctor recommends medicine which?") the "Doctor" functions as a simple subject, and almost as a pronoun.

One couple speak Thai to each other exclusively, but use English "Babe" as the "You" pronoun. Many couples use เอง (/eng/, 'oneself') as the "You" pronoun for each other. Some pronouns come in pairs: Couples that use /eng/ for "You" will use เขา (/khao/, 'he/she') as the "I" pronoun when speaking to each other.
Cute! But complicated? I don't think so.

เธอ /thoe/ is a pronoun often used as an affectionate "You." Here's the definition Wiktionary shows

a second person pronoun, used of a person of equal or lower status.

a third person pronoun, used of a person of equal or lower status.

a third person pronoun, used of a monarch.

It's used as 3rd-person for BOTH a person of equal or lower status and for a monarch?!
Weird? I think so. But is it a significant complexity?

Swammerdami · Dec 30, 2024

pood said:
Swammerdami said:

I suppose the Thai people are a bit frustrated that their simple language seems to be the exception to the rule that all languages are equally complex. Thai has no word for "the", no markers for verb tense or even plurality; "King" and "God" are the same word.

To make up for this simplicity, Thais have no less than seventeen syllables that can be thrown onto the end of a sentence to express the speaker's mood or his attitude toward the listener.

Maybe that's why they don't believe in Jehovah or Allah or Beelzebub or whatever His Name is. Too many new pronouns would be needed.

Click to expand...

I believe a number of languages, most prominently perhaps Russian, omit definite and indefinite articles. Also in Russian, the double negative is not just permitted, but required.

From what you write above it sounds like Thai is just as complex as any other language, just in a different way, given the bewildering use of all those syllables.

On the one hand, the sentence-ending particles may seem weird. Is there a name for them? Do other languages have something like this?

On the other hand, English itself has a few words appended to sentences with a role similar to those Thai particles. For examples, "please," "hunh?", "duh!", "Sir," and "Ma'am."

(In "Please (to) pass the peas" the please can be parsed as a verb. But that parsing seems wrong in "Pass the peas please.")

Bomb#20 · Dec 30, 2024

Swammerdami said:
On the one hand, the sentence-ending particles may seem weird. Is there a name for them? Do other languages have something like this?

On the other hand, English itself has a few words appended to sentences with a role similar to those Thai particles. For examples, "please," "hunh?", "duh!", "Sir," and "Ma'am."

(In "Please (to) pass the peas" the please can be parsed as a verb. But that parsing seems wrong in "Pass the peas please.")

It's still a verb -- it's short for "if you please".

For I hold that on the seas,
The expression "if you please"
A particularly gentlemanly tone implants.
And so do his sisters and his cousins and his aunts!

Swammerdami · Dec 30, 2024

What you've shown is that the English phrase "If you please" evolved (was "grammaticalized"?) into the particle "please."

Something similar happened in Thai to produce /khrap/ the polite ending particle for males:
khrap is a shortened form of kho-rap which concatenates to very common Thai verbs -- this was new to me until just now! -- it translates as "(I} request (to) receive."

(From time to time one hears a female use /khrap/ as a sentence ender. This is usually when speaking to a young Thai boy, or to a male Farang.)

ETA: I got the above from Wiktionary. It also has an entry for English "please" :--
Etymology 1 is shown as verb

From Middle English plesen, plaisen, borrowed from Old French plaise, conjugated form of plaisir or plaire, from Latin placeō (“to please, to seem good”),[1] from the Proto-Indo-European *pleHk- (“pleasingness, permission”). In this sense, displaced native Old English līcian, whence Modern English like.

A 2nd etymogy is for an Adverb:

Short for if you please, an intransitive, ergative form taken from if it please you[

Wiktionary calls Thai /khrap/ or /khorap/ a Particle.

Swammerdami · Dec 30, 2024

Swammerdami said:
(From time to time one hears a female use /khrap/ as a sentence ender. This is usually when speaking to a young Thai boy, or to a male Farang.)

Statements like this are often based on my own (small sample-size) observations rather than any linguistic authority.

But I've only one example of exception to the claim: Just a few weeks ago I heard the owner (with her husband/barista) of Somewhere Espresso use the formula speaking to a female customer she seemed to know. I almost wanted to ask her about it but our relationship lacked any intimacy. (And she'd smiled only grudgingly if at all when I once hummed "Somewhere over the rainbow" to compliment her shop's name. Somewhere Espresso is a very small shop but showed up in Facebook's "Best coffee in Chiang Mai" thread. It's found on "F.F. Road" but I'm reluctant to specify that locale more clearly.)

Swammerdami · Jan 9, 2025

Written Thai

When linguists speak of language complexity I'm pretty sure they're referring only to the spoken language, not the written language. (Perhaps this will change in future with written language increasingly more common than spoken language?)

If we DID include the written language when measuring "language complexity", that might be the source of Thai's alleged complexity.

The well-known U.S. State Dept. language-learning difficulty scale gets confused about this:

U.S. Dept of State said:
...
Category III: 36 weeks (900 hours)
Languages with linguistic and/or cultural differences from English
Indonesian; Malaysia; Swahili
Category IV: 44 weeks (1100 hours)
Languages with significant linguistic and/or cultural differences from English
...
*Thai
...
* Languages preceded by asterisks are usually more difficult for native English speakers to learn than other languages in the same category.

What distinguishes the Category III languages from the several in Category IV where "cultural differences" are more "significant"? All three of the Category III languages are written with the same Latin script that English uses. The discussions never seem to mention this but obviously it is learning the WRITTEN language which moves simple languages like Thai into "Category IV with *."

In this post I'll mention a few ways in which writing Thai is complex.

1. Silent or Transformed Letters
Half the letters in English "knight" are silent, so Thai isn't much more complicated than English on this point.
* "sand" translates to ทราย which letter-by-letter is T-R-A-Y but is pronounced /saai/ (like "sigh").
Interestingly, wiktionary tells us that replacing an S symbol with the T-R was a mistake in this case!

This word was written ซาย (saai) in documents prior to the 18th century, but shifted to the current spelling (with a cluster ทร initial) during the 18–19th centuries, possibly under the popular misconception that this word was of Khmer origin, and that the s- initial had developed from an earlier *dr- ...

Thai has no less than SIX different consonants, all pronounced as aspirated T (along with two for unaspirated T) but this peculiar 'TR' representation of an 'S' sound always uses the most common /Th/ consonant. This means that when an English word like 'TRAVEL' is transcribed into Thai it might look like it should be pronounced "savel"!

"Ordinary accident" translates to อุบัติเหตุธรรมดา but this writing has THREE distinct weirdnesses of which I'll mention just one: In the doubled 'RR' which I've painted red, neither R is pronounced. Instead the doubled R is just a schwa vowel.

2. Ambiguous division into words or syllables
Spaces are left between sentences, but usually not between words. Of course some other written languages share this problem.
The soft drink "Sprite" is written as สไปรท์ which has TWO oddities. The first oddity is that the word must be written as two syllables since triple consonant blends are not allowed. I've reddened the vowel, so a straight transliteration would be "SIPRT."

The words for "time" and "empty" are เวลา and เปล่า with transliterations of /we-la/ and /plao/ respectively. The words are written almost identically, but two vowels (shown in red) produce two syllables in the first word, but a single "vowel blend" in the second word.

3. Positioning of Symbols
A Thai vowel can appear before, after, above or below the consonant it's associated with. A syllable can have BOTH an above-vowel AND a tone mark, so the tone mark will need to go ABOVE the above vowel. (Example of this in the image below.) Adding further confusion, some consonants extend upward to where above-vowels go; the most sophisticated type-setting software will need to relocate or resize the vowel. Some rarish consonants extend below the base-line: again humans or top-notch type-setting software may relocate below-vowels in those cases; ordinary software just smudges them together. (European scripts sometimes need to use white space above the ŇŐRMĂL tops of letters.)

ออูอึ ฐฐูฐึ ฟฟูฟึ ... พึฟึ
Three different consonants are shown, each with no vowel, a below-vowel and an above vowel. In the second set it looks like the rendering software has gone back in the second case and replaced the consonant. (Notepad does NOT do this.) In the final comparison, it looks like the above-vowel is shifted left to bypass the stroke of an oversize consonant. (Notepad doesn't do this either.)

4. Old Rendering of Thai on Computers
With today's fancy Unicode fonts and "negative kerning" etc., rendering Thai script is easy today but in the 1980's there was difficulty. Old PC monitors allowed 25 lines of English text, but Thai implementations sometimes offered just 8 lines of Thai text. Each line of Thai text used a sub-line for the above-vowels and tone marks, another sub-line for the consonants, and a third sub-line for the below-vowels. Eight lines was just too few, so some implementations offered 12 lines of Thai text, with the above-vowels sharing a sub-line with the below-vowels from the previous line. A single character couldn't combine both the above- and below-symbols, so IIRC spaces were inserted as necessary to avoid such collisions.

Old mechanical French typewriters have "dead keys" for the accent marks. The same approach is used on Thai typewriters, and Thai computer keyboards. But IIUC French or German keyboard drivers do NOT operate like old typewriters; the accent is pressed first and the keyboard driver generates a character code for the accented vowel. That wouldn't work in Thai -- there are just too many consonant-vowel-tone_mark combinations to have character codes for each.

5. Hard-to-Distinguish Symbols

Letters in English are almost all easily distinguished. (Although lower-case i and j can be confused; another exception pair is upper-case I and lower-case L.) The serifs in fonts like this one ABTH ... have no effect; sans-serif fonts work as well: ABTH But in Thai script, what look like tiny serifs can affect which letter it is. And which letters can be easily confused for each other depends on what font-face is being used.

I've attached a photo I took a few days ago.

The sign reads
rap-sue-saak-rot, which translates word-by-word as
"receive purchase debris wheeled-vehicle"
or
"we buy cars for scrap"

I do not want to sell my Honda for scrap YET, but I would like to replace the RGB display now that the G color is missing altogether.
I don't need G for displaying song lists or such, but I'm really afraid that the missing G will confuse me when using the rear camera for backing up. Honda dealer wants $1000 or thereabouts to replace the screen, so I thought I'd see if a scrap dealer has one.

As you see, the sign has nine letters across (which I denote via RBZOZAGRT) two vowels above, and a tone mark above the second above-vowel.
Next I show the same letters in the default font, along with letters denoted DKChZ for discussion.

รับซื้อซากรถ ... ดค ซช
RBZOZAGRT .. DKChZ

Note first that the letter I've labeled Z is almost identical to the Ch shown at right. They don't even differ by a single serif; just by a tiny indentation on the "serif"! The difference is almost invisible; I think when Thais are reading they just subconsciously guess the letter because only one of the possibilities forms a valid word.

Note also that the letters I'm calling G and T both have an indentation near the top of the left leg. In the photo, the indentation on the G is too tiny to be visible, but the indentation IS visible on the T. That's because there's no alternate letter that looks like the G, but without the indentation the T could be mistaken for a D.

I show the D above and you're thinking that it doesn't look at all like the T. That's true of THIS font, but in some fonts the D is simplified to have just a tit at the bottom left instead of the pretty curlicue. Next to the D above is a K, just like the D but concave instead of convex. In some fonts these two are easily confused.

And these examples just scratch the surface of hard-to-read Thai letters.

Politesse · Jan 9, 2025

Orthography is an area of special study for linguists. Many do specialize in examining how language is encoded in written form, but they are an independent phenomenon from oral language, and are not tied to it. Note that one can use any orthography to transcribe any language. Popular scripts such as Latin, Cyrilic, and Chinese are used as the written form of hundreds of languages, not just one each. On the reverse case, some languages commonly use multiple forms of orthography, as one can readily see by standing on any busy Tokyo street corner and looking up.

So generally speaking, if a linguist says "language", unless it is a sign language they mean its oral form. Or to think of it another way, they mean a system which has the structure and properties universal to human languages: recursivity, indexicality, productivity, and so forth. All spoken languages have these features, whereas no written language can or could have all of them independently of a spoken or signed model.

Swammerdami · Jan 9, 2025

Swammerdami said:
...
Next I show the same letters in the default font, along with letters denoted DKChZ for discussion.

รับซื้อซากรถ ... ดค ซช
~~RBZOZAGRT .. DKC~~hZ

Note first that the letter I've labeled Z is almost identical to the Ch shown at right. They don't even differ by a single serif; just by a tiny indentation on the "serif"! The difference is almost invisible; I think when Thais are reading they just subconsciously guess the letter because only one of the possibilities forms a valid word.

Ooops! I scrambled the labels. Should be:

รับซื้อซากรถ ... ดค ซช
RBZOZAGRT .. DKZCh

Jokodo · Jan 16, 2025

Bomb#20 said:
Swammerdami said:

Thanks for the lead! It will take me some time to read the whole paper, but the title ("The rise and fall of a consensus") shows the theme. Long ago, it was assumed that "savage" or "barbarian" cultures had primitive languages. Complex societies had complex languages. ...

Anyway, savage people with complex languages also turned up and by the early 20th century, the meme of equal complexity may have arisen as a counterfactual anti-racism, which today might be called "wokeism"! ...

Click to expand...

While I don't doubt that the meme arose historically as a long-overdue reaction against the savages-had-primitive-languages meme, it seems to me at this point there's intellectually more to it than that. The negation of "Language complexities are equal." is "There exist languages X and Y such that Complexity(X) > Complexity(Y)."; the point of saying the former is to dispute the latter. And for any language L, Complexity(L) = Summation_{[i in language aspects]} complexity_i(L) * weight_i . So I'd take "Language complexities are equal." to be a TLDR executive summary of the view that:

The claim that "There exist languages X and Y such that Complexity(X) > Complexity(Y)." has burden of proof, but all those weight_i coefficients appear to be subjective. So it can't meet its burden of proof unless you can exhibit either an objective way to choose the weight_i coefficients, or else a pair of languages X, Y such that For All i, complexity_i(X) > complexity_i(Y). Good luck with that.

Thanks for a very good explication!

Jokodo · Jan 16, 2025

Swammerdami said:
PRONOUNS

Pronouns were proposed as an example of Thai's alleged complexity. I think its pronouns instead show simplicity!

Many words ("Uncle", "Child", "Sarge", "Doctor" etc. etc.) can play the role of pronouns, and operate as 1st-, 2nd- or 3rd-person. In English we might say "Doctor, what medicine do you recommend?" "Doctor" is a noun here but what role does it play in the sentence? (How do you diagram it?) In Thai ("Doctor recommends medicine which?") the "Doctor" functions as a simple subject, and almost as a pronoun.

One couple speak Thai to each other exclusively, but use English "Babe" as the "You" pronoun. Many couples use เอง (/eng/, 'oneself') as the "You" pronoun for each other. Some pronouns come in pairs: Couples that use /eng/ for "You" will use เขา (/khao/, 'he/she') as the "I" pronoun when speaking to each other.
Cute! But complicated? I don't think so.

But of course you cannot justify your exclusion of those features from a cross-linguistic complexity metric. It's just a gut feeling that they "shouldn't count". That makes your implicit metric subjective and thus ouside of the domain of linguistics as a science.

I could just as easily argue that nominal case actually makes a language simpler, so the Slavic languages for example with 6-7 cases would be, by that measure alone, simpler than English or Thai. Here's my argument: In Serbian, the proposition that Budimir recommended Stojan to Lazar can be expressed by pretty much any order of the words "Budimir(nominative)", "recommended(masculine singular past participle)", "Stojan(accusative)", and "Lazar(dative)", the only thing that stays put is the auxiliary verb. Some orders are more neutral than others, others might be used to direct the attention to one or the other participant or to answer a specific question, etc., or carry a connotation of something other unsaid being additionally the case, but all of them are unambigious as to who was recommended, who was doing the recommending, and who was the recommendation adressed to.

Budimir je Lazaru preporučio Stojana. (I guess this is "the" neutral order, (I'm not a native speaker but speak the language at a level comparable to my English), but it's actually hard to say because a lot of orders are quite natural with only subtle context shifts)
Stojana je Lazaru Budimir preporučio.
Preporučio je Lazaru Budimir Stojana.
Lazaru je Stojana preporučio Budimir.
(and literally all or nearly all the others)

Now that's simple. Just remember to put the endings on the objects and whoever you're most concerned about or think offers the best anchor to the previous dialogue can be the first to pronounce, and so on until everyone has had their mention.

In English and I assume similarly in Thai, you have to stick to a very rigid order that arguably breaks the flow of thought. If you do want to put a different participant in the foreground, you have to say things roundabout like in "It was to Lazar that Budimir recommeded Stojan". All of these are complexities that Slavic languages don't have to deal with, and I assume it only gets worse for Thai because English at least has a way to liberally use sentence prosody for some in situ highlighting, something that's much more restricted in a tonal language. Not impossible, but requiring more subtle, complex, cues!

Jokodo · Jan 17, 2025

Swammerdami said:
It is a dogma of linguistics that "all languages are equally complex," but have different ways of being complex. For example, English lacks the inflections of French, but has complexity in the form of phrasal verbs ("look up" = research, "look out" = beware, "give up" = accept defeat, "come across" = encounter) and other idioms that must be memorized ("by yourself", "in person"). Some languages have weirdnesses like evidential markers (though one of those, Eastern Pomo language, seems complex in other ways as well).

I am NOT a linguist (though some of the topics intrigue me enough that by now I've read dozens of books or papers), and only know three languages (English, French, Thai). But I just do NOT see how Thai can be called "complex." We have 2 or 3 linguists here and I hope they can show me what I'm missing. Or refer my question to a linguists' message-board.

(I know ZERO Chinese but have the impression its similarities to Thai may make it an exception also.)

I do feel that I know Thai well enough to point to what might be considered complexities:

The written language. Thai has about 65 writing symbols compared with English's 26. And it has weirdnesses: 'TR' becomes an 'S' sound, and 'RR' becomes a schwa! But I think linguists and this complexity meme are focused solely on the spoken language, right?

Classifiers. English also has special words to deal with "uncountable" nouns: heads of cattle or lettuce, etc. The fact that ALL nouns in Thai are treated, in a sense, as "uncountable" is almost a simplification.

Pronouns. Thai uses a variety of pronouns depending on social status or personal relationship. But, like English, it also often uses ordinary words ("Sarge," "Doctor," "Dad") when a pronoun might be used.

Consecutive verbs. I've previously posted an example Thai sentence of 13 words: 1 pronoun followed by 12 verbs! Again, this seems like simplification. One could construct such a sentence in English but connectors like "and," "to," etc. would be needed.

Sentence-ending particles. Thais often end a sentence with a syllable or two showing mood or status. English does this also: "Sir", "hmmm."

Tones. Again, English also uses tones, but in a more complex way. "Isn't she pretty" has different meanings depending on where the high-tone is placed.

That last point actually leads us to a prime argument about how ill-defined the question "is language X or Y more complex?" really is.

Not only would comparing them require a weighting of individual aspects, which is arbitrary, but sometimes it is not even clear what is and isn't a feature that should be included. For example and building somewhat on my last post, if a language requires hearers to heavily rely on sentence prosody or what you call "tone" in a language like English to disambiguate examples that are left unresolved by context and syntax and morphology proper, is that a linguistic skill and thus adds to the complexity of the language? Or is it a non-linguistic skill, an extension of general cognitive principles of attention direction? In that case a language that heavily relies on them is more cumbersome to use precisely because it *lacks* complexity elsewhere, in narrow syntax which could have produced unambiguous results in another language. So you have a preference for one or the other choice and, more importantly, can your justify it? I don't and I can't, and I actually dabbled in studying specifically the interface of syntax/pragmatics/prosody, back in another life when linguistics was my day job, so if you do, I'd be very interested!

It only gets worse from here. Let's agree for the moment that sentence prosody is not strictly linguistic. But while you perceive it as a feature of English, surely Thai also employs only that it's somewhat constrained by the need to express lexical tone. Surely the way tone and prosody interact is part of linguistics even if sentence prosody as such isn't? So that would mean an entire aspect of complexities in Thai that cannot even be defined in English or Serbian! By only counting aspects that can be defined in both languages you'd be cheating massively on Thai!

And it gets worse still. Even if we do agree that Thai tone+restricted prosody and English prosody (+word accent rules) are essentially two sides of the same coin, I have no idea how you would even start to compare their complexity. So sometimes we don't eben get to the question of how we should weightper-aspect scores, we fall at assigning scores!

Anyone who wants to claim that languages differ in complexity in any meaningful sense needs to answer these questions and convince the rest of us that his choice of answers is more useful than many alternatives that would produce different results for understanding the reality we live in.

"Doesn't seem all that complex to me" may be a good starting point, but it only takes you about 0.16% of the way.

Thai has no obligatory markers for tense, number, gender. Even the non-obligatory markers are ambiguous. A word for 'already' and a verb for 'to obtain' are perhaps the most common past-tense markers, but the latter is also used to mean 'to be able to.'

What am I missing? Surely professional linguists are well aware of Thai; in what way can it uphold the meme (which linguists seem to treat almost as an axiom) that "all languages are equally complex"?

To the contrary, the language is so simple that ordinary conversations are often full of ambiguity. I've witnessed examples of this as a bystander. Sometimes what would be a single statement in English necessitates extra turns. For example the English statement "We're going to see a movie" might become "Go see movie"; "Go already?"; "No, we go afternoon."

Just as well as you might overhear two English speakers say something like "Let's go to the movies!" - "Cool, I'll get my coat!" - "no, I meant sometime later this week!".

In any human language, a ton of sentences are technically ambiguous but 90% of the time, we resolve the ambiguity without even noticing it because the linguistic and general context of the utterance make all but one interpretation contradictory, or so unexpected that we'd expect the speaker to explicitly confirm if that's what they meant, or even just not a particularly useful thing to say. When obviously more then one interpretation survives, speakers often anticipate by adding qualifiers. For the most part actual misunderstandings or the need for clarification only arise when speakers assume some additional context as shared/given that the hearer isn't aware of.

You probably only noticed this particular ambiguity *because* you felt it could be avoided in English. All the other times you overheard Thais successfully communicate some circumstance that would likely have resulted in an unresolved ambiguity in English, using some subtle linguistic tool unavailable in English, you didn't notice because speakers are generally (across languages) so efficient in ambiguity resolution that it's the failures, not the successes that stick out!

Jokodo · Jan 17, 2025

Swammerdami said:
SUMMARY: Many top linguists agree that Thread question ("Are some languages more complex than others?") should be answered Yes. Those who say otherwise are largely parroting (out of "political correctness"?) a dogma from decades ago.

Swammerdami said:

Another example that has been cited to support the idea of complexity trade-offs is that Chinese and typologically similar languages, which have a simple (isolating) morphosyntax and individual morphemes that are multiply ambiguous, tend to have both constructions situated at the intersection between the lexicon and productive syntax like classifiers, reduplication, compounding, and verb serialization(see Riddle 2008 for Hmong, Mandarin, and Thai) and complex rules of inference and complex rules interfacing form and meaning (see Bisang 2009 for Khmer, Thai, and late Archaic Chinese).

Click to expand...

I am interested in the phrase I've reddened and will try to obtain Bisang 2009, “On the Evolution of Complexity: Sometimes less is more in East and mainland Southeast Asia”.

Click to expand...

The paper I seek is a chapter in a book whose abstract begins

This book presents a challenge to the widely-held assumption that human languages are both similar and constant in their degree of complexity. For a hundred years or more the universal equality of languages has been a tenet of faith among most anthropologists and linguists. It has been frequently advanced as a corrective to the idea that some languages are at a later stage of evolution than others. It also appears to be an inevitable outcome of one of the central axioms of generative linguistic theory: that the mental architecture of language is fixed and is thus identical in all languages and that whereas genes evolve languages do not. Language Complexity as an Evolving Variable reopens the debate.

Click to expand...

Bisang has a page listing his papers but my target is not available for Download. Instead I downloaded his "Grammaticalization and the areal factor: The perspective of East and mainland Southeast Asian languages" and make the following brief comments:

Bisang claims that Thai and other languages have a "high degree of grammaticalization" (with classifiers and rigid word order given as examples of grammaticalization). Does this imply "complexity"? Let's not go too far down this rabbit-hole, but here are comments on, and an example of, the use of classifiers:

The process of classification can be used to profile conceptual boundaries of concepts. Due to this function, classification is the basis of the two main functions involved with numeral classifiers, i.e., identification and individuation. On the one hand, classification helps identifying a certain sensory perception by using its conceptual boundaries to highlight that perception against other sensory perceptions. On the other hand, it can
establish a sensory perception as an individual item by actualising its salient inherent properties which constitute it as a conceptual unit.

Click to expand...

What does that even mean? After a while he gives an example Thai sentence translated into English as

"I like THIS car, I don’t like THAT car."
The words translated as 'THIS' and 'THAT' are preceded by ... classifiers! The complexity! The complexity!

Does rigid word order represent "complexity"? Bisang offers an 8-word Thai example sentence and translates it into English:

"He took the luggage down for me."
The ordering of the 8 words admits no deviation. Does a rigid word order represent complexity??

I played with this example a little, discovering that Google Translate would produce the first 7 words of Bisang's 8-word Thai sentence when presented with

"He brought the bag down for me."
เขาเอากระเป๋าลงมาให้ฉัน [แล้ว]
which translates word-for-word as

He take bag descend come give me [already]
In brackets I've appended 'แล้ว/already', a tense marker Google omits.

SUMMARY: Many top linguists agree that Thread question ("Are some languages more complex than others?") should be answered Yes. Those who say otherwise are largely parroting (out of "political correctness"?) a dogma from decades ago.

Sorry, but that summary clearly didn't follow from anything you've said or quoted in this thread.

Complexity doesn't stop being linguistic complexity because you'd rather not count it. In fact your refusal to count rigid word order is almost comical. Rigid word order is one of the most unambiguous ways in which language can be complex. The vocabulary? That's a very transient property of languages, language changes words like we change shirts. Prosody that helps disambiguate sentences left ambiguous by relatively free word order? We first have to decide if that's fully linguistic. But word order??? You can't really get out of jail free by claiming the order corresponds to the natural order of thoughts, because languages that require rigid word order badly differ in which exact orders they require and the data to say that some of them are easier to learn than others didn't exist!

If language that allows you to let the words tumble out of your mouth in whatever way the concept happen to enter your mind and requires speaker and hearer to rely on non-linguistic or para-linguistic tools to negotiate a shared interpretation isn't (in that respect) *clearly* less complex than one that requires them to share a large set of rules that allow them to construct those and only those sentences the language allows and/or a large set of word order templates, the concept of linguistic complexity loses all the usefulness it ever had!

At this point you're just showing that when you get to ignore the ways some languages are complex, they turn out to be less complex than others. Exactly as predicted by the hypothesis that languages are complex in different ways and there's no meaningful global metric of comparison. You're not hurting the position you are ostensibly arguing against by demonstrating the correctness of its predictions!

Swammerdami · Jan 17, 2025

I did find the multi-author book Language Complexity as an Evolving Variable on-line. It has 19 chapters, including "An interview with Dan Everett" regarding his controversial claim mentioned upthread. (This chapter is also available at daneverettbooks.com.)

Chapter 8 "Linguistic complexity: a comprehensive definition and survey" adopts a specific definition of grammatical complexity, and pursues this quantitatively. She concludes that complexity DOES vary, and ranks 68 languages: Basque is lowest at 13.0, Mandarin at 14.6, German at 16.0, Thai at 16.4, Russian at 17.2, Ingush highest at 27.9. I am suspicious of her strict arithmetic approach.

definition of grammatical complexity can be based on the usual understanding of a complex system as one consisting of many different elements each with a number of degrees of freedom. In addition, for languages, what is probably relevant is not just the number of elements, etc., but the amount of information needed to describe them, as discussed below.

I think most of the chapters will be more interesting than Chapter 8 but especially important for our thread topic is Chapter 3: Bisang's "On the Evolution of Complexity: Sometimes less is more in East and mainland Southeast Asia" He contrasts "overt complexity" of morphosyntactic patterns with neglected "hidden complexity;" the latter "will be the main topic of this paper."

Bisang said:
Human speech encoding is by far the slowest part of speech production and comprehension – processes like pre-articulation, parsing, and comprehension run at a much higher speed. This bottleneck situation leads to an asymmetry between inference and articulation, which accounts for why linguistic structures and their properties are subject to context-induced enrichment: inference is cheap, articulation expensive, and thus the design requirements are for a system that maximizes inference. (Levinson 2000: 29) Given this situation, linguistic structures somehow need to keep the balance between expensive articulation or explicitness and cheap inference or economy.
...
Overt complexity reflects explicitness: the structure of the language simply forces the speaker to explicitly encode certain grammatical categories even if they could easily be inferred from context. Hidden complexity reflects economy: the structure of the language does not force the speaker to use a certain grammatical category if it can be inferred from context. Linguists dealing with
overt complexity see it as the result of a historical development in terms of grammaticalization processes. [For example] In McWhorter’s view, creole grammars are characterized by a comparatively low degree of complexity because these languages did not have the time that is needed to produce more elaborate grammatical systems.
...
East and mainland Southeast Asian languages are characterized by a rather limited degree of coevolution of form and meaning and by almost no obligatory marking of grammatical categories. As a consequence, they are more open to economy and attribute a greater importance to inference. The absence of the overt marking of a grammatical category or of a construction-indicating marker does not imply that the speaker may not want to express it – he may simply have omitted it due to its retrievability from context.
...
If one understands grammatical structures as at least partially motivated by the competing forces of economy and explicitness, overt complexity and hidden complexity are both to be expected.
...
In the languages of East and mainland Southeast Asia, grammaticalization is expressed by phonetic erosion in terms of duration and vowel quality rather than by morphological reduction. [T]his is due to two very strong factors: the discreteness of syllable boundaries, and phonotactic constraints. Phonetic erosion is often limited to the reduction of tonality. McWhorter (2005: 12) describes tonality as ‘‘the result of a long-term
change’’, i.e. as a clear indicator of maturity.

As an example of Thai's "hidden complexity", Bisang cites the word /daai/, a helping verb used with different meanings, with the listener deducing which meaning from context.

He also claims that Thai's classifiers have multiple uses and this is a "complexity." As examples he notes that (the equivalent of0 "car this" is ambiguous; car could be singular or plural. But "car <classifier> this" is usually(!) singular.

I am NOT convinced that this "hidden complexity" is really an example of what is meant by "language complexity." Certainly inferences are also often needed in languages with high "overt complexity."

From a review of Chapter 2:

David Gil, "How Much Grammar Does It Take to Sail a Boat?" (pp. 19-33), argues that a minimal language (no morphology, no stem-class oppositions, semantic composition by "association") is generally adequate for human communication, and that some languages such as (semidiglossic colloquial) Riau Indonesian attain such minimality in conversational stretches and approximate it elsewhere. Other languages are much more complex (e.g., have cascading syntactic hierarchies), but such complexity is "hugely dysfunctional" insofar as such a language "forces us to say things that we don't want to say" (p. 32). The implication of the papers by Nichols, Dahl, and Gil is that languages can differ greatly in overall overt morphosyntactic complexity.

One thing ALL of us can agree on is that "complexity" is ambiguous, and different people will have different interpretations of it.

Anyway, I continue to feel that Thai is NOT complex. Weird pronouns? French now uses the 3rd-person pronoun "on" as 2nd- and 1st-person. Sentence-ending particles? I showed that English has similar sentence-enders, though used less frequently. The sentence with 12 consecutive verbs? The verbs are action verbs spoken in chronological order of the events described.

But actual ambiguity in conversations where sentences are TOO simple seems relevant. I do NOT include my own confusions; I refer to conversations between native speakers. (Sometimes I'm confused by an utterance, ask a native speaker to clarify, and am told she doesn't understand the utterance either!)

Two examples I personally overheard were, in word-by-word translation:
(1) "they apply how-many person?" Listener responded "as many as wish to, but only two will be selected" while speaker wondered how many DID apply.
(2) "Date 15 inject medicine where?" The nurse asked about a future vaccination and was confused when the reply was about a past vaccination.

Jokodo · Jan 17, 2025

I'm sure Thai has means to make both of these questions unambiguous with particles, adverbs, or some slight rearrangement. The ambiguity persisted not because Thai doesn't have the means to make it go away, but because the speaker didn't anticipate that it would be necessary, based on making unwarranted assumptions about the hearer's context frame. *Exactly* like misunderstandings arise in English.

English may force you to encode some of that information whether you think it's relevant or not, but so what? There's plenty of information that we don't habitually communicate in English unless we think it's relevant, while other languages, in some cases maybe including Thai, always require it, or force you to go to great lengths to conceil it!

In many languages of the Middle East and to some extent Southeastern Europe, there are no neutral words for "aunt" or "uncle", in some cases even "brother" or "sister". If you want to talk about your mother's brother in Serbian, that's your "ujak", and his wife your "ujna"; your father's brother and his wife are your "stric"/"strina; and your parents sisters and their husbands are "tetka"/"tetak". For a speaker of such a language, the inherent ambiguity of English "My aunt gifted my brother a book for Christmas" is every bit as confusing as you find the ambiguity in above Thai examples. That doesn't mean English is "too simple", does it?

Or for an example closer to your Thai example: a Nurse might ask in English "where did you get your last shot?" And be surprised when the answer is "at my local health centre" when they expected something like "in the right shoulder".

Jokodo · Jan 17, 2025

Swammerdami said:
I did find the multi-author book Language Complexity as an Evolving Variable on-line. It has 19 chapters, including "An interview with Dan Everett" regarding his controversial claim mentioned upthread. (This chapter is also available at daneverettbooks.com.)

Chapter 8 "Linguistic complexity: a comprehensive definition and survey" adopts a specific definition of grammatical complexity, and pursues this quantitatively. She concludes that complexity DOES vary, and ranks 68 languages: Basque is lowest at 13.0, Mandarin at 14.6, German at 16.0, Thai at 16.4, Russian at 17.2, Ingush highest at 27.9. I am suspicious of her strict arithmetic approach.

You are what??? The claim that language A has more complexity than language B is intrinsically a quantitative claim. It can either be demonstrated to be true or false with a strictly arithmetic approach, or not at all. Those are the only two possibilities. Pick one!

If you toss out arithmetic, you're leaping into the emptiness of outer space.

(What gave her away though? Was it the fact that your oh-so-simple Thai lands midway between German and Russian made you suspicious?)

Swammerdami · Jan 17, 2025

Jokodo said:
Swammerdami said:

I did find the multi-author book Language Complexity as an Evolving Variable on-line. It has 19 chapters, including "An interview with Dan Everett" regarding his controversial claim mentioned upthread. (This chapter is also available at daneverettbooks.com.)

Chapter 8 "Linguistic complexity: a comprehensive definition and survey" adopts a specific definition of grammatical complexity, and pursues this quantitatively. She concludes that complexity DOES vary, and ranks 68 languages: Basque is lowest at 13.0, Mandarin at 14.6, German at 16.0, Thai at 16.4, Russian at 17.2, Ingush highest at 27.9. I am suspicious of her strict arithmetic approach.

Click to expand...

You are what??? The claim that language A has more complexity than language B is intrinsically a quantitative claim. It can either be demonstrated to be true or false with a strictly arithmetic approach, or not at all. Those are the only two possibilities. Pick one!

If you toss out arithmetic, you're leaping into the emptiness of outer space.

(What gave her away though? Was it the fact that your oh-so-simple Thai lands midway between German and Russian made you suspicious?)

You've got nothing but banal insults for me. Please just set me to Ignore, or I'll do the vice versa.

Thanks in advance.

Are some languages more "complex" than others?

Contributor

Contributor

Squadron Leader

Squadron Leader

Squadron Leader

Squadron Leader

Contributor

Squadron Leader

Squadron Leader

Squadron Leader

Lux Aeterna

Squadron Leader

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Squadron Leader

Veteran Member

Veteran Member

Squadron Leader