Language as a Clue to Prehistory

lpetrich · Nov 25, 2023

Then Vaclav Blazhek reviewing E.J. Michael Witzel: "The Origins of the World’s Mythologies" - he mainly concerns himself with linguistic features.

The Gondwanan mythological motifs are mostly typical of people speaking Khoisan, Congo-Saharan (Niger-Congo + Nilo-Saharan), Indo-Pacific, and Australian languages. Also of Dravidian and Austronesian speakers, likely as a substratum influence, since Dravidian is in Nostratic and Austronesian in Austric, both Borean, and Borean speakers typically have Laurasian ones.

Indo-Pacific: New Guinea, Andaman Islands, etc.

Finally, Michael Fortescue: "Language Relations across Bering Strait: Reappraising the Archaeological and Linguistic Evidence."

Proposing a relationship between Uralic and Yukaghir, and also Chukchi-Kamchatkan and Eskimo-Aleut, then connects them as Uralo-Siberian. He notes absence of initial r while initial l is common. That is also typical of Tungusic, and is reconstructed for Altaic.

lpetrich · Nov 25, 2023

Mother Tongue 20: Vaclav Blazhek: "Cushitic and Omotic personal pronouns in Afroasiatic perspective" - the pronoun systems of the AA member families are good evidence of shared ancestry, though it's hard to point to much else. Note the rival reconstructions of Christopher Ehret, and of Vladimir Orel and Olga Stolbova.

VB also included some diagrams of family trees of AA and its members.

For AA as a whole, there is a big difference of opinion on when Omotic split off. Either the earliest or from Cushitic after it split off. Omotic's internval divergence: 7,000 - 5,400 BCE.

Turning to the others, Cushitic split off at 10,000 BCE, at the beginning of the Holocene and the invention of agriculture. Its internal divergence: 6,500 BCE. Semitic split off at 9,000 - 7,700 BCE, and its internal divergence is at 4,500 - 3,800 BCE. Egyptian split off at 7,700 - 7,300 BCE, and Berber and Chadic split at 6,000 - 5,900 BCE, with Berber diverging at 1,500 - 680 BCE and Chadic a 5,400 - 5,100 BCE.

Has Alexander Militarev's lexicostatisics-based family tree of Semitic, and also L. Kogan's more traditional tree based on grammatical features. Some other lexicostatistics: Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East - PMC by Andrew Kitchen et al.

AM:

(root) 4300 BCE - S Arabian 500 BCE - N Semitic
N Semitic 3700 BCE - Akkadian - NW Semitic
NW Semitic 2800 BCE - Ethiosemitic 1000 BCE - C Semitic
C Semitic 2500 BCE - Arabic - NW Semitic
NW Semitic 2200 BCE - Ugaritic - NW Semitic A
NW Semitic A 1900 BCE - Aramaic - Canaanite
Canaanite 1300 BCE - Phoenician - Hebrew

AK:

(root) 3750 BCE - Akkadian - W Semitic
W Semitic 3400 BCE - C Semitic - S Semitic
S Semitic 2650 BCE - S Arabian 50 BCE - Ethiosemitic 800 BCE
C Semitic 2450 BCE - Arabic (difficult to pin down) - NW Semitic
NW Semitic 2050 BCE - Ugaritic - NW Semitic A
NW Semitic A 1500 BCE - Hebrew - Aramaic

LK agrees with AK, aside from being more flattened in places.

There is a possible archeological correlate with the NW Semitic divergence: the arrival of the

Hyksos in Egypt. They were Canaanites who settled in the Nile Delta starting in 1800 BCE. So if they spread southward, they may also have spread northward.

Swammerdami · Nov 25, 2023

lpetrich said:
That means that other Neolithic European farmers likely spoke Dene-Caucasian langs.

Why? While Cardial Ware (along with proto-Basque language and Y-haplogroups like G-PF3147) spread to Spain via Eastern Mediterranean adventurers, Linear Ware spread up the Danube via cultural diffusion. Prior to the arrival of Indo-Europeans, West Central Europe had mostly the I2 Y-haplogroup -- the haplogroup of European hunter-gatherers probably associated with Solutrean^* culture. Am I correct that there is little evidence of I-E languages borrowing from proto-Basque?

* - Am I not correct that "Solutrean" refers to specifically a European paleolithic culture? Yet Googling it now, most of the hits are to the obscure "Solutrean hypothesis" associated with North America! Is this yet another example of how Internet information degrades as Google (et al) get "smarter"?

lpetrich · Nov 25, 2023

Vitaly Shevoroshkin: "Notes on Anatolian languages" - more of his work on Milyan, like cognates of its vocabulary.

EJ Michael Witzel: "The Central Asian substrate in Old Iranian" - he'd previously worked on evidence of substrate vocabulary in Sanskrit.

Ilia Peiros: "What is hidden under the “Uralic-Yukaghir” label?"

Are the similarities in vocabulary due to ancestry or to borrowing? Author Ilia Peiros concludes that Yukaghir is closest to Uralic in Nostratic.

Pierre J. Bancel, John D. Bengtson, and Alain Matthey de l'Etang: "A Universal Proto-Interjection System in Modern-Day Humans"

Discussing "hmmm..." and similar utterances. They are cross-linguistic and cross-cultural, suggesting that they are universal for us.

However, several of their features that have been highlighted above tend to push them back to a stage anterior to articulate speech. These features are:
(i) Their lacking supraglottal articulation;
(ii) Their non-symbolic functions, and, correlatively,
(iii) Their being restricted to express feelings and states of mind of the emitter;
(iv) Their specific, continuous way of signifying;
(v) Their functional independence from articulate speech (even though they may interact with it);
(vi) The apparently spontaneous nature of their use to express well-being; and
(vii) Their parallels in other mammals, including chimpanzees.

Something like laughter - something we are wired to do.

lpetrich · Nov 25, 2023

Mother Tongue 20 - Sh. Nafikov, G. Yagafarova, G .Karimova, M. Valieva: "The Bashkir Gloss tänäy ‘baby’ and its Interphyletic Correspondences in Other Languages"

Abstract: The article contains some cross-linguistic and largely interphyletic comparanda concerning some putative cognates or parallels to the Bashkir and other Turkic words denoting *baby, infant'. A superficial search for so-called look-alikes taken from various languages of the Amerindian and Austric macrophyla and a study of sources have revealed many parallels which could be accounted for by chance similarities (which we think highly unlikely), by areal diffusion phenomena. or by a primordial unity dating back to ethnolinguistic prehistory.

I think that the authors underestimate the likelihood of a chance resemblance. If one makes one's phonetic and semantic match criteria broad enough, then one will easily get coincidences.

Mother Tongue 21 - "This issue features a state-of-the-art discussion of the taxonomic structure and history of the native languages of the Caucasus region" - leaving out Indo-European, Turkic, Mongolic, and Semitic ones.

Kartvelian (South Caucasian) - including Eurasian Georgian, Megrelian, Laz, Svan
West Caucasian (Northwest Caucasian, Abkhazo-Adyghean) - Abkhaz, Abaza, Circassian (Adyghe, Kabardian), Ubykh (now extinct)
East Caucasian (Northeast Caucasian, Nakh-Daghestanian) - some 30 to 35 langs, with subfamilies Nakh (Chechen, Ingush, ...), Avar-Andian (Avar, Tindi, ...), Lezgian (Lezgi, Tabasaran, ...), isolates Dargi, Lak, Khinalug

Chechen as in Chechnya, a place that Russia rather forcefully suppressed a rebellion.

They typically have gruesomely complicated phonologies, something that makes comparative work difficult.

Are all three of them genetically related, with common ancestry? The "Ibero-Caucasian" hypothesis. Doubtful at best. But that is much more plausible for the two North Caucasian ones.

Kartvelian is in Nostratic as the closest relative of Eurasiatic, and NC has different closest relatives.

lpetrich · Nov 25, 2023

Viacheslav A. Chirikba: "From North to North West: How North-West Caucasian Evolved from North Caucasian"
He's at the Abkhaz State University, Sukhum, Republic of Abkhazia

He proposes that the NWC langs are descended from one with a structure much like the NEC ones, with agglutinative morphology. But something or other happened that made it isolating, with grammar done with separate function words. But its speakers ran the function words into content words, making the language agglutinative again. The NWC langs have lots of verb prefixes and not many noun cases, while the NEC ones have lots of verb suffixes and plenty of noun cases. Another NWC change was absorption of vowel quality into consonants and reduction in vowel distinctions.

Then making an analogy with present-day colloquial French, which has gone a similar route from Latin. For instance,

English translation: "(the fact) that I don't love you"
que je ne t 'aime pas (written version)
que je t'aime pas (spelling of colloquial version)
ke-S-t-em-pa (transcription)
REL-1SG-NOM-2SG-OBL-love-NEG (analysis)

Vaclav Blazhek appreciated VC's contribution, and he noted that VC is a native speaker of Abkhaz, a NWC language.

John Colarusso then noted that some NEC langs have a large number of noun cases for many of them being compound cases: location + motion: "to the top, away from the top, across the top, at the top, ..." Analogous to this, English has compound prepositions like "into", "onto", "out of", "off of".

lpetrich · Nov 25, 2023

Allan Bomhard: "Prehistoric Language Contact on the Steppes: The Case of Indo-European and Northwest Caucasian"
"Evidence will be presented to demonstrate that Proto-Indo-European is the result of the imposition of a Eurasiatic language — to use Greenberg’s term — on a population speaking one or more primordial Northwest Caucasian languages."

With a lot of comparisons, including one that I find rather semantically weak. IE *bheuH- (*bhû-) "to become, to be (perfective aspect)" with Proto-Circassian *baw(a) "to kiss, breathe". Presumably < "to live" < "to exist".

Then a repeat printing of Sergei Starostin's "Indo-European-North Caucasian Isoglosses" - written in 1988, originally in Russian - a reprint because this issue of MT is on North Caucasian. He finds a sizable amount of vocabulary, enough to work out sound correspondences. He finds that several kinds of NC sounds become reduced to much fewer kinds of PIE sounds, an indication of which way the borrowing went.

This could conceivably help resolve aspects of PIE phonology like the voicing of the stops and the nature of the laryngeals, though SS doesn't do any of that there.

Characteristic is the presence among the lexical coincidences of words that are names of domestic animals and plants, terms connected with the raising of animals and the cultivation of plants {in part, the large number of names of body parts of animals), the many names of objects of everyday use, products for feeding, and trade-and-exchange relations. ... At that time the presence among the PNC-PIE isoglosses of a sufficiently large number of names of wild plants and vegetation as well as of terms for fauna such as 'frog', 'fish', and 'weasel' leads to the notion that we have before us evidence not simply of cultural contacts but of substrate relations.

These borrowings must have taken place before the beginning of the PIE breakup -- they include Anatolian forms, and Anatolian was the first to split off.

"In the second place, the PNC dialect from which the borrowings were assimilated into PIE apparently already differed somewhat from the original common North Caucasian language." then pointing out where it differed.

The date: around 5000 BCE

John Bengtson noted some developments after SS's article, like "North Caucasian Etymological Dictionary" and Dene-Caucasian, and Basque as most closely related to NC.

lpetrich · Nov 25, 2023

In Mother Tongue 21's business section, I found this:

Peer Review Committee (Jonathan Morris): There is a need to raise the standard of contributions to Mother Tongue - i.e., not to publish under-researched or badly-written articles, and my suggestion is to establish a peer review committee, perhaps with a chairperson and with a panel covering various areas of expertise.

I concede that I've seen some clunkers in MT, articles that needed copy editing.

Mother Tongue 22 has an obituary of Eric Hamp, an anti-long-ranger but still respected. When Merritt Ruhlen mentioned global etymologies to him, he responded

Remember that Bopp in 1816, & probably Jones, before him, started with morphology. It’s never enough to look for roots; you have to look at totally accountable words & phrases with their morphologies & syntactic markings. Only then are the semantics justified against all formant increments. – That’s what I urge as a goal for cleaning up (or rejecting) these proposed etymologies. Often disappointing, yes, but terra firma. So for me all 25 fail.

I don't think it impossible - one can use vocabulary alone, but one has to use seldom-borrowed and seldom-replaced vocabulary, and from the looks of it, those global etymologies are for those sorts of words. I have problems of my own with it, like too much phonetic and semantic latitude, and too many possible sources for words.

John Bengtson rebutted Jan Henrik Holst on whether one needs every member of a language family to do a good job of reconstructing its protolanguage. JHH noted that that may take a lifetime of work, while JB noted that some notable reconstructions were based on a small number of members. "The earliest Indo-European reconstructions were based mainly on Sanskrit, Greek, and Latin, and were heavily weighted toward Sanskrit."

That's evident in the first version of "The Sheep and the Horses" - August Schleicher's PIE reconstruction was very Sanskrit-like.

lpetrich · Nov 25, 2023

Vaclav Blazhek on Na-Dene Numerals

Na-Dene is ( ( (Athabaskan, Eyak ), Tlingit ), Haida )

1: A, E, T "whole", H "not many"
2: A, E cognates in Sino-Tibetan, Yeniseian, T, H "they don't agree"
The others are also variable, though there are some patterns.
4: 2*2, 2+2
5: "hand"
6: "attained", 2*3, 5+1
7: "next to thumb", 6+1, 5+2, ...
8: 2*4, 5+3, ...
9: 10-1, 8+1, ...
10: "all fingers down", "both hands". ...
I'm hopelessly confused.

Looking across the Dene-Caucasian langs, only *-n- "2" is preserved across a good set of them. That poor preservation is evident in other macrofamilies, like Nostratic and Afrasian.

Are words for numbers grossly unstable when one does not have much that one has to count?

lpetrich · Nov 26, 2023

Mother Tongue 22 - John Bengtson: "Some Notes about Dene-Caucasian"

Indeed, it is obvious to any linguist that the North Caucasian and Na-Dene phonological systems are very similar in some ways, such as the trinary oppositions of glottal/fortis/lenis consonants and the abundant lateral fricatives and affricates. In the current Dene-Caucasian (DC) analysis, these typological similarities are not interpreted as common innovations pointing to a special relationship between Na-Dene and North Caucasian, but rather as archaic retentions of some features of the original DC phonological system in widely separated areas.

Lateral fricatives - a sort of thl or voiceless l sound. Lateral affricates - a sort of tl sound.

After three decades of study of Nikolaev’s etymologies I now think that a fair number (perhaps about half of them) still seem plausible and are borne out by my research. Perhaps about another quarter of them could be valid or promising, but I have not yet been able to verify the Na-Dene and/or North Caucasian data. Some others (perhaps about one fourth), because of clear errors in the data, or implausible phonetic or semantic changes, seem to me to be improbable or simply erroneous.

I like seeing critical sense.

lpetrich · Nov 26, 2023

Swammerdami said:
lpetrich said:

That means that other Neolithic European farmers likely spoke Dene-Caucasian langs.

Click to expand...

Why? While Cardial Ware (along with proto-Basque language and Y-haplogroups like G-PF3147) spread to Spain via Eastern Mediterranean adventurers, Linear Ware spread up the Danube via cultural diffusion.

Prior to the arrival of Indo-Europeans, West Central Europe had mostly the I2 Y-haplogroup -- the haplogroup of European hunter-gatherers probably associated with Solutrean^* culture. Am I correct that there is little evidence of I-E languages borrowing from proto-Basque?

Was it an expanding population? Or did people learn agriculture from neighboring farmers? Genetics research reveals mostly the former, an expanding population. Such a population would have brought its language with it as it spread, though with increasing distance and time, this population would have ended up speaking separate dialects, and eventually separate languages.

This makes Basque the only survivor of the European Neolithic farmers' languages.

* - Am I not correct that "Solutrean" refers to specifically a European paleolithic culture? Yet Googling it now, most of the hits are to the obscure "Solutrean hypothesis" associated with North America! Is this yet another example of how Internet information degrades as Google (et al) get "smarter"?

Solutrean

The Solutrean /səˈljuːtriən/ industry is a relatively advanced flint tool-making style of the Upper Paleolithic of the Final Gravettian, from around 22,000 to 17,000 BP. Solutrean sites have been found in modern-day France, Spain and Portugal.

Solutrean hypothesis

The Solutrean hypothesis on the peopling of the Americas claims that the earliest human migration to the Americas took place from Europe, with Solutreans traveling along pack ice in the Atlantic Ocean.

lpetrich · Nov 26, 2023

As to

Linear Pottery culture it's Neolithic in central Europe, and an offshoot of Balkan Neolithic (Starčevo, Körös, Criș), in turn an offshoot of Greek Neolithic.

Swammerdami said:
Am I correct that there is little evidence of I-E languages borrowing from proto-Basque?

I'm not sure about that, but I quickly found that there are plenty of Spanish words of Basque origin.

List of Spanish words of Basque origin and Spanish Words of Basque Origin | SpanishDictionary.com

Pre-Greek substrate - oodles of words - Notes on some Pre-Greek words in relation to Euskaro-Caucasian (North Caucasian + Basque)

Like this one: halôs ~ halôê ~ halôâ "threshing floor, disk, halo"

I checked JB's Basque and Burushaski comparisons with the North Caucasian dictionary at Starling Etymological Databases - North Caucasian *-ârtla "to thresh", Basque larrain "threshing floor", Burushaski *daltan- "to thresh" < Western Dene-Caucasian *ratla "to thresh"

Palaeolexicon - Pre-Pre-Greek: Traces of a hunter-gatherer substrate in Greek - doesn't really deliver on pre-pre-Greek, though it does have some likely pre-Greek words.

There are similar hypotheses for elsewhere, like the

Vasconic substrate hypothesis and the

Goidelic substrate hypothesis and the

Germanic substrate hypothesis all rather controversial. Germanic words of non-Indo-European origin - Linguistics - Eupedia - very doubtful list. I may have to research it some time.

Swammerdami · Nov 26, 2023

lpetrich said:
Swammerdami said:

lpetrich said:

That means that other Neolithic European farmers likely spoke Dene-Caucasian langs.

Click to expand...

Why? While Cardial Ware (along with proto-Basque language and Y-haplogroups like G-PF3147) spread to Spain via Eastern Mediterranean adventurers, Linear Ware spread up the Danube via cultural diffusion.

Prior to the arrival of Indo-Europeans, West Central Europe had mostly the I2 Y-haplogroup -- the haplogroup of European hunter-gatherers probably associated with Solutrean^* culture. Am I correct that there is little evidence of I-E languages borrowing from proto-Basque?

Click to expand...

Massive migration from the steppe was a source for Indo-European languages in Europe - PMC

Ancient genomes link early farmers from Atapuerca in Spain to modern-day Basques | PNAS

Ancient human genomes suggest three ancestral populations for present-day Europeans - PubMed

Genomic diversity and admixture differs for Stone-Age Scandinavian foragers and farmers - PubMed

Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe - PubMed

Was it an expanding population? Or did people learn agriculture from neighboring farmers? Genetics research reveals mostly the former, an expanding population. Such a population would have brought its language with it as it spread, though with increasing distance and time, this population would have ended up speaking separate dialects, and eventually separate languages.

Apples and oranges. Migration from the steppe was mostly AFTER about 3500 BC. The early (Linear Ware) farmers reached the Rhine River BEFORE 4500 BC. Those "early farmers" were, of course, a combination of "cultural diffusion" and "population movement," with emphasis on the former. As I implied this is confirmed by the relevant Y-chromosome: I2 from the European (Solutrean) Paleolithic.

lpetrich said:
* - Am I not correct that "Solutrean" refers to specifically a European paleolithic culture? Yet Googling it now, most of the hits are to the obscure "Solutrean hypothesis" associated with North America! Is this yet another example of how Internet information degrades as Google (et al) get "smarter"?

Click to expand...

Solutrean

The Solutrean /səˈljuːtriən/ industry is a relatively advanced flint tool-making style of the Upper Paleolithic of the Final Gravettian, from around 22,000 to 17,000 BP. Solutrean sites have been found in modern-day France, Spain and Portugal.

Click to expand...

Solutrean hypothesis

The Solutrean hypothesis on the peopling of the Americas claims that the earliest human migration to the Americas took place from Europe, with Solutreans traveling along pack ice in the Atlantic Ocean.

Click to expand...

??? Of course. We all know that. My complaint was against Google algorithms, which retulrn mostly links to the Solutrean hypothesis when I search for "Solutrean." Do you agree that a searcher for "Solutrean" should be presented results about ... Solutrean ?

Swammerdami · Nov 26, 2023

lpetrich said:
Swammerdami said:

Am I correct that there is little evidence of I-E languages borrowing from proto-Basque?

Click to expand...

I'm not sure about that, but I quickly found that there are plenty of Spanish words of Basque origin. List of Spanish words of Basque origin and Spanish Words of Basque Origin | SpanishDictionary.com

Of course. Spanish borrowed from Basque to which it was geographically adjacent.
Obviously I was addressing the claim that other European languages, e.g. Germanic, were adjacent to speakers of Dene-Caucasian.

lpetrich · Nov 26, 2023

Non-Indo-European root nouns in Germanic: evidence in support of the Agricultural Substrate Hypothesis - sust266_kroonen.pdf by Guus Kroonen

Noting a prefixed a that is only sometimes present: Latin alauda "lark" from Gaulish, Proto-Germanic *laiwarikôn (that bird also) > English "lark" and

Then noting root-noun inflection, an inflection without a suffixed vowel, common in the older IE langs, though it tended to disappear in more recent ones.

For "pea", PGmc *arwît- > German Erbse -- Latin ervum "bitter vetch", Greek orobos "bitter vetch", erebinthos "chickpea"

Then Proto-Germanic *gaid- "goat", noting that the IE langs' words for this animal are very regional. That suggests that the PIE speakers did not raise goats very much, if at all, unlike horses or cows or pigs or sheep.

lpetrich · Nov 26, 2023

Mother Tongue 22: Gregory Haynes "Resonant Variation in Proto-Indo-European"

Resonants: (none), w, y, r, l, n, m, laryngeals (h1,h2,h3) -- inserted inside of the CVC base root structure

*ghebh- E "give"
*gheh2bh- > Latin habere "to have"
*ghrebh- > E "grab"
*ghreyb- > E "grip"

GH has numerous other examples of this effect. This work could be extended to main consonants, like
*keh2p- > Latin capere "to take"

Finally, Pierre J. Bancel, Alain Matthey de L’Etang, John D. Bengtson: "The Proto-Sapiens Prohibitive/Negative Particle *MA"

It seems to be very common, but are there others that might also be common? The authors didn't research that, it seems.

Indo-European *ne, prohibitive *meh1, Uralic negative verb *e-, Turkic, Mongolian verb negation *-me, Eurasian Georgian ar, Dravidian alla, ..., Basque ez, Semitic *lâ, Tamazight (Berber) ur, Nakh *ca, Malayo-Polynesian *diaq, Proto-Tai *mi, Navajo dooda, Miskito apia, ...

I'd earlier mentioned

Jespersen's Cycle - a negator gets worn down, then extended: no + thing -> nothing. It may then get worn down again, but with the "thing" part surviving instead of the "no" part.

Swammerdami · Nov 26, 2023

I think that a linguist researching substrates of languages like Germanic, or Celtic or even Latin or Greek will be keen to find cognates of Basque words. The fact that such papers usually do NOT find likely borrowings from Basque in the Germanic Substrate speaks for itself.

lpetrich · Nov 26, 2023

Mother Tongue 23 - the most recent one that ASLIP has published, as of this writing.

It has some retrospectives about Merritt Ruhlen, and an update that he wrote for his book "The Origin of Language" back in 2009. Like

What I would most like to rewrite today is my inadequate discussion in Chapter 1 of the difference between the origin of Language—the language faculty, the ability to speak—and the origin of those languages which now exist. These are entirely different questions, though even today they are still constantly confused.

He has a section on it on the evolution of word order, specifically the unmarked or default order of the subject, verb, and object of a simple sentence with a transitive verb. Much like what he wrote in this paper: The origin and evolution of word order | PNAS He wrote it with Murray Gell-Mann, a physicist who became a historical linguist. So I decided to check that paper's work.

There are six possible ones, and the most common orders are SVO and SOV, with VSO being next, and then VOS, and very rarely, OSV and OVS.

Subject–verb–object word order and

Subject–object–verb word order and

Verb–subject–object word order and

Verb–object–subject word order and

Object–subject–verb word order and

Object–verb–subject word order

Let's start with Indo-European. English is obviously SVO, and that is common in IE. But many other IE langs are SOV, including the older langs, and it is reconstructed for the protolanguage. Likewise, Uralic has some SVO members, but it is ancestrally SOV. Looking at Altaic, Turkic, Mongolian, Tungusic, Korean, and Japanese are all SOV, making Altaic SOV, Turning to the rest of Eurasiatic, Yukaghir, Chukchi-Kamchatkan, Nivkh/Gilyak, and Eskimo-Aleut are all SOV, making Eurasiatic SOV. The rest of Nostratic, Kartvelian and Dravidian, are both SOV, making Nostratic SOV, at least Nostratic without Afrasian.

Turning to Dene-Caucasian, Basque, North Caucasian, Burushaski, Yeniseian, most of Sino-Tibetan, and Na-Dene are all SOV. A notable ST exception is the Sinitic or Chinese languages, which are SVO. So Dene-Caucasian is SOV.

Afroasiatic is a difficult case. Omotic is SOV, Cushitic is SOV or VSO, Chadic is VSO or SVO, Berber is VSO, Egyptian is VSO, and Semitic is ancestrally VSO, though some members have become SVO or SOV. So unlike MR and MGM, I'll put a big fat question mark in front of AA.

Turning to Austric, Proto-Austronesian was VSO or VOS, Kra-Dai is SVO, Austroasiatic: Mon-Khmer SVO and Munda SOV, Hmong-Mien/Miao-Yao SVO. So Austric is likely SVO or VSO.

Turning to Joseph Greenberg's New-World lumping, Amerind, it has all six orders, though MR concludes that it is ancestrally SOV. That makes Borean SOV, though I consider that conclusion weak.

New Guinean languages (Indo-Pacific) are mostly SOV, but Australian ones have all six orders, and like Amerind, MR concludes that Australian was also originally SOV. So Ex-African, as it might be called, was SOV (MR) or uncertain (me).

Turning to Africa itself, Nilo-Saharan is a motley Greenbergian group that has SOV, SVO, and VSO in it, but MR considers it originally SOV. Niger-Congo is mostly SVO except for Mande, which is SOV, and MR argues that NC was originally SOV. Thus, Congo-Saharan, their combined grouping, is in MR's estimate, SOV, and in mine, doubtful.

Since the first split of humanity was into north-central and southern African populations, and since the former one is the ancestor of the ex-African ones, from MR's arguments, Proto-North-Central-Africa was SOV, while I find that conclusion doubtful.

The southern African population is now represented by Khoisan speakers, and Khoisan is a mixture of SOV and SVO.

So while MR concludes that Proto-World was SOV, I think that we have no way of telling from what its descendants were like. In fact, in some of his argumentation, MR seemed to be trying to force that conclusion, or at least to show that ancestral SOV was possible. But that's not a demonstration of ancestral SOV. It's not like Nostratic and Dene-Caucasian being SOV or like Austronesian being VSO/VOS.

lpetrich · Nov 26, 2023

WALS Online - Feature 81A: Order of Subject, Object and Verb - with a map of present-day langs with these features.

Some psycholinguistic effects that may affect preferred word order.

The semantic origins of word order - ScienceDirect

Recent literature has shown that the pragmatic rule ‘Agent first’ drives the preference for S initial word order, but this rule does not decide between SOV and SVO. ... We focus on the role of the verb, and argue that the preference for SOV word order reported in earlier experiments is due to the use of extensional verbs (e.g. throw). With intensional verbs like think, the object is dependent on the agent’s thought, and our experiment confirms that such verbs lead to a preference for SVO instead.

Universals of word order reflect optimization of grammars for efficient communication | PNAS - considering the Greenberg universals. They looked at correlations of head-modifier vs. modifier-head for 8 types of head-modifier combinations:

head -- modifier -- example
adposition -- noun phrase -- to -- a friend
copula -- noun phrase -- is -- a friend
auxiliary -- verb phrase -- has -- written
noun -- genitive -- friend -- of John
noun -- relative clause -- books -- that you read
complementizer -- sentence -- that -- she has arrived
verb -- adpositional phrase -- went -- to school
"want" -- verb phrase -- wants -- to leave

The authors generated a lot of baseline grammars to test their efficiency hypotheses.

We found that the grammars of natural languages are more efficient than baseline grammars and that a large subset of the Greenberg word-order correlations can be explained in terms of optimization of grammars for efficient communication.

As to why that might be, "This is the idea that word order minimizes the average distance between syntactically related words."

Explaining the origins of word order using information theory | MIT News | Massachusetts Institute of Technology

The tendency even of speakers of a subject-verb-object (SVO) language like English to gesture subject-object-verb (SOV), Gibson says, may be an example of an innate human preference for linguistically recapitulating old information before introducing new information. The “old before new” theory — which, according to the University of Pennsylvania linguist Ellen Price, is also known as the given-new, known-new, and presupposition-focus theory — has a rich history in the linguistic literature, dating back to at least the work of the German philosopher Hermann Paul, in 1880.

...
Assuming a natural preference for the SOV word order, then — at least in cases where the verb is the new piece of information — why would the volunteers in the PNAS experiments mime SVO when both the subject and the object were people? The MIT researchers’ explanation is that the SVO ordering has a better chance of preserving information if the communications channel is noisy.

Consider "the dog chases the cat" and "the cat chases the dog". With SOV order, those sentences become "the dog the cat chases" and "the cat the dog chases" -- not as easy to tell apart.

Crosslinguistic word order variation reflects evolutionary pressures of dependency and information locality | PNAS - "We propose that variation in word order reflects different ways of balancing competing pressures of dependency locality and information locality, whereby languages favor placing elements together when they are syntactically related or contextually informative about each other."

lpetrich · Nov 27, 2023

Pierre J. Bancel and John D. Bengtson: "On The Pronoun Roots N '1sg' And M '2sg' In The Native Languages of the Americas And Their Historical Meaning"

This article is a response to Raoul Zamponi’s “First-person n and second-person m in Native America: A fresh look.” He did a very good job examining 172 language families for presence or absence of those pronoun forms. He counted isolates as single-member families.

First-person n and second-person m in Native America: a fresh look | Raoul Zamponi He found 1sg n in 50 of these and 2sg m in 58 of these, both of them in 14 of these. Correlation between N and M presence and absence I found to be statistically significant -- more likelihood of both being present or absent than what one would expect from lack of correlation. The n-m pattern is far from universal, it must be pointed out. In central South America, another pattern is common: i - a.

RZ concludes "But the question of whether these contacts have the form of a hypothetical proto-language or of a convergence between (perhaps few or very few) formerly distinct language branches is not answerable based on just the present-day knowledge of the history of the linguistic families of these areas."

PB & JB then criticize RZ for saying that m-n is mainly a Western North American thing, when his diagrams show it to be present in Gulf Coast North America and in Central America.

They also concede that 1st and 2nd personal pronouns have a very limited set of sounds. RZ's top five: m 19.0%, n 16.1%, k 11.2%, t 8.6%, s 4.3%, ... Looking at phonetic relatives, one finds that they are much rarer. For labials (m), one finds p ~ m 2.6%, b ~ m, p, b 0.9%. For dental/alveolar (n, t, s), d < 0.9%, while for velar (k), ng is 3.7%, and its other phonetic relatives are < 0.9% each. That is very skewed when one compares the distribution of these sounds more generally - p and b are very common.

Franz Boas, already in 1917, may have remarked these phonological gaps in the most frequent pronoun roots. Unable to find any phonetic (articulatory, acoustic or auditory) rationale – since there is none –, he was probably far off the mark when he ruminated

how far the frequent occurrence of similar sounds for expressing related ideas (like the personal pronouns) may be due to obscure psychological causes rather than to genetic relationship. (p. 5, our italics)

A full century later, these psychological causes remain as obscure as they were in Boas’ time, and no new hint has appeared that they should exist at all.

Also, borrowing of 1st and 2nd person pronouns is *very* rare. One example is urban Thai young people borrowing English pronouns, then Chinese ones. But Thai has numerous such pronouns, with usage dependent on social status, so adding a pronoun would not be difficult.

PB & JB suspect RZ's work of being incomplete, with some forms that he missed, and also for not working with plural forms, and dual ones if present. Also possessive and verb affixes, and archaic forms.

Language as a Clue to Prehistory

Contributor

Contributor

Squadron Leader

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Squadron Leader

Squadron Leader

Contributor

Contributor

Squadron Leader

Contributor

Contributor

Contributor