• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Language as a Clue to Prehistory

Gravettian hand stencils as sign language formatives | Philosophical Transactions of the Royal Society B: Biological Sciences - Gravettian: some 33,000 BP in W Eurasia, before the Last Glacial Maximum.

Pantomimic fossils in modern human communication | Philosophical Transactions of the Royal Society B: Biological Sciences - gestures as alternative language.

Constructing a protolanguage: reconstructing prehistoric languages in a usage-based construction grammar framework | Philosophical Transactions of the Royal Society B: Biological Sciences - couldn't figure that one out

Language evolution: examining the link between cross-modality and aggression through the lens of disorders | Philosophical Transactions of the Royal Society B: Biological Sciences
We demonstrate how two linguistic phenomena, figurative language (implicating cross-modality) and derogatory language (implicating aggression), both demand a precise degree of (dis)inhibition in the same cortico-subcortical brain circuits, in particular cortico-striatal networks, whose connectivity has been significantly enhanced in recent evolution.
A lot of derogatory language is rather obviously metaphorical, so those two categories are not very far apart.

Then noting
schizophrenia (SZ), autism spectrum disorder (ASD), synaesthesia and Tourette's syndrome (TS)
as having unusual patterns of (dis)inhibition.
Our proposal is that enhanced cross-modality (necessary to support language, in particular metaphoricity) was a result, partly a side-effect, of self-domestication (SD). SD targeted the taming of reactive aggression, but reactive impulses are controlled by the same cortico-subcortical networks that are implicated in cross-modality.
 
The prehistory of speech and language is revealed in brain damage | Philosophical Transactions of the Royal Society B: Biological Sciences
The arguments put forward provide insights tending to support the motor-gestural model of speech and language evolution.
I couldn't follow that paper very well.
  •  Aphasia - difficulty in understanding or generating language usually due to damage of parts of one's brain.
  •  Expressive aphasia - difficulty in generating language. Often has  Agrammatism - difficulty in using inflections or function words or syntax, while being able to use content words.
  •  Receptive aphasia - difficulty in understanding language.
  •  Conduction aphasia - difficulty in repeating one hears or reads
  •  Mixed transcortical aphasia - "isolation aphasia" - one can repeat language but one has difficulty understanding or generating it.
  •  Transcortical motor aphasia - one can understand and repeat language but one has difficulty generating it.
  •  Transcortical sensory aphasia - one can generate and repeat language but one has difficulty understanding it.
  •  Global aphasia - difficulty in all of understanding, repeating, and generating language.
  •  Anomic aphasia - difficulty in retrieving words, especially nouns and verbs (content words).
    • Word selection anomia - one can describe something without being able to name it.
    • Semantic anomia - one cannot use the meanings of words.
    • Disconnection anomia, including callosal anomia - one can name something presented with one sense but not with another.
    • Articulatory initiation anomia - difficulty in selecting what words one wants.
    • Phonemic substitution anomia - substituting phonemes or words.
    • Modality-specific anomia - for one sense, like vision or touch.
  •  Paraphasia - production of unwanted phonemes, words, or phrases as one attempts to speak.
Not only a split between understanding and generation of language, but also one between phonemes and words/phrases, and also one between content words (lexicon) and function words (grammar).

Seems like evidence of a language instinct, that we are predisposed to use language. Coupled with the universality of language in documented human societies, and its transmission down the generations, that means that we as a species have had language for as long as we have existed as a species.

We as a species have some distinctive features, notably our protruding chins. That suggests origin in a relatively small population that then expanded and spread -- punctuated equilibrium. From genetic research, the earliest split in our species was between southern Africans and all the rest, something that suggests a location in southern to eastern Africa. Considering  Behavioral modernity suggests a date of roughly 100,000 years ago, in rough agreement with genetics research.

This ancestral population was likely small enough to have had only one language: Proto-World or Proto-Human or Proto-Sapiens. But whether any of it is reconstructible is another issue.
 
Back to that special issue.

At the boundaries of syntactic prehistory | Philosophical Transactions of the Royal Society B: Biological Sciences
Can language relatedness be established without cognate words? ... We show that not only does syntax allow for comparison across distinct traditional language families, but that the probability of deeper historical relatedness between such families can be statistically tested through a dedicated algorithm which implements the concept of ‘possible languages’ suggested by a formal syntactic theory.
They attempted to test for borrowing of syntactic features:
(i) Areal effects

The possible impact of geographical proximity on syntactic similarity was tested through a Mantel correlation test (as in [90]) run on the numerically well-represented groups (IE, Altaic and Uralo-Altaic).
They apparently did so with present-day or recent locations, and not with homelands. They also failed to check on earlier-documented langs like Latin or Ancient Greek or Sanskrit.

For Indo-European, they found clustering
  • 0.205: Germanic, Slavic
  • 0.244: GeSl, Greek
  • 0.277: GeSlGk, Romance
  • 0.296: GeSlGkRo, Indo-Iranian
  • 0.324: GeSlGkRoII, Celtic
That's a big red flag, because it disagrees rather strongly with other work on Indo-European subdivision.
 
There are a lot of books and YouTubes about the evolution of the English language. I've watched a few YouTubes on the RobWords channel. They probably have little value for serious linguists but I found parts interesting. English has 44 distinct phonemes compared with 25-30 for the average language. English is also an outlier for several "weird" grammatical forms. A lot of Middle English lexicon, including pronouns and forms of 'to be', comes from North Germanic.

This last point reminds me of some common misconceptions. The details of the evolution of Old English to Middle English are controversial but most of what I write below is accepted in disparate views.

The transition from Old English to Early Middle English had almost nothing to do with the Norman language. Bilingualism is the typical prerequisite for borrowing, but there was very little English-Norman bilinguism in the 150+ years after the 1066 Conquest. It was during the reign of King John IIRC that nobles were forced to choose between their French and English domains. (Noblemen with two sons often split their land that way.) And the first English King after the Conquest to speak fluent English was in the 15th century. Middle English began to borrow substantial French lexicon only after the time of King John, and it was largely Parisian French (the Continental prestige language) that was borrowed rather than Norman French.

So the transition from Old English to Early Middle English depended mainly on borrowing from Old Norse (Old Danish). There WAS significant bilingualism in the Danelaw for two centuries before the Conquest. Compared with a smallish number of Norman rulers, the number of Danes in the Danelaw was large. And bilingualism was relatively easy: Old English and Old Norse were cousins genetically. The 1066 Conquest did play a role: Before the Conquest, the Danes and Saxons had been antagonists; but they were happy to ally militarily and to share language after 1066 ("My enemy's enemy is my friend"). The English capital (and the center of the WRITTEN prestige dialect of Old English) had been in Winchester, Wessex but after 1066 the capital was moved to London, just to the south of the Danelaw, and there was a big influx of immigrants speaking a Norse or Norsified language.

But exactly how did the extensive and dramatic Norsification of English happen? Did the prestige dialect borrow from Norse or Norsified speakers in the Danelaw? Or did London's speakers borrow from the prestigious written Old (Wessex) English? Was the Danelaw language in 1066 Norsified English? Or was it Anglicized Norse? This is controversial, but usual scenarios have trouble accounting for the HUGE Norse influence on Middle English.
 
Testing methods of linguistic homeland detection using synthetic data | Philosophical Transactions of the Royal Society B: Biological Sciences
Two families of quantitative methods have been used to infer geographical homelands of language families: Bayesian phylogeography and the ‘diversity method'. Bayesian methods model how populations may have moved using a phylogenetic tree as a backbone, while the diversity method assumes that the geographical area where linguistic diversity is highest likely corresponds to the homeland.

... Here, we carry out performance testing by simulating language families, including branching structures and word lists, along with speaker populations moving in space.

... As a result of the tests, we propose a hierarchy of performance of the different methods. Factors such as geographical idiosyncrasies, incomplete sampling, tree imbalance and small family sizes all have a negative impact on performance, but mostly across the board, the performance hierarchy generally being impervious to such factors.
They used these methods: rand: random selection -- centr: the center of the enclosing polygon -- md: minimal distance -- BTF: BayesTraits with fixed rates -- BTV: BayesTraits with variable rates -- BRW: BEAST ("Bayesian Evolutionary Analysis Sampling Trees"), a relaxed random walk -- RBF: RevBayes with a fixed-rate random walk -- RBV: RevBayes with a variable-rate random walk -- Div: diversity center: for each lang, find the linguistic distance to each other lang relative to its geographic distance and take the average. Then find the lang with the highest value of that average and use its location.

I took Table 1, normalized each row by dividing by its median, then found the median of each column. Sorting, I found, from worst to best, {{"centr", 1.67657}, {"rand", 1.25624}, {"Div", 1.05853}, {"BRW", 1.01759}, {"RBV", 1.}, {"RBF", 0.965247}, {"BTV", 0.956511}, {"md", 0.86498}, {"BTF", 0.855979}}

Using instead of the ranks of each type of error, from 1 to 9, {{"centr", 1.}, {"rand", 2.}, {"Div", 3.}, {"BRW", 4.}, {"RBV", 5.}, {"BTV", 6.5}, {"RBF", 6.5}, {"md", 8.}, {"BTF", 9.}}

Somewhat different, but with the same overall tendency.

It would be interesting to do these calculations with some barriers, like a big wall with the origin place right next to it, so that spreading outward can only be done in some directions.

For Austronesian, if one naively used the language-bounding polygon center or the language centroid, one would find some horribly wrong results: somewhere near New Guinea rather than Taiwan.

For Russian, one would find central Siberia rather than Moscow.

For the Romance languages, one would find somewhere near Geneva rather than Rome.

There is also the problem of closely-related languages, like Continental Scandinavian (Danish, Norwegian, Swedish) and Eastern Slavic (Russian, Ukrainian, Belarusian). So one will have to bootstrap one's way back in time.
 
From those lists for number-word stability, it is curious that Bantu languages have "hunger" and "elephant" as highly stable. With Wiktionary let's see about those words more typically.
  • Proto-Germanic *hungruz > English "hunger", other Germanic forms
  • Latin famês > Romance forms like Italian fame, French faim > English famine
  • Middle Irish occoras > Goidelic forms
  • Proto-Celtic *nâuniyâ > Brythonic forms
  • Proto-Slavic *goldu > Slavic forms
  • Greek peina
  • Sanskrit bubhukshâ, kshudhâ > various Modern Indic forms
So it's not very evident what the Proto-Indo-European form was.
  • Proto-Turkic *âtS > Turkic forms
  • Proto-Sino-Tibetan *mwat ~ *ng(w)at > Chinese, Burmese, ...
  • Proto-Malayo-Polynesian *lapaR, *bitil
  • Proto-Oceanic *pitolon
  • Proto-Bantu *njàdà
Many of the words are likely derived from more general words, like words for "to desire".
 
It would be interesting to do these calculations with some barriers, like a big wall with the origin place right next to it, so that spreading outward can only be done in some directions.

For Austronesian, if one naively used the language-bounding polygon center or the language centroid, one would find some horribly wrong results: somewhere near New Guinea rather than Taiwan....

The P-I-E Homeland issue is a vivid example of this. A few decades ago, the Balkans was a popular guess for PIE Homeland: Greek, Albanian, Phrygian, Thracian, Dacian, proto-Armenian are all located near the Balkans. (Throw in one Romance language and some Slavic dialects if you want.) But that ignores geography: There have been a dozen or so historic migrations from the East European steppes to the Carpathian Basin. But migrations from that Basin to those steppes? I can't think of a single one.

(As an analogy for that one-way movement I think of either a funnel, or a lit candle, hottest above the flame.)
 
I'll now turn to "elephant". Outside of this animal's range, the word for it is a classic wander word.

English < Old French elefant, olifant < Latin elephantus, elephans (-ant-), elephas (-ant-) < Greek elephas (-ant-)

In Modern Greek, it is elefandas -- third-declension nouns were forced into the first two declensions with only a few exceptions: "father" patêr > pateras, "mother" mêtêr > mitera.

Likely from ancient Egyptian 3bw *rûbaw > *yêbh

Likely source of Latin ebur "ivory" ("elephant" > "ivory")

Akkadian pîrum, another likely descendant, had several descendants: Persian pil, Arabic fîl, Georgian spilo, Armenian p'ig, ...

Proto-Slavic had *slonu of obscure origin > Slavic forms

Turning to South Asia, Sanskrit had several forms: gaja, ibha, hastin, nâga, vârana, kunjara

Of these, hastin < hasta "hand" + -in : "having a hand" (trunk) > numerous Modern Indic forms, like Hindi hâthî, also borrowings

Of the rest, gaja has obscure origins, but it was borrowed a lot in Southeast Asia and nearby, like Indonesian gajah

Proto-Dravidian *yÂnay > Tamil yânai, Telugu ênugu, ...

Khmer damrey < "wiggler, swinger" (from an elephant's trunk swinging)

Vietnamese voi

Then this set:
  • Burmese hcang < Proto-Lolo-Burmese *tsang
  • Thai cháang < Proto-Tai *djâng
  • Chinese xiàng < Old Chinese: (Baxter–Sagart) *s-(d)ang, (Zhengzhang) *ljang
Which way did the word go?

Given where elephants live, I'd guess that the word originated in Southeast Asia, in Kra-Dai or Austroasiatic, or even early in Austric or some substrate lang.
 
Turning to Africa south of the Sahara,

Proto-Bantu *njògù "elephant" had numerous descendants.

Turning to West Africa, I find Yoruba erin which does not look much like that. Looking further,
  • Yoruba erin < Proto-Yoruboid *é-lĩ, *é-nĩ
  • Igbo enyi < Proto-Igboid *é-nĩ̀Nĩ̀
  • Ibibio eniin, Tee ni
  • (others) < Proto-Edoid *E-ni, Proto-Lower Cross River *é-nì:n, Proto-Ogoni *ǹnĩ, ...
Nupe dagba < Proto-Nupoid *ɔ-dɔgba

How are they related?
  • Benue-Congo (E BC): Bantu, Cross River: Ogoni (Tee), Lower Cross River (Ibibio)
  • Volta-Niger (W BC): Yoruboid, Igboid, Edoid, Nupoid
subdivisions of Volta-Congo (broad Benue-Congo)

I find Proto-VC *(e)-ni with replacements in Proto-Nupoid and Proto-Bantu.

Some other ones, in Niger-Congo: Atlantic-Congo
  • Gur: S Gur: Turka gbĩɛ̃lw, N Gur: Dagbani wɔbigu
  • Volta-Congo:
    • Volta-Niger *(e)-ni
    • Kwa: Tano: Akan ɛsono, Nkonya srʋfɔ
Nilo-Saharan:
  • E Sudanic: Nilotic: Dinka-Nuer: Dinka akoon, Nuer guɔr
  • Komuz: Koman: Komo gwa, Kwama kwɨ
  • Saharan: Tedaga kumon
Afroasiatic:
  • Chadic: Hausa giwa < Proto-West-Chadic *giw-, Tangale labata
  • Cushitic: Oromo arbaa, Somali maroodi
  • Egyptian 3bw *Rübaw > Akkadian pîrum (> pil, fil), Greek elephas (-ant-), ...
  • Ethiosemitic: Amharic zëhon (borrowing?)
Khoisan: Tuu: ǃXóõ ǂxūa, Khoekhoe ǂkhoab (those odd symbols are for clicks)

Jarawa language (Andaman Islands): ʈʰehuːʈʰu "is big, fat"

Austroasiatic: Mon-Khmer: Jehai antok

Back to Indo-European: Tocharian A oṅkaläm, B oṅkalmo, of unknown origin

I think I've covered just about all of Wiktionary on "elephant".
 
I found the curiosity that the Yoruba word for hippopotamus is erinmi: erin "elephant" + omi "water" -- "water elephant".

It's also an insulting term for a fat person.

There's also a reconstructed Proto-Bantu form: *ngùbʊ́

I also found Proto-Bantu *ngòì "leopard", *ncímbá "big cat", *nɲàmà "animal, meat"
 
Proto-Bantu *njògù "elephant", *ngùbʊ́ "hippopotamus", *ngòì "leopard", *ncímbá "big cat", *nɲàmà "animal, meat", *mpítí "hyena", *ntʊ̀ìgà "giraffe", *mbògó, *njátɪ́ "buffalo", *mpàdá "impala", *ndègè "bird", *njókà "snake", *ngòìnà "crocodile", ...

Notice that all these words start with n- or m-, the latter before p or b.
 
On the structure of Proto-Uralic | Juha Janhunen - Academia.edu

Proto-Uralic noun number: singular *-, dual *-k, plural *-t with oblique form *-j- (j is "yet" rather than "jet", like in most Germanic and Roman-alphabet Slavic langs).

Proto-Uralic noun cases: nominative *-, genitive (of) *-n, accusative *-m, locative (at) *-na, ablative (from) *-të, dative (to) ? -kä, -ng

Attested Uralic langs typically have many more noun cases, and they were created after their ancestors split off. This suggests that Proto-Uralic likely had additional ones that did not survive or survived in only one descendant. One might test such a hypothesis by looking at the statistics of surviving noun-case endings. Do some of them have only a few survivals?

But that risks small-number statistics, and one is likely to get more significant results with vocabulary survivals.

 List of grammatical cases - a big list, but it may be easier to learn them as preposition counterparts: genitive = of-case, dative = to-case, ...

 Proto-Indo-European nominals and  Proto-Indo-European pronouns

I'll try to collapse them together: athematic (most nouns), thematic (-os), and pronominal, then singular, dual, plural
  • Anim. nom. -s | -os | -s, - ||| -h1 |-oyh1 | -h1 ||| | -es | -ôs, -oy | -y
  • Inanim NVA - | -om | -d ||| -ih1 | -oyh1 | (?) ||| -ns | -ons | -ns
  • Anim. voc. - | -e | (N) ||| (N) ||| (N)
  • Anim. acc.-m | -om | -m || (N) ||| -h2 | -eh2 | -h2
One reconstructs pre-PIE animate accusative *-m, dual *-h1, animate nominative plural *-es (thematic *-o-es) pronominal *-y, animate accusative plural *-m-es, inanimate plural *-h2 (> *-a)

So one gets dual *-h1, plural *-s -- much like Uralic dual *-k, plural *-t -- seems like another case of Pre-IE or IE *K > *H.

Here's a nice example of that effect: PIE *h3ost- ~ *kost- "bone"

*h3ost- > Hittite hastai, Luwian hassa, Latin os (oss-), Greek osteon, Sanskrit asthi, Welsh asgwrn (< Proto-Celtic *astV-kornV- "bone-horn" <*kerh2-)

*kost- > Proto-Slavic *kosti, Latin costa "rib"
 
Turning to Turkic, there is an interesting complication. Turkic is split into two branches, Common Turkic or r-Turkic, with most attested Turkic langs, including Turkish, Azeri, Bashkir, Tatar, Turkmen, Kazakh, Uzbek, Kyrgyz, Uyghur, ..., and Oghuric or Bulgharic or z-Turkic, with one survivor, Chuvash.

There is a curious sound correspondance: Common l ~ Oghuric l, sh, Common r ~ Oghuric r, z, reconstructed as l, l', r, r'.

Noun cases include genitive *-n, *-ng and definite accusative (Common) *-i, (Chuvash) -e (no ending for indefinite accusative).

"I saw a dog" vs. "I saw the dog" in Turkish: Köpek gördüm vs. Köpeği gördüm

Plurals are Common *-lar and Chuvash -sem. Some Turkic langs have vowel harmony in their suffixes, changing vowels to be more like the stem vowels. Thus, Turkish at, atlar "horse, horses" and it, itler "dog, dogs". (a > -lar, i > -ler).

Turning to pronouns, the first and second person ones have plural *-r' > Common *-z, Chuvash -r. So one only gets Turkic *-r'

Mongolic has genitive *-n, accusative *-g, and several plural suffixes.

Tungusic has genitive *-n, accusative *-be, and also several plural suffixes.

Plural Suffixes in the Altaic Languages by Nicholas Poppe

Some of these plural suffixes are multiple, something like the whimsical English plural "breasteses".

NP found all five of *-t, *-s, *-l, *-n, and *-r in Turkic, Mongolic, and Tungusic, though sometimes in fossilized form or inside compound forms:
  • Turkic: *-t, *-ghut, (Yakut) *-tSut, *-s, *-si, (Chuvash) -ese, (Common) *-la-r, *-n, *-an, *-r' (> Chuvash -r, Common *-z)
  • Mongolic: *-d, *-ud, *-ghud, *-nad, *-nughud, *-tSud, *-s, *-us, *-l, *-tSul, *-n, *-nar, *-nad
  • Tungusic: *-t, *-tin, *-ta, *-sa, *-l, *-sal, *-nasal, *-r

So one reconstructs all five, *-t, *-s, *-l, *-n, *-r, for Core or Narrow Altaic.
 
This dual -k, plural -t pattern is also found in the Inuit languages, like in Inupiaq:

nanuq "polar bear": dual nannuk, plural nannut

So, in summary,
  • Indo-European: dual *-H, plural *-s, *-i, (neuter) *-H2
  • Uralic: dual *-k, plural *-t (oblique *-y-)
  • Altaic: plural *-t, *-s, *-n, *-l, *-r
  • Eskimo-Aleut: dual *-k, plural *-t
 
Last edited:
Back
Top Bottom