• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Language as a Clue to Prehistory

In the West Liao basin, agriculture started with the growing of broomcorn millet around 9000 BP (7000 BCE).

Bayesian phylolinguistics reveals the internal structure of the Transeurasian family | Journal of Language Evolution | Oxford Academic

They used 23 out of 27 present-day Turkic languages and also Old Turkic, 10 out of 17 Mongolic languages and also Written Mongolian and Middle Mongolian, 10 out of 13 Tungusic languages and also Manchu and Jurchen, Korean and also Middle Korean, and Japanese and 5 out of 14 Ryukyuan languages and also Old Japanese.

They also used a 200-word version of the  Leipzig–Jakarta list, a list of meanings with seldom-replaced word forms, a list derived using statistics on etymologies. Something like the Swadesh lists, with a lot of overlap.
It is highly unlikely that all similarities between the basic items in our dataset are the result of contact instead of genealogical relationship. Traditionally, the strength of basic vocabulary lies in the fact that words with basic meanings tend to resist borrowing more successfully than random lexical items. The very fact that we find 150 Transeurasian etymologies covering 107 distinct basic vocabulary concepts thus is a strong argument against borrowing by itself. In addition, we can advance other arguments against borrowing, such as (1) the misfit with the expected borrowing hierarchy; (2) the misfit with the expected typology of verbal borrowing; (3) the regularity and complexity of sound correspondence; (4) the occurrence of broken contact chains; (5) the multiple setting; and (6) the well-spread distribution of the cognates; see also Robbeets (in press, 2019).
They then go into detail.

"First, among the concepts of the Leipzig-Jakarta list, we find fifty-nine actions, thirty-two property words, twenty-three deictic or grammatical items and eighty-six nominal concepts." and "Empirically, it is observed that languages tend to borrow lexical items more easily than grammatical ones and nouns more easily than verbs (a.o. Wohlgemuth 2009; Matras 2009; Tadmor et al. 2010)." That is, words with standalone meanings tend to be borrowed more easily than words with meanings connected to other words. "In contrast to this tendency, there are more correlations for verbs (65%) and deictic and grammatical items (57%) in the Transeurasian basic vocabulary than for nouns (43%)."

"Second, as far as the mechanisms of loan verb accomodation are concerned, most recipient languages can be categorized into two distinct groups: borrowed verbs either arrive as verbs, needing no formal accommodation, or, they arrive as nonverbs and need formal accommodation. Most Transeurasian languages can be assigned to the second group because they display a clear preference for the nonverbal strategy (Wohlgemuth 2009: 159, 161)." In Japanese, borrowed verbs often have form "to do" <verb> (<verb> suru).

"Third, the comparative sets for basic vocabulary display regular correspondences for each consonant of the verb root and for each but the root-final vowel, conform to the requirements in Supplementary Data (SI 1)." Usually close to other work, like Palaeolexicon - Table of (Macro-) Altaic Phonology

"Fourth, gaps in the attestation of members of an etymology, whereby a cognate is absent in one or more intermediate contact branches are indicative of borrowing."

"Fifth, most examples of borrowing have a binary setting in common: they typically go from a model language into a recipient language. Especially for verbs and grammatical markers, examples of the same item progressing into a third or fourth language are relatively rare." - but it is grammatical items and verbs that are best-preserved of the cognate list used.

"Finally, the distribution of a certain basic item to a single language or to only few languages of a certain subgroup could serve as an indication of borrowing. However, such cases do not occur among our basic vocabulary etymologies."
 
Here are the ranges of dates that they used for the four families with more than one member.
WhatLowerUpper
Proto-Turkic500 BCE100 CE
Proto-Mongolic1000 CE1300 CE
Proto-Tungusic600 BCE500 CE
Proto-Japonic200 BCE500 CE

"Linguists and archeologists associate proto-Japonic with the beginnings of Yayoi-culture (900 BCE–300 CE) on the Japanese Islands. ... The ancestor of the languages now spoken in the Ryukyuan Islands is thought to have remained in northeastern Kyushu until around 900 CE, when full-scale agriculture was introduced to the Ryukyus."

"Proto-Mongolic is nearly equivalent with the language spoken by the historical Mongols around the time of the Mongol Empire (1206–1368), which is documented in historical sources, written in several different scripts and collectively termed Middle Mongol."

The first split in Turkic is between Oghuric / Bulgaric and Common Turkic. Of the former, Chuvash is the only survivor with Bulgar and Khazar now dead, with the latter containing all other present-day Turkic languages, including Turkish.

Oghuric is sometimes called Lir Turkic and Common Turkic Shaz Turkic from these sound correspondences:
Proto-TurkicOghurCommon
*lylsh
*ryrz

They considered hypotheses
( (Tungus, JK), (Turk, Mongol) )
( Tungus, JK, (Turk, Mongol) )
( JK, ( Turk, (Tungus, Mongol) ) )
( JK, ( Tungus, (Turk, Mongol) ) ) -- their best fit.
-- essentially various placements of Tungusic in (JK, (Turk, Mongol)) with JK = (Japonic, Korean)
 
I simply don't buy the Bayesian analysis, which is entirely untested and subject to possible biases inherent in setting up priors. I don't think that there is any way to measure how strongly languages are affected by borrowed traits. For example, the Breton language has been hugely affected by French, which almost completely reworked its system of time and tense reference. English has quite a few very strong effects of borrowing from both Scandinavian and French influence, some of which significantly reworked its syntactic structure and morphology. I don't have any further knowledge of this study than what has been released to the public, but it is being met with more skepticism in the linguistic community than the popular media, which is to be expected, I suppose. It's fun to speculate, but going back more than a few thousand years is really making a lot of leaps.
 
It sounds like the Transeurasian Hypothesis lpetrich writes about is based on the Swadesh-like list, and NOT on grammatical characters.

Ringe et al did consider grammatical and phonological characters as well as lexical when they reconstructed the tree structure of the Indo-European family. But as discussed in the long linked-to article they tried to exclude all but rare and irreversible characters.

For me — as I mentioned (#92) last year in this thread — the most interesting thing about Ringe's study is the anomalous status of Germanic. His software was unable to attach Germanic consistently. Approaching the I-E expansion from the archaeological direction, dots can be connected to other I-E families (e.g. Glob Amphora ↣ Bell Beaker ↣ Urnfeld Bronze ↣ Hallstatt Iron ↣ Celtic) but Germanic is much less clear. Was not Corded Ware supplanted by Comb Ceramic in Sweden; and might its successors have been dominant in the Nordic Bronze Age? I think there were at least THREE languages that contributed to proto-Gemanic: Centum, Satem, and a non-IE language spoken by TRB or Comb Ceramic. There was likely a stage of creolization. Germanic has a non-IE word for king/koenig which might suggest an unconquered people adopting I-E voluntarily.

Is the linguistic evidence enough to construct an hypothesis for the early evolution of proto-Germanic?
 
Looking at Don Ringe's paper, he finds a reasonably good tree if he omits Germanic.
  • Root: (Anatolian, (Tocharian, Classic))
  • Classic: (Albanian, (Italo-Celtic, Core))
  • Italo-Celtic: ((Welsh, Old Irish), (Latin, (Oscan, Umbrian)))
  • Core: ((Greek, Armenian), (Balto-Slavic, Indo-Iranian))
  • Balto-Slavic: (Old Church Slavonic, (Old Prussian, (Lithuanian, Latvian)))
  • Indo-Iranian: (Vedic, (Avestan, Old Persian))

"Classic" refers to the conception of Proto-Indo-European before Hittite and Tocharian were discovered.
All the inflectional characters that give any precise information about the position of Germanic - namely M5, M6 and M8 - place it in the large subgroup that also includes Balto-Slavic, Indo-Iranian and Greek; and since those are the characters that are the most reliable indicators of genetic descent, it appears that Germanic should be placed in what we are calling the core of the family ± the residue after the departure of Anatolian, Tocharian and Italo-Celtic.
Authors Don Ringe and Ann Taylor conclude that early Germanic was an offshoot of Core IE, but one whose speakers lived near some early Celtic or Italo-Celtic speakers and got a lot of vocabulary from them.

The three Italic languages have the usually-accepted subgrouping, as do the three Baltic ones and the three Indo-Iranian ones.
 
Yes, Germanic's sources include both a "core" language (Satem like Balto-Slavic, or para-Satem like Albanian) and a Western Centum language (Italic or Celtic), but there must have been a THIRD source as well. I think the third source was a sea-faring Baltic people, either the  Pitted Ware culture or the  Pit–Comb Ware culture.

The sea-faring terms Ship, Sail, Sea, Seal, Keel, Eel possibly Ice and perhaps even Boat are all non-IE words found in both West Germanic and North Germanic. The Finnic (or Fennic) language is often associated with these Scandinavian seal-hunters but I don't think any of the eight words just mentioned has a clear Uralic cognate. Although 'Boat' has a possible PIE etymology (*bheid- "to split"), cognates of Boat in Romance languages are considered borrowings from Germanic. (And Irish bád is borrowed from Old English.)

Basic vocabulary words found in both Western and Northern Germanic but not in other I-E languages include finger, toe, neck, bone, wife, oak, berry and even horse.

Ocean-going ships were in use in the Baltic as early as 2500 BC, about the same time as Corded Ware farmers arrived in Sweden. But some fisher-gatherers of Sweden rejected farming and adopted a rich economy on the shores of the Baltic. They could trade furs and amber for agricultural products; or even use their sea-going skills as pirates to raid and steal what they wanted.

The Nordic Bronze Age was centered in Sweden, not Germany or Denmark. I think the "proto-Vikings" — whose existence isn't even hinted at in Barry Cunliffe's otherwise excellent Europe Between the Oceans — gained control during that Age. (Perhaps their sea-faring skills gave them access to the English tin needed for bronze.) At some point they switched to the I-E (Corded Ware) language of those they conquered but they retained some of their old language, calling their king Kuningaz instead of Rēx, and so on.

The origin of the Germanic people and their language is surely a fascinating story but one we'll never be able to reconstruct. Still, I think linguistics may offer some clues.
 
I simply don't buy the Bayesian analysis, which is entirely untested and subject to possible biases inherent in setting up priors. ...
Bayesian phylolinguistics reveals the internal structure of the Transeurasian family | Journal of Language Evolution | Oxford Academic

I checked on "Methods" and the authors used some existing software that is often used in molecular phylogeny. BEAST: Bayesian Evolutionary Analysis Sampling Trees - BEAST Software - Bayesian Evolutionary Analysis Sampling Trees | BEAST Documentation - BEAST 2

 Bayesian inference in phylogeny describes how it works. Since the number of possible family trees grows factorially with the number of end nodes or leaves, one cannot do a completely comprehensive search. For n end nodes, one must search this number of trees:

(2n-3)!! = (2n-3)*(2n-5)*...*5*3*1

So one must do some random sample of them, and the way to do that is to compare a tree to randomly-generated tweaks of it, then repeat with a good one of these trees:  Markov chain Monte Carlo I say a good one and not the best one, unless one wants to do hill climbing. Using a good one means more sampling of the space of possibilities.

 Computational phylogenetics -  List of phylogenetics software -  List of phylogenetic tree visualization software

Bayes's inverse-probability theorem: for data values D and hypotheses H:

\( \displaystyle{ P(H \text{ if } D) = \frac{ P(D \text{ if } H) P(H) } { P(D) } ,\ P(D) = \sum_H P(D \text{ if } H) P(H) } \)

The P(H) values are the prior values of hypotheses H.
 
The development of negation in the Transeurasian languages by Martine Robbeets

A  Negative verb is a verb that expresses negation, with the main verb as its object, like the English auxiliary "do not". Negative verbs are common in the Uralic languages, and also in the Transeurasian ones.

Negative verbs often get turned into auxiliary verbs, and then into particles or prefixes or suffixes. English "not" is a negation particle, and Turkish and Japanese both use negation suffixes.

With a negative verb, the main verb is originally a non-finite (non-inflected) form (participle, infinitive). In Uralic, inflection gradually gets transferred from the negative verb to the main verb, in this order: voice, aspect, mood, tense, person/number, imperative.

MR reconstructs negative verbs Proto-Transeurasian *ana-, Proto-Altaic *e-, Proto-Turkic *ma-

Transeurasian basic verbs: copy or cognate?
"Copy" is a MUCH better word than "borrow", because the original words are unchanged.
Empirically it is observed that languages tend to copy nouns more easily than verbs (e.g. Moravcsik 1975, 1978, Muysken 2000, Wichmann & Wohlgemuth 2008, Wohlgemuth 2009, Matras 2009, Tadmor et al. 2010). From the seventeeth to the nineteenth century, for instance, Japanese underwent intensive contact from Dutch leading to the global copying of over 300 words and the selective copying of syntax, but Japanese did not copy a single verb from Dutch (Irwin 2011). The relative stability of verbs is interrelated with a number of factors, such as the fact that verbal semantics tend to be less culturally determined than the meanings of nouns, that verbs are less perceivable as a distinct unit because they need more adaption to the morpho-syntactic frame of the sentence, and that there simply are less verbs than nouns.
The Swadesh Lists are short on verbs, but a recent list of stable meanings, the Leipzig-Jakarta one, adds a lot of verbs.

Loan verbs in a typological perspective

"Direct insertion" - using a root form directly, "indirect insertion" - using some affix, "light-verb strategy" - using some verb like "to do"

The latter is a common strategy in Transeurasian languages like Turkish and Korean and Japanese.
 
 Jespersen's Cycle described by linguist Otto Jespersen a century ago.

Words for negation get used a lot, so they tend to become weakened. The speakers then press some other word into service to make negation more prominent, and the same thing then happens. Like in French:

Jeo ne dis
Je ne dis pas
Je dis pas

"Pas" is literally "step", and English words like "no" and "not" have similar origins: "no" from "ne any" (also makes "none") and "not" from "ne aught" ("anything", making "not" literally "nothing").
 
A very recent coinage is "to click", to press a mouse button because of the sound it then makes.
  1. Spanish: clicar, cliquear, hacer clic ("make click") (Catalan, Portuguese similar)
  2. Greek: κάνω κλικ, κλικάρω
  3. Turkish: klik-le-, klik et-
  4. Japanese: クリックする kurikku suru ("do click")
Sources:
  • 1, 2: "Loan verbs in a typological perspective"
  • 3: "Transeurasian basic verbs: copy or cognate?"
  • 1, 4: click - Wiktionary
Several languages use direct insertion, like German klicken, Danish klikke, Swedish klicka, French cliquer, Italian cliccare, Polish klikać, Thai klik, etc.

Illustration of the three strategies inside English itself:
  • Direct insertion: to click
  • Indirect insertion: to clickify
  • Light verb: to do click

As to word morphology, I recall reading that a little after 1800, Danish linguist Rasmus Rask showed that the Germanic languages were a well-defined family by describing their grammatical similarities, because vocabulary comparisons seemed too prone to borrowing. (BTW, Otto Jespersen was also Danish).

For verbs, their simple past tenses and past/passive participles are very obviously related. There are several classes of "strong" verbs, those with vowel shifts, and one class of "weak" verbs, those with -ed and its cognates.
 
Even more evident is shared patterns of irregularity, especially  Suppletion - using different word forms for different parts of a paradigm.

English has some of that, like some verb conjugations and comparison paradigms, like go - went - gone, good - better - best, bad - worse - worst. For "to be", it's a mixture of suppletion and incomplete reduction of ancestral conjugations by the standards of English verbs.

Suppletive comparison paradigms are found in other Germanic languages, in Latin and Romance, in Celtic, in Slavic, etc.

Though other Germanic languages don't have the suppletion that English has for "to go", the Romance languages have plenty of suppletion there.
  • Italian: andare -- pres. vado, vai, va, andiamo, andate, vanno -- impf. andavo, pret. andai, fut. andrò
  • Spanish: ir -- pres. voy, vas, va, vamos, vais, van - impf. iba, pret. fui, fut. iré
  • French: aller -- pres. vais, vas, va, allons, allez, vont - impf. allais, pret. allai, fut. irai
The non-present tenses are much more regular than the present tense.

Suppletion can be traced back to Proto-Indo-European for some verbs, like the copula verb ("to be").  Indo-European copula

The two main PIE roots are *es- and *bhuH- (> is, be), joined by some others in some of the dialects, like *wes- in Germanic.

Dawn of Verbal Suppletion in Indo-European Languages -- discussing several examples. Suppletive verbs mostly have very general sorts of meanings, as do adjectives with suppletive comparisons in the dialects.


FIrst and second person pronouns are also suppletive in the dialects, with suppletion in them reconstructed for PIE. English I/me, thou/three, we/us. The PIE singular pronouns are *egho-/*me- and *tu-/*te-, with the plural ones being more difficult to reconstruct.
 
The Wikipedia article on Suppletion gives many examples, but every single example comes from an Indo-European language. And many textbooks devote little attention to the topic.

Even defining suppletion isn't easy. Igor Mel'cuk gives the English word 'close' (instead of 'unopen') to be an example!
And some examples that look like suppletion, such as yeux as the plural of French oeil (or IIUC, mice as the plural of English mouse) are the result of regular sound changes.

If you want a long write-up on Suppletion including examples from several non-IE languages you can get a PDF at
(That pdf is free. If you prefer you can pay $35 for the same file at other sites.)
 
 Proto-Indo-European pronouns - the 1st, 2nd personal pronouns:

Case1sg2sg1pl2pl
Nominative*egoH*tuH*wei*yuH
Oblique Stem*me-*te-*nos-, *ns-*wos-, *us-
Verb ending*-m, *-oH*-s, *-eHi*-me*-te

Uralic:  Proto-Uralic language - I looked in the Finnish-language version: Kantaurali – Wikipedia (language name: Suomi) Appendix:Finnish possessive suffixes - Wiktionary  Finnish verb conjugation,  Hungarian grammar

I have included possessive suffixes and personal verb endings for Finnish and Hungarian. The latter has both definite and indefinite verb conjugations, depending on whether the object is definite or indefinite.

Language1sg2sg1pl2pl
Finnishminäsinämete
Hungarianéntemiti
Proto-Uralic*mi-*ti-*me*te
Finn poss-ni-si-mme-tte
Hung poss-om-od-unk-otok
Finn vb-n-t-mme-tte
Hung id vb-ok-sz-unk-tok
Hung df vb-om-od-juk-játok

Let's look at Altaic. With Personal pronouns in Core Altaic

Family1sg2sg1pl2pl
Turkic*bi*si*bis*sis
Mongolian*bi*ti*ba*ta
Tungusic*bi*si*bö*sö

There is a m-t pattern in them, though Altaic has m > b, and several members have t > s.
 
Proto-Indo-European, Proto-Uralic, and the Transeurasian languages also share subject-object-verb order and related word orders, like mainverb-auxverb, though such syntactical similarities can be areal effects, from language contact.

The hypothesis of their relationship is part of the Eurasiatic and Nostratic hypotheses -  Eurasiatic languages and  Nostratic languages - along with several other language families and isolates. Joseph Greenberg included in Eurasiatic Indo-European, Uralic-Yukaghir, Altaic, Korean-Japanese-Ainu, Chukchi-Kamchatkan, and Eskimo-Aleut, and Nostratic typically includes Kartvelian, Dravidian, and Afro-Asiatic.

Some similarities are the m-t pattern of personal pronouns, and among some of them, noun dual -k and plural -t.

Going even further is  Borean languages, with Nostratic,  Dené–Caucasian languages,  Austric languages, and  Amerind languages. This covers all of premodern humanity except for sub-Saharan Africa, New Guinea, and Australia.

Though that is *very* speculative, there is a rather entrancing feature of it. It covers essentially all the non-Negroid populations of humanity. That means that some offshoot population in Eurasia in the Upper Paleolithic had spoken Proto-Borean. This population had relatively light skin and straight or lightly-curled hair, though continuing to have black-colored hair and brown eyes. Light skin is an adaptation to low sunlight, but straight hair is less explicable. Was it an adaptation? Or sexual selection to distinguish some group? Or a result of Neanderthal admixture?
 
Returning to much closer to our time, I've found Dated language phylogenies shed light on the ancestry of Sino-Tibetan | PNAS
Given its size and geographical extension, Sino-Tibetan is of the highest importance for understanding the prehistory of East Asia, and of neighboring language families. Based on a dataset of 50 Sino-Tibetan languages, we infer phylogenies that date the origin of the language family to around 7200 B.P., linking the origin of the language family with the late Cishan and the early Yangshao cultures
That's about 5200 BCE. The Sino-Tibetan homeland is located on the lower part of the Yellow River, but inland from the coast and Beijing's location. The people there domesticated broomcorn millet, foxtail millet, pigs, and sheep.

That family's two main branches are the Chinese dialects (Sinitic) and the Tibeto-Burman languages, named from containing Tibetan and Burmese. Proto-Chinese speakers were Sino-Tibetan stay-at-homes, while Proto-Tibeto-Burman speakers moved westward and then southward. They reached Xishanping at the NE end of the Tibetan plateau at 5250 - 4000 BP / 3250 - 2000 BCE. The Proto-Tibeto-Burman people and the Proto-Chinese people then acquired horses and cattle and rice at around this time, the horses and cattle likely from Indo-European speakers to the west and rice from the Baligang people to the south, where it was domesticated since 8700 - 8300 BP.

Proto-Tibeto-Burman speakers continued southwestward into Tibet and southward into the mountains of Southeast Asia, and Chinese speakers expanded southward much later.

Interesting curiosity: the Chinese word for horse, ma, is likely cognate with English "mare", from Proto-Indo-European *mark-. Checking horse - Wiktionary, I find:
  • Sino-Tibetan: Chinese: ma, Old Chinese *mra:?, Tibetan: rta, rmang, Burmese: rmang
  • Turkic: Turkish: at, Azeri: at, Tatar: at, Kazakh: jılqı, at, Turkmen: at, Kyrgyz: at, jılqı, Chuvash: lasha
  • Mongolian: aduu, mori
  • Tungusic: Manchu: morin, Nanai: morin, Oroqen: murin, Evenki: murin, Jurchen muri, Proto-Tungus *murin
Reconstruction:Proto-Tungusic/murin - Wiktionary notes the similar words for this animal in Mongolian, Chinese, Japanese, and some other Central and Southeast Asian languages. Seems like a Wanderwort, a "wander word", a word that travels with what it names.
 
Last edited:
Chinese is the oldest attested Sino-Tibetan language, going back over 3000 years.
A big difficulty is that Chinese writing is not phonetic but instead logographic, with one symbol for each word or morpheme (word part treated as a unit). But Chinese characters often have a phonetic part and a semantic part, like mother = woman + horse (kind of woman whose word sounds like the word for horse, "ma").

For Middle Chinese, one can look back with with the help of rhyming dictionaries, present-day words, words from Chinese in Korean, Japanese, and Vietnamese, and borrowings into Chinese.

Middle Chinese grammar was much like present-day Chinese grammar in being mostly isolating, without inflections. However, Old Chinese had initial and final consonant clusters, something lacking from the present-day dialects. Reduction of these clusters induced the development of tones.
Most researchers trace the core vocabulary of Old Chinese to Sino-Tibetan, with much early borrowing from neighbouring languages. During the Zhou period, the originally monosyllabic vocabulary was augmented with polysyllabic words formed by compounding and reduplication, although monosyllabic vocabulary was still predominant. Unlike Middle Chinese and the modern Chinese dialects, Old Chinese had a significant amount of derivational morphology. Several affixes have been identified, including ones for the verbification of nouns, conversion between transitive and intransitive verbs, and formation of causative verbs.[4] Like modern Chinese, it appears to be uninflected, though a pronoun case and number system seems to have existed during the Shang and early Zhou but was already in the process of disappearing by the Classical period.[5] Likewise, by the Classical period, most morphological derivations had become unproductive or vestigial, and grammatical relationships were primarily indicated using word order and grammatical particles.
 
Turning to other putative members of Dene-Sino-Caucasian, I take another look at Basque. I searched Google Scholar for "Vasco-Caucasian" and "Euskaro-Caucasian", and also more broadly.
Attempts to estimate the time of Basque-NC divergence yield the early Neolithic, about right for Basque-NC to be a language family spread by the European Neolithic farmers. Basque-NC words in Latin, Germanic, Slavic, and Greek imply a spread all over Europe, making it likely to be *the* language family that those long-ago farmers spread.
 
 Pre-Finno-Ugric substrate
Pre-Finno-Ugric substrate refers to substratum loanwords from unidentified non-Indo-European and non-Uralic languages that are found in various Finno-Ugric languages, most notably Sami. The presence of Pre-Finno-Ugric substrate in Sami languages was demonstrated by Ante Aikio.[1] Janne Saarikivi points out that similar substrate words are present in Finnic languages as well, but in much smaller numbers.[2]

The number of substrate words in Sámi likely exceeds one thousand words.[3]

Borrowing to Saami from Paleo-Laplandic probably still took place after the completion of the Great Saami Vowel Shift. Paleo-Laplandic likely became extinct about 1500 years ago.[4]

The Nganasan language also has many substrate words from unknown extinct languages in the Taimyr peninsula.[5]
Lapland is the northern part of the Scandinavian peninsula. The Taimyr Peninsula is on the northern coast of central Siberia.
Vladimir Napolskikh has attempted to link them to the hypothetical Dené–Caucasian language family, but later had to admit that these substrate words have no apparent parallels in any known language on Earth.[9]

Yuri Kuzmenko tried to compare them to the hypothetical Pre-Germanic substrate words, but found no similarities apart from the distinction between central and peripheral accentuation.[10]
So Paleo-Laplandic is related to no known language.

"Irregular correspondences among Uralic languages are frequent among some words, such as 'to milk' and 'hazelnut'. These are presumed to be non-native loanwords by Aikio (2021)"

Such irregular correspondences would be the result of sound shifts in the donor languages, sound shifts not parallel to those in recipient languages. I've seen the same argument about pre-Indo-European substrate vocabulary in Europe, that irregular correspondences mean borrowing from different languages.
 
There are many texts on historical linguistics available for free download on the 'Net. I'll track down some of the URL's if there's interest. Here are just a few:

(1) One interesting book, English: the Language of the Vikings, argues the case that Middle English is descended from Old Norse instead of Old English! Most will reject that as ridiculous but the book has much evidence and interesting discussion. I might conclude that Middle English was a hybrid of the two sources, but the idea of a hybrid is rejected by BOTH sides of the debate! :) Briefly, Danish residents of the Danelaw retained their language. When William the Bastard and the Normans arrived and the natives resisted, the English-vs-Danish political situation flipped completely. Since one's enemy's enemy is one's ally there was suddenly incentive to merge (or form a koine of) two languages which were already close enough to facilitate bilingualism; it was even politically expedient to try for a 50-50 combination! As London — just to the South of the Danelaw — became the new center of England (and underwent a big influx of immigrants from the Danelaw region), Anglicized Norse (or Norsified English in the traditional view) became the new standard of English.

A very large portion of English words have agreed-upon Norse or Danish etymology. In addition many words have cognates in both Old English and Old Norse and, since by default those words are usually assigned an Old English etymology, the true contribution of Norse/Danish to English may be even higher.

(2) Trask's Historical Linguistics is on-line and has been updated since his death. It may have little new to offer, but has an interesting account of how the discovery of Hittite confirmed Saussure’s laryngeal theory which had been thought of as a clever but unimportant conjecture.

Trask's book also gives this famous story:
By about 1500, it is clear that people were often finding it exceedingly difficult to understand English-speakers from other areas. In a famous passage from 1490, the printer William Caxton reports that a merchant from the north of England walked into a tavern to the east of London and asked for eggys, and was told by the tavern-keeper that she could not understand French. The exchange became quite heated before another man stepped in and explained that the merchant was asking for eyren. This little bit of interpretation did the trick, and the merchant got his eggs. Here the merchant was using a northern word with a northern plural ending, while the tavern-keeper only knew the southern forms typical of Essex and Kent.

(3) Language Classification by Lyle Campbell and William Poser is available on-line. Something I learned from that book is that Gottfried Wilhelm von Leibniz — the great polymath sometimes compared to Leonardo da Vinci — wrote several papers on historical linguistics beginning in the 1690's! If anyone tracks down one of these papers, please let me know. Here's an example of one of several mentions.
 
 Substratum in Vedic Sanskrit - notes several substratum-derived features of that language. One of them is "retroflex" consonants, versions of (t,d,n) produced with the tongue at the roof of the mouth instead of at the teeth ("dental"). This is shared across the Indian subcontinent, and is clearly an areal effect, as linguists say. The Central Asian nomads who brought Sanskrit with them also picked up retroflex consonants when they settled down in India.
In 1955 Burrow listed some 500 words in Sanskrit that he considered to be loans from non-Indo-European languages. ...

These loanwords cover local flora and fauna, agriculture and artisanship, terms of toilette, clothing and household. Dancing and music are particularly prominent, and there are some items of religion and beliefs.[15] They only reflect village life, and not the intricate civilization of the Indus cities, befitting a post-Harappan time frame.[17] In particular, Indo-Aryan words for plants stem in large part from other language families, especially from the now-lost substrate languages.[5]
The sorts of words that settlers might want to borrow.
Mayrhofer identified a "prefixing" language as the source of many non-Indo-European words in the Rigveda, based on recurring prefixes like ka- or ki-, that have been compared by Michael Witzel to the Munda prefix k- for designation of persons, and the plural prefix ki seen in Khasi, though he notes that in Vedic, k- also applies to items merely connected with humans and animals.[9]: 12  ,,,

Witzel remarks that these words span all of local village life. He considers that they were drawn from the lost language of the northern Indus Civilization and its Neolithic predecessors. As they abound in Austroasiatic-like prefixes, he initially chose to call it Para-Munda, but later the Kubhā-Vipāś substrate.[5]
Some of this substrate vocabulary came in early, as Proto-Indo-Iranian nomads went south from their Sintashta homes and overran the Bactria–Margiana Archaeological Complex before splitting up with some going into the Indian subcontinent and some going into Iran.
Terms borrowed from an otherwise unknown language include those relating to cereal-growing and breadmaking (bread, ploughshare, seed, sheaf, yeast), waterworks (canal, well), architecture (brick, house, pillar, wooden peg), tools or weapons (axe, club), textiles and garments (cloak, cloth, coarse garment, hem, needle) and plants (hemp, mustard, soma plant).[3]
Also, "There are an estimated thirty to forty Dravidian loanwords in Vedic" even if not much that can be identified from the Munda languages (east India).
 
Back
Top Bottom