• Welcome to the Internet Infidels Discussion Board.

Language as a Clue to Prehistory

Borrowings? Like hörcsög "hamster" from some Slavic source: Serbo-Croatian hrčak, Proto-Slavic *xrъčь > *xruči

Also Turkic borrowings: Hungarian ~ Turkish < Proto-Turkic:
hód ~ kunduz < *kundur' "beaver"
homok ~ kum < *kum "sand"
kapu ~ kapı < *kap-ïg "door, gate" < *kap- "to close"
kút ~ kuyu < *Kut- "well (noun)"

Were they from some Turkic language that already had some of this sound shift? Or did these words come into some ancestor of Hungarian before that sound shift? Opinion nowadays tends toward the latter option.

But words with k + front vowel or h + back vowel can still be borrowings, like kék "blue" from Turkic: Turkish gök, Proto-Turkic *kȫk "sky, blue".
 
Then the wh-word effect: interrogative pronouns often starting with the same sounds or modifications of them: English who, what, whose, which, where, when, why, how, ... German has w-, Danish hv-, Swedish v-, Latin qu-, with c before u, ...

Georgian often has r-, like ra "what", rogor "how", romeli "which" with exceptions like vin "who" and sad "where".

This points to some single interrogative pronoun that was the source of these others, from compounding, derivation, or inflection.

Hungarian has ki "who", kié "whose" but hol "where" and hogy "why", among others. A separate one is mi "what" with miért "why", among others.

This kind of alternation, though between k and q, is evident in Yukaghir: kin "who", qadā "where", in Classical Mongolian: ken "who", qamiγa "where", and in Eskimo-Aleut: Greenlandic kina "who", qanga "when".

These interrogative k's are also found in Turkic (Turkish kim "who"), Chukotko-Kamchatkan (Itelmen k'e "who"), and Indo-European (*kwis "who" > Latin quis, English who, ....). With first-person *m and second-person *t also being common here, that points to a northern Eurasian macrofamily: Nostratic (Illich-Svitych, Dolgopolsky), Eurasiatic (Greenberg), Uralo-Siberian (Fortescue).

That's a good case for North Nostratic, though South Nostratic is weaker:

Kartvelian: 1s *me, 1p *Cwen, 2s *S(w)en, 2p *tkwen -- S is "sh", C is "tsh"

Dravidian: first-person *yâ-, second-person *nî-, interrogative *yâ- (who? what? which?)

The pronouns somewhat match for K but not for D.
 
  • Me watch Sophie. Sophie scratch cat head -- stripped-down grammar.
Tarzan's or Ape language - "Me Tarzan, you Jane"
Hindi, Urdu and related languages: kyun (why), Kab (when), kahan (where), kaise (how), kisliye (what for), kisne (who, did it), kya (what), kiska, kiski (whose, masculine, faminine), kaun sa, kaun si (which one, masculine, faminine)
All from Sanskrit < IE < some PIE (probably).
 
Last edited:
In Thai, question words usually use the vowel /ai/
"what" = /a-rai/ < /an-rai/ = "thing which?"​
"who" = /khrai/ < /khon-rai/ = "person which?"​
"when" = /meua-rai/ = "time which?"​
"when" = /ton-nai/ = "time which?"​
"why" = /chnai/​
"why" = /tham-mai/ < /tham-rai/ = "make which?"​
"where" = /tii-nai/ = "place which?"​
"how" = /yang-rai/ = "method which?"​
question = /mai/ < /rue-mai/ = "or not"​
 
Then Roger Blench on "(De)Classifying Arunchal languages: Reconsidering the Evidence" - reassessing the languages of India's northeastern province Arunachal Pradesh. He concludes that some of them are not Sino-Tibetan but in their own separate families or are isolates.

Then Gregory Haynes on Yggdrasil, the Old Norse legendary world-axis ash tree (orig. Yggdrasill). He starts off with discussing "axis mundi", Latin: "axle of the world" or "world axis". That concept is inspired by the (apparent) motion of the celestial bodies around the Earth; they move in circles around a line that stretches through the Universe.

Working from Michael Witzel's work, he then notes the appearance of the Milky Way Galaxy: a glowing belt across the sky that is two parallel belts in parts of the sky. At one branching point are the constellations Cygnus, the swan, and Aquila, the eagle, and at the other branching point Capricornus, the goat. In the middle of the single-belt part is Hydra, the serpent. GH suggests that some of Yggdrasil's features were inspired by the Milky Way, like its whitish color, the snake or dragon at its base, the birds associated with it, and the goat that lives in its top branches.

Yggdrasill = Odin's horse? Ygg is a common title for Odin, "the terrifying one", but drasill? GH proposes:

PIE *terkw- "to turn, spin" > Latin torquêre "to spin, twirl, turn, twist, wind, ..." > E "torque"

From Grimm's law and loss of non-initial h, one would expect Old Norse to have *thra-. But in compounds, /th/ often -> /dh/ often -> /d/. Thus, the d.

So this would mean "spindle", something for spinning wool or cotton. One starts with a pile, pulls a bit out of it, attaches that to the spindle, then turns the spindle to make a thread or string. This technology is at least as old as Neolithic farming, and it continues to be used in highly mechanized form.

The second part, sill, has cognates like E "sill" in "windowsill" and the like, and goes back to PIE *swel-, *sel- "plank, board, wooden post".

So instead of Odin's horse, it could be Odin's spindle post. Why the association with horses? Horses for pulling celestial bodies across the sky? Horses harnessed to a post and made to walk around it to thresh grain?

GH then gets into nature symbolism more generally, citing Leda and the swan. In Euripides's play "Helen", about that famous citizen of Troy, the titular character says "Zeus took the feathered form of a swan, and that being pursued by an eagle, and flying for refuge to the bosom of my mother, Leda, he used this deceit to accomplish his desire upon her."

So Leda is the Milky Way, with a swan near where her legs come out of her torso, and with an eagle on her lower legs. That swan? Zeus himself.

Then Michael Witzel himself went into detail about possible celestial symbolism in the Vedas.
 
FInally, in this issue of "Mother Tongue", a book announcement. Editor Pierre Bancel recently published the book "Pris aux mots – De l'origine du langage à l'origine des langues" -- "Taken at Words - From the Origin of Language to the Origin of Languages" (Google Translate) "Taken from Words – From the Origin of Language to the Origin of Languages" (Bing Translator)
It adopts a definitely evolutionary perspective to explain how a speechless ape species, in a series of steps, conquered first the human voice, then a host of hum interjections, then the first syllables, then a lot of them, and finally assembled them into narrations before syntax evolved.

He has unearthed several striking facts, some already known to a few long-rangers, like the Proto-Sapiens negative/prohibitive particle **ma, some others which had gone unnoticed, like the universality of hum interjections in modern humans, and made some stunning observations, such as his granddaughter Celeste, aged 22 months, telling the complete story of their encounter with a singing cuckoo.
An English translation is in the works, and I'll be wanting to find out what those "striking facts" are.
 
I've seen the theory that singing came before speech. That seems plausible to me, because it establishes both generation and recognition of sequences of sounds without having to have any semantic content. That makes it much like wolf howls, bird songs, and whale songs.

Once that foundation has been built, one can start adding semantics, associating speech sounds with meanings.

Linguists have composed some whimsical names for theories of origin of language:
  • Ding-dong - some connection between sound and meaning
  • Bow-wow - imitation of other sounds
  • Pooh-pooh - exclamations
  • Yo-he-ho - exertion sounds
  • Ta-ta - imitations of gestures
  • La-la - audio doodling
So this is the la-la theory of language origin.

Related to this is the notion of phonesthesia. I wanted to link to a paper on this, but Google's dumbification continues. Bing search did come up with the mediocre paper https://www.researchgate.net/publication/309289841_The_Reality_of_English_Phonaesthemes

Briefly -- and I hope our expert linguists will correct me if I'm wrong -- the theory is that certain sounds acquire connotations, perhaps initially by chance. For example, "slip", "slide", "slippery" have related connotations so "slither" might end up preferred over alternatives because it seems to fit!

"snout", "snot", "snuff", "sniff", "snore", "sneeze" have different etymologies but all relate to the nose. Is it a surprise then that "snicker" connotes laughing with one's nose? ("Smooch", "smile", "smack (one's lips)" all relate to mouth. "Thrash", "thrust", "throw", "threaten" all suggest violence.

But my impression is that linguists tend to be skeptical of this whole idea!
As someone who used to do linguistics as a day job for part of their life and as someone who still calls themselves a linguist by training, I haven't encountered the term "phonesthesia" (or, equally possible, forgot all about it), but I don't find anything particularly objectionable about the concept as you describe it either.

To the extent that it is by chance, it seems to be a special case of what I know as "contamination", where a word deviates from what would be expected from regular sound changes in ways that make it more similar to asthmatic semantically related words. In the cases were it isn't, it could be subsumed under onomatopeia.

Google isn't what it used to be and I'm having a hard time finding results for "contamination in etymology" that aren't about the etymology of contamination, but here's one (paywalled) result: https://oxfordre.com/linguistics/di...9384655.001.0001/acrefore-9780199384655-e-457

Interestingly, from your English examples, I find that I can only replicate a fraction in German. We do have "Schnauze" and (related) "schneuzen" for "snout" and "clear one's nose", and "schniefen" for something like "silently cry through one's nose", but already "sneeze" is "niessen" without a "sch".
 
Last edited:
the initial cluster gl- is found in words referring to light, vision or especially shininess: glitter, glimpse, gleam, glow etc.
fl- suggests lightness and quickness: fly, flee, flow, fluid, and flicker

sl- is often pejorative; but sl- can also denote sliding or slipping.
So "slime" and "slither" (and "slug"?) each combine BOTH of the connotations of sl-

I wikipedia'd a bit and found some more examples:
[David Crystal] shows that English speakers tend to associate unpleasantness with the sound sl- in such words as sleazy, slime, slug, and slush,[4] or they associate repetition lacking any particular shape with -tter in such words as chatter, glitter, flutter, and shatter.
So "glitter", "flitter" and "flutter" (and "snitter") each incorporate TWO phonesthemes. If -ther is accepted as a substitute for -tter then "slither" incorporates THREE phonesthemes.
 
Major Quisling has added a new word to the English language. To writers, the word Quisling is a gift from the gods. If they had been ordered to invent a new word for traitor … they could hardly have hit upon a more brilliant combination of letters. Aurally it contrives to suggest something at once slippery and tortuous. Visually it has the supreme merit of beginning with a Q, which (with one august exception) has long seemed to the British mind to be a crooked, uncertain and slightly disreputable letter, suggestive of the questionable, the querulous, the quavering of quaking quagmires and quivering quicksands, of quibbles and quarrels, of queasiness, quackery, qualms and quilp.
- The Times, April 19th, 1940

(The "august exception" is, of course, "Queen"*).








* Though it seems unlikely that the Editor of The Times in 1940 was aware of the future careers of Freddy Mercury, Brian May, Roger Taylor, or John Deacon.
 
Darn it, I was going to make a Queen joke before I scrolled down.

ETA: plus, quotidian and quiddlity are great words.
 
Unde Venisti? The Prehistory of Italic through its Loanword Lexicon - Leiden University - summary
Unde venisti? | Scholarly Publications - full-text links
Unde venisti: Latin: "Where did you come from?"
Word-for-word: From-where you-came?
cui dōnō lepidum novum libellum
āridā modo pūmice expolītum?

namque vōs solēbātis
meās esse aliquid putāre nūgās.
In between: "To my parents"
Following: "(Adapted from Catullus 1)"

Google Translate:
To whom do I give a nice new book
just polished with pumice stone?

For you used to think that my little things were something.
Bing Translator:
To whom do I gift this charming new little book, just polished with dry pumice? For you used to think that my trifles were something.
 
Last edited:
"Es gibt keine Mischsprachen” vs. “Es gibt keine völlige ungemischte Sprache”.

German: Google Translate, Bing Translator: "There are no mixed languages" vs. "There is no such thing as a completely unmixed language".

Then assessing carious possible sources of borrowings into Latin and its ancestors. About now-lost Indo-European langs,
In sum, these attempts are all based on the perfectly plausible idea that the attested Indo-European languages were not the first Indo-European languages to be spoken in the area where they are attested. The irregularity of the data can a priori be due the intersecting effects of contact with 1) any number of different lost IE languages 2) over a long period of time in which both sides of the contact situation were undergoing sound changes.
This leaves non-IE langs,
Whether or not traces of lost Indo-European languages exist in the attested daughter languages and whether these are visible as such, there are certainly a large number of cases for which an attempt at an Indo-European etymology is futile. Thus the focus of the rest of this chapter, and of this thesis in general is on the research of non-Indo-European sources of Latin vocabulary.

Like Latin plumbum ~ Greek molubdos "lead (metal)" from what Antoine Meillet called “un troisième langue inconnue,” (French: a third unknown language). Of the other words,
  • vaccīnium "blueberry" ~ vacca "cow" or Greek huakinthos "blue to rad, some flowers" > "hyacinth"
  • cupressus "cypress (tree)" ~ Gk kuparissos ~ kuparittos
  • menta "mint (plant)" ~ Gk minthê ~ minthâ ~ minthos
  • rosa "rose (flower)" ~ Gk rhodon ~ wrodon ~ brodon
  • līlium "lily (flower)" ~ Gk leirion
  • fīcus "fig (tree, fruit)" ~ Gk sukon ~ tukon ~ Armenian t'uz
  • lībra "Roman pound" ~ Gk litra "Sicilian coin, unit of weight" (borrowing from early Italic *lidhra?)
  • vīnum "wine" ~ Greek oinos (ancestral IE?)
About the last one, I checked on Ukrainian and Russian wineries, and they are all near the Black Sea. So it's iffy whether PIE speakers knew about wine.

Also, it's likely that Germanic, Celtic, and Balto-Slavic speakers learned about wine from the Roman Empire; their words are likely borrowings of the Latin word.
 
Last edited:
. . . Related to this is the notion of phonesthesia. I wanted to link to a paper on this, but Google's dumbification continues. Bing search did come up with the mediocre paper https://www.researchgate.net/publication/309289841_The_Reality_of_English_Phonaesthemes

Briefly -- and I hope our expert linguists will correct me if I'm wrong -- the theory is that certain sounds acquire connotations, perhaps initially by chance. For example, "slip", "slide", "slippery" have related connotations so "slither" might end up preferred over alternatives because it seems to fit!

New words arise via borrowing or semantic shift. Since there are many opportunities to borrow or shift, is it an interesting question WHICH borrowings or shifts become permanent? I think that the phonaesthesia hypothesis offers explanation, at least in a few cases, of why a new word takes hold. Some papers (including this one, which mentions Wallis) present statistical evidence that such phonaesthemic connections occur oftener than random.

"snout", "snot", "snuff", "sniff", "snore", "sneeze" have different etymologies but all relate to the nose. Is it a surprise then that "snicker" connotes laughing with one's nose? ("Smooch", "smile", "smack (one's lips)" all relate to mouth. "Thrash", "thrust", "throw", "threaten" all suggest violence.

But my impression is that linguists tend to be skeptical of this whole idea!
As someone who used to do linguistics as a day job for part of their life and as someone who still calls themselves a linguist by training, I haven't encountered the term "phonesthesia" (or, equally possible, forgot all about it), but I don't find anything particularly objectionable about the concept as you describe it either.

I read a few posts at the sci.ling Usenet group that ridiculed the hypothesis as laymen's ignorance. Indeed IF the hypothesis were widely embraced by linguists THEN one would expect it to get more mention in textbooks, no?

I am intrigued by cases where a genius in one field famously has insight in an unrelated field. Two examples:
  • Georg Cantor, an outstanding 19th century mathematical genius, was (along with other 19th century geniuses, e.g. Mark Twain) one of the earliest to understand that the Shakespeare Authorship was a hoax. (Setting aside of course those in the early 17th century who knew of the hoax and even hinted at it, but could not openly divulge a state secret.)
  • John Wallis, an outstanding 17th century mathematical genius, wrote Grammatica linguae Anglicanae which allegedly introduces the phonesthesia hypothesis.
 
Research into Germanic substrate words has often focused on words without etymologies from known sources, and that is risky. “Wer sagt uns, daß dies nicht morgen der Fall sein werde?” German: "Who says that this won't be the case tomorrow?"

Then discussing Etruscan, and then some methodological issues. I must note that the pages have "Unde vēnistī? The Prehistory of Italic through its Loanword Lexicon" marking the vowel lengths in that Latin phrase.

They called their compared words "comparanda", as opposed to "cognates", to be more non-committal.

That's the Latin future passive participle: "(something) to be compared". Another words with that participle is the name "Amanda" - "(someone) to be loved". Latin also has a future active one, like "futûrus" - "(something) to be" > E future.

Then various criteria for recognizing borrowings, like "s" between vowels. It usually has rhotacism:
/s/ > /z/ > /r/
like Latin honos, honor- > E "honor"
 
Thus, likely borrowings
  • asinus "donkey" ~ Greek onos ~ (?) Sumerian anshu ~ (?) Proto-Semitic *atân- "female donkey"
  • casa "cottage, hut" > most Romance casa "house, home", French preposition chez "at the home/location of"
  • caballus "horse" replaced equus in Latin's descendants ~ Proto-Celtic *kappelos ~ Gk kaballês ~ Proto-Slavic *kobyla "mare"
  • calix "cup, pot" ~ Gk kulix "cup"
  • citrus ~ Gk kedros "cedar"
  • columba "pigeon, dove" ~ PSlav *golombi ~ ...
  • cupressus "cypress (tree)" ~ Gk kuparissos ~ Hebrew gofer "gopher wood"
  • ervum "bitter vetch" ~ Proto-Germanic *arwit- "pea" ~ Gk orobos "bitter vetch" ~ Gk erebinthos "chickpea"
  • faba "bean" ~ PGmc *baunô- ~ PSlav *bobu ~ Proto-Berber âbâw-
  • ferrum "iron, steel" (< *fersum) ~ PGmc *braes- "brass, copper" ~ Luwian *parza- "iron" ~ Svan bereZ "iron" ~ (?) Ingush, Chechen borza "bronze"
  • fungus "mushroom, fungus, sponge" ~ Gk sphonggos "sponge"
  • laurus "laurel, bay tree" ~ Gk daphnê, daukhna
  • nux "nut" ~ PGmc *hnut- ~ PCelt *knû-
  • râpum "turnip" ~ Gk raphus "turnip", raphanos "cabbage, radish" ~ PGmc *rôbjôn- "turnip" ~ PSlav *rêp- "turnip" ~Proto-Baltic *râp- ~ PCelt *arbîno- "turnip"
  • sabulum "sand" ~ Gk (ps)ammos ~ (ps)amathos ~ PGmc *sanduz
Then possible borrowings. One of them is arâneus "spider" ~ Gk arakhnê. Some purported borrowings author AM Wigman decided were likely inherited, like aqua "water" and vînum "wine", though aqua has cognates only in Germanic, and though much of the PIE homeland may be too cold for growing grapevines in.

Grapevines are perennial plants, meaning that they have to survive winters, unlike annual plants, which only have to have seeds that can survive winters. They do that by going dormant, like a deciduous tree.

Grapevine Cold Hardiness - Viticulture & Enology The common European grapevine, Vitis vinifera has a minimum temperature of around -20 C.

The growing season of grapvines is roughly March to September in the Northern Hemisphere.:  Annual growth cycle of grapevines and the plant must be above freezing for all this time.

Rome and Athens are both grapevine-safe, so I looked at the PIE homeland. Odesa and Kyiv are at the borderline of grapevine safety, with Kyiv maybe a little bit over. There is also the question of whether nomads would want to grow perennial crop plants like grapevines. Annual ones would not be much of a problem, but perennial ones might only be feasible if they are root vegetables with rhizomes that can be dug up and planted elsewhere.
 
Then a lot of analysis of features, like alternations of sounds like d ~ l, l ~ r, a sometimes present a-, and some suffixes.
A remarkable pattern emerges wherein the words with a Mediterranean distribution attest to a set of irregular alternations that are also by and large restricted to a Mediterranean distribution.
Some of these words are for plants with a Mediterranean distribution, like the Mediterranean buckthorn, the box tree, the fig tree, and the cypress tree.

The author then gets into contact scenarios, proposing an Italic-Celtic-Germanic substrate or substrates, and also a Mediterranean one, one shared with Greek.

Sorting out by semantics,
It is clear that the non-inherited lexemes of Latin are indeed overwhelmingly plants (40%) and animals (19%), though they are certainly not all economically unimportant. Beyond being unable to say with certainty which animals and plants would have been economically unimportant to ancient peoples, several of the words refer to domesticated or otherwise edible species. There are also several words referring to items of material culture, including tools, vessels, and textiles.
I'd say "largely" rather than "overwhelmingly", but that is indeed correct.
The 11 non-inherited words for domesticated plants as a group are important, in that they seem to confirm that a portion of the non-inherited vocabulary in the Indo-European languages was indeed borrowed from a population practicing intensive agriculture.
Not very many in his list that have North Caucasian cognates, let alone Basque ones.
 
After mentioning again 40% plants and 19% animals, which he states as 20% there, Dr. Wigman continues with "All 3 words for vessels, all 3 culinary terms, and 3 of 4 textile terms are shared with Greek and/or attest to a Mediterranean distribution."

"On the other hand, the 6 Italo-Celto-Germanic isoglosses lack domesticated species and include corbis ‘basket’ and hasta ‘spear’, hinting at the much earlier cultural contexts in which they were borrowed. "

Then he gets into when Italic speakers might have arrived in the Italian peninsula, noting that first arrival of steppe-derived populations at around 2000 BCE as Bell Beaker people. Though a more plausible candidate for Proto-Italic speakers is the Urnfield culture, which spread out from central Europe starting in 1300 BCE and was also the origin of the Hallstatt culture of central Europe, often identified as Proto-Celtic speakers.
 
I've always been most impressed by the methods and results of Ringe and Warnow. I attach their proposed chronology below.
I see that the initial split in Italic between LA (Latin) and OS-UM occurs at 1300 BC, exactly as lpetrich proposes.

While looking for that tree, I happened on a large book about I-E which is on-line and public domain(?) with the name "The Indo-European Language Family: A Phylogenetic Perspective : Edited by Thomas Olander"

lpetrich's discussion of animal and plant names reminds me that 25 years ago or so, there was a big controversy about "beech" < *bhago- . The name of that tree is present in most branches of I-E but the plant itself was NOT found in the alleged Homeland near Volgograd. This seemed to be treated as a big objection to the hypothesis of Pontic–Caspian Steppe origin. (Gimbutas apologists conjectured that the PIE folk had BIRCH trees they called bhago, and that word was adopted by branches who did have beech.)

BUT some expert in unearthing fossilized pollen(?!!?) determined that that region DID have beech trees 5500 years ago. Wow!

- - - - - - - - - - - -
Ringe-Warnow+Phylogenetic+Tree+of+Indo-European.jpg
 
[2502.11688] From Isolates to Families: Using Neural Networks for Automated Language Affiliation

Over the past few decades, linguists have assembled big databases of linguistic data for many languages: corpora (collections of texts), vocabulary lists, and lists of grammatical features.

Grammatical features?
Quantitative methods that exclusively use grammatical data as their primary evidence have less frequently been proposed than their lexical counterparts. Despite this, Dunn et al. (2005) suggest that analyzing grammatical data could lead further back in time than the traditional comparative method. Today, these claims have lost supporters due to other studies indicating that grammatical features alone are less well suited for language classification because they diffuse easily in cases of language contact Gray et al. (2010); Greenhill et al. (2010). The high potential for such diffusion is due to the limited amount of variation that grammatical features exhibit Wichmann (2017). However, these dynamics remain understudied, and we lack further case studies to analyze the behavior of grammatical data in large-scale classification settings.
They use Aharon Dolgopolsky's simplified phonology, and code each entry as a vector of 0's for every consonant class except for the one that it encodes, which gets a 1. The encoded the first two consonant, with all 0's for no consonant. This is a typical way of encoding categorical data, data that can have any of more than two values.

Grammatical features are easier, since many of them are scored as true/false.
 
Back
Top Bottom