Language as a Clue to Prehistory

lpetrich · Dec 2, 2019

Words for domestic animals and technologies may be good, and their presence is a necessary but not sufficient condition for the presence of Indo-European speakers. Not sufficient because they can be present in the absence of speakers of IE languages. But their absence is good evidence of absence of IE speakers.

Looking at domestic animals, one finds "dog", "cow", "bull", "pig", "sheep", "goat", "horse", and "foal" (baby horse). A word for dog does not tell us much, since dogs are humanity's first domesticated animal. Most of the others were domesticated in the Middle East, and horses were domesticated in the steppe belt between eastern Ukraine and Kazakhstan.

They knew about a variety of wild animals: "wolf", "bear", "deer", "elk", "eagle", "mouse", "snake", and "trout/salmon".

Looking at technologies, one finds "wheel", "axle", "yoke", "wagon", meaning that the PIE speakers had wheeled vehicles. The word for wheel is derived from a word for rolling or turning, making it much like our word "roller".

One also finds words for wool, flax, spinning, and weaving, so they had woven clothing in addition to animal-skin clothing.

One also finds "metal" and "gold", with no evidence of "iron". Words for iron vary

So we must look for evidence of wheeled vehicles.

Why It Took So Long to Invent the Wheel | Live Science
A wheel in isolation is rather simple-looking, but a wheeled vehicle is not. The body of the vehicle has to have at least one axle, and the wheels have to fit onto the ends of the axles while being loose enough to rotate. The wheels themselves have to be close to circular with their axle holes in their centers.

The wheels, axles, and vehicle bodies were all made of wood, and their manufacture required metal tools. Stone tools are not precise enough. The first metal usable for tools is bronze, a copper-tin alloy. An early form of bronze was copper-arsenic, but it did not last long. Arsenic is well-known for its toxicity, and its users may eventually have decided that it is jinxed.

The first kind of wheel was likely a solid wheel, made from a slice of a log or from fastening some boards together. Spoked wheels were likely a later invention. The first wheel that we used may not have been a vehicle wheel but a potter's wheel, something easier to build.

So the image of a caveman carving a wheel is absolute bullshit. Never mind that most Paleolithic people did not live in caves, because there aren't many to live in. Instead, they made huts for themselves, something like what people with similar levels of technology were discovered doing by European and European-descended explorers. Most of these huts have not survived, but there are a few survivors: mammoth-bone huts in what's now European Russia.

We don't know for sure where and when the wheel was invented, but the first evidence of wheels is in Southeastern Europe and the Middle East. So the invention likely spread quickly once it was made.

Horses and wheeled vehicles point to an identification of the place and time of the Proto-Indo-European homeland as that steppe zone about 5000 years ago - the Yamna or Yamnaya culture.

lpetrich · Dec 2, 2019

Urheimat German: "original home", "homeland" - homelands of protolanguage speakers. That article discusses homeland hypotheses for several language families, and I will discuss some of them here.

Austronesian is an interesting case. The most divergent languages of this family are spoken in Taiwan, and that is why Taiwan is inferred to be the Austronesian homeland. It was settled around 3000 BCE from South China, and there are some speculative hypotheses about its closest relatives, but that's about it.

There are 9 highest-level branches of Austronesian in Taiwan, and a tenth one outside Taiwan: the Malayo-Polynesian languages, all the rest of the Austronesian ones.

Proto-Austronesian language

Proto-Malayo-Polynesian language

Proto-Oceanic language

Proto-Polynesian language

The northern Philippines were colonized around 2200 BCE, and the rest of the Malayo-Polynesian domain after that. Malayo-Polynesian has one big subfamily, Oceanic, and lots of small ones with disputed relations between them. Those ones are almost all west or northwest of New Guinea, and they include nearly all Philippine and Indonesian languages.

One of these small families is the Barito family, named after some of its speakers living near the Barito River in southern Borneo. I say "some", because one Barito language is spoken far away: Malagasy in Madagascar. So some southern Borneans traveled a *long* way to find some land to colonize, and they found some such land in Madagascar, reaching that island around 500 CE.

The great-circle distance is about 4500 mi / 7300 km, and the distance along the shorelines is around 9500 mi / 13400 km.

Turning to Oceanic, the speakers of Proto-Oceanic likely lived in the Bismarck Archipelago NE of New Guinea around 1600 BCE. Their remains are likely the Lapita archeological culture. Their descendants spread northward and eastward to much of Micronesia (northward), much of Melanesia, and Polynesia (eastward).

Proto-Polynesian was likely spoken in Samoa and Tonga around 800 - 900 BCE. Polynesians spread eastward and then northeastward, southeastward, and southwestward. They reached Tahiti and the Marquesas Islands around 700 CE, Hawai'i around 900 CE, Rapa Nui (Easter Island) around 1000 - 1200 CE, and Aotearoa (New Zealand) around 1200 CE.

So the archeology agrees fairly well with what one can infer about the higher-level linguistic relationships.

lpetrich · Dec 3, 2019

Back to Indo-European,

Indo-European vocabulary is a big list. Its table of contents:
1 Notes
2 Kinship
3 People
4 Pronouns, particles
5 Numbers
6 Body parts
7 Animals
8 Agriculture
9 Bodily functions and states
10 Mental functions and states
11 Natural features
12 Directions
13 Basic adjectives
14 Construction, fabrication
15 Self-motion, rest
16 Object motion
17 Time
18 References
19 External links

Most of these reconstructed words are for rather commonplace sorts of things. For example, one can reconstruct a word for clothing in PIE, but it is a generic sort of word. Words for kinds of clothing vary widely in the dialects, with similarities often due to borrowing, so it may be difficult to reconstruct PIE words for different kinds of clothing.

An indicator of climate is a shared word for snow. English snow, German Schnee, Swedish snö, Latin nix, niv-, Greek niph-, Russian sneg, etc. have a common ancestor: *sneigwh-

This means temperate or polar and not subtropical or tropical, something consistent with the steppe-zone hypothesis for the PIE homeland. However, it is not a very precise match, and it is consistent with most other PIE homeland hypotheses that have been proposed.

lpetrich · Dec 3, 2019

The Numbers List at zompist.com -- 1 to 10

List of numbers in various languages -- 1 to 10 and sometimes to 20

Proto-Indo-European numerals

The "primary" ones are 1 to 10, 100, and 1000. 20 is listed as 2-10, 30 as 3-10, etc. The word for 10 is *dekm, and the word for 100 is *kmtom, suggesting that the word for 100 was originally *dkmtom, sort of "super 10".

Indo-European migrations - illustrates spread from the PIE homeland in the steppe zone north of the Black and Caspian Seas. What's now South European Russia and nearby.

Wiktionary, the free dictionary - if you want to track down the ancestors of present-day word forms, this is a good place -- it includes plenty of etymologies going as far back as mainstream linguists consider reliable.

lpetrich · Dec 3, 2019

Austronesian Basic Vocabulary Database: Main

Each language in our database has around 210 words associated with it. These words correspond to basic items of vocabulary, such as simple verbs like 'to walk', or 'to fly', the names of body parts like hand or mouth, colors like red, numbers (1, 2, 3, 4) and kinship terms such as Mother, Father and Person. The full list is here.

Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement | Science

Debates about human prehistory often center on the role that population expansions play in shaping biological and cultural diversity. Hypotheses on the origin of the Austronesian settlers of the Pacific are divided between a recent “pulse-pause” expansion from Taiwan and an older “slow-boat” diffusion from Wallacea. We used lexical data and Bayesian phylogenetic methods to construct a phylogeny of 400 languages. In agreement with the pulse-pause scenario, the language trees place the Austronesian origin in Taiwan approximately 5230 years ago and reveal a series of settlement pauses and expansion pulses linked to technological and social innovations. These results are robust to assumptions about the rooting and calibration of the trees and demonstrate the combined power of linguistic scholarship, database technologies, and computational phylogenetic methods for resolving questions about human prehistory.

From the paper:

Our results place the Formosan languages of Taiwan at the base of the trees immediately after the outgroups (Fig. 1). Following these are the languages of the Philippines, Borneo/Sulawesi, Central Malayo-Polynesia, South Halmahera/West New Guinea, and the Oceanic languages. This chained topology is precisely the structure predicted by the pulse-pause scenario.

As to technological and social innovations,

The first pause between the settlement of Taiwan and the Philippines may have been due to the difficulties in crossing the 350-km Bashi channel between Taiwan and the Philippines (4, 6). The invention of the outrigger canoe and its sail may have enabled the Austronesians to move across this channel before spreading rapidly over the 7000 km from the Philippines to Polynesia (4). This is supported by linguistic reconstructions showing that the terminology associated with the outrigger canoe complex can only be traced back to Proto-Malayo-Polynesian and not Proto-Austronesian (41).

One possible reason for the second long pause in Western Polynesia is that the final pulse into the far-flung islands of Eastern Polynesia required further technological advances. These might have included the ability to estimate latitude from the stars, the ability to sail across the prevailing easterly tradewinds, and the use of double-hulled canoes with greater stability and carrying capacity (4, 42). Alternatively, the vast distances between these islands might have required the development of new social strategies for dealing with the greater isolation found in Eastern Polynesia (42). These technological and social advances in Eastern Polynesia may also underlie the fourth pulse into Micronesia.

Awfully impressive work for people with Neolithic technology. Sort of like the big cities and empires of the pre-Columbian Americas.

lpetrich · Dec 3, 2019

Now to substrate languages. Conquest is often imperfect, with the conquered peoples' place names often surviving and with the conquerors often borrowing words from the people that they conquered.

This is evident in the contiguous United States, where its northeast and southeast parts have clearly different substrate languages.

In the northeast, one finds place names like Massachusetts, Connecticut, Narragansett, Susquehanna, and Rappahannock, and borrowed words like "skunk" and "raccoon" and "opossum".

In the southwest, one finds place names like Los Angeles, San Diego, San Francisco, and Santa Barbara, and borrowed words like "arroyo" and "canyon".

The most discussed substrate may be the Pre-Greek one:

Pre-Greek substrate. We find place names like Korinthos (Corinth) and Knossos, and words like kuparissos "cypress tree", terminthos "terebinth tree", erebínthos "chickpea". After a while, one notices -nthos -ssos suffixes, and one infers that the Pre-Greek language had them.

Popular Controversies in World History: 1. Prehistory and Early Civilizations book page 87 (PDF 106 of 348) has another list.

An interesting additional suffix is -nx for noisemakers: larunx "voice box", pharunx "throat", surinx "flute", salpinx "trumpet", phorminx "lyre".

lpetrich · Dec 3, 2019

More Pre-Greek:
The Pre-Greek substrate and its origins - The Pre-Greek substrate and its origins.pdf
Brill Introductions to Indo-European Languages 2 - Robert S. P. Beekes, Stefan Norbruis-Pre-Greek_ Phonology, Morphology, Lexicon-Brill Academic Publishers (2014) | Vowel | Anatolia - paywalled
Pre-Greek: Phonology, Morphology, Lexicon - Robert Beekes - Google Books - Pre-Greek Lexicon snippet: several pages

Some Pre-Greek languages survived into Classical times, and there are some inscriptions in them: Eteocretan, Eteocypriot, Lemnian. These are difficult to interpret, though Lemnian is recognizably related to Etruscan, a pre-Latin language of Italy. An earlier Pre-Greek language was the Minoan Linear A language, and that is also difficult. I've found Minoan language blog where an amateur attempts to interpret it. Looks rather sober and careful by amateur standards.

lpetrich · Dec 3, 2019

Germanic substrate hypothesis is another one.
Talking Neolithic: Linguistic and Archaeological Perspectives on How Indo-European Was Implemented in Southern Scandinavia - AJA121_04_Iversen.pdf
Germanic words of non-Indo-European origin - Linguistics - Eupedia
Non-Indo-European Root Nouns in Germanic: Evidence in Spport of the Agricultural Substrate Hypothesis
The Shared Lexicon of Baltic, Slavic, and Germanic - Thesis_vdHeijden_upload.pdf

The "Talking Neolithic" one discusses words in several IE languages that don't fit IE sound correspondences very well, indicating that they were likely borrowed from pre-IE languages. Like words with a sometimes-present a- or e- prefix.

rousseau · Feb 1, 2020

Khoisan word list

fromderinside · Feb 23, 2020

How does the uniqueness of clicks relate in African languages relate to pre-IE languages? One would think a connection between african languages and pre-Indo-European languages need be established first.

rousseau · Feb 23, 2020

fromderinside said:
How does the uniqueness of clicks relate in African languages relate to pre-IE languages? One would think a connection between african languages and pre-Indo-European languages need be established first.

I'm still not clear on what we're trying to accomplish with this thread, maybe lpetrich can clarify. I took 'clue to prehistory' to mean 'what language tells us about our history prior to the written word'.

There should be a link between African languages and modern ones, but I I would assume that making a direct link with evidence isn't possible. AFAIK, the only evidence we have of language comes from surviving written word dating back to the past ~ 2 - 5 thousand years, as well as surviving prehistoric languages. This Khoisan language dates back possibly over 100k, and likely diverged with migrations out of Africa, where evidence is still spotty and incomplete.That leaves quite a wide chasm between the two forms.

But if we're looking for 'clues to prehistory' the syntax of Khoisan does a pretty good job of it. One can imagine what would be relevant to a pre-historic hunter gatherer, and further any other person in any society henceforth.

lpetrich · Feb 23, 2020

fromderinside said:
How does the uniqueness of clicks relate in African languages relate to pre-IE languages? One would think a connection between african languages and pre-Indo-European languages need be established first.

That's right. A list like that 100-word list should be a good place to start - Khoisan word list

Appendix:Swadesh lists - Wiktionary - has lists for numerous languages and reconstructed protolanguages
Swadesh list
Leipzig–Jakarta list
Automated Similarity Judgment Program - has a list
Dolgopolsky list - only 15 words. Should be easy to start with.

Here it is:

I/me
two/pair
you (singular, informal)
who/what
tongue
name
eye
heart
tooth
no/not
nail (finger-nail)
louse/nit
tear/teardrop
water
dead

I'm still not clear on what we're trying to accomplish with this thread, maybe lpetrich can clarify. I took 'clue to prehistory' to mean 'what language tells us about our history prior to the written word'.

Yes, that is what I meant.

fromderinside · Feb 25, 2020

I learned something. Thanks. Maybe I got the thread going again?

Good job rousseau. Thanks lpetrich.

Copernicus · Feb 25, 2020

rousseau said:
fromderinside said:

How does the uniqueness of clicks relate in African languages relate to pre-IE languages? One would think a connection between african languages and pre-Indo-European languages need be established first.

Click to expand...

I'm still not clear on what we're trying to accomplish with this thread, maybe lpetrich can clarify. I took 'clue to prehistory' to mean 'what language tells us about our history prior to the written word'.

That is something we can explore by comparing cognate word sets across daughter languages, but the traditional way of establishing relatedness between languages is by establishing regular sound correspondences across potential daughter languages. Since sounds in language change in similar ways across entire vocabularies (not just in individual words), there will be recognizable patterns that support the existence of a protolanguage. However, changes can be so radical that correspondences are increasingly difficult to recognize, the further back you go. There are lots of theories about how we can improve on the traditional sound correspondence method established in the 18th and 19th centuries, but evidence is really had to come by. Vocabulary lists that purport to show similarities are not of much use, since any two languages are likely to have a number of words that look similar across any two languages in the world, no matter how unrelated.

In the twentieth century, linguists discovered that language similarities are not just about genetic relations. Languages tend to fall into typological buckets that have nothing to do with a common ancestor. Rather, typological similarities relate to structural similarities. For example, languages in which the verb is normally close to the beginning of the sentence tend to have prepositions and prefixes, whereas as those in which the verb is normally at the end of the sentence tend to have post-positions and suffixes. (These are just tendencies, not absolute patterns.) For example, English and Hindi are both Indo-European, but Hindi, unlike English, tends to place verbs at the ends of clauses. Not surprisingly, Hindi has postpositions instead of prepositions. In that respect, it is like Japanese, which is a verb-last language that also has postpositions. There is no proven relationship between Japanese and Indo-European languages, but there seems to be a broad consensus now that Turkic languages and Japanese descend from a common ancestor. What linguists do nowadays is not just to look at patterns of sound correspondences, but also to use knowledge of how languages form patterns of change to discover relationships that go back a little deeper in time than we were able to in the 19th and early 20th centuries.

There should be a link between African languages and modern ones, but I I would assume that making a direct link with evidence isn't possible. AFAIK, the only evidence we have of language comes from surviving written word dating back to the past ~ 2 - 5 thousand years, as well as surviving prehistoric languages. This Khoisan language dates back possibly over 100k, and likely diverged with migrations out of Africa, where evidence is still spotty and incomplete.That leaves quite a wide chasm between the two forms.

But if we're looking for 'clues to prehistory' the syntax of Khoisan does a pretty good job of it. One can imagine what would be relevant to a pre-historic hunter gatherer, and further any other person in any society henceforth.

Remember that the picture is complicated by the existence of typological patterns that cut across ancestral relationships. There are limits to how languages can differ from each other, and the factors that might cause typological similarities are not well-established. See

linguistic typology.

rousseau · Feb 26, 2020

Copernicus said:
Remember that the picture is complicated by the existence of typological patterns that cut across ancestral relationships. There are limits to how languages can differ from each other, and the factors that might cause typological similarities are not well-established. See linguistic typology.

I guess my point is less about relationships between and structures of languages, and more about the objects that have been symbolized in each language. In Khoisan, what I find interesting isn't it's evolution across time, but rather what people chose to symbolize one hundred thousand years ago. In that sense the scope of the language points to the life experience and concepts contained by those speaking it, or a part of our prehistory.

Beyond that, when you say 'there are limits to how languages can differ', that's interesting to me too, because it points to the fact that, regardless of time period, our fundamental experience as humans is relatively static. In theory there should be a finite set of objects and concepts to symbolize, and an even smaller set that are in every day use.

lpetrich · Feb 26, 2020

rousseau said:
There should be a link between African languages and modern ones, ...

Those African languages are as present-day as the "modern" ones, so they are equally distant from the common ancestor that they likely had.

There is also no reason to believe that they are especially conservative. They may preserve some features lost in other languages, but they may have lost some features that other languages preserve. it's hard to tell without detailed comparative work.

I suggest taking that African-language word list and extracting the Dolgopolsky 12-word list for each language and posting it here.

rousseau · Feb 26, 2020

lpetrich said:
rousseau said:

There should be a link between African languages and modern ones, ...

Click to expand...

Those African languages are as present-day as the "modern" ones, so they are equally distant from the common ancestor that they likely had.

There is also no reason to believe that they are especially conservative. They may preserve some features lost in other languages, but they may have lost some features that other languages preserve. it's hard to tell without detailed comparative work.

I suggest taking that African-language word list and extracting the Dolgopolsky 12-word list for each language and posting it here.

Confining the conversation to symbolized objects, that would be true only once they came in contact with modern civilization, and to the degree that their lifestyle changed over that time-frame. So yes and no. If their lifestyle was basically unchanged for 100k years, the modern version would very closely resemble the original, for whenever that originated. At least in it's syntax, maybe not in it's structure.

That was along the lines of my original point way back when - language emerges to symbolize experience - if the experience is unchanged, so is the world the language represents. But you're right that they wouldn't be completely equivalent.

lpetrich · Feb 26, 2020

Here's a table of English words that demonstrate the range of English stop-consonant sounds, with the same format as for the Proto-Indo-European one earlier:

[TABLE="class: grid"]
[TR]
[TD]pill[/TD]
[TD]till[/TD]
[TD]kill[/TD]
[/TR]
[TR]
[TD]spill[/TD]
[TD]still[/TD]
[TD]skill[/TD]
[/TR]
[TR]
[TD]bill[/TD]
[TD]dill[/TD]
[TD]gill[/TD]
[/TR]
[/TABLE]

Rows: voicing
Columns: articulation point

English fricatives:
[TABLE="class: grid"]
[TR]
[TD]th[/TD]
[TD]s[/TD]
[TD]sh[/TD]
[TD]f[/TD]
[TD]h[/TD]
[/TR]
[TR]
[TD]dh[/TD]
[TD]z[/TD]
[TD]zh[/TD]
[TD]v[/TD]
[TD]-[/TD]
[/TR]
[/TABLE]
th = voiceless th (thin)
dh = voiced th (then)

English affricates:
[TABLE="class: grid"]
[TR]
[TD]ch[/TD]
[/TR]
[TR]
[TD]j[/TD]
[/TR]
[/TABLE]

English semivowels: y, w

English has some 10 vowel phonemes, at least for my dialect of American English. It also has 5 diphthongs (vowel-semivowel combinations), including two "long vowels": /ai/, /au/, /ei/, /oi/, /ou/
[TABLE="class: grid"]
[TR]
[TD][/TD]
[TD]baht
[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]bat
[/TD]
[TD][/TD]
[TD]bot
[/TD]
[/TR]
[TR]
[TD]bet
[/TD]
[TD]but
[/TD]
[TD]bote
[/TD]
[/TR]
[TR]
[TD]bit
[/TD]
[TD][/TD]
[TD]foot
[/TD]
[/TR]
[TR]
[TD]beet
[/TD]
[TD][/TD]
[TD]boot
[/TD]
[/TR]
[/TABLE]

Jokodo · Feb 26, 2020

rousseau said:
lpetrich said:

rousseau said:

There should be a link between African languages and modern ones, ...

Click to expand...

Those African languages are as present-day as the "modern" ones, so they are equally distant from the common ancestor that they likely had.

There is also no reason to believe that they are especially conservative. They may preserve some features lost in other languages, but they may have lost some features that other languages preserve. it's hard to tell without detailed comparative work.

I suggest taking that African-language word list and extracting the Dolgopolsky 12-word list for each language and posting it here.

Click to expand...

Confining the conversation to symbolized objects, that would be true only once they came in contact with modern civilization, and to the degree that their lifestyle changed over that time-frame. So yes and no. If their lifestyle was basically unchanged for 100k years, the modern version would very closely resemble the original, for whenever that originated. At least in it's syntax, maybe not in it's structure.

And why would that be so? There isn't a good correlation between a group's languages syntactic typology and its mode of subsistence. Here's the distribution of languages by whether they have morphological case, https://wals.info/feature/49A#4/-11.87/143.57

You will note that there isn't much of a global pattern, and in some regions, e.g. Northern Australia our Southeast Europe you find everything from languages with no case marking to 10 or more cases right next to each other.

That was along the lines of my original point way back when - language emerges to symbolize experience - if the experience is unchanged, so is the world the language represents. But you're right that they wouldn't be completely equivalent.

Again, even if "experience is unchanged" (which it never is), that's no reason for language structure and syntax to evolve any slower.

rousseau · Feb 26, 2020

Jokodo said:
rousseau said:

Confining the conversation to symbolized objects, that would be true only once they came in contact with modern civilization, and to the degree that their lifestyle changed over that time-frame. So yes and no. If their lifestyle was basically unchanged for 100k years, the modern version would very closely resemble the original, for whenever that originated. At least in it's syntax, maybe not in it's structure.

Click to expand...

And why would that be so? There isn't a good correlation between a group's languages syntactic typology and its mode of subsistence. Here's the distribution of languages by whether they have morphological case, https://wals.info/feature/49A#4/-11.87/143.57

You will note that there isn't much of a global pattern, and in some regions, e.g. Northern Australia our Southeast Europe you find everything from languages with no case marking to 10 or more cases right next to each other.

That was along the lines of my original point way back when - language emerges to symbolize experience - if the experience is unchanged, so is the world the language represents. But you're right that they wouldn't be completely equivalent.

Click to expand...

Again, even if "experience is unchanged" (which it never is), that's no reason for language structure and syntax to evolve any slower.

I'm not referring to syntactic typology. If there is a term that I am referring to I don't know it because I've never formally studied linguistics. I'm talking about the scope of words that exist in the language, not the structure of how they're expressed.

I'm making the claim that the words which exist in a language come into common usage because they're relevant to the lifestyle of those speaking the language. Therefore, if hunter-gatherers have been living essentially the same static lifestyle for [x] period of time, then the evolution of the language is very slow, and it's likely that the modern one resembles the original one more closely than if more social evolution had occurred. I don't particularly care about structure, I'm discussing what's been symbolized.

Yes social changes have occurred in that time, but I think it'd still be true that the language San Bushmen were speaking in the 19th century would be a good indicator of what they were speaking many thousands of years ago (in prehistory), if not an exact replica.

Language as a Clue to Prehistory

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Mazzie Daius

Contributor

Contributor

Mazzie Daius

Industrial Grade Linguist

Contributor

Contributor

Contributor

Contributor

Veteran Member

Contributor