• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Language as a Clue to Prehistory

But then why did Proto-Semitic speakers succeed in conquering the Levant? The Proto-Indo-European speakers were helped by their horses and wheeled vehicles and the like, but the Proto-Semitic speakers?

I have a hypothesis. Another equid, the donkey, was what helped them. ...
Very interesting; I've not heard this before. Is there archeological support, e.g. the dating of donkey remains in Mesopotamia?

The link you give connects to a paywall. Do you have a subscription, or some other workaround?
I didn't try to get the full article. But scholar.google.com often returns un-paywalled versions, either preprints or reprints. The original preprint site is ArXiv, and it has provoked numerous imitations, like BioRxiv, MedRxiv, PsyArXiv, SocArXiv, ... Many people host reprints at their sites. If these possibilities fail, there is sci-hub.se

The Importance of the Donkey as a Pack Animal in the Early Bronze Age Southern Levant: A View from Tell es-Sâfî / Gath
In this paper, we review the evidence for the use of the domestic donkey as a mode of transportation in the Early Bronze Age. The study will present the domestic donkey remains (artefactual and zoological) and their archaeological context from the Early Bronze Age III domestic neighborhood at Tell es-Sâfî / Gath. The remains indicate the significant role that donkeys played in the daily life of the inhabitants. This reflects on our understanding of their role in the trade networks and mode of transportation that existed within the emerging urban cultures in the southern Levant during the 3rd mill. B.C.E.

Domestication of the Donkey (Equus asinus) in the Southern Levant: Archaeozoology, Iconography and Economy | SpringerLink
(paywalled)
More specifically, this chapter reviews data concerning the role of these beasts of burden and the possible existence of a dedicated social stratum or group of persons specializing in their use in the Early Bronze Age (ca. 3700/3600–2400 BC).
The starting time is about when the speakers of East Semitic left the Semitic homeland.

Size Matters – Donkeys and Horses in the Prehistory of the Southernmost Levant - Persée - over the 5th, 4th, 3rd millennia BCE
 
Last edited:
Donkeys in Africa.pdf by Roger Blench and Kevin MacDonald

Conjectured range of African wild donkeys around 5000 BCE: Near the coasts of the Mediterranean and Red Seas, and in the Horn of Africa.

Before recent centuries, donkeys were mainly used as pack animals and for riding.

RB then discusses the range of words for that animal. Were those originally for wild donkeys?
 
 Proto-Berber language
... Proto-Berber might be as recent as 3,000 years ago. Louali & Philippson (2003) propose, on the basis of the lexical reconstruction of livestock-herding, a Proto-Berber 1 (PB1) stage around 7,000 years ago and a Proto-Berber 2 (PB2) stage as the direct ancestor of contemporary Berber languages.[3]

...
Roger Blench (2018)[9] suggests that Proto-Berber speakers had spread from the Nile River valley to North Africa 4,000–5,000 years ago due to the spread of pastoralism, and experienced intense language leveling about 2,000 years ago as the Roman Empire was expanding in North Africa. Hence, although Berber had split off from Afroasiatic several thousand years ago, Proto-Berber itself can only be reconstructed to a period as late as 200 A.D. Blench (2018) notes that Berber is considerably different from other Afroasiatic branches, but modern-day Berber languages displays low internal diversity. The presence of Punic borrowings in Proto-Berber points to the diversification of modern Berber language varieties subsequent to the fall of Carthage in 146 B.C.; only Guanche and Zenaga lack Punic loanwords.[9] Additionally, Latin loanwords in Proto-Berber point to the breakup of Proto-Berber between 0–200 A.D. During this time period, Roman innovations including the ox-plough, camel, and orchard management were adopted by Berber communities along the limes, or borders of the Roman Empire. In Blench's view, this resulted in a new trading culture involving the use of a lingua franca which became Proto-Berber.[9]
Seems like there was some small population of Berber speakers in the Roman Empire, a population that spread outward after that empire's fall.
 
 Afroasiatic languages
Estimates of the date at which the Proto-Afroasiatic language was spoken vary widely. They fall within a range between approximately 7500 BC (9,500 years ago), and approximately 16,000 BC (18,000 years ago). According to Igor M. Diakonoff (1988: 33n), Proto-Afroasiatic was spoken c. 10,000 BC. Christopher Ehret (2002: 35–36) asserts that Proto-Afroasiatic was spoken c. 11,000 BC at the latest, and possibly as early as c. 16,000 BC. These dates are older than those associated with other proto-languages.
These dates strike me as as rather subjective handwaving, but they are well into the Pleistocene, almost as far back as the end of the  Last Glacial Maximum about 20,000 years ago.

 Eurasiatic languages
mentions
The last common ancestor of the family[vague] was estimated by phylogenetic analysis of ultraconserved words at roughly 15,000 years old, suggesting that these languages spread from a "refuge" area at the Last Glacial Maximum.[12]
from Ultraconserved words point to deep language ancestry across Eurasia | PNAS -- seems less subjective, trying to quantify how long word forms last before they get replaced.
  • Eurasiatic, Kartvelian, Dravidian: 15,000 BP
  • Eurasiatic members: 12,500 - 10,000 BP
That is well before agriculture in northern Eurasia.

The Eurasiatic branching that they find is ( (IE, Uralic), (Altaic, (Chukchi-Kamchatkan, Inuit-Yupik) ) )
 
Indo-European phylogenetics with R in: Indo-European Linguistics Volume 8 Issue 1 (2020) - "Phylogenetic inference is carried out on the Indo-European dataset compiled by Don Ringe and Ann Taylor, which includes phonological, morphological, and lexical characters."

Some earlier work 2005): A Comparison of Phylogenetic Reconstruction Methods on an Indo-European Dataset

Of the family-tree algorithms, UPGMA does the worst, giving results different from several decades of linguist assessment. That is because UPGMA is "ultrametric", assuming that every node has the same amount of change from the root. That is absurd. Consider Icelandic vs. English. Icelandic is not much different from Old Norse, while present-day English is much more different from Old English.

All the other methods found well-established subgroupings, complete with their internal branching.
  • Anatolian (Hittite, Luwian, Lycian) was usually (Hittite, (Luwian, Lycian)) but not always.
  • Tocharian
  • Greek-Armenian always appeared.
  • Indo-Iranian: (Indic: Vedic Sanskrit, Iranian: (Old Persian, Avestan))
  • Balto-Slavic: (Old Church Slavic, Baltic: (Old Prussian, (Lithuanian, Latvian)))
  • (Insular) Celtic: (Welsh, Old Irish)
  • Italic: (Latin, (Umbrian, Oscan))
  • Germanic: (Gothic, (Old Norse, (Old English, Old High German)))
Albanian jumped around quite a lot, often being an early brancher, but sometimes close to Germanic or Indo-Iranian or Tocharian. Aside from Albanian, the earliest branches were

(Anatolian, (Tocharian, other IE))

Leaving out Albanian as problematic and Anatolian and Tocharian as agreed-on, the others branch
  • GA, ( (II, BS), ( (It, Ce), Ge) )
  • (Ce, It), ( Ge, ( (BS, II), GA) )
  • GA, (II, (BS, ( (It, Ce), Ge) ) ) - (3)
  • GA, (II, (BS, (It, (Ce, Ge) ) ) ) - (7)
  • (Ce, It), (Ge, (BS, II) )
  • GA, II, (BS, (It, Ce), Ge)
  • GA, (II, (BS, (It, (Ce, Ge) ) ) ) - (3)
  • GA, (II, (BS, (Ce, (It, Ge) ) ) )
  • GA, II, (BS, (Ce, (It, Ge) ) )
  • II, (GA, (BS, (Ce, (It, Ge) ) ) )
So there is something of a consensus: GA, (II, (BS, (Ge, (It, Ce) ) ) )
 
Rather than simple counting of character matches, linguistic knowledge should be used to build phylogenetic trees. For example, suppose that the Q>P sound transition is rare, but much more common than P>Q. This would suggest that proto-Irish and proto-Welsh come from different halves of the main Celtic bifurcation and that their classification together as Insular Celtic is due to similarities resulting from areal contact. This is one reason why I find Ringe's work much more convincing than that of Grey-Atkinson.

(But is Q>P sound transition rare, but much more common than P>Q? It would be nice to see a table showing sound transitions with their likelihoods.)

Recall  Ring species in which A may interbreed with B, B with C, C with D, D with E, and E with A, but not A with C or D. Similarly languages might form a  Dialect continuum during their early separation while still mutually intelligible, or with bilinguality common. Greek and Armenian, for example, are on opposite sides of the Centum-Satem divide and may seem closely related due to areal contact.

Archaeological facts can guide us to the correct I-E tree. Set aside Anatolian and equate I-E Proper with Yamnaya. Afanasievo's separation (Tocharian) ca 3500 BC calibrates the change rate. Italo-Celtic is about as divergent from the Satem core as Tocharian is, aligning with Globular Amphora, also ca 3500 BC. Usatovo is another probably-IE culture ca 3500, possibly ancestral to Greek, Illyrian and/or Phrygian.

All these initial separations were circa 3500 BC, explaining why it is difficult to create a single tree with only binary fanouts. The remainder of Yamnaya underwent the transition to Satem, separated into a huge range from the Baltic Sea to Afghanistan, leading to Balto-Slavic in the Northwest and Indo-Iranian in the Southeast. (And now-extinct languages like Thracian emerged in the Southwest of Yamnaya.)

Albanian is attested only very recently, so trying to place it exactly is wasted effort. Germanic resulted from creoles involving Corded Ware, Funnel Beaker and Pitted Ware so is also hard to fit onto a clading diagram.
 
Rather than simple counting of character matches, linguistic knowledge should be used to build phylogenetic trees. For example, suppose that the Q>P sound transition is rare, but much more common than P>Q. This would suggest that proto-Irish and proto-Welsh come from different halves of the main Celtic bifurcation and that their classification together as Insular Celtic is due to similarities resulting from areal contact. This is one reason why I find Ringe's work much more convincing than that of Grey-Atkinson.

(But is Q>P sound transition rare, but much more common than P>Q? It would be nice to see a table showing sound transitions with their likelihoods.)

Recall  Ring species in which A may interbreed with B, B with C, C with D, D with E, and E with A, but not A with C or D. Similarly languages might form a  Dialect continuum during their early separation while still mutually intelligible, or with bilinguality common. Greek and Armenian, for example, are on opposite sides of the Centum-Satem divide and may seem closely related due to areal contact.

Archaeological facts can guide us to the correct I-E tree. Set aside Anatolian and equate I-E Proper with Yamnaya. Afanasievo's separation (Tocharian) ca 3500 BC calibrates the change rate. Italo-Celtic is about as divergent from the Satem core as Tocharian is, aligning with Globular Amphora, also ca 3500 BC. Usatovo is another probably-IE culture ca 3500, possibly ancestral to Greek, Illyrian and/or Phrygian.

All these initial separations were circa 3500 BC, explaining why it is difficult to create a single tree with only binary fanouts. The remainder of Yamnaya underwent the transition to Satem, separated into a huge range from the Baltic Sea to Afghanistan, leading to Balto-Slavic in the Northwest and Indo-Iranian in the Southeast. (And now-extinct languages like Thracian emerged in the Southwest of Yamnaya.)

Albanian is attested only very recently, so trying to place it exactly is wasted effort. Germanic resulted from creoles involving Corded Ware, Funnel Beaker and Pitted Ware so is also hard to fit onto a clading diagram.
If Protogermanic was a creole in any meaningful sense, it's hard to see why it would retain such a lot of grammatical peculiarities of PIE, like non-null nominative endings (the -us and -os and -as of Latin/Classical Greek/Sanskrit, preserved today only, to the best of my knowledge, in Baltic and - as "-r" - in Icelandic), or one of the best preserved extant ablaut paradigms.
 
If Protogermanic was a creole in any meaningful sense, it's hard to see why it would retain such a lot of grammatical peculiarities of PIE, like non-null nominative endings (the -us and -os and -as of Latin/Classical Greek/Sanskrit, preserved today only, to the best of my knowledge, in Baltic and - as "-r" - in Icelandic), or one of the best preserved extant ablaut paradigms.

I was using "creole" in a very loose sense to include imperfect learning and/or koiné like hybridization.

Some I-E branches underwent rather little change (I've read that Lithuanian and Sanskrit are sometimes very similar despite the vast geographic separation), and some were at least partially attested long ago (Celtic was spoken across the continent when proto-Germanic first emerged). But for Germanic there is a big gap between the interaction of THREE distinct cultures circa 2900 BC and the Jastorf Iron Age circa 500 BC. I can't guess when, where, and how many language shifts occurred during the millennia between proto-IE and Germanic.
 
 Afroasiatic homeland - the two main hypotheses are the Levant and North Africa.

The main scenario for a Levantine origin is spread by early farmers. That would make the AA ancestors the Natufian people of the Levant 15,000 to 11,500 BP, the beginning of the Holocene. They were sedentary, and they were gathering grains before they started growing them, being among the first to do so.

That is how Bantu and Austronesian speakers spread their languages, and that was proposed for Indo-European by archeologist Colin Renfrew. But that is now considered very implausible, meaning that the first farmers of Europe spread some other language family, like Euskaro-Caucasian.

But AA lang distribution supports a Northeastern African origin much better, since Semitic is an offshoot that is not especially deep in AA, and since the deepest branch is of Omotic, named after the Omo Valley of SW Ethiopia.

There is also some genetic evidence that is consistent with that scenario.

But then why did Proto-Semitic speakers succeed in conquering the Levant? The Proto-Indo-European speakers were helped by their horses and wheeled vehicles and the like, but the Proto-Semitic speakers?

I have a hypothesis. Another equid, the donkey, was what helped them.

Wild donkeys live in NE Africa, near the Red Sea and the Gulf of Aden. That makes them near the AA homeland in the NE-Africa theory. The genomic history and global expansion of domestic donkeys | Science - one domestication of them, in East Africa about 7,000 years ago. That's a little before the time of Proto-Semitic, just right for some NE Africans to invade the Middle East with them.

Not surprisingly, Proto-Semitic had words for the animal, *hhimâr- and for a female one *'atân-
Another quite important
 Proto-Berber language
... Proto-Berber might be as recent as 3,000 years ago. Louali & Philippson (2003) propose, on the basis of the lexical reconstruction of livestock-herding, a Proto-Berber 1 (PB1) stage around 7,000 years ago and a Proto-Berber 2 (PB2) stage as the direct ancestor of contemporary Berber languages.[3]

...
Roger Blench (2018)[9] suggests that Proto-Berber speakers had spread from the Nile River valley to North Africa 4,000–5,000 years ago due to the spread of pastoralism, and experienced intense language leveling about 2,000 years ago as the Roman Empire was expanding in North Africa. Hence, although Berber had split off from Afroasiatic several thousand years ago, Proto-Berber itself can only be reconstructed to a period as late as 200 A.D. Blench (2018) notes that Berber is considerably different from other Afroasiatic branches, but modern-day Berber languages displays low internal diversity. The presence of Punic borrowings in Proto-Berber points to the diversification of modern Berber language varieties subsequent to the fall of Carthage in 146 B.C.; only Guanche and Zenaga lack Punic loanwords.[9] Additionally, Latin loanwords in Proto-Berber point to the breakup of Proto-Berber between 0–200 A.D. During this time period, Roman innovations including the ox-plough, camel, and orchard management were adopted by Berber communities along the limes, or borders of the Roman Empire. In Blench's view, this resulted in a new trading culture involving the use of a lingua franca which became Proto-Berber.[9]
Seems like there was some small population of Berber speakers in the Roman Empire, a population that spread outward after that empire's fall.
That's not what I'm reading here. To me that sounds like something similar to Berber spread out all the way to the Canary Islands and Mauretania (Guanche and Zenaga) before contact with Punic and Latin, but that the languages in the core were heavily influenced by a koine Berber with plenty of Punic and Latin loans during late antiquity and the Middle Ages.
 
Place names are often remarkably stable, some of them known to last for millennia, and many of them being older than written records of them. So place names can give clues about the languages of their namers, even if those namers spoke a different language from their documented descendants or successors.

I will use a simple example: United States place names, comparing the northeast to the southwest.

In New England is a lot of place names ending in -et: Massachusetts, Connecticut, Nantucket MA, Narragansett RI, Mattapoisett MA, Pawtucket RI, Woonsocket RI, Cohasset MA, Amagansett NY, Namskaket MA, Nauset MA.

But in the southwest there are a lot of names with a completely different pattern. Many two-word names with the first word El, La, Los, Las, San, Santa. Furthermore, these words are associated with second-word endings.
  • El -o, -e, -on (El Toro, El Rio, El Monte, El Cajon)
  • La -a (La Habra, La Cienega)
  • Los -os, -es (Los Alamitos, Los Altos, Los Angeles, Los Padres)
  • Las -as, -es (Las Positas, Las Vegas, Las Cruces)
  • San -o, -os, -e (San Diego, San Bernardino, San Carlos, San Marcos, San Clemente, San Jose)
  • Santa -a, -z (Santa Monica, Santa Clara, Santa Cruz, Santa Ynez)
So one concludes that the earlier inhabitants of the northeast and southwest US spoke very different languages.
 
Place names are often remarkably stable, some of them known to last for millennia, and many of them being older than written records of them. So place names can give clues about the languages of their namers, even if those namers spoke a different language from their documented descendants or successors.

I will use a simple example: United States place names, comparing the northeast to the southwest.

In New England is a lot of place names ending in -et: Massachusetts, Connecticut, Nantucket MA, Narragansett RI, Mattapoisett MA, Pawtucket RI, Woonsocket RI, Cohasset MA, Amagansett NY, Namskaket MA, Nauset MA.

But in the southwest there are a lot of names with a completely different pattern. Many two-word names with the first word El, La, Los, Las, San, Santa. Furthermore, these words are associated with second-word endings.
  • El -o, -e, -on (El Toro, El Rio, El Monte, El Cajon)
  • La -a (La Habra, La Cienega)
  • Los -os, -es (Los Alamitos, Los Altos, Los Angeles, Los Padres)
  • Las -as, -es (Las Positas, Las Vegas, Las Cruces)
  • San -o, -os, -e (San Diego, San Bernardino, San Carlos, San Marcos, San Clemente, San Jose)
  • Santa -a, -z (Santa Monica, Santa Clara, Santa Cruz, Santa Ynez)
So one concludes that the earlier inhabitants of the northeast and southwest US spoke very different languages.
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.

The English invaders in the NE had a habit of asking locals what places were called, or naming places after the tribal names of the people they had killed to get them; The Spanish invaders didn't give a crap what the locals called places, and re-named places after elements of Spanish Catholic mythology.
 
Place names are often remarkably stable, some of them known to last for millennia, and many of them being older than written records of them. So place names can give clues about the languages of their namers, even if those namers spoke a different language from their documented descendants or successors.

I will use a simple example: United States place names, comparing the northeast to the southwest.

In New England is a lot of place names ending in -et: Massachusetts, Connecticut, Nantucket MA, Narragansett RI, Mattapoisett MA, Pawtucket RI, Woonsocket RI, Cohasset MA, Amagansett NY, Namskaket MA, Nauset MA.

But in the southwest there are a lot of names with a completely different pattern. Many two-word names with the first word El, La, Los, Las, San, Santa. Furthermore, these words are associated with second-word endings.
  • El -o, -e, -on (El Toro, El Rio, El Monte, El Cajon)
  • La -a (La Habra, La Cienega)
  • Los -os, -es (Los Alamitos, Los Altos, Los Angeles, Los Padres)
  • Las -as, -es (Las Positas, Las Vegas, Las Cruces)
  • San -o, -os, -e (San Diego, San Bernardino, San Carlos, San Marcos, San Clemente, San Jose)
  • Santa -a, -z (Santa Monica, Santa Clara, Santa Cruz, Santa Ynez)
So one concludes that the earlier inhabitants of the northeast and southwest US spoke very different languages.
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.

The English invaders in the NE had a habit of asking locals what places were called, or naming places after the tribal names of the people they had killed to get them; The Spanish invaders didn't give a crap what the locals called places, and re-named places after elements of Spanish Catholic mythology.
One can find plenty of names in the Southwest that better represent its linguistically diverse history, though.
 
The Spanish invaders didn't give a crap what the locals called places, and re-named places after elements of Spanish Catholic mythology.
Even worse, the Spanish Catholic Church enslaved the American natives and, at least in Central America, destroyed ancient writings.

Other Catholic atrocities include the Cathar massacres, Crusades, Inquisition, various hypocrisies and even well-known 20th-century abuses in USA. The Catholic Church is "grandfathered in" in deference to the Virgin Mary and her Holy Son. But if she were required to apply from scratch with her 1000-year biography of crime, she'd not only be denied tax-exemption, but would be under indictments in the Hague.
 
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.
That's not getting the point. Yes, the "el language" as it might be called, turns out to be Spanish, and the "et language" turns out to be Algonquian languages like the  Massachusett language. But if one didn't know anything more than those place names, what would one conclude?
 
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.
That's not getting the point. Yes, the "el language" as it might be called, turns out to be Spanish, and the "et language" turns out to be Algonquian languages like the  Massachusett language. But if one didn't know anything more than those place names, what would one conclude?
Well, this one would conclude that he didn't have enough information for speculation to be useful.

But most humans are very happy to wholeheartedly embrace complex and detailed stories whose basis is weak evidence at best.

That language is such a demonstrably poor clue to history, should serve as a warning that much of the received wisdom regarding historical (and particularly pre-historical) events is likely false, and mere speculation.

Instead it inspires vast reams of 'based on a true story' style semi-fiction, which itself becomes accepted as The Truth by the small cliques of people who take a deep interest in it, and is then used as the foundation for even more wild speculation.

It may only be useful at all as an excellent object lesson in the human experience inability to accept "we don't, and probably never will, know" as an answer.

Even when that answer is far superior - ie far less distant from reality - than the most popular speculative tales de jour.

I believe it was Steven Baxter who suggested that the immodest species name homo sapiens would be better replaced by pan narrans.
 
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.
That's not getting the point. Yes, the "el language" as it might be called, turns out to be Spanish, and the "et language" turns out to be Algonquian languages like the  Massachusett language. But if one didn't know anything more than those place names, what would one conclude?
Well, this one would conclude that he didn't have enough information for speculation to be useful.
That seems very defeatist. One can still learn something.

In this example, it's easy to find out about Spanish, where one quickly finds out about articles and noun-adjective agreement.  Massachusett grammar is much less well-known, but documented enough to enable discovering that -et is a locative sufffix. So those place names are "at/in/on (something)".

One does have to be careful, because place-name suffixes can sometimes be borrowed. That occurred a lot with Latin/Romance -ia, short for -ia terra. Another often-borrowed one is -stan from medieval Persian, used a lot by Central Asian Turkic speakers and other neighbors of Iran. Germanic -land hasn't been borrowed much, however.
That language is such a demonstrably poor clue to history, should serve as a warning that much of the received wisdom regarding historical (and particularly pre-historical) events is likely false, and mere speculation.
What do you mean by that?
 
I've been following the Russia-Ukraine war, and I'm struck by how flat Ukraine is, and how contrary to something that JP Mallory wrote about in "In Search of the Indo-Europeans" (p. 145):
A peculiar tendency among a number of nineteenth-century linguists was the strange desire to insulate the Indo-European homeland by great natural barriers, the Hindu Kush or the Himalayas proving particularly popular. Here primitive 'Aryans' were believed to have 'perfected' their language before bursting out over the rest of Eurasia.
Aryans?
We must also take a brief glance at that most loaded of Indo-European words - Aryan. As an ethnic designation, the word is most properly limited to the Indo-Iranians, and most justly to the latter where it still gives its name to the country Iran (from the Avestan genitive plural aiyanam through later Iranian eran to iran). The great Persian king Darius described himself as Aryan. The term was also used widely in India where it referred to one who was a member of the community (though details of who was included in the community have been the topic of wide and unsettled debate). Whether this ethnic designation was limited to the Indo-Iranians or not is difficult to say.
The word ârya has possible cognates outside of this realm, in Hittite and Irish, but nothing very strong.

So the Indo-European returnees are:
  • Scythians and Sarmatians
  • Celts (Galicia's name)
  • Greeks (Black Sea colonies)
  • Romans (S Crimea and near W border)
  • Goths (Crimea)
  • Eastern Slavs (formed the Kievan Rus)
  • Poles (Polish-Lithuanian Commonwealth)
  • Russians (Muscovites, to be more specific)
  • Nazis (wanted to build a Greater Germany)

A common Semitic-homeland hypothesis is the Levant: Israel/Palestine - Lebanon - W Syria. That makes the Semitic returnees:
  • Assyrians, Babylonians (Akkadian speakers)
  • Arabs

From the Northeast-African hypothesis of the Afro-Asiatic homeland, the returnees are the Ethiosemitic speakers.
 
If you want to get to know US Southwest language families through toponyms, there are many locations and towns whose names hail from each of the major language families. Lpetrich has astounded us with the revelation that the Latin languages have left their mark via the efforts of Castilian Spanish settlers, but there are a great many others, also. The other major language families with some representative place names follow:

-------

Germanic:
The vast majority of place names now in use by the US government hail from the English language and follow that language's naming customs, both morphologically (suffixes like ----field and ----ton and ------ City), and conventionally (the practice of naming towns after their founders, significant historical figures, European "home towns", or Biblical characters and places). Even purely Native names often have heavily Anglicized pronunciations and spellings.

Basque (language isolate):
Durango
Arizona (disputed, this one may actually be a borrow from O'odham)

Athabaskan:
Teec Nos Pos
Kayenta
Kaibito
Shonto
Tsegi
Naschitti
Ya-Ta-Hey
Dennehotso
Lukuchukai
Tonalea
Cibecue
Apache Junction
Temecula

Puebloan:
Jemez Springs
Taos
Chama
Tesuque
Nambe
Pojoaque
Chimayo

Zuni (language isolate):
Zuni Pueblo
Cibola
Cebollah

Uto-Aztecan:
Utah
New Mexico
Montezuma Well
Aztec
Saguache
Tonopah
Wickieup
Tuba City
Sahuarita
Sasabe
Pima County
Walpi
Tucson
Wahweap
Pahrump
Tucumcari
Panguitch
Moapa
Waocoba
Truckee
Ouray

Yuman:
Yuma
Gila Bend
Toppock
Yavapai
Mojave Desert

----------------------------

One interesting phenomenon are places whose semantics are borrowed into English but not their phonology. Red Lake in Northern NM is an anglicization of a completely different sounding Dine Bizaad name, for instance, but means the exact same thing as its Dine Bizaad counterpart. Many Farms AZ is likewise a direct English translation of a Dine Bizaad name. I suspect this happens because foreigners have a difficult time pronouncing the language, so when settlers asked what something was called by the locals, they often got a polite English translation in response.

Many places in the Southwest are called by different names depending on who you talk to. The national park the government calls "Zion" today, a Hebrew-via-English borrowing, is still "Mukuntuweap" to many locals (Paiute: "Straight-Arrow-Canyon") and indeed was even recognized as such when it was a National Monument - the re-renaming to reflect Mormon history was a political concession and only partially caught on. A certain very noticeable landform in northern Arizona is either "Navajo Mountain" or "Paiute Mountain" depending on who you ask, with very obvious political implications; the US firmly sided with the Navajo on that one. In general, it's rare for Dine Bizaad speakers to use English names when speaking their own language, though they will casually refer to Shiprock, Farmington etc when speaking in English. Likewise, native Spanish speakers will default to the Spanish names of many towns and cities when spekaing in their own language, regardless of the officially recognized epithet.

There are some interesting oddities emerging from colonialism. "Yucca" is a plant name that has become a place name throughout the Southwest, but is of neither local nor Spanish origin - the Spanish borrowed the term from the Taino language of the Caribbean and brought it with them. "Hurricane" after which Hurricane UT is named, is also derived from Taino. Likewise, Tule and Coyote are Nahuatl (Aztec) words borrowed as names for a common plant and animal respectively, but spread by the Spanish to places where Nahuatl speaking people never lived. So there are many place names resulting from a borrow of a borrow of a borrow. Enlgish settlers sometimes did this too: Tomahawk and Toboggan are both place names carried into the region by English speakers, but originally derived from Algonquian languages spoken on the other side of the continent. One of the strangest cases of name-borrowing that I know of is the town of Elko Nevada, whose name is a faux-Spanish word invented by an English speaker by throwing an -o on the end of "Elk", the German name for larger deer and moose species, which came to be attached to an American animal commonly found around the settlement. "Wapiti" the Siouan name for the same animal, has wandered into usage in several Southwestern place names as well. The actual Spanish cognate term to "elk" is "alce", but I am not aware of any place names in the Southwest that use it!

Truckee and Ouray are named for Native American people, in both cases band chiefs friendly to the American conquest, but neither is the name either of those individuals would have used; naming things after historical figures is a European convention.
 
Place names are often remarkably stable, some of them known to last for millennia, and many of them being older than written records of them. So place names can give clues about the languages of their namers, even if those namers spoke a different language from their documented descendants or successors.

I will use a simple example: United States place names, comparing the northeast to the southwest.

In New England is a lot of place names ending in -et: Massachusetts, Connecticut, Nantucket MA, Narragansett RI, Mattapoisett MA, Pawtucket RI, Woonsocket RI, Cohasset MA, Amagansett NY, Namskaket MA, Nauset MA.

But in the southwest there are a lot of names with a completely different pattern. Many two-word names with the first word El, La, Los, Las, San, Santa. Furthermore, these words are associated with second-word endings.
  • El -o, -e, -on (El Toro, El Rio, El Monte, El Cajon)
  • La -a (La Habra, La Cienega)
  • Los -os, -es (Los Alamitos, Los Altos, Los Angeles, Los Padres)
  • Las -as, -es (Las Positas, Las Vegas, Las Cruces)
  • San -o, -os, -e (San Diego, San Bernardino, San Carlos, San Marcos, San Clemente, San Jose)
  • Santa -a, -z (Santa Monica, Santa Clara, Santa Cruz, Santa Ynez)
So one concludes that the earlier inhabitants of the northeast and southwest US spoke very different languages.
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.

They do tell us the SW US was under the influence (possibly indirect) of a group related to the contemporary inhabitants of the Iberian peninsula by the time English started to play a role in the area, and to develop a need for local toponyms - or had been at some point prior.

That's too vague to base a narrative on it, but it can be enough to give additional weight to one of several possible scenarios suggested by archaeology, or written sources, or population genetics.
 
Back
Top Bottom