• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Language as a Clue to Prehistory

Looking at the Massachusett language itself, it has some interesting grammatical features.  Massachusett grammar

Possession is indicated with affixes, and some kinds of nouns have an obligatory possessor, like kinship terms, body parts, and "body" itself. There is a prefix for an unspecified possessor, mu- ("someone's").

Possession prefixes and independent pronouns are built from first-person n- and second-person k-, and the language distinguishes between inclusive and inclusive "we": ("you and I" and "we without you").

The language have two grammatical genders, animate and inanimate, though some inanimate objects have animate gender. The animate-plural suffix is -ak and the inanimate-plural one -ash. Some words come in both genders, like mehtog (muhtuq) "tree". Its animate plural mehtogquog (muhtuqak) means "living trees", and its inanimate plural mehtogquosh (muhtuqash) means "dead trees" or more properly "wood".

Along with the locative suffix -et, in all those New England place names, there is an absentative suffix -ay (singular) -uk (plural), an obviative suffix -ah, and a diminutive suffix -îs.

Absentative - for an animate entity, indicates that it has died, and for an inanimate entity, that it is lost or destroyed or irreparably broken.

Obviative - for the proximate (main or primary subject) / obviate (peripheral or secondary subject). Proximate nouns are unmarked.

Diminutive - "little <noun>"
 
Last edited:
The Wikipedia article doesn't go into much detail about Massachusett verbs, but looking at the grammar of other Eastern Algonquian langs, its possessive suffixes would be used as a personal conjugation for its verbs.

Massachusett is one of the Eastern Algonquian languages, formerly spoken along the east coast of North America from New Brunswick to North Carolina. From  Eastern Algonquian languages
A complex series of phonological and morphological innovations define Eastern Algonquian as a subgroup. "There is less diversity, by any measure, among [Eastern Algonquian languages] as a group than among the Algonquian languages as a whole or among the non-Eastern languages."

 Algonquian languages - the Central and Plains ones are most likely areal rather than genetic groupings, like the Eastern one. The Central one extends from eastern Canada to central Alberta and also to Kentucky, and the Plains ones are in the western US/Canada Plains. A likely branching order:
  • Blackfoot (W MT - S AB)
  • -
    • Arapaho-Gros Ventre (CO-KS-NE-WY and MT-SK), Cree-Montagnais (NL - AB), Menominee (NW MI), and Cheyenne (SD-WY)
    • -
      • Core Great Lakes: Ojibwe–Potawatomi (S QC - S ON - N MN), Shawnee (KY), Sauk–Fox–Kickapoo (MI), and Miami–Illinois (IL)
      • Eastern Algonquian
The Core Great Lakes languages may form their own genetic grouping, comparable to E Alg.

This sequence of splits suggests origin in western North America, roughly around where Montana and Alberta meet.

 Proto-Algonquian language - several features:
  • Animate vs. inanimate noun gender: plural is anim *-aki vs. inanim *-ari
  • Inclusive vs. exclusive "we": you-and-I vs. we-without-you
  • Proximate vs. obviate distinction: central vs. peripheral
  • Possessive prefixes ~ verb subjects (in some at least)
From below, the Proto-Algonquian personal pronouns were 1 n-, 2 k-, making it N-K.

From  Ojibwe grammar
Some words are distinguished purely by their noun class; for example, mitig, if it is animate (plural mitigoog), means "tree;" if it is inanimate (plural mitigoon), it means "stick."
Prefixes for personal pronouns, possessives, and personal verb conjugations:
  • (Eastern) Massachusett: 1 n-, 2 k-, 3 w-
  • (Eastern) Munsee: 1 n-, 2 k-
  • (Great Lakes) Ojibwe: 1 n-, 2 g-, 3 w-
  • (Plains) Menominee: 1 n-, 2 k-, 3 o-
  • (Plains) Cheyenne: 1 n-, 2 n-, 3 h-
Animate and inanimate plurals; obviative:
  • (Eastern) Massachusett: a: -ak, i: -ash, obv: -ah
  • (Eastern) Munsee: a: -ak, i: -al; obv: -al
  • (Great Lakes) Ojibwe: a: -g, i: -n; obv: -n
  • (Plains) Cheyenne: a: -(n)otse, i: -(h)o, -(n)e; obv (anim): -o, -oho
  • Blackfoot: a: sg -wa pl -iksi, i: sg -yi pl -istsi
 
 Algic languages - Algonquian and two languages spoken in northwestern California: Wiyot and Yurok, sometimes grouped as "Ritwan".

Wiyot possessive prefixes 1 d-, 2 kh-, independent pronouns 1 yi(l), 2 khil. This looks something like Algonquian N-K.

 Proto-Algic language estimated age: 7,000 years. Likely spoken in western North America in or near the Columbian Plateau, given where the earliest branchings are.


Sergei L. Nikolayev proposes an even further relationship.
Wakashan is some languages of the Pacific Northwest coast, mostly in British Columbia.

SN proposes an additional one: Nivkh or Gilyak, spoken in the Russian Far East: the Amur River and Sakhalin.

Its pronoun pattern: N-K (2nd person: Algic *k, Nivkh *tS, Wakashan *s)

Timeline:
Algonquian-Wakashan: 6500 BCE
  • Chimakuan-Wakashan: 5000 BCE
    • Wakashan: 3000 BCE
    • Chimakuan
  • Nivkh-Algic: 5000 BCE
    • Nivkh:
      • Southern Nivkh: 700 CE
      • Northern Nivkh
    • Algic: 3000 BCE
      • Algonquian: 1500 BCE
        • Plains
        • Central: 700 BCE
          • Central
          • Eastern
      • Ritwan: 1400 BCE
An age of Algic of about 5,000 years instead of 7,000 years.
 
Seems like some dispersal of A-W speakers from western North America eastward and northward, with some of them returning to eastern Siberia as Nivkh speakers.

SN also mentions a possible relationship of the Salishan langs to A-W, though he has not discovered systematic sound correspondences. Salishan's range is N Washington - S British Columbia, close to the inferred Algonquian homeland.

As far as I can tell, the A-W-S pronouns are N-K.

Checking on Google Scholar, I could not find much on Proto-Salishan.
 
 Iroquoian languages - mainly around Lake Ontario, with some spots to the south. From  Proto-Iroquoian language I infer pronouns 1 *k and 2 *hs.

Linguistic Clues to Iroquoian Prehistory | Journal of Anthropological Research: Vol 73, No 3
Our results suggest that Proto-Iroquoian dates to around 2624 bc, and that the Finger Lakes region of west-central New York is the most likely homeland. The results also revealed a strong relationship between linguistic dissimilarity and geographic distance, likely reflecting the isolating effects of spatial separation on the magnitude of linguistic exchange. The timing of language divergences seems to coincide with important events observable in the archaeological record, including the first evidence for the use of corn in New York and Ontario. The development of important Iroquoian cultural attributes such as the longhouse, matrilocal residence, and the intensification of agriculture all coincide with a period which saw most of the internal language divergences.

 Muskogean languages - southeast US - Reconstructing Proto-Muskogean Language and Prehistory.pdf - earliest split around 1000 BCE

 Siouan languages - roughly Montana - Wisconsin - Arkansas with patches in Alberta and Virginia - N Carolina. I've found Rankin_Carter_Jones_Proto-Siouan_Phonology.pdf
with pronouns 1 *wa-, 2 *ya- Thus, W-Y

 Caddoan languages - split into northern (NE, IA) and southern (NE TX, S OK, SW AR, NW LA) branches about 3000 years ago.
 
I then looked at  Na-Dene languages - most of Alaska and NW Canada, some bits on the California and Oregon coast, and New Mexico and nearby. It's subdivided (Tlingit, (Eyak, Athabaskan)) - Tlingit: S Alaska, Eyak: N end of S Alaska strip. Athabaskan: Northern, Pacific Coast, Southern.

It includes the US Native language with the most speakers, Navajo (~170,000; NM-AZ-UT-CO).  Navajo language and  Navajo grammar Its 1s and 2s pronoun prefixes are shi- and ni-. Also present in other S Athabaskan langs. Of the Pacific Coast Athabaskan langs, Hupa has 1s wh- and 2s ni-. The Northern Athabaskan langs have 1s sh-, se-, si-, s- and 2s ni-, ne-, ni-, n-. This indicates proto-Athabaskan 1s *shi-, 2s *ni-

Eyak: poss. 1s si-, 2s 'i- / subj 1s khw, 2s y(i) / dir obj 1s khu, 2s 'i.
Tlingit: poss. 1s akh, 2s i, ind. 1s khat, 2s uhaan, subj. 1s kha-, 2s ee-, obj. 1s khat, kha, akh, 2s i

 Proto-Athabaskan language has a section, "First person singular fricative" - " In Athabaskan languages, it usually has a reflex of /š/, the alveolar fricative, but in Eyak it appears as /x/ and in Tlingit as /χ/."

So it's Sh/Kh-N/Y

Looking further:  Dené–Yeniseian languages and  Dené–Caucasian languages
 
 Hokan languages and Kaufman-reconstructing protoHokan-first gropings-revd2015.pdf and Somehypothesesregardingproto-HokangrammarKaufman2015.pdf
1: *nyi ~ *nya, *tshi, *Ha, *lye
2: *mi ~ *ma, *nyi ~ *nya
The first ones in these sets are the most widespread, making N-M.

 Penutian languages - I could not find out much, but  Southern Sierra Miwok has 1s kanni-, 2s mi-, 1p mahhi-, 2p mi-ko- and the  Chinookan languages have 1 n-, 2 m-.

Both Hokan and Penutian langs are are distributed along the North American West Coast.

I've found First-person n and second-person m in Native America: a fresh look - Raoul Zamponi
Also
WALS Online - Feature 137A: N-M Pronouns
 
Last edited:
Rounding out North America is  Uto-Aztecan languages - the western-US mountains and deserts, W Texas, W coast of Mexico, S Mexico.

 Proto-Uto-Aztecan language - its homeland is a very contentious issue.

One member is  Classical Nahuatl -  Classical Nahuatl grammar
  • Possession: 1s no-, 2s mo-, 1p to-, 2p amo-
  • Verb subject: 1s ni-, 2s ti-, 1p ti-, 2p an-
  • Verb object: 1s -nêch-, 2s -mitz-, 1p -têch-, 2p -amêch-
Half-fits N-M.

Lack of linguistic support for Proto-Uto-Aztecan at 8900 BP | PNAS - ages are in BP (Before Present)
  • Group - Similarity score - Campbell date - Swadesh date - ASJP date
  • Uto-Aztecan - 6.15 - 5000 - 4900 - 4118
  • Athapaskan-Eyak - 5.70 - 3500 - - 4234
  • Mixtecan - 5.10 - - 4900 - 4402
  • Chibchan - 4.84 - 5600 - 5000 - 4484
  • Caddoan - 4.08 - - 3500 - 4743
  • Eskimo-Aleut - 3.31 - 4000 - 3700 - 5059
  • Algic - 2.47 - - 7200 - 5506
  • Witotoan - 2.15 - - - 5717
  • Iroquoian - 1.53 - - 3400 - 6232
  • Siouan-Catawba - 1.27 - 4000 - - 6523
  • Ge - 1.02 - - 4600 - 6856
  • Otomanguean - 0.70 - 6400 - - 7418
Corn was domesticated some 9,000 years ago, making it much older than Uto-Aztecan.

 Oto-Manguean languages - S Mexico, W Nicaragua

Has  Mixtec language - 1s formal sana, na, informal ru'u, ru, 1p incl. yo'o yo, 2p formal ni'in ni, informal ro'o ro

 Mayan languages - Yucatan Peninsula MX, Belize, Guatemala

 Kʼicheʼ language - 1s n-, w-, 2s a-, 1p q-, u-, 2p i-

Going into the Andes in South America,  Classical Quechua has
  • Pronouns: 1 ñuqa, 2 qam
  • Possession: 1 -y, 2 -yki
  • Verb subject: 1 -ni, 2 -nki
  • Verb object: 1 -wa-, 2 -su- (3 subj, pre-tense), -nki (3 subj, post-tense), -yki? (1 subj, post-tense)
Plurals are usually clearly related to singulars, but it's otherwise hard to recognize patterns.
 
A recent paper: C. Barbier et al. 2022, "A global analysis of matches and mismatches between human genetic and linguistic histories",PNAS, tries to quantify the degree of alignment between linguistic and population genetic trees. A relevant quote for our discussions:

"Across the whole dataset, we find that for most populations their closest genetic neighbor belongs to the same language family. However, a nonnegligible proportion (18%) is closest to a linguistically unrelated language (Dataset S1). This suggests that mismatches are a regular outcome of language history and not just rare outliers."

Note that this only includes well established, um uncontroversial linguistic families with estimated divergence times of only a few k years at most. If 18%-ish of sampled populations undergo language shift per couple thousand years (and that's a lower bound because shifts to a related language will go undetected), statistically speaking the chances that the speakers of of two languages that diverged in the pleistocene (if that relation indeed can be shown) otherwise share much of their history is pretty slim.

 
Barbieri's paper "excluded sex chromosomes to not bias the analysis with the female to male ratio." I'm not sure this was wise. There is much information in the Y-chromosome specifically.

While most children learn their mother's language, most societies are patrilocal and wives are much more likely to switch to their husband's language than vice versa.

The one specific gene/language mismatch clear from a cursory reading of the Barbieri paper is Hungarian. This is an Ugric language in the middle of Europe, brought from Siberia in historic times by Magyar conquerors. Some Ugrics had R1a Y-haplogroup, common in Central Europe, but some had the rare N Y-haplogroup. N is found at low levels in Hungary and Serbia, though some may have been brought by Turks or Varangians.

Another issue is that some regions are subject to continual migrations. The Eastern European steppes are a good example: they have been washed over by dozens of migrations, so one must look to ancient skeletons (rather than the living) for their genetic history. Is the Carpathian Basin (Hungary) similar? Could a neighbor like Serbia actually better reflect the gene pool of Hungary ten centuries ago? (Serbia has about 5% N haplogroup IIUC.)

Here is a webpage intended for people with "Magyar surnames." Only a small percentage of these "Magyars" have the N Y-haplogroup but I was intrigued to note that one of them was Gergely Árpád, the only Árpád in the data set and (absent cuckolding!) an agnatic descendant of the Magyar kings. Árpád is the surname eventually adopted by the Magyar Kings of Hungary (it was the personal name of Zsolt's father and Taksony's grandfather).

This paper goes into much detail about Hungarian haplogroups.
 
Barbieri's paper "excluded sex chromosomes to not bias the analysis with the female to male ratio." I'm not sure this was wise. There is much information in the Y-chromosome specifically.

While most children learn their mother's language, most societies are patrilocal and wives are much more likely to switch to their husband's language than vice versa.

The one specific gene/language mismatch clear from a cursory reading of the Barbieri paper is Hungarian.
Not so. Hungarian is their prime (only real?) example for a linguistic enclave that is not also a genetic enclave, i.e. a group that genetically clusters with their neighbours but speaks a language that is most closely related to languages from another part of the word. There are plenty of other types of gene/language mismatches: groups that are genetically closest to neighbour A but of the same language group as neighbour B, for example, or groups that have adapted the language(-s) of their surrounding while retaining the signatures of a different origin (Sardinians, or various Jewish groups around the world,...).
 
More pronouns:
 Arawakan languages - scattered over South America and the Caribbean - 1s *n-, *t-, 2s *p-, 1p *w-, 2p *h-
 Jê languages - E, S Brazil - 1s *i-, 2s *a-
 Panoan languages - W Brazil, nearby - 1s *?i-
 Chibchan languages - S Central America, N South America - 1s *da, 2s *ba

Some of these half look like N-M, like being turned into D-B, but others don't.

 Amerind languages and  Indigenous languages of the Americas - both list several examples of N-M, though they note some examples of its absence. What I'd like to see is a comprehensive collection of pronoun forms.


 Classification of indigenous languages of the Americas - most of the classifications are phylogenetic lawns. However,  Joseph Greenberg claimed to have sorted them out into three main groups: Eskaleut (Eskimo-Aleut, Inuit-Yupik), Na-Dene (Eyak-Athabaskan-Tlingit), and Amerind (all the others).

 Eskaleut languages --  Uralo-Siberian languages - Uralic, Yukaghir, Eskaleut - like  Eurasiatic languages (Core Nostratic, Mitian)

 Na-Dene languages --  Dené–Yeniseian languages and  Dené–Caucasian languages

However, Joseph Greenberg used a method,  Mass comparison that seems very subjective and impressionistic, and his classification is not generally accepted.
 
Prior to Greenberg, the American languages were being studied by many linguists, while African languages were not well-studied at all.

However, Joseph Greenberg used a method,  Mass comparison that seems very subjective and impressionistic, and his classification is not generally accepted.

Greenberg's disciples stress that mass comparison is just a FIRST step. Once a list of possible cognates is available, a linguist can explore for sound change laws, etc.

Greenberg achieved huge respect when he classified the African languages into four large groups with this method. He applied the same method to the American languages and stated that he found these languages MUCH easier to classify than the African languages. Yet his classification of the African languages, despite some controversy, is the standard almost without exception to this day. His classification of Amerindian, on the other hand, has met with astoundingly shrill and angry opposition.

Someone unfamiliar with the facts of this shrill and angry opposition will say that "Debate is appropriate in the sciences." I'm tempted to waste an hour Googling for excerpts of this "debate." One finds examples from the anti-Greenberg crowd that are BY FAR the shrillest, angriest, and least intelligent remarks I have ever seen on any subject by people who call themselves scientists.
 
Joseph Greenberg gave this table as an example.
MeaningABCDEFGHI
Headkarkarsekaltututofipi
Eyeminkuminmiŋminminidiiri
Nosetortörnitolwaswašwasikam
Onemitkankankaŋhakankεnhečak
Twonitanekilneninegumgun
Bloodkursemsemšamisemsemfikpix
Let's get to work on it.

I used various clustering and phylogeny algorithms on it, and I found a clear split: ABCDEFG HI.

Trying to find a consensus between the algorithms' results, I found that ABCDEFG split further into A BD CEFG, and CEFG into C E FG.
 
Place names are often remarkably stable, some of them known to last for millennia, and many of them being older than written records of them. So place names can give clues about the languages of their namers, even if those namers spoke a different language from their documented descendants or successors.

I will use a simple example: United States place names, comparing the northeast to the southwest.

In New England is a lot of place names ending in -et: Massachusetts, Connecticut, Nantucket MA, Narragansett RI, Mattapoisett MA, Pawtucket RI, Woonsocket RI, Cohasset MA, Amagansett NY, Namskaket MA, Nauset MA.

But in the southwest there are a lot of names with a completely different pattern. Many two-word names with the first word El, La, Los, Las, San, Santa. Furthermore, these words are associated with second-word endings.
  • El -o, -e, -on (El Toro, El Rio, El Monte, El Cajon)
  • La -a (La Habra, La Cienega)
  • Los -os, -es (Los Alamitos, Los Altos, Los Angeles, Los Padres)
  • Las -as, -es (Las Positas, Las Vegas, Las Cruces)
  • San -o, -os, -e (San Diego, San Bernardino, San Carlos, San Marcos, San Clemente, San Jose)
  • Santa -a, -z (Santa Monica, Santa Clara, Santa Cruz, Santa Ynez)
So one concludes that the earlier inhabitants of the northeast and southwest US spoke very different languages.
Those Southwestern names tell you more about the earlier inhabitants of the Iberian peninsula than about the SW United States.

The English invaders in the NE had a habit of asking locals what places were called, or naming places after the tribal names of the people they had killed to get them; The Spanish invaders didn't give a crap what the locals called places, and re-named places after elements of Spanish Catholic mythology.
It gets even more confusing with personal names. There's a ton of Gothic names in Spanish speaking countries, while Germanic languages have left little other mark in Spanish. For example, there's many more Rodrigos and Orlandos in Spain than there's Roderichs and Rolands in Germany, and almost as many Luíses (=Ludwig) and Hernandos (=Hermann). If, by some accident of history, those traditional Spanish-Gothic names drop out of use on the Iberian peninsula but survive in the New World, the 25th century lpetriches and Swammerdamis might well conclude that the Goths reached Mesosmerica and Mexico.
 
. . . there's many more Rodrigos and Orlandos in Spain than there's Roderichs and Rolands in Germany, and almost as many Luíses (=Ludwig) and Hernandos (=Hermann). If, by some accident of history, those traditional Spanish-Gothic names drop out of use on the Iberian peninsula but survive in the New World, the 25th century lpetriches and Swammerdamis might well conclude that the Goths reached Mesosmerica and Mexico.
I am not disputing your claim. I DO wonder why you mention me. IIRC I've never mentioned proper names in the thread.
 
(Joseph Greenberg and mass comparison)

I used some more clustering algorithms on his example, and here is their consensus:
(A BD C E FG, HI)
-- an unresolved polychotomy for the first set, because different algs produce different branchings in it.

I selected out the first seven and used clusters BD and FG with protoforms inferred from consensus with the others. I found (A BD) C (E FG), and all of them have the same protoforms except for "head": *kar, *se, *tu.


Another South American language family:  Tupian languages with  Tupi–Guarani languages in it, with the  Guarani language of Paraguay.

Guarani pronouns: (independent, undergoer-conjugation prefix) 1s che, 2s nde, 1p incl ñande, excl ore, 2p peẽ, (active-subject conjugation prefix) 1s a-, 2s re-, 1p incl ja-, 1p excl ro-, 2p pee- (ch = sh, j = dy).

Proto-Tupi-Guarani: 1s *iye, *itshe, 2s *endé, 1p incl *yande, excl *ore, 2p *pẽẽ

Not much resemblance to N-M.

Rodrigues (2007) considers the Proto-Tupian urheimat to be somewhere between the Guaporé and Aripuanã rivers, in the Madeira River basin.[1] Much of this area corresponds to the modern-day state of Rondônia, Brazil. 5 of the 10 Tupian branches are found in this area, as well as some Tupi–Guarani languages (especially Kawahíb), making it the probable urheimat of these languages and maybe of its speaking peoples. Rodrigues believes the Proto-Tupian language dates back to around 3,000 BC.

Going further, we find  Je–Tupi–Carib languages including  Macro-Jê languages proposed by Andrei Nikulin.
  • Macro-Jê: All of Brazil but the north, the part near the Amazon River
  • Tupi: around the Macro-Jê area on all sides
  • Carib: N South America, some bits further south

With  Mataco–Guaicuru languages near Paraguay-Argentina, AN proposes a Macro-Chaco family.

They have overall pronoun pattern 1s *i-, 2s *a- with 1p and 2p being more complicated.
1p Je *ku-, MG *qo-, 2p Je *ka-, MG *qa-, Tupi *pe-

The Tupi Expansion | SpringerLink from "The Handbook of South American Archaeology" -- "The Tupi of Brazil undertook an enormous territorial expansion more than 2,000 years ago."
 
One finds examples from the anti-Greenberg crowd that are BY FAR the shrillest, angriest, and least intelligent remarks I have ever seen on any subject by people who call themselves scientists
Because Greenbergians set the bar so high.... :rolleyes:
:confused: Is this some goose—gander claim? Can you give an example? I'll spend the hour and hunt for an anti-Greenberg rant. We can make direct comparison.
 
One finds examples from the anti-Greenberg crowd that are BY FAR the shrillest, angriest, and least intelligent remarks I have ever seen on any subject by people who call themselves scientists
Because Greenbergians set the bar so high.... :rolleyes:
:confused: Is this some goose—gander claim? Can you give an example? I'll spend the hour and hunt for an anti-Greenberg rant. We can make direct comparison.
I suppose it might be a goose and gander argument, if I thought there were anything wrong with academic debate in the first place, which I don't. This is how we progress over time.
 
Back
Top Bottom