• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Language as a Clue to Prehistory

Mother Tongue 6 has a review by John Bengtson of Joseph Greenberg's "Indo-European and Its Closest Relatives: The Eurasiatic Language Family (Volume 1. Grammar)" - also covering phonology. The second volume is on the lexicon.

JG noted increasing convergence of versions of Nostratic on Afrasian (Afroasiatic) being alongside the others: (Afrasian, Eurasiatic). Kartvelian he proposed is closer to Afrasian than to his Eurasiatic. He proposed in Eurasiatic: Etruscan, Indo-European, (Uralic, Yukaghir), Narrow Altaic: (Turkic, Mongolic, Tungusic), (Korean, Japanese, Ainu), Nivkh/Gilyak, Chukotian/Chukchi-Kamchatkan, Eskimo-Aleut.


Etruscan he was not sure whether it is a separate branch or a third branch of Indo-European (the other two branches being Anatolian and all the rest). About Ainu, spoken in Hokkaido, Japan, JB and Vaclav Blazhek propose that it is in the Austric family.

He notes some grammatical curiosities like tuk (2nd person dual > plural), ken (interrogative pronoun suffixed with n), and m prefixed with g (1st person singular): Indo-European nominative *eg(h)o(m), Hungarian accusative engem, Kamchadal nominative kim. Also a similar formation for the 2nd person singular.


He also reviewed a book about African langs, noting their classification in order of general acceptance: Afrasian (Afro-Asiatic), Niger-Congo, Khoisan, Nilo-Saharan. The second and fourth are sometimes combined as Congo-Saharan.

One of Joseph Greenberg's classification criteria: “Language classification must be based on linguistic evidence alone and not on racial or cultural criteria.” Another one is that “classification must be based on specific points of resemblance and not on the presence or absence of general features of a typological nature.” Also the rule of transitivity, and the rule that vocabulary and grammar lead to the same results. Transitivity: if A ~ B and B ~ C, then A ~ C.
 
After Vitaly Shevoroshkin's reviews of two books on Nostratic is John Bengtson's review of Vaclav Blazhek's "Numerals: Comparative - Etymological Analyses ofNumeral Systems and Their Implications (Saharan, Nubian, Egyptian, Berber, Kartvelian, Uralic, Altaic and Indo-European Languages)" - some of its contents have been published separately, like IE "seven" in MT.

The IE one goes into detail about 1 to 10, 100, and 1000, like for 1: *oy- "one, *sem- "one", *per-/*pro- "first". For 4, VB discusses 11 hypotheses for *kwetwores.

Then patterns of naming numbers, like names of fingers. I checked out some of the other ones, and they didn't match what I found in The Numbers List - not very well-sourced, I will concede, though it has this page of background for it: Language Family Information for the Numbers List I've also found Numeral Systems of the World (Max Planck University; comprehensive, often goes up to 2000, but has no reconstructions) and Appendix:Numerals in various languages - Wiktionary, the free dictionary and  Numeral (linguistics)

Base 10 is the most common pattern of the more developed systems. The PIE word for 100 is *kmtóm more or less "super 10" *dekm, but the PIE word for 1000 is more difficult.
  • *tuHsont- > Proto-Germanic *thûsundî, Proto-Balto-Slavic *tûsantis (< *tew-2 "to swell" + *kmtóm "100 ?)
  • *gheslom > Greek khilioi (<*ghesr- "hand" + *-lom)
  • *sm-gheslom (*sm-gheslom) > Sanskrit sahásra (*sm "one" + *gheslom)
  • *smih2-gheslih2 (*smî-gheslî) > Latin mille (*sm- "one" + *gheslom + *-ih2)

In Uralic, Proto-Finno-Ugric goes up to 6, while Proto-Samoyedic goes up to 10. They only have 2 in common: FU *kakte, S *kitä

Turkic has 1 to 10, 100, 1000, Sino-Tibetan has 1 to 10, 100, Austronesian has 1 to 10, 100, with Malayo-Polynesian having 1000, Dravidian has 1 to 10, 100, Kartvelian 1 to 10, 100, Semitic has 1 to 10, 100, 1000 ("thousand" ~ "cattle": */alp-), Ancient Egyptian had 1 to 10, 100, 1000, ...
 
10 is a favorite number base, most likely because we have 10 fingers. But some people have used other number bases, like 4, 5, 6, 10, 12, 20, 24, 60

The larger bases are often handled with sub-bases, like 20 =4*5, 60 = 6*10, ...

Thus, 1, 2, 3, 4, 5, 5+1, 5+2, 5+3, 5+4, 2*5, ...

Why are subdivisions of angle degrees and time hours in sixties? Because a sexagesimal or base-60 system is how Sumerians counted, in SE Iraq some 5,000 years ago. This system was then used there for big numbers for the next three millennia. When Alexander the Great conquered Babylon, Greek astronomers became acquainted with Babylonian astronomy, complete with sexagesimal numbers. They also became acquainted with something less desirable: astrology. That aside, their successors have used sexagesimal numbers for time and angles ever since.

Minute < Latin pars minûta prîma
Second < Latin pars minûta secunda

Compounding is used to extend the reach of number systems, sometimes transparently, sometimes less so. For instance, the Chinese word for 21 is èr shí yī "two ten one". Chinese Numbers: How to Count in Chinese

This is even used in languages with small numbers of number words, like 3 = 2 + 1, 4 = 2 + 2, 6 = 3 +3 = 5 + 1, ...

Less so? Like English, eleven < PGmc *ainalif "one left over", twelve < PGmc *twalif "two left over", thirteen < PGmc *thritehun "three ten", ..., twenty < PGmc *twaintigiwiz "two groups of ten", ...

Latin has undecim "one-ten", duodecim "two-ten", ... viginti < PIE *widkmti < PIE *dwidkomt "two tens", ...

Latin has some alternates: 18 = duodeviginti "two from twenty", 19 = undeviginti "one from twenty"

Looking at its descendants, for the teen numbers, French, Catalan, and Italian have inherited forms for 11 to 16, and "ten-seven", "ten-eight", "ten-nine" for the rest. Spanish and Portuguese have inherited 11 to 15, and "ten-six" etc. for the rest. French seize, dix-sept, Catalan setze, desset, Italian sedici, diciassette, Spanish quince, dieciséis, Portuguese quinze, dezasseis

French shifts to base-20 after 60, though some dialects keep the base-10 inherited forms. 60 = soixante, 70 = soixante-dix "sixty ten", septante, 80 = quatre-vingts "four twenties", huitante, 90 = quatre-vingt-dix "four twenties ten", nonante

Romanian, however uses for 11 unsprezece "one to ten", ..., 20 douăzeci "two tens", ...

The Slavic langs have "one on ten", "two on ten", ... for the teens, and "two tens", "three tens", ..., for 20, 30, ... However, the Eastern Slavic langs have 40 = sorok, a word of uncertain origin.
 
[irrelevant trivia from the math history desk]

One sillyish(?) but well-studied thread in the history of mathematics is the precision with which pi has been calculated. Several great mathematicians have held the precision record at some time. Jamshid al-Kashi is especially notable: Although his method was based on Archimedes' straightforward approach, al-Kashi calculated pi correctly to 16 decimal digits; this held the record for almost two centuries.

Why are subdivisions of angle degrees and time hours in sixties? Because a sexagesimal or base-60 system is how Sumerians counted, in SE Iraq some 5,000 years ago. This system was then used there for big numbers for the next three millennia....
It was in 1424 AD that al-Kashi computed the excellent approximation to pi. He did this using sexagesimal arithmetic.
 
Interesting curiosity. One has to ask why the sexagesimal system lasted so long, at least in various specialized uses. The French revolutionaries had tried to introduce  Decimal time with 10 hours per day, 100 minutes per hour, and 100 seconds per minute, but they were not very successful with that.  French Republican calendar Division of the right angle into 100 instead of 90 degrees is the  Gradian or grade or grad or gon. Also introduced in the French Revolution, with decimal minutes and seconds of arc like the time ones, it has had some use in some applications in some places.

But decimal-prefix time units like millisecond and kiloyear are common, as are decimal degrees, and decimal length and mass units have become nearly universal: metric units, though the systems they replaced have oodles of different multiplying factors.

English weights and measures: Lengths and areas
1 nail = 9/4 inches, 1 hand = 4 in, 1 foot = 12 in, 1 yard = 3 ft, 1 fathom = 6 ft, 1 rod/pole/perch = 11/2 yd, 1 link = 1/25 rod, 1 chain = 4 rods = 100 links = 22 yd, 1 furlong = 10 chains, 1 statute mile = 8 furlongs

1 nautical mile = 1 minute of arc on the Earth's surface
1 acre = 1 furlong * 1 rod
1 cubit = 3/2 ft

 Avoirdupois French: "have some weight" - for anything relatively heavy
1 ounce = 16 drams, 1 pound = 16 ounces, 1 stone = 14 pounds, 1 short quarter = 25 pounds, 1 long quarter = 2 stone = 28 pounds, 1 short/long hundredweight = 4 short/long quarters, 1 short/long ton = 20 short/long hundredweights

US Standard Volume
1 cup = 8 fluid ounces, 1 pint = 2 cups, 1 quart = 2 pints, 1 gallon = 4 quarts

These are fluid US units. Here are dry US units:
1 dry quart = 4 dry pints, 1 dry gallon = 4 dry quarts, 1 peck = 2 dry gallons, 1 bushel = 4 pecks
 
[2207.12102] Sexagesimal Calculations in Ancient Sumer
This article discusses the reasons for the choice of the sexagesimal system by ancient Sumerians. It is shown that Sumerians chose this specific numeral system based on logical and practical reasons which enabled them to deal with big numbers easily and even perform the multiplications and divisions in this system. I shall also discuss how the Sumerians calculated the area of a large field and measured a large quantity of barley according to their seemingly complicated but really systematic methods.
1 = ash, dish, 2 = min, 3 = esh5, 4 = limmu, 5 = ia, 6 = ash (<5 + 1), 7 = imin (<5 + 2), 8 = ussu, 9 = ilimmu (<5 + 4)
10 = u, 20 = nish, 30 = ushu, 40 = nimin (<20 + 2), 50 = ninnu (<40 + 10)
60 = gesh, 60^2 = shar, 60^3 = shargal ("big shar") or shargesh "60^2*60"

Sumerian uses a mixed base: 60 = 6 * 10, 10 = 2 * 5.

 Babylonian cuneiform numerals and The Joy of Sexagesimal Floating-Point Arithmetic - Scientific American Blog Network

They are written in mixed-base fashion: 60 = 6 * 10 with different tens and ones symbols, symbols that are repeated as appropriate.

The major premodern successors of the Sumerians all use base-10: speakers of Akkadian, Greek, Arabic, Latin (not sure about India, but in that case, it would be Sanskrit).


I'll look at Indo-European numerals more closely. 1 is a bit unstable: *oynos, *oykos, *oywos, *sem- and is inflected as an adjective in the singular. 2 is *dwoH, inflected in the dual number, 3 is *treyes, inflected in the plural. "Protruding finger"? 4 is *kwetwores, inflected in the plural. Atypically long for an Indo-European root. 5 is *penkwe . 6 is *sweks: originally *weks ? 7 is *septm . 8 is *oktôw, with a dual ending, meaning two fours. 9 is *newn, either from *newos "new" or from *h1enu- "lack". 10 is *dekm . 100 is *kmtom more or less "big ten" 1000 is "big hand" and "big hundred".

Turning to Semitic, I find 1 = *\asht- (East), */ahhad- (West), 2 = *t'in-, 3 = *t'alât'-, 4 = */arba\-, 5 = *hhamsh-, 6 = *shidt'-, 7 = *shab\-, 8 = *t'amâniy-, 9 = *tish\-, 10 = *\asar-, 20 *isrû- ("tens"), 30 ("threes"), 40 ("fours"), 50 ("fives"), 60 ("sixes"), 70 ("sevens"), 80 ("eights"), 90 ("nines"), 100 = *mi/at-, 1000 */alp- ("cattle")

6 and 7 somewhat resemble each other, but that's the extent of their resemblance. In Semitic, the word for 100 doesn't resemble the word for 10 very much, unlike in IE.
 
The earliest known system for measuring silver came from Sumeria, and used pure sexagesimal. One talent of silver weighed as much as 60 mina; one mina weighed 60 shekels; one shekel weighed 180 grains of barley (180 is a multiple of 60, but see below). (The talent was just a unit of account defined in terms of mina or shekels, and varied between 22 and 33 kilograms country to country.)

As this system was adapted by nearby countries, factors other than 60 were used. Canaan had only 50 menah in their talent, so a talent was 3000 shekels instead of 3600. Greece had 60 mine in a talent, but 100 drachma in a mine, so there were 6000 drachma in a talent. Thus the didrachma (2 drachma) became a common coin, with about the same value as a shekel. Although the average weight of a barleycorn grain would seem to be a reproducible measure; in fact there was wide variation in the weight of silver shekel coins. Coins in Canaan were about 11.7 grams, precisely the weight of 180 barleycorn grains. However Sumerian shekels were considerably smaller. Why the Sumerian shekel didn't match its nominal 180_grains_of_barley definition is an unsolved mystery.

This system survives to a degree today. One troy ounce is 480 grains (480 is a multiple of 60). A grain is about 64.8 milligrams which, in fact, is about the average weight of a barley corn taken from the middle of the ear. 4500 years after the barleycorn/shekel/mina system was defined in ancient Sumeria, weighing barleycorns was still the nominal definition for coinage weights in Europe. (A pennyweight is defined as 24 grains.) European countries without barley used grains of wheat and a conversion factor.
 
Borrowing of word for big numbers is common. For example, English "million" is borrowed from Old French, in turn from Italian milione -- mille "thousand" + -one "augmentative suffix" -- "big thousand"

Italian -one itself is cognate with French -on, Spanish -ón, and Portuguese -ão, from Latin -ô (-ôn-), from PIE *-Hon- (PIE had several derivational suffixes)

Smaller numbers can be borrowed, like Proto-Finno-Ugric *seta 100 from early Indo-Iranian.

Perusing Mark Rosenfelder's The Numbers List - Swahili has

1 moja, 2 mbili, 3 tatu, 4 nne, 5 tano, 6 sita, 7 saba, 8 nane, 9 tisa, 10 kumi

Of these, 1 to 5 and 10 are inherited from Proto-Bantu, 6, 7, 9 are borrowed from Arabic, and 8 is likely "two fours" or "big four".

Two fours? Like in Indo-European? Let's look at Old Japanese:
1 pitö, 2 puta, 3 mi, 4 yö, 5 itu, 6 mu, 7 nana, 8 ya, 9 könönö, 10 töwo
Some of them are related by vowel shifts: 2 ~ 1, 6 ~ 3, 8 ~ 4

In Proto-Bantu, 1 to 5 can be reconstructed, and also 10, but not 6 to 9. Since Proto-Bantu speakers likely counted in between 5 and 10, that means that their words for those numbers cannot be reconstructed, and that is from too much variability in those words in its descendants.

Something similar happens in Sumerian, where 6 to 9 are from 1 to 5, and also Proto-Berber:
1 yn, 2 sn, 3 krad, 4 okkoz, 5 fuss, 6 fuss d yn, 7 fuss d sn, 8 fuss d krad, 9 aḍàw meraw, 10 meraw
It's evident that in these words 6 = 5+1, 7 = 5+2, 8 = 5+3, 9 = (less?) 10

From MR's big list and from Wiktionary, it is evident that words for 1 to 10 were independently invented several times. Many of them are likely coinages from other words, though which ones are often obscure. There are some exceptions, like:

From  Numeral (linguistics) "Quaternary systems are based on the number 4. Some Austronesian, Melanesian, Sulawesi, and Papua New Guinea ethnic groups, count with the base number four, using the term asu or aso, the word for dog, as the ubiquitous village dog has four legs."

Austronesian Comparative Dictionary - PAN Index -- *lima "five" and *qalima, *qa-lima "hand" are very clearly related -- each of our hands has five fingers.

Some langs have only a few number words, like up to 2 or 3 or 4 or 5. These are mostly spoken by people who live in warm climates with low technology, being hunter-gatherers (Paleolithic) or the earlier sorts of farmers (Neolithic). Since that is what our ancestors were like, I conclude that having a word for 2 is universal.

Yet that word varies like crazy, like all the others. So it must have been replaced numerous times.
 
From Swammerdami's numbers,
  • Sumer: 1 shekel = 180 grains, 1 mina = 60 shekels, 1 talent = 60 minas
  • Canaan: (above), 1 talent = 50 minas
  • Greece: 1 mina = 100 drachmas, 1 talent = 60 minas
  • Troy units: 1 pennyweight = 24 grains, 1 troy ounce = 20 pennyweights, 1 troy pound = 12 troy ounces
  • Avoirdupois units: 1 pound = 7000 grains (1 troy pound = 5760 grains)

The new complete system of arithmetic, composed for the use of the citizens of the United States. By Nicolas Pike, A.M. Member of the American Academy of Arts and Sciences. Abridged for the use of schools. 1798 : Pike, Nicolas. : Free Download, Borrow, and Streaming : Internet Archive - I concede that that is an awful scan - but it's good to see the chaos of measurement units before the last few centuries.

On scan page 38, book page 32, is this for dry measures:

2 pints = 1 quart, 2 quarts = 1 pottle, 2 pottles = 1 gallon, 2 gallons = 1 peck, 4 pecks = 1 bushel, 2 bushels = 1 strike, 2 strikes = 1 coom, 2 cooms = 1 quarter, 4 quarters = 1 chaldron, 4 1/2 quaters = 1 chaldron in London, 5 quarters = 1 wey, 2 weys = 1 last

Not bottle, pottle.
 
Mother Tongue 7 is in honor of Joseph Greenberg. Allan Bomhard: "Reflections on Greenberg’s Indo-European and Its Closest Relatives", volume 1 on the grammar.
One of the criticisms often leveled at the Nostratic Hypothesis is the relative dearth of morphological evidence presented by its proponents. Recently, this deficiency has begun to be filled. The late Joseph H. Greenberg has amassed a tremendous amount of morphological evidence in volume 1 of his recent book Indo-European and Its Closest Relatives. On the basis of the morphological evidence alone, I believe that Greenberg has successfully demonstrated that Eurasiatic is a valid linguistic taxon of and by itself. The morphological evidence that Greenberg has gathered for determining which languages may be related to Indo-European is the most complete to date and the most persuasive — it goes far beyond what Illich-Svitych was able to come up with, and it also surpasses what was presented in the chapter on morphology by John C. Kerns in ourjoint monograph The Nostratic Macrofamily.
He then tried to connect Eurasiatic grammatical elements to Afroasiatic, Kartvelian, & Dravidian ones.

Then George Starostin on Elamite. David McAlpin proposed mainly grammatical similarities to Dravidian, but Vaclav Blazhek proposes comparisons of the Elamite lexicon to Afroasiatic. GS: " Instead of solving the problem, in fact, all these works seem to raise several additional ones." Although DMA's work seems convincing, "However, a more detailed analysis of McAlpin's comparisons is able to show that the similarities between the two families (branches?) are, in fact, exaggerated." Also, "Turning now to the theory of V. Blazhek on an Affoasiatic-Elamite relationship, it is easy to see that it has its serious drawbacks, as well."

The poor state of AA reconstruction has the consequence that "as among the endless sea of Afroasiatic languages it would be possible to find suitable parallels to just about any particular Elamite morpheme." There are good reconstructions for Semitic, Egyptian, and Berber, but not for Chadic, Cushitic, or Omotic. GS praises VB for being much more cautious than DMA, though he still finds problems in some of VB's comparisons.

He then compared Dravidian to AA, Nostratic (minus AA), and Sino-Caucasian, using the Swadesh list but with Greenberg-style subjective resemblance, finding stronger comparisons to AA and Nos than to SC.
 
"Tracing the Ancestra lKinship System The Global Etymon KAKA" by Pierre J. Bancel and Alain Matthey de l’Etang

Proposed for Proto-World by Merritt Ruhlen as "uncle, elder brother", alongside AYA, "elder sister, aunt, grandmother, mother".

A difficulty with this work is the baby-talk problem: words for relatives seeming too much like babbling of little children. Like mama, baba, papa, nana, dada, tata, and clipped versions ama, aba, apa, ana, ada, ata. So kaka, yaya, wawa, and clipped aka, aya, awa may be from somewhat later babbling.

The authors use abbreviations mother M, father F, son S, daughter D, brother B, sister Z, wife W, husband H. Also Ch: S, D and Sp: W, H and GdF: FF, MF and GdM: FM, MM and GdP: GdF, GdM. Older and younger siblings are Sb+ and Sb-.

Thus, maternal uncle is MB and paternal aunt is FZ.

The authors compared all but Khoisan, because it's rather difficult to compare clicks to consonants.

Chance resemblances? A problem is that the authors used a rather large semantic field: siblings, parents, parents' siblings, and grandparents. Inadequate reconstruction work is also a problem.

Roman Jakobson's sound-symbolism hypothesis? A problem there is words for the two parents. I looked in Wiktionary for that (mother, father):

Indo-European -- *meh2ter- (*mâter-), *amma -- *ph2ter- (*päter-), *atta
Uralic -- *emä -- *itsä -- Turkic -- *ana -- *ata -- Mongolic -- *eke -- *abu -- Eskimo -- *ana-ana -- *ata-ata
(Kartvelian) Georgian -- deda -- mama -- Dravidian -- *amma -- *appa
Semitic -- */imm- -- */abw- -- Egyptian -- mut -- it
Basque -- ama -- aita -- (NWC) Abkhaz-Abaza -- *ana -- *aba
(NEC) Nakh -- *naana -- *dada -- Avar -- ebel -- emen -- Lezgi -- dida -- buba -- Tabasaran -- dada -- adash
Austronesian -- *ina -- *aba, *amax -- Proto-Thai -- *mê -- *bô
Bantu -- *mâma -- *bâba

The starred forms are protolanguage reconstructions. Many of them are well-represented in their attested descendant langs.

Though the words typically include baby-talk sounds, they can be persistent for a long time.
 
More from Long Ranger 17: Merritt Ruhlen reviewed Johanna Nichols's book "Linguistic Diversity in Space and Time."
... JN proposed an alternative to the comparative method: grammatical features: "head/dependent marking, complexity, alignment, word order, PP's, inalienable possession, inclusive/exclusive pronouns, plurality neutralization, noun classes, and numeral classifiers". ...
"Most linguists will wonder about the feasibility of using typological traits at all in the investigation of genetic affinity, after Greenberg's demonstration of their absurd consequences in Africa."
This seems obvious just from the relative information content -- the number of bits it takes to represent typology is minuscule compared to vocabulary. So I'm puzzled as to why it's most linguists and not all linguists. Do Nichols and other advocates of using typology have a theory for how languages change that explains how a protolanguage could plausibly evolve into daughter languages in a way that leaves a signal detectable above the noise background in the bullseye of grammatical features while simultaneously leaving no trace in the Swadesh list barn door?
 
More from Long Ranger 17: Merritt Ruhlen reviewed Johanna Nichols's book "Linguistic Diversity in Space and Time."
...
"Most linguists will wonder about the feasibility of using typological traits at all in the investigation of genetic affinity, after Greenberg's demonstration of their absurd consequences in Africa."
This seems obvious just from the relative information content -- the number of bits it takes to represent typology is minuscule compared to vocabulary. So I'm puzzled as to why it's most linguists and not all linguists. Do Nichols and other advocates of using typology have a theory for how languages change that explains how a protolanguage could plausibly evolve into daughter languages in a way that leaves a signal detectable above the noise background in the bullseye of grammatical features while simultaneously leaving no trace in the Swadesh list barn door?
Wikipedia has  Linguistic Diversity in Space and Time - she didn't try to do that, instead selecting one language each. She could have tried to research how persistent typological features are, and how resistant to borrowing they are.

A good example of a borrowed feature is classifiers / counter words / count words / measure words, common in eastern and southeastern Asia but rare elsewhere. It's more-or-less treating every noun as uncountable. Classifiers are in Chinese, Korean, Japanese, Vietnamese, Thai, and Burmese, and at least some of them have relatives that have no classifiers. For Chinese and Burmese (Sino-Tibetan), Tibetan, for Vietnamese (Austroasiatic), Khmer, for Thai (Austro-Tai), Austronesian, and for Japanese and Korean (Transeurasian / Broad Altaic), Turkic, Mongolic, Tungusic. Austro-Tai and Altaic are both speculative, but supported by some Swadeshian statistics. Also notable is that the classifier words themselves don't seem to be cognate across langs, indicating that it was the idea of using classifiers that was borrowed, and not the classifiers themselves.
 
Johanna Nichols's list of features again, with the Wikipedia article including helpful links.
  • Head-marking vs. dependent-marking -- "these flowers are blooming" (these: dependent marked, are: head marked)
  • Morphological complexity
  • Word order
  • Morphosyntactic alignment - a fancy phrase for whether the subject of an intransitive verb is marked like either the agent or the target of a transitive verb
  • Valence-changing operations or voice system -- valence is what number of nouns a verb takes, and voice is whether the subject is the agent (active) or the target (passive) of a transitive verb, or both (reflexive or middle).
  • Inclusive vs. exclusive we
  • Possession: alienable vs. inalienable -- detachable or permanently attached in some way -- dominant or non-dominant (Polynesian a vs. o) -- obligatory for some nouns vs. never obligatory
  • Numerical classifiers
  • Noun classes: gender, animacy, etc.
  • Grammatical number ("plurality neutralization") - some langs don't have it
  • Adpositional (prepositional, postpositional) phrases ("PP's") - some langs have verbs that can act as prepositions
  • Non-finite verb forms (infinitives or verbal nouns)
Valence:
  • 0 - impersonal verb: it is raining
  • 1 - intransitive verb: the rain is falling
  • 2 - transitive verb: the rain is making a pleasant sound
  • 3 - ditransitive verb: the rain is giving me a cold
Morphological complexity - langs vary widely in what they pack into a word:
  • Nouns: number, noun case, definiteness
  • Verbs: tense, aspect, mode/mood, valence, voice, negation, evidentiality
  • Adjectives: nounlike or verblike
  • Adjectives and adverbs: comparison
 
Johanna Nichols found some rather broad geographic patterns.
One pattern is spread zones (geographical areas where a language family has spread widely, often repeated with several language families in sequence, like Indo-European and later Turkic languages in central Eurasia) vs. residual zones (areas, often mountainous, where many languages of various families have been preserved, like the Caucasus or New Guinea). For example, head marking is more common in the residual zones, which Nichols suggests is a result of long-term language contact.
Her regions:
  • Old World (Afro-Eurasia)
  • New World
  • Pacific: New Guinean and Australian langs
The Old World is geographically largest, but has the least typological diversity and lowest density of language families, suggesting that repeated spreads from its center have eliminated much diversity which previously existed, especially at the edges of the Afro-Eurasia supercontinent. Surprisingly, typological statistics for African languages are similar to those for the languages of Eurasia, though there has been little spread of languages between the two areas, other than the Afroasiatic languages that span both areas.

The New World differs considerably from the Old World, with much higher frequencies of head-marking, ergativity and other features. The "Pacific" is intermediate on these features.
 
More on the KAKA paper in Mother Tongue 7. The next hypothesis that the authors consider is borrowing and diffusion. That does happen for the more distant relatives. Consider:

English uncle < Old French oncle, aunt < Old French ante, cousin < Old French cosin(e)

But the authors conclude that that is unlikely. There is still a possibility of similar roots also being widespread.

The authors then do an anthropological study, and they mention another abbreviation: G for brother B and sister Z. German Geschwister "siblings".

They propose an identification with each other of male elder relatives on one's mother's side, from GdF, MB, B+, and likewise of female elder relatives on one's father's side, from GdM, FZ, Z+.

"We have established that KAKA referred primarily to masculine elders on the mother’s side and possibly to feminine elders on the father’s side."
 
"Was the First Language Purposefully Invented?" by John Saul

He doesn't discuss that issue, however, and instead discusses notions of death. A disappointment.

I would have been interested in some discussion of the emergence of language capability. Some research uses children's acquisition of language, but that involves the risky presumption of ontogeny recapitulates phylogeny, or in simpler terms, growth reruns evolution.


"The Numeral System of Jarawa Andamanese" by Michael Witzel

A very curious system, repeating at 32 with a modified 7, 33 - 8, ... essentially 1 to 6, then repeats of the next 25.

Then describing the Telefol system:  Telefol people -- (left) 1 to 5: little finger to thumb, 6 to 10: wrist, lower arm, elbow, upper arm, shoulder, 11 to 13: side of neck, ear, eye, 14: nose, (right) 15 to 27: like left, but in reverse order.

MW proposes that the Jarawa system starts with the head, then goes down each arm, and on each arm's hand, the segments of each finger.
 
Mother Tongue 8 - John Bengtson on Basque and N Caucasian phonology, Vitaly Shevoroshkin on Salishan as Dene-Caucasian, and George Starostin attempting to find sound correspondences for the Khoisan langs, including ones for clicks.

Mother Tongue 9 - discusses the odd distribution of Australian langs. Pama-Nyungan over most of the continent, and all the rest in the north, especially the northwest.

Paul Whitehouse: "Perhaps because the Australian data are so uneven, with very homogenous personal pronoims throughout the continent, plus a small number of widespread lexemes, contrasting with an otherwise extremely diverse lexicon, the idea has often arisen that there is something different about linguistic change in Australia."

PW and Geoff O'Grady got into some argument about "antonymic semantic tradeoff" - words with opposite meanings exchanging word forms - among Pama-Nyungan speakers. GOG said "Much more commonly in Pama-Nyungan, one finds just Antonymic Semantic Change, in which a given root descends in some languages meaning ‘short’, for example, and ‘long’ in others. It’s there!"

A curious linguistic effect is name taboo - not mentioning the name of someone who died or any words similar to it, instead using other words - but that is temporary, and it is only done by the friends and relatives of the dear departed, and also by others in their presence. So it does not produce as much language change as one might expect.
 
Alain Matthey de I’Etang and Pierre J. Bancel have another Proto-World article: "The Global Distribution of (P)APA and (T)ATA and their Original Meaning"

The others conclude that these words are for male relatives on the father's side: GdF, F, FB, B+.

The authors plan to work on (M)AMA, (N)ANA, and (Y)AYA. From John Bengtson's and Merritt Ruhlen's Global Etymologies, (Y)AYA, which they call AJA, refers to one's paternal aunt, FZ, and not one's maternal aunt, MZ, when it refers to an aunt. So (Y)AYA would be for one's female elders on one's father's side: GdM, FZ, Z+. Judging from words for "mother", (M)AMA and (N)ANA would be for one's female elders on one's mother's side: GdM, M, MZ, Z+.

They also hope to eventually work on younger relatives -- any patterns in words for them?

They also note a way for multiple words to coexist: different terms of address and terms of reference, or else different informal and formal terms.

So in summary:
  • (M)AMA, (N)ANA -- Z+, M, MZ, GdM -- female elders on one's mother's side
  • (Y)AYA -- Z+, FZ, GdM -- female elders on one's father's side
  • (P)APA, (T)ATA -- B+, F, FB, GdF -- male elders on one's father's side
  • (K)AKA -- B+, MB, GdF -- male elders on one's mother's side
 Kinship terminology - three of the six basic systems agree in having MZ = M, with FZ different, and FB = F, with MB different.

In the attested langs, there is some crossover from one of these categories to another, like Eurasian Georgian mama "father" deda "mother", but not enough to obscure this system. Or at least that's what these authors claim to have discovered.
 
Pierre J. Bancel and Alain Matthey de I’Etang: "A Study of Kin Nursery Terms in Reiation to Language Acquisition With a Historicai and Evolutionary Perspective"

About *(K)AKA for male elders, they note that the colonial expansions of the last half-millennium were done by speakers of langs without any such forms in their langs: English, French, Spanish, Portuguese, and Russian.

But it may be represented as this word: Reconstruction: Proto-Indo-European/h₂éwh₂os - Wiktionary, the free dictionary - "maternal uncle" MB, "maternal grandfather" MF.

It has descendants like Hittite huhhas "grandfather", Latin avus "grandfather", avunculus "maternal uncle" ("little grandfather") > English "uncle", ...

From the likely phonetic value of PIE laryngeal h2, the form was *xuxa, where x is the velar fricative, the "kh" sound like German ch. That would require *K > *x -- are there any Eurasiatic/Nostratic cognates without that shift?

In any case, such a shift would explain a doublet that I'd mentioned earlier:

"duck" *h2enh2ts > Latin anas, English annet, ...
"goose" *ghh2ens > Latin anser, Greek khên, Proto-Germanic *gans > English goose, German Gans, ...

Pre-PIE *Kan- ?
 
Last edited:
Back
Top Bottom