Language as a Clue to Prehistory

lpetrich · Jul 22, 2022

I'll now demystify noun cases. Though they can look scary, they are functionally much like prepositions. If we named prepositions like cases, we'd get something like

of - genitive preposition
to - dative preposition
at, in, on - locative prepositions
from - ablative preposition
with - instrumental/comitative preposition

As to how they originated, consider prepositions. If they follow their noun phrases instead of preceding those phrases, they are called postpositions, and both together are adpositions.

Noun + postposition when run together give noun + case ending

Looking at

Finnish language: Grammatical cases
Noun cases | a piece of Hungary for you
Learn Turkish - Grammar
Mongolian
Microsoft Word - h_sch_9a.rtf - h_sch_9a.pdf about Tamil noun cases.

The Turkish one contains this example of agglutinative noun inflection: evlerinizin "of your houses" -- ev-ler-iniz-in -- house-(plural)-(your)-(genitive)

Most of the others are similarly agglutinative, with singular and plural forms sharing case endings. That can make it hard to distinguish them from postpositions.

Finnish has some additional complexity, however. Its bare plural ending is -t but that ending mostly becomes -i- when a case ending is added onto it.

Swammerdami · Jul 22, 2022

lpetrich said:
Some people have proposed a cycle: isolating -> agglutinative -> fusional -> isolating again, but that seems too schematic.

As to which is easiest to learn, isolating and agglutinating morphologies are roughly equivalent, because they are both very modular, both easy to decompose into parts. Fusional morphology is more difficult, since it is less modular.

As an interested layman, I have some questions and comments.

First, some write-ups insist on adding additional morphological types (polysynthetic and possibly analytic) to the {isolating, agglutinative, fusional} trio. Would I be out-of-line to focus on the trio and dismiss these additions as unnecessary complications?

I usually think of a word as containing one or more syllables, and in agglutinative or fusional languages a word often contains many syllables. But what about "Qu'est-ce qu'il y a' — seven words of French using just three syllables? Or English "wun-chal" ("Wouldn't you all") — four words in just 2 syllables? Of course, the former is taught as Standard French, while the latter is colloquial; does this make a difference to linguists?

And are those examples "fusion", or something else? I remember the time a Frenchman asked me the single-syllable question "D'où?" and it took me a moment to figure out what he was asking. (My French teacher would have asked the less ambiguous "D'où venez-vous?")

I think Thai might be the most isolating of all languages!

. (This is, I'll guess, one reason it is so VERY easy to learn.) Chinese (Mandarin?) seems to be the go-to example for very isolating language, and I know no Chinese. But I have read journal papers that treat Chinese and Thai as examples of isolating languages undergoing grammaticalization and they show that Chinese is further along in that part of the cycle. (And write "15–25% of lexemes produced by the Thai speakers were complex, with a mean of about 20% as shown. By contrast, 44–57% of lexemes produced by the Chinese speakers are complex, with a mean of about 52%.")

In fact the typical examples offered for grammaticalization in Thai are two words (โดน /don/ "bump") and (ถูก /thuuk/ "touch") which are increasingly used to mark passive voice. However (a) they are usually used only when recipient has an unfavorable outcome, and (b) the actor's noun is often placed in between the marker and the main verb. These suggest to me that this "grammaticalization" is not particularly ready for agglutination; am I right?

English is, I think, a good example of a language that uses all three structures (isolating, agglutinative, fusional) so shows that simplistic typing may be futile. Nevertheless my readings suggest that type cycling (isolating -> agglutinative -> fusional -> isolating again) may be valid as a general tendency.

Copernicus · Jul 22, 2022

Swammerdami said:
lpetrich said:

Some people have proposed a cycle: isolating -> agglutinative -> fusional -> isolating again, but that seems too schematic.

As to which is easiest to learn, isolating and agglutinating morphologies are roughly equivalent, because they are both very modular, both easy to decompose into parts. Fusional morphology is more difficult, since it is less modular.

Click to expand...

As an interested layman, I have some questions and comments.

First, some write-ups insist on adding additional morphological types (polysynthetic and possibly analytic) to the {isolating, agglutinative, fusional} trio. Would I be out-of-line to focus on the trio and dismiss these additions as unnecessary complications?

I usually think of a word as containing one or more syllables, and in agglutinative or fusional languages a word often contains many syllables. But what about "Qu'est-ce qu'il y a' — seven words of French using just three syllables? Or English "wun-chal" ("Wouldn't you all") — four words in just 2 syllables? Of course, the former is taught as Standard French, while the latter is colloquial; does this make a difference to linguists?

And are those examples "fusion", or something else? I remember the time a Frenchman asked me the single-syllable question "D'où?" and it took me a moment to figure out what he was asking. (My French teacher would have asked the less ambiguous "D'où venez-vous?")

I think Thai might be the most isolating of all languages! . (This is, I'll guess, one reason it is so VERY easy to learn.) Chinese (Mandarin?) seems to be the go-to example for very isolating language, and I know no Chinese. But I have read journal papers that treat Chinese and Thai as examples of isolating languages undergoing grammaticalization and they show that Chinese is further along in that part of the cycle. (And write "15–25% of lexemes produced by the Thai speakers were complex, with a mean of about 20% as shown. By contrast, 44–57% of lexemes produced by the Chinese speakers are complex, with a mean of about 52%.")

In fact the typical examples offered for grammaticalization in Thai are two words (โดน /don/ "bump") and (ถูก /thuuk/ "touch") which are increasingly used to mark passive voice. However (a) they are usually used only when recipient has an unfavorable outcome, and (b) the actor's noun is often placed in between the marker and the main verb. These suggest to me that this "grammaticalization" is not particularly ready for agglutination; am I right?

English is, I think, a good example of a language that uses all three structures (isolating, agglutinative, fusional) so shows that simplistic typing may be futile. Nevertheless my readings suggest that type cycling (isolating -> agglutinative -> fusional -> isolating again) may be valid as a general tendency.

These typological categories are actually very old, and the proposed hierarchy has lots of exceptions. Syntactic typologies and phonological trends have a lot to do with how morphological systems change over time--for example, the way in which rhythmic syllable and foot patterns affect the coordination of articulatory gestures and the perceptual salience of syllables in words and phrases. It really requires a strong background in linguistics to make much sense of what is behind language typologies.

Swammerdami · Jul 23, 2022

Copernicus said:
These typological categories are actually very old, and the proposed hierarchy has lots of exceptions. Syntactic typologies and phonological trends have a lot to do with how morphological systems change over time--for example, the way in which rhythmic syllable and foot patterns affect the coordination of articulatory gestures and the perceptual salience of syllables in words and phrases. It really requires a strong background in linguistics to make much sense of what is behind language typologies.

Your little pinkie finger has forgotten more linguistics than I'll ever know — we get that.

Still it might have been a friendly gesture to actually respond to some of my questions.

Copernicus · Jul 23, 2022

Sorry about not answering your questions. I didn't have a lot of time earlier, but I'll try to see what I can do. However, answering some of your questions would take a basic course in linguistics to introduce you to concepts that would help you understand the answers. How much phonetics, phonology, and morphology have you studied in the past? For most people, the answer would be "none". Linguistics simply isn't taught in most schools, and it is almost never a required subject. Not every university even has a linguistics department.

Swammerdami said:
lpetrich said:

Some people have proposed a cycle: isolating -> agglutinative -> fusional -> isolating again, but that seems too schematic.

As to which is easiest to learn, isolating and agglutinating morphologies are roughly equivalent, because they are both very modular, both easy to decompose into parts. Fusional morphology is more difficult, since it is less modular.

Click to expand...

As an interested layman, I have some questions and comments.

First, some write-ups insist on adding additional morphological types (polysynthetic and possibly analytic) to the {isolating, agglutinative, fusional} trio. Would I be out-of-line to focus on the trio and dismiss these additions as unnecessary complications?

No, but think of polysynthetic languages as those that pile lots of affixes (suffixes, infixes, prefixes) on verbs to designate roles that other languages might use separate words (pronouns, noun phrases) to express. Agglutinative languages tend to string affixes together on words, where each affix has a single grammatical function (e.g. plurality or definiteness, but not both). Fusional (or inflectional) languages tend to pile more meanings into an affix (e.g. the -s suffix in "puts" carries both third person and singular designations). Isolating languages tend to require separate words to express grammatical functions rather than affixes. No language really fits perfectly into any of these categories, and linguistic typology these days is much more sophisticated than in the early twentieth century when the morphological types were popular. So it can be misleading to take these categories too seriously.

Swammerdami said:
I usually think of a word as containing one or more syllables, and in agglutinative or fusional languages a word often contains many syllables. But what about "Qu'est-ce qu'il y a' — seven words of French using just three syllables? Or English "wun-chal" ("Wouldn't you all") — four words in just 2 syllables? Of course, the former is taught as Standard French, while the latter is colloquial; does this make a difference to linguists?

Really tough to answer this question without explaining the difference between phonology and morphoponology. Phonology is basically about coordinating articulatory gestures during speech. Speech is sequential, so think of it in terms of articulating strings of words, where each word consists of a string of phonemes (basic speech sounds). Those phoneme strings are produced in rhythmic groups, which poets create artistic patterns from. Those rhythmic groups are composed of syllables. So you can slow down or speed up, articulate carefully or casually, sing, or whisper those strings of sounds. That's what phonology is about, and knowing that is part of the answer to the question.

Now consider that string of sounds that you try to pronounce using your largely unconscious knowledge of English phonology--rules governing articulatory gestures. You can monkey around with that string of sounds. Children do this in language games all the time, for example, in Pig Latin. When you say the plural of book, you add the suffix -s to the word: books. But if you say the plural of knife, you change the stem to knive and add -s (actually the phoneme /z/). English phonology makes sure that. So you manipulate the string of phonemes before you try to articulate the word. Phonology is what happens as you try to articulate. Morphophonology involves manipulating strings of phonemes that make up the words.

So you want to know why wouldn't you all can be pronounced casually like one word: wunchall. Good question. It involves both changing the string of phonemes and modifying they way you articulate that string--i.e. morphophonology + phonology. Would not gets replaced by a contraction wouldn't. Unstressed you gets replace by ya in casual speech. That's morphophonology. Now kick in phonology. A /t/ followed by an /y/ in the same syllable gets pronounced ch by a phonological process that we call "palatalization". Very common in English, doncha think? And, of course, ya + all coalesces into yall in casual English. Hence, wunchall.

Had enough? Am I tiring you out? I could explain what is going on in French. It's the same kind of interplay between morphophonological and phonological processes. The brain arranges a string of words that consists of strings of phonemes. The phoneme string gets packed into rhythmic units (meter and syllables) that then get run through a phonological filter to produce an acoustic output. Listeners use their knowledge and expectations to help them decode the acoustic signal back into words and phrases.

And are those examples "fusion", or something else? I remember the time a Frenchman asked me the single-syllable question "D'où?" and it took me a moment to figure out what he was asking. (My French teacher would have asked the less ambiguous "D'où venez-vous?")

I don't think that the morphological typology is useful in answering your question, but I think I've already addressed it. The morphophonological and phonological systems in French are quite different from English. For one thing, English is a stress-timed language rhythmically. So speakers time their articulation to pronounce strings of syllables that are of equal length between stress peaks. French is syllable-timed. That is, unlike with English, each syllable is pronounce with roughly equal length, and stress is almost always on the last syllable. In English, the placement of stress is more complicated. D'où is just a contraction of de + où. In that speech context, you can guess the meaning without needing to say venez-vous.

I think Thai might be the most isolating of all languages! . (This is, I'll guess, one reason it is so VERY easy to learn.) Chinese (Mandarin?) seems to be the go-to example for very isolating language, and I know no Chinese. But I have read journal papers that treat Chinese and Thai as examples of isolating languages undergoing grammaticalization and they show that Chinese is further along in that part of the cycle. (And write "15–25% of lexemes produced by the Thai speakers were complex, with a mean of about 20% as shown. By contrast, 44–57% of lexemes produced by the Chinese speakers are complex, with a mean of about 52%.")

The evolutionary hierarchy that lpetrich posted is a fairly old one that isn't taken very seriously in modern linguistics. Thai and Chinese are tone languages, and that tends to play a role in their morphophonological and phonological systems. Neither language tends to use affixes (i.e. infixes, suffixes, prefixes), although there can be exceptions to the tendency.

In fact the typical examples offered for grammaticalization in Thai are two words (โดน /don/ "bump") and (ถูก /thuuk/ "touch") which are increasingly used to mark passive voice. However (a) they are usually used only when recipient has an unfavorable outcome, and (b) the actor's noun is often placed in between the marker and the main verb. These suggest to me that this "grammaticalization" is not particularly ready for agglutination; am I right?

I haven't studied Thai or Chinese well enough to speak about what is going on there. However, I would be hesitant to say that neither language uses affixes. There are degrees of difference between compounding words and attaching affixes to words. Sometimes, compounding resembles affixation.

English is, I think, a good example of a language that uses all three structures (isolating, agglutinative, fusional) so shows that simplistic typing may be futile. Nevertheless my readings suggest that type cycling (isolating -> agglutinative -> fusional -> isolating again) may be valid as a general tendency.

I would not say that, since all languages tend to be a mixed bag of these morphological types. Languages can go pretty much in any direction when they change over time. A lot of it has to do with the way in which phonological processes erode the acoustic signal in fast and casual speech. Infants are born with a capacity to develop a phonological system, and that is essentially what causes "babytalk" in the first few years of life. They are tuning their articulatory systems to mimic adult pronunciations. However, they can only base pronunciations on what they hear. So, if adults are inserting or deleting syllables when they speak carefully or quickly, the child may have a tendency to develop a phonological system that differs from the adult ones internally but sounds roughly the same to adults as their own. Linguists call this phenomenon "rephonologization". Imperfect learning is the primary driver of language change, and that is why resistance is useless. Change is inevitable. And it doesn't always go in a predictable direction, although there are recognizable trends.

lpetrich · Jul 25, 2022

To the previous post's list I add Georgian/Nouns - Wikibooks, open books for an open world - for the language of Eurasian Georgia (not the US state).

The older and more conservative Indo-European languages are another story altogether, with multiple declension types and with case endings for singulars, duals (two of something), and plurals having little resemblance to each other. The dual and plural case endings cannot be analyzed into (dual or plural ending) + (singular case ending).

Indo-European noun-case endings have a complicated history, with plenty of influence on each other, and with the ancestral forms sometimes being difficult to reconstruct.

The Wiktionary article compares reconstructed PIE noun declensions to several attested ones.

PIE is usually reconstructed as having eight cases: nominative (subject), vocative (for addressing someone), accusative (direct object), genitive (of-case), dative (to-case), instrumental (with-case), locative (in-case), ablative (from-case).

But "case syncretism" is common, and is reconstructed for PIE.

The vocative case is identical to the nominative case in the dual and plural, and often also in the singular. When different, it is a sort of bare stem, without nominative singular -s.

In the neuter/inanimate gender, the nominative, vocative, and accusative cases are identical, something well-preserved in the descendant languages.

That is also true of duals, and the other cases are difficult to reconstruct for them. Sanskrit, with all eight cases, has genitive-locative and dative-instrumental-ablative syncretism.

Turning to the ablative, in the singular, it is often the same as the genitive, while in the plural, it is always the same as the dative.

-

Turning to adjectives, they are noun-like in PIE and in most descendant languages, and they have noun-like declensions. In PIE and the older IE languages, they have the same declensions, but some later ones, like Germanic and Slavic ones, developed separate adjective declensions.

Alternatively, adjectives might be verb-like ("to be <adjective>"), like in Japanese.

-

Looking outside of IE, the Bantu languages have noun-class prefixes that are different in singular and plural, where the plural ones cannot be analyzed as (singular noun-class prefix) + (plural prefix) --

Proto-Bantu language

lpetrich · Sep 8, 2022

I note that wiktionary.org has lots of historical-linguistic info in its etymologies. Not just word roots, but also word inflections.

Duals:

Sanskrit: Nom-Voc-Acc -î, -au, Gen-Loc -oh, -yoh, Dat-Inst-Abl -abhyâm, -bhyâm, -âbhyâm
Greek: Nom-Voc-Acc -e, -ô, Gen-Dat -oin
Old Church Slavonic: Nom-Voc-Acc -e, -i, -a, Gen-Loc -u, Dat-Inst -ima, -oma

We see Skt bh ~ OCS m, and that's rather common in IE inflection endings:

*bh -- Italic, Celtic, Greek, Indo-Iranian
*m -- Germanic, Balto-Slavic

About duals, two of the words for numbers have dual endings: 2 *dwô and 8 *oktô -- 8 = 2*4 implying a word for 4 that is now lost. But it is present in the Kartvelian languages of the Caucasus Mountains, like Georgian otxi. The consonants are interchanged, a change called "metathesis", like "ask" becoming "ax" in some English dialects.

There is something similar in Japanese, where multiples of 2 sometimes have vowel changes.

Old Japanese, present-day native Japanese:

1 pitö, hitotsu; 2 puta, futatsu; 3 mi, mittsu; 4 yö, yottsu; 5 itu, itsutsu; 6 mu, muttsu; 7 nana, nanatsu; 8 ya, yattsu; 9 könönö, kokonotsu; 10 töwo, tō

Vowel-shift pairs: 1 - 2, 3 - 6, 4 - 8. The vowel shifts: (1-2) i-u, ö-a, (3-6) i-u, (4-8) ö-a

I note that from Old to Modern Japanese, 1 to 9 got the suffix -tsu.

lpetrich · Sep 8, 2022

Trying to untangle Proto-Indo-European noun, adjective, and pronoun declension is a difficult task. From

Proto-Indo-European nominals and

Proto-Indo-European pronouns I have assembled a combined table:

Case	Singular	Plural
Nominative	-s ~ -()	-es
Vocative	-()	-es
Accusative	-m	-ns
Neuter NVA	-m ~ -()	-h2 ~ -()
Genitive	-os	-om
Ablative	-et	-mos
Dative	-ei	-mos
Locative	-i, -()	-su
Instrumental	-eh1	-bhi

I've written 0-slash (nothing present) as ().

It's evident that the plural forms don't look much like the singular forms. PIE's nominal declensions are completely fusional rather than agglutinative, like for for Uralic, Turkic, Mongolian, and Dravidian.

It must be noted that some present-day IE languages have agglutinative case endings. For instance, Classical Armenian had fusional ones, but Modern Armenian has agglutinative ones. Some Indic languages also have agglutinative ones, like Sinhalese.

lpetrich · Oct 29, 2022

In an earlier post, I had simplified the Proto-Indo-European noun declension, because there were three types: athematic noun, thematic noun, and pronoun, with some differences between them. Thematic nouns are o-stem ones: *-os, like Old Latin and Greek -os (Classical Latin made it -us), and athematic nouns all the others. Non-personal pronouns (demonstrative, interrogative, ...) were thematic-like, though some of them also had a separate feminine formed with athematic -eh2 > -â or -ih2 > î.

For instance the neuter nominative/accusative/vocative singular is athematic -, thematic -m, and pronominal -d. I'm omitting thematic and pronominal -o- ~ -e- to make the relations clearer. The neuter NVA plural for all of them is -h2, however.

The animate nominative singular for all of them is -s ~ -, accusative singular -m, nominative plural athematic, thematic -es, pronominal (some) -i ~ (some) -es, accusative plural -ns.

Athematic genitive-ablative singular: -s ~ -os ~ -es
Thematic, pronominal genitive singular: -s(y)o
Dative singular: -ey

Genitive plural: athematic, thematic -om ~ -ôm, pronominal -ysom
Dative-ablative plural: -bhos ~ -mos

lpetrich · Oct 29, 2022

The personal pronouns are

Case	1s	2s	1p	2p
Nominative	h1egoH	tuH	wei	yuH
Oblique stem	h1me-	te-	ns-, nos-	us-, wos-

The dual pronouns were very similar to the plural ones.

PIE had no third-person pronouns, but instead used demonstrative ones ("this, that"), as some of the descendant languages did, like Latin. The descendant languages have a *lot* of variety in those pronouns, though they originated in various ways, like hillbilly-English "this here" and "that there".

For instance, Romance, from Latin:

French ce < ecce, ceci < ecce hic, cela < ecce illac
Catalan aqueix < eccum ipse, aquest < eccum iste
Spanish ese < ipse, este < iste, aquel < eccum ille
Portuguese esse < ipse, este < iste, aquele < eccum ille
Italian questo < eccum iste, quello < eccum ille
Romanian acest < eccum iste, acela < eccum ille

From Latin ecce "look at...", hic, iste, ille, illic, ipse -- is (ea, id) dropped out, likely from being very short. Like how Latin îre "to go" dropped out. BTW, ille became most Romance languages' definite article.

PIE likely had two demonstratives, *to- and *e- (anaphoric: something earlier mentioned)

For *to- the masculine and feminine nominative singular did t > s: m *so, f *seh2 > *sâ, n *tod, plurals m *toi, f *teh2i > *tai, n *teh2 > *tâ

PIE had a reflexive pronoun, *swe-, a relative pronoun, (H)yo-, and an interrogative / indefinite pronoun *kwe- *kwi- with adjectival form *kwo-

lpetrich · Nov 5, 2022

A common feature in IE is preposition-case combinations. Here are common meanings of cases with prepositions:

Genitive = origin, starting point
Accusative = destination of motion
Other cases = stationary state

Some prepositions can take more than one case, usually the second and third in this list.

Translations by Google Translate except for the Latin one.

English: The cat runs into the house / The cat sleeps in the house
German: Die Katze läuft ins Haus (acc) / Die Katze schläft im Haus (dat)
Icelandic: Kötturinn hleypur inn í húsið / Kötturinn sefur í húsinu
Latin: Fêlês in domum currit (acc) / Fêlês in domû dormit (abl) -- SOV
Croatian: Mačka utrči u kuću (acc) / Mačka spava u kući (dat)
Russian: Кошка вбегает в дом (acc) / Кошка спит в доме (loc)

I couldn't find dictionaries good enough for Ancient Greek.

Seems like it was a feature of Proto-Indo-European.

Among IE langs that have lost this noun-case variation, they either use the same preposition (French, Modern Greek) or different prepositions (Spanish, English).

English "into" is rather obviously a compound: "in" + "to" (indicates destination of motion). English "onto" is similar.

Looking outside of IE, I couldn't find anything comparable. Either one preposition for both meanings, two separate prepositions, or two separate noun cases.

Finnish: Kissa juoksee taloon (illative), Kissa nukkuu talossa (inessive)
Turkish: Kedi eve koşar (dative) Kedi evde uyuyor (locative) -- SOV

Here is a case of two separate prepositions:

Indonesian: Kucing itu berlari ke dalam rumah (into: ke dalam) / Kucing itu tidur di rumah (in: di)

Here is a case of one preposition for both:

Filipino: Tumatakbo ang pusa sa bahay (into: sa) / Natutulog ang pusa sa bahay (in: sa) -- VSO

I've indicated departures from SVO with -- SOV and -- VSO.

lpetrich · Nov 5, 2022

There is a curious elaboration on the gender system in most Slavic languages. They split their masculine gender into animate and inanimate ones, distinguished by accusative = genitive (animate) or accusative = nominative (inanimate).

Singular only: Serbo-Croatian, Slovenian
Singular, inanimate plural: Czech
Singular and plural: Slovak, Eastern Slavic (Russian, Ukrainian, Belarusian)

Polish has a three-way distinction.

Personal: acc = gen
Animate: singular: acc = gen, plural: acc = nom
Inanimate: acc = nom

This feature is an innovation in Proto-Slavic, since it is lacking from most other Indo-European languages, not even their closest relatives, the Baltic languages.

lpetrich · Nov 5, 2022

An oddity that the Balto-Slavic languages share is genitive of negation. In some of them the object of a negated verb goes into the genitive case. Genesis of the Genitive of Negation in Balto-Slavic and Its Evidence in Contemporary Slovenian by Žiga Pirnat.

It's mandatory in Old Church Slavonic (Old Bulgarian), Polish, Slovenian, and Lithuanian, optional in Russian, rare in Serbo-Croatian and Slovak, and very rare in Czech and Latvian. It declined over the recorded history of Czech and Serbo-Croatian.

So it likely originated in Proto-Balto-Slavic and was variously preserved and dropped in a very patchy fashion in its descendants.

Alternately, it could have originated in Proto-Slavic, then be transmitted to Lithuanian as syntax borrowing.

This odd feature is rare outside of the Balto-Slavic languages.

lpetrich · Nov 5, 2022

Syntax borrowing (areal effects) may also explain two odd features of the Russian language that are absent outside of the Eastern Slavic languages: omitting the present tense of "to be" and expressing possession with "at (u) <possessor> is" instead of "<possessor> has". Most other languages are like English, using that present tense and expressing possession with a verb, like English "to have". Ukrainian uses both variations of each feature, and I can't find out much about Belarusian.

''I have'' in Ukrainian - У мене є / Я маю - Ukrainian Lessons

I'll translate "I have a book":

Russian: У меня есть книга - U menya est' kniga
Ukrainian 1: У мене є книга - U mene ye knyha
Ukrainian 2: Я маю книгу - Ya mayu knihu
Polish: Mam książkę
Czech, Slovak: Mám knihu
Slovenian: Imam knjigo
Croatian: Imam knjigu
Serbian: Имам књигу - Imam knjigu
Macedonian: Јас имам книга - Jas imam kniga
Bulgarian: Имам книга - Imam kniga

Ukr 1 is the more usual version, and Ukr 2 is common in W Ukraine. So:

Ukr inherited a Russian-like construction, and W Ukr was then influenced by Polish
Ukr inherited a word meaning "to have" and then was then influenced by Russian

One reconstructs Proto-Slavic

*jьměti - to have
*kъňìga - book

So the word for "to have" must have been lost in Eastern Slavic.

As to how Russian got those two constructions, I've seen the theory that those are syntax borrowings from some Turkic language.

More generally, some linguists divide languages into "have languages" and "be languages" depending on how they express possession, either with a verb of possession ("to have") or else with some construction like "at <possessor> is".

lpetrich · Nov 5, 2022

I must note a very well-preserved feature of Indo-European, the "Neuter Law": in the neuter or inanimate gender, the nominative, vocative, and accusative cases always look the same. I recall from somewhere that it has no exceptions in attested IE langs. Like in English: he/him, she/her, but it/it.

The Slavic split of the masculine gender into animate and inanimate subgenders brings to mind an odd feature of Spanish, the "personal a" -- The Personal A of Spanish and Spanish Prepositions: The Definitive Guide From the former:

I saw the tree -- Vi el árbol
I saw Teresa -- Vi a Teresa

From the latter:

“To” or “at” in Spanish is “a“. It is used when the direct object of a verb is an animal or a person or something personified. We also use “a” to introduce an indirect object, to express time, to give an order, to indicate manner and motion.

To use noun-case names, "a" is a dative preposition, and it is also used as an accusative preposition for personal nouns.

The personal a does not exist in other Romance languages, not even in close ones like Portuguese or Catalan.

Another oddity in Spanish is its two words for "to be" - ser and estar.

ser - persistent - date, occupation, characteristic, time, origin, relation
estar - transitory - position, location, action, condition, emotion

Though estar is also used for the locations of anything fixed in place, like trees and buildings.

Ser vs. estar: understanding Spanish “to be” verbs and Ser vs Estar: The Only Guide You’ll Ever Need | BaseLang and When do you use 'ser' and 'estar'? | Learning Spanish Grammar | Collins Education

Portuguese Verbs Ser vs. Estar: How and When to Use Either » Portuguesepedia -- looks very similar to Spanish
Usos de ser i estar – Aula de català - Catalan has also has this verb split, but it is somewhat different.
What's the difference between ‘essere’ and ‘stare’ in Italian? | Learning Italian Grammar | Collins Education - Also somewhat similar.

However, French and Romanian do not have that split.

lpetrich · Nov 5, 2022

These Romance words for "to be" are derived from Latin esse "to be", sedêre "to sit, stay in place", and stâre "to stand".

In turn derived from PIE *h1es- > *es- "to be (imperfective)", *sed- "to sit", and *steh2- > *stâ- "to stand". These words, in turn, may have some common ancestor in some pre-PIE language. **s ?

Conjugations of "to be": infinitives, present indicative of first form. The conjugation of the second form is much more regular.

Language	Inf (1st)	Inf (2nd)	1s	2s	3s	1p	2p	3p
Latin	esse	stâre	sum	es	est	sumus	estis	sunt
Italian	essere	stare	sóno	sèi	è	siàmo	siète	sóno
Spanish	ser	estar	soy	eres	es	somos	sois	son
Portuguese	ser	estar	sou	és	é	somos	sois	são
Catalan	ser	estar	sóc	ets	és	som	sou	són
French	être /etr/		suis /swi/	es /e/	est /e/	sommes /som/	êtes /et/	sont /soN/

The 1s and 3s forms are related to English "am" and "is".

Note the e- at the beginning of Spanish, Portuguese, Catalan, and French. This is an add-on that makes the pronunciation easier. Consider Latin scrîbere "to write": Italian scrivere, Spanish escribir, Portuguese escrever, French écrire.

French être has e-hat instead of es. That is a French spelling convention for a dropped s. Consider Latin fenestra "window": Italian finestra, Portuguese fresta ("small opening"), Catalan finestra, Old French fenestre, French fenêtre.

lpetrich · Nov 5, 2022

In Proto-Indo-European, adjectives were noun-like, declined like nouns, and agreeing with their nouns in gender, number, and case. A feminine gender was added early in IE history, using stems -eh2 > -â, -ih2 > -î, and -uh2 > -û to make feminine versions of nouns and adjectives, though there were many nouns and adjectives where the feminine form was left the same as the masculine one.

For example, *swekuros "father-in-law" and *swekruh2 "mother-in-law" (Latin socer, socrus, etc.), though that is an unusual case. Feminines were usually formed with -eh2 (-â) or -ih2 (-î).

Some patterns, mostly from Wikipedia and Wiktionary, and also Whitney's Sanskrit Grammar (Wikisource):

(thematic) -os, -om -- (Greek)
(thematic) -os, -eh2 (-â), -om -- Sanskrit, Greek, Latin, Celtic, Germanic, Balto-Slavic
(thematic) -os, -ih2 (-î), -om -- Sanskrit
(athematic) -s, - -- Sanskrit, Greek, Latin, Celtic, Germanic
(athematic) -s, -eh2 (-â), - Sanskrit, Greek, Balto-Slavic(?)
(athematic) -s, -ih2 (-î), - Sanskrit, Latin, Balto-Slavic(?)
(athematic) -s, -uh2 (-û), - Sanskrit

lpetrich · Nov 5, 2022

Proto-Germanic developed an additional adjective declension feature: strong vs. weak.

The strong one was the indefinite form, "a/an (adj)", and the weak one the definite form, "the (adj)".

The strong one continues o/a-stem adjectives as -az, -ô, -an and the weak one continues n-stem adjectives.

In the present languages, "a big dog" vs. "the big dog":

Dutch: en groot hond / de grote hond
German: ein großer Hund / der große Hund
Swedish: en stor hund / den stora hunden

The German ß is ss with a long vowel before it. Like the other North Germanic languages, Swedish has both a standalone article and a suffixed one, and without an adjective, only the suffixed one is used: "the dog" - hunden.

English lost that distinction, and English adjectives are indeclinable with only two exceptions, the demostratives: this/these and that/those.

I must note that those words for "dog" have "hound" as their English cognate, and that the Dutch and German adjectives are cognate with English "great".

Proto-Balto-Slavic also developed an indefinite vs. definite distinction. Definite adjectives were made by suffixing the pronoun *ya- to the indefinite ones. This is preserved in the Baltic languages and usually lost in the Slavic languages, with the Slavic forms often being mixtures of the old indefinite and definite forms. For instance, in Russian, the indefinite form became the short form and the definite form the long form. Get the lowdown on long and short–form adjectives in Russian – Unlocking Russian

lpetrich · Nov 5, 2022

Turning to verbs, I note these Proto-Indo-European verb personal endings:

Pers,Num	Act Pri	Act Sec	Act Imp	MPs Pri	MPs Sec	MPs Imp	Stative
1s	-mi, -oh2	-m	-	-h2er	-h2e	-	-h2e
2s	-si	-s	-, -dhi	-th2er	-th2e	?	-th2e
3s	-ti	-t	-tu	-or	-o	?	-e
1d	-wos	-we	-	?	?	-	?
2d	-thes	-tom	?	?	?	?	?
3d	-tes	-tâm	?	?	?	?	?
1p	-mos	-me	-	-mosdhh2	-medhh2	-	-me
2p	-te	-te	-te	-dhh2we	-dhh2we	-dhh2we	-te
3p	-nti	-nt	-ntu	-ror, -ntor	-ro, -nto	-nto	-êr

Act = active, MPs = mediopassive
Pri = primary, Sec = secondary, Imp = imperative

Participles:
Active: -ent- ~ -ont- ~ -nt-
Mediopassive: -mh2no- ~ -m(e)no-
Stative: -wos ~ -us-

Thematic verbs had a suffix -o- ~ -e- before the personal endings, athematic verbs didn't. For the first person singular active primary, athematic verbs had -mi and thematic verbs -oH > -ô.

lpetrich · Nov 5, 2022

Mediopassive voice? That's combined middle (reflexive) and passive.

The PIE mediopassive survived into the older IE languages, though in Germanic, Balto-Slavic, Greek, and Indo-Iranian, -r was replaced by -i. Modern Greek is the only present-day Indo-European language that preserves the original IE mediopassive.

As an aisde, Greek is also one of the only ones to preserve PIE nominative singular -s. Lithuanian, Latvian, and Icelandic also do so, though in Icelandic, it became -r.

But some speakers of the dialects have created new mediopassives.

Using 'Se' with Spanish Verbs To Express the English Passive Voice
Passive voice | WordDive Grammar for Swedish, a North Germanic language
Passive voice of Russian verbs explained - #1 resource guide

Romance, North Germanic, and Slavic languages use descendants of *s(w)e- "self" in reflexives, though North Germanic turned it into a suffix, -s (-st in Icelandic), and Russian also did so, -sya or -s'. In most of these langs, this construction also has an impersonal-subject or a passive-voice meaning, making it a mediopassive.

Like Spanish "se habla español" -- "Spanish is spoken"

Mediopassives sometimes become complete passives, as in Latin and the continental North Germanic languages.

Also, some languages with (medio)passive conjugations use those conjugations with active meaning in some intransitive verbs -

Deponent verb - like Latin, Greek, and the continental N Gmc langs.

Language as a Clue to Prehistory

Contributor

Squadron Leader

Industrial Grade Linguist

Squadron Leader

Industrial Grade Linguist

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor