Afroasiatic Comparative Lexica: Implications for Long (and Medium) Range Language Comparison
One way to test the reliability of the comparative method would be to undertake the following experiment. Take a set of languages for which a relationship has been suggested, but for which regular sound correspondences and a reconstructed phonemic system of the proto-language have not yet been established. Furnish two libraries on opposite sides of the world with all the available and relevant information on the languages (dictionaries, grammars, texts). Take two researchers trained in the comparative method, put them in the libraries, keep them in isolation from each other and see what they come up with. If it is a reliable procedure then two trained practicioners of it confronted with the same body of data should come up with broadly similar results-- repeatability of experiments should be expected as in natural science. Unfortunately, as so often in linguistics, ethical considerations prevent us from subjecting real human beings to such an experiment.
The world of comparative linguistics is fortunate, therefore, that something very close to this experiment came to be performed by accident.
Referring to the Ehret and Orel-Stolbova etymological dictionaries of Afroasiatic or Afrasian.
E: Semitic, Egyptian, Cushitic, Omotic, Chadic, Berber, but ignoring Berber
OS: breaking Cushitic up into Beja, Agaw, “East Cushitic,” Dahalo, Mogogodo, Rift
"A further difficulty is that E often gives reconstructed forms only without attestation of actual language data."
E and OS disagree on proto-phonology and on sound correspondences. On 2-consonant vs. 3-consonant roots -- E: 3C ones are all 2C ones with fossilized suffixes -- OS: some 2C ones are 3C ones with dropped-out sounds, some 3C ones have fossilized prefixed.
Even when the two agree on cognate sets, they differ in reconstruction. For instance, "to die" and related words -- Proto-Semitic *mawut-, Egyptian mwt, Berber mmt, Hausa (Chadic) mutù -- E maaw OS mawut -- this meaning is one of the most stable ones.
E and OS agree on only a small fraction of entries: 59, 6% of E, 2% of OS -- OS propose 2.5 * as much as E.
Die-hard opponents of long-distance comparison may gleefully leap to the conclusion that the method is 94 to 98% inaccurate even at medium depths, but such a conclusion would be premature. Still the fact remains that two sets of scholars have been able to reconstruct mutually unrecognizable proto-languages, and this demands an explanation.
One or the other of the two must have found lots of spurious correspondences -- or both of them did.
Reviewer Robert R. Ratcliffe noted that if semantics are loose enough, then that makes coincidences very likely. He used as an example different kinds of birds - OS have 52 sets of putative cognate sets with the meaning "bird" - the sets are mostly of different kinds of birds, like:
Egyptian "falcon" ~ Central Chadic "vulture", "hen" ~ Eastern Chadic "great bustard" ~ Agaw "kind of bird"
Semitic "parrot" ~ Western Chadic "quail"
Central Chadic "hawk" ~ Eastern Chadic "dove"
Berber "butterfly, small bird" ~ Chadic "guinea fowl" ~ Beja "pelican"
On average, OS reconstructs 10 cognate sets for each member of the Swadesh 100-word list; 52 for "bird" is the largest number. By comparison, E did not reconstruct most of the entries in that list, "bird" and most others. E preferred to reconstruct verbs, while OS preferred to reconstruct nouns.
A further problem: E used an Arabic etymological dictionary for Semitic, and that gave him numerous derived forms to choose from - why not some Proto-Semitic one?
Reviewer RRR summed up with noting several problems, like looking in several langs without attempting to reconstruct subfamily protoforms, and looking at different times in langs with long histories, like Egyptian. Also the atypical nature of the reconstructions, like numerous synonyms, and E's reconstruction of the basic vocabulary as mostly abstract sorts of verbs. "In general the underived, basic vocabulary of a language and specific and concrete, while abstract words are formed by derivation."