• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Auditory Illusions: The McGurk Effect and Phonemes

Copernicus

Industrial Grade Linguist
Joined
May 27, 2017
Messages
5,675
Location
Bellevue, WA
Basic Beliefs
Atheist humanist
I am fascinated by illusions. They illustrate a most fundamental fact about our perception of reality--that it is not a passive process. It is an active projection on what our senses report. The following video lasts a little over 3 minutes.

[YOUTUBE]G-lN8vWm3m0[/YOUTUBE]


The McGurk Effect combines different sensations--visual and auditory--to produce its effect. An auditory sensation can be overridden by a visual sensation. But this is part of a larger phenomenon that has to do with brains that are evolved or "hardwired" for producing and perceiving spoken language. I am referring to a phenomenon that I call "phonemic hearing" but is more widely known as "phonemic awareness" or "phonological awareness" these days. The following 2-minute video briefly introduces the phenomenon of phonemic hearing:

[YOUTUBE]P4TiIAO59ec[/YOUTUBE]


Every language in the world is a system of auditory illusions known as "phonemes". They are completely language-specific illusions that we all learn automatically in the first few months of life. They represent a barrier to learning foreign languages, especially in mature individuals, because we find it difficult to shake our illusions as we get older.

Every word of our language is associated in memory with strings of phonemes. Not the auditory sensations. The illusions evoked by auditory sensations in combination with other sensations and knowledge about the content of what we are listening to.
 
I say, bake-news! You're just confusing bonemes with phonemes!


Ah, well, yeah, bascinating! And the video is pitch-perfect. Only the Trump supporters still won't felieve it!

Anyway, we're learning something everyday around here.

Very impressive, I think...

You call those "illusions". I prefer to call them "impressions" (see my one-man thread on the subject of impressions).

You have the impression it's a "fa" you're hearing even though it's a "ba".

Like you have the impression that your perceptions just are the actual physical world. Think of it, there's actually nothing in your perceptions that says as much! And you can actually loose that impression while keeping all your perceptions intact...

And it's you unconscious brain that concocts those impressions, or "illusions", for the benefit of your conscious brain. You usually don't even pay attention to the fact that it comes all prepackaged, ready for consumption. You just have the impressions that pop up, one after another, just like it was you doing it. Clever!

Jess like bake-news!! :p
EB
 
Every language in the world is a system of auditory illusions known as "phonemes". They are completely language-specific illusions that we all learn automatically in the first few months of life. They represent a barrier to learning foreign languages, especially in mature individuals, because we find it difficult to shake our illusions as we get older.

It's not an illusion.

It is turning visual information into sound. Like a bat.

The visual information is cleaner and clearer. So the brain gives it priority over what is heard when creating the experience of sound.
 
It is interesting that the visual information tends to override auditory information in the McGurk Effect. We don't think we hear a /b/ when we see the lip movement for /f/. I've also noticed that, when linguists and philosophers describe the semantics of physical objects, they almost always focus on visual properties first. Dogs bark, but that isn't the first thing that comes to mind when someone asks you to define "dog". Our sensorium seems to be largely devoted to visual modeling.

ETA: From a purely acoustic perspective, consonants tend to carry very little information relative to vowels. A sound spectrogram depicts harmonic bands built up from the fundamental frequency of vocal turbulence. The physical information in the auditory channel that identifies consonants is actually found in the way the bands adjacent to the actual consonant bend. So, for example, /p/, /t/, and /k/ are entirely silent--no energy bands at any frequency. That makes them acoustically identical during their periods of articulation--total silence. They are perceived as different sounds because of the way they bend the formant bands in adjacent vowels.
 
It is interesting that the visual information tends to override auditory information in the McGurk Effect. We don't think we hear a /b/ when we see the lip movement for /f/. I've also noticed that, when linguists and philosophers describe the semantics of physical objects, they almost always focus on visual properties first. Dogs bark, but that isn't the first thing that comes to mind when someone asks you to define "dog". Our sensorium seems to be largely devoted to visual modeling.

The visual information is usually more reliable and easier to comprehend. And the other sounds in the area are not a distraction.

Understanding the language of other people is a very important survival skill.

And visual information is more easily comprehended. More of our brain is devoted to vision.
 
But note that our evolved communication system is primarily auditory, not visual. In fact, there is speculation that human language evolved from a gesture-based system of communication, which we and other primates still use. Human sign language in deaf communities has a structure and complexity that is the equal of any spoken language. Nevertheless, the survival advantage that seems to have tipped the scales in human evolution is the fact that speech works better in the dark and over very long distances. In fact, some communities of speakers have developed forms of  whistled language that work over longer distances than even shouted speech. Another form of long distance communication is with the use of drums, which may mimic the tonal structure of words in languages that rely heavily on pitch to distinguish words.
 
This is very related to something I ran into about fifteen or twenty years ago. I have a native Chinese friend and I was attempting to learn a little Chinese from them. There is a sound common in the Chinese language that I absolutely could not hear. I can't tell you what it is because I never heard it even though my friend put a lot of effort in trying to get me to pronounce the words containing that sound properly. It turned out that we couldn't find any (non-Chinese) Americans who could hear it either. For a little balance, she had a very difficult time hearing the English "r" sound.

For a little fun, here's an internet meme that is hot right now. What do you hear?

 
There is a long tradition of findings showing one modality dominates another with respect to a perceptable attribute. We generally attribute the appearance of that phenomenon to the principle of common fate.

I once performed such an experiment using apparent motion. That is I sounded speakers and shining lights in sequence in this or that direction to see whether auditory or visual cues drove appreciation of the other sense perception. Visual cues dominated auditory spatial signals. We concluded it a demonstration of dominant sence driven common fate. The study was published in a refereed journal and no it wasn't Wormrunners.

Just realize that not only were observers saying stimuli were moving when they were just sequentially sounded or lit objects but, additionally, they were saying light direction of apparent movement drove acoustic apparent movement directional perceptions.
 
This is very related to something I ran into about fifteen or twenty years ago. I have a native Chinese friend and I was attempting to learn a little Chinese from them. There is a sound common in the Chinese language that I absolutely could not hear. I can't tell you what it is because I never heard it even though my friend put a lot of effort in trying to get me to pronounce the words containing that sound properly. It turned out that we couldn't find any (non-Chinese) Americans who could hear it either. For a little balance, she had a very difficult time hearing the English "r" sound.

For a little fun, here's an internet meme that is hot right now. What do you hear?

(See "Yanny or Laurel" video above)

Actually, I just saw this one on Facebook, and it inspired me to post the OP. The Yanny/Laurel debate seems to be one involving auditory acoustics--a matter of being able to physically hear higher frequencies. At my age, ability to hear high-pitched frequencies has declined, so I can only hear "Laurel", unless the signal is run through a high-pass filter. At that point, I hear a squawk that barely sounds like "Yanny". My advice to "Yanny" fans who want to stay in the club is to turn down the music volume whenever you can. That is what leads to permanent hearing loss, and most people, especially young people, don't realize the damage that loud decibels can cause. In fact, if you can feel vibrations from sounds in your chest, open your mouth, clap your hands over your ears, and go somewhere else.

My OP was about a different aspect of hearing--that connected to language. The interesting thing is that we can usually hear and pronounce all of the speech sounds that exist in every language, but we filter them out only when we construe the acoustic stream as a form of linguistic communication. If we think we are just listening to random vocalisms, or we try to make funny noises, we are not so limited.

I'll try to explain how this works. Consider the final sound in the word "song", pronounced /sɔŋ/. It is formed with the back of your tongue up against your soft palate (or "velum"). It is a nasal consonant, so air is blocked in the mouth but allowed to pass freely through the nasal cavity. We call this sound a "velar nasal". In English, velar nasals can only occur at the end of a syllable, so English speakers have developed the ability to pronounce it only in that position. Now let's say that you are a native English speaker trying to learning Tagalog, an official language of the Philippines. Speakers of that language tend to pronounce velar nasals at the beginnings of syllables. For example, the word "nga", which means "truly" and is pronounced [ŋa], seems unpronounceable to most English speakers at first. But that is only true when language learners try to pronounce the word in running speech. English speakers have no trouble pronouncing strings of syllables with initial velar nasals when they are just fooling around with nonsense syllables--nga-nga-nga-nga. IOW, when you pronounce words in running speech, your brain automatically switches you into a program of coordinated articulation that limits or impedes the gestures that your speech organs can make. When you aren't trying to pronounce actual words, your brain yields more control over the movement of articulators. The same is true about phonemic hearing. When you are trying to identify words, your brain matches incoming phonetic input against phonemic expectations, but you become less limited in perception when you think you are listening to nonsense syllables.

This reminds me of a related anecdote from when my wife and I were learning conversational Spanish back in the early 1980s. The teacher was a native speaker of Cuban Spanish, a dialect that does not distinguish between dental and velar nasals at the ends of syllables. In fact, Cubans can use either sound interchangeably in that position. This meant that she would pronounce the words "been" and "bing" identically. Moreover, she absolutely could not hear a distinction between those two words. My wife and I both understood this about her English accent, but nobody else in the class did. Finally, a student became a little impatient with her English accent and asked her: "Which is it: 'been' or 'bing'?" She had been saying "been", but she looked puzzled. She said: "That's what I just said: "bing". My wife and I had a good chuckle over that, but the rest of the class did not know what was going on.
 
Hey, I got all that! I'm very pleased you can say things that really make sense to me!

I thought I might have developed a hearing problem... :rolleyes:
EB
 
Hey, I got all that! I'm very pleased you can say things that really make sense to me!

I thought I might have developed a hearing problem... :rolleyes:
EB

My French is not great, since I always forget proper gender and my pronunciation can be awkward. English does not have what are called "front rounded vowels": /ü/ and /ö/. These vowels are pronounced with rounded lips, but the bulk of the tongue is forward in the mouth. The back rounded vowels /u/ and /o/ are pronounced with the tongue high and mid respectively, but in the back of the mouth. English speakers tend to hear and pronounce /ü/ and /ö/ in French and German words as the back vowel /u/. That is, the lip rounding seems to override the vowel quality that makes them closer to /i/ and /e/ in articulation and acoustics. I suspect that the McGurk effect has something to do with that. English speakers see rounded lips, and they automatically categorize the sounds as back vowels.

I actually did train myself to pronounce and hear the difference between /ü/ and /u/ fairly well, but I had trouble distinguishing /ü/ from /ö/ for a number of years when I was younger. That is, I couldn't always hear the difference between "peur" and "pur" or "deux" and "doux". I remember the exact moment when my brain finally snapped the phoneme /ö/ into consciousness for me during French conversation. I was sitting one day with a old Breton informant, trying to figure out if some phrases were grammatical in that language. So I gave her two Breton sentences and asked which one was good Breton. "Tous les deux!" she exclaimed repeatedly. After that, I had no trouble pronouncing /ö/ in French (or Breton) and hearing it in the speech of both languages. It was quite jarring, because I knew the sound on an intellectual level and was used to hearing French, but my mind had not internalized it for use as a speech sound.

ETA: Now that I think of it, I suspect that my experience with those front rounded vowels had something to do with the McGurk effect. I was watching her lips as she said "Tous les deux!", and I heard two very different pronunciations with the same lip configuration. Before that, I probably didn't pay too much attention to lip-reading in sync with listening.
 
I can, with some difficulty, overcome the McGurk effect, and hear 'baa' even when the lip is moving under the front teeth. But if I stop concentrating, I go back to hearing 'faa'. Interesting.

And I'm clearly hearing 'yanny', and can't get 'laurel' at all, at age 62. I'll have to try messing with the bass/treble controls on my speakers to see if I can make that sound like 'laurel'. Even when I turn the volume down to bare audibility it still is 'yanny'.

Just from curiosity, Copernicus, do you know which language uses the most individual phonemes?
 
I can, with some difficulty, overcome the McGurk effect, and hear 'baa' even when the lip is moving under the front teeth. But if I stop concentrating, I go back to hearing 'faa'. Interesting.

And I'm clearly hearing 'yanny', and can't get 'laurel' at all, at age 62. I'll have to try messing with the bass/treble controls on my speakers to see if I can make that sound like 'laurel'. Even when I turn the volume down to bare audibility it still is 'yanny'.

Just from curiosity, Copernicus, do you know which language uses the most individual phonemes?

I've heard of languages with over 100 phonemes, but that is extremely rare. English has roughly 44, but that varies with dialects. Polynesian languages like Hawaiian tend to have small phoneme inventories. Hawaiian has roughly 18 phonemes. It is very difficult to distinguish genuine phonemes from phonetic allophonic variants of phonemes, so I'm always a bit skeptical of outlier claims. A phoneme is a sound that can be used to distinguish a word from other words, but a single phoneme can have a range of allophones, i.e. non-distinctive phonetic variants. Spelling systems can sometimes throw you off. Most languages have relatively few vowel phonemes--perhaps five like Latin (a, e, i, o, u). English spelling uses the Latin alphabet, so it has the same number of vowel symbols. However, it has around 14 separate vowel phonemes and a host of complicated rules for assigning pronunciation to alphabetic expressions. The worst of it is that there are lots of exceptions to the rules. Ideally, a language should have one-to-one correspondences between phonemes and alphabetic symbols. Historically, alphabetic writing is phonemic. Rhyme, alliteration, and assonance are also largely based on phonemic patterns.
 
"Tous les deux!" she exclaimed repeatedly.

Please note, if that can help, that I suspect "tous" in your example may be often pronounced more /ö/ ("teu") than /ü/ ("tou"), while the pronunciation of "deux" would still remain /ö/ in all cases.

And, that may well explain something, like... your avatar being an avatar of that long-gone time talking to a sweet Breton girl? :p
EB
 
"Tous les deux!" she exclaimed repeatedly.

Please note, if that can help, that I suspect "tous" in your example may be often pronounced more /ö/ ("teu") than /ü/ ("tou"), while the pronunciation of "deux" would still remain /ö/ in all cases.

And, that may well explain something, like... your avatar being an avatar of that long-gone time talking to a sweet Breton girl? :p
EB

This was an old woman who spoke French with a Breton accent. In any case, the French spoken in Brittany is far more conservative and easier to understand than that spoken by marble-mouthed Parisians. :p
 
I'm really enjoying reading this thread. Cheers all.
 
I think I understand now why sheep can't tell us to go and fuck ourselves.

And maybe we call a ram a "buck" because we perhaps misunderstand something sheep try to tell us?

Any academic paper on this interesting aspect of the wild West countryside life?
:rolleyes:
EB
 
Back
Top Bottom