Well there has certainly been a lot of publicity for the laurel/yanny clip recently. It is great to have so many people discussing speech and speech perception – but also a little disheartening that so much misinformation gets accepted as valid phonetics.
For those who don’t know (where have you been?!?) — it is all about an audio clip of a man saying the word ‘laurel’ (listen below) – and the fact that a surprising number of people claim to hear it not as ‘laurel’ but as ‘yanny’.
The original audio came from vocabulary.com, a site aiming to help people learn the meaning and pronunciation of lots of words. Here’s the relevant clip. Notice that it was pronounced clearly (by an opera singer, no less) and in isolation, and recorded under good conditions.
Let’s get a few of the misconceptions out of the way, before going on to consider some possible explanations and discuss some broader issues, especially how (if at all) this example relates to forensic transcription.
- This is not an ambiguous audio clip. The word ‘laurel’ is not likely to be consistently confused with ‘yanny’ even in a noisy recording – and this is a nice clear recording. In its original form on vocabulary.com, it is quite clearly ‘laurel’ (the many unclear versions floating around the internet have been created by audio geeks trying to explain the phenomenon by manipulating the audio – see more on this below).
- Differences in how people hear this clip do not and could not relate to general interpersonal variation in habitual ‘style of hearing’. Basically someone who genuinely and consistently confused words like ‘laurel’ and ‘yanny’ would be unable to speak English. The differences between these words are the very differences that enable us to differentiate hundreds and thousands of other common words, so if you get confused by ‘laurel’ vs ‘yanny’, you’d be getting confused by numerous words in everything you listen to.
- Differences in how people hear this clip do not and could not relate to age-related hearing loss. Age-related hearing loss is real and important – but it’s not at play here.
- For one thing, all the relevant acoustic information in this word is below 3kHz (see spectrogram below), which is way lower than normal age-related hearing loss.
- For another thing, even if it were true (which it isn’t) that young people were generally better able than older folks to hear the frequencies of ‘y’-like sounds (like the beginning of ‘yanny’), the age-related-hearing-loss hypothesis requires us to believe that young people are less able to hear ‘l/r’-like frequencies than older people are, which is kind of absurd.
Here’s a spectrogram of the recording in the video above. Notice how nice and clear it is. Notice the very definite /l/ at the beginning, with a nice release burst before the vowel. Notice the dramatic dip in the third formant, classic for /r/. Notice the first and second formants are close together throughout, expected for /l/ and /ɔ/.
So what happened to make this nice clear ‘laurel’ into ‘yanny’?
(This is the story as gleaned from various websites – see links below).
A high-school student named Katie Hetzel, in Georgia USA, was doing her homework, listening to words from her lesson on vocabulary.com. When she clicked to hear the next word on the list (so she could define it), she was surprised to hear ‘yanny’, which was not part of her homework. Then she was more surprised to find the word was supposed to say ‘laurel’. At this point I have no definite explanation for why she initially heard ‘yanny’ (but see below for a conjecture).
She then asked her classmates what they heard. They all said they heard ‘laurel’ – except one who agreed with Katie that it sounded like ‘yanny’. For fun they put the clip on social media to ask more friends. Then someone put a ‘vote laurel or yanny’ poll over the top. Next thing – with the help of a youtube ‘influencer’ named Cloe Feldman – the poll is going viral. Hear Cloe’s account of the whole thing here and here (Katie’s own account – which appears between 5 and 9 mins in the second video – is excerpted for you in the clip below).
In no time, votes were coming in showing approx 40-45% voting for each of ‘laurel’ and ‘yanny’, with the remainder saying they alternated (either at will or spontaneously) between the two.
Note the huge shift from Katie’s original ‘experiment’ where only one of a class of, let’s guess, something like 20, heard ‘yanny’ – i.e. 5% – to this online ‘experiment’ where nearly half hear ‘yanny’. Something huge has changed here, and it is not the audio! (See more on psycho-social factors below).
Next, various explanations for the differing perceptions appeared, most (incorrectly) attributing it to factors like those debunked at 1-3 above.
Soon a variety of alternative versions of the audio appeared, manipulating different frequencies of the audio clip – of course all of these degraded and distorted the quality of the original clip – again affecting the conditions of the ‘experiment’. The New York Times put up a slider that allowed listeners to move through various different (manipulated, degraded) versions, progressively more and more different from the original, and share their experience of where their perception changed from laurel to yanny.
The real explanation
Explaining the illusion
As mentioned above, laurel/yanny is not a normal phonetic confusion. There’s no way an unbiased perception test would yield anything close to 50% of participants consistently hearing ‘yanny’. We need psycho-social reasons to account for that large percentage (see below).
However, it is true that many responsible people report hearing ‘yanny’ some of the time, usually sporadically and uncontrollably. I confess it has happened to me a few times too. It is a weird sensation. How can we explain it? I don’t know for sure but here are some thoughts.
It is notable that the third formant in the spectrogram (the horizontal black bar that starts around 2720Hz, then dips down and rises again) has very roughly a similar shape to the shape of the second formant of a word like yanny. It is also notable that the first and second formants of ‘laurel’ are very close together – in this recording and in any normal pronunciation of the word.
If, for whatever reason, a listener is led to interpret the third formant as the second, and the close first and second formants as the first, they might, in the right context, be led to an illusion of hearing something a bit like ‘yanny’. Maybe this is what happened to Katie when she first heard the word, and to the rest of us subsequently, influenced by her suggestion. Or maybe there’s a better explanation (if you have a suggestion, please let me know).
Of course, to be confident of this or any explanation, we’d need to be able to replicate it properly — not by distorting the sound, like the audio geeks do, but by consistently and predictably manipulating unbiased listener’s perception of this recording and of other clear recordings of this and similar words. Unfortunately due to the internet frenzy of recent weeks, it will be impossible to find unbiased listeners for this audio for at least a generation or two :-).
Explaining the social phenomenon
Whatever the explanation for the perceptual illusion, it is unusual and sporadic, not a consistent ambiguity. The acoustic cues for ‘laurel’ in this recording are really much stronger than those for ‘yanny’. Even if someone first hears ‘yanny’ (like Katie did), repeated careful listening should, under normal conditions, soon lead to recognition that ‘laurel’ is the right interpretation (even in the absence of external information about the origin of the clip).
Explaining the current phenomenon — a near 50/50 split in interpretations — involves some perceptual reasons but mostly psycho-social reasons, such as: relief at finding something more exciting than vocabulary homework to discuss, team spirit, peer pressure, confusion, reluctance to change from a first impression, desire to get on Ellen (see Cloe and Katie’s account above, from about 7min, and Ellen video below, from start), and so on.
What does all this have to do with forensic transcription?
The laurel/yanny thing itself is unusual and has little to do with forensic transcription. However, some of the phenomena that are being discussed along with laurel/yanny have heaps to do with forensic transcription. Here’s one example from part way through the Ellen video:
The way a sound can go from meaningless static to a clear phrase under the influence of a prime is exactly what happens with lots of indistinct covert recordings, like the one that got a man convicted for murder based on words he didn’t say).
And the fact that priming seems so amazing and unfathomable to so many people is exactly what causes so many problems when police are allowed to present their transcripts of covert recordings to a court.
To solve those problems we really do need to get to a point where priming is considered a normal, non-amazing occurrence, that is understood, at least in general terms, by everyone.
But this laurel/yanny thing is quite different – as discussed a bit above – not least because it is a very clear recording. It’s a shame these different phenomena are all being rolled together into one big incomprehensible ‘thing’.
Now to finish up, I want to take issue with another common view expressed in response to the laurel/yanny debate:
‘There’s no right answer – everyone can just interpret it the way they want to’
That’s a fine philosophy in lots of situations – but not when it comes to interpreting forensic audio (as I discuss also in relation to the Randy Newman example that I go through elsewhere on this site – check it out when you finish this as it is pretty interesting!).
With forensic audio, we need to distinguish reliably between ‘audio given the right interpretation’, ‘audio given the wrong interpretation’, ‘audio whose interpretation I personally don’t know for sure‘ and ‘audio whose interpretation, in principle, no one can know for sure‘. Of course, only the first of these should be put before a jury.
If the laurel/yanny example teaches us anything about forensic transcription, it is to warn us about the dangers of letting interpretation of forensic audio be a matter for social negotiation (as by a jury for example). And the need to not just accept a ‘first impression’.
Some of the more useful links (roughly in date order)
Laurel or Yanny? What you hear could depend on your hearing loss (NB as discussed above I disagree strongly with the concept that this phenomenon relies on hearing loss – but the recount of the twitter conversation provided by this article is useful)
and see also links embedded above.
The full twitter thread has a lot of profanity (try viewing in reverse order, as it gets worse as days pass) but the odd glimpse of real humour. Among other gems, I have to admit I laughed at Donald Trump saying he heard neither laurel nor yanny, but rather covfefe!