Contrary to popular belief, English has more than five or ten vowel sounds. The actual number is disputed because of disagreements over when two sounds are sufficiently distinct to be classified as separate sounds. I’ve heard some people say 15, some 17, some over 20.
I ran across a podcast episode recently that mentioned a sentence that demonstrates a different English vowel sound in each word:
Who would know naught of art must learn, act, and then take his ease [1].
The hosts noted that to get all the vowels in, you need to read the sentence with non-rhotic pronunciation, i.e. suppressing the r in art.
I’ll run this sentence through some software that returns the phonetic spelling of each word in IPA symbols to see the distinct vowel sounds that way. First I’ll use Python, then Mathematica.
Python
Let’s run this through some Python code that converts English words to IPA notation so we can look at the vowels.
import eng_to_ipa as ipa text = "Who would know naught of art must learn, act, and then take his ease." print(ipa.convert(text))
This gives us
hu wʊd noʊ nɔt əv ɑrt məst lərn, ækt, ənd ðɛn teɪk hɪz iz
Which includes the following vowel symbols:
- u
- ʊ
- oʊ
- ɔ
- ə
- ɑ
- ə
- ə
- æ
- ə
- ɛ
- eɪ
- ɪ
- i
This has some duplicates: 5, 7, 8, and 10 are all schwa symbols.
By default the eng_to_ipa
gives one way to write each word in IPA notation. There is an optional argument, retrieve_all
that defaults to False
but may return more alternatives when set to True
. However, in our example the only difference is that the second alternative writes and as ænd
rather than ənd
.
It looks like the eng_to_ipa
module doesn’t transcribe vowels with sufficient resolution to distinguish some of the sounds in the model sequence. For example, it doesn’t seem to distinguish the stressed sound ʌ
from the unstressed ə
.
Mathematica
Here’s Mathematica code to split the model sentence into words and show the IPA pronunciation of each word.
text = "who would know naught of art must \ learn, act, and then take his ease" ipa[w_] := WordData[w, "PhoneticForm"] Map[ipa, TextWords[text]]
This returns
{"hˈu", "wˈʊd", "nˈoʊ", "nˈɔt", "ˈʌv", "ˈɒrt", "mˈʌst", "lˈɝn", "ˈækt", "ˈænd", "ðˈɛn", "tˈeɪk", "hˈɪz", "ˈiz"}
By the way, I had to write the first word as “who” because WordData
won’t do it for me. If you ask for
ipa["Who"]
Mathematica will return
Missing["NotAvailable"]
though it works as expected if you send it “who” rather than “Who.”
Let’s remove the stress marks and join the words together so we can compare the Python and Mathematica output. The top line is from Python and the bottom is from Mathematica.
hu wʊd noʊ nɔt əv ɑrt məst lərn ækt ænd ðɛn teɪk hɪz iz hu wʊd noʊ nɔt ʌv ɒrt mʌst lɝn ækt ænd ðɛn teɪk hɪz iz
There are a few differences, summarized in the table below. Since the symbols are a little difficult to tell apart, I’ve included their Unicode code points.
|-------+------------+-------------| | Word | Python | Mathematica | |-------+------------+-------------| | of | ə (U+0259) | ʌ (U+028C) | | must | ə (U+0259) | ʌ (U+028C) | | art | ɑ (U+0251) | ɒ (U+0252) | | learn | ə (U+0259) | ɝ (U+025D) | |-------+------------+-------------|
Mathematica makes some distinctions that Python missed.
Update: See the first comment below for variations on how the model sentence can be pronounced and how to get more distinct vowel sounds out of it.
More linguistics posts
- Estimating vocabulary size with Heaps law
- Writing down an unwritten language
- Chinese character frequency and entropy
[1] After writing this post I saw the sentence in writing, and the fourth word is “aught” rather than “naught.” This doesn’t change the vowel since the two words rhyme, but Mathematica doesn’t recognize the word “aught.”
The Mathematica pronunciation is entirely rhotic here — there is an /r/ consonant in “art” (presumably broad transcription actually representing /ɹ/), and an r-coloured /ɚ/ vowel in “learn” that only exists in North American and Irish rhotic dialects.
The lack of length marking also points to a specifically NA English dialect being represented: most English dialects have /iː/ or /ɪi̯/ in “ease”.
They’re also using the “weak” pronunciation of “of”, which doesn’t really help the point of the sentence. But I don’t think I’ve ever heard anyone pronounce “of” /ʌv/, strong or weak!
John Wells points out that for this sentence to work in the desired way requires strong “would”, “of”, and “must” but weak “and”, and doing so produces 14 of the 20 vowels of conservative British English: https://www.phon.ucl.ac.uk/home/wells/blog0802b.htm.
The sentence is then: /huː wʊd nəʉ̯ ɔːt ɒv ɑ̟ːt mʌst lɐːn ækt ənd ðen teɪk hɪz iːz/, lacking /aɪ/ /aʊ/ /ɔɪ/ /ɪə/ /eə/ /ʊə/.
In modern Southern Standard British English (retaining the strong pronunciations, as in careful speech), this becomes:
/hʉ̞ʉ̯ wɵd nəʊ̯ oːt ɔv ɑ̟ːt mʌst ləːn akt ənd ðɛn tɛɪ̯k hɪz ɪi̯z/
— though also note that the “must” vowel is highly variable, and can be realized anywhere in the triangle /ə/-/ʌ/-/ɑ/.
In my currently-most-used dialect, this lacks /aɪ̯/ /aʉ̯/ /ɔɪ̯/ /ɪː/ /ɛː/ /ɒʊ̯/. Relative to John Wells’ conservative set, the first five of these correspond directly, but /ʊə/ has merged with /ɔː/ to become /oː/ (represented in “aught”), and /əʊ̯/ and /ɒʊ̯/ have split (latter now not represented).
(In my native dialect, it is /ʔəʉ̯ wəd næ̈ɤ̯̈ ɔ̝əʔ əv ɑ̟ːʔ mɐ̟st lɜːn akt n̩ːen tæɪ̯k ɪz ɪi̯z/)