How well does the spelling rule “i before e except after c” hold? I searched the 5,000 most common English words (from here) to see.
70% of the words containing ‘ie” or “ei” follow the rule.
If you weigh the word counts by word frequency, the rule only holds 54% of the time.
There’s a longer version of the rule that adds “or when sounding as ‘a’ as in neighbor or weigh.” This version holds for 79% of the words in my list. And when weighted by frequency, the rule holds 85% of the time.
Update: Here’s an even more accurate version from Merriam-Webster:
i before e,
except after c,
or when sounded as a,
as in ‘neighbor’ and ‘weigh’,
or when it appears in comparatives and superlatives like ‘fancier’,
or when the c sounds as sh as in ‘glacier’,
or when the vowel sounds like ee as in ‘seize’,
or i as in ‘height’,
or when it shows up in compound words such as ‘albeit’,
or when it shows up in –ing inflections of verbs that end in e, like queueing,
or occasionally in technical words that have a strong etymological link to their parent languages such as ‘cuneiform’ and ‘caffeine’,
and in numerous other random exceptions such as ‘science’, ‘forfeit’, and ‘weird.’
Might be harder to calculate, but what if you include the last part of the rule: “or when sounding like ‘ay’ as in ‘neighbor’ and ‘weigh?'”
I went back to check, and updated the post to include the longer version of the rule. I had to do it manually, so I might have made an error.
85% sounds low to me, for the full version of the rule. Could you post a short list of the most common exceptions, so I can see what I’m missing? At the moment, ‘height’ is the most common exception I can think of. The others are words with a syllable break between the e and the i, like ‘reinvigorate’ and ‘reincarnation’ and ‘deicing’, which at the time the rule was formulated were still spelled with a hyphen: ‘re-invigorate’, ‘de-icing’, etc.
Similar point to David… The version I know is “I before E, except after C, when the sound you are making is EE.”
Never failed me.
Dave: One common exceptions is “science” and variations such as “conscience.”
They aren’t exceptions to the long rule “when the sound is ee”. Weird is a true exception because the sound is ee and it’s not after c.
Seize and variations e.g. seizure, seizing.
I have always used “I before E, except after C, when the sound is ‘ee’, except ‘weird'”. (It makes me happy that ‘weird’ is an exception, i.e. it is a weird word). Possibly this rule assumes British pronunciation.
That gets me far enough that its not the dominant source of my typos.
‘Queuing’ as spelt isn’t an exception. ‘Queueing’ works, and ‘Cooeeing’ is another fun example with lots of consecutive vowels.
A few days before reading this post, I ran across another describing this perplexing situation.
http://www.dailywritingtips.com/the-surfeit-of-weird-exceptions-to-the-i-before-e-rule/
The author’s conclusion was, “Ultimately, it may be wise to forget that such a rule exists and always check spelling of words that may have an ie or an ei combination.”
I prefere the Brian Reagan variant of the rule:
I before e except after c and when sounding like a as in neighbor and weigh, and on weekends and holidays and all throughout May, and you’ll always be wrong no matter what you say!
(http://en.wikiquote.org/wiki/Brian_Regan)
Also see http://norvig.com/chomsky.html for similar, with a corpus of a trillion words. (Search for “i before e”. Nah, read the whole thing.)
@John: um, ‘science’ and ‘conscience’ aren’t exceptions, because of the “…except after c” part of the rule.
The main ‘family’ of exceptions (see the link from Christopher Grau above) seems to be words that were compound words ending in -fait in Old French and spelled funny by the Normans:
surfeit <- surfait ("overdone")
counterfeit <- contrefait ("against fact")
forfeit <- forfait ("beyond deed", transgression)
('Foreign' doesn't quite fit that pattern, but it was also spelled with an /ai/ in the original French: forain. For- as a prefix meant outside, beyond, away; originally from the Latin 'foris', out-of-doors, whence also 'forest'.)
'Heifer' is good Old English, but again was spelled with different vowels originally: haeghfore (where the ae represents the single letter 'ash').
'Weird' has also had its vowels changed, from Old English /wyrd/ which in turn derived from /weordhan/. (Dh = the letter edh.)
Dave Tate:
Cases like “science” and “conscience” (and “glacier”) are confusing, because they’re exceptions to the second phrase of the rule (“except after c”) — exceptions to the exception, as it were.
So, yeah, they’re not exceptions to the very first phrase of the rule (“i before e”), but they *are* exceptions to the usual “i before e, except after c” version.
They *are* handled correctly by the British version “i before e, except after c when sounding like ‘ee'”, though that fails for a lot of the other exceptions (weird, counterfeit, heifer, etc.).
I before E except in Germany
“…and on weekends and holidays, and all thru out may, and you’ll never be right no matter what you say” –Brian Regan ‘Stupid in school’ bit
“I before E except in Germany”
That may explain why it takes an Einstein to figure this out.
How many exceptions are there to this rule
i before e except after c when the sound says ‘ee’.
And of course across 1 syllable only
I can only think of weird and seize.
That list of exception is very long so I’m thinking of each exception as a factor in a factor analysis. You should see a long tail.
More variation is included as each factor gets added. I’ve done this with other applications, but I bump into multiple convergences with the x-axis which seems to tell me I have multiple small worlds, or another dimension and another pair of tails.