Last year I wrote a post about saxophone octave keys. I was surprised to discover, after playing saxophone for most of my life, that a saxophone has not one but two octave holes. Modern saxophones have one octave key, but two octave holes. Originally saxophones had a separate octave key for each octave hole; you had to use different octave keys for different notes.
I had not seen one of these old saxophones until Carlo Burkhardt sent me photos today of a Pierret Modele 5 Tenor Sax from around 1912.
Here’s a closeup of the octave keys.
And here’s a closeup of the bell where you can see the branding.
The other day I wrote about the golden angle, a variation on the golden ratio. If φ is the golden ratio, then a golden angle is 1/φ2 of a circle, approximately 137.5°, a little over a third of a circle.
Musical notes go around in a circle. After 12 half steps we’re back where we started. What would it sound like if we played intervals that went around this circle at golden angles? I’ll include audio files and the code that produced them below.
A golden interval, moving around the music circle by a golden angle, is a little more than a major third. And so a chord made of golden intervals is like an augmented major chord but stretched a bit.
An augmented major triad divides the musical circle exactly in thirds. For example, C E G#. Each note is four half steps, a major third, from the previous note. In terms of a circle, each interval is 120°. Here’s what these notes sound like in succession and as a chord.
If we go around the musical circle in golden angles, we get something like an augmented triad but with slightly bigger intervals. In terms of a circle, each note moves 137.5° from the previous note rather than 120°. Whereas an augmented triad goes around the musical circle at 0°, 120°, and 240° degrees, a golden triad goes around 0°, 137.5°, and 275°.
A half step corresponds to 30°, so a golden angle corresponds to a little more than 4.5 half steps. If we start on C, the next note is between E and F, and the next is just a little higher than A.
If we keep going up in golden intervals, we do not return to the note we stared on, unlike a progression of major thirds. In fact, we never get the same note twice because a golden interval is not a rational part of a circle. Four golden angle rotations amount to 412.5°, i.e. 52.5° more than a circle. In terms of music, going up four golden intervals puts us an octave and almost a whole step higher than we stared.
Here’s what a longer progression of golden intervals sounds like. Each note keeps going but decays so you can hear both the sequence of notes and how they sound in harmony. The intention was to create something like playing the notes on a piano with the sustain pedal down.
It sounds a little unusual but pleasant, better than I thought it would.
Here’s the Python code that produced the sound files in case you’d like to create your own. You might, for example, experiment by increasing or decreasing the decay rate. Or you might try using richer tones than just pure sine waves.
from scipy.constants import golden
import numpy as np
from scipy.io.wavfile import write
N = 44100 # samples per second
# frequency in cycles per second
# duration in seconds
# decay = half-life in seconds
def tone(freq, duration, decay=0):
t = np.arange(0, duration, 1.0/N)
y = np.sin(freq*2*np.pi*t)
if decay > 0:
k = np.log(2) / decay
y *= np.exp(-k*t)
# Scale signal to 16-bit integers,
# values between -2^15 and 2^15 - 1.
signal /= max(abs(signal))
return np.int16(signal*(2**15 - 1))
C = 262 # middle C frequency in Hz
# Play notes sequentially then as a chord
def arpeggio(n, interval):
y = np.zeros((n+1)*N)
for i in range(n):
x = tone(C * interval**i, 1)
y[i*N : (i+1)*N] = x
y[n*N : (n+1)*N] += x
# Play notes sequentially with each sustained
def bell(n, interval, decay):
y = np.zeros(n*N)
for i in range(n):
x = tone(C * interval**i, n-i, decay)
y[i*N:] += x
major_third = 2**(1/3)
golden_interval = 2**(golden**-2)
write("augmented.wav", N, to_integer(arpeggio(3, major_third)))
write("golden_triad.wav", N, to_integer(arpeggio(3, golden_interval)))
write("bell.wav", N, to_integer(bell(9, golden_interval, 6)))
When I was in high school, one year I made the Region choir. I had no intention of competing at the next level, Area, because I didn’t think I stood a chance of going all the way to State, and because the music was really hard: Stravinsky’s Symphony of Psalms.
My choir director persuaded me to try anyway, with just a few days before auditions. That wasn’t enough time for me to learn the music with all its strange intervals. But I tried out. I sang the whole thing. As awful as it was, I kept going. It was about as terrible as it could be, just good enough to not be funny. I wanted to walk out, and maybe I should have out of compassion for the judges, but I stuck it out.
I was proud of that audition, not as a musical achievement, but because I powered through something humiliating.
I did better in band than in choir. I made Area in band and tried out for State but didn’t make it. I worked hard for that one and did a fair job, but simply wasn’t good enough.
That turned out well. It was my senior year, and I was debating whether to major in math or music. I’d told myself that if I made State, I’d major in music. I didn’t make State, so I majored in math and took a few music classes for fun. We can never know how alternative paths would have worked out, but it’s hard to imagine that I would have succeeded as a musician. I didn’t have the talent or the temperament for it.
When I was in college I wondered whether I should have done something like acoustical engineering as a sort of compromise between math and music. I could imagine that working out. Years later I got a chance to do some work in acoustics and enjoyed it, but I’m glad I made a career of math. Applied math has given me the chance to work in a lot of different areas—to play in everyone else’s back yard, as John Tukey put it—and I believe it suits me better than music or acoustics would have.
For many years, rivals University of Texas and Texas A&M University played each other in football on Thanksgiving. In 1999, the game fell one week after the collapse of the Aggie Bonfire killed 12 A&M students and injured 27.
The University of Texas band’s half time show that year was a beautiful tribute to the fallen A&M students.
Amplitude modulated signals sound rough to the human ear. The perceived roughness increases with modulation frequency, then decreases, and eventually disappears. The point where roughness reaches is maximum depends on the the carrier signal, but for a 1 kHz tone roughness reaches a maximum for modulation at 70 Hz. Roughness also increases as a function of modulation depth.
Amplitude modulation multiplies a carrier signal by
1 + d sin(2π f t)
where d is the modulation depth, f is the modulation frequency, and t is time.
Here are some examples you can listen to. We use a pure 1000 Hz tone and Gaussian white noise as carriers, and vary modulation depth and frequency continuously over 10 seconds. he modulation depth example varies depth from 0 to 1. Modulation frequency varies from 0 to 120 Hz.
First, here’s a pure tone with increasing modulation depth.
Next we vary the modulation frequency.
Now we switch over to Gaussian white noise, first varying depth.
And finally white noise with varying modulation frequency. This one sounds like a prop-driven airplane taking off.
Fluctuation strength is similar to roughness, though at much lower modulation frequencies. Fluctuation strength is measured in vacils (from vacilare in Latin or vacillate in English). Police sirens are a good example of sounds with high fluctuation strength.
Fluctuation strength reaches its maximum at a modulation frequency of around 4 Hz. For much higher modulation frequencies, one perceives roughness rather than fluctuation. The reference value for one vacil is a 1 kHz tone, fully modulated at 4 Hz, at a sound pressure level of 60 decibels. In other words
(1 + sin(8πt)) sin(2000πt)
where t is time in seconds.
Since the carrier frequency is 250 times greater than the modulation frequency, you can’t see both in the same graph. In this plot, the carrier is solid blue compared to the modulation.
Here’s what the reference for one vacil would sound like:
Acoustic roughness is measured in aspers (from the Latin word for rough). An asper is the roughness of a 1 kHz tone, at 60 dB, 100% modulated at 70 Hz. That is, the signal
(1 + sin(140πt)) sin(2000πt)
where t is time in seconds.
Here’s what that sounds like (if you play this at 60 dB, about the loudness of a typical conversation at one meter):
And here’s the Python code that made the file:
from scipy.io.wavfile import write
from numpy import arange, pi, sin, int16
def f(t, f_c, f_m):
# t = time
# f_c = carrier frequency
# f_m = modulation frequency
return (1 + sin(2*pi*f_m*t))*sin(2*f_c*pi*t)
# Take samples in [-1, 1] and scale to 16-bit integers,
# values between -2^15 and 2^15 - 1.
return int16(signal*(2**15 - 1))
N = 48000 # samples per second
x = arange(3*N) # three seconds of audio
# 1 asper corresponds to a 1 kHz tone, 100% modulated at 70 Hz, at 60 dB
data = f(x/N, 1000, 70)
write("one_asper.wav", N, to_integer(data))
This afternoon I was working on a project involving tonal prominence. I stepped away from the computer to think and was interrupted by the sound of a leaf blower. I was annoyed for a second, then I thought “Hey, a leaf blower!” and went out to record it. A leaf blower is a great example of a broad spectrum noise with strong tonal components. Lawn maintenance men think you’re kinda crazy when you say you want to record the noise of their equipment.
The tuner app on my phone identified the sound as an A3, the A below middle C, or 220 Hz. Apparently leaf blowers are tenors.
Here’s a short audio clip:
And here’s what the spectrum looks like. The dashed grey vertical lines are at multiples of 55 Hz.
The peaks are perfectly spaced at multiples of the fundamental frequency of 55 Hz, A1 in scientific pitch notation. This even spacing of peaks is the fingerprint of a definite tone. There’s also a lot of random fluctuation between peaks. That’s the finger print of noise. So together we hear a pitch and noise.
When using the tone-to-noise ratio algorithm from the ECMA-74, only the spike at 110 Hz is prominent. A limitation of that approach is that it only considers single tones, not how well multiple tones line up in a harmonic sequence.
This post looks at loudness and sharpness, two important psychoacoustic metrics. Because they have to do with human perception, these factors are by definition subjective. And yet they’re not entirely subjective. People tend to agree on when, for example, one sound is twice as loud as another, or when one sound is sharper than another.
Loudness is the psychological counterpart to sound pressure level. Sound pressure level is a physical quantity, but loudness is a psychoacoustic quantity. The former has to do with how a microphone perceives sound, the latter how a human perceives sound. Sound pressure level in dB and loudness in phon are roughly the same for a pure tone of 1 kHz. But loudness depends on the power spectrum of a sound and not just it’s sound pressure level. For example, if a sound’s frequency is too high or too low to hear, it’s not loud at all! See my previous post on loudness for more background.
Let’s take the four guitar sounds from the previous post and scale them so that each has a sound pressure level of 65 dB, about the sound level of an office conversation, then rescale so the sound pressure is 90 dB, fairly loud though not as loud as a rock concert. [Because sound perception is so nonlinear, amplifying a sound does not increase the loudness or sharpness of every component equally.]
While all four sounds have the same sound pressure level, the undistorted sounds have the lowest loudness. The distorted sounds are louder, especially the single note. Increasing the sound pressure level from 65 dB to 90 dB increases the loudness of each sound by roughly the same amount. This will not be true of sharpness.
Sharpness is related how much a sound’s spectrum is in the high end. You can compute sharpness as a particular weighted sum of the specific loudness levels in various bands, typically 1/3-octave bands. This weight function that increases rapidly toward the highest frequency bands. For more details, see Psychoacoustics: Facts and Models.
The table below gives sharpness, measured in acum, for the four guitar sounds at 65 dB and 90 dB.
Although a clean chord sounds a little louder than a single note, the former is a little sharper. Distortion increases sharpness as it does loudness. The single note with distortion is a little louder than the other sounds, but much sharper than the others.
Notice that increasing the sound pressure level increases the sharpness of the sounds by different amounts. The sharpness of the last sound hardly changes.
The other day I asked on Google+ if someone could make an audio clip for me and Dave Jacoby graciously volunteered. I wanted a simple chord on an electric guitar played with varying levels of distortion. Dave describes the process of making the recording as
Kettledrums (a.k.a. tympani) produce a definite pitch, but in theory they should not. At least the simplest mathematical model of a kettledrum would not have a definite pitch. Of course there are more accurate theories that align with reality.
Unlike many things that work in theory but not in practice, kettledrums work in practice but not in theory.
A musical sound has a definite pitch when the first several Fourier components are small integer multiples of the lowest component, the fundamental. A pitch we hear at 100 Hz would have a first overtone at 200 Hz, the second at 300 Hz, etc. It’s the relative strengths of these components give each instrument its characteristic sound.
An ideal string would make a definite pitch when you pluck it. The features of a real string discarded for the theoretical simplicity, such as stiffness, don’t make a huge difference to the tonality of the string.
An ideal circular membrane would vibrate at frequencies that are much closer together than consecutive integer multiples of the fundamental. The first few frequencies would be at 1.594, 2.136, 2.296, 2.653, and 2.918 times the fundamental. Here’s what that would sound like:
This isn’t an accurate simulation of tympani sounds, just something simple but more realistic than the vibrations of an idea membrane.
The real world complications of a kettledrum spread out its Fourier components to make it have a more definite pitch. These include the weight of air on top of the drum, the stiffness of the drum head, the air trapped in the body of the drum, etc.
If you’d like to read more about how kettle drums work, you might start with The Physics of Kettledrums by Thomas Rossing in Scientific American, November 1982.
What do tuning a guitar and tuning a radio have in common? Both are examples of beats or amplitude modulation.
In an earlier post I wrote about how beats come up in vibrating systems, such as a mass and spring combination or an electric circuit. Here I look at examples from music and radio.
When two musical instruments play nearly the same note, they produce beats. The number of beats per second is the difference in the two frequencies. So if two flutes are playing an A, one playing at 440 Hz and one at 442 Hz, you’ll hear a pitch at 441 Hz that beats two times a second. Here’s a wave file of two pure sine waves at 440 Hz and 442 Hz.
As the players come closer to being in tune, the beats slow down. Sometimes you don’t have two instruments but two strings on the same instrument. Guitarists listen for beats to tell when two strings are playing the same note with the same pitch.
The same principle applies to AM radio. A message is transmitted by multiplying a carrier signal by the content you want to broadcast. The beats are the content. As we’ll see below, in some ways the musical example and the AM radio example are opposites. With tuning, we start with two sources and create beats. With AM radio, we start by creating beats, then see that we’ve created two sources, the sidebands of the signal.
Both examples above relate to the following trig identity:
cos(a–b) + cos(a+b) = 2 cos a cos b
And because we’re looking at time-varying signals, slip in a factor of 2πt:
cos(2π(a–b)t) + cos(2π(a+b)t) = 2 cos 2πat cos 2πbt
In the case of two pure tones, slightly out of tune, let a = 441 and b = 1. Playing an A 440 and an A 442 at the same time results in an A 441, twice as loud, with the amplitude going up and down like cos 2πt, i.e. oscillating two times a second. (Why two times and not just once? One beat for the maximum and and one for the minimum of cos 2πt.)
It may be hard to hear beats because of complications we’ve glossed over. Musical instruments are never perfectly in phase, but more importantly they’re not pure tones. An oboe, for example, has strong components above the fundamental frequency. I used a flute in this example because although its tone is not simply a sine wave, it’s closer to a sine wave than other instruments, especially when playing higher notes. Also, guitarists often compare the harmonics of two strings. These are purer tones and so it’s easier to hear beats between them.
For the case of AM radio, read the equation above from right to left. Let a be the frequency of the carrier wave. For example if you’re broadcasting on AM station 700, this means 700 kHz, so a = 700,000. If this station were broadcasting a pure tone at 440 Hz, b would be 440. This would produce sidebands at 700,440 Hz and 699,560 Hz.
In practice, however, the carrier is not multiplied by a signal like cos 2πbt but by 1 + m cos 2πbt where |m| < 1 to avoid over-modulation. Without this extra factor of 1 the signal would be 100% modulated; the envelope of the signal would pinch all the way down to zero. By including the factor of 1 and using a modulation index m less than 1, the signal looks more like the image above, with the envelope not pinching all the way down. (Over-modulation occurs when m > 1. Instead of the envelope pinching to zero, the upper and lower parts of the envelop cross.)
I’ve played saxophone since I was in high school, and I thought I knew how saxophones work, but I learned something new this evening. I was listening to a podcast  on musical acoustics and much of it was old hat. Then the host said that a saxophone has two octave holes. Really?! I only thought there was only one.
When you press the octave key on the back of a saxophone with your left thumb, the pitch goes up an octave. Sometimes this causes a key on the neck to open up and sometimes it doesn’t . I knew that much.
I thought that when this key didn’t open, the octaves work like they do on a flute: no mechanical change to the instrument, but a change in the way you play. And to some extent this is right: You can make the pitch go up an octave without using the octave key. However, when the octave key is pressed there is a second hole that opens up when the more visible one on the neck closes.
According to the podcast, the first saxophones had two octave keys to operate with your thumb. You had to choose the correct octave key for the note you’re playing. Modern saxophones work the same as early saxophones except there is only one octave key controlling two octave holes.
* * *
 Musical Acoustics from The University of Edinburgh, iTunes U.
 On the notes written middle C up to A flat, the octave key raises the little hole I wasn’t aware of. For higher notes the octave key raises the octave hole on the neck.