Uncategorized

Spectral flatness

White noise has a flat power spectrum. So a reasonable way to measure how close a sound is to being pure noise is to measure how flat its spectrum is.

Spectral flatness is defined as the ratio of the geometric mean to the arithmetic mean of a power spectrum.

The arithmetic mean of a sequence of n items is what you usually think of as a mean or average: add up all the items and divide by n.

The geometric mean of a sequence of n items is the nth root of their product. You could calculate this by taking the arithmetic mean of the logarithms of the items, then taking the exponential of the result. What if some items are negative? Since the power spectrum is the squared absolute value of the FFT, it can’t be negative.

So why should the ratio of the geometric mean to the arithmetic mean measure flatness? And why pure tones have “unflat” power spectra?

If a power spectrum were perfectly flat, i.e. constant, then its arithmetic and geometric means would be equal, so their ratio would be 1. Could the ratio ever be more than 1? No, because the geometric mean is always less than or equal to the arithmetic mean, with equality happening only for constant sequences.

In the continuous realm, the Fourier transform of a sine wave is a pair of delta functions. In the discrete realm, the FFT will be large at two points and small everywhere else. Why should a concentrated function have a small flatness score? If one or two of the components are 1’s and the rest are zeros, the geometric mean is zero. So the ratio of geometric and arithmetic means is zero. If you replace the zero entries with some small ε and take the limit as ε goes to zero, you get 0 flatness as well.

Sometimes flatness is measured on a logarithmic scale, so instead of running from 0 to 1, it would run from -∞ to 0.

Let’s compute the flatness of some examples. We’ll take a mixture of a 440 Hz sine wave and Gaussian white noise with varying proportions, from pure sine wave to pure noise. Here’s what the flatness looks like as a function of the proportions.

spectral flatness

The curve is fairly smooth, though there’s some simulation noise at the right end. This is because we’re working with finite samples.

Here’s what a couple of these signals would sound like. First 30% noise and 70% sine wave:

(download)

And now the other way around, 70% noise and 30% sine wave:

(download)

Why does the flatness of white noise max out somewhere between 0.5 and 0.6 rather than 1? White noise only has a flat spectrum in expectation. The expected value of the power spectrum at every frequency is the same, but that won’t be true of any particular sample.

Related posts:

Linear or not, random or not, at different levels

Linear vs nonlinear

I’ve run across a lot of ambiguity lately regarding systems described as “nonlinear.” Systems typically have several layers, and are linear at one level and nonlinear at another, and authors are not always clear about which level they’re talking about.

For example, I recently ran across something called a “nonlinear trapezoid filter.” My first instinct was to parse this as

nonlinear (trapezoid filter)

though I didn’t know what that meant. On closer inspection, I think they meant

(nonlinear trapezoid) filter

which is a linear filter, formed by multiplying a spectrum by a “nonlinear trapezoid,” a function whose graph looks like a trapezoid except one of the sides is slightly curved.

One of the things that prompted this post was a discussion with a client about Kalman filters and particle filters. You can have linear methods for tracking nonlinear processes, nonlinear methods for tracking nonlinear processes, etc. You have to be careful in this context when you call something “linear” or not.

Random vs deterministic

There’s a similar ambiguity around whether a system is random or deterministic. Systems are often deterministic at one level and random at another. For example, a bandit design is a deterministic rule for conducting an experiment with random outcomes. Simulations can be even more confusing because there could be several alternating layers of randomness and determinism. People can talk past each other while thinking they’re being clear. They can say something about variance, for example, and other other person nods their head, though they’re thinking about the variance of two different things.

As simple as it sounds, you can often help a team out by asking them to be more explicit when they say something is linear or nonlinear, random or deterministic. In a sense this is nothing new: it helps to be explicit. But I’m saying a little more than that, suggesting a couple particular areas to watch out for, areas where it’s common for people to be vague when they think they’re being clear.

Related posts

Alphamagic squares in French

In earlier blog posts I give examples of alphamagic squares in English and in Spanish. This post looks at French.

French flag

The Wikipedia article on alphamagic squares, quoting The Universal Book of Mathematics, says

French allows just one 3 × 3 alphamagic square involving numbers up to 200, but a further 255 squares if the size of the entries is increased to 300.

My script did not find a French alphamagic with entries no larger than 200, but it did find 254 squares with entries no larger than 300. (For each of these squares there are 7 more variations you can form by rotation and reflection.)

The French alphamagic square with the smallest maximum entry that I found is the following.

[[3, 204, 102], [202, 103, 4], [104, 2, 203]]

When spelled out in French we get

[[trois, deux cent quatre, cent deux], [deux cent deux, cent trois, quatre], [cent quatre, deux, deux cent trois]]

And when we replace each cell with its number of letters we get the following magic square:

[[ 5, 14, 8], [12, 9, 6], [10, , 13]]

A list of all the French magic squares with maximum element 300 that I found is available here.

Acoustic roughness

When two pure tones are nearly in tune, you hear beats. The perceived pitch is the average of the two pitches, and you hear it fluctuate as many times per second as the difference in frequencies. For example, an A 438 and an A 442 together sound like an A 440 that beats four times per second. (Listen)

As the difference in pitches increases, the combined tone sounds rough and unpleasant. Here are sound files combining two pitches that differ by 16 Hz and 30 Hz.

16 Hz:

30 Hz:

The sound becomes more pleasant as the tones differ more in pitch. Here’s an example of pitches differing by 100 Hz. Now instead of hearing one rough tone, we hear two distinct tones in harmony. The two notes are at frequencies 440-50 Hz and 440+50 Hz, approximately the G and B above middle C.

100 Hz:

If we separate the tones even further, we hear one tone again. Here we separate the tones by 300 Hz. Now instead of hearing harmony, we hear only the lower tone, 440+150 Hz. The upper tone, 440+150 Hz, changes the quality of the lower tone but is barely perceived directly.

300 Hz:

We can make the previous example sound a little better by making the separation a little smaller, 293 Hz. Why? Because now the two tones are an octave apart rather than a little more than an octave. Now we hear the D above middle C.

293 Hz:

Update: Here’s a continuous version of the above examples. The separation of the two pitches at time t is 10t Hz.

Continuous:

Here’s Python code that produced the .wav files. (I’m using Python 3.5.1. There was a comment on an earlier post from someone having trouble using similar code from Python 2.7.)

from scipy.io.wavfile import write
from numpy import arange, pi, sin, int16, iinfo

N = 48000 # sampling rate per second
x = arange(3*N) # 3 seconds of audio

def beats(t, f1, f2):
    return sin(2*pi*f1*t) + sin(2*pi*f2*t)

def to_integer(signal):
    # Take samples in [-1, 1] and scale to 16-bit integers
    m = iinfo(int16).max
    M = max(abs(signal))
    return int16(signal*m/M)

def write_beat_file(center_freq, delta):
    f1 = center_freq - 0.5*delta
    f2 = center_freq + 0.5*delta    
    file_name = "beats_{}Hz_diff.wav".format(delta)
    write(file_name, N, to_integer(beats(x/N, f1, f2)))

write_beat_file(440, 4)
write_beat_file(440, 16)
write_beat_file(440, 30)
write_beat_file(440, 100)
write_beat_file(440, 293)

In my next post on roughness I get a little more quantitative, giving a power law for roughness of an amplitude modulated signal.

Related: Psychoacoustics consulting

Paying for doughnuts with a phone

At a doughnut shop today, I noticed the people at the head of each line were using their phones, either to pay for an order or to use a coupon. I thought how ridiculous it would sound if I were to go back twenty or thirty years and tell my mother about this.

Me: Some day people are going to pay for doughnuts with a phone.

Mom: You mean like calling up a doughnut shop to place an order? We already do that.

Me: No, they’re going to take their phone into the doughnut shop and pay with it.

Mom: Good grief. Why not just use cash?

Me: Well, they could. But it’ll be easy to use their phones since most people will carry them around all the time anyway.

Mom: People will carry around phones?!

Me: Sorta. More like computers, that can also make phone calls.

Mom: People will carry around computers?!!

old PC

Me: Not really. I was just making that up. But they will drive flying cars.

Mom: OK, I could see that. That’ll be nice.

Coffee after Obama

This morning I had coffee and a generous slice of squash bread at Top Pot Doughnuts in Seattle. When I went to pick up my coffee I saw a large photo of President Obama ordering coffee at the same place.

This is the second time I’ve been at a shop that proudly remembered our president stopping by. The first was a diner in San Juan that I cannot remember the name of. They put a plaque on the table President Obama sat at. If Top Pot did too, I missed it.

I enjoyed a few minutes of quiet this morning, something President Obama certainly did not have a chance to do here. One wall of the shop is lined with old books, so naturally that’s where I sat. I pulled down a small book that bound the December 1942 issues of My Garden magazine.

I find the ads in old magazines interesting for what they say between the lines. The ads are more interesting than the articles, the opposite of contemporary magazines, though that depends on the magazine.

One thing that struck me was the small ratio of advertisement to content. I suppose the original subscribers didn’t notice this, or maybe thought the magazine was putting in too many ads. I imagine they would be shocked by magazines that squeeze articles in between the ads.

Another thing that struck me was the somber but hopeful tone of the ads. Ford had a series of ads entitled “Until then …”. The ads seemed to say “We know nobody is going to be buying a car any time soon. But some day this war will be over, and when it is, we hope you’ll remember us.” Other companies tried to make the best of rationing. For example, one ad for shaving cream said that it was a good thing that their product worked so well since there are limits on how much shaving cream you can buy.

It was a pleasant way to start the day, without Secret Service agents buzzing around and without rationing limits on things like coffee and squash bread.

Visualizing search keyword overlap

The other day someone asked me to look at the search data for Debug Pest Control, a pest management company based in Rhode Island. One of the things I wanted to visualize was how the search terms overlapped with each other.

To do this, I created a graph where the nodes are the keywords and edges join nodes that share a word. (A “keyword” is actually a set of words, whatever someone types into a search.) The result was a solid gray blob be cause the the nodes are so densely connected.

I thinned the graph by first only looking at keywords that have occurred at least twice. Then I made my connection criteria a little stronger. I take the union of the individual words in two keywords and create a connection if at least 1/3 of these words appear in both keywords.

(You can click on the image to see a larger version.)

The area of each red dot is proportional to the number of times someone has come to the site using that keyword. You can see that there’s long-tail behavior: while some dots are much larger than others, there are a lot of little dots.

In case you’re curious, these are the top keywords:

  1. pest control
  2. ri
  3. pest control ri
  4. pest control rhode island,
  5. plants that keep bees away
  6. poisonous snakes in ri
  7. plants that keep bees and wasps away
  8. plants to keep bees away
  9. plants that keep wasps away
  10. pest control companies

 

The keywords were filtered a little before making the graph. Some of the top keywords contained “debug.” These may have been people using the term to refer to eliminating pests from their property, but more likely they were people who knew the company name. By removing “debug” from each of the keywords, the results focus on people searching for the company’s services rather than the company itself.

Click to learn more about consulting for complex networks

 

Seven questions a statistician could answer for a lawyer

A statistician could help a lawyer answer the following questions.

  1. Was this data collected in a proper way?
  2. Does common sense apply here, or is there something subtle going on?
  3. What conclusions can we draw from the data?
  4. Is this analysis routine or is there something unusual about it?
  5. How much confidence can we place in the conclusions?
  6. How can you explain this to a jury?
  7. Did opposing counsel analyze their data properly?

Related:

Visualizing the DFT matrix

The discrete Fourier transform (DFT) of length N multiplies a vector by a matrix whose (j, k) entry is ωjk where ω = exp(-2πi/N), with j and k running from 0 to – 1. Each element of the matrix is a rotation, so if N = 12, we can represent each element by an hour on a clock. The angle between the hour hand and minute hand corresponds to the phase of the matrix entry. We could also view each element as a color around a color wheel. The image below does both.

The matrix representing the inverse of the DFT is the conjugate of the DFT matrix (divided by Nf, but we’re only looking at phase here, so we can ignore this rescaling.) The image below displays the DFT matrix on the left and it’s inverse on the right.

Taking the conjugate amounts to making all the clocks run backward.

The DFT is often called the FFT. Strictly speaking, the FFT is an algorithm for computing the DFT. Nobody computes a DFT by multiplying by the DFT matrix, because the FFT is faster. The DFT matrix has a lot of special structure, which the FFT takes advantage of to compute the product faster than using ordinary matrix multiplication.

By the way, there are Unicode characters for clock times on the hour, U+1F550 through U+1F55B. I created the image above by writing a script that put the right characters in a table. The colors have HSL values where H is proportional to the angle and S = L =0.8.

Click to learn more about consulting help with signal processing

Finding 2016 in pi

2016 appears in π starting at the 7173rd decimal place:

You can confirm this with Mathematica or Wolfram Alpha:

        Mod[ Floor[10^7177 pi] , 10000]

I found it using the following Python code:

        >>> from sympy import pi
        >>> digits = str(pi.evalf(10000))[2:]
        >>> digits.find('2016')
        7173

By the way, it’s also true that 2016 = 1 + 2 + 3 + … + 63.

Automate to save mental energy, not time

Automation doesn’t always save as much time or effort as we expect.

The xkcd cartoon above is looking at automation as an investment. Does the work I put in now eventually save more work than I put into it? Automation may be well worth it even if the answer is “no.”

Automation can be like a battery as well as an investment. Putting energy into batteries is a bad investment; you’ll never get out as much energy as you put in. But that’s not why you put energy into batteries. You put energy in while you can so you can use some of that energy later when you need it.

Write automation scripts when you have the time, energy, and motivation to do so and when nothing else is more important. (Or nothing is more interesting, if you’re looking for a way to procrastinate without feeling too guilty. This is called “moral compensation.”) You may indeed save more work than you put into writing the scripts. But you also may save mental energy just when you need it.

Suppose it takes you an hour to write a script that only saves you two minutes later. If that two minutes would have derailed your concentration at a critical moment, but it didn’t because you had the script, writing the script may have paid for itself, even though you invested 60 minutes to save 2 minutes.

If your goal is to save mental energy, not time, you have a different strategy for automation. If a script executes faster than a manual process, but it takes a long time to remember where to find the script and how to run it, it may be a net loss. The less often you run a script, the chattier the interface should be.

The same considerations apply to learning third party software. I suspect the time I’ve put into learning some features of Emacs, for example, will not pay for itself in terms of time invested versus time saved. But I’ve invested leisure time to save time when I’m working hard, not to save keystrokes but to save mental energy for the project at hand.

Related post: How much does typing speed matter?

Click to learn more about math and computing consulting

Restarting @DSP_fact, ending @PerlRegex

I’m making a couple changes to my Twitter accounts.

First, I’m winding down @PerlRegex. I’ll stop tweeting there when my scheduled tweets run out. I suggest that everyone who has been following @PerlRegex start following @RegexTip instead. The latter is more general, but is mostly compatible with Perl.

Second, I’m reviving my @DSP_Fact. I stopped tweeting there a couple years ago, but I’d like to start posting there again. This time it’s going to be a little broader. I intend to include some material on acoustics, Fourier analysis (continuous and discrete), and maybe some other related material.

DSP_fact logo