Frequency shift keying (FSK) spectrum

Posted on 13 March 2022 by John

This post will look encoding digital data as an analog signal using frequency shift keying (FSK), first directly and then with windowing. We’ll look at the spectrum of the encoded signal and show that basic FSK uses much less bandwidth than direct encoding, but more bandwidth than FSK with windowing.

Square waves

The most natural way to encode binary data as an analog signal would be represent 0s and 1s by a sequence of pulses that take on the values 0 and 1.

A problem with this approach is that it would require a lot of bandwidth.

In theory a square wave has infinite bandwidth: its Fourier series has an infinite number of non-zero coefficients. In practice, the bandwidth of a signal is determined by how many Fourier coefficients it has above some threshold. The threshold would depend on context, but let’s say we ignore Fourier components with amplitude smaller than 0.001.

As I wrote about here, the nth Fourier sine series coefficients for a square wave is equal to 4/nπ for odd n. This means we would need on the order of 1,000 terms before the coefficients drop below our threshold.

Frequency shift keying

The rate of convergence of the Fourier series for a function f depends on the smoothness of f. Discontinuities, like the jump in a square wave, correspond to slow convergence, i.e. high bandwidth. We can save bandwidth by encoding our data with smoother functions.

So instead of jumping from 0 to 1, we’ll encode a 0 as a signal of one frequency and a 1 as a signal with another frequency. By changing the frequency after some whole number of periods, the resulting function will be continuous, and so will have smaller bandwidth.

Suppose we have a one second signal f(t) that is made of half a second of a 4 Hz signal and half a second of a 6 Hz signal, possibly encoding a 0 followed by a 1.

What would the Fourier coefficients look like? If we just had a 4 Hz sine wave, the Fourier series would have only one component: the signal itself at 4 Hz. If we just had a 6 Hz sine wave, the only Fourier component would again be the signal itself. There would be no sine components at other frequencies, and no cosine components.

But our signal patched together by concatenating 4 Hz and 6 Hz signals has non-zero cosine terms for every odd n, and these coefficients decay like O(1/n²).

Our Fourier series is

f(t) = 0.25 sin 8πt + 0.25 sin 12πt + 0.0303 cos 2πt + 0.1112 cos 6πt − 0.3151 cos 12πt + 0.1083 cos 10πt + …

We need to go out to 141 Hz before the coefficients drop below 0.001. That’s a lot of coefficients, but it’s an order of magnitude fewer coefficients than we’d need for a square wave.

Pulse shaping

Although our function f is continuous, it is not differentiable. The left-hand derivative at 1/2 is 8π and the right-hand derivative is 12π. If we could replace f with a related function that is differentiable at 1/2, presumably the signal would require less bandwidth.

We could do this by multiplying both halves of our signal by a windowing function. This is called pulse shaping because instead of a simple sine wave, we change the shape of the wave, tapering it at the ends.

Let’s using a cosine window because that’ll be easy; in practice you’d probably use a different window [1].

Now our function is differentiable at 1/2, and its Fourier series converges more quickly. Now we can disregard components above 40 Hz. With a smoother windowing function the windowed function would have more derivatives and we could disregard more of the high frequencies.

[1] This kind of window is called a cosine window because you multiply your signal by one lobe of a cosine function, with the peak in the middle of the signal. Since we’re doing this over [0. 1/2] and again over [1/2, 1], we’re actually multiplying by |sin 2πt|.

Aliasing in a nutshell

Posted on 9 February 2022 by John

Suppose you have a sine wave with frequency f₀ Hz.

We’re going to discretize this signal by sampling it f_s times per second. That is, we’re going to evaluate S at integer multiples of

$h = \frac{1}{f_s}.$

The result is the sequence

$\sin(2\pi f_0 hn)$

where n runs through the integers.

Next, let k be an integer and consider the sine wave

$A(t) = \sin(2\pi(f_0 + k f_s)t).$

(Foreshadowing: A is for “alias.”)

Now let’s sample A at the same frequency as S, i.e. f_s Hz. This gives us the sequence

$\sin(2\pi(f_0 + kf_s)hn).$

A short derivation shows

$\begin{align*} \sin(2\pi(f_0 + kf_s)hn) &= \sin(2\pi\left(f_0 + k/h\right)hn) \\ &= \sin(2\pi f_0 hn + 2\pi kn) \\ &= \sin(2\pi f_0 hn) \end{align*}$

which is exactly what we got from sampling S.

To recap, sampling a signal of f₀ Hz at a rate of f_s Hz produces the same samples as sampling a signal of f₀ + kf_s Hz at the same rate for any integer k.

So, for example, if we’re sampling signals at 1000 samples per second, then we’ll get the same samples whether we’re sampling a signal of 440 Hz or 1440 Hz or 2440 Hz etc.

Periodic sampling cannot distinguish frequency components that differ by an integer multiple of the sampling frequency.

If a signal has components at 440 Hz and at 1440 Hz, and we sample at 1000 Hz, all the information from the higher frequency component is aliased, added to the samples of the 440 Hz component.

If a signal contains frequencies between –B and B, you can avoid aliasing by sampling at a rate higher than 2B. In practice this may mean that your signal has frequency components outside the interval [-B, B] but these components are small enough to ignore.

A good rule of thumb is to sample at a frequency of at least 2.5B and not just the theoretical minimum of 2B. For more on this, see The Engineer’s Nyquist frequency.

FM signal approximation

Posted on 22 August 2021 by John

FM radio transmits a signal by perturbing (modulating) the frequency of a carrier wave. If the carrier has frequency ω and the signal has frequency q, then the FM signal is

cos(ωt + β cos(qt)).

To understand the modulated signal, it’s useful to write it as a sum of simple sines and cosines with no modulation. I wrote about how to do this exactly using Bessel functions. Today I’ll write about an approximation that’s easier to understand and work with, assuming the modulation index β is small.

Here’s the approximation:

cos(ωt + β cos(qt)) ≈ cos ωt + ½ β ( sin (ω + q)t + sin (ω − q)t ).

This says that to a good approximation, the modulation term adds two sine waves to the carrier, one that adds the signal frequency to the carrier frequency and one that subtracts it.

To establish the approximation and see how the error depends on β, subtract the right side from the left and expand as a Taylor series in β. The first non-zero term in the series is

-½ cos(qt)² cos(ωt) β²

and so if β is small, the approximation error is very small. For example, if β = 0.1, then the approximation error is on the order of 0.005.

As an example, let ω = 10, q = 2, and β = 0.1. Then

cos(10t + 0.1 cos 2t) ≈ cos 10t + 0.05 ( sin 12t + sin 8t )

and the approximation error is plotted below.

As predicted, the amplitude of the error is around 0.005, while the amplitude of the FM signal is 1.

Identifying someone from their heart beat

Posted on 11 September 2020 by John

electrocardiogram of a toddler

How feasible would it be to identify someone based from electrocardiogram (EKG, ECG) data? (Apparently the abbreviation “EKG” is more common in America and “ECG” is more common in the UK.)

Electrocardiograms are unique, but unique doesn’t necessarily mean identifiable. Unique data isn’t identifiable without some way to map it to identities. If you shuffle a deck of cards, you will probably produce an arrangement that has never occurred before. But without some sort of registry mapping card deck orders to their shufflers, there’s no chance of identification. (For identification, you’re better off dusting the cards for fingerprints, because there are registries of fingerprints.)

According to one survey [1], researchers have tried a wide variety of methods for identifying people from electrocardiograms. They’ve used time-domain features such as peak amplitudes, slopes, variances, etc., as well as a variety of frequency-domain (DFT) features. It seems that all these methods work moderately well, but none are great, and there’s no consensus regarding which approach is best.

If you have two EKGs on someone, how readily can you tell that they belong to the same person? The answer depends on the size of the set of EKGs you’re comparing it to. The studies surveyed in [1] do some sort of similarity search, comparing a single EKG to tens of candidates. The methods surveyed had an overall success rate of around 95%. But these studies were based on small populations; at least at the time of publication no one had looked at matching an single EKG against thousands of possible matches.

In short, an electrocardiogram can identify someone with high probability once you know that they belong to a relatively small set of people for which you have electrocardiograms.

Adding phase-shifted sine waves

Posted on 17 August 2020 by John

Suppose you have two sinusoidal functions with the same frequency ω but with different phases and different amplitudes:

f(t) = A sin(ωt)

and

g(t) = B sin(ωt + φ).

Then their sum is another sine wave with the same frequency

h(t) = C sin(ωt + ψ).

Note that this includes cosines as a special case since a cosine is a sine with phase shift φ = 90°.

Sum of two phase-shifted sine waves with the same frequency is another sine wave

This post will

prove that the sum of sine waves is another sine wave,
show how to find its amplitude and phase, and
discuss the significance of this result in signal processing.

Finding the amplitude and phase

Note f + g and h both satisfy the second order differential equation

y” = − ω² y

Therefore if they also satisfy the same initial conditions y(0) and y‘(0) then they’re the same function.

The functions f + g and h are equal at 0 if

B sin(φ) = C sin(ψ).

and their derivatives are equal at 0 if

ω A + ω B cos(φ) = ω C cos(ψ).

Taking ratios says that

tan(ψ) = B sin(φ) / (A + B cos(φ))

ψ = arctan( B sin(φ) / (A + B cos(φ)) ).

Once we have ψ, we solve for C and find

C = B sin(φ) / sin(ψ).

Special case of sine and cosine

Let’s look at the special case of φ = 90°, i.e. adding A sin(ωt) and B cos(ωt). Then sin(φ) = 1 and cos(φ) = 0, and the equation for ψ simplifies to

ψ = arctan(B/A).

If an angle has tangent B/A, then it’s sine is B / √(A² + B²), and so we have

C = √(A² + B²).

Linear time invariant (LTI) systems

A linear, time-invariant system can differentiate or integrate signals. It can change their amplitude or phase. And it can add two signals together.

It’s easy to see that changing the amplitude or phase of a signal doesn’t change its frequency. It’s also easy to see that differentiation and integration of sine waves doesn’t change their frequency. But it’s not as clear that adding two sines with the same frequency doesn’t change their frequency. Here we’ve shown that’s the case.

Bode plots are a way to show how an LTI system changes in response to changes in its inputs. These plots show what happens to the amplitude and to the phase. They don’t need to show what happens to the frequency because the frequency doesn’t change.

Clipped sine waves

Posted on 20 May 2020 by John

One source of distortion in electronic music is clipping. The highest and lowest portions of a wave form are truncated due to limitations of equipment. As the gain is increased, the sound doesn’t simply get louder but also becomes more distorted as more of the signal is clipped off.

Clipping 0.2

For example, here is what a sine wave looks like when clipped 20%, i.e. cut off to be between -0.8 and 0.8.

Sine clipped at 0.8

A simple sine wave has only one Fourier component, itself. But when we clip the sine wave, we move energy into higher frequency components. We can see that in the Fourier components below.

Fourier coefficients of sine clipped at 0.8

You can show by symmetry that the even-numbered coefficients are exactly zero.

Clipping 0.6

Here are the corresponding plots for 60% clipping, i.e. the absolute value of the signal is cut off to be 0.4. First the signal

Sine clipped at 0.8

and then its Fourier components.

Fourier coefficients of sine clipped at 0.8

Here are the first five sine waves with the amplitudes given by the Fourier coefficients.

Fourier components

And here we see how the of the sines above do a pretty good job of reconstructing the original clipped sine. We’d need an infinite number of Fourier components to exactly reconstruct the original signal, but the first five components do most of the work.

Adding up the first five Fourier components

Continuous range of clipping

Next let’s look at the ratio of the energy in the 3rd component to that of the 1st component as we continuously vary the amount of clipping.

Ratio of energy in 3rd harmonic to fundamental

Now for the 5th harmonic. This one is interesting because it’s not strictly increasing but rather has a little bump before it starts increasing.

Ratio of energy in 5th harmonic to fundamental

Finally, here’s the ratio of the energy in all higher frequencies to the energy in the fundamental.

Ratio of energy in all higher frequences combined to fundamental

Area of sinc and jinc function lobes

Posted on 31 December 2019 by John

Someone left a comment this morning on my blog post on sinc and jinc integrals regarding the area of the lobes.

It would be nice to have the values of integrals of each lobe, i.e. integrals between 0 and multiples of pi. Anyone knows of such a table?

This post will include Python code to address that question. (Update: added asymptotic approximation. See below.)

First, let me back up and explain the context. The sinc function is defined as [1]

sinc(x) = sin(x) / x

and the jinc function is defined analogously as

jinc(x) = J₁(x) / x,

substituting the Bessel function J₁ for the sine function. You could think of Bessel functions as analogs of sines and cosines. Bessel functions often come up when vibrations are described in polar coordinates, just as sines and cosines come up when using rectangular coordinates.

Here’s a plot of the sinc and jinc functions:

The lobes are the regions between crossings of the x-axis. For the sinc function, the lobe in the middle runs from −π to π, and for n > 0 the nth lobe runs from nπ to (n+1)π. The zeros of Bessel functions are not uniformly spaced like the zeros of the sine function, but they come up in application frequently and so it’s easy to find software to compute their locations.

First of all we’ll need some imports.

    from scipy import sin, pi
    from scipy.special import jn, jn_zeros
    from scipy.integrate import quad

The sinc and jinc functions are continuous at zero, but the computer doesn’t know that [2]. To prevent division by zero, we return the limiting value of each function for very small arguments.

    def sinc(x):
        return 1 if abs(x) < 1e-8 else sin(x)/x

    def jinc(x):
        return 0.5 if abs(x) < 1e-8 else jn(1,x)/x

You can show via Taylor series that these functions are exact to the limits of floating point precision for |x| < 10⁻⁸.

Here’s code to compute the area of the sinc lobes.

    def sinc_lobe_area(n):
        n = abs(n)
        integral, info = quad(sinc, n*pi, (n+1)*pi)
        return 2*integral if n == 0 else integral

The corresponding code for the jinc function is a little more complicated because we need to compute the zeros for the Bessel function J₁. Our solution is a little clunky because we have an upper bound N on the lobe number. Ideally we’d work out an asymptotic value for the lobe area and compute zeros up to the point where the asymptotic approximation became sufficiently accurate, and switch over to the asymptotic formula for sufficiently large n.

    def jinc_lobe_area(n):
        n = abs(n)
        assert(n < N)
        integral, info = quad(jinc, jzeros[n-1], jzeros[n])
        return 2*integral if n == 0 else integral

Note that the 0th element of the array returned by jn_zeros is the first positive zero of J₁; it doesn’t include the zero at the origin.

For both sinc and jinc, the even numbered lobes have positive area and the odd numbered lobes have negative area. Here’s a plot of the absolute values of the lobe areas.

Asymptotic results

We can approximate the area of the nth lobe of the sinc function by using a midpoint approximation for 1/x. It works out that the area is asymptotically equal to

$(-1)^n \frac{4}{(2n+1)\pi}$

We can do a similar calculation for the area of the nth jinc lobe, starting with the asymptotic approximation for jinc given here. We find that the area of the nth lobe of the jinc function is asymptotically equal to

$\frac{(-1)^n}{\pi^2} \left( \frac{8}{4n+3} \right )^{3/2}$

To get an idea of the accuracy of the asymptotic approximations, here are the results for n=100.

    sinc area:      0.00633455
    asymptotic:     0.00633452
    absolute error: 2.97e-8
    relative error: 4.69e-6

    jinc area:      0.000283391
    asymptotic:     0.000283385
    absolute error: 5.66e-9
    relative error: 2.00e-5

The Engineer’s Nyquist frequency and the sampling theorem

Posted on 2 January 2018 by John

The Nyquist sampling theorem says that a band-limited signal can be recovered from evenly-spaced samples. If the highest frequency component of the signal is f_c then the function needs to be sampled at a frequency of at least the Nyquist frequency 2f_c. Or to put it another way, the spacing between samples needs to be no more than Δ = 1/2f_c.

If the signal is given by a function h(t), then the Nyquist-Shannon sampling theorem says we can recover h(t) by

$h(t) = \sum_{n=-\infty}^\infty h(n\Delta)\, \mathrm{sinc}(2f_c t- n)$

where sinc(x) = sin(πx) / πx.

In practice, signals may not entirely band-limited, but beyond some frequency f_c higher frequencies can be ignored. This means that the cutoff frequency f_c is somewhat fuzzy. As we demonstrate below, it’s much better to err on the side of making the cutoff frequency higher than necessary. Sampling at a little less than the necessary frequency can cause the reconstructed signal to be a poor approximation of the original. That is, the sampling theorem is robust to over-sampling but not to under-sampling. There’s no harm from sampling more frequently than necessary. (No harm as far as the accuracy of the equation above. There may be economic costs, for example, that come from using an unnecessarily high sampling rate.)

Let’s look at the function h(t) = cos(18πt) + cos(20πt). The bandwidth of this function is 10 Hz, and so the sampling theorem requires that we sample our function at 20 Hz. If we sample at 20.4 Hz, 2% higher than necessary, the reconstruction lines up with the original function so well that the plots of the two functions agree to the thickness of the plotting line.

Function and reconstruction sampling at 20.4 Hz

But if we sample at 19.6 Hz, 2% less than necessary, the reconstruction is not at all accurate due to problems with aliasing.

Function and reconstruction sampling at 19.6 Hz

One rule of thumb is to use the Engineer’s Nyquist frequency of 2.5 f_c which is 25% more than the exact Nyquist frequency. An engineer’s Nyquist frequency is sorta like a baker’s dozen, a conventional safety margin added to a well-known quantity.

Update: Here’s a plot of the error, the RMS difference between the signal and its reconstruction, as a function of sampling frequency.

RMS error in reconstruction as a function of sampling frequency

By the way, the function in the example demonstrates beats. The sum of a 9 Hz signal and a 10 Hz signal is a 9.5 Hz signal modulated at 0.5 Hz. More details on beats in this post on AM radio and musical instruments.

Paul Klee meets Perry the Platypus

Posted on 25 October 2017 by John

I was playing around with something in Mathematica and one of the images that came out of it surprised me.

filter contour plot

It’s a contour plot for the system function of a low pass filter.

    H[z_] := 0.05634*(1 + 1/z)*(1 - 1.0166/z + 1/z^2) /
            ((1 - 0.683/z)*(1 - 1.4461/z + 0.7957/z^2))
    ContourPlot[ Arg[H[Exp[I (x + I y)]]], 
                 {x, -1, 1}, {y, -1, 1}, 
                 ColorFunction -> "StarryNightColors"]

It looks sorta like a cross between Paul Klee’s painting Senecio

Senecio by Paul Klee, 1922

and Perry the Platypus from Phineas and Ferb.

Perry the Platypus from Phineas and Ferb

Time series analysis vs DSP terminology

Posted on 20 October 2017 by John

Time series analysis and digital signal processing are closely related. Unfortunately, the two fields use different terms to refer to the same things.

Suppose you have a sequence of inputs x[n] and a sequence of outputs y[n] for integers n.

Moving average / FIR

If each output depends on a linear combination of a finite number of previous inputs

y[n] = b₀ x[n] + b₁ x[n−1] + … + b_q x[n − q]

then time series analysis would call this a moving average (MA) model of order q, provided b₀ = 1. Note that this might not really be an average, i.e. the b‘s are not necessarily positive and don’t necessarily sum to 1.

Digital signal processing would call this a finite impulse response (FIR) filter of order q.

Autoregressive / IIR

If each output depends on a linear combination of a finite number of previous outputs

y[n] = a₁ y[n −1] + … + a_p y[n − p]

then time series analysis would call this an autoregressive (AR) model of order p.

Digital signal processing would call this an infinite impulse response (IIR) filter of order p.

Sometimes you’ll see the opposite sign convention on the a‘s.

ARMA / IIR

If each output depends on a linear combination of a finite number of previous inputs and outputs

y[n] = b₀ x[n] + b₁ x[n-1] + … + b_q x[n − q] + a₁ y[n − 1] + … + a_p y[n − p]

then time series analysis would call this an autoregressive moving average (ARMA) model of order (p, q), i.e. p AR terms and q MA terms.

Digital signal processing would call this an infinite impulse response (IIR) filter with q feedforward coefficients and p feedback coefficients. Also, as above, you may see the opposite sign convention on the a‘s.

ARMA notation

Box and Jenkins use a‘s for input and z‘s for output. We’ll stick with x‘s and y’s to make the comparison to DSP easier.

Using the backward shift operator B that takes a sample at n to the sample at n-1, the ARMA system can be written

φ(B) y[n] = θ(B) x[n]

where φ and θ are polynomials

φ(B) = 1 − φ₁B − φ₂B² − … φ_pB^p

and

θ(B) = 1 − θ₁B − θ₂B² − … θ_qB^q.

System function notation

In DSP, filters are described by their system function, the z-transform of the impulse response. In this notation (as in Oppenheim and Shafer, for example) we have

$H(z) = \frac{\sum_{k=0}^q b_k z^{-k}}{1 - \sum_{k=1}^p a_k z^{-k}}$

The φ_k in Box and Jenkins correspond to the a_k in Oppenheim and Schafer. The θ_k correspond to the (negative) b_k.

The system function H(z) corresponds to θ(1/z) / φ(1/z).

DSP and time series consulting

Signal processing