How to hear FM Synthesis

A big reason why I enjoy dabbling in electronic music production is that it’s a way of directly transforming math into physical sensation. It’s math you can literally hear. One of the most incredible examples is FM synthesis.

FM stands for frequency modulation. That means that you change the frequency of a note over time, generally in a cyclical pattern. Our ears interpret frequency as pitch, so this may sound like a pitch that fluctuates over time. However, if the fluctuation occurs fast enough, our ears can no longer track it, and it begins to sound completely different.

Try listening to these demos:

Demo 1

The first demo begins with a pure tone, consisting of a 512 Hz sine wave. I modulate the frequency up and down between 448 Hz and 576 Hz. The modulation itself has an associated frequency, which begins slowly, but accelerates faster and faster. At the end of the sound clip, the modulation frequency is 1024 Hz.

Demo 2

The second demo breaks it down into steps. The first note has just a pure 512 Hz tone. Then we introduce a modulation at 2 Hz. At each step, we double the modulation frequency, to 4 Hz, 8 Hz, 16 Hz, and so on up to 1024 Hz.

Somewhere in the middle, there’s a point where we stop hearing it as a fluctuating pitch, and start to hear something different. To my ears, the transition occurs between 32 and 64 Hz.

Note: this article uses MathJax to render LaTex equations. This may not work correctly in RSS feeds, and sorry for whatever it does to screen readers.

The 512 Hz sine wave is called the carrier wave, and it can be expressed as follows:

A \sin(\omega_c t)

where ωc is the carrier wave frequency (512 Hz in our example), A is the amplitude, and t is time.

We want to introduce a modulation to the frequency, which requires two more variables. In the demo, the carrier frequency was modulated by plus or minus 64 Hz, so we’ll call that quantity B = 64 Hz. The speed of the modulation is called the modulation frequency, which we’ll denote ωm. Our new expression is

A \sin(\int_0^t (\omega_c + B \cos(\omega_m \tau)) d\tau )
= A \sin(\omega_c t + B/\omega_m \sin(\omega_m t))

The quantity B/ωm shows up all over FM synthesis, so the common practice is to give it a name, β, or the frequency modulation index.

I think it would be instructive to repeat the first demo, but instead of using B = 64 Hz, we will instead use β = 2, while sweeping the modulator frequency. You’ll hear the frequency modulation accelerate and grow at the same time.

Demo 3

In order to understand what we’re hearing, it’s useful to break up the equation into a sum of simple sine waves. If the sound can be expressed as a sum of sine waves, then what we hear might soundlike multiple notes played together. Unfortunately the math is quite difficult to derive from scratch, and the easiest way to do it is to use identity equations taken from a reference book. The solution looks like this:

\sum_{n=-\infty}^\infty J_n(\beta) sin((\omega_c + n\omega_m)t)

This is a sum of simple sine waves, each of which has a frequency (ωc + n ωm), where n is an integer. The amplitude of each sine wave is given by the function Jn(β), which denotes a Bessel function of the first kind. The Bessel function is pretty complicated, and it takes some examples to build an intuition for it. But the important thing is, the amplitude is just some number that you can calculate by plugging it into a computer. For example if we take β = 2, we have:

J_{-3}(2) = -0.1289...
J_{-2}(2) = 0.3528...
J_{-1}(2) = -0.5767...
J_0(2) = 0.2239...
J_1(2) = 0.5767...
J_2(2) = 0.3528...
J_3(2) = 0.1289...

So FM synthesis will create multiple “satellite” frequencies, at the carrier frequency plus an integer times the modulation frequency. For example, if the modulation frequency is 200 Hz, and the carrier frequency is 512 Hz, then we will have satellite frequencies at 112, 312, 712, and 912 Hz. There will also be satellites at negative frequencies, at 88 Hz and 288 Hz–but these will sound the same as positive-frequency sine waves.

The satellites, in principle, go on forever, but the Bessel function becomes small for large values of n. The satellite frequencies with the highest amplitudes will be those with n approximately up to β.

So let’s try another demo, where we hold the carrier frequency constant at 512 Hz, and the modulation frequency at 128 Hz, while sweeping β from 0 to 8. At the start there won’t be any satellite frequencies, but as the demo proceeds, more and more satellite frequencies will be audible.

Demo 4

Once I heard what a beta sweep sounds like, I realized that it’s a fairly common sound in music! I hope that this demo allows you to start recognizing it for yourself.

When the modulation frequency is too small, the satellite frequencies appear too close together, and the human ear can no longer hear them as distinct. Instead you hear a wobbly frequency. Mathematically, there is no distinction between the wobbly frequency and the satellite frequencies, it just depends on the human hearing range and frequency resolution.

So what can you do with FM synthesis? You have three different parameters: the carrier frequency, the modulation frequency, and the modulation index, and that’s before introducing any further complexities. These three parameters prove to be quite versatile. Often, when people want a consonant sound, they’ll use a modulation frequency that’s a multiple of the carrier frequency, or at least they keep a nice ratio between the carrier and modulation frequencies. But you can also use other modulation frequencies to make more dissonant drones or buzzing sounds. FM synthesis is also used to emulate percussive instruments like drums or bells.

This demo contains some examples of sounds that I made by just playing around with the parameters:

Demo 5

All sounds were generated using csound.


  1. Rob Grigjanis says

    In the second math expression, there’s a sign change which I don’t think should be there.

    From 0 to t,

    ∫ cos(ωτ) dτ = (1/ω)sin(ωt)

  2. says

    @Rob Grigjanis,
    Thanks for pointing that out.

    The sign & phase of the cosines & sines doesn’t really matter for most purposes, but I try to choose a sign convention that makes the math cleaner.

  3. antaresrichard says

    Demo three reminded me of the audio used over the 1951 scene where Gort restores Klaatu to life. Not quite the same, but…


  4. says

    How to hear FM synthesis? Listen to any pop music with lots of synthesizers made between late 1983 and say 1988. There’s probably a 90 percent chance someone will be playing a Yamaha DX7.

  5. says

    In the field of communications electronics, the “satellite frequencies” are known as sideband frequencies. Amplitude modulation also produces sidebands, but it’s a much simpler result than FM: one pair of sidebands for each sine in the modulating signal (unless overdriven), versus the variable number of pairs you get with FM (dependent on mod index).

    As timgueguen note noted @5, FM synths were all the rage in the 1980s, the Yamaha DX7 being the progenitor of the class. They were particularly good for making inharmonic sounds, like bells, but fell out of favor as they were notoriously difficult to program (compared to the prior analog synths). For a while, there were some AM synths, but they never seemed to gain the same popularity.

    I am not surprised that your subjective point of “shift” occurred when you went from 32 Hz to 64 Hz, as the modulating tone itself is now within the range of human hearing. A similar transition affect occurs with time delays. In short, as long as the modulating tone is below the lower bound of the human frequency range, AM will sound like a variation in loudness (tremolo) and FM will sound like a variation in pitch (vibrato). The greater the mod index, the more pronounced the effect (on effects boxes this is usually denoted as “depth” or “intensity”).

  6. bubble says

    1 starting from 2^6 hz (and up), my ear started to recognize the wave as note.
    2 so I guess for each piano key, there is a corresponded sin wave at certain frequency?
    3 if 2 is true, any one can do tuning themselves. though piano fine tuning is not only about the pitch.

  7. bubble says

    another realization:
    by adjusting the carrier frequency, the modulation frequency, the modulation index, and other possible indexes, we can get all possible noise/sound.
    For my use case, I do not care what instrument/audio workstation I use – to the nature of my improvisation, I care about if the sound/noise is created exactly the same as the sound in my head…I’ll learn Csound.

  8. says

    On a basic level, a piano, or any other pitched instrument, will produce a note with a fundamental frequency. But they aren’t perfect sine waves, they also contain higher frequencies, and other complexities. These complexities are what make different instruments sound different.

    FM synthesis won’t produce a sound like a piano. I’ve heard that emulating the sound of a piano is particularly complicated.

Leave a Reply

Your email address will not be published. Required fields are marked *