Wavetable synthesis is a deceptively simple concept.

- Represent one cycle of a periodic waveform as an array
`W`

of length`L`

containing sample values. - Every sampling cycle, accumulate a phase value
`P`

with a note's pre-calculated frequency increment`F`

. - The value at
`W[P%L]`

yields the sample value`S`

, which is written to the audio buffer.

In practice `P`

and `F`

are usually floats, which means that the phase won't directly track the indices of `W`

. Floating-point linear interpolation reads the value "between" `W[floor(P)%L]`

and `W[floor(P+1)%L]`

, weighted by the fractional part of `P`

. The phase is also modulo'd against π each cycle to prevent pitch drift from floating point error. `S`

itself tends to be a float too; this allows easy arithmetic in modulation operations and prevents integer overflow.

This arrangement is more than just theoretically simple. Floating point DSP procedures on modern architectures may actually be faster than their integer equivalents, since the fixnum registers are often clogged with other logic most of the time. It's doubtful that any meaningful performance boost can be gleaned from handling `P`

, `F`

, and `S`

as integers, but as you might have guessed, I'm going to do it anyway. Why?

- The synthesizer might run on a different architecture one day, such as a hardware microcontroller. Not all of these platforms support floats.
- If
`S`

is not represented as a float, there's no reason to worry about dithering. - FM synthesis is performed in just a few operations, meaning that any math speed up that could be gained from floats might be negligible. The conversions needed to read from
`W`

might make it faster to keep`P`

as a pure int. - Integer overflow can be exploited to simulate the cyclic nature of waves, avoiding modulo operations.
- It's fun.

The following code will be in plain old C. As much as I love FP langs, a topic like this calls for precise bit mangling. Please install sox if you'd like to play along.

Paste the following code into your Vim. I'll dissect it later.

```
#include <limits.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#define SIZE_TABLE 4096
#define WAVE_INC ((M_PI * 2.0) / SIZE_TABLE)
#define RATE 48000
#define FREQ_INC ((uint32_t)(UINT_MAX / (uint32_t)RATE))
#define SHIFT 20
int main() {
uint16_t n = 0;
double wavePhase = 0.0;
int16_t w[SIZE_TABLE] = {0};
uint32_t phase = 0;
uint32_t f = 440 * FREQ_INC;
int16_t s1 = 0;
for (; n < SIZE_TABLE; n++) {
w[n] = (int16_t)(SHRT_MAX * sin(wavePhase));
wavePhase += WAVE_INC;
}
while (1) {
phase += f;
s1 = w[phase >> SHIFT];
putchar(s1 & 255);
putchar(s1 >> 8);
}
return 0;
}
```

run

```
cc filename.c -lm && ./a.out | play -b 16 -e signed -c 1 -r 48000 -t raw -
```

and you should hear a sine wave ringing out a pleasant A.

16 bit signed integers are enough to encode CD quality audio, which is why the wavetable `W`

is represented by the following C code.

```
#define SIZE_TABLE 4096
#define WAVE_INC ((M_PI * 2.0) / SIZE_TABLE)
...
uint16_t n = 0;
double wavePhase = 0.0;
int16_t w[SIZE_TABLE] = {0};
...
for (; n < SIZE_TABLE ; n++) {
w[n] = (int16_t)(SHRT_MAX * sin(wavePhase));
wavePhase += WAVE_INC;
}
```

This code increments a floating point phase value `wavePhase`

between 0 and 2π over the course of `SIZE_TABLE`

samples. For each sample, the float return value of `sin`

[-1.0, 1.0] is multiplied against `SHRT_MAX`

and truncated to 16 bits. The array simulates a single cycle of a sine, but now scaled between [-32767, 32767] instead.

The phase `P`

of a signal is incremented by frequency `F`

and modulo'd against `TABLE_SIZE`

to read from the wavetable. This logic is represented in C as follows.

```
#define RATE 48000
#define FREQ_INC ((uint32_t)(UINT_MAX / (uint32_t)RATE))
...
uint32_t phase = 0;
uint32_t f = 440 * FREQ_INC;
...
phase += f;
```

`RATE`

is the industry standard sampling rate of 48,000 Hz. When `UINT_MAX`

is divided against it, it provides the basic unit of increment that will take place during each sampling period, assuming the signal's frequency is 1Hz. Since this is inaudible, the actual pitch increment `f`

is `440 * FREQ_INC`

.

The bounds of the 32 bit unsigned integer `phase`

are now easily mapped to the bounds of the wavetable `w`

, which in turn is mapped to the cycles of a waveform. `phase += f`

overflows by design, and every time it does, it signals a new cycle of the sine. This avoids the need for any explicit modulo.

The size of `phase`

and `f`

is significant. I chose 32 bit ints over 16 bit ones for reasons that will become apparent later.

Retrieving the sample value at `w`

means truncating `phase`

down to a number between [0, 4095], so that it can address a specific index of the wavetable. Since `TABLE_SIZE`

is a power of 2, this is not a problem.

```
#define SHIFT 20
...
int16_t s1 = 0;
...
s1 = w[phase >> SHIFT];
putchar(s1 & 255);
putchar(s1 >> 8);
```

Right shifting the 32 bit value `phase`

by 20 places turns it into a 12 bit one—less than 4096: the exact index of the desired sample `s1`

. The toy oscillator writes audio to `sox`

by breaking `s1`

into two separate bytes and writing them to stdout, where the program is listening.

This specific sine sounds fine via this truncation method, but it's not advised for wavetable synthesis in general. That's where linear interpolation comes in.

The accuracy of a signal is improved considerably with linear interpolation. This involves sampling the wavetable at two adjacent points, and using them to derive an entirely new sample. The oscillator code is almost identical.

```
#include <limits.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#define SIZE_TABLE 4096
#define WAVE_INC ((M_PI * 2.0) / SIZE_TABLE)
#define RATE 48000
#define FREQ_INC ((uint32_t)(UINT_MAX / (uint32_t)RATE))
#define SHIFT 20
#define TABLE_INC (1 << SHIFT)
#define FRAC_MASK (TABLE_INC - 1)
int main() {
uint16_t n = 0;
double wavePhase = 0.0;
int16_t w[SIZE_TABLE] = {0};
uint32_t phase = 0;
uint32_t f = 440 * FREQ_INC;
int16_t s1 = 0;
int16_t s2 = 0;
uint64_t sf = 0;
for (; n < SIZE_TABLE; n++) {
w[n] = (int16_t)(SHRT_MAX * sin(wavePhase));
wavePhase += WAVE_INC;
}
while (1) {
phase += f;
s1 = w[phase >> SHIFT];
s2 = w[(phase + TABLE_INC) >> SHIFT];
sf = (uint64_t) phase & FRAC_MASK;
sf *= s2 - s1;
s1 += sf >> SHIFT;
putchar(s1 & 255);
putchar(s1 >> 8);
}
return 0;
}
```

Only a few additional values and operations are required.

```
#define TABLE_INC (1 << SHIFT)
...
int16_t s1 = 0;
int16_t s2 = 0;
...
s1 = w[phase >> SHIFT];
s2 = w[(phase + TABLE_INC) >> SHIFT];
```

The first sample `s1`

is retrieved through simple truncation. The second sample `s2`

is the next index of the wavetable. If `phase >> SHIFT`

reduces the phase to an exact index of `w`

called `N`

, then index `N+1`

can be derived by adding `1 << SHIFT`

to `phase`

, then right shifting this sum by `SHIFT`

. This avoids another modulo operation.

`phase`

and `f`

are represented as 32 bit ints, despite only needing 12 bits to represent the indices [0, 4095], so that the 20 remaining bits of `phase`

may represent the fractional "distance" of the phase between wavetable indices `N`

and `N+1`

. These bits are used to derive the more accurate sample reading from `s1`

and `s2`

. They must be isolated from `phase`

.

```
#define FRAC_MASK (TABLE_INC - 1)
...
uint64_t sf = 0;
...
sf = (uint64_t) phase & FRAC_MASK;
```

By subtracting 1 from `TABLE_INC`

, `FRAC_MASK`

is defined to be a series of 20 consecutive set bits. Running a bitwise AND against phase returns only the fractional part. Despite being smaller than `phase`

, this fractional value `sf`

is stored in 64 bits to provide it enough head room to perform the subsequent interpolation.

```
sf *= s2 - s1;
s1 += sf >> SHIFT;
putchar(s1 & 255);
putchar(s1 >> 8);
```

Subtracting `s1`

from `s2`

returns the span between the two samples. Multiplying this by the fractional part (this is where head room matters), then shifting it back down by `SHIFT`

determines how much of this span should be added on to the original truncated sample `s1`

. It's hard to envision, but this multiplication and right shifting works just like multiplying by a value [0.0, 1.0] in a float interpolation.

Something is seriously wrong with you if you can actually hear the difference between these two example programs, but in a modulation-heavy setting with multiple oscillators and envelopes, these theoretical accuracy concerns cease to be theoretical.

As for whether it makes sense to even use fixnums in the first place, time will tell. Worth a go though.

Even though the audio is ultimately 16 bits, it probably makes more sense to store the wavetable values as 32 bit numbers (16.16 fixed point format) to play nicely with the phase accumulators of the same size. I noticed this while attempting basic FM between two oscs. Like everything else on this site, this page documents a learning process. It's not an authoritative article. Sorry if you read to the end only to run into the same flaws I did.