luxcache

NANOSCOPICS IN DIGITAL AUDIO - Part 2: FFT and Granulation with MaxMSP with sv1

Added 2022-02-09 18:57:29 +0000 UTC

NANOSCOPICS IN DIGITAL AUDIO

Part 2: FFT and Granulation with MaxMSP with sv1

In this Lux Cache tutorial series, our residential sound design master sv1 dives into the tinkering of microscopic textures to create his rich, nuanced and infinitely detailed compositions. In the second part of Nanoscopics in Digital Audio, we dive into frequency domain processing as he breaks down building a basic granulation device in Max for Live.

This tutorial is available as both a Patreon text post and .pdf document format. We ask you kindly to not share Lux Cache content outside of the Patreon, our contributors rely on your donations.

Frequency Domain Processing
What is fft, why do we care about it?
Gizmo~

Basic Granulators + sv1.granulator_1.1
Buffers
Grains and Poly~

Conclusion

Part 1: Frequency Domain Processing

What is fft, why do we care about it?

This topic tends to get a bit dense and as to not scare anyone away and prevent any errors in my explanation, I’ll try to be as brief and simple as possible. If you do find the material covered in this article interesting, by all means, go and read about it yourself, it is useful!

FFT or Fast Fourier Transform is an analysis method that produces information about the frequency domain of a signal. We can analyze audio in the time domain or the frequency domain. The time-domain refers to the signal over time and the frequency domain refers to the frequency information of a signal. For instance, when looking at audio on the timeline, we are viewing it from the time domain and when we are looking at audio through an eq, we are viewing it from the frequency domain. We can perform operations on a signal in the frequency domain by converting it using FFT, and once we are done, we can convert back to the time domain by performing an IFFT (Inverse Fourier Transform). This is very useful!

This is all very hard and mathy, how am I supposed to do this?

Maxmsp has implemented an object that allows us to perform these transforms very easily: pfft~. More on this object can be found in the internal max help files, they’ve been very detailed and thorough, so consider reading more on it there in case anything I present here is confusing and would like a more detailed description.

To use this object we must first create an additional max patch just to perform our frequency domain analysis. So create a new patch and save it with a helpful name and in the same directory as your main patcher. Once saved, we can call the pfft~ object like so:

pfft~ filename.maxpat

The pfft~ object also takes several other arguments following it, but the one we are concerned with for now is window size. This refers to the number of samples we perform an FFT on. The max documentation says that by default the window size is 512. Because we have to analyze a section of audio (our window) before we can process it, some latency is introduced when processing the frequency domain. Note a larger window size will introduce more latency than a smaller window size. Our window size must also be a power of 2, meaning it was to be 2 raised to some power (2, 4, 8, 16, 32, 64,… etc). So now when calling pfft~ we can specify our window size like so:

pfft~ filename.maxpat [window size]

Once the pfft~ object has been called, we can now open up the subpatcher by cmd+double clicking it. Within this, we can call the fftin~ and fftout~ objects that will convert our signal to the frequency domain so that we may perform operations on it. Make sure to assign the inlet and outlet by providing the inlet/outlet number as an argument. You may notice that the fftin~ and fftout~ objects have two wires connecting them in the screenshot below. When working with a signal in the frequency domain, the signal is represented by a real and an imaginary component, which are represented by the two wires respectively. This is very important! However, we won’t get into what all of that means as in the remainder of this tutorial we will just be connecting wires to where the real and imaginary components are required. For now, just pay attention to what each inlet and outlet designate when working in the frequency domain.

More detailed info on this object can be found in the internal help file. Now let’s put this to use!

Gizmo~

My first introduction to using FFT for myself and one of my first serious looks into max was on a search for how pitch-shifting algorithms are implemented. The first thing I found was a redirect to max’s built-in “Gizmo~” object, and since then it has become a big part of the tools that I use daily. This max object is a “frequency-domain pitch shifter”. I enjoy it because of the glitchy and harsh ‘spectral’ noise it produces.

To set this object up let’s make a new pfft~ instance either with a new designated subpatch or the previous one we just made. I have one on my computer that’s been specifically saved just so I can instance and use gizmo~ more readily in patches I’m working on.

You may notice that I have included another argument following window size, ‘4’. This refers to our overlap factor, more can be found on this in the pfft~ helpfile, but it recommends we use a value of 4 for this. This patcher will have the same fftin~ and fftout~ objects as the one we just previously set up, however, this time will call the “Gizmo~” object and hook it up to these objects.

Gizmo~ has a third inlet, which is used to change the pitch of our incoming signal. This inlet will receive a pitch ratio, where 1 is the initial pitch and any number a power of 2 will raise or lower the pitch an octave accordingly. So for now we will make an “In” object with the argument of “2” and attach that to the pitch ratio inlet of gizmo~. This will give our fft patcher an inlet that will receive the pitch ratio. So now our FFT patcher is done, it should look like this:

Outside of the FFT patcher, we will now do some patching that will make this usable. We will need audio in, audio out, and a numbox to provide our pitch ratio. This will look something like this:

I have provided our numbox with the bounds of 0.01 and 1000, however, you can tweak this to better suit your own use. I just prefer it when it has crazy bounds like this because at a really high pitch ratio the sounds get harsh and glitchy. Now is where the most interesting part comes in. pfft~ defaults to a window size of 512, this does not produce much latency and the sounds that are produced are crazy but experiment with a really high window size such as 8192, or an even smaller window size. As I’m aware of right now, there’s not a way to automate the window size so we do have to go in and manually change this. Here are some examples at varying window sizes:

🔊 - dry loop

🔊 - wet - window: 16, pitch ratio: 50

🔊 - wet - window: 512, pitch ratio: 1.50

🔊 - wet - window: 512, pitch ratio: 50

🔊 - wet - window: 8192, pitch ratio: 1.50

While this device is good on its own, I like to tack it on the end of other devices I’ve made because I like the sound so much. Next, I want to talk about a device that I’ve been sitting on for over a year now that also has this gizmo~ object attached.

Part 2: Basic Granulators + sv1.granulator_1.1

When I was first starting to learn max, the first ‘big device’ I worked on was a very basic granulator and it’s become a big part of tools I use daily. It’s a very simple device, but what it does is upon receiving a note it will randomize the grain size, the grain position, and playback speed. This is very useful for generating glitchy and textural sound design one-shots.

🔊 - granulator example

I found this to be a very helpful and informative way to get my feet wet in max msp, while also making something useful for myself. So I’d like to briefly explain how I made it.

Buffers

Here is what we will first be assembling:

Main objects we will be using for setting up the buffers:

Dropfile - used to send the file path of a sample to the buffer

Buffer~ - used to store the sample

Info~ - used to access data from the sample, such as length

First, we will set up two buffers, which will be used to store our sample that we will be granulating as well as a grain window. A grain window is used to shape our incoming grains. In our case we will be building up a 16 voice granulator, meaning that each pulse will generate 16 grains. If we were to play all 16 voices at once, we may experience a lot of clicking, so to smooth that out, we envelope our grains with a grain window.

Buffer requires that you name it so that it knows which buffer you are referring to. In our case we will be using two, one to store the sample and one to store the buffer window, so we will name them accordingly.

In order for the buffer to receive the sample we will need to send it a “replace” message, so we prepend a replace message to the output of the dropfile object, and then connect that to the buffer.

At some point we will need to determine the length of our sample, so we use an info~ object to determine it, making sure to label it with the same name as the buffer that stores our sample. Connect the info~ object inlet to the rightmost outlet of the buffer. This will make sure that info~ outputs the new length of a sample upon loading a sample.

We will need to fill our grain window, and we can do this with a message. There exist many different functions to fill a grain window with, but we will be filling ours with a welch curve. We can do this by sending a “fill 1, apply welch” message to the grain window buffer. Also, return to the grain window buffer and provide it an attribute “@samps 1024”, this will change the length of the buffer to 1024 samples.

It will also be useful to attach a loadbang to the inlets of both the fill message and info~ so that whenever you load in an instance of the granulator, that information is automatically triggered.

Next, we will build our grain engine.

Grains and Poly~

Here’s what we will be assembling in this section:

Objects we will be using to do grain operations:

Poly~: used for polyphony and voice management

Wave~: a wavetable, used to scan through the sample

Phasor~: sawtooth wave from 0 to 1 to drive the sample playback

Random: generates a random number

Route: Sends messages to different outlets based on prepended input

T: trigger, used to send inputs

Notein: receive midi input

Now we will open up a new patch and save it with a useful name. When I was making this, I titled this subpatch “thedate_gran”. Close this, and then build a poly~ object with the arguments “title” and “number of voices”. In this case, we will be giving our poly~ 16 voice. Now that poly~ has been instantiated, cmd+double click it to reopen it so that we can edit in our poly~ environment.

The first thing that will need to be set up will most important be the voice management for our grains. The poly~ help file demonstrates a million ways to do this, but for now, we are going to just stick with this one method. We will need to free up and activate voices as we need them. To do this, we need to send mute messages to thispoly~ .

“Mute 0, 1” unmutes voices and makes them busy, “mute 1,0” mute voices, and frees them. We need to trigger these at certain moments, but for now, just set up these messages and attach them to thispoly~.

First, let’s set up our inlets, the first inlet will be for receiving note input, and the second inlet will be used to route incoming grain messages. Once those have been made, set up a “t b b” which stands for “trigger, bang, bang”, and connect it to inlet 1. Max sends messages from right to left, so it will send a bang from the rightmost outlet, and then a bang from the leftmost outlet. This will send a bang to the “mute 0, 1” message, and then a bang to the random number generator which we will set up in a moment. From here we will set up a “route” with the arguments of “speed”, “length”, and “start_pos”.

Next, instantiate 3 random objects, these will be used to generate random attributes of the grain: speed, length, and start position. We will initialize these random values ranges outside of the patch and then route them in. Length and Start Position will have an initialization of 1000, and speed will have an initialization of 2000. Speed will start at 2000 so that we can subtract 1000, such that we have speeds ranging from [-1000, 1000] allowing grains to be randomly reversed.

The following picture shows that this next section gets kind of dense, and to keep it brief and informative, I’ve marked out what each section of patching is doing.

This wraps up the poly~ patcher.

From here we will return to the top level and finish setting up some messages that will be sent to poly~. We will initialize each of our random grain variables, and prepend them with their corresponding title, and send them into the second inlet of the poly~ patcher. Again attach a loadbang to the top of these so that they are automatically loaded up on patch instancing. We can also set up a notein, and a “t b” that will be used to trigger the grain engine.

Finally, we will send the output of the poly~ to a live.gain object so that we can control the volume of the outcoming audio.

Finally attaching the output of the gain to a plugout~. The final patch should look something like this:

Conclusion:

I wrote this article because I found diving into max this time last year, not only very fun and interesting but refreshing for me. I find working on tools for myself extremely rewarding and want people to know that none of this stuff is impossible despite how intimidating it may look at first. I hope you learned something and continue diving in, or at the very least hope, you find some use from these tools I’ve included!

Here is a link to the M4L devices described, designed by sv1: https://github.com/00ff1a/sv1-m4l

sv1 is a texas-based musician and sound artist, his recent single, "itallwashedover", is available on bandcamp

You can follow him on Twitter @sv1___ and Instagram @sv1.earth

2022 © Whiston Digital / Lux Media | luxcache.com