What I am going to speak about
Sound waves, frequencies, musical notes
Our tool: Fast Fourier Transform
Putting it to the test: Can my computer recognize pitch?
Waves and frequencies
If you whistle, sing or strum a guitar, you create vibrations in the air
If you look over a short interval in time, a vibration seems periodic
The period T is the time after which signal repeats itself
The frequency f is the reciprocal of the period: f = 1/T
Which pitch?
In western music, some frequencies have names:
...,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
...
The subscript denotes the octave
For the rest of the presentation, it is very useful to rename:
...
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
...
...
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
...
We will write n for such a number, and we call it a semitone
So ... What is a G 6?
...
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
...
...
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
...
Back to frequency names
...
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
...
...
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
...
A lot of people remember
(= 9), corresponds to 440 Hz
Write this as
A semitone higher, means frequency increases by factor
:
So,
An octave higher means double the frequency!
We can give a formula for the frequency
f( 9 ) = 440 Hz
If k is a number larger than 0:
But the same formula holds for negative k
Putting everything together,
Hz = 440 ×
Hz
And from the frequency, we can go back to n
We found relationship between frequency f and note n
Hz
So,
= 
= 
n = 9 + 12
Summary
In western music, some frequencies got names, we gave alternative names:
...
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
...
...
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
...
Then the frequency of semitone n is f( n ) = 440 ×
Hz
And if we have a frequency f, the corresponding semitone n is
n = 9 + 12
Table of Frequencies
![]()
-1
246.94 Hz
![]()
0
261.63 Hz
![]()
1
277.18 Hz
![]()
2
293.66 Hz
![]()
3
311.13 Hz
![]()
4
329.63 Hz
![]()
5
349.23 Hz
![]()
6
369.99 Hz
![]()
7
392.00 Hz
![]()
8
415.30 Hz
![]()
9
440.00 Hz
![]()
10
466.16 Hz
![]()
11
493.88 Hz
![]()
12
523.25 Hz
Alternative music notation
Besides the classical music notation, there are alternative suggestions
Very interesting ones make use of chromatic scales: in such a case, the distance between a semitone n and a semitone n+1 is always the same
My favorite notation graphs the semitones you hear as a function of time
Alternative music notation example
Let's continue with FFT now...
Let's continue with FFT now...
FFT has many many applications
Music Analysis and Speech Analysis
Image Analysis, Image compression
Simulation of fluids and many other physical systems
Filtering
Making money (Finance):
...
What FFT does...
FFT takes a periodic signal apart into easy building blocks
The building blocks are sine's and cosine's of different frequencies
, k=1,2,3, ...
If for instance, there is a lot of a building block with frequency
in the vibration of the air, you will probably also hear a sound with that frequency
Plot of
Plot of
Plot of
Plot of
Sampling: consider signal at N points in time
At: points t/T = 0/N, 1/N, ..., (N-1)/N
Store values in a list / array / vector
In[1]:=
Out[1]=
If you pick more points, it looks more like real plot
Sample sin(
) and put it in vector
, for k = 1, ..., N/2-1
Sample
and put it in vector
for k=0,...,N/2
Build a signal y from samples of cos and sin
Want to find
and
such that
Dot product
The dot product takes two vectors of same size N
and calculates a real number,
Some properties:
Dot product
Write |x| for the length of x. Then
If x and y are perpendicular,
Building components are perpendicular to each other
Also, for 0 < k < N/2
Finally,
It gives us a great trick
Remember: we want to find
and
such that
Take dot product of both sides with
So ![]()
!
The Transformation
You get the rest of the coefficients in the same way:
A little test
In[5]:=
In[6]:=
Out[6]=
In[2]:=
I'm sorry, this was not FFT
What I told you is true, and it works for our purposes, but it is not yet FAST
Rather than the FFT, I explained the Discrete Fourier Transform
Actually, there is no 'FFT'. There are a lot of ways to make the above algorithm fast, and they are all called FFT
A little FFT
The most famous version of FFT is the Cooley-Tukey algorithm, basic idea: (divide and conquer)
![]()
/ \
,
,
![]()
/ \ / \
FFT(
,
) FFT(
,
) FFT(
,
) FFT(
,
)
Putting it to the test
The setup
Voice or guitar to microphone in
Soundcard samples signal from microphone
A computer program asks soundcard for sample data
Program performs FFT
Program displays
for the different values of k
The computer program
Open source computer program Frequency Analyzer
Modified to the ‘note’ scale rather than the frequency scale
Conclusion
I hope you have some idea of what FFT is
If you are continuing in exact sciences, you will definitely meet FFT later in your life, it is extremely useful
There is a lot more to say and discover on this subject. I put a few references on the next slide
The slides will be available on the Csplash website, and on my website: www.cims.nyu.edu/~jim
On my website you can also find links to more information, and a link to Frequency Analyzer, so you can try it out for yourself
Appendix
For every integer k, k not a multiple of N/2,
For every integer k , k multiple of N/2
For every integers k and l,
And if k ≠ l,