Skip to content

Commit

Permalink
Update links which are missing http:// prefix
Browse files Browse the repository at this point in the history
  • Loading branch information
jackschaedler committed May 17, 2015
1 parent 0fdad34 commit 605e255
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions 2015-05-11-audio-dog-house.md
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ The trick behind this approach is to determine the distance between consecutive
</tr>
</table>

The autocorrelation is a nifty signal processing trick for pitch estimation, but it has its drawbacks. One obvious problem is that the autocorrelation function tapers off at its left and right edges. The tapering is caused by fewer non-zero samples being used in the calculation of the dot product for extreme lag values. Samples that lie outside the original waveform are simply considered to be zero, causing the overall magnitude of the dot product to be attenuated. This effect is known as <i>biasing</i>, and can be addressed in a number of ways. In his excellent paper, <a href="miracle.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf">"A Smarter Way to Find Pitch,"</a> Philip McLeod devises a strategy that cleverly removes this biasing from the autocorrelation function in a non-obvious but very robust way. When you've played around a bit with a simple implementation of the autocorrelation, I would suggest reading through this paper to see how the basic method can be refined and improved.
The autocorrelation is a nifty signal processing trick for pitch estimation, but it has its drawbacks. One obvious problem is that the autocorrelation function tapers off at its left and right edges. The tapering is caused by fewer non-zero samples being used in the calculation of the dot product for extreme lag values. Samples that lie outside the original waveform are simply considered to be zero, causing the overall magnitude of the dot product to be attenuated. This effect is known as <i>biasing</i>, and can be addressed in a number of ways. In his excellent paper, <a href="http://miracle.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf">"A Smarter Way to Find Pitch,"</a> Philip McLeod devises a strategy that cleverly removes this biasing from the autocorrelation function in a non-obvious but very robust way. When you've played around a bit with a simple implementation of the autocorrelation, I would suggest reading through this paper to see how the basic method can be refined and improved.

Autocorrelation as implemented in its naive form is an <i>O(N<sup>2</sup>)</i> operation. This complexity class is less than desirable for an algorithm that we intend to run in real time. Thankfully, there is an efficient way to compute the autocorrelation in <i>O(N log(N))</i> time. The theoretical justification for this algorithmic shortcut is far beyond the scope of this article, but if you're interested, you should know that it's possible to compute the autocorrelation function using two FFT (Fast Fourier Transform) operations. You can read more about this technique in the footnotes.[^1g] I would suggest writing the naive version first, and using this implementation as a ground truth to verify a fancier, FFT-based implementation.

Expand Down Expand Up @@ -368,11 +368,11 @@ I hope that by now you have a sturdy enough theoretical toehold on the problem o

The approaches to pitch detection outlined in this article have been explored and refined to a great degree of finish by the academic signal processing community over the past few decades. In this article, we've only scratched the surface, and I suggest that you refine your initial implementations and explorations by digging deeper into two exceptional examples of monophonic pitch detectors: the SNAC and YIN algorithms.

Philip McLeod's SNAC pitch detection algorithm is a clever refinement of the autocorrelation method introduced in this article. McLeod has found a way to work around the inherent biasing of the autocorrelation function. His method is performant and robust. I highly recommend reading McLeod's paper titled <a href="miracle.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf">"A Smarter Way to Find Pitch"</a> if you want to learn more about monophonic pitch detection. It's one of the most approachable papers on the subject. There is also a wonderful tutorial and evaluation of McLeod's method available <a href="http://www.katjaas.nl/helmholtz/helmholtz.html">here</a>. I <i>highly</i> recommend poking around this author's website.
Philip McLeod's SNAC pitch detection algorithm is a clever refinement of the autocorrelation method introduced in this article. McLeod has found a way to work around the inherent biasing of the autocorrelation function. His method is performant and robust. I highly recommend reading McLeod's paper titled <a href="http://miracle.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf">"A Smarter Way to Find Pitch"</a> if you want to learn more about monophonic pitch detection. It's one of the most approachable papers on the subject. There is also a wonderful tutorial and evaluation of McLeod's method available <a href="http://www.katjaas.nl/helmholtz/helmholtz.html">here</a>. I <i>highly</i> recommend poking around this author's website.

YIN was developed by Cheveigné and Kawahahara in the early 2000s, and remains a classic pitch estimation technique. It's often taught in graduate courses on audio signal processing. I'd definitely recommend reading <a href="audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf">the original paper</a> if you find the topic of pitch estimation interesting. Implementing your own version of YIN is a fun weekend task.
YIN was developed by Cheveigné and Kawahahara in the early 2000s, and remains a classic pitch estimation technique. It's often taught in graduate courses on audio signal processing. I'd definitely recommend reading <a href="http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf">the original paper</a> if you find the topic of pitch estimation interesting. Implementing your own version of YIN is a fun weekend task.

If you're interested in more advanced techniques for <i>polyphonic</i> fundamental frequency estimation, I suggest that you begin by reading Anssi Klapuri's excellent Ph.D. thesis on <a href="www.cs.tut.fi/sgn/arg/klap/phd/klap_phd.pdf">automatic music transcription</a>. In his paper, he outlines a number of approaches to multiple fundamental frequency estimation, and gives a great overview of the entire automatic music transcription landscape.
If you're interested in more advanced techniques for <i>polyphonic</i> fundamental frequency estimation, I suggest that you begin by reading Anssi Klapuri's excellent Ph.D. thesis on <a href="http://www.cs.tut.fi/sgn/arg/klap/phd/klap_phd.pdf">automatic music transcription</a>. In his paper, he outlines a number of approaches to multiple fundamental frequency estimation, and gives a great overview of the entire automatic music transcription landscape.

If you're feeling inspired enough to start on your own dog house, feel free to <a href="https://twitter.com/JackSchaedler">contact me</a> on Twitter with any questions, complaints, or comments about the content of this article. Happy building!

Expand Down

0 comments on commit 605e255

Please sign in to comment.