Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Here is another approach that hasn't been mentioned yet. No AI, no feature extraction, no FFT, and it will work great! It is quite slow, though.

Imagine that you have a bunch of sound samples. These samples are either positive or negative. If you take another copy of the samples and slide them across the original samples, they will line up at one particular instant in time. A nice way to get the computer to "see" this moment is to multiply the samples from each waveform together, sample by sample. Then, add up the products. This works because the lined-up samples will all turn into positive numbers (instead of the random mix of positive and negative numbers when they don't line up.) All these positive numbers will add up to a really big positive number, which is called a correlation spike.

This algorithm of sliding the samples across each other, multiplying them sample-by-sample, and adding them together is called convolution. It works great but requires a large amount of computer power. If you have a really hot machine, or you are patient, it should work fine.

A short-cut for this procedure takes advantage of the Fast Fourier Transform (FFT). This amazing algorithm allows you to transform a convolution in the time domain into a multiplication in the frequency domain. To get the benefits of this algorithm, you will need to learn about window functions, the effect of sample rates, and some other gory details. It will be *much* faster, but also much more work to learn how to use.

To solve your particular problem, you will need to convolve each possibly new sample against each song in your collection, looking for a match. For the FFT algorithm, you can compute the FFT of each song only once, and store it. These stored FFT samples are multiplied by the FFT of the new song. You get the same value for the correlation spike when you multiply the FFTs as when you do the convolution. If you get the huge spike in either the time or the frequency domain, the songs are the same.

The sliding FFT that jcwren mentions solves an important problem with the FFT. Imagine that two songs start with identical notes, but they are played in a different order. An ordinary FFT cannot distinguish between these two songs. The sliding FFT will fix this. The sliding FFT is yet more complicated than the ordinary FFT, so I wouldn't recommend it as a first project in signal processing.

For doing this type of number-crunching in perl I use the PDL modules. They are well worth the trouble to install.

Update: See Analyzing WAV Files with Perl for FFT usage.

It should work perfectly the first time! - toma


In reply to Re: How to compare 2 wav files. by toma
in thread How to compare 2 wav files. by shadox

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found