Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

How to compare 2 wav files.

by shadox (Priest)
on May 27, 2002 at 20:44 UTC ( #169641=perlquestion: print w/ replies, xml ) Need Help??
shadox has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys, i am with a project and i need a little help, one part of the project is compare 2 wav files, i need to get a match when the 2 files have the same sound, or the same musicm the files will be small (about 5 seconds), i was looking in cpan but i didn't find anything that could help me, anyone of you guys have do it before, or which module can i use for it.
Update:This project is not for Speech reconigtion, it is for compare 2 small wav files with music.

Optimus magister, bonus liber

Replies are listed 'Best First'.
(jcwren) Re: How to compare 2 wav files.
by jcwren (Prior) on May 27, 2002 at 20:52 UTC

    This is a non-trivial task, although not impossible.

    Basically, you'd need to run a sliding window FFT/DFT (Fast Fourier Transform/Discrete Fourier Transform) and look at the spectral energy density of various frequency groups. If the energy density is the same for both samples at the same time, you could consider them the same.

    FFT/DFTs are not difficult to implement, and while I'd tend to do it in C, I think Perl should do it pretty well. There are dozens of FFT/DFT implementations on the 'net. Decoding the .WAV format is pretty simple, also.

    About the only problem I can really see is that if you have two .WAV files sampled at different rates (i.e., one is 44.1Khz, and the other 10Khz). In that case, you'd have to resample one to match the other. Still not overly difficult, but an added step.

    Welcome to DSP-101


    e-mail jcwren

      UPDATE: late at night and didn't check the date to see that it was a zombie thread... post is still relevant in case someone looks...

      FFT/DFTs are not difficult to implement, and while I'd tend to do it in C, I think Perl should do it pretty well.

      It's not difficult, but I wouldn't bother implementing FFTs myself at all anymore, except as an excercise in implementing FFTs or if I needed to own the code. There are free libraries available, such as fftw that are already debugged, documented, and reasonably optimized. Fftw is pretty speedy - about 0.06 seconds to do a 2048x2048 2D-FFT on a new-ish (i7) Macbook.

      when and where to use FFT..??
Re: How to compare 2 wav files.
by arunhorne (Pilgrim) on May 27, 2002 at 23:06 UTC

    Fast Fourier Transform is a good option but you might also consider Hidden Markov Models (HMMs). These statistical models are commonly used for speech recognition tasks and have the advantage over FFT that they are less likely to be fooled by an off-target sample point. HMMs build a probabalistic model of a pattern (in this case a sound wave) and will provide you will a likelihood that the sound wave it is given matches the training set.

    To apply the idea to your problem, you use the first wave file as a training set and the second as test set. If the HMM returns a probablility for the test set of greater than say, 0.9 consider them equal. This probabilistic approach will serve you well in this case. For example, with FFT to identical wave files recorded at different frequency may not be matched, whereas an HMM should be able to encapsulate this difference.

    Here are some links to get you started. However, be aware that what you are attempting is non-trivial as jcwren points out and also many people devote their entire degrees/Phds to this area... would it be better if you just used a human? There are times when a computer isn't the best solution, and knowing when to recognise this can be key to many Artifical Intelligence tasks...

    HMM Tutorial

    HMMs for Speech

    Hope this helps, or at least touches the tip of the ice-berg.

Re: How to compare 2 wav files.
by graff (Chancellor) on May 28, 2002 at 02:43 UTC
    Personally, I would not view this as a Perl question, nor as a problem best solved using Perl -- except for building any sort of "wrapper" utility that would make it easier to use the existing tools that are available in C and C++.

    For example, has a fairly comprehensive set of signal processing tools (including an HMM toolkit). These tend to prefer raw pcm data so you can use SoX to strip the WAV headers (it also does a lot of other useful stuff -- you need it anyway).

    A lot will depend on the scope and actual nature of your project: how many files to compare, what criteria define "same" vs. "not same", how confusable the samples are on these criteria, what error rate is acceptable. If you're looking for cases of two files that replicate the same portion of a single digital source with little or no alteration, then DSP approaches are likely succeed quite well -- but any other condition will have a measurable error rate on both "same" and "not-same" decisions.

    Another approach to consider, if the job allows it, would be to build a Perl/Tk interface that makes it very easy, fast and efficient for a human to compare the audio files and make the decisions.

    Update: It's not at all clear to me that HMM's are appropriate for classifying music data. The first thing to try should probably just be comparing DFT vectors, both "narrow band" (long analysis window) and "wide band" (short window). I believe the ISIP toolkit includes a vector quantization process, which will make the statistical assessments easier.

Re: How to compare 2 wav files.
by thor (Priest) on May 27, 2002 at 22:49 UTC
    As jcwren says above, DFT is probably the way to go. Beware, however, that your clips have the same length. If you have the same sound that in one sample is a 1/4 of a second longer than the other, and your sample rate is 1/2 a second, then you will be taking completely different sample points and your algorithm will identify the sounds as different. That may be construed as a feature, however.
Re: How to compare 2 wav files.
by toma (Vicar) on May 29, 2002 at 04:59 UTC
    Here is another approach that hasn't been mentioned yet. No AI, no feature extraction, no FFT, and it will work great! It is quite slow, though.

    Imagine that you have a bunch of sound samples. These samples are either positive or negative. If you take another copy of the samples and slide them across the original samples, they will line up at one particular instant in time. A nice way to get the computer to "see" this moment is to multiply the samples from each waveform together, sample by sample. Then, add up the products. This works because the lined-up samples will all turn into positive numbers (instead of the random mix of positive and negative numbers when they don't line up.) All these positive numbers will add up to a really big positive number, which is called a correlation spike.

    This algorithm of sliding the samples across each other, multiplying them sample-by-sample, and adding them together is called convolution. It works great but requires a large amount of computer power. If you have a really hot machine, or you are patient, it should work fine.

    A short-cut for this procedure takes advantage of the Fast Fourier Transform (FFT). This amazing algorithm allows you to transform a convolution in the time domain into a multiplication in the frequency domain. To get the benefits of this algorithm, you will need to learn about window functions, the effect of sample rates, and some other gory details. It will be *much* faster, but also much more work to learn how to use.

    To solve your particular problem, you will need to convolve each possibly new sample against each song in your collection, looking for a match. For the FFT algorithm, you can compute the FFT of each song only once, and store it. These stored FFT samples are multiplied by the FFT of the new song. You get the same value for the correlation spike when you multiply the FFTs as when you do the convolution. If you get the huge spike in either the time or the frequency domain, the songs are the same.

    The sliding FFT that jcwren mentions solves an important problem with the FFT. Imagine that two songs start with identical notes, but they are played in a different order. An ordinary FFT cannot distinguish between these two songs. The sliding FFT will fix this. The sliding FFT is yet more complicated than the ordinary FFT, so I wouldn't recommend it as a first project in signal processing.

    For doing this type of number-crunching in perl I use the PDL modules. They are well worth the trouble to install.

    Update: See Analyzing WAV Files with Perl for FFT usage.

    It should work perfectly the first time! - toma

      I've to do real FFT on a wav file.How to give the wav file as input to the realFFT method?
        Hi, It's really not that easy, but such tool exists: It can compare two audio files and give % of similarity whether you want to test your codec quality or just compare two audio files (like original and received at destination of VoIP channel). It's also available for Linux. Hope this helps. Regards, Vallu
Re: How to compare 2 wav files.
by jotti (Scribe) on May 28, 2002 at 20:43 UTC
    Of course there is allways the trivial case when we are talking about two wave files that both are originally the same sample. In that case we simply search for similar byte patterns. But, as most comments seem to take for granted, shadox is probabely talking about two different samples, say two mikes picking up the same sound source, right?
      Well not totally right :)
      I will have some wav files about 5 second in lenght (each file) and a friend server will send me a wav file (5 seconds too), then i will compare his file with the files i have and i will say "That song is ......" or "I don't know that song" and my program will "learn" that song. This is just a learn project (not college or work project (: ) and i really apreciate all your help guys.
      Optimus magister, bonus liber
        Okay, if you and your friend agree on which 5-sec portion to compare (e.g. always use the first 5 sec, not counting any initial silence that might be present), then you have a fairly good chance of building a DFT-based discriminator/identifier with a pretty good success rate.

        In this case, Perl could be very handy for driving the DFT/VQ engine on your friend's audio file, doing data reduction on that output, and running or maybe even computing the suitable statistics to identify a "best match" in your local database of first-5-sec snippets.

        Just building your local database of "song signatures" will be a very instructive exercise, and you can use it for both "training" and "testing". I could go on... but it would all be speculative, and you should work it out for yourself.

        If you allready have the file on your server, is it the same sample, f.i. extracted from the same music CD with same sample rate? Or might it be two different samples from two different LP records? Or could it be f.i. two different recordings of Beethoven's 5:th?
Re: How to compare 2 wav files.
by vallu (Initiate) on Dec 29, 2009 at 09:58 UTC
    Hi, Have you tried AQuA Wideband? Charging by the technology presentation comparing audio is not a trivial task at all. Although this software does not intent to find audio similarity at first stage, it does compare files and provides percentage of their similarity. Besides it's multiplatform charging by the software and company blog:
      By using php it is possible please try this one <?php $audio1 = file_get_contents('audio1.wav'); $audio2 = file_get_contents('audio2.wav'); if($audio1 == $audio2) echo "true"; else echo "false"; ?>
Re: How to compare 2 wav files.
by Anonymous Monk on May 24, 2012 at 12:42 UTC

    It is possible by javascript by using the same logic its is possible by all programing language. Now I given one jacascript function for this. If anybody need it in php please contact me

    function compireAudio(){ var audio1 = ""; var audio2 = ""; var i,j,d; var matching = 0; var t = 0;var i,j,d; var matching = 0; var t = 0; var audio1Arr = Array(); var audio1Len = audio1.length; for (i = 1; i<=audio1Len; i++) { //reverse so its like a stack d = audio1.charCodeAt(audio1Len-i); for (j = 0; j < 8; j++) { audio1Arr.push(d%2); d = Math.floor(d/2); } } var audio2Len = audio2.length; for (i = 1; i<=audio2Len; i++) { //reverse so its like a stack d = audio2.charCodeAt(audio2Len-i); for (j = 0; j < 8; j++) { if(d%2 == audio1Arr[t]) { matching++; } d = Math.floor(d/2); t++; } } var avarage = Number(matching)/((Number(t)+Number(audio1Arr.length +))/Number(2))*Number(100); alert('The Matching with the two audio is '+avarage+' %.'); }

      This code simply compares the two URLs, nothing to do with the audio they may or may not contain. The files are not retrieved, their contents are never compared. This in no way answers the question asked.

      I need in PHP more then 2 voice comparison script

        If you need a solution in PHP, you're likely better off asking on another site. (Is there such a thing as Discussion of other languages and solutions written in them is not unheard of in the Monastery, but for the most part this is PerlMonks.

        Furthermore, the Monastery is not a free code-writing service. If you have a problem that you need help with, the monks will be happy to assist; but we will not do your work for you, especially for free. That's what paid consultants are for.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://169641]
Approved by jcwren
Front-paged by cLive ;-)
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2016-06-27 08:26 GMT
Find Nodes?
    Voting Booth?
    My preferred method of making French fries (chips) is in a ...

    Results (337 votes). Check out past polls.