http://www.perlmonks.org?node_id=800734


in reply to Re: Finding local maxima/minima in noisy, pediodic data
in thread Finding local maxima/minima in noisy, pediodic data

Thanks.

I couldn't run your program because I don't have the GD::Graph module (I couldn't install it and don't have time right now to fix it), but I took the core loop of your program and adapted it. Well, it is not satisfactory.

It incorrectly labels some extrema as true ones but also misses some of them.

It turns out that my test data generator was too simple. Here is a real example.
  • Comment on Re^2: Finding local maxima/minima in noisy, pediodic data

Replies are listed 'Best First'.
Re^3: Finding local maxima/minima in noisy, pediodic data
by BrowserUk (Patriarch) on Oct 12, 2009 at 16:39 UTC

    In the interim, I plotted the second and third columns against the index on separate graphs using my first algorithm and posted the results here & here respectively.

    Perhaps you could load one or both of these images into a simple graphics editor and annotate the misses and false hits for us. I can see for example that the last 5 periods on the first graph above have false hits,

    Update: but I didn't notice any misses? Now I have. The 12th period on the first graph is a miss.

    But maybe I'm plotting the wrong values? Maybe you are working with the midpoints of the 2nd & 3rd columns?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      The data file posted on pastebin contains data from two channels recorded during the same measurement. 2nd col. vs. 1st col. is one dataset and 3rd vs. 1st. is the other.

      So you are plotting the right values.

      You are also correct about misses and false hits.
      Anyway, here is your first graph with missed peaks marked with ugly blue arrows and false peaks marked with smaller but equally ugly gray arrows.

        How automated does this have to be? In other words, is it acceptable to your process to supply some initial parameters, derived by inspection, to the processing?

        Because your dataset has the signal you are seeking to isolate, superimposed upon a much lower frequency, but still relatively high volume "carrier wave", with the whole thing overlain by a considerable amount of high frequency noise; single pass filtering--low pass; high pass; band-pass; simple smoothing--don't achieve the goal. And the asymmetric nature of the saw-tooth, combined with the variablity of the frequency, and the strength of both the random noise, and the underlaying carrier wave, makes isolating the required signal quite complex.

        However, if all the waveforms you wish to analyse fit the same basic structure; or if it is acceptable to supply some initial parameters, derived by inspection, it considerably simplifies things.

        For example, this graph shows your sample data processed using 4 input parameters:

        1. -PERIOD=75;

          The approximate duration of 1 cycle.

          Inspecting a simple plot of the data, you can see that peak 2 coincides with 100, and peak 5 with 400; giving ( 400 - 100 ) / 4 = 75. It's not a perfect fit, and the frequency drifts over the sample, but it's easy to derive and greatly simplifies the process.

        2. -IDIR=0;

          The initial direction (trend) of the first cycle in the data is down; (set IDIR=1 for up).

          This can be derived by comparing the relative ordering of the extremas found within the first PERIOD, but it is so easy for the human eye to see, it's better done by inspection.

          It also doesn't make a huge difference if you specify it wrong, as the algorithm syncs after the first cycle, but it does ensure a clean first cycle.

        3. -ATTACK=3;

          This indicates tha the attack of the waveform is fast. Ie. The next maxima occurs within 1/8th (1/2**3) of a cycle from the preceding minima.

          It is used to suppress consideration of a possible maxima until at least (PERIOD >> 3) time units have been collated. It helps exclude single point high-peak anomolies.

        4. -DECAY=1;

          This indicates that the decay of the waveform is qute slow. Ie. The next minima occurs not less than 1/2 cycle (1/2**1) after the preceding maxima.

          It is used to delay consideration of local minima until we've got at least half way through the cycle.

        On the graph linked above, I've displaced the second dataset for clarity. The keys are:

        • Raw data, red: first dataset; green: second dataset.
        • Respectively, blue & yellow: moving average (used for crossover detection).
        • Respectively, purple & cyan: resultant square waves.

        The algorithm now requires two passes for each dataset. The additional one to compute the moving average. This can be done in-line with the second pass, but it considerably complicates things.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Finding local maxima/minima in noisy, pediodic data
by BrowserUk (Patriarch) on Oct 12, 2009 at 16:15 UTC

    Could you explain how the 3 columns in this sample relate to the two column data in your ealier post?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.