I couldn't run your program because I don't have the GD::Graph module (I couldn't install it and don't have time right now to fix it), but I took the core loop of your program and adapted it. Well, it is not satisfactory.
It incorrectly labels some extrema as true ones but also misses some of them.
In the interim, I plotted the second and third columns against the index on separate graphs using my first algorithm and posted the results here & here respectively.
Perhaps you could load one or both of these images into a simple graphics editor and annotate the misses and false hits for us. I can see for example that the last 5 periods on the first graph above have false hits,
Update: but I didn't notice any misses? Now I have. The 12th period on the first graph is a miss.
But maybe I'm plotting the wrong values? Maybe you are working with the midpoints of the 2nd & 3rd columns?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
How automated does this have to be? In other words, is it acceptable to your process to supply some initial parameters, derived by inspection, to the processing?
Because your dataset has the signal you are seeking to isolate, superimposed upon a much lower frequency, but still relatively high volume "carrier wave", with the whole thing overlain by a considerable amount of high frequency noise; single pass filtering--low pass; high pass; band-pass; simple smoothing--don't achieve the goal. And the asymmetric nature of the saw-tooth, combined with the variablity of the frequency, and the strength of both the random noise, and the underlaying carrier wave, makes isolating the required signal quite complex.
However, if all the waveforms you wish to analyse fit the same basic structure; or if it is acceptable to supply some initial parameters, derived by inspection, it considerably simplifies things.
For example, this graph shows your sample data processed using 4 input parameters:
The approximate duration of 1 cycle.
Inspecting a simple plot of the data, you can see that peak 2 coincides with 100, and peak 5 with 400; giving ( 400 - 100 ) / 4 = 75. It's not a perfect fit, and the frequency drifts over the sample, but it's easy to derive and greatly simplifies the process.
The initial direction (trend) of the first cycle in the data is down; (set IDIR=1 for up).
This can be derived by comparing the relative ordering of the extremas found within the first PERIOD, but it is so easy for the human eye to see, it's better done by inspection.
It also doesn't make a huge difference if you specify it wrong, as the algorithm syncs after the first cycle, but it does ensure a clean first cycle.
This indicates tha the attack of the waveform is fast. Ie. The next maxima occurs within 1/8th (1/2**3) of a cycle from the preceding minima.
It is used to suppress consideration of a possible maxima until at least (PERIOD >> 3) time units have been collated. It helps exclude single point high-peak anomolies.
This indicates that the decay of the waveform is qute slow. Ie. The next minima occurs not less than 1/2 cycle (1/2**1) after the preceding maxima.
It is used to delay consideration of local minima until we've got at least half way through the cycle.
On the graph linked above, I've displaced the second dataset for clarity. The keys are:
Raw data, red: first dataset; green: second dataset.
Respectively, blue & yellow: moving average (used for crossover detection).