Byte repetition check

james28909 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Byte repetition check by BrowserUk (Patriarch) on Dec 11, 2014 at 04:02 UTC
In this code, it should match "1" 2 times, "2" 2 times, and "3" 3 times, which would be a total of 7 repeating bytes. The script only counts 5 repeating bytes. What am I missing? I know its something simple probably. Any help would be appreciated :) You are smart matching a byte against an array. The array is initially empty, so you take the else branch and (re)empty it before pushing the character into it, and then print its content. On the second pass: the array will contain one character; so you increment your count; print the array -- which will still only have one character in it; and then empty it. The array is now empty (again), so goto step 1. So, step through the ten iterations of your loop on paper, recording the changes to @array and $count, and it will be very clear to you where you are going wrong. (You also appear to have a newline character as the first line of your file, which probably isn't meant to be there?) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^2: Byte repetition check by james28909 (Deacon) on Dec 11, 2014 at 04:27 UTC
Nope, the first newline isnt suppose to be there. Thanks for pointing that out.	[reply]
Re: Byte repetition check by ikegami (Patriarch) on Dec 11, 2014 at 03:56 UTC
`my $count; my $last = ''; while (read(DATA, my $byte, 1)) { ++$count if $byte eq $last; $last = $byte; }` [download]	[reply] [d/l]
Re: Byte repetition check by Anonymous Monk on Dec 11, 2014 at 05:05 UTC
Just don't use smartmatch... And if files are small, I don't see any reason to read them byte by byte. `$ perl -0777 -nE ' my $count = 0; $count += length $2 while /(.)(\1+)/g; say $count; ' <<< "1112223333" 7 $ perldoc perlop \| perl -0777 -nE ' my $count = 0; $count += length $2 while /(.)(\1+)/g; say $count; ' 22788` [download] (just don't decode your strings, and you'll have bytes... pretty much)	[reply] [d/l]
Re: Byte repetition check by thezip (Vicar) on Dec 11, 2014 at 21:17 UTC
Yet another WTDI: `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $data = '11122222344456788899'; my @data = split(//, $data); my $accum = {}; my @buffer = (); push(@buffer, shift(@data)); while (@data) { push(@buffer, shift(@data)); if($buffer[0] == $buffer[1]) { $accum->{$buffer[0]}++; } shift(@buffer); } print Dumper($accum);` [download] My tenacity goes to eleven...	[reply] [d/l]
Re: Byte repetition check by james28909 (Deacon) on Dec 11, 2014 at 07:42 UTC
It seems doing it this way will not work out in the end. And the issue is because when I move from reading 1 byte, to 2 bytes or more, I think what you call alignment becomes an issue. When 2 bytes are read at a time, 011110, $byte becomes 01 then 11 then 10, and common sense will tell you there is repeating byte sets in there.'11' and '11' but it doesnt catch it when 2 bytes are read, and so on with a larger read. Sooo... if I want to read 2 bytes at a time, I will have to read from the beginning of the file, then calculate how many reps, then seek 1 byte and do it again. Would that take care of this "alignment" issue?	[reply]
Re^2: Byte repetition check by FloydATC (Deacon) on Dec 11, 2014 at 11:29 UTC
Why not read the file (or blocks of it, if it's too large to conveniently fit into memory) into a buffer and then process that buffer? This would add a couple of lines of code, but it would save you thousands of system calls and improve performance immensely. Looping through a buffer could then be done using `substr()` or by `split()`ting the buffer into an `@array` which you can then `foreach()` through. -- FloydATC Time flies when you don't know what you're doing	[reply] [d/l] [select]