Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Finding length of line if have any position of any char inside line

by phoenix007 (Sexton)
on Oct 10, 2019 at 11:09 UTC ( #11107298=perlquestion: print w/replies, xml ) Need Help??

phoenix007 has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks. I want to find length of line if I have any position in that line

What im doing currently is creating array where "\n" is present. Then finding pair of "\n" surrounding my position and then taking differnce between position of those two new lines as length of line

I want to optimise it to find position of newline before and after postion and find length instead of building array every time

my $string ="test\nI want length of this line\n test"; my $position = 12; # Note this postion can be any position in any line +. currently considering it inside 2nd line my @newlines; # for storing \n positions push @newlines, 0; while ($string =~/\n/g) { push @newlines, pos($string); } my itr = scalar @newlines - 1; while ($newlines[$itr] > $position) { $itr--; } my $length_of_line = $newlines[$itr + 1] - $newlines[$itr]; #Better efficient solution to find length of line if we have only any +position inside it. Thanks in advance!!!
  • Comment on Finding length of line if have any position of any char inside line
  • Download Code

Replies are listed 'Best First'.
Re: Finding length of line if have any position of any char inside line
by rjt (Deacon) on Oct 10, 2019 at 11:31 UTC

    Edit: Added Inline::C version and quick'n'dirty Test::More tests.

    Finding the next and previous newlines with index() and rindex() from the current $position will net you a good speedup (~200%), while a pure Inline::C solution will get you around ~600%. At the end of the day, you still have to scan left and right in the string. I included a very basic test framework, which highlights some unexpected cases in your version (op() in my example code).

    This is about as fast as you can go without changing your algorithm, as scanning character by character for the next/previous newlines is going to be necessary no matter what.

    Benchmark Results:

    $> BENCHMARK=5 perl subline.pl Rate op tybalt89 daxim rlindex cur_s +trlen op 536066/s -- -0% -22% -66% + -86% tybalt89 538753/s 1% -- -22% -66% + -86% daxim 688615/s 28% 28% -- -57% + -82% rlindex 1585059/s 196% 194% 130% -- + -58% cur_strlen 3749674/s 599% 596% 445% 137% + --

    As your subroutine didn't do any error or out of bounds checking, or define what happens if C<$position> is on a line boundary already, I did not do anything with those edge cases. You probably will, though.

    Full code is in the readmore.

    use strict; use warnings; omitted for brevity.

      Thanks rjt. rlindex version looks good

Re: Finding length of line if have any position of any char inside line
by tybalt89 (Parson) on Oct 10, 2019 at 14:05 UTC

    TMTOWTDI - Why search for things when regex will do that for you? It may not be the fastest, or it may, I don't care, but does that really matter?

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11107298 use warnings; my $string ="test\nI want length of this line\n test"; my $position = 12; pos($string) = $position; $string =~ /.*\G.*\n?/; my $length_of_line = length $&; print "length of line at $position is $length_of_line\n";

    Outputs:

    length of line at 12 is 27
Re: Finding length of line if have any position of any char inside line
by Corion (Pope) on Oct 10, 2019 at 11:18 UTC

    You could simply precalculate the length of the line when you store its position:

    ... my $len = pos($string) - $newlines[-1]; push @newlines, [ pos($string), $len ]; ...

    Also, you could do a binary search for the starting position instead of starting from the front or the back of the file, or even linearly estimate the position of the line, assuming that you know the average width of a line. Since you need to scan the string for newlines anyway, calculating the average line length is easy and estimating the likely position of the line as a start point for your search is also easy:

    my $avg_len = 0; $avg_len = length($string) / 0+@newlines; my $estimated_line = int( $position / $avg_len );
Re: Finding length of line if have any position of any char inside line
by daxim (Curate) on Oct 10, 2019 at 11:40 UTC
    You don't need to build an array.
    use experimental 'signatures'; sub length_at_position_between($s, $p, $n = "\n") { my ($prev, $next); while ($s =~ /$n/g) { $prev = $next; $next = pos $s; last if $next > $p; } return $next - $prev; } print length_at_position_between ("test\nI want length of this line\n test", 12); # 27

      As with the OP's version that you based yours on, this fails several of these tests:

      my @tests = ( # $string, $pos, $expected, $description [ "\nString\n", 2, 6, 'Surrounded' ], [ "String\n", 2, 6, 'No first \n' ], [ "\nString", 2, 6, 'No last \n' ], [ "String", 2, 6, 'No \n' ], [ "", 0, 0, 'Blank' ], );

      Since the OP's version failed those as well, this is understandable. Obviously even more tests would be needed, but this was meant as more of a sanity check than anything.

      Performance-wise, here's where your solution fits in:

      Rate op_ver daxim rlindex cur_strlen op_ver 564186/s -- -23% -66% -86% daxim 737063/s 31% -- -56% -81% <-- H +ERE rlindex 1677856/s 197% 128% -- -57% cur_strlen 3932672/s 597% 434% 134% --

      Of course, I am not sure what sort of performance impact (for better or worse) a fixed version of your code would have. Complete code for my test and benchmark suite is in my comment, if you want to have a go with it yourself.

      not ok 1 - daxim: Surrounded # Failed test 'daxim: Surrounded' # at /home/ryan/src/perlmonks/subline.pl line 37. # got: '7' # expected: '6' Use of uninitialized value $prev in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. not ok 2 - daxim: No first \n # Failed test 'daxim: No first \n' # at /home/ryan/src/perlmonks/subline.pl line 37. # got: '7' # expected: '6' Use of uninitialized value $prev in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. not ok 3 - daxim: No last \n # Failed test 'daxim: No last \n' # at /home/ryan/src/perlmonks/subline.pl line 37. # got: '1' # expected: '6' Use of uninitialized value $prev in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. Use of uninitialized value $next in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. not ok 4 - daxim: No \n # Failed test 'daxim: No \n' # at /home/ryan/src/perlmonks/subline.pl line 37. # got: '0' # expected: '6' Use of uninitialized value $prev in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. Use of uninitialized value $next in subtraction (-) at /home/ryan/src/ +perlmonks/subline.pl line 64. ok 5 - daxim: Blank
      use strict; use warnings; omitted for brevity.

        Thanks. I have handled edge cases. For simplicity edge cases omited in my example i want to focus on main problem first

Re: Finding length of line if have any position of any char inside line
by jwkrahn (Monsignor) on Oct 10, 2019 at 19:58 UTC

    This appears to do what you require:

    my $string = "test\nI want length of this line\n test"; my $position = 12; # Note this postion can be any position in any line +. currently considering it inside 2nd line my $length_of_line; if ( $string =~ /\n.*(?=\n)/ && $-[ 0 ] < $position && $+[ 0 ] > $posi +tion ) { $length_of_line = $+[ 0 ] - $-[ 0 ]; }

      But the following gives no result even though the
          "\nwant length of this line\n"
      substring containing a character at offset 12 in the overall string matches the  /\n.*(?=\n)/ regex:

      c:\@Work\Perl\monks>perl -wMstrict -le "my $string = qq{test\nI\nwant length of this line\n test}; my $position = 12; ;; my $length_of_line; if ( $string =~ /\n.*(?=\n)/ && $-[ 0 ] < $position && $+[ 0 ] > $pos +ition ) { $length_of_line = $+[ 0 ] - $-[ 0 ]; } print $length_of_line; " Use of uninitialized value in print at -e line 1.


      Give a man a fish:  <%-{-{-{-<

Re: Finding length of line if have any position of any char inside line
by TieUpYourCamel (Sexton) on Oct 10, 2019 at 17:28 UTC
    My negligible contribution to this discussion is to point out that you don't mention if the \n should be included in the length count.
    I want length of this line
    is 26 characters without the newline.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11107298]
Front-paged by rjt
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2019-10-14 18:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?