comment on

Wow, so many answers already. :-)

The best way to find out if this should be a separate module is probably to let you guys decide, so here's the code...
This code parses some special list markup (similar to Wiki markup).
The scalarref ($text) is directly modified (I don't want to copy it and go $$text = $newtext).
If I wanted to write another parser function which does even more complicated modifications (which cannot be replaced by a regex), I'd have some duplicate code if this loop was not a separate module.
I know this isn't the prettiest piece of code mankind has written, but it seems to work (even with 3 byte UTF-8 characters / note that the line length changes).
I'd be happy if you could help me improve my code (even if there's a one-line alternative).

sub parseLists
{
    my $self = shift;
    my $text = \$self->{_text};

    # So we want to loop through the text line by line
    # and be able to modify some lines,
    # but we don't want to rebuild/copy the whole text.
    my $lf = "\n"; # linebreak
    my $lflen = length $lf; # 1
    my $pos1 = 0; # left line offset
    my $pos2 = 0; # right line offset (1st char after line)
    my $len = 0; # line length
    my $lendif = 0; # line length difference
    my $inlist;
    open my $fh, "<:utf8", $text; # Note how we open in UTF-8 mode
    # while (<$fh>) # Gets confused when line length changes
    # Using seek() is risky, because it reads bytes, not chars!
    # However, substr() always counts chars, not bytes.
    while (<$fh>)
    {
        # Get line string without newline character
        my $line = substr $_, 0, -$lflen;
        my $oldline = $line;

        # Calculate offsets
        $len = length $line;
        $pos1 = $pos2;
        $pos2 += $len + $lflen;

        # Modify line
        # START (not part of loop structure)
        my $isasterisk = $line =~ m/^\* /;
        my $isindented = $line =~ m/^\  /;
        my $isfirst;
        if (!$inlist)
        {
            if ($isasterisk)
            {
                $isfirst = 1;
                $inlist = 1;
            }
        }
        if ($inlist)
        {
            if (!$isindented && !$isasterisk)
            {
                substr $line, 0, 0, "</ul>\n";
                $inlist = 0;
            }
            elsif ($isindented)
            {
                $line =~ s/^\  (.*)/<li class="nobullet">$1<\/li>/;
            }
            elsif ($isasterisk)
            {
                $line =~ s/^\* (.*)/<li>$1<\/li>/;
                substr $line, 0, 0, "<ul>\n" if $isfirst;
            }
        }
        # END

        # Write new line back
        substr $$text, $pos1, $len, $line;

        # Calculate diff
        $lendif = (length $line) - ((length $_) - ($lflen));

        # Adjust our and Perl's (!) position counter
        $pos2 += $lendif; # That's our counter
        seek $fh, $lendif, 1; # That's from Perl / SEEK_CUR
    }
}
[download]

In reply to Re^2: Where to put self-made loop logic (separate module)? by basic6
in thread Where to put self-made loop logic (separate module)? by basic6

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Think about Loose Coupling
	PerlMonks