Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I have a file that lists thousands of lines of interest in a large baseline of code. Each line will list the module and line number. When I get an updated list of lines at a later date, becasue of changes to the baseline, some lines will stay the same, there will be some new lines, there will be some lines that go away, and some lines will have their line number change.

If a module has a line inserted into it or deleted between the two times, the line number of interest will be different. Without doing a diff on the two files, it is difficult to tell whether that changed line listing is due to a line number change or if there was a deletion and insertion.

Since I have thousands of lines tracked, doing manual diffs is not an option. So I wrote a Perl script to automate the process. I can give it the old module, the updated module, and the line number of interest. It will give me what that line number would be in the new module, or 0 if it is no longer there or changed. That way I can tell if it was a line number change, or an addition plus deletion instead.

It does a diff on the two files, then uses the n1an2 or n1,n2cn3 or whatever lines diff gives to get an array of new to old line numbers. I then have to reverse that array into another to get the old to new line numbers. Only then can I use the line number as an index to that array to get the line number in the updated module.

Has anyone had to do something similar? It seems a little convoluted. Is there an easier way to determine what the new line number would be in a modified file? ( I hope I copied my listing correctly. )

# # old_file revised_file line_number # # compares old_file to revised_file and will determine what a line num +ber in old_file will be for revised_file # use strict; my ( $old_file, $new_file, $line_number ) = @ARGV; my @new_to_old; my @old_to_new; my $lines_in_new = 0; my $lines_in_old = 0; initialize_arrays(); determine_differences(); print "$old_to_new[$line_number]\n"; sub initialize_arrays{ # Initialize arrays. Size is old + new to give shrink and grow roo +m during calculations. # Just ignore extra when done. open my $new_file_handle, "<", $new_file or die "Unable to open $n +ew_file\n"; $lines_in_new++ while ( <$new_file_handle> ); open my $old_file_handle, "<", $old_file or die "Unable to open $o +ld_file\n"; $lines_in_old++ while ( <$old_file_handle> ); $old_to_new[$_] = 0 for ( 1 .. $lines_in_new + $lines_in_old ); $new_to_old[$_] = $_ for ( 1 .. $lines_in_new + $lines_in_old ); } sub determine_differences{ for my $capture_line ( `diff $old_file $new_file` ) { # Only process lines starting with number. Ignore <, >, or --- + lines. if ( $capture_line =~ /^(?:(\d+),)?(\d+)([acd])(?:(\d+),)?(\d+ +)$/ ) { $3 ne 'd' and $new_to_old[$_] = 0 for ( ( $4 ? $4 : $5 ) . +. $5 ); $new_to_old[$_] += ( $4 ? $4 : $5 ) - $5 + $2 - ( $1 : $1 +? $2 ) + ( $3 cmp 'c' ) for ( $5 + 1 .. $lines_in_old + $lines_in_new ); } } # Now get inverse to go from old to new. # There will be complete arrays going old to new and new to old ju +st to get one answer. for my $n ( 0 .. $lines_in_old + $lines_in_new ) { if ( $new_to_old[$n] > 0 ) { $old_to_new[$new_to_old[$n]] = $n; } else { $old_to_new[$n] = 0; # If no match, make line number zero. } } }

In reply to Determine new line number in revised file given old line number by ExReg

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2022-01-29 13:32 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (74 votes). Check out past polls.