Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Ok, finally found some time to look back at this. After doing some testing and closer examination and thinking about what's really happening, here's my thoughts and your fixed code.

   I tried your code, but the output file was empty.

Not sure what's happening there. I have now tested this on 2 systems and it seems to be working for me. Please note that I downloaded your sample data into a file named data.txt and hard-coded that into my code. If you didn't do that, then you might encounter some issues. Anyways, I think you can forget about that code. See below.

   ...but does not print up to the ADJ TO TOTALS part of the regex. Is there a reson why this part of the code would truncate the regex?

That forced to me actually try running your code (after, of course, cleaning up the alignment). Then as I was trying to figure out why the "ADJ TO TOTALS" line was not printed, I realized that the "NAME" line should not have been printed either, but it was. Sooooo, I pulled out one of the best tools for debugging regexes --- the print statement. I started tossing in print statements to figure out what the heck was in the variables to figure out if the problem was with the regex or what was going into the regex. And the answer is.. (insert drum roll)...neither. I know. You're thinking "Huh? What? What did he say?". Follow along.

First, look at your code. Where is the only print statement printing to the output file? It's inside of the if ($zero == "0.00") statement. If you look at the "NAME" and "ADJ" lines, $zero is not getting "0.00" so neither line should be printing. So what is actually being sent to the output file? That would be @data. I first changed that to $data and presto! The "NAME" and "ADJ" lines were not printed. Then I realized that you had a line where your were trying to reinitialize @data. The problem was you didn't do it in the right spot. You were only reinitializing it when the line had "1235114182", which is why the "NAME" line was printed. Changing the print statement back to using the @data and relocating @data=(); also worked.

In the end, there's two things that I did to find the issue. First, clean up the indenting so that I can quickly and easily understand what's inside of what brackets and braces. Second, debug with print statements. So why the long, convoluted response? To help illustrate the thought process that goes on in debugging. Sometimes that more helpful that saying "here's the problem and here's the solution". In other words, I thought that walking you through the debug process would be more useful to you than just handing you the "steps" to "fix" your code.

Two more quick points before I share the modified version of your code that I ran. First, I would agree with jaffy that your logic behind the looping and variable use is somewhat confusing, which made it difficult to understand what's going on and where the problem was at. Second, if you really wanted the "NAME" and "ADJ" lines printed, then the real problem is that you've discovered that perl is extremely good at doing what you told it to do instead of what you wanted it to do, which I, speaking from personal experience, admit can be very frustrating. In other words, your corrected code told perl to not print those lines.

Ok, the cleaned up and modified code below is what I ran to debug your code. Try running it and take a look at all of the stuff that gets printed to the screen. You'll see how that was useful in telling me what was going on.

use strict; use warnings; print "What file do you want parsed? "; my $file=<STDIN>; my @data; my $data; my $lines; open (TEST,"$file") or die$!; open OUTPUT, "> peptest.txt" or die$!; while (<TEST>) { if (/NAME /../ADJ TO TOTALS:/) { push @data, $_; foreach $data (@data) { print "data -- $data\n"; ## Added for debugging if ($data =~ /1235114182/) { $lines.=$_; my $zero = substr $lines, 118, 5; print "zero -- $zero\n"; if ($zero == "0.00") { #Version1 print OUTPUT "$data \n"; print OUTPUT "@data \n"; print " data sent to output file\n"; ## Added for debugging } else {print " data skipped\n"} ## Added for debugging $zero=""; $lines=""; $data=""; #Version1 @data=(); } @data=(); print "--------end of one iteration of foreach loop-----------\n\n +"; ## Added for debugging } } } close TEST; close OUTPUT;

In reply to Re^3: Regex Not Grabbing Everything by dasgar
in thread Regex Not Grabbing Everything by JonDepp

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (1)
As of 2024-05-25 15:30 GMT
Find Nodes?
    Voting Booth?

    No recent polls found