Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Performing Mathematical Operation on Specific Column of text File

by kcott (Archbishop)
on May 14, 2015 at 22:27 UTC ( [id://1126704]=note: print w/replies, xml ) Need Help??


in reply to Performing Mathematical Operation on Specific Column of text File

G'day Bama_Perl,

You have two major problems with this line:

my ($name, $time) = (split /\s+/, $file)[1,9];
  • $file is the filehandle! You want to split the line that was read from this filehandle. That line will be in $_.
  • Arrays are zero-based! To get the first and last elements, your array slice needs to specify [0,8] (not [1,9]).

-- Ken

Replies are listed 'Best First'.
Re^2: Performing Mathematical Operation on Specific Column of text File
by Bama_Perl (Acolyte) on May 14, 2015 at 22:32 UTC
    Hi Ken, Thanks for your comment. I'll be sure to split the line that was read from the filehandle. As for the second comment, I understand that in perl, the arrays are zero-based. However, in this case, the $name is in the second column (1), and the last column is 9. The first column is where "MCCCC... " is located.
      "However, in this case, the $name is in the second column (1), and the last column is 9. The first column is where "MCCCC... " is located."

      I suspect there's something here you haven't understood but I don't know what that might be.

      The " MCCC processed: ..." and "station, ..." lines have already been read in the for loop; they're not read again in the while loop. The first line read in the while loop will be:

      ZJ.GRAW -0.7964 0.0051 0.9690 0.0139 0 GRAW.BHZ 301 +.1263 -1.8041

      When split on whitespace, element zero will be "ZJ.GRAW".

      In your next split, element zero is the only element where you're likely to capture "ZJ". Your split pattern is wrong here but I'll assume that's progated from earlier errors (obviously, you want to split on /[.]/ — not /\s+/).

      Beyond all these issues, splitting on whitespace, and then trying to recreate the line, without knowing how much whitespace originally existed will not work: you won't retain the "same formatting" you state you want. So, I suggest you sit back and think of another approach.

      Assuming you've provided representative data, here's how I might have tackled the logic. (Note: I'm assuming "remove the mean" indicates subtracting the mean: if not, modify the calculation in output_recalc_zj_lines() to suit.)

      #!/usr/bin/env perl use strict; use warnings; print scalar <DATA> for 1 .. 2; my @zj_lines; while (<DATA>) { if (/ \A Mean_arrival_time: \s+ ( \S+ )/x) { output_recalc_zj_lines(\@zj_lines, $1); print; last; } push @zj_lines, $_; } print <DATA>; sub output_recalc_zj_lines { my ($zj_lines, $mean) = @_; for (@$zj_lines) { s/ ( \S+ ) ( \s+ ) \z / $1 - $mean . $2 /ex; print; } } __DATA__ MCCC processed: unknown event at: Tue, 14 Oct 2014 12:02:26 CST station, mccc delay, std, cc coeff, cc std, pol , t0_times + , delay_times ZJ.GRAW -0.7964 0.0051 0.9690 0.0139 0 GRAW.BHZ 301 +.1263 -1.8041 ZJ.KNYN -0.7065 0.0072 0.9760 0.0133 0 KNYN.BHZ 30 +1.3372 -1.9249 ZJ.LEON 0.9675 0.0072 0.9548 0.0292 0 LEON.BHZ 30 +1.2611 -0.1749 ZJ.RKST -0.2061 0.0114 0.9404 0.0383 0 RKST.BHZ 30 +1.3500 -1.4374 ZJ.SHRD 0.4382 0.0051 0.9542 0.0351 0 SHRD.BHZ 30 +1.7360 -1.1791 ZJ.SPLN 0.3033 0.0051 0.9785 0.0126 0 SPLN.BHZ 30 +1.0760 -0.6541 Mean_arrival_time: 300.1187 No weighting of equations. Window: 2.23 Inset: 1.17 Shift: 0.25 Variance: 0.00645 Coefficient: 0.96215 Sample rate: 40.000 Taper: 0.28 Phase: P PDE 2013 7 15 14 6 58.00 -60.867 -25.143 31.0 0.0 7.3

      Output:

      $ pm_1126698_split_record.pl MCCC processed: unknown event at: Tue, 14 Oct 2014 12:02:26 CST station, mccc delay, std, cc coeff, cc std, pol , t0_times + , delay_times ZJ.GRAW -0.7964 0.0051 0.9690 0.0139 0 GRAW.BHZ 301 +.1263 -301.9228 ZJ.KNYN -0.7065 0.0072 0.9760 0.0133 0 KNYN.BHZ 30 +1.3372 -302.0436 ZJ.LEON 0.9675 0.0072 0.9548 0.0292 0 LEON.BHZ 30 +1.2611 -300.2936 ZJ.RKST -0.2061 0.0114 0.9404 0.0383 0 RKST.BHZ 30 +1.3500 -301.5561 ZJ.SHRD 0.4382 0.0051 0.9542 0.0351 0 SHRD.BHZ 30 +1.7360 -301.2978 ZJ.SPLN 0.3033 0.0051 0.9785 0.0126 0 SPLN.BHZ 30 +1.0760 -300.7728 Mean_arrival_time: 300.1187 No weighting of equations. Window: 2.23 Inset: 1.17 Shift: 0.25 Variance: 0.00645 Coefficient: 0.96215 Sample rate: 40.000 Taper: 0.28 Phase: P PDE 2013 7 15 14 6 58.00 -60.867 -25.143 31.0 0.0 7.3

      Note how the original formatting is retained exactly (including the space that is presumably missing between ZJ.GRAW and -0.7964 which, if present, would have aligned that record's format with the other ZJ records).

      -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1126704]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2024-04-20 05:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found