Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

copying records from one file to another with filter.

by avanta (Beadle)
on Jan 09, 2010 at 16:11 UTC ( [id://816493]=perlquestion: print w/replies, xml ) Need Help??

avanta has asked for the wisdom of the Perl Monks concerning the following question:

I am new to perl, need some help on perl.

While copying records from one file to another I need to put a filter on one field (date in my case) of the record.

Content of Input file:
Date,Expense,Income
1-Dec-2009,12,87
2-Dec-2009,54,204
3-Dec-2009,75,214
4-Dec-2009,78,198
5-Dec-2009,98,155
6-Dec-2009,10,180
7-Dec-2009,51,91
8-Dec-2009,32,130
9-Dec-2009,29,207
....so on..

Expected content of output file:

Date,Total
4-Dec-2009,276
5-Dec-2009,253
6-Dec-2009,190
7-Dec-2009,142
8-Dec-2009,162
9-Dec-2009,236

For example, I need to lookup for "4-Dec-2009" and copy all the records therafter.

How can this be done? I am able to copy complete file along with summation on two columns (Expense+Income).


Thanks,
AvantA

Replies are listed 'Best First'.
Re: copying records from one file to another with filter.
by BrowserUk (Patriarch) on Jan 09, 2010 at 18:02 UTC
    >perl -ne"next unless /^4-Dec-2009/; print; last }{ print while <>" ju +nk.dat 4-Dec-2009,78,198 5-Dec-2009,98,155 6-Dec-2009,10,180 7-Dec-2009,51,91 8-Dec-2009,32,130 9-Dec-2009,29,207

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Im sorry but I didnt get how this code works... Im tryin to write this code for an ETL process. here my DB is a flat file. I need to filter our records as per a date range, eg. last 30 days or smthing like that and write it to a new file..
Re: copying records from one file to another with filter.
by apl (Monsignor) on Jan 09, 2010 at 17:32 UTC
    Two suggestions:
    • Convert the dates into a numeric string of the form YYMMDD, and reverse sort.
    • Look at CPAN and find a Date or Time class that will store your information in a form you can also manipulate in reverse sort.
Re: copying records from one file to another with filter.
by Corion (Patriarch) on Jan 09, 2010 at 16:15 UTC

    Maybe you can show us what you've tried so far, and how it failed to work for you?

      #!/usr/bin/perl use strict; use warnings; use Switch; my $srcdir = "D:/source/"; my $srcfile = $srcdir."money.dat"; open (READ, "< $srcfile") || die "Can't find the dat file\n"; my ($date, $expense, $income); my $epochToday = time; my $last30days = $epochToday - 2592000; my ($year, $month, $day) = (localtime($last30days))[5,4,3]; $month++; $year+=1900; my $mon; switch ($month) { case [1] {$mon="Jan"} case [2] {$mon="Feb"} case [3] {$mon="Mar"} case [4] {$mon="Apr"} case [5] {$mon="May"} case [6] {$mon="Jun"} case [7] {$mon="Jul"} case [8] {$mon="Aug"} case [9] {$mon="Sep"} case [10] {$mon="Oct"} case [11] {$mon="Nov"} case [12] {$mon="Dec"} } my $lastdate = $day."-".$mon."-".$year; my $read; my $total; while (<READ>) { chop; ($date, $expense, $income) = split(/,/,$_); $total = $expense+$income; @data_date = ($date); MY PROBLEM STARTS HERE!!!!! if($lastdate =~ /^@data_date/) { } @data = ($total); push @data, [@data_date]; push @recent, [@data]; $read++; } close READ || die "Couldn't close the dat file"; my $desdir = "D:/target/"; my $desfile = $desdir."total.csv"; open (WRITE, "< $desfile") || die "Can't find the csv file.\n"; my $y=0; foreach(@recent) { $printout .= "$recent[$y][0],$recent[$y][1]\n"; $y++; } print WRITE $printout; close WRITE || die "Couldn't close the csv file.\n"; exit ;

        Actually, your problem started when you didn't check your script with perl -c scriptname. Had you done so, you would have seen this:

        >perl -c \816493.pl Possible unintended interpolation of @data_date in string at 816493.pl + line 60. Global symbol "@data_date" requires explicit package name at 816493.pl + line 56. syntax error at 816493.pl line 60, near "if" Global symbol "@data_date" requires explicit package name at 816493.pl + line 60. Global symbol "@data" requires explicit package name at816493.pl line +65. Global symbol "@data" requires explicit package name at 816493.pl line + 67. Global symbol "@data_date" requires explicit package name at 816493.pl + line 67. Global symbol "@recent" requires explicit package name at816493.pl lin +e 68. Global symbol "@data" requires explicit package name at 816493.pl line + 68. syntax error at 816493.pl line 71, near "}" 816493.pl had compilation errors.

        IE, what you've posted won't compile. Suggest you strip this down to the bare essentials. Then, when you have a compilable version that illustrates your problem, you'll be ready to post that.

        Your code has at least one large problem:

        use Switch;

        This module is a bad idea and not recommended for use, even by its author anymore. Remove it and the switch construct. If you use Perl 5.10, use the new given keyword, otherwise use the approach you already got for the switch expression or use a chain of if ... elsif ... elsif statements.

        You could replace

        switch ($month) { case [1] {$mon="Jan"} case [2] {$mon="Feb"} case [3] {$mon="Mar"} case [4] {$mon="Apr"} case [5] {$mon="May"} case [6] {$mon="Jun"} case [7] {$mon="Jul"} case [8] {$mon="Aug"} case [9] {$mon="Sep"} case [10] {$mon="Oct"} case [11] {$mon="Nov"} case [12] {$mon="Dec"} }
        with
        $mon = (qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec))[$month-1] +;
Re: copying records from one file to another with filter.
by mnooning (Beadle) on Jan 09, 2010 at 18:52 UTC
    use strict; my @incoming = ( "1-Dec-2009,12,87", "2-Dec-2009,54,204", "3-Dec-2009,75,214", "4-Dec-2009,78,198", "5-Dec-2009,98,155", "6-Dec-2009,10,180", "7-Dec-2009,51,91", "8-Dec-2009,32,130", ); my $lookedForDate = "4-Dec-2009"; foreach my $line (@incoming) { # If you are getting these lines from a file, you may # first have to # chomp($line); # Get rid of the \n # Look for beginning-of-line,then $lookedForDate if ($line =~ m/^$lookedForDate/) { print ("Found $lookedForDate in the line: $line\n"); } } __END__ The result is pasted below Found 4-Dec-2009 in the line: 4-Dec-2009,78,198
Re: copying records from one file to another with filter.
by thundergnat (Deacon) on Jan 11, 2010 at 15:47 UTC

    If you know the start and end point use a range operator.

    Note: In this context "eof" doesn't really correspond to the eof end-of-file marker, it is just a string that isn't found before the end of the file. It works in this situation, but beware of using it in a general case. If you want the true eof marker, use the hex char at ordinal 4, "\x4"

    use warnings; use strict; my $start = '4-Dec-2009'; my $end = "eof"; print "Date,Total\n"; while (<DATA>) { if ( /^$start/ .. /^$end/ ) { chomp; my (@items) = split /,/; printf "%s,%d\n", $items[0], $items[1] + $items[2]; } } __DATA__ Date,Expense,Income 1-Dec-2009,12,87 2-Dec-2009,54,204 3-Dec-2009,75,214 4-Dec-2009,78,198 5-Dec-2009,98,155 6-Dec-2009,10,180 7-Dec-2009,51,91 8-Dec-2009,32,130 9-Dec-2009,29,207

    produces:

    Date,Total 4-Dec-2009,276 5-Dec-2009,253 6-Dec-2009,190 7-Dec-2009,142 8-Dec-2009,162 9-Dec-2009,236
      your code seems to be more relevant to my problem....I implemented the technique and heres my edited code...
      #!/usr/bin/perl use strict; use warnings; # source file directory my $srcdir = "../source"; # source file name my $srcfile = $srcdir."/vol.dat"; # Open source file. open (READ, "< $srcfile") || die "Can't find the DAT file\n"; my $epochToday = time; $epochToday = $epochToday - 2592000; my ($year, $month, $day) = (localtime($epochToday))[5,4,3]; $month++; $year+=1900; my $startdate = $year."-".$month."-".$day; my $x=0; my $info; my @input; while ($info = <READ>) { chomp $info; my @data = split (/,/, $info); push @input, [@data]; $x++; } close READ || die "Couldn't close the DAT file"; @ordered_input); my $desdir = "../target"; my $desfile = $desdir."/total_volume.csv"; open (WRITE, "> $desfile") || die "Can't find the CSV file.\n"; my @headers = ("Date",",","Total_Volume"); print WRITE @headers,"\n"; my $printout; while (@input) { my $start = $startdate; my $end = "eof"; if ( /^$start/../^$end/ ) { chomp; my (@items) = split /,/; $printout .= "%s,%d\n", $items[0], $items[1] + $items[2]; } } print WRITE $printout; close WRITE || die "Couldn't close the CSV file"; exit 0;
      but im getting and infinite loop error:
      Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Terminating on signal SIGINT(2) Terminating on signal SIGINT(2)
      kindly help in this matter..

      Thanks
      AvantA

        A few observations and comments:

        You are reading in the entire file and saving it in an array of arrays, but then don't do anything with it. (Well, other than use the size of the array as a never terminating while() condition.) For that situation your loop probably should be

        for (@input)
        rather than
        while (@input)

        You are searching for a YYYY-MM-DD pattern when your example data is in DD-MMM-YYYY format. You'll never find anything.

        You should probably use the 3 argument form of open and lexical file handles. What you have isn't wrong particularly, but isn't considered best practices.

        The way you are calculating 30 days is easy but fragile. You may be better off using one of the date calculation packages. It may be good enough here though, so take that with a grain of salt.

        Here's a minor reworking of your posted script to be more efficient (and work, for certain values of "work").

        #!/usr/bin/perl use strict; use warnings; my @months = ( qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/); # source file directory my $srcdir = '../source'; # source file name my $srcfile = $srcdir . '/vol.dat'; my $desdir = '../target'; my $desfile = $desdir . '/total_volume.csv'; # Open source file. open my $read, '<', $srcfile or die "Can't open the DAT file $!\n"; open my $write, '>', $desfile or die "Can't open the CSV file. $!\n"; print $write "Date,Total_Volume\n"; my ( $year, $month, $day ) = ( localtime( time() - 2592000 ) )[ 5, 4, +3 ]; $year += 1900; my $startdate = "$day-".$months[$month]."-$year"; while (<$read>) { if ( /^$startdate/ .. /\x4/ ) { chomp; my (@items) = split /,/; printf $write "%s,%d\n", $items[0], $items[1] + $items[2]; } } # probably not necessary but not a bad idea close $read or die "Couldn't close the DAT file: $!\n"; close $write or die "Couldn't close the CSV file: $!\n";
      if( /^$start/ .. /^$end/ )
      Sir, I am havng trouble interpretting in this line.. there is an error

      Use of uninitialized value $_ in pattern match (m//) at range.pl line 12
      Can you please help me.. its urgent..

      Avanta

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://816493]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (4)
As of 2024-03-29 01:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found