avanta has asked for the wisdom of the Perl Monks concerning the following question:
I am new to perl, need some help on perl.
While copying records from one file to another I need to put a filter on one field (date in my case) of the record.
Content of Input file:
Date,Expense,Income
1-Dec-2009,12,87
2-Dec-2009,54,204
3-Dec-2009,75,214
4-Dec-2009,78,198
5-Dec-2009,98,155
6-Dec-2009,10,180
7-Dec-2009,51,91
8-Dec-2009,32,130
9-Dec-2009,29,207
....so on..
Expected content of output file:
Date,Total
4-Dec-2009,276
5-Dec-2009,253
6-Dec-2009,190
7-Dec-2009,142
8-Dec-2009,162
9-Dec-2009,236
For example, I need to lookup for "4-Dec-2009" and copy all the records therafter.
How can this be done? I am able to copy complete file along with summation on two columns (Expense+Income).
Thanks,
AvantA
Re: copying records from one file to another with filter.
by BrowserUk (Patriarch) on Jan 09, 2010 at 18:02 UTC
|
>perl -ne"next unless /^4-Dec-2009/; print; last }{ print while <>" ju
+nk.dat
4-Dec-2009,78,198
5-Dec-2009,98,155
6-Dec-2009,10,180
7-Dec-2009,51,91
8-Dec-2009,32,130
9-Dec-2009,29,207
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] |
|
Im sorry but I didnt get how this code works...
Im tryin to write this code for an ETL process. here my DB is a flat file. I need to filter our records as per a date range, eg. last 30 days or smthing like that and write it to a new file..
| [reply] [Watch: Dir/Any] |
Re: copying records from one file to another with filter.
by apl (Monsignor) on Jan 09, 2010 at 17:32 UTC
|
Two suggestions:
- Convert the dates into a numeric string of the form YYMMDD, and reverse sort.
- Look at CPAN and find a Date or Time class that will store your information in a form you can also manipulate in reverse sort.
| [reply] [Watch: Dir/Any] |
Re: copying records from one file to another with filter.
by Corion (Patriarch) on Jan 09, 2010 at 16:15 UTC
|
Maybe you can show us what you've tried so far, and how it failed to work for you?
| [reply] [Watch: Dir/Any] |
|
#!/usr/bin/perl
use strict;
use warnings;
use Switch;
my $srcdir = "D:/source/";
my $srcfile = $srcdir."money.dat";
open (READ, "< $srcfile") || die "Can't find the dat file\n";
my ($date, $expense, $income);
my $epochToday = time;
my $last30days = $epochToday - 2592000;
my ($year, $month, $day) = (localtime($last30days))[5,4,3];
$month++;
$year+=1900;
my $mon;
switch ($month)
{
case [1] {$mon="Jan"}
case [2] {$mon="Feb"}
case [3] {$mon="Mar"}
case [4] {$mon="Apr"}
case [5] {$mon="May"}
case [6] {$mon="Jun"}
case [7] {$mon="Jul"}
case [8] {$mon="Aug"}
case [9] {$mon="Sep"}
case [10] {$mon="Oct"}
case [11] {$mon="Nov"}
case [12] {$mon="Dec"}
}
my $lastdate = $day."-".$mon."-".$year;
my $read;
my $total;
while (<READ>)
{
chop;
($date, $expense, $income) = split(/,/,$_);
$total = $expense+$income;
@data_date = ($date);
MY PROBLEM STARTS HERE!!!!!
if($lastdate =~ /^@data_date/)
{
}
@data = ($total);
push @data, [@data_date];
push @recent, [@data];
$read++;
}
close READ || die "Couldn't close the dat file";
my $desdir = "D:/target/";
my $desfile = $desdir."total.csv";
open (WRITE, "< $desfile") || die "Can't find the csv file.\n";
my $y=0;
foreach(@recent)
{
$printout .= "$recent[$y][0],$recent[$y][1]\n";
$y++;
}
print WRITE $printout;
close WRITE || die "Couldn't close the csv file.\n";
exit ;
| [reply] [Watch: Dir/Any] [d/l] |
|
>perl -c \816493.pl
Possible unintended interpolation of @data_date in string at 816493.pl
+ line 60.
Global symbol "@data_date" requires explicit package name at 816493.pl
+ line 56.
syntax error at 816493.pl line 60, near "if"
Global symbol "@data_date" requires explicit package name at 816493.pl
+ line 60.
Global symbol "@data" requires explicit package name at816493.pl line
+65.
Global symbol "@data" requires explicit package name at 816493.pl line
+ 67.
Global symbol "@data_date" requires explicit package name at 816493.pl
+ line 67.
Global symbol "@recent" requires explicit package name at816493.pl lin
+e 68.
Global symbol "@data" requires explicit package name at 816493.pl line
+ 68.
syntax error at 816493.pl line 71, near "}"
816493.pl had compilation errors.
IE, what you've posted won't compile. Suggest you strip this down to the bare essentials. Then, when you have a compilable version that illustrates your problem, you'll be ready to post that.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
use Switch;
This module is a bad idea and not recommended for use, even by its author anymore. Remove it and the switch construct. If you use Perl 5.10, use the new given keyword, otherwise use the approach you already got for the switch expression or use a chain of if ... elsif ... elsif statements. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
$mon = (qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec))[$month-1]
+;
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Re: copying records from one file to another with filter.
by mnooning (Beadle) on Jan 09, 2010 at 18:52 UTC
|
use strict;
my @incoming = (
"1-Dec-2009,12,87",
"2-Dec-2009,54,204",
"3-Dec-2009,75,214",
"4-Dec-2009,78,198",
"5-Dec-2009,98,155",
"6-Dec-2009,10,180",
"7-Dec-2009,51,91",
"8-Dec-2009,32,130",
);
my $lookedForDate = "4-Dec-2009";
foreach my $line (@incoming) {
# If you are getting these lines from a file, you may
# first have to
# chomp($line); # Get rid of the \n
# Look for beginning-of-line,then $lookedForDate
if ($line =~ m/^$lookedForDate/) {
print ("Found $lookedForDate in the line: $line\n");
}
}
__END__
The result is pasted below
Found 4-Dec-2009 in the line: 4-Dec-2009,78,198
| [reply] [Watch: Dir/Any] [d/l] |
Re: copying records from one file to another with filter.
by thundergnat (Deacon) on Jan 11, 2010 at 15:47 UTC
|
If you know the start and end point use a range operator.
Note: In this context "eof" doesn't really correspond to the eof end-of-file marker, it is just a string that isn't found before the end of the file. It works in this situation, but beware of using it in a general case. If you want the true eof marker, use the hex char at ordinal 4, "\x4"
use warnings;
use strict;
my $start = '4-Dec-2009';
my $end = "eof";
print "Date,Total\n";
while (<DATA>) {
if ( /^$start/ .. /^$end/ ) {
chomp;
my (@items) = split /,/;
printf "%s,%d\n", $items[0], $items[1] + $items[2];
}
}
__DATA__
Date,Expense,Income
1-Dec-2009,12,87
2-Dec-2009,54,204
3-Dec-2009,75,214
4-Dec-2009,78,198
5-Dec-2009,98,155
6-Dec-2009,10,180
7-Dec-2009,51,91
8-Dec-2009,32,130
9-Dec-2009,29,207
produces:
Date,Total
4-Dec-2009,276
5-Dec-2009,253
6-Dec-2009,190
7-Dec-2009,142
8-Dec-2009,162
9-Dec-2009,236
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
your code seems to be more relevant to my problem....I implemented the technique and heres my edited code...
#!/usr/bin/perl
use strict;
use warnings;
# source file directory
my $srcdir = "../source";
# source file name
my $srcfile = $srcdir."/vol.dat";
# Open source file.
open (READ, "< $srcfile") || die "Can't find the DAT file\n";
my $epochToday = time;
$epochToday = $epochToday - 2592000;
my ($year, $month, $day) = (localtime($epochToday))[5,4,3];
$month++;
$year+=1900;
my $startdate = $year."-".$month."-".$day;
my $x=0;
my $info;
my @input;
while ($info = <READ>)
{
chomp $info;
my @data = split (/,/, $info);
push @input, [@data];
$x++;
}
close READ || die "Couldn't close the DAT file";
@ordered_input);
my $desdir = "../target";
my $desfile = $desdir."/total_volume.csv";
open (WRITE, "> $desfile") || die "Can't find the CSV file.\n";
my @headers = ("Date",",","Total_Volume");
print WRITE @headers,"\n";
my $printout;
while (@input)
{
my $start = $startdate;
my $end = "eof";
if ( /^$start/../^$end/ )
{
chomp;
my (@items) = split /,/;
$printout .= "%s,%d\n", $items[0], $items[1] + $items[2];
}
}
print WRITE $printout;
close WRITE || die "Couldn't close the CSV file";
exit 0;
but im getting and infinite loop error:
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Use of uninitialized value $_ in pattern match (m//) at basic.pl line
+63.
Terminating on signal SIGINT(2)
Terminating on signal SIGINT(2)
kindly help in this matter..
Thanks
AvantA
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
A few observations and comments:
You are reading in the entire file and saving it in an array of arrays, but then don't do anything with it. (Well, other than use the size of the array as a never terminating while() condition.) For that situation your loop probably should be
for (@input) rather than while (@input)
You are searching for a YYYY-MM-DD pattern when your example data is in DD-MMM-YYYY format. You'll never find anything.
You should probably use the 3 argument form of open and lexical file handles. What you have isn't wrong particularly, but isn't considered best practices.
The way you are calculating 30 days is easy but fragile. You may be better off using one of the date calculation packages. It may be good enough here though, so take that with a grain of salt.
Here's a minor reworking of your posted script to be more efficient (and work, for certain values of "work").
#!/usr/bin/perl
use strict;
use warnings;
my @months = ( qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/);
# source file directory
my $srcdir = '../source';
# source file name
my $srcfile = $srcdir . '/vol.dat';
my $desdir = '../target';
my $desfile = $desdir . '/total_volume.csv';
# Open source file.
open my $read, '<', $srcfile or die "Can't open the DAT file $!\n";
open my $write, '>', $desfile or die "Can't open the CSV file. $!\n";
print $write "Date,Total_Volume\n";
my ( $year, $month, $day ) = ( localtime( time() - 2592000 ) )[ 5, 4,
+3 ];
$year += 1900;
my $startdate = "$day-".$months[$month]."-$year";
while (<$read>) {
if ( /^$startdate/ .. /\x4/ ) {
chomp;
my (@items) = split /,/;
printf $write "%s,%d\n", $items[0], $items[1] + $items[2];
}
}
# probably not necessary but not a bad idea
close $read or die "Couldn't close the DAT file: $!\n";
close $write or die "Couldn't close the CSV file: $!\n";
| [reply] [Watch: Dir/Any] [d/l] |
|
|
|
if( /^$start/ .. /^$end/ )
Sir, I am havng trouble interpretting in this line.. there is an error
Use of uninitialized value $_ in pattern match (m//) at range.pl line 12
Can you please help me.. its urgent..
Avanta
| [reply] [Watch: Dir/Any] |
|
|