Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Read Directory and getlines in .csv

by Anonymous Monk
on Apr 27, 2016 at 14:57 UTC ( [id://1161638]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I am new to Perl and programming. I have written a perl program to get the names of .csv files in a different directory which were uploaded to the server today only, by modified date. These files are 13+ MB so I don't want to copy and move them to my new directory. I also wrote another perl program which will read a file and get only the lines with what I want and place those lines in a .csv with the criteria I need for my final report. My question is: How do I integrate these two programs into one program without moving these large files from the directory they are housed in. Is this possible?

#This is the program to get the .csv file names from the directory: #!/usr/bin/perl use strict; use warnings; use File::stat; use Text::CSV_XS; use IO::File; my $dirname = "Daily_QoS_CPD_link"; opendir(DIR, $dirname) or die "Not able to open $dirname $!"; # Create an array, Open the directory, get only .csv's modified in the + last 24 hours my @dir = sort { -M "$dirname/$a" <=> -M "$dirname/$b"} grep /\.csv/, $dirname ne '.' && $dirname ne '..', readdir + DIR; rewinddir(DIR); # create a list of csv's for today into a text file. my $One_day = 86400; foreach my $list (@dir){ my $diff = time()-stat("$dirname/$list")->mtime; if( $One_day > $diff){ open FILE, ">>CPD_Files.txt" or die "Not able to open f +ile $!"; print FILE "$list\n"; } } closedir DIR; close FILE; # This is the code for getting the lines from the .csv's that I need #!/usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $Finput = "cpd_link_ABC_cpddrops_50_300300000.csv"; my $Foutput = "data0426-2.csv"; open my $FH, "<", $Finput; open my $out, ">", $Foutput; my $csv = Text::CSV_XS->new({binary => 1, eol => $/ }); while(my $row = $csv->getline($FH)) { my @fields = @$row; if ($fields[2] eq "DROPPED-10" || $fields[2] eq "CALL_START" | +| $fields[2] eq "CALL_END") { $csv->print($out, $row); } } close $FH; if (not $csv->eof){ $csv->error_diag(); }

Replies are listed 'Best First'.
Re: Read Directory and getlines in .csv
by Ovid (Cardinal) on Apr 27, 2016 at 17:12 UTC
    I know not everyone is a fan of File::Find::Rule, but I'm so used to it that it's just a quick hack for me. The following returns a list of all files with a .csv extension, in a given directory, modified within the last 24 hours. The rest, of course, should be straightforward after this.
    use File::Find::Rule; my $dir = 'lib'; # or wherever they're located my $last_24_hours = time - 86_400; my @files = File::Find::Rule->file->name('*.csv')->mtime(">$la +st_24_hours")->in($dir);
      Note that this will recurse into subdirectories of $dir if they exist. His code does not.

      I'm not sure how you could achieve that.

      Update: I believe this addition to the rule will achieve that.

      my @files = File::Find::Rule->file->name('*.csv')->maxdepth(1)->mtime(">$last_24_hours")->in($dir);

      Update: poj's use of glob would probably be the better solution

      Great idea to use File::Find::Rule Thank you!

Re: Read Directory and getlines in .csv
by Dipepper (Novice) on Apr 27, 2016 at 15:49 UTC

    I realized I am checking for modified date twice. I have corrected my code.

    #!/usr/bin/perl use strict; use warnings; use File::stat; use Text::CSV_XS; use IO::File; my $dirname = "Daily_QoS_CPD_link"; opendir(DIR, $dirname) or die "Not able to open $dirname $!"; # Create an array of files in directory my @dir = grep /\.csv/, $dirname ne '.' && $dirname ne '..', readdi +r DIR; rewinddir(DIR); # create a list of csv's for today into a text file. my $One_day = 86400; foreach my $list (@dir){ my $diff = time()-stat("$dirname/$list")->mtime; if( $One_day > $diff){ open FILE, ">>CPD_Files.txt" or die "Not able to open f +ile $!"; print FILE "$list\n"; } } closedir DIR; close FILE;
      You could avoid the use of File::stat by checking for csv files less than a day old with

      my @today = grep /\.csv$/ && -M "$dir/$_" < 1, readdir DIR;

      (See file test ops).

        To avoid confusion: that works, but -M also stats the file, and does not return the same thing as the mtime field.

        I have used this site to learn many things. I thought I read that -mtime was more consistent than -M?

Re: Read Directory and getlines in .csv
by Laurent_R (Canon) on Apr 27, 2016 at 16:35 UTC
    You can open a file in any directory where you have access by specifying the path and the name of the file to open, for example:
    my $file = "$path/$bare_file_name"; open my $IN, "<", $file or die "... $!";

      I apologize, please be patient with me. I do not really understand your response. These files are on a server and all are not uploaded daily, depends on the activity (no set pattern). Sometimes 2 other times 14 csv files. Therefore, I do not know the filenames/unknown_path_in_that_case of the files which were modified that day (today) until I run the program I created. But, I do want to read each filename/unknown_path.. to getlines from that .csv place result in one document (a .csv) then go to next file, recursively going through files until end of today's modified files.

        Not sure what part you are having trouble with but try this

        #!/usr/bin/perl use strict; use warnings; use File::stat; use Text::CSV_XS; my $csv = Text::CSV_XS->new( { binary => 1, eol => $/ } ); my $ONE_DAY = 86400; my $Foutput = "data0426-2.csv"; open my $out, ">", $Foutput or die "$!"; # create array of csv filenames in directory my $dirname = "Daily_QoS_CPD_link"; unless (-d $dirname) { die "Not able to open $dirname"; } my @dir = glob "$dirname/*.csv"; # parse files modifed in past 24 hours foreach my $infile (@dir){ my $age = time()-stat($infile)->mtime; if( $age < $ONE_DAY ){ open my $FH, "<", $infile or die "Could not open $infile : $!"; while (my $row = $csv->getline($FH)) { if ( $row->[2] =~ /^(DROPPED-10|CALL_START|CALL_END)$/ ) { $csv->print($out, $row); } } close $FH; } }
        poj

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1161638]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2024-05-30 18:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.