Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: part - split up files according to column value

by Corion (Patriarch)
on Aug 26, 2008 at 13:31 UTC ( [id://706900]=note: print w/replies, xml ) Need Help??


in reply to Re^2: part - split up files according to column value
in thread part - split up files according to column value

You can start by telling us where you encounter problems and what difficulties you have incorporating FileCache into jdporter's code.

  • Comment on Re^3: part - split up files according to column value

Replies are listed 'Best First'.
Re^4: part - split up files according to column value
by mick2020 (Novice) on Aug 26, 2008 at 14:37 UTC
    I have the first part of the task completed i.e. sorting the files. Here is my code
    ///My code use FileCache maxOpen => 1000; //////////// # config: my $field = 0; my $sep = ","; ////MY code cacheout $mode, $path; $fh = cacheout $mode, $path; ///////// $, = $sep; $\ = $/; my %file; # { num, name, $fh } my $fnum = 1; while (<>) { chomp; my @c = split /$sep/o; my( $key, $num ) = defined $c[$field] ? ( $c[$field], $fnum++ ) : ( '(column not present)', 0 ); unless ( $file{$key}) { $nameF = $c[$field]; $nameF =~ s/"//g; $file{$key}{num} = $num; $file{$key}{name} = "out/".$nameF.$ARGV[0]; if(($file{$key}{num}) >1){ -f $file{$key}{name} and die "Sorry, '$file{$key}{name}' exists; won't clobber."; open $file{$key}{fh}, ">", $file{$key}{name} or die "Error opening '$file{$key}{name}' for write - $!"; }} print {$file{$key}{fh}} @c; }
    The problem is the filecache. I am not familiar with perl so I am having problems with this part of code.
    I am getting error $ perl split.pl Input.csv .cvs Error opening '4444.cvs' for write - Too many open files at split.pl 39, <> line 817961.
    I have marked the my addition to jdporter's code
    I don't know $path and $mode are.

      I read the FileHandleFileCache documentation differently than you do. I think that you're basically supposed to replace your calls to open by calls to cacheout, that is, instead of open $file{$key}{fh}, ..., use:

      $file{$key}{fh} = cacheout $file{$key}{name}

      But I haven't tested that. $path is the (path and) name of the output file, and $mode is the file mode (which is irrelevant to your needs).

      Update: kyle spotted a link to the wrong documentation.

      I have tried
      use FileCache maxOpen => 10000;
      ..
      open $file{$key}{fh}, ">", cacheout $file{$key}{name} or die
      But I get the error
      Too many open files at /usr/lib/perl5/5.10/ .... at line 408948
      I have tried changing the value of maxOpen but this does nothing

        I'm not sure why you interpret what I wrote:

        instead of open $file{$key}{fh}, ..., use:

        $file{$key}{fh} = cacheout $file{$key}{name}

        as that you should write:

        open $file{$key}{fh}, ">", cacheout $file{$key}{name} or die

        Maybe you want to reread my node again. I'm also not sure how to phrase it differently so you get what I mean short of writing the program for you which I won't do.

Re^4: part - split up files according to column value
by mick2020 (Novice) on Aug 26, 2008 at 15:55 UTC
    I have tried
    use FileCache maxOpen => 10000;
    ..
    open $file{$key}{fh}, ">", cacheout $file{$key}{name} or die
    But I get the error
    Too many open files at /usr/lib/perl5/5.10/ .... at line 408948
    I have tried changing the value of maxOpen but this does nothing

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://706900]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-20 02:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found