Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

File::Copy - move() function corrupting files

by myelinviolin (Novice)
on Aug 22, 2014 at 15:32 UTC ( #1098334=perlquestion: print w/replies, xml ) Need Help??
myelinviolin has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a program that moves all pdf files from one location (mine is the desktop) to another "dump" location. As it does this, it appends the pdf to a document placed in a different location. The pdf that is being appended has a generated title page, the purpose of createpdf.pl. I have it to run continuously while the program is "ON" and can be turned off by the user by pressing "q". (I have it on a 1 second sleep for now). It works great when the pdfs are already on the desktop, but if I put a file on the desktop while the program is running, the file on the desktop is left corrupted. The pdf is still moved to the dump location and the main pdf file is still appended, but in order to really get rid of the file on the desktop I have to move the files from the dump folder back to the desktop. I think this may have something to do with the move() function in File::Copy. I used a close function to close the files after they are moved, thinking this might be the reason for a corrupted file to be left there, but it didn't fix anything. **I think the problem is in appendpdf.pl. The whole program runs when you run PDF.pl** The point of the program is to function as an automatic pdf appender when a lot of documents are printing from Adobe PDF printer at once. There isn't a way for the files to be put on the desktop without having to rename each one. You can turn that option off for the printer but then all the files that are named the same get overwritten. I want to grab the pdf as it is being made and put it somewhere else. Hopefully this will be fast enough to work, but if there is some kind of timing issue with the pdf files, it might not work. I am trying to use several modules and the only way I could figure out how to make them all work with each other was to put the pieces in different files.

PDF.pl

#!usr/bin/perl use Term::ReadKey; do 'user_options.pl'; do 'createpdf.pl'; print "Program ON\n"; ReadMode 4; # Turn off controls keys $input=0; until ($input eq "q"){ while (not defined ($input = ReadKey(-1))) { do 'appendpdf.pl'; sleep(1); } } ReadMode 0; # Reset tty mode before exiting print "\nProgram OFF\n";

appendpdf.pl

use CAM::PDF; use File::Copy; #Initialize title page pdf my $doc1 = CAM::PDF->new("$file1") || die "$CAM::PDF::errstr\n"; #Read each pdf file on the desktop opendir(DIR,$directory); my @files = grep{/\.pdf$/}readdir(DIR); closedir(DIR); #Get the names of all the files. foreach(@files[0]){ $name2=$_; $file2="$directory"."$name2"; $doc2 = CAM::PDF->new("$file2") || die "$CAM::PDF::errstr\n"; #Append the thermogram file with each document $doc1->appendPDF($doc2); $doc1->clearAnnotations(); $doc1->cleanoutput("$file1"); #Move each file to the destination folder $file3="$destination"."$name2"; move("$file2","$file3") || die "Move failed: $!"; $file2->close || die "Source file failed to close: $!"; $file3->close|| die "Target file failed to close: $!"; }

createpdf.pl

$file1="$output"."$filename"; use PDF::Create; # initialize PDF my $pdf = PDF::Create->new('filename'=> "$file1",'CreationDate' => + [ localtime ], ) || die "Could not initialize PDF."; # add a A4 sized page my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); # Add a page which inherits its attributes from $a4 my $page1 = $a4->new_page; # Prepare a font my $f1 = $pdf->font('BaseFont' => 'Helvetica'); #prompt for user input print "\n"; print "Name: "; $name=<>; print "Method: "; $method=<>; print "\n"; #get the date $date=localtime; # Write some text $page1->stringc($f1, 20, 306, 425, "$title"); $page1->stringc($f1, 20, 306, 375, "Author: $name"); $page1->stringc($f1, 20, 306, 350, "Method: $method"); $page1->stringc($f1, 20, 306, 300, "$date\n"); # Close the file and write the PDF $pdf->close || die "Could not write PDF.";

user_options.pl

#The directory where the files that need to be appended are located: $directory = 'C:/Users/X/Desktop/'; #This is where the final pdf file will print: $output = 'M:/perlscripts/'; #This is where the spent files will be collected: $destination='C:/Users/X/Desktop/Generated Thermograms/'; #This is what your final file name will be: $filename='Thermograms.pdf'; #The title in the PDF: $title='Thermograms';

Replies are listed 'Best First'.
Re: File::Copy - move() function corrupting files
by Laurent_R (Canon) on Aug 22, 2014 at 18:02 UTC
    If I understand correctly, you have problem when your program starts processing files that are not completely there (still being copied from somewhere else), or whatever. You need to prevent your program from grabbing your PDF file prematurely.

    There are several things you could do: when you detect a new PDF, just sleep for some time before starting to process it (that works only is the time for the file to arrive is relatively constant). Another way is to have the process delivering the files to copy the files with a different name (e.g. an extension other that ".pdf", such as ".tmp"), and to rename the file only when the file delivery is complete. Or the process delivering the files could put a flag (an empty file with a name similar to the file being delivered) in the directory and remove the flag once delivery is complete. This way, your program knows when it can start processing the file safely

      The only way I could get my program to work is to wait 5 seconds after it detects a file. I tried really hard to just get it to move the oldest file with File::DirList

       @list = File::DirList::list('$directory','ia',1,1,1);

      and stat, but I could never get the number out of the variables.

      $last_modified = (stat($filename))[9]

      My program works for me for now but who knows the kind of pressure people are going to put on it. Any hints on these in case I really do need to upload the oldest file?

      My new code:

      use CAM::PDF; use File::Copy; use File::stat; use Time::localtime; sleep(1); if ($o==0){ print "\nChecking for files...\n"; $o=1; } #Initialize title page pdf my $doc1 = CAM::PDF->new("$file1") || die "$CAM::PDF::errstr\n"; #Read each pdf file on the desktop opendir(DIR,$directory); my @files = grep{/\.pdf$/}readdir(DIR); if (@files){ print "Found file! Waiting 5 seconds before I move it.\n"; print "Name of file: $files[0]\n"; $o=0; sleep(5); } else{ continue; } closedir(DIR);
        Well, do you have control on the process producing the files onto the desktop? Because the best solution is really to change this process so that it copies the files with a different name (say "*.pd_"), and that it renames them to "*.pdf" only when the file is complete. This way, when you grep the directory content, you only pick up files that are complete.

        Or create the files in a different directory on the same disk, and move it into the right directory once the file is complete.

Re: File::Copy - move() function corrupting files
by roboticus (Chancellor) on Aug 22, 2014 at 18:15 UTC

    brainfiddle:

    I don't know if the problem is due to moving PDF files when they're open or not. But if that's the problem, then I suggest you create the PDF files with a bogus extension (such as .WORK). After the file is successfully completed and closed, you can then rename it from XYZ.WORK to XYZ.PDF. This will prevent your program from detecting and working with PDF files before they're ready to go. (I use the same trick for file generators to prevent an FTP process from sending the file before it's complete.)

    As I said, I don't know if that's your problem, though.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      I can't do it this way because the pdf files are initially created by Abobe PDF Printer and I don't think there is a way to change the extensions of what it prints to something besides .pdf

        Don't scan for files yourself, make the operating system notify you of changes in the filesystem (inotify or similar APIs). Make sure your code starts processing only after a new file (opened for writing) is closed. For inotify, this seems to be indicated by the IN_CLOSE_WRITE event.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: File::Copy - move() function corrupting files
by 2teez (Vicar) on Aug 22, 2014 at 19:26 UTC

    Hi myelinviolin,
    While other Monks may have suggested how your problem could be fixed, am only wondering that you are using do to load some of your perl script (Not that it is impossible though, or forbidden). I believe, you could simply go the ways of Perl "module", using either use or require to include modules which you then can use.
    Just thinking aloud though.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1098334]
Approved by mr_mischief
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2018-06-24 08:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (126 votes). Check out past polls.

    Notices?