Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Unlink not deleting, Threading issue?

by pimperator (Acolyte)
on Jun 05, 2014 at 02:09 UTC ( #1088763=perlquestion: print w/ replies, xml ) Need Help??
pimperator has asked for the wisdom of the Perl Monks concerning the following question:

O'Monks with your infinite wisdom, I prostrate myself at your feet. I am nothing, I am nothing without you.

I am writing a script to run on a windows device. The special binary files I'm converting to a txt file are in a server, the user has super-user privileges. My problem is that the txt files, which are also saved in the server, are not deleted after using unlink() The purpose of the script:

  • Store all file directories ending with '.special_binary_file' extension
  • At this point the script is forked
  • Iterate through the file directory array and use a special program to convert .special_binary to .txt (this is done using a system command)
  • open the txt file and parse out data
  • delete the txt file
  • After doing some reading I see that unlink() will not work if the file is open or if the user doesn't have superuser privileges, which is not the case for both.

    Could threading the script interfere with the unlink() command?

    I is baffled.
    #!/usr/bin/perl use warnings; use strict; use File::Find; use File::Basename; use Parallel::ForkManager; my $masterDirect = '//server/secret/juice/2012/'; find(\&file_names, $masterDirect); my @files; sub file_names { if( -f && $File::Find::name=~/\.special$/) { push @files, $File::Find::name; } } + my $maxProcs = 14; my $pm = new Parallel::ForkManager($maxProcs); foreach(@files){ my $pid = $pm->start and next; my @name = split /\//, $_; my $id = pop(@name); my @safeID = split / /, $id; my $perlOut = join "/", @name; my $safeDirect = $perlOut.$safeID[0]; #safe is hipaa compliant my $specialCommand = $_; $specialCommand =~ s/\//\\/g; # needs to be windows formatted. I t +ested it $specialCommand = "\"".$specialCommand."\""; my $specialTxtOut = join "\\", @name; $specialTxtOut ="\"".$specialTxtOut."\""; my $companyCommand = "apt-sauce-to-txt -o ".$specialTxtOut." ".$sp +ecialCommand; unless(exists($specialLog{$safeDirect})) { # log file test, not + shown in this truncated script system($companyCommand); # SEND COMMAND TO THE TERMINAL } ##### # # BEGIN TXT PARSING # ##### my $specialTxtFile = $_.".txt"; if ( -e $specialTxtFile ) { open IN, $_.".txt" or die "CANNOT OPEN special.TXT FILE: INSTA +NCE 1\n"; while(my $line=<IN>){ ##### # # PARSING HAPPENS HERE, NOTHING FANCY HERE LADS # ##### } close(IN); # SEE! } unlink $_.".txt"; # I'VE USED $specialTxtFile too, same result $pm->finish; } $pm->wait_all_children;

    Very important: I would like to delete the file in a single pass. That is create, read, delete. I suppose I could wait until the parsing and then delete them all by storing the directories in an array. But each txt file is ~250Mb and there are over 25,000 files that need to be converted. So I don't think the server can take all that data.

    Comment on Unlink not deleting, Threading issue?
    Download Code
    Re: Unlink not deleting, Threading issue?
    by Anonymous Monk on Jun 05, 2014 at 02:20 UTC
      Not looked at your code except to see you're not using sub Fudge to get a reason why you can't unlink
    Re: Unlink not deleting, Threading issue?
    by taint (Chaplain) on Jun 05, 2014 at 05:28 UTC
      Greetings, pimperator.

      I'm gonna have to go with the file probably has a handle open on it. But I don't quite know what type of system you're attempting to do this on -- *NIX, Win*, {...}. Win* ain't got a (native) unlink (that I can recall). But of course, all the *NIX's do have unlink. I'm also not familiar with the peculiars with Perl on Win, and unlink. But the doc's say:

      On error, unlink will not tell you which files it could not remove. If you want to know which files you could not remove, try them one at a time:
      foreach my $file ( @goners ) { unlink $file or warn "Could not unlink $file: $!"; }
      You will also need to pass -U flag to Perl.

      If nothing else. Now you can at least attempt to track the unlinking of the files, to see why/what's going on.

      Best wishes.

      --Chris

      λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

        You will also need to pass -U flag to Perl.
        No you won't. Unless the OP is trying (as root) to direcltly unlink a directory on an ancient UNIX variant, the -U flag is irrelevant.

        Dave.

          Greetings, dave_the_m.

          The OP (pimperator) mentions Super User twice in the post.

          ...the user has super-user privileges...
          and
          After doing some reading I see that unlink() will not work if the file is open or if the user doesn't have superuser privileges, which is not the case for both.
          Which is why I felt it pertinent to mention the -U flag.

          --Chris

          λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

        ack, no! One uses -U when one wants to unlink directories, which is never! Use rmdir to remove directories. Don't use -U!

          Just for the record. The only reason I even mentioned it. Was because the perl docs did:

          Note: unlink will not attempt to delete directories unless you are superuser and the -U flag is supplied to Perl. Even if these conditions are met, be warned that unlinking a directory can inflict damage on your filesystem. Finally, using unlink on directories is not supported on many operating systems. Use rmdir instead.

          If LIST is omitted, unlink uses $_ .

          Which IMHO, is not the same as recommending the use of it. :)
          eg; added for completeness.

          --Chris

          λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

        This is very dangerous advice. If someone did as you suggested and actually managed to unlink a directory, it would probably damage their filesystem.

        The worst part is that you copy/pasted the how to do it parts while leaving out the big red flags that are trying to keep us from doing something stupid and tragic.

    Re: Unlink not deleting, Threading issue?
    by FloydATC (Chaplain) on Jun 05, 2014 at 08:35 UTC

      I would start by checking $! for an error message.

      -- FloydATC

      Time flies when you don't know what you're doing

    Re: Unlink not deleting, Threading issue? (SHARE_DELETE)
    by tye (Cardinal) on Jun 05, 2014 at 18:16 UTC
    Re: Unlink not deleting, Threading issue?
    by sundialsvc4 (Abbot) on Jun 05, 2014 at 18:29 UTC

      Definitely suggest looking for what error-messages are being reported ... and, generally re-consider your design.   It superficially appears to me that you are attempting to launch one thread per-file.   I will guarantee that you will not succeed in launching 25,000 children.   Your code does not anticipate that the fork won’t succeed.

      May I kindly now refer you, without further ado, to BrowserUK’s most-excellent first reply to the following recent thread: Proper undefine queue with multithreads.   Your situation is exactly the same.   His design is different, but easy to do, and his design works.

      Incidentally ... in a Un*x/Linux environment ... this whole thing just might be already-done for you!   If the xargs command on your system (man xargs) supports the -P maxprocs parameter, then you can bypass all of this nonsense.   Simply write (in Perl) a command that expects to receive a filename as a parameter.   This command converts that one file, processes it, then deletes it, then ends.   Meanwhile, command-line-pipe the output of a find command into xargs which uses that parameter.   Now, the find command is posting the filenames to xargs, which farms-out the work to maxprocs identical, single-purpose (Perl ...) process instances.   The same business requirement has now been solved, in a “very Un*x-ish way,” and the complexity of your (Perl) script has been greatly simplified.   (Perhaps you decide that this alternate approach is “right for you,” or perhaps not, and either choice is up-to-you.   But you will see that it is functionally equivalent in its general approach to the problem.)

        It superficially appears to me that you are attempting to launch one thread per-file.

        That's what I thought too at first. I didn't get the memo that fork on Win32 was ready for actual use so I got real curious about how Parallel::ForkManager dealt with this. Turns out, when you use $pm->start, if you've already used all your children (14 here), it waits for a child and finally launches a new one and continues the parent program. So yeah, it seems that's how you're "supposed" to do it.

        As good as Parallel::ForkManager seems to be, it still strikes me as odd to reach for forks on a platform that's natively threaded.

    Re: Unlink not deleting, Threading issue?
    by perlfan (Curate) on Jun 05, 2014 at 20:00 UTC
      I would just unlink using a list of files after you've joined back into your parent process and your children have been reaped.
    Re: Unlink not deleting, Threading issue?
    by LanX (Canon) on Jun 07, 2014 at 17:05 UTC
      You are yelling!

      Down voted and ignored!

      Cheers Rolf

      (addicted to the Perl Programming Language)

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: perlquestion [id://1088763]
    Approved by taint
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (7)
    As of 2014-12-18 06:27 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (43 votes), past polls