Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: Spawning a thread in a CGI Perl file

by willjones (Sexton)
on Jul 30, 2008 at 20:34 UTC ( #701264=note: print w/replies, xml ) Need Help??


in reply to Re: Spawning a thread in a CGI Perl file
in thread Spawning a thread in a CGI Perl file

Well, I am using Crypt::OpenPGP to encrypt files posted to my CGI file. This was working great until larger files started being processed. I am able to barely get a 12mb file through. But when I try to bump this up and upload a 22mb file Crypt::OpenPGP seems either unable to handle it or it takes a really long time (not sure which). All 22mb of the file to be uploaded are successfully read from the input stream after it is posted to my perl CGI code, but then according to my logs the encryption process starts but never finishes. At first I couldn't even handle 12mb files, but after bumping up the Apache server timeout these began working after a long delay of course. So, then I thought maybe I could spawn a thread which thanks to Paul's post I realized what I really wanted to do was fork a new process. So, I tried forking a new process to handle pulling the large file off the input stream and then attempting to encrypt it while the parent process returned shortly to the user informing him or her that their file is being processed. This way I thought maybe timeout wouldn't be an issue. Well, I tried implementing this and it ended up taking a long time even though I forked. I suspect it has something to do with my child process having a higher priority than I really wanted it to compared to the parent process. But eventually the parent process did complete before the child and returned HTML back to the user with a nice 'Your file is being processed' message. However the child process, according to my logs, shows that it started the PGP encryption process on the 22mb file after successfully reading all of it from the input stream but then that is the last log statement shown. So it seems either PGP encrypt was taking a really long time or it killed the process somehow. I had a log of my process id for the child to, so I logged on to my server and did a ps -f command but did not see the process id listed there. (Not sure if that makes sense to try and see it like that or not.) So, now I'm thinking maybe there is a limit to the size of a file that Crypt::OpenPGP can encrypt for me.

Anyway, to answer your question, since I realized large files would need to be uploaded in some cases. I thought it would be good if an asynchronous process could work on digesting it while control was returned back to the user. When the file encryption is complete a flag file is written out in a certain directory that is constantly scanned by a daemon process. When it sees the flag file it picks up the file that was uploaded and processes. After completion of processing it updates a status file that can be checked by the user from the UI when navigating to a status.cgi page.

I realize some kind of FTP solution may be better for the large files, but I would still run into problems on the encryption of a large file unless maybe I break the file up into little segments and encrypt the segments. The daemon program would have to be changed to know how to decode them though. Also, I'm not entirely sure if I'm allowed to use FTP, due to firewall rules or something, in my situation I'll have to check on this.

Any advice on the best way to approach this would be appreciated. Also, is there a maximum file size for PGP encryption?

Thanks,
-Will
  • Comment on Re^2: Spawning a thread in a CGI Perl file

Replies are listed 'Best First'.
Re^3: Spawning a thread in a CGI Perl file
by CountZero (Bishop) on Jul 30, 2008 at 21:34 UTC
    I think you will first have to test Crypt::OpenPGP in a non-server environment with succesively larger files until it breaks or until you reach the size you are prepared to handle.

    Once you are clear on this, indeed the daemon process you describe seems the best way to handle large files which need long processing times.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://701264]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2020-04-10 04:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (49 votes). Check out past polls.

    Notices?