Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

cron/perl interaction gotchas i am missing?

by schweini (Friar)
on Jan 30, 2004 at 12:39 UTC ( #325204=perlquestion: print w/replies, xml ) Need Help??

schweini has asked for the wisdom of the Perl Monks concerning the following question:

beloved monks,

i have the following line in my crontab:
# download syncs 10 04 * * * root /root/crons/downloadsyncs.pl

that script is supposed to connect to various stores, and download a file (using 'wget') from each one.

my problem is that it seems to enter the for-loop (that is iterating over the stores), and it does the first call to wget (via system()) just fine, but after having downloaded the required file, somehow doesn't continue - almost as if cron doesn't give control back to the script after the wget.
if i run my script in a terminal, everything works fine, so i guess it's a cron/perl interaction thing.
anyone know what i am overlooking?

Update: This is on Mandrake 9.1, and (although i couln't find it in the docs anywhere) all the other crontab lines have the username under which to run the command in front of it there, so i just did the same, and the program DOES run...

Replies are listed 'Best First'.
Re: cron/perl interaction gotchas i am missing?
by Corion (Pope) on Jan 30, 2004 at 12:55 UTC

    I'm not sure if your script runs at all. If the shown line is your crontab entry, there must be an additional script root involved, which might be (yet another) cause for the script behaving weird.

    My list of things that I check whenever I wrote a new script being started via cron is the following:

    1. Working directory. My versions of cron start the scripts in /, and most likely that is not writable to the user the script runs as. Doing a cd $(dirname $0); respectively use File::Basename; chdir dirname($0) or die "Couldn't change to $0 : $!"; fixes that in my cases.
    2. Environment. cron dosen't set many environment variables that my .bashrc, .kshrc or .profile set. tilly wrote a [id://snippet] to get the default login environment variables defined in these files, but I normally use a small shell wrapper around my scripts if they require values in %ENV: . ~/.profile; exec ~/my/script.pl
    3. $ENV{PATH} Courtesy of bart This is a special case of the above, the path is most likely not set or set to something very restrictive like /bin:/usr/bin. Explicitly setting it in the (Perl) script works well here.
    4. The log files. My interesting scripts log their complete run into logfiles. What do the logfiles tell? If there is a compilation error (or missing modules, see Environment above) etc., the shell wrapper logs at least the start of the Perl script.
    5. Email recipient. cron should mail me the output (if any). Is the email recipient for the crontab set at all? Does that account have a (working) .forward file?

    Update: Added link to tillys snippet

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
      If the shown line is your crontab entry, there must be an additional script root involved, which might be (yet another) cause for the script behaving weird.
      You would be right if you were talking about normal Unix cron. Unfortunally, Linux distros truely enjoy deviating from decades of established pratices, especially if they can introduce another maze of little /etc files.

      For decades, cron used files in /var/spool/crontab, one for each user. It was clear, it was simple, everyone understood it, and life was good. But, no, simplicity isn't good enough for your standard Linux distro. Good heavens someone might want to use a simple grep to find out when a certain program is run from cron (and by who). No, we can't have that! Nowadays, we must have next to the cron spool directory, a /etc/cron.hourly/, a /etc/cron.daily/, a /etc/cron.weekly/, a /etc/cront.monthy/, a /etc/crontab, and to top it off, a /etc/cron.d/. With different syntax than the files in crontab spool directory. An extra column is introduced to indicate which uid to run under.

      Why a good and honost spool directory isn't good enough remains a mystery.

      Abigail

      there must be an additional script root involved

      That /etc/crontab entry is pretty standard for Redhat and Debian-based distros. It means that the cron job is run as root. =) (The user contabs look different.)

      # RedHat 9.0 01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly

      I think they do things this way so that programs that need to install cron jobs can just copy a script into the appropriate directory.

      --
      Allolex

        I think they do things this way so that programs that need to install cron jobs can just copy a script into the appropriate directory.

        In particular, many .rpm packages include files that go in these directories. Apparently, rpm does not support the possibility that a package might contain individual lines that go into certain files, probably because it would not then be obvious, after other packages had been installed that put lines in there as well, which lines belonged to which packages, which would create a mess when uninstalling or upgrading packages. For this reason, many people have altogether quit using the old /var/spool crontabs anymore; everything goes into the hourly, daily, weekly, or monthly directories, unless it is set up by hand by an unprivileged user. On the desktop, unprivileged users don't usually set up cron jobs by hand, since almost everyone these days has a computer of their own and thus has root access when need be. The user crontabs are still useful for multiuser systems, especially shared servers where various people have shell accounts. I suppose they might get used on shared lab computers too, and that sort of thing, where most of the users cannot get root access to install anything in the global cron directories. (These are the same sorts of systems where people cannot install modules off of CPAN except in their home directories, and other horrid nonsense.)


        $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: cron/perl interaction gotchas i am missing?
by Abigail-II (Bishop) on Jan 30, 2004 at 12:58 UTC
    cron doesn't give control back to the script after the wget.
    What do you mean by that? Cron isn't somekind of god that picks apart a program and deals with each piece of code individually. All that cron does is look at its watch, say "it's 10 minutes past 4!", fork, set (e)uid and pipe-open /root/crons/downloadsyncs.pl. Cron doesn't care what kind of external programs the script calls, and it certainly won't call those programs on the scripts behalf.

    What you should be aware of when calling programs from cron (any program, not just Perl programs), is that you will have a limited environment. There's no user calling the program, and there's no login-shell involved either. There won't be many environment variables, PATH will likely be different than running something from the command-line, and so will your current working directory.

    Abigail

      sorry - i have a tendency to antropomorphize daemons...
      what i wanted to say is that that it seems to me that cron wait()s for the wget-subprocess to finish, and then calls it a day...almost as if i would've called exec() instead of system() in my script...

      and i can't figure out how the enviroment influences my poor script, since it does run fine...til the wget returns....
        Anyway it might have something to do with the environment the script is running in ...
        - what user's crontab is this? (root?)
        - show us the code of the script called 'root'
        then we might be able of helping you ...
        pelagic

        update:
        I just read YOUR update ... no more ideas left on my side ... besides: stay up and watch the script running at 0410 ...
Re: cron/perl interaction gotchas i am missing?
by exussum0 (Vicar) on Jan 30, 2004 at 12:51 UTC
    We would need to see the script. Many people use cron and perl w/o a problem. I use it particularly with wget with cron as well.

    BTW, I've never seen the username put before the perl script.


    Play that funky music white boy..
      i can't post the whole script (passwords, IPs and overall shame), but here're the basics:
      @locations = qw/1 8 9/; foreach $loc (@locations) { $dir = getDir($store); $url = getUrl($store); chdir($dir); Log("starting retrieval of '$url' to file '$dir/$fn'"); system("wget --timeout=180 $url"); }

      ...and as i said, it does finish the first wget perfectly...

        How are you sure that wget as finished and the wget process has terminated? I recommend using the logging options of wget plus adding some logging/syslog output to your Perl script as well - print statements should also be enough, as cronmails those to you.

        perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
        I'm guessing it's your second url that's the problem. Have you tried simply using the first url more than once to see if it's not a problem between the cron'd machine and the target?

        Play that funky music white boy..
        just wondering...why could this one have gotten a -- ?
        XP isn't (that) important to me, but i find this curious...
Re: cron/perl interaction gotchas i am missing?
by ysth (Canon) on Jan 30, 2004 at 15:41 UTC
    Could SIGCHLD be being ignored under cron?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://325204]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2020-10-25 11:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (249 votes). Check out past polls.

    Notices?