Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Forking and sharing variables

by httptech (Chaplain)
on May 20, 2000 at 23:01 UTC ( [id://13928]=perlquestion: print w/replies, xml ) Need Help??

httptech has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a routine to sort email addresses by MX server for (hopefully) faster delivery than just sorting by domain. I'm intending to have a set of forked processes then deliver the mail to each MX server. Also, the MX lookups are done ahead of time, also using forked processes.

My sort routine works, but it's kind of clumsy. I'm using IPC::Shareable to store the sorted MX servers, but I ran into problems when trying to store references as hash keys, because it IPC::Shareable tries to tie the references also, and I don't see a way to disable that behavior. So I end up doing some splits/joins to get the job done (amateurishly).

Anyone care to give me some pointers on how I can improve this? I am posting the code as a reply to this message, since it's fairly long.

Replies are listed 'Best First'.
Re: Forking and sharing variables
by perlmonkey (Hermit) on May 21, 2000 at 06:00 UTC
    This was interesting, I had not done IPC stuff yet. I redid some of your logic since I think some of it was uncessary. It also seemed to me that the SIG handlers were uncecessary, I always find it more work that it is worth to keep track of the childred, and I dont think it is really required here. The first part is unchange except for the children hash:
    use strict; use POSIX; use IPC::Shareable; use Net::DNS; use Data::Dumper; $|=1; my $glue = 'data'; my %options = ( create => 'yes', destroy => 'yes', ); my %mxs; my $knot = tie %mxs, 'IPC::Shareable', $glue, { %options } or die "server: tie failed\n"; my @urls = <>; my %hash; for (@urls) { chomp; my ($username, $domain) = split (/\@/); push (@{$hash{$domain}}, $_); }; my $PREFORK = 25; # number of children to maintain my @doms = keys %hash;
    I added Data::Dumper just for easily displaying the data of our multi dimensional hash %mxs. And I added 'strict' as every good human should :)

    The next part is the most drastic change. Is is the rest of the main function (that the parent will execute):
    # define the anon arrays here instead of in the child foreach(@doms) { $mxs{$_} = []; } for( 1..$PREFORK ) { last unless @doms > 0; my $url = shift @doms; child( $url ) if !fork(); } wait(); # wait till one dies; my $kid; { if( @doms > 0 ) { my $url = shift @doms; child( $url ) if !fork(); } $kid = wait(); redo unless $kid == -1; } print Data::Dumper->Dump([\%mxs]); exit;
    First since we want each hash key to be an array (to get rid of the split/join junk) it seems we have to create the anon arrays in the parent. I always got fatal errors if I tried to do it from the child. I am not saying it can't be done, because I dont know ... but I couldn't figure out how to do it.
    In the first for loop we call 'child' for each child as long as there are more domains to search for. So if there are more doms than $PREFORK we have to wait for one to die before we fork off another one. The 'wait' is blocking and will wait for any child to die. Then we go into a anonomous block and keep 'redo'ing until we have forked of a child for every domain and all the children have died. Finally once all the children are dead we display the results via Data::Dumper.

    The last bit is the child function:
    sub child { my $url = shift; my $glue = 'data'; my %options = ( create => 'no', destroy => 'no', ); my %mxs; my $knot = tie %mxs, 'IPC::Shareable', $glue, { %options } or die "client: tie failed\n"; $knot->shlock; push( @{$mxs{$url}}, map { $_->exchange } mx($url)); $knot->shunlock; exit; }
    This code is similar to yours, just a trimmed a bit. First we open up the shared memory segment,then call the mx function with the passed in URL. The list result of that is mapped to get out the data we want, and the list returned from map is pushed on the end of the anonymous array for our url in the mxs hash.

    I have to mention though that I have never done IPC stuff before, so I might be missing some of the subtleties, but this code worked like a champ for me.

    I hope this helps.

    And here is the code all together just to make is easier to cut and past:
    use strict; use POSIX; use IPC::Shareable; use Net::DNS; use Data::Dumper; $|=1; my $glue = 'data'; my %options = ( create => 'yes', destroy => 'yes', ); my %mxs; my $knot = tie %mxs, 'IPC::Shareable', $glue, { %options } or die "server: tie failed\n"; my @urls = <>; my %hash; for (@urls) { chomp; my ($username, $domain) = split (/\@/); push (@{$hash{$domain}}, $_); }; my $PREFORK = 25; # number of children to maintain my @doms = keys %hash; # define the anon arrays here instead of in the child foreach(@doms) { $mxs{$_} = []; } for( 1..$PREFORK ) { last unless @doms > 0; my $url = shift @doms; child( $url ) if !fork(); } wait(); # wait till one dies; my $kid; { if( @doms > 0 ) { my $url = shift @doms; child( $url ) if !fork(); } $kid = wait(); redo unless $kid == -1; } print Data::Dumper->Dump([\%mxs]); exit; sub child { my $url = shift; my $glue = 'data'; my %options = ( create => 'no', destroy => 'no', ); my %mxs; my $knot = tie %mxs, 'IPC::Shareable', $glue, { %options } or die "client: tie failed\n"; $knot->shlock; push( @{$mxs{$url}}, map { $_->exchange } mx($url)); $knot->shunlock; exit; }
Re: Forking and sharing variables
by httptech (Chaplain) on May 21, 2000 at 16:38 UTC
    Ok, a couple of things I ran into. First, you went back to using 'data' as the glue for IPC::Shareable. If I'm not mistaken, this would cause a problem if you had more than one instance of this program running, because they would try to access the same shared memory segment. That's why I was using the pid of the process as the glue.

    Also, from what I understand, at the end of the program you need to call IPC::Shareable->clean_up or the shared memory segment persists after the program ends, which is probably not desirable.

    The last thing, which is one I couldn't get past, is that I seem to be running out of memory when creating the tied %mxs hash when using your method of defining it. It doesn't happen on a small list, but when I tried it on a list of 200 addresses I get:

    IPC::Shareable::SharedMem: shmget: No space left on device at /usr/lib/perl5/site_perl/5.005/IPC/Shareable.pm line 446 Could not create shared memory segment: No space left on device at ./mx.pl line 39
    So I added  size => 8000000 to %options and that just led me to plain old Out of memory!

    I can't see why it should take more than 8 megabytes of memory to store MX servers for 200 domains. It didn't do this in my example code. But I don't see anything radically different about defining the %mxs hash ahead of time that would cause this.

    Any ideas?

RE: Forking and sharing variables
by httptech (Chaplain) on May 20, 2000 at 23:01 UTC
    The code is as follows (the forking code is mostly borrowed from pg.626-627 of The Perl Cookbook
    use Symbol; use POSIX; use IPC::Shareable; use Net::DNS; $| = 1; my $ppid = $$; my %options = ( destroy => 'yes', create => 'yes', ); my %mxs; my $knot = tie %mxs, 'IPC::Shareable', $ppid, { %options } or die "server: tie failed\n"; my @urls = <>; # Specify a filename on the command line with # a list of email addresses, one per line. for (@urls) { chomp; my ($username, $domain) = split (/\@/); push (@{$hash{$domain}}, $_); }; print "Sorted addresses by domain\n"; my $PREFORK = 25; # number of children to mainta +in my %children = (); # keys are current child proces +s IDs my $children = 0; # current number of children my @doms = keys %hash; # Fork off our children. for (1 .. $PREFORK) { my $url = shift(@doms); make_new_child($url) if $url; } # Install signal handlers. $SIG{CHLD} = \&REAPER; $SIG{INT} = \&HUNTSMAN; # And maintain the population. while (1) { sleep; # wait for a signal (i.e., child's + death) for ($i = $children; $i < $PREFORK; $i++) { my $url = shift(@doms); make_new_child($url) if $url; # top up the child poo +l } last unless @doms; } print "Done with URL list, waiting on remaining threads to finish...\n +"; # Wait for everyone to return do { $kid = waitpid(-1,&WNOHANG); } until $kid == -1; print "Sorted domains by mx server\n"; open (OUT, ">/tmp/mx.tmp"); # Make a report about what would # have happened if the rest of the # program were written for (keys %mxs) { my @deliver = (); my $mx = $_; my @domains = split(/\n/, $mxs{$mx}); for (@domains) { push (@deliver, @{$hash{$_}}) } print OUT "Sending to mx server $mx :\n"; print OUT join ("\n", @deliver), "\n"; } close (OUT); IPC::Shareable->clean_up; exit; sub make_new_child { my $url = shift; my $pid; my $sigset; # block signal for fork $sigset = POSIX::SigSet->new(SIGINT); sigprocmask(SIG_BLOCK, $sigset) or die "Can't block SIGINT for fork: $!\n"; die "fork: $!" unless defined ($pid = fork); if ($pid) { # Parent records the child's birth and returns. sigprocmask(SIG_UNBLOCK, $sigset) or die "Can't unblock SIGINT for fork: $!\n"; $children{$pid} = 1; $children++; return; } else { # Child can *not* return from this subroutine. $SIG{INT} = 'DEFAULT'; # make SIGINT kill us as it did be +fore # unblock signals sigprocmask(SIG_UNBLOCK, $sigset) or die "Can't unblock SIGINT for fork: $!\n"; print "Looking up MX server for $url\n"; my @mx = mx($url); my $mx = join("|", map { $_->exchange} @mx); # Making a key wi +th # join instead of references. yuk. $mx ||= $url; $knot->shlock; # push (@{$mxs{$mx}}, $url); # This doesn't work due to tie # problems with IPC::Shareable $mxs{$mx} .= "$url\n"; # So we do this instead (bleah) $knot->shunlock; print "Found MX servers for $url\n"; exit; } } sub REAPER { # takes care of dead children $SIG{CHLD} = \&REAPER; my $pid = wait; $children --; delete $children{$pid}; } sub HUNTSMAN { # signal handler for SIGINT local($SIG{CHLD}) = 'IGNORE'; # we're going to kill our children kill 'INT' => keys %children; $knot->destroy; IPC::Shareable->clean_up; exit; # clean up with dignity }
    For now I am not worrying about storing the preference of the MX server. And when a domain like foo.bar.com doesn't have an MX record, I am just using foo.bar.com as the MX server. because I don't know how to use Net::DNS recursively.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://13928]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-06-20 01:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.