Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change


( #480=superdoc: print w/ replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
The Case for Macros in Perl
5 direct replies — Read more / Contribute
by einhverfr
on Sep 12, 2014 at 23:07

    In some of my work I have started doing a lot more with higher order and functional Perl programming. A good example is PGObject::Util::DBMethod which provides a way to declaratively map stored procedures in Postgres to object methods. I have linked to the source code on github above because it is a good example of where macros would be very helpful.

    Now I will be the first to admit that in these cases, macros are not 100% necessary. The module above can accomplish what it needs to do without them. However the alternative, which means effectively creating a highly generalized anonymous coderef, setting up a custom execution environment for that coderef, and then installing the generalized coderef with the specific execution environment as a method has some significant drawbacks.

    Here's the particular section that does the main work:
    sub dbmethod { my $name = shift; my %defaultargs = @_; my ($target) = caller; my $coderef = sub { my $self = shift @_; my %args; if ($defaultargs{arg_list}){ %args = ( args => _process_args($defaultargs{arg_list}, @_) + ); } else { %args = @_; } for my $key (keys %{$defaultargs{args}}){ $args{args}->{$key} = $defaultargs{args}->{$key} unless $args{args}->{$key} or $defaultargs{strict_ar +gs}; $args{args}->{$key} = $defaultargs{args}->{$key} if $defaultargs{strict_args}; } for my $key(keys %defaultargs){ next if grep(/^$key$/, qw(strict_args args returns_objects) +); $args{$key} = $defaultargs{$key} if $defaultargs{$key}; } my @results = $self->call_dbmethod(%args); if ($defaultargs{returns_objects}){ for my $ref(@results){ $ref = "$target"->new(%$ref); } } if ($defaultargs{merge_back}){ _merge($self, shift @results); return $self; } return shift @results unless wantarray; return @results; }; no strict 'refs'; *{"${target}::${name}"} = $coderef; }

    Now that is 40 lines of code and 30 lines of it go into the coderef which is executed when the method is actually run. This doesn't seem too much but it does the work of 5-10 lines of code in an imperative style. In other words, it is 5-6 times as long and intensive as it needs to be.

    With macros, it would be quite possible to generate only the code needed for the specific function rather than creating a generalized case which has to handle many non-applicable inputs, and then create a context where it only gets what it needs.

Almost 28 new names for 32 old marks
6 direct replies — Read more / Contribute
by tye
on Sep 06, 2014 at 01:42

    We were discussing a software bug and somebody mentioned "vertical pipe" and I thought, "Then it should be called 'bong'". It took several days after that, but I eventually settled on my new names for all of the ASCII punctuation marks:

    ! bang | bong @ bung & dung $ bling ^ sting < bring > brung ( sling ) slung [ cling ] clung { fling } flung : sing ; sung " string ' strong ` strang ~ swing = rung ? rang . ding , dang / slash \ sash - dash _ lash # bash * splash % rash + crash

    Each is mnemonic but I'll leave divining etymologies as an exercise; some of them might be entertaining to realize (some I find entertaining while obvious, YMMV).

    - tye        

RFC Using PERL HEREDOC script within bash
4 direct replies — Read more / Contribute
by dcronin135
on Aug 26, 2014 at 23:29

    This submission is in response to others asking how to embedded a PERL within a bash or ksh script. Though it may not be a common practice, it does illustrate a couple of examples as to how this would be accomplished.

    #!/bin/sh # If you are not passing bash var's into the PERL HEREDOC, # then single quote the HEREDOC tag perl -le "$(cat <<'MYPL' # Best to build your out vars rather than writing directly # to the pipe until the end. my $STDERRdata="", $STDOUTdata=""; while ($i=<STDIN>){ chomp $i; $STDOUTdata .= "To stdout\n"; $STDERRdata .= "Write from within the heredoc\n"; MYPL print $STDOUTdata; # Doing the pipe write at the end will save you warn $STDERRdata; # a lot of frustration. )" <myInputFile 1>prints.txt 2>warns.txt


    #!/bin/sh set WRITEWHAT="bash vars" # If you want to include your bash var's # Escape the $'s that are not bash vars. perl -le "$(cat <<MYPL my $STDERRdata="", $STDOUTdata=""; while (\$i=<STDIN>){ chomp \$i; \$STDOUTdata .= "To stdout\n"; \$STDERRdata .= "Write $WRITEWHAT from within the heredoc\n"; MYPL print \$STDOUTdata; # Doing the pipe write at the end will save you warn \$STDERRdata; # a lot of frustration. )" <myInputFile 1>prints.txt 2>warns.txt

    If you wanted to pass command line arguments, insert them before the < indirect for STDIN.

How realistic is an extended absence?
13 direct replies — Read more / Contribute
by ksublondie
on Aug 15, 2014 at 13:17
    I've been working for the same small, local company since college (12 years -- CS degree) and the sole programmer for the last 7...5 of which have been almost exclusively from home. I love my job, the company is great, can't ask for a better boss, I'm able to work independently and come up with my own projects. But lately, I've been contemplating staying home* to watch the kiddos (currently 3 all <=5). I'm flat out burned out and my priorities have shifted.

    How realistic is it to quit my job for an extended adsence (5+ years) and later return to a programming/IT position? Am I going to be pigeon holed into the baby-track? Will I be untouchable & irrelavant?

    * EDIT: "staying at home" = quitting my job/programming. For clarification, I have been working at home full-time with the kiddos from day one. Always in the past, it worked rather well. It was all they ever knew. My parenting style is rather "hands off" (not to say I neglect my children, but I make sure their needs are met while teaching them to be independent and doing things for themselves if it's within their capability). As a result, they have amazing attention spands and are capable of entertaining themselves. Plus a fortune invested in baby gates helps. Toddlers running around are less distracting than my coworkers and all the drama, politics, meetings about the next meeting, etc.

    I don't know if it's the addition of #3, or their ages requiring more mental stimulation, or #2 being a yet-to-be-potty-trained holy terror...or a combination thereof...but it's not working so smoothly anymore. I'm debating about quitting completely. I can tell myself to "stay in the loop" independently, but realistically, I know I won't. I already feel irrelavant since I'm not physically in the office.

RFC: interface for a DBD::Proxy-like module client-side DSN
No replies — Read more | Post response
by MidLifeXis
on Aug 14, 2014 at 09:22

    I made mention of this in the CB the other day, but didn't get many responses, so I thought I would ask it here to perhaps get a wider audience and set of responses.

    I am modifying a copy of DBD::Proxy/DBI::ProxyServer so that instead of specifying the entire server-side DSN on the client side, you instead specify a known name of a handle to a configured DSN on the server side. Using this and implementing the sql section of the configuration to another set of known queries would allow the client to use a DBI compliant data source without needing to have the server-side implementation details available. I am also looking to update the connection / user / query definition sections to make them more able to be isolated from one another.

    • Does a client-side DSN along the lines of dbi:Router:hostname=$hostname;port=$port;dsn=dbi:router:$remotename seem like a reasonable interface? [clarification: $hostname and $port are for connecting to the proxy / routing server, not the database -- that is fully configured on the routing server] Is there something (currently) better to base this on than DBD::Proxy/DBI::ProxyServer?
    • Does the name seem sensible?
    • Should I just try to incorporate this directly into the DBD::Proxy core itself?
    • Any other thoughts / previously invented wheels / ideas?

    The major use case I have for this is to standardize access to all of the little bits of information I have to use for my applications which currently exist in different data stores (CSV/TSV, SQLite, ldap, ...) in order to migrate them into a more manageable setup without impacting the application code. This type of configuration would also allow for the mockup of a testing set of data, migration to different platforms, ...


    • pgbouncer was mentioned as a similar tool
    • Added description of my use case
    • Added a clarification of what the host/port refer to


RFC: pianobar event example
2 direct replies — Read more / Contribute
by ulterior_modem
on Aug 10, 2014 at 22:02
    Hello monks,

    I know enough perl to be bad; however I live in a unix userland and some things are more universally accepted than others, one of them being perl. I usually play around in php, but writing this sort of script in php seems wrong.

    My goal was for it to be understandable and log songs played via pianobar to csv for use with other things. What ways could this be improved? Any feedback is appreciated.



    use strict; use warnings; # this holds all of the lines output by pianobar. my @input = <STDIN>; # lines parsed into hash. my %data; # file we want to write output to my $file = '/home/ulterior/pandora.log'; # assembled CSV line. my $line; # last line of logfile. my $lastline; # remove newlines from end of all values in array. chomp @input; # build hash from contents of array. foreach my $var (@input) { (my $key, my $value) = split(/\=/, $var); $data{$key} = $value; } # check to see if all the field we want are defined. if (defined($data{title}) && defined($data{artist}) && defined($data{album}) && defined($data{songStationName})) { # compose csv line with/without album art. if (defined($data{coverArt})) { $line = '"'.$data{title}.'","'.$data{album}.'","'.$data{artist}.'"," +'.$data{songStationName}.'","'.$data{coverArt}.'"'."\n"; } else { $line = '"'.$data{title}.'","'.$data{album}.'","'.$data{artist}.'" +,"'.$data{songStationName}.'"'."\n"; } } # check to see if log file exists. if (-e $file) { # check to see if the last line is the same to avoid duplication. $lastline = qx/tail -n 1 $file/; if ($line eq $lastline) { exit(0); } # write csv line to file. else { open(HANDLE, ">>", $file); print(HANDLE "$line"); close(HANDLE); } }

    Sample data.

    artist=Bastille title=Pompeii album=Pompeii (Remixes) coverArt= +00W_500H.jpg stationName=QuickMix songStationName=Major Tom Radio pRet=1 pRetStr=Everything is fine :) wRet=1 wRetStr=Everything's fine :) songDuration=214 songPlayed=214 rating=0 detailUrl= +32&ad=1:23:1:47805::0:msn:0:0:581:307:IN:18167:0:0:0:0:6:0 stationCount=74 station0=28 Days Radio station1=And so on...
Private & Protected Objects
3 direct replies — Read more / Contribute
by Sixes
on Aug 10, 2014 at 13:46

    Some time ago (nearly 15 years, actually) in this thread, btrott was talking about various ways of protecting a blessed object and quite a lot of discussion came from it.

    I haven't seen anyone suggest using Variable::Magic to achieve this. I'm thinking of writing a base class with a class method on the lines of this.

    sub new { my $class = shift; my %params = @_; my $protected = sub { croak qq{Attempt to access protected data "$_[2]"} unless call +er->isa(__PACKAGE__); }; my $wiz = wizard( store => $protected, fetch => $protected, exists => $protected, delete => $protected, ); my %self; cast %self, $wiz; my $self = \%self; bless $self, $class; $self->$_($params{$_}) foreach keys %params; return $self; }

    Does anyone have any views on whether this (a) will work correctly and (b) will be useful? The intention is to make the underlying hash inaccessable other than to subclasses of a class using this as a parent.

    The main problem I'm trying to solve is the programmer who accidentally types $obj->{field} when he meant $obj->field, thereby inadvertantly bypassing any clever stuff in the getter.

Contemplating some set comparison tasks
8 direct replies — Read more / Contribute
by dwhite20899
on Aug 08, 2014 at 14:32

    I'm stewing on a particular task that is likely to reappear from time to time. I'd like to find an efficient way to do this work so it can scale up in future.

    In summary, I have Keys and Sources. A Key may come from one or many Sources, a Source may generate one or more Keys. What is the minimal list of Sources which cover the Keys?

    I have data in the format "Key|Source", /^[0-9A-F]{40}\|[0-9a-f]{40}$/

    0000002D9D62AEBE1E0E9DB6C4C4C7C16A163D2C|2f214516cdcab089e83f3e5094928 +fe9611f2f51 000000A9E47BD385A0A3685AA12C2DB6FD727A20|2adeac692d450c54f8830014ee6cb +e3a958c1e60 00000142988AFA836117B1B572FAE4713F200567|04bb7bbed62376f9aaec15fe6f18b +89b27a4c3d8 00000142988AFA836117B1B572FAE4713F200567|6935a8fc967a6ffc20be0f07f2bb4 +a46072a397e 00000142988AFA836117B1B572FAE4713F200567|8c88f4f3c4b1aff760a026759ae80 +7af6c40e015 00000142988AFA836117B1B572FAE4713F200567|974c820f53aded6d6e57ca8de2c33 +206e2b5f439 00000142988AFA836117B1B572FAE4713F200567|b05be3e17bb9987ffb368696ee916 +dd9b9c2f9b3 000001BCBC3B7C8C6E5FC59B686D3568132D218C|0d4c09539f42165bb8b1ab890fe6d +c3d3ca838b3 000001BCBC3B7C8C6E5FC59B686D3568132D218C|9fd421d4e020788100c289d21e4b9 +297acaaff62 000001BCBC3B7C8C6E5FC59B686D3568132D218C|d09565280ebae0a37ca9385bc39c0 +a777a446554 000001E4975FA18878DF5C0989024327FBE1F4DF|55b8ece03f4935f9be667e332d52f +7db3e17b809 000001EF1880189B7DE7C15E971105EB6707DE83|cd15550344b5b9c2785a13ef95830 +15f267ad667 000002F2D7CB4D4B548ADC623F559683D6F59258|36bed8bdb6d66fb67f409166f5db6 +4b02199812f 0000034C9033333F8F58D9C7A64800F509962F3A|3c4b0a3c1acf6e03111805a0d8b4e +879df112b7a 000003682106A4CB4F9D3B1B6E5C08820FCFD1B2|cd15550344b5b9c2785a13ef95830 +15f267ad667 00000368B9CFE1B4CF9F3D38F3EFD82840BA280D|50edd315b9217345b1728c38b0265 +7df42043197 000003A16A5A1C6CCDDBE548E85261422489A458|691845459c0ad35b28cce4dffc0e3 +ee8912fb0f5 0000046FD530A338D03422C7D0D16A9EE087ECD9|13e213f346ce624e9be99b356ab91 +25af563a375 0000046FD530A338D03422C7D0D16A9EE087ECD9|67c0da2da88a23a803733cea951e8 +4974b34d029 00000472E2B96A6CD0CBE8614779C5C8197BB42D|0c5e6cdb06c52160ded398d173922 +46269165e0a

    I am now dealing with a 190,000,000+ set of Key|Source pairs. There are 30,000,000 unique Key values and 20,000 unique Source values. There are 23,800,000 Keys that appear only once, so I know I must have at least their Sources in the final set I want. I need to find the smallest set of Sources that cover the 6,200,000 remaining Keys.

    I can think of a brute-force iteration method to do this, but there should be a more elegant (and hopefully more efficient) way to find the smallest coverage set of Sources over the 6,200,000 Keys.

    My data is usually sorted by Key value, and how I'm used to thinking of it. If I sort on Source value, I might have an inspiration.

    So I'm stewing on this...

    UPDATE 2014-08-12

    I have shrunk the problem set by identifying the keys which only come from single sources. I am now left only to consider the set of many-many key-source relationships. That is 9,197,129 relations, between 1,890 sources and 3,692,089 keys. I was able reduce the key signatures from 40 char to 11 char and source signatures from 40 char to 6 char, to buy myself some space.

    Off to bang on this a while...

    Complete 170 MB data in 50MB zip file :
    format: key|source and deleted modules
1 direct reply — Read more / Contribute
by marto
on Aug 06, 2014 at 05:16

    Recently I've been thinking about the tickets on associated with modules which no longer exist. For context see CPAN Day - 16th of August. One argument for not deleting these along with the module would be that a user may not know that a module they use has been deleted from CPAN, any bugs, patches and discussions in the associated rt queue would be useful to them. Perhaps there are other reasons I've not yet thought of for keeping them around.

    I wonder if it's worthwhile flagging each ticket associated with a deleted module, so that people could easily filter them out when browsing the queue and/or a way to highlight this, be it via displaying an additional field or some CSS.

    I'd be keen to find out if anyone had any additional thoughts on this issue.

When to Use Object Oriented approach in Perl? (RFC)
8 direct replies — Read more / Contribute
by thanos1983
on Jul 31, 2014 at 16:53

    Dear Monks,

    I do not know if this is actually the correct place to write my observations and ask for your questions, so bare with me in case of not correctly posting this question here.


    I am not expert in scripting or in Perl, I am relative new programmer with a short experience, so my observations maybe are not the most correct ones.


    Why to use Perl with Object Oriented approach?

    I found on book Beginning Perl (Programmer to Programmer) written by Simon Cozens and Peter Wainwright. At chapter 11 Object-Oriented Perl the following quotation appears under the subject "Do you need OO?" page 336:

    Object-oriented programs run slightly slower than equally-written procedural programs that do the same job, because packaging things into objects and passing objects around is expensive, both in terms of time and resources used. If you can get away without using object orientation, you probably should.

    The only the reason that I could come up with is to minimize and simplify the code and maybe, maybe, increase the process speed in some cases.

    So in order to understand more about it I created my own experiment with and without Object Oriented approach. I tried to time the performance with assistance of Benchmark Tool.


    I have two identical processes on the same script, on one of them sends the data from to the module, the data are processed and send back to the script. The second process it does exactly the same process but instead of sending the data to another module completes the process on the same script. The purposes creating the same process twice on the same script is to test the execution speeds processes etc. by comparing them.

    This is the main.plscript.

    This is the module which processes the data.

    I am also including the conf.ini file in case that someone want to replicate the experiment.


    I contact the experiment 4 times to get different results and observe the output.

    Output straight from my terminal:


    On the first run the results is as expected, the Object Oriented process was slower by 7%.

    The impressive part one the rest of the rounds. It actually shows that the Object Oriented process is almost fast as the normal process and at some point it is even faster!


    The book was written back on 2000, fourteen years ago. From my point of view Object Oriented programming is a huge advantage, it makes the code shorter and also possibly faster on some occasions.

    So in conclusion, when a user should choose to follow an Object Oriented programming approach if the code is really long only?

    Thank you all for your time and effort to assist me with my question/discussion.

    Seeking for Perl wisdom...on the process...not there...yet!
RFC: Proc::Governor
3 direct replies — Read more / Contribute
by tye
on Jul 28, 2014 at 03:12

    Here is the documentation for a little module I threw together after one of our services did a denial-of-service attack against another of our services. The math for this simple trick works out very neatly.

    I plan to upload this to CPAN very soon. Please let me know what you think.


    Proc::Governor - Automatically prevent over-consumption of resources.


    use Proc::Governor(); my $gov = Proc::Governor->new(); while( ... ) { $gov->breathe(); ... # Use resources } while( ... ) { my $res = $gov->work( sub { ... # Use Service } ); ... }


    If you want to do a batch of processing as fast as possible, then you should probably also worry about overwhelming some resource and causing problems for other tasks that must share that resource. Fortunately, there is a simple trick that allows one to perform a batch of processing as fast as possible while automatically backing off resource consumption when most any involved resource starts to become a bottleneck (or even before it has become much of a bottleneck).

    The simple trick is to pause between steps for a duration equal to how long the prior step took to complete. The one minor down-side to this is that a single strand of execution can only go about 1/2 maximum speed. But if you have 2 or more strands (processes or threads), then throughput is not limited by this simple "universal governor" trick.

    It is also easy to slightly modify this trick so that, no matter how many strands you have working, they together (without any coordination or communication between the strands) will never consume more than, say, 60% of any resource (on average).

    A typical pattern for batch processing is a client sending a series of requests to a server over a network. But the universal governor trick also works in lots of other situations such as with 1 or more strands where each is doing a series of calculations and you don't want the collection of strands to use more than X% of the system's CPU.

    Note that the universal governor does not work well for resources that remain consumed while a process is sleep()ing, such as your process using too much memory.

    Proc::Governor provides lots of simple ways to incorporate this trick into your code so that you don't have to worry about your code becoming a "denial-of-service attack", which also frees you to split your processing among many strands of execution in order to get it done as fast as possible.



    my $gov = Proc::Governor->new( { working => 0, minSeconds => 0.01, maxPercent => 100, unsafe => 0, } );

    new() constructs a new Proc::Governor object for tracking how much time has recently been spent potentially consuming resources and how much time has recently been spent not consuming resources.

    new() takes a single, optional argument of a reference to a hash of options. The following option names are currently supported:


    If given a true value, then the time spent immediately after the call to new() is counted as "working" (consuming resources). By default, the time spent immediately after the call to new() is counted as "not working" (not consuming).


    minSeconds specifies the shortest duration for which a pause should be done. If a pause is requested but the calculated pause duration is shorter than the number of seconds specified for minSeconds, then no pause happens (and that calculated duration is effectively added to the next pause duration).

    The default for minSeconds is 0.01.


    maxPercent indicates how much of any particular resource the collection of strands should be allowed to consume. The default is 100 (for 100%, or all of any resource, but avoid building up a backlog by trying to over-consuming any resource).

    Note that percentages are not simply additive. Having 3 groups of clients where each is set to not consume more than 75% of the same service's resources is the same as having just 1 group. The 3 groups together will not consume more than 75% of the service's resources in total.

    Say you have a group of clients, H, all set to not consume more than 50% of some service's resources and you have another group of clients, Q, all set to not consume more than 25% of that same service's resources. Both H and Q together will not add up to consuming more than 50% of the service's resources.

    If Q is managing to consume 20% of the service's resources when H starts running, then H won't be able to consume more than 30% of the service's resources without (slightly) impacting performance to the point that Q starts consuming less than 20%.

    H Q Total 50% 0% 50% 40% 10% 50% 30% 20% 50% 25% 25% 50%


    You can actually specify a maxPercent value larger than 100, perhaps because you have measured overhead that isn't easily accounted for by the client. But doing so risks overloading a resource (your measured overhead could end up being a much smaller percentage of the request time when the service is near capacity).

    So specifying a maxPercent of more than 100 is fatal unless you also specify a true value for unsafe.


    $gov->beginWork( $breathe );

    Calling beginWork() means that the time spent immediately after the call is counted as "working" (consuming resources). Such time adds to how long the next pause will be.

    If $breathe is a true value, then beginWork() may put the strand to sleep for an appropriate duration.


    $gov->endWork( $breathe );

    Calling endWork() means that the time spent immediately after the call is counted as "not working" (not consuming resources). Such time subtracts from how long the next pause will be.

    If $breathe is a true value, then endWork() may put the strand to sleep for an appropriate duration.


    $gov->work( sub { ... # Consume resources }, $which );

    work() is a convenient shortcut that is roughly equivalent to:

    $gov->beginWork( $before ); ... # Consume resources $gov->endWork( $after );

    The value of $which can be:

    0 No pause will happen. 1 A pause may happen before the sub reference is called. 2 A pause may happen after the sub reference is called. 3 A pause may happen before and/or after the sub is called.

    If $which is not given or is undefined, then a value of 1 is used.

    You can actually get a return value through work():

    my @a = $gov->work( sub { ...; get_list() }, $which ); my $s = $gov->work( sub { ...; get_item() }, $which );

    Note that scalar or list (or void) context is preserved.

    Currently, if your code throws an exception, then endWork() does not get called. This is the same as would happen with the "equivalent" code shown above.


    $gov->breathe( $begin );

    Calling breathe() requests that the current process/thread pause for an appropriate duration.

    Each of the following:

    $gov->breathe(); # or $gov->breathe( 1 );

    is actually equivalent to:

    $gov->beginWork( 1 );


    $gov->breathe( 0 );

    will just pause but will not change whether $gov is counting time as "working" or as "not working".


    $gov->pulse( $count, $begin );

    pulse() is very much like breathe() except that it is optimized for being called many times before enough "working" time has accumulated to justify doing a pause. The meaning of $begin is the same as with breathe().

    So, if you are making requests of a very fast service or are doing work in small chunks, then you can call pulse() directly in your loop and just pass it a value specifying approximiately how many calls to pulse() should be made before one of those calls does the work of calculating how long of a pause is called for.

    For example, a request to our Redis service typically takes a bit under 1ms. So code to perform a large number of such requests back-to-back might be written like:

    my $gov = Proc::Governor->new( { maxPercent => 70, working => 1, } ); my $redis = Redis->new(server=>...); while( ... ) { $gov->pulse( 20 ); $redis->...; }

    That is like calling breathe() every 20th time through the loop and is only the slightest bit less efficient (in run time) than if you had made the extra effort to write:

    ... my $count = 0; while( ... ) { if( 20 < ++$count ) { $gov->breathe(); $count = 0; } ...


    A single process (or thread) can simultaneously use more than one Proc::Governor object. For example, each process (of a group) that makes a series of requests to a service and does significant local processing of the data from each request might want to both prevent overwhelming the service and prevent overwhelming local resources (such as CPU).

    So you could have two Proc::Governor objects. One throttles use of local resources ($g_cpu below). The other throttles use of service resources ($g_db below).

    my $g_cpu = Proc::Governor->new( { maxPercent => 80 } ); my $g_db = Proc::Governor->new( { maxPercent => 30 } ); $g_db->beginWork(); my $db = DBI->connect( ... ); # DB work my $rows = $db->selectall_arrayref( ... ); $g_db->endWork(); for my $row ( @$rows ) { my $upd = $g_cpu->work( sub { process_row( $row ); # Local work } ); $g_db->work( sub { $db->update_row( $upd ); # DB work } ); }

    The above code assumes that the local resources required for making requests of the database service are relatively low. And realizes that doing local computations do not use database resources.

    If you set maxPercent to 100 for both Governors and each process spent about the same amount of time waiting for a response from the database as it spent performing local computations, then there might be no need for any pauses.

    Note that only time spent doing "DB work" adds to how long of a pause might be performed by the $g_db Governor. And only time spent doing "Local work" adds to how long of a pause might be performed by the $g_cpu Governor.

    Any pauses executed by either Governor get subtracted from the duration of any pauses of any Governor objects. So the $g_db Governor executing a pause also counts as a pause for the $g_cpu Governor (and thus makes the next pause that it performs either shorter or later or just not needed).

    Time spent inside of Proc::Governor methods may also be subtracted from future pause durations. But the code pays more attention to keeping such overhead small than to providing highly accurate accounting of the overhead and trying to subtract such from every Governor object.


    Say you have a service that is a layer in front of some other service. You want to ensure that your service can't become a denial-of-service attack against the other service. But you want to prevent a Governor pause from impacting clients of your service when possible.

    You could implement such as follows:

    sub handle_request { my( $req ) = @_; our $Gov ||= Proc::Governor->new(); my $res = $Gov->work( sub { forward_request( $req ); }, 0 ); # Don't pause here. give_response( $res ); $Gov->breathe( 0 ); # Pause here; still idle. }

    (Well, so long as your service architecture supports returning a complete response before the request handler subroutine has returned.)

    If the other service is not near capacity, then the added pauses have no impact (other than perhaps preventing the number of active strands for your service from dropping lower). Be sure your service has an appropriate cap on how many strands it is allowed to keep active (as always).


    A future version should have support for asynchronous processing. The shape of that interface is already sketched out, but the initial release was not delayed by the work to implement such.

    - tye        

Speeds vs functionality
7 direct replies — Read more / Contribute
by Tux
on Jul 27, 2014 at 12:48

    So my main question, also to myself, is "How much speed are you willing to sacrifice for a new feature?".

    Really. Lets assume you have a neat module that deals with your data, and it deals with it pretty well and reliable, but extending it with new features - some of them asked for by others - is getting harder and harder.

    We now have git, and making a branch is easy, so you can implements the most requested new feature, or the one that most appeals to you and when you are done and all old tests and new tests have passed, you notice a speed drop.

    What considerations do you make to decide whether to release the module with the new neat new feature and mention the slowdown (specified) or do you revert the change and note in the docs that the new feature would cause to big a slowdown.

    Enjoy, Have FUN! H.Merijn
How I spend my day meditating on perl is this the same with you?
6 direct replies — Read more / Contribute
by 5plit_func
on Jul 24, 2014 at 18:11

    Dear Monks, I am pretty new to programming. In the past i spent my time trying to be a pro overnight but lately realized my initial approach to learning was wrong. I spent most of my time doing irrelevant things on computer because i found it had focusing on one thing which is practicing programming. Lately i spend the whole day laying on my bed trying to slow down my life and rediscover myself. In doing this i only use the computer for a short period of time which is usually in the evenings and i now visit less websites. when i notice i am no longer able to focus i logout and relax. Why i do this is because i have come to realize that if i want to be a skilled programmer it is something i have to commit to and let other things go. In doing this i find myself learning more though my progress is quite slow but im pleased with the pace at which i am going. These are still early days of my approach. I hope i am not doing anything wrong in my approach. I will love to know how others relax and spend there day at home or at work and what advice do you all have for me. Thanks in advance., metacpan and PAUSE all broken in different ways?
4 direct replies — Read more / Contribute
by Sixes
on Jul 19, 2014 at 14:25

    Starting with PAUSE, I have uploaded several new modules to PAUSE. Each time, I get a message that says:

    This distribution name can only be used by users with permission for the package <My::Package>, which you do not have.

    The packages are in new namespaces. As I understand it, simply uploading a new module should allocate that namespace to me on a "first-come" basis. But it isn't.

    This doesn't seem to matter to, when it's working: it still indexes the module so that it can be found and downloaded via the cpan utility.

    However that doesn't seem to apply to metacpan. It uses 02packages.details.txt which isn't being updated, presunably because of the PAUSE issue. Thus my modules are not appearing on metacpan in their search. Metacpan's help says:

    MetaCPAN uses the PAUSE generated 02packages.details.txt file. If it's not in there, then the module author will need to fix this,

    Does anyone know if it's fixable? I have mailed a couple of times but no response.

The problem with "The Problem with Threads"
5 direct replies — Read more / Contribute
by BrowserUk
on Jul 18, 2014 at 07:26

    This started life as a reply to Re^2: Which 'Perl6'? (And where?), but it seems too important to bury it down there in a long dead thread as a reply to an author I promised to resist, and whom probably will not respond. So I'm putting it here to see what of any interest it arouses.

    1. Is concurrency appropriate? There are two basic motivations ... and 2) to speed things up. In the latter case, if the problem being tackled is really IO bound, turning to concurrency probably won't help.

      That is way too simplistic a view. If the problem is IO bound to a single, local, harddisk, and is uncacheable, then concurrency may not help.

      But change any of the four defining elements of that criteria; and it might -- even: probably will -- be helped by well written asynchronicity. Eg.

      1. If the IO data is, or can be, spread across multiple local physical drives; concurrency can speed overall throughput by overlapping requests to different units.
      2. If the disks are remote -- as in SAN, NAS, cloud etc. -- then again, overlapping requests can increase throughput by utilising buffering and waiting time for processing.
      3. If the drives aren't harddisks, but SSDs; or SSD buffered HDs; or PCI connected virtual drives; then overlapping several fast read requests with each slower write request can more fully utilise the available bandwidth and improve throughput.
      4. If the IO involved displays temporal locality of reference -- that is, if the nature of the processing is such that a subset of the data has multiple references over a short period of time, even if that subset changes over the longer term -- then suspending the IO for new references until re-references to existing cached data play out comes about naturally if fine-grained concurrency is used.

      And if some or all of the IO in your IO bound processing is to the network, or network attached devices; or the intranet; or the internet; or the cloud; -- eg. webserving; webcrawling; webscraping; collaborative datasets; email; SMS; customer facing; ....... -- then both:

      • Preventing IO from freezing your processing;
      • And allowing threads of execution who's IO has completed to continue as soon as a core is available -- ie. not also have to wait for any particular core to become available;

      Is mandatory for effective utilisation of modern hardware and networks; even for IO-bound processing.

      Only kernel(OS) threading provides the required combination of facilities. Cooperative multitasking (aka. 'green threads'; aka. Win95 tech) simply does not scale beyond the single core/single thread hardware of the last century.

    2. The Problem with Threads.

      The problem with "The Problem with Threads", is that it is just so much academic hot air divorced from the realities of the real world.

      Only mathematicians and computer scientists demand total determinacy; and throw their arms up in refusal to work if they don't get it.

      The rest of the world -- you, me, mothers and toddlers, doctors, lawyers, spacemen, dustmen, pilots, builders, shippers, movers & shakers, factory workers, engineers, tinkers, tailors, soldiers, sailors, rich & poor men, beggars and thieves; all have to live in the real -- asynchronous -- world, where shit happens.

      Deliveries are late; machines break down; people are sick; power-outs and system-downs occur; the inconvenient realities of life have to be accepted, lived with and dealt with.

      The problem is not that threading is hard; the problem is that people keep on saying that "threading is hard"; and then stopping there.

      Man is very adept at dealing with hard and complex tasks

      Imagine all places you'd never have been; all the things you'd never have done; if the once wide-spread belief that we would suffocate if we attempted to travel at over 30mph.

      Too trivial an example for you? Ok. Think about heart transplantation. Think about the problems of disconnecting and reconnecting the (fragile, living) large bore pipes supplying and removing the pumped liquid; the wires carrying electrical control signals; the small bore pipes carrying the lubricants needed to keep the pump alive and removing the waste. Now think about the complexities of doing a pump change whilst keeping the engine running; the passengers comfortable and the 'life force' intact. And all the while contending with all the other problems of compatibility; rejection; infection; compounded diagnosis.

      Circa. 5000 coronary transplants occurred last year. Mankind is good at doing difficult things.

      Asynchronicity and non-determinism are 'solved problems' in almost every other walk of life

      From multiple checkouts in supermarkets; to holding patterns in the skies above airport hubs; to off & on ramps on motorways; to holding tanks in petro-chemical plants; to waiting areas in airports and doctors and dentists surgeries; to carousels in baggage claims and production lines; distribution warehouses in supply chains; roundabouts and filter-in-turn; {Add the first 10 things that spring to your mind here! }.

      One day in the near future a non-indoctrinated mathematician is going to invent a symbol for an asynchronous queue.

      She'll give it a nice, technical sounding name like "Temporally Lax Composer", which will quickly become lost behind the cute acronym and new era of deterministic, asynchronous composability will ensue.

      And the academic world will rejoice, proclaim her a genius of our time, and no doubt award her a Nobel prize. (That'd be nice!)

      And suddenly the mathematicians will realise that a process or system of processes can be deterministic, without the requirement for every stage of the process (equation) to occur in temporal lockstep.

      'Safety' is the laudable imperative of the modern era.

      As in code-safety and thread-safety, but also every other kind of predictable, potentially preventable danger.

      Like piety, chastity & sobriety from bygone eras, it is hard to argue against; but the world is full (and getting fuller) of sexually promiscuous atheists who enjoy a drink; that hold down jobs, raise kids and perform charitable works. The world didn't fall apart with the wane of the religious, moral and sobriety campaigns of the past.

      In an ideal world, all corners would be rounded; flat surfaces 'soft-touch'; voltages would be low; gases non-toxic; hot water wouldn't scald; radiant elements wouldn't sear; microwaves would be confined to lead-lined bunkers; there'd be no naked flames; and every home would be fire-proof, flood-proof, hurricane-proof, tornado-proof, earthquake-proof, tsunami-proof and pestilence-proof.

      Meanwhile in the real-world, walk around your own house and see all the dangers that lurk for the unsupervised, uneducated, unwary, careless or stupid and ask yourself why do they persist? Practicality and economics.

      Theoreticians love theoretical problems; and eschew practical solutions.

      When considering concurrency, mathematicians love to invent ever so slightly more (theoretically) efficient solutions to the 'classical' problems.

      Eg. The Dining Philosophers. In a nutshell: how can 6 fil..Phillo.. guys eat their dinners using 5 forks without one or more of them starving. They'll combine locks and syncs, barriers and signals, mutexs and spinlocks and semaphores trying to claw back some tiny percentage of a quasilinear factor.

      Why? Buy another bloody fork; or use a spoon; or eat with your damn fingers.

      The problem is said to represent the situation where you have 6 computers that need to concurrently use the scarce resource of 5 tape machines. But that's dumb!

      Its not a resource problem but a capital expenditure problem. Buy another damn tape machine and save yourself 10 times its cost by avoiding having to code and maintain a complex solution. Better still, buy two extra tape machines; cos as sure as eggs is eggs, it'll be the year-end accounting run; or the Black Friday consumer spending peak when one of those tape machines defy the 3 sigma MTBF and break.

      Threading can be complex, but there are solutions to all of the problems all around us in the every day, unreliable, non-deterministic operations of every day modern life.

      And the simplest solution to many of them is to avoid creating problems in the first place. Don't synchronise (unless you absolutely have to). Don't lock (unless it is absolutely unavoidable). Don't share (unless avoiding doing so creates greater problems).

      But equally, don't throw the baby out with the bath water. Flames are dangerous; but oh so very useful.

    3. Futures et al are the future. There are much simpler, safer, higher level ways to do concurrency. I haven't tried Paul Evans' Futures, but they look the part.

      And therein lies the very crux of the problem. Most of those decrying threads; and those offering alternative to them; either haven't tried them -- because they read they were hard -- or did try them on the wrong problems, and/or using the wrong techniques; and without taking the time to become familiar with and understand their requirements and limitations.

      Futures neither remove the complexity nor solve the problems; they just bury them under the covers forcing everyone to rely upon the efficacy of their implementation and the competence of the implementors.

      And the people making the decisions are taking advice from those thread-shy novices with silver bullets and employing those with proven track records of being completely useless at implementing threaded solutions.

      The blind taking advice from the dumb and employing the incompetent.

    4. Perl 5 "threads" are very heavy. This sometimes introduces additional complexity.

      The "heaviness" of P5 threading is a misnomer. The threads aren't heavy; the implementation of shared memory is heavy. And that could easily be fixed. If there was any interest. If there wasn't an institutionalised prejudicial barrier preventing anyone even suggesting change to improve the threading support; much less supporting those with the knowledge and ideas to take them forward.

      They've basically stagnated for the past 8 or more years because p5p won't allow change.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Add your Meditation
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others pondering the Monastery: (7)
    As of 2014-12-26 19:24 GMT
    Find Nodes?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?

      Results (174 votes), past polls