If you've discovered something amazing about Perl that you just need to share with everyone,
this is the right place.
This section is also used for nonquestion discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)
Meditations is sometimes used as a soundingboard — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").
_{Edit: Not going to edit this node because of replies but Eily noticed there is an extra single quote at the end of my port :/}
This recent post in r/programming poses an interesting question for a bit of golf:
"What is the shortest word in the English Language which contains: a b c d e f?"
Some Junk™ code was posted to figure this out and my Perl version is a bit shorter. I know you perl better than me, if you do, and can see how else to do this:
The Junk Code:
sorted([w.strip() for w in open('/usr/share/dict/words', 'r').readline
+s()
if set(list('abcdef')).issubset(set(list(w.strip())))], key=lambda x:
+len(x))
Most perls assign floating point values using perl's internal Atof function  and that includes perls that define "Perl_strtod".
But Perl's Atof function is notoriously incorrect, and a far better alternative IMO is to have floats assigned using Perl_strtod, which is just a wrapper around C's strtod() or strtold() or strtoflt128()  whichever is appropriate for the particular perl's nvtype.
First up, I should point out that Dusequadmath builds (ie builds for which $Config{nvtype} reports "__float128") already use Perl_strtod(), with the result that the __float128 values are assigned correctly, in my experience on Ubuntu16.04. (By "correctly", I mean rounded to nearest, ties to even.)
But when perl's nvtype is "double" or "long double", then values are being assigned using perl's Atof function and there's a fair chance that values are being assigned incorrectly.
The magnitude of Atof's inaccuracies is not particularly large  mostly it's only 1 unit of least precision (ULP). But it can be as large as 7 ULP when nvtype is "double" and and as large as 54 ULP when nvtype is the extended precision "long double".
(The figures of "7" and "54" are the largest I've found, having tested millions of random values  and those 2 numbers turn up often enough.)
The actual likelihood of striking inaccuracies with Atof depends upon the exponent range that you're working in. If the exponent is in the range (say) 10 to 10 the likelihood of an incorrect assignment is about 10%.
But when I randomly select values across the full exponent range, I'm finding that the chances of an incorrect assignment rise to around 97% for "doubles" and 82% for "long doubles".
When I hack the perl source to use Perl_strtod, the chances of an incorrect assignment become 0. (Ok ... I haven't checked every value ... but I've not yet found a value that has been incorrectly assigned by Perl_strtod on Ubuntu.)
It turns out that using Perl_strtod instead of perl's Atof is very easy to implement. We just need to open up numeric.c in the top level perl source folder, replace (the one occurrence of) "strtoflt128" with "Perl_strtod", replace every occurrence of "USE_QUADMATH" with "Perl_strtod", and rebuild perl.
The actual patch (for perl5.28.0 source) can be downloaded from my scratchpad.
UPDATE: Better to grab this patch because:
a) it's a portable patch for both mingww64 built Windows perl && Linux perl;
b) at some time I'll probably clear my scratchpad.
That's about it. If your perl's nvtype is "__float128" or your build of perl doesn't define "Perl_strtod", then applying the patch will not change anything.
Otherwise, however, if you build perl using the patched numeric.c then perl will assign floating point values using Perl_strtod instead of perl's Atof.
It's very much the same story on MS Windows wrt to mingww64 builds of perl whose nvtype is "double", where exactly the same patch makes equally dramatic improvements to the assigning of floating point values.
Sadly, however, for "long doubles" on Windows, there's https://sourceforge.net/p/mingww64/bugs/711 and https://sourceforge.net/p/mingww64/bugs/725 that complicate matters.
And there's also an issue wrt to strtold's assigning of some subnormal long double values  for which I've yet to submit a bug report.
(More about Windows at a later date.)
Here's the script I use to check $ARGV[1] randomly selected values within a specified exponent range ($ARGV[0] to +$ARGV[0]).
# atonv.pl
# Test a range of values for
# correctness of assignment
use strict;
use warnings;
use Math::MPFR qw(:mpfr);
die "Upgrade to MathMPFR4.03"
unless $Math::MPFR::VERSION >= 4.03;
die "Usage: perl atonv.pl maximum_exponent how_many_values"
unless @ARGV == 2;
$++;
my $display = 0;
while($display !~ /^y/i && $display !~ /^n/i) {
print("Do you want mismatched values to be displayed ? [yn]: \n");
$display = <STDIN>;
}
$display = 0 if $display =~ /n/i;
my($mant, $exp, $perl_unpacked, $mpfr_unpacked, $str_value);
my($count, $diff, $max_diff, $min_diff) = (0, 0, 0, 0);
my $max_exp = $ARGV[0];
$max_exp++;
# $workspace is the Math::MPFR object to which
# the value being tested is assigned.
# Here we set the precision of $workspace to the
# same number of bits as perl's NV.
my $workspace = Rmpfr_init2($Math::MPFR::BITS);
my $failed = 0;
my($perl_nv, $mpfr_nv);
for(;;) {
$count++;
$mant = int(rand(10))
. '.'
. int(rand(10))
. int(rand(10))
. int(rand(10))
. int(rand(10))
. int(rand(10))
. int(rand(10))
. int(rand(10))
. int(rand(10))
;
$exp = int(rand($max_exp));
$exp = "$exp" if $count % 2;
$str_value = $mant . "e$exp";
# Assign $str_value to $mpfr_nv using mpfr
$mpfr_nv = atonv($workspace, $str_value);
# Assign $str_value to $perl_nv using perl
$perl_nv = "$str_value" + 0;
# $mpfr_nv and $perl_nv should be exactly equivalent.
# Else atleast one of mpfr and perl has assigned incorrectly.
# IME, mpfr does not assign incorrectly.
unless($perl_nv == $mpfr_nv) {
$failed++;
$perl_unpacked = scalar reverse unpack "h*", pack "F<", $perl_nv;
$mpfr_unpacked = scalar reverse unpack "h*", pack "F<", $mpfr_nv;
print "$str_value: $mpfr_nv:\n $perl_unpacked vs $mpfr_unpacked\n\
+n" if $display;
$diff = hex(substr($perl_unpacked, 8, 8))  hex(substr($mpfr_unpa
+cked, 8, 8));
if($diff > $max_diff) {
$max_diff = $diff;
}
elsif($diff < $min_diff) {
$min_diff = $diff;
}
}
last if $count == $ARGV[1];
}
print "Count: $count\n";
print "Failed: $failed\n";
print "Largest differences were $max_diff ULPs and $min_diff ULPs\n";
print "Failed: $failed\n";
print "Largest differences were $max_diff ULPs and $min_diff ULPs\n";
It requires MathMPFR4.03. If you want to test values in the subnormal range, you should build Math::MPFR against mpfr4.0.x as earlier versions of mpfr were buggy in their calculation of subnormals.
As a starter, run perl atonv.pl 300 100, opting to display mismatches, and see how that fares.
Whenever I run that command against a patched perl5.28.0, 0 mismatches are detected, irrespective of perl's nvtype.
Whenever I run that command against an unpatched perl5.28.0, about 80 failures are detected unless, of course, nvtype is "__float128"  in which case no failures still occur.
There's probably not many who would bother, but I certainly intend to continue building perl with this hack in place.
UPDATE: For the record, gcc version on my Ubuntu box is 5.4.0, and libc version is 2.23
Me? (Not that I'm an expert.) Because Perl was handy, extremely useful, and didn't require a separate compile phase. Because I could solve other people's problems with it. Because it was general purpose, and not specifically geared for stream picking / editing. Because it was free. Because it was more fun than any other language I knew at the time.
QM

Quantum Mechanics: The dreams stuff is made of
I wonder what will happen the day search.cpan.org goes away, and the day after.
One more week and we will know! Does Perlmonks get flooded with
identical questions? Does FUD gleefully announcing the death of Perl and
CPAN make the rounds?? Do those redirects work???
_{STOP REINVENTING WHEELS, START BUILDING SPACE ROCKETS!—CPAN}
CGI::Alternatives is a module that "doesn't do anything"^{1} except
vehemently deny and propogandize against the utility of one of the most useful
forms of programming EVER conceived: CGI^{2} programming.
CGI::Alternatives perpetuates all the
common fallicies against CGI that if heeded only disempower independent
developers. This includes advocating the replacement of all of Perl's
wonderful and extremely simple, stable, mature, powerful CGI modules with
vastly more byzantine "frameworks"^{1} which rigidly enforce all sorts of
corporate nonsense like "full separation of concerns"^{1}, total objectoriented
lack of any possible functionality, and the absurd complication of allowing
oneself to be used by something as annoyingly
totalitarian as templates^{1} EVEN when they're not appropriate! All of these
techniques have their place of course, mostly in big projects, with lots
of tiny modules (to confuse management, ensure job stability, in
competitive workplaces, stretching hours into months, for the children),
but not usually in code written by us individuals for fun, prototyping,
and extreme levels of pure: results.^{3}
With all due respect to the author's efforts to change how we Perl into
something his bosses find acceptable, the author of CGI::Alternatives is
actually in charge of CGI.pm! How is this even possible? I realize the
author is a talented programmer who has contributed significantly to CPAN,
but this quote directly reflects his inappropriate state of mind towards the
CGI paradigm (while CGI.pm is derided with that weird noveltyobsessed
bigotry for being "old", as he removes perfectly sensible functions, only to
prove his pointless point):
"You can't just hand a template to the webdesigners and allow them to work their
magic. Don't mix the business logic and the presentation layer. Just don't."
This guy doesn't even know what a CGI programmer does yet he dictates to us? This is CRAZY! We ARE the webdesigners, OF the business logic, AND the presentation
layerALL mixed togetherlike a SWISS ARMY CHAINSAW: this is our TECHINQUE! THIS IS Perl! Something YOU (Lee) obviously don't understand. Mixing it all up is exactly how some other language(s) seized the web from Perl (along with plenty of wellfunded
corporate FUD). Even though we still do it far better.
We have been here from the beginning and we remain no matter how many of our
tools you try to disable or how much FUD you spread about our primordially
awesome technique of producing ONE UNIFIED FILE, USING CORE MODULES, QUITE OFTEN
VASTLY SUPERIOR TO FRAGMENTED TEMPLATES, WRITTEN BY ONE PERL GENIUS, RATHER THAN
A TEAM OF HOPELESSLY ABSTRACTED CORPORATE DRONES: Because Larry wrote and maintains Perl that way; Blessed be.
How dare you tell us to stop doing what we love and what Perl empowers us
to do? How dare you remove the HTML generation functions from CGI.pm? Who do
you think you are anyway? People who come to Perl and say things have got to
change don't appreciate Perl and should be led as far away from Perl as
possible (Python), not in charge of (formerly, unfortunately) core modules!
Can someone who cares please take CGI.pm away from Lee Johnson (LEEJO)? I
would feel far more comfortable with someone we can trust, like ikegami or Merlyn^{3}, in
charge of maintaining CGI.pm. At least we know they would give us what we
want and need, and more, rather than inflicting torture by removing legacy
functionality FOR EMOTIONAL REASONS thereby violating operational stability.
News for you Lee: What worked 20 years ago still works today: UNIX, POSIX, BASH,
PERL, ME, AND MAYBE EVEN YOU. Mature technology never stops working! I appreciate
innovation so don't necessarily stop trying to reinvent the wheel, but please do
stop trying to shove your shiny new wheels in sheep's clothing down our throats
because PERL ALREADY WON.^{4}
If we stopped wasting time and spirit listening to ideologically driven flame
warrior infiltrators who keep trying to change Perl we would already have a
perfectly backward compatible and "fixed" (even though it has never failed me
to this very day thank you Lincoln Stein^{5}) CGI.pm on the corelist joined by
other bits we desperately need and use EVERY SINGLE DAY like CGI::Carp,
Data::Dumper and File::Slurp.
Some examples:
This extremely useful oneline CGI dubugger is now broken thanks to LEEJO (thanks!):
If you agree PLEASE respond! If you do NOT agree please DO NOT hijack the thread because
you guys already kinda won and I hope this thread can be for
CGI programmers to chime in and support this seemingly lost cause which is
really not even close to lost in the real nonideological world of actual
programmers who GET STUFF DONE.
In all of my current 44 CPAN distributions, along with my near 80 Github Open Source repositories (Perl, C, C++, C#, Python etc), I (with few exceptions) enforce an 80 char limit on the length of the lines of code.
I also do this even in my POD, Changes and test files (again, with some exceptions).
I know that this practice is based on legacy console linelength reasons, but I still like to stick with it, as it keeps things very consistent, as well as allows my IDE to display the project layout, two open files sidebyside, and the overview (structure) of the file I'm currently working on to be viewed clearly and easily.
Even when I'm using just vi/vim on the CLI outside of my IDE of choice, I can count on my code being consistently wide in all aspects.
What are your thoughts here? Many coders I speak to go as far as 120 chars and they say that is helpful, and at $work (Python and C++), there's a 79 char limit and many hate it. Seems as though newer generations prefer longer line lengths, but here I am curious as to what the Perl community feels.
No matter what $something is, it will safely evaluate to either true or false. (With the exceptions of a few edge cases like blessed objects which are overloaded to throw an error when evaluated as booleans.)
So when a Moose class does something like this, what does it mean?
If absolutely any value could work when $self>something was accessed in boolean context, then what need is there to check what value is passed to the constructor? Should Bool basically be the same as Any, just spelled differently for documentation purposes?
So what does Moose do? The documentation says:
Bool accepts 1 for true, and undef, 0, or the empty string as false.
However, that's not the full story. Blessed objects which overload stringification are accepted, but only if the stringification returns the strings "0", "1", or the empty string at the time the type constraint is checked. If the object stringifies to something else, but also overloads boolification sensibly, then too bad. Of course when you write if ($self>something) it's the boolification overloading which matters, but Moose only checks the stringification overloading.
Moose's support for objects that overload stringification as booleans is not explicitly documented, nor is it covered at all by the Moose test suite.
What does Mouse do? Well, that's even weirder. It mostly follows Moose's documented behaviour. It accepts "1" for true, and "0", undef, and the empty string for false. But also, it accepts objects overloading boolification for false. Yes, that's right — if you overload boolification to return true, it will fail the type check. Overload it to return false, and you're golden!
So where does this leave my module Types::Standard? Well, the pure Perl implementation follows what Moose does, and the (optional) XS implementation is forked from Mouse.
For the latest release of the XS version, I've dropped support for objects which overload boolification to return false, bringing it in line with Moose's documented behaviour. I plan for the pure Perl implementation to also follow suit, dropping support for objects which overload stringification to return a boolean value.
If you need support for objects overloading boolification, a quick workaround is this:
has something => (
is => 'ro',
isa => 'Any', # Bool
);
In the case of readonly attributes, I happen to believe accepting a blessed object as a boolean value could be harmful. The contents of the object could later change, changing the value from true to false, or vice versa, despite its readonlyness.
As I get sucked deeper and deeper into web scrapers  the Cosmo Cramers of our era  and constantly doing so with my faithful companion, the LWP::UserAgent, the need arose, primarily out of curtesy to the hosts, for counting the number of requests (hits) I made over a certain time interval and holding the scraper back by sleep()ing some time.
Eventually, I decided I wanted to be able to know the ratio of active hitting sessions over sleep times and also control and tweak the hit rate and the subsequent burden on the host, for particular traffic situations: late night or noons, with just a few parameters, mainly the sleep() durations between the various phases of scraping and form filling. The latters, one could imagine being like a complex state machine which can lead you to deterministic  most of the time  but highly complex paths.
And so I have devised two methods/tools to assist me in my endeavours, one is a hit counter for LWP::UserAgent and the other is a counter of sleep() seconds which works across all sleep() calls even in far and foreign modules.
I will proceed now to lay out a modulebased implementation of socalled UserAgentwithStats, including a test script.
The basic idea is to subclass LWP::UserAgent in order to add a handler (via set_handler), when requested by the user, to the "request_send" phase of LWP's request(). The purpose of this handler is to increment our internal hit counter every time a request is sent by LWP (GET/POST/etc.).
Additionally, there are two time counters to assist us in calculating the timeinterval between when counter was turned on and either lasthit or when it was turned off. The aim is to be able to know the number of hits that occured within a time interval. Thinking about it maybe it makes more sense a timefirsthit to timelasthit interval.
Now, one may ask why there is a need to subclass and not create a new class which takes a LWP::UserAgent object in adds handler to it and keeps the counters. Indeed, that is another possibility.
In any event, that's the basic idea. I would like to ask for your comments, corrections and recommendations. I will do the same for the sleepcount module in my next post.
Your mother was X.500 and your father smells of RFCs! Now go away or I shall mock you a second time!
 from the original draft of Monty Python and the Holy Grail.
I finally buckled down and starting to mock the LDAP server in my tests rather than trying to connect
to a live server. Other than the documentation, there's not a lot of examples out there for Test::Net::LDAP::Mock or Test::Net::LDAP::Util, so here's the results from banging away at it for an afternoon along with what I think is going on. Please feel free to point out what I've done wrong. I've stopped where it started working for me.
I have a Mojolcious app that authenticates against LDAP, but the tests would fail when using dummy accounts or when I wasn't connected.
Here's the test I wrote
Setup your test environment as usual and use Test::Net::LDAP::Util qw/ldap_mockify/;
The ldap_mockify method intercepts all calls to Net::LDAP>new()
and redirects them to your mocked LDAP directory.
Create a new Net::LDAP object
Use the object to populate your mocked server with data using the add method
If you want to mock the authentication process, use the mock_bind method with a call back that
returns LDAP_SUCCESS or LDAP_INVALID_CREDENTIALS
Now that your LDAP server is all mocked up, run your tests
Don't forget the }; at the end of the method. It's a funny error message when you forget the semicolon at the end.
Notes
the $basedn that you mock has to be the same as the base DN that you search in your application.
this is easier if you keep the values in a config file and read the same file in your test (not shown here for brevity)
testing authentication, you don't set the password for an entry with mock_password(), but instead supply mock_bind() with a callback
if you haven't imported Net::LDAP::Constant, you'll need to use the fully qualified name
to report success Net::LDAP::Constant::LDAP_SUCCESS or failure Net::LDAP::Constant::LDAP_INVALID_CREDENTIALS
most of the methods in Test::Net::LDAP::Util seem to want to return success, regardless of the underlying data, which can be frustrating until you work that out and code accordingly.
Edit  while cleaning up tabs used for putting this post together, I found a relevant question on StackOverflow from 5 years ago, but it hasn't been answered so far.
Ea
Sometimes I can think of 6 impossible LDAP attributes before breakfast.
What do you think of General_Data_Protection_Regulation? I'm interested to know if your companies are behind it or minimally complying, more interested to know if you think individuals ought to have the rights expressed in that law and if there is really a moral obligation on site owners to comply. Or, if it should be scrapped or changed.
The right of erasure specifically contradicts PM policy which is defended with the same argument that Wikipedia uses, the "Memory hole" argument. If one user decides to revoke the site owners permission to use their nodes, that creates a hole in the link of the chain, and every user is negatively affected. That is a pretty utilitarian view point. It smells slightly self serving to me to hear that argument from sites whose success directly rides on user generated content.
It really only benefits future users, because if you were there, you don't need a tattoo of the conversation to remember it later. I don't see that a site owner, especially if it's not the hoster ie back in time machines, gets a perpetual license after you leave. Recipe sites  let's say you participate for years honing the craft and eventually decide to write a cookbook, you don't ever have the right to revoke your recipes down off the boards and make the world pay for your stuff? But your dishes have probably benefited from all that recipe sharing, so it seems you would owe something too.
I can't help but think of the social contract put forth in Crito. You have a good idea of what you are getting into when you participate online, seems reasonable that the site architects who built your playground would be able to dictate the terms, but I don't see how they have the right to continue to do so once you leave.
I googled: Social contract, copyright law, landlord tenant, looked up about 10 web sites that were closing down or blocking EU Customer, but I can't make up my mind. There seems to be a lot of data players operating in the shadows without consent that should be addressed, but I can't see how it affects my life at all. I see an ad about something I almost bought on Amazon, big deal.
Well surely we do not live in a perfect world, but does the GDPR move the decimal point either direction? Or just adding more compliance factories to the world? And who are the people who wrote the bill that made me get all this TOS spam. I tried to find the authors' names and I could not. Maybe this is a stepping stone to better "digital rights"?
I have for a time entertained the idea, that if God is the creator, he would have left his signature in the DNA of human species. If I was the creator, I would have encoded the entire Hebrew Bible in DNA, so to let no one doubt that DNA was created by God and that the Bible is the word of God.
I finally took up the challenge and wrote a perl script to check if the first five verses of Bible are encoded in DNA. Naturally there is an infinite number of ways to encode information in DNA, but I assumed that God would have used something quite obvious in order for us to be able to find information encoded in DNA. I’m assuming that if the Bible is encoded in DNA, the encoding used would be the same as for protein synthesis, namely that triplets of DNA base pairs would encode for one character. There are 64 possible codons so there is plenty of redundancy when they are used for encoding 22 hebrew alphabets (plus sofit forms for five characters).
Like so:
AAA > Y
AAC > XXX
AAG > B
AAT > XXX
ACA > A
ACC > M
ACG > XXX
ACT > XXX
AGA > R
AGC > R
AGG > W
AGT > W
ATA > H
ATC > XXX
ATG > A
ATT > H
CAA > XXX
CAC > XXX
CAG > I
CAT > XXX
CCA > XXX
CCC > XXX
CCG > XXX
CCT > V
CGA > XXX
CGC > XXX
CGG > O
CGT > XXX
CTA > XXX
CTC > E
CTG > V
CTT > H
GAA > H
GAC > I
GAG > XXX
GAT > T
GCA > XXX
GCC > B
GCG > XXX
GCT > H
GGA > V
GGC > H
GGG > Y
GGT > A
GTA > Y
GTC > V
GTG > A
GTT > I
TAA > E
TAC > XXX
TAG > XXX
TAT > O
TCA > A
TCC > Y
TCG > XXX
TCT > XXX
TGA > R
TGC > H
TGG > A
TGT > XXX
TTA > XXX
TTC > L
TTG > B
TTT > A
My dirty little perl script reads a FASTA file one character at a time and when a triplet is read, it check to see if that codon is already defined. If it is not, the first character of the target sequence is added to a hash containing all the codons. The algorithm then moves to the next triplet in DNA and check to see if that triplet is defined and so on. When a triplet is already defined and the character stored does not equal the target sequence, the script records the maximum length of the sequence found and goes back to the beginning of DNA and moves forward one base pair to continue the search.
I’m not a computer science expert and I’m sure that my script is dirty and messy, but it does work. It takes 33h to search one target sequence against the 3 billion base pairs of human DNA. The FASTA files are in chunks of roughly 150 million base pairs, so several files need to be checked by hand, but this is not much of a problem. My computer crashes when I try to load more than 10million base pairs at a time, so the script reads each FASTA file in chunks of 5 million base pairs at a time.
I could not get hebrew characters to work properly, so I simply translitterated the first five chapters of Genesis to ASCII characters. This is a dirty way of going about it, but it works.
Aleph A
Bet B
Gimmel G
Dalet D
Hey H
Vav V
Zayin Z
Chet C
Tet T
Yod Y
Kaf K G
Lamed L
Mem M O
Nun N J
Samekh S
Ayin X
Pey P
Tsadi U W
Kuf Q
Resh R
Shin E
Tav I
Gen 1:15
BRAEYI BRA ALHYO AI HEMYO VAT HARW VHARW HYIH IHV VBHV VCEG XL PNY IHV
+O VRVC ALHYO MRCPI AL PNY HMYO VYAMR ALHYO YHY AVR VYHY AVR VYRA ALHY
+O AI HAVR KY TVB VYBDL ALHYO BYJ HAVR VBYJ HCEG VYQRA ALHYO LAVR YVO
+VLCEG QRA LYLH VYHY XRB BYHY BQR YVO ACD
For control sequences I used Lorem ipsum, War and Peace and a random string. For the control sequences I checked the first one million base pairs only.
The results so far:
Lorem ipsum 42 characters found (250 million searched)
War and peace 35 characters found (one million searched)
Random string 35 characters found (one million searched)
Having checked the hebrew Bible against so far 500 million base pairs, the maximum sequence found was 45 characters. This is more than the control sequences, but only because much more base pairs were compared. To be sure that the sequence was encoded in DNA by God, I would expect to find a sequence of hundreds of characters, preferably all the first five verses of Genesis. I’m not a mathematician, so I have not calculated what the maximum sequence length would be if left to chance alone. But the control sequences do give some estimate.
I’m of course assuming that God used the hebrew Bible, because some say hebrew is the holy language, but I’ve also checked the King James English for the first verses of Matthew and John. If God is omnipotent, surely he could have encoded the Bible in DNA in any language. In the future I’ll check if New Testament passeges are encoded in Greek, but thus far I’m working with the assumption that the most awesome thing for God to do would have been to encode the biginning of Genesis. Will post results when I find anything.
Let me know what you think of my efforts, I know this is nuts.
Edit: Just noticed that PDL 2.019 has been released, this was written against 2.018. There shouldn't be many (if any) changes, but I'll update this comment accordingly once I've checked it over.
Edit2: longlong range fixed.
As there was significant interest in the porting of numpy to PDL documentation, I've been continuing to document my explorations with PDL.
The following document is my own personal PDL 'QuickRef', which I've created as both a reference to myself and as a summary of PDL
I've tidied it up and now I'm posting it here for others, should they find it useful. Hopefully, it's both useful to new users of PDL (exploring along with perldl shell) and as a reference for experienced users.
Hopefully I've put this in the correct place, but mods feel free to move it if this is the wrong section.
I will continue to update the 100 PDL Exercises offline and will post an updated version incorporating all feedback soon.
PDL QuickRef
Arguably, this is just a rehashing of the existing documentation available via the modules in the PDL::* namespace. However, I found it useful when learning PDL to have everything in a single place.
PDL Creation
Creation of Vectors
The pdl function creates piddles from implicit and explicit scalars and variables. It accepts an optional first argument, $type, which specifies the internal data type of the piddle.
PDL Datatypes
All piddles store matrices of data in the same data type. PDL supports the following datatypes:
Datatype
Internal 'C' type
Valid values
byte
unsigned char
Integer values from 0 to +255
short
short
Integer values from 32,768 to +32,767
ushort
unsigned short
Integer values from 0 to +65,535
long
int
Integer values from 2,147,483,648 to +2,147,483,647
longlong
long
Integer values from –9,223,372,036,854,775,808 to +9,223,372,036,854,775,807
float
float
Real values from 1.2E38 to +3.4E+38 with 6 decimal places of precision
double
double
Real values from 2.3E308 to +1.7E+308 with 15 decimal places of precision
pdl Examples
Row vector from explicit values:
$v = pdl($type, [1,2]);
Column vector from explicit values:
$v = pdl($type, [[1],[2]]); or $v = pdl($type, [1,2])>(*1);
Row vector from scalar string:
$v = pdl($type, "1 2 3 4");
Row vector from array of numbers:
$v = pdl($type, @a);
Matrix from explicit values:
$M = pdl($type, [[1,2],[3,4]]);
Matrix from a scalar:
$M = pdl($type, "[1 2] [3 4]");
Piddle Helper Creation Functions
In the following functions, where arguments are marked as ..., accept arguments in the following form:
$type  an optional data type (see above)
$x,$y,$z,...  A list of n dimensions for the resulting piddle, OR
$M  Another piddle, from which the dimensions will be reused
Sequential integers, starting at zero:
$M = sequence(...);
Sequential Fibonacci values, starting at one:
$M = fibonacci(...);
Of all zeros:
$M = zeros(...);
Of all ones:
$M = ones(...);
Of random values between zero and one:
$M = random(...);
Of Gaussian random values between zero and one:
$M = grandom(...);
Where each value is it's zerobased index along the first dimension:
$M = xvals(...);
Where each value is it's zerobased index along the second dimension:
$M = yvals(...);
Where each value is it's zerobased index along the third dimension:
$M = zvals(...);
Where each value is it's zerobased index along dimension $d:
$M = axisvals(..., $d);
Where each value is it's distance from a specified centre:
$M = rvals(..., {Centre=>[x,y,z,...]);
The following functions create piddles with dimensions taken from another piddle, $M and distribute values between two endpoints ($min and $max) inclusively:
Linearly distributed values along the first dimension:
$N = $M>xlinvals($min, $max);
Linearly distributed values along the second dimension:
$N = $M>ylinvals($min, $max);
Linearly distributed values along the third dimension:
$N = $M>zlinvals($min, $max);
Logarithmically distributed values along the first dimension:
$N = $M>xlogvals($min, $max);
Logarithmically distributed values along the second dimension:
$N = $M>ylogvals($min, $max);
Logarithmically distributed values along the third dimension:
$N = $M>zlogvals($min, $max);
Coordinate Piddles
Finally the ndcoords utility function creates a piddle of coordinates for the supplied arguments. It may be called in two ways:
$coords = ndcoords($M);  Take dimensions from another piddle
$coords = ndcoords(@dims);  Take dimensions from a Perl list
Piddle Conversion
A piddle can be converted into a different type using the datatype names as a method upon the piddle. This returns the converted piddle as a new piddle. The inplace method does not work with these conversion methods.
Operation
Operator
Convert to byte datatype:
$M>byte; or byte $M;
Convert to short datatype:
$M>short; or short $M;
Convert to ushort datatype:
$M>ushort; or ushort $M;
Convert to long datatype:
$M>long; or long $M;
Convert to longlong datatype:
$M>longlong; or longlong $M;
Convert to float datatype:
$M>float; or float $M;
Convert to double datatype:
$M>double; or double $M;
Obtaining Piddle Information
PDL provides a number of functions to obtain information about piddles:
Description
Code
Return the number of elements:
$M>nelem;
Return the number of dimensions:
$M>ndims;
Return the length of dimension $d:
$M>dim($d);
Return the length of all dimensions as a Perl list:
$M>dims;
Return the length of all dimensions as a piddle:
$M>shape;
Return the datatype of a piddle:
$M>type;
Return general information about a piddle (datatype, dimensions):
$M>info;
Return the memory used by a piddle:
$M>info("%M");
Indexing, Slicing and Views
Points To Note
PDL internally stores matrices in column major format. This affects the indexing of piddle elements.
In standard mathematical notation, the element at M_{i,j} will be i elements down and j elements across, with the elements 0 and 3 at M_{1,1} and M_{2,1} respectively.
With PDL indexing, indexes start at zero, and the first two dimensions are 'swapped'. Therefore, the elements 0 and 3 are at PDL indices (0,0) and (0,1) respectively.
Views are References
PDL attempts to do as little work as possible in that it will try to avoid memory copying of piddle values when it can. The most common operations where this is the case is when taking piddle slices or views across a piddle matrix. The piddles returned by these functions are views upon the original data, rather than copies, so modifications to them will affect the original matrix.
Slicing
A common operation is to view only a subset of a piddle. This is called slicing.
As slicing is such a common operation, there is a module to implement a shorter syntax for the slice method. This module is PDL::NiceSlice. This document only uses this syntax.
A rectangular slice of a piddle is returned via using the default method on a piddle. This takes up to n arguments, where n is the number of dimensions in the piddle.
Each argument must be one of the following forms:
""
An empty value returns the entire dimension.
n
Return the value at index n into the dimension, keeping the dimension of size one.
(n)
Return the value at index n into the dimension, eliminating the entire dimension.
n:m
Return the range of values from index n to index m inclusive in the dimension. Negative indexes are indexed from the end of the dimension, where 1 is the last element.
n:m:s
Return the range of values from index n to index m with step s inclusive in the dimension. Negative indexes are indexed from the end of the dimension, where 1 is the last element.
Return the first and second column as a 2x3 matrix:
$M>(0:1);
[ [0 1] [3 4] [6 7] ]
Return the first and third row as a 3x2 matrix:
$M>(,0:1:2);
[ [0 1 2] [6 7 8] ]
Dicing
Occasionally it is required to extract noncontiguous regions along a dimension. This is called dicing. The dice method accepts an array of indices for each dimension, which do not have to be contiguous.
Return the first and third column as a 2x3 matrix:
$M>dice([0,2]);
[ [0 2] [3 5] [6 8] ]
Return the first and third column and the first and third row as a 2x2 matrix:
$M>dice([0,2],[0,2]);
[ [0 2] [6 8] ]
Which and Where Clauses
The other common operation to perform over a piddle is to apply a boolean operation over the entire piddle elementwise. This is achieved in PDL with the where method.
The where method accepts a single argument of a boolean operation. The element is referred to within this argument with the same variable name as the piddle. The values in the returned piddle are references to the values in the initial piddle.
In a similar mannor to which clauses outlined above, there is the where method. The difference between these two methods is that which returns the values, while where returns the indices.
This is best explained with examples over a matrix $M:
Description
Return values
Return indices
Obtain all positive values:
$M>where($M > 0);
which($M > 0);
Obtain all values equal to three:
$M>where($M == 3);
which($M == 3);
Obtain all values which are not zero:
$M>where($M != 0);
which($M != 0);
Note that there is also the which_both function. This function returns an array of two piddles. The first is a list of indices for which the boolean operation was true, the second for which the result was false.
Again, as where clauses as so common PDL::NiceSlice has syntatic support for it through the default method. This is acheived through an argument modifier, which is appended to the single argument.
The modifiers are seperated from the original argument via a ; character, and the following modifiers are supported:
Modifier
Description
?
The argument is no longer a slice, but rather a where clause
_
flatten the piddle to one dimension prior to the operation

squeeze the piddle by flattening any dimensions of length one.

sever the returned piddle into a copy, rather than a reference
Using this syntax, the following where commands are identical:
PDL contains many functions to modify the view of a piddle. These are outlined below:
Description
Code
Transpose a matrix/vector:
$M>transpose;
Return the multidimensional diagonal over the supplied dimensions:
$M>diagonal(@dims);
Remove any dimensions of length one:
$M>squeeze;
Flatten to one dimension:
$M>flat;
Merge the first $n dimensions into one:
$M>clump($n);
Merge a list of dimensions into one:
$M>clump(@dims);
Exchange the position of zeroindexed dimensions $i and $j:
$M>xchg($i, $j);
Move the position of zeroindexed dimension $d to index $i:
$M>mv($d, $i);
Reorder the index of all dimensions:
$M>reorder(@dims);
Concatenate piddles of the same dimensions into a single piddle of rank n+1:
cat($M, $N, ...);
Split a single piddle into an array of piddles across the last dimension:
($M, $N, ...) = dog($P);
Rotate elements with wrap across the first dimension:
$M>rotate($n);
Given a vector $v return a matrix, where each column is of length $len, with step $step over the entire vector:
$M>lags($dim, $step, $len);
Normalise a vector to unit length:
$M>norm;
Destructively reshape a matrix to n dimensions, where n is the number of arguments and each argument is the length of each dimension. Any additional values are discarded and any missing values are set to zero:
$M>resize(@dims);
Append piddle $N to piddle $M across the first dimension:
$M>append($N);
Append piddle $N to piddle $M across the dimension with index $dim:
$M>glue($dim, $N);
Matrix Multiplication
PDL supports four main matrix multiplication methods between two piddles of compatible dimensions. These are:
Operation
Code
Dot product:
$M x $N;
Inner product:
$M>inner($N);
Outer product:
$M>outer($N);
Cross product:
$M>crossp($N);
As the x operator is overloaded to be the dot product, it can also be used to multiply vectors, matrices and scalars.
Operation
Code
Row x matrix = row
$r x $M;
Matrix x column = column
$M x $c;
Matrix x scalar = matrix
$M x 3;
Row x column = scalar
$r x $c;
Column x row = matrix
$c x $r;
Arithmetic Operations
PDL supports a number of arithmetic operations, both elementwise, over an entire matrix and along the first dimension. Double precision variants are prefixed with d.
Operation
Elementwise
Over entire PDL
Over 1st Dimention
Addition:
$M + $N;
$M>sum;; $M>dsum;
$M>sumover;; $M>dsumover;
Subtraction:
$M  $N;
Product:
$M * $N;
$M>prod;; $M>dprod;
$M>prodover;; $M>dprodover;
Division:
$M / $N;
Modulo:
$M % $N;
Raise to the power:
$M ** $N;
Cumulative Addition:
$M>cumusumover;; $M>dcumusumover;
Cumulative Product:
$M>cumuprodover;; $M>dcumuprodover;
Comparison Operations:
PDL supports a number of different elementwise comparison functions between matrices of the same shape.
Operation
Elementwise
Equal to:
$M == $N;
Not equal to:
$M != $N;
Greater than:
$M > $N;
Greater than or equal to:
$M >= $N;
Less than:
$M < $N;
Less than or equal to:
$M <= $N;
Compare (spaceship):
$M <=> $N;
Binary Operations
PDL also allows binary operations to occur over piddles. PDL will convert any real number datatype piddles (float, double) to an integer before performing the operation.
Operation
Elementwise
Over entire PDL
Over 1st Dimention
Binary and:
$M & $N;
$M>band;
$M>bandover;
Binary or:
$M  $N;
$M>bor;
$M>borover;
Binary xor:
$M ^ $N;
Binary not:
~ $M; or $M>bitnot;
Bit shift left:
$M << $N;
Bit shift right:
$M >> $N;
Logical and:
$M>and;
$M>andover;
Logical or:
$M>or;
$M>orover;
Logical not:
! $M; or $M>not;
Trigonometric Functions
These PDL functions operate in units of radians elementwise over a piddle.
Operation
Elementwise
Sine:
$M>sin;
Cosine:
$M>cos;
Tangent:
$M>tan;
Arcsine:
$M>asin;
Arccosine:
$M>acos;
Arctangent:
$M>atan;
Hyperbolic sine:
$M>sinh;
Hyperbolic cosine:
$M>cosh;
Hyperbolic tangent:
$M>tanh;
Hyperbolic arcsine:
$M>asinh;
Hyperbolic arccosine:
$M>acosh;
Hyperbolic arctangent:
$M>atanh;
Statistical Functions
PDL contains many methods to obtain statistics from piddles. Double precision variants are prefixed with d.
Operation
Over entire PDL
Over 1st Dimention
Minimum value:
$M>min;
$M>minover;
Maximum value:
$M>max;
$M>maxover;
Minimum and maximum value:
$M>minmax;
$M>minmaxover;
Minimum value (as indicies):
$M>minover_ind;; $M>minover_n_ind;
Maximum value (as indicies):
$M>maxover_ind;; $M>maxover_n_ind;
Mean:
$M>avg;; $M>davg;
$M>avgover;; $M>davgover;
Median:
$M>median;; $M>oddmedian;
$M>medover;; $M>oddmedover;
Mode:
$M>mode;
$M>modeover;
Percentile:
$M>pct;; $M>oddpct;
$M>pctover;; $M>oddpctover;
Elementwise error function:
$M>erf;
Elementwise complement of the error function:
$M>erfc;
Elemntwise inverse of the error function:
$M>erfi;
Calculate histogram of $data, with specified $minimum bin value, bin $step size and $count bins:
histogram($data, $step, $min, $count);
Calculate weighted histogram of $data with weights $weights, specified $minimum bin value, bin $step size and $count bins:
whistogram($data, $weights, $step, $min, $count);
Various statistics:
$M>stats;
$M>statsover;
The 'various statistics' described above are returned as a Perl array of the following items:
mean
population RMS deviation from the mean
median
minimum
maximum
average absolute deviation
RMS deviation from the mean
Zero Detection, Sorting, Unique Element Extraction
Operation
Over entire PDL
Over 1st Dimention
Any zero values:
$M>zcheck;
$M>zcover;
Any nonzero values:
$M>any;
All nonzero values:
$M>all;
Sort (returning values):
$M>qsort;
$M>qsortvec;
Sort (returning indices):
$M>qsorti;
$M>qsortveci;
Unique elements:
$M>uniq;
$M>uniqvec;
Unique elements (returning indices):
$M>uniqind;
Rounding and Clipping of Values
PDL contains multiple methods to round and clip values. These all opererate elementwise over a piddle.
Operation
Elementwise
Round down to the nearest integer:
$M>floor;
Round up to the nearest integer:
$M>ceil;
'Round half to even' to the nearest integer:
$M>rint;
Clamp values to a maximum of $max:
$M>hclip($max);
Clamp values to a minimum of $min:
$M>lclip($min);
Clamp values between a minimum and maximum:
$M>clip($min, $max);
Set Operations
PDL contains methods to treat piddles as sets of values. Mathematically, a set cannot contain the same value twice, but if this happens to be the case with the piddles, PDL takes care of this for you.
Operation
Code
Obtain a mask piddle for values from $N contained within $M:
$M>in($N);
Obtain the values of the intersection of the sets $M and $N:
setops($M, 'AND', $N); or intersect($M, $N);
Obtain the values of the union of the sets $M and $N:
setops($M, 'OR', $N);
Obtain the values which are in sets $M or $N, but not both (union  intersection):
setops($M, 'XOR', $N);
Kernel Convolusion
PDL supports kernel convolution across multiple dimensions:
Description
Code
1dimensional convolution of matrix $M with kernel $K across first dimension (edges wrap around):
$M>conv1d($K);
1dimensional convolution of matrix $M with kernel $K across first dimension (edges reflect):
$M>conv1d($K, {Boundary => 'reflect');
2dimensional convolution of matrix $M with kernel $K (edges wrap around):
$M>conv2d($K);
2dimensional convolution of matrix $M with kernel $K (edges reflect):
$M>conv2d($K, {Boundary => 'reflect');
2dimensional convolution of matrix $M with kernel $K (edges truncate):
$M>conv2d($K, {Boundary => 'truncate');
2dimensional convolution of matrix $M with kernel $K (edges repeat):
$M>conv2d($K, {Boundary => 'replicate');
Miscellaneous Mathematical Methods
Here is all the other stuff which doesn't fit anywhere else:
I'm curious to know if Perl Monks believe that Perl is still worth learning as a primary programming language. I've used it in the past for some bioinformatics programming and really like the language. I'm interested in a change of career and wonder if it is worth the time investment really learning Perl in depth with the aim of becoming a Perl developer some time in the future. I know that there are a lot of 'trendier' programming languages out there like Python, PHP and Ruby. My logic behind learning Perl is that there are fewer people learning it compared to other languages. I assume that the market is awash with programmers using these other languages and that there might be a niche for perl programmer. Does anyone here work as a professional Perl programmer, either as an employee or a freelancer? Is there a future in Perl programming, or is a lot of the work migrating Perl to another platform? Are any start ups still using Perl frameworks like Catalyst?
I've been trying to slowly learn PDL over the last few months. While I'm aware of some available documentation (the PDL::* perldoc, The PDL Book, etc.) I've found the beginner documentation to be lacking. Therefore, I thought it would be a good idea to start 'porting' some numpy documentation to PDL for new users such as myself.
I've started with 100 numpy exercises, and this is the work in progress port to Perl/PDL.
As I'm still learning PDL, some solutions may be less than optimal, while others do not currently have solutions as they are outside my of level of competency. Therefore, I'm posting this WIP to PM to ask for comments and contributions.
As with most of Perl, there is more than one way to do it for most of these. I've decided to keep the $var>function() syntax as much as possible to easily be able to chain operations.
46. Create a structured array with x and y coordinates covering the [0,1]x[0,1] area.
n/a
47. Given two arrays, X and Y, construct the Cauchy matrix C (Cij =1/(xi  yj)):
TODO
48. Print the minimum and maximum representable value for each data type:
# This cannot be done directly, but you can extract the underlying
# C type used for each PDL type:
print byte>realctype;
print short>realctype;
print ushort>realctype;
print long>realctype;
print longlong>realctype;
print indx>realctype;
print float>realctype;
print double>realctype;
73. Consider a set of 10 triplets describing 10 triangles (with shared vertices), find the set of unique line segments composing all the triangles:
TODO
74. Given an array C that is a bincount, how to produce an array A such that np.bincount(A) == C?
TODO
75. How to compute averages using a sliding window over an array?
TODO
76. Consider a onedimensional array Z, build a twodimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is shifted by 1 (last row should be (Z[3],Z[2],Z[1])
TODO
77. How to negate a boolean, or to change the sign of a float inplace?
my $z = long 2 * random(10);
$z = not $z;
print $z;
$z = 5 + sequence(10);
$z = 1 * $z;
print $z;
78. Consider 2 sets of points P0,P1 describing lines (2d) and a point p, how to compute distance from p to each line i (P0[i],P1[i])?
TODO
79. Consider 2 sets of points P0,P1 describing lines (2d) and a set of points P, how to compute distance from each point j (P[j]) to each line i (P0[i],P1[i])?
TODO
80. Consider an arbitrary array, write a function that extract a subpart with a fixed shape and centered on a given element (pad with a fill value when necessary):
TODO
81. Consider an array Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14], how to generate an array R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]:
85. Create a 2D array subclass such that Z[i,j] == Z[j,i]:
TODO
86. Consider a set of p matrices wich shape (n,n) and a set of p vectors with shape (n,1). How to compute the sum of of the p matrix products at once? (result has shape (n,1))
TODO
87. Consider a 16x16 array, how to get the blocksum (block size is 4x4)?
TODO
88. How to implement the Game of Life using PDL arrays?
TODO
89. How to get the n largest values of an array:
my $z = 10 * random(20);
my $n = 3;
print $z>qsort>($n:);
93. Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A that contain elements of each row of B regardless of the order of the elements in B?
TODO
94. Considering a 10x3 matrix, extract rows with unequal values (e.g. [2,2,3]):
TODO
95. Convert a vector of ints into a matrix binary representation:
my $z = pdl [0,1,2,3,15,16,32,64,128];
my $bits = ($z>transpose & (2 ** xvals(9)));
$bits>where($bits > 0) .= 1;
print $bits;
97. Considering 2 vectors A & B, write the einsum equivalent of inner, outer, sum, and mul function:
TODO
98. Considering a path described by two vectors (X,Y), how to sample it using equidistant samples:
TODO
99. Given an integer n and a 2D array X, select from X the rows which can be interpreted as draws from a multinomial distribution with n degrees, i.e., the rows which only contain integers and which sum to n:
TODO
100. Compute bootstrapped 95% confidence intervals for the mean of a 1D array X (i.e., resample the elements of an array with replacement N times, compute the mean of each sample, and then compute percentiles over the means):
TODO
edit: link to PDL for those who do not know what it is