Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
Writing a Perl extension framework for Inkscape
5 direct replies — Read more / Contribute
by fdesar
on Jan 22, 2019 at 08:49

    Hi, Monks,

    Sorry for this long first post, I just want to explain as clearly as possible what I'm attempting to do...


    I'm a newbie at the monastery but not a perl newbie : I started using perl with version 4, beginning of the 90's...

    Since a few years I started doing paper folding artwork also known as 'Origami' ((折り紙 or おりがみ) and I use a very popular OpenSource vector-drawing software called Inkscape for drawing Origami diagrams.

    Diagramming Origami require somes very specific tools for drawing, to solve geometric problems to find where a crease will fall, following the axiomatic of Origami. It's sometimes tedious so, as Inkscape allows it, I decided to write a dedicated extension for it to do that. Of course, I wrote it in Perl as Inkscape allows using various scripting languages for it, but their preferences go clearly to Python : every docs and canevas for extensions are written for and in Python. So I decided to write my own tools to get my extension working with Perl and now it does work flawlessly.

    As I've always been an OpenSource enthusiastic, I logically want to share my work with others. On Linux or other Unices, no problem : Perl is always available and just adding a few modules to the standard core installations is sufficient (XML::LibXML and Local::gettext as I also made support for I18n). But unfortunately, it exists a very widely used bloated thing called Windows on which many ordinary people get stuck. And here starts the problem: neither Perl nor Python are natively available on those environments.

    I solved the problem by suggesting using Active Perl and it works quite well but... it is definitely *not* free software (as in free speech, not free bear) and I don't like that!

    The Inkscape Team solved this issue by optionally integrating a Python2.7 chain to their Windows distribution as they already provide a lot of useful extensions written in Python. But for Perl, nothing, nada. Which I don't like neither and it makes me angry (but I keep zen).

    So I had a very strange idea : building an integrated Perl framework for Inkscape on Windows that would install on top of standard Inkscape installation that would allow Perl extension to gracefully integrate with it. And if it works, maybe I'll be able to convince the Inkscape team to integrate it later on as an option within their main distribution, who knows...

    For that, I need to do a number of things, all theoretically feasible I think :

    1. Build a native Perl on Windows relative to the Inkscape installation directory (having an @INC pointing to (?)/Inkscape/lib/perl5)

    2. Build the Lib::LibXML module for it (absolutely mandatory as Inkscape works upon SVG)

    3. Build other few modules as Locale::gettext, for example, as gettext is their I18n implementation

    4. Write a basic framework (in fact a module) for building extensions (mostly already done as I already wrote one for my own use but it needs to be more generalized/structured), as there is one for Python

    5. Write an installer to setup things easily for dummy end-users (mainly copying the Perl executable and libraries into the installed Inkscape directory)

    If all of this seems weird to you, just tell me and stop reading further...

    Now, the problem:

    I'm stuck on stage 1: building Perl on a windows system !

    I tried many things, read a lot of threads, used a lot of configurations, from Straberry Perl, MingWin64 (MSYS2) to Visual Studio C++ Community 2017 and every thing failed:

    - Strawberry Perl, no succes. I don't remember why I decided to give up, probably because using cygwin, it won't build a native Windows Perl...

    - MingWin64 (MSYS2), no success, I get a fatal compiling error for a missing poll.h include file

    - VS 2017 community is much better, as nmake completes the compilation correctly, but I get a fatal error running nmake test:

    Test Summary Report ------------------- ../cpan/File-Temp/t/mktemp.t (Wsta +t: 6400 Tests: 5 Failed: 0) Non-zero exit status: 25 Parse errors: Bad plan. You planned 9 tests but ran 5. ../ext/File-Find/t/find.t (Wsta +t: 3328 Tests: 125 Failed: 0) Non-zero exit status: 13 Parse errors: Bad plan. You planned 137 tests but ran 125. ../ext/IPC-Open3/t/IPC-Open3.t (Wsta +t: 0 Tests: 45 Failed: 0) TODO passed: 25 ../ext/XS-APItest/t/locale.t (Wsta +t: 2304 Tests: 2 Failed: 0) Non-zero exit status: 9 Parse errors: No plan found in TAP output Files=2666, Tests=1088273, 1731 wallclock secs (66.19 usr + 6.66 sys += 72.84 CPU) Result: FAIL NMAKE : fatal error U1077: '.\perl.exe' : code retour '0x3' Stop.

    and, of course, nmake install won't work.

    And finally, the question:

    What to try next ?

    If anyone's interested in helping me mongering Perl on Inkscape, you're very welcome ;-)

    PS : my extension is available to download and test at

Creating tutorial POD that refers to source code lines
2 direct replies — Read more / Contribute
by perlancar
on Jan 22, 2019 at 02:38

    I'm planning to package some of my tutorials on my blog and upload it as CPAN distribution. Here's an example blog post. Some of these posts present a piece of source code and discuss it part by part, mentioning line number or line number ranges (BTW, as you see, WordPress shows line numbers and allows us to highlight some lines.)

    MetaCPAN also has a source code viewer which shows line number and can highlight a single line (example). I'm not sure if the source code viewer can be instructed to highlight multiple lines. In my tutorial POD (converted from the tutorial blog post) I can link to a specific line. The drawback is: 1) the link is specific to MetaCPAN; 2) the source code needs to be in a separate file in the distribution. For source code embedded in the POD as verbatim paragraphs, by default there is no line number or line highlighting.

    Any advice on how to make this kind of tutorial also convenient to read on MetaCPAN?

Error: Attempt to reload module.. while testing failing require
1 direct reply — Read more / Contribute
by Discipulus
on Jan 20, 2019 at 13:53
    Hello monks,

    Ok, at the end I've manged to resolve on my own (deleting from %INC ), but I still have doubts.

    I'm trying to test a different executable path to be used in a module and specified by setting an %ENV variable.

    I supposed to use die inside a BEGIN block in the module if the specified executable is incorrect:

    # Win32::Backup::Robocopy code BEGIN{ if( $ENV{PERL_ROBOCOPY_EXE} ){ my $robo = File::Spec->rel2abs( $ENV{PERL_ROBOCOPY_EXE} ); die "$robo does not exists!" unless -e $robo; die "$robo is not executable!" unless -x $robo; } }

    It seemed ok to me on quick test: the module dies if rubbish is passed (probably the check must be improved.. suggestions?)

    But when testing this feauture I discovered the module is present anyway in %INC even if died (the require returns true? how can I return 0 from the BEGIN block?).

    So with this version of test ( update the same errors appears even without all BEGIN blocks):

    #!perl use 5.010; use strict; use warnings; use Test::More; use Test::Exception; BEGIN # 1) this file should not exists { my $try = 'c:\robocopy.exe'; note("try using $try"); $ENV{PERL_ROBOCOPY_EXE} = $try; dies_ok { require Win32::Backup::Robocopy } 'expected to die with not existent executable'; #delete $INC{'Win32/Backup/'}; print "--->$_ \n" for grep{/Robocopy/} keys %INC; } BEGIN # 2) the HOSTS file exists in every win version, but is not exec +utable { my $try = -e 'C:\Windows\System32\drivers\etc\HOSTS' ? 'C:\Windows\System32\drivers\etc\HOSTS' : # systenative used if a 32bit perl. See # filesystem redirection oddities 'C:\Windows\Sysnative\drivers\etc\HOSTS'; note("try using $try"); $ENV{PERL_ROBOCOPY_EXE} = $try; dies_ok { require Win32::Backup::Robocopy } 'expected to die with not executable file'; #delete $INC{'Win32/Backup/'}; print "--->$_\n" for grep{/Robocopy/} keys %INC; } BEGIN # 3) this has to be the standard one { my $try = -e 'C:\Windows\System32\robocopy.exe' ? 'C:\Windows\System32\robocopy.exe' : 'C:\Windows\Sysnative\robocopy.exe'; note("try using $try"); $ENV{PERL_ROBOCOPY_EXE} = $try; #delete $INC{'Win32::Backup::Robocopy'}; ok (require Win32::Backup::Robocopy, 'default executable path'); } done_testing();

    I got Attempt to reload Win32/Backup/ aborted.

    # try using c:\robocopy.exe ok 1 - expected to die with not existent executable --->Win32/Backup/ # try using C:\Windows\System32\drivers\etc\HOSTS ok 2 - expected to die with not executable file --->Win32/Backup/ # try using C:\Windows\System32\robocopy.exe Attempt to reload Win32/Backup/ aborted. Compilation failed in require at .\t\01-ENV.t line 45. BEGIN failed--compilation aborted at .\t\01-ENV.t line 46.

    Uncommenting the delete the test runs OK, but why is the module in %INC if the require died? There are better ways to do this?


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
taint mode differences between ActiveState Perl and cPanel Perl
1 direct reply — Read more / Contribute
by Dandello
on Jan 19, 2019 at 11:15

    I'm trying to 'detaint' a very large legacy script. It now runs properly with strict and warnings (finally) but ActiveState Perl v5.24.3 (on Win7 on an offline Wamp server) and Perl v5.16.3 (on *nix on a commercially hosted VPS using Apache and cPanel) give me different results when 'use tainting' is active.

    This is the first time I've had issues (aside from when cleaning up deprecated code) with the differences between ActiveState Perl and what c-Panel installs.

    Now, I know there is tainted incoming data - but ActiveState Perl only gave me some of the offenders, not all the offenders.

    Any hints on how to 'detaint' on the off-line server when tainting says everything is fine?
Improving speed match arrays with fuzzy logic
4 direct replies — Read more / Contribute
by Takamoto
on Jan 18, 2019 at 12:57

    Hello monks,

    I have a first long list (array) of n-grams (array various length, typical: 10.000 elements) and a second even bigger reference list with n-grams (~500.000 elements). I need to check if elements of list 1 are present in list 2. The match must be fuzzy, i.e. both lists are words, and I need to match also very similar words (basically to match singular-plurals, small differences in spellings, and so on). My script (very naive) works, but I'd like to hear from you some suggestions to improve its speed (which is quite crucial in my application). I already had a BIG gain in time switching from Text::Fuzzy::PP to Text::Fuzzy, but surely the script could further improved. Data looks like this (each line is one element of multi word units):

    IP exchange IPX Internet Protocol Exchange DPM Data Point Model

    My code

    use strict; use warnings; use Text::Fuzzy; use Data::Dumper; my @stringsToMatch=("extra-articular arthrodesis", "malnutritions");#m +y first list. I want to check if its elements are in my reference lis +t $referenceNgrams my @goodmatches; my $referenceNgrams="EN.txt";#my huge reference file (second list) print "Loading n-grams from $referenceNgrams\n"; open my $inFH, '<:encoding(UTF-8)', $referenceNgrams or die; chomp(my @Corpus = <$inFH>); close $inFH; #Matching with Fuzzy foreach my $stringToMatch (@stringsToMatch){ print "Working on $stringToMatch\n"; foreach my $corpusElement (@Corpus){ #matching only if $stringToMatch has the same amount of elemen +ts of $corpusElement (to save time?) my $elementsInstringToMatch = 1 + ($stringToMatch =~ tr{ }{ }) +; my $elementsIncorpusElement = 1 + ($corpusElement =~ tr{ }{ }) +; if ($elementsIncorpusElement eq $elementsInstringToMatch){ my $tf = Text::Fuzzy->new ($stringToMatch); my $distance= $tf->distance ($corpusElement); if ($distance < 2){#sensibility push (@goodmatches, $stringToMatch); last;#go out of loop if match has been found } } } } print "Good matches:\n"; print Dumper @goodmatches;
[OT] Accessing python3's print() of floating point values
2 direct replies — Read more / Contribute
by syphilis
on Jan 18, 2019 at 08:08

    With python3 installed, I'm able to access (from within a perl script) the values that python3 prints out for a particular double ($d) by shelling out as follows:
    use strict; use warnings; my $d = 2 ** -1074; my $py = `python3 -c \"print($d)\"`; print "$py\n"; # prints 5e-324
    Now, that's about the full extent of my python3 skills, and it works well enough for my purposes even though shelling out is quite an expensive operation.

    But that only works as I intend if $d is a 'double' - ie when $Config{nvtype} is 'double'.
    How do I get access to the output of 'long double' and '__float128' values in python3 - ie when perl's nvtype (and hence $d) is 'long double' or '__float128' ?
    I couldn't quickly google up an answer to that question ... so I'm asking here in the event that some kind soul might be able to help me out.

    <pathetic cringe>
    By way of explanation:
    Python3 has a rather nice approach to the (base 10) presentation of floating point values. It will provide as few digits as are needed. The script above is a good example. Perl will tell you:
    C:\>perl -le "print 2 ** -1074;" 4.94065645841247e-324
    And python3 will claim:
    $ python3 -c "print(2 ** -1074)" 5e-324
    The 2 values appear to be different ... but they're not, and perl will even tell you so:
    C:\>perl -le "print 'ok' if 5e-324 == 4.94065645841247e-324;" ok
    It therefore begs the question "Why go to the (obfuscating ?) trouble of providing all of those extra digits when they don't provide greater accuracy ?". It's a good question - though there are answers and the question is
    not necessarily a rhetorical one.

    Anyway, I've written a perl implementation (using Math::MPFR, Math::GMPz, Math::GMPq) of the python3 approach to base 10 output of floating point values, and I need to test it.
    The internal consistency checks are looking fine, but I also want to check against some trustworthy external source - largely to check that I've got the standardised output working as per the .... ummm .... standard.
    That's why I'm referencing python3, and it's checking out well for double precision floating point values.
    Now it's just a matter of checking the 'long double' and '__float128' precision values.
    Hence my request for assistance.
    </pathetic cringe>

    Is there a better "trustworthy external source" that I should be using ?
    Is there a C library available that already implements this algorithm ? (Note that the PDF file to which I've linked provides only pages 112-126 of the book. I've implemented the algorithm on page 120.)
    Incidentally, Zefram had indicated an interest in implementing this particular algorithm into the perl source, but I've seen no postings from him anywhere since his perl5 grant ran out last year.


[SOLVED] Test:: fail when output file was changed
1 direct reply — Read more / Contribute
by jahero
on Jan 18, 2019 at 04:05

    Dear fellow monks.

    I think that some time ago I have seen somewhere information about a package on CPAN (in a blog post I think), which could be used in testing, and which:

    • by default failed, when named output file was changed
    • could run in such a mode, that it would "remember" current state of the file (if the change was intentional), which would prevent further fails (until next change)

    Imagine you are programmatically creating complex structured text, and want to be able to "keep it fixed" in you tests - unless you know that the change is "for the better"

    Is my memory playing tricks on me? Are you aware of such library? I hope that what I vaguely described above makes sense.

    Unfortunately, mu Google-fu is not advanced enough to yield the answer.

    Regards, Jan


    Update: change the title, answer is in comments below.

Building Inline::C from github source
1 direct reply — Read more / Contribute
by syphilis
on Jan 17, 2019 at 20:20

    I can clone the github repo ok:
    git clone inline-c-pm
    However, AFAICT, there's no Makefile.PL included - except for a couple in eg/modules that bear no relation to the building of Inline::C itself.
    In the absence of that file, what is the recommended course of action to be taken in order to build Inline::C from github source ?
    (And where is this procedure documented ?)

Comparison of XML files ignoring ordering of child elements
2 direct replies — Read more / Contribute
by adikan123
on Jan 17, 2019 at 00:09
    I have been currently using XML::SemanticDiff to compare 2 XML files. This module fails to check diff if there is change in ordering of child elements. I want to know the best way which can compare 2 XML files using Perl. One of the method would be to convert XML into text and than comparing text files (finding each line from 1st file in another file to compare). It would be great help if I can get the good response as I am struggling on this from past few weeks. Code should include below: 1. XML files to be provided as input to Perl script 2. XML files to be compared for differences while ignoring order of the child elements 3. Perl script should print each difference with line number of Baseline file PS: I am new to Perl and this is my first post in PM Thanks
Bookmarking PDF by string
3 direct replies — Read more / Contribute
by ReverendDovie
on Jan 16, 2019 at 15:05
    Hello, first time poster. Hope I get it right.

    I'm trying to do the following:

    1) Convert an HTML page to a PDF
    2) Add the appropriate bookmarks in that PDF
    3) Join it with a pre-fab "title page" PDF

    I have found the answers (I'm pretty sure, haven't fully tested yet) to 1 and 3. Number 2 is getting me a bit. I found the bookmarking ability of PDF::Reuse to be close, but I want to bookmark to a specific string, not a page number since I won't necessarily know the right page number since the PDF was just generated back in step one.

    Is there a way to do the bookmarking thing but to a specific string (which I can preset when building the HTML)?

    Thank you
New Meditations
n-dimensional statistical analysis of DNA sequences (or text, or ...)
1 direct reply — Read more / Contribute
by bliako
on Jan 20, 2019 at 17:22

    The recent node Reduce RAM required by onlyIDleft is asking for efficient shuffling of a DNA sequence. As an answer, hdb suggested (Re: Reduce RAM required) to build the probability distribution of the DNA sequence at hand (simply as a hash of the four DNA bases ATGC each holding the count/probability of each base appearing). Then one can ask the built distribution to output symbols according to their probability. The result will reflect the statistical properties of the input.

    I added that a multi-dimensional prob distribution could create DNA sequences closer to the original because it would count the occurence of single bases, ATGC, as well as pair of bases AT,AG,AC,.., and triplets, and so on. So, I concocted this module, released here, which reads some input data consisting of symbols, optionally separated by a separator, and builds its probability distribution, and some other structures, for a specified number of dimensions, or the ngram-length. That statistical data can then be used to predict output for a given seed.

    For example, for ndim = 3 the cumulative distribution will look like this:

    CT => ["A", 0.7, "G", 1], GA => ["T", 0.8, "C", 1], GC => ["A", 0.2, "T", 1], GT => ["A", 1],

    meaning that CT followed by A appears 7/10 times whereas the only other alternative is CT followed by G which appears 3/10. The above structure (and especially that it has cumulative probs) enables one quite easily to return A or G weighted on their probabilities by just checking if rand() falls below or above 0.7. And so I learned that the likelihood of certain sequences of bases is much larger, even an order of magnitude, than others. For example:

    AAA => 0.0335, CCC => 0.0158, GGG => 0.0158, TTT => 0.0350, GCG => 0.0030, ...

    Another feature of the module is that one can use a starting seed, e.g. AT and ask the system to predict what base follows according to the statistical data built already. And so, a DNA sequence of similar statistical properties as the original can be built.

    The module I am describing can also be used for reading in any other type of sequence, not just DNA sequences (fasta files). Text for example. And the module, then, becomes a random text generator emulating the style (i.e. the statistical distribution in n-dimensions) of the specific corpus or literature opus.

    The internal data structure used is a hash of hashes which even for 4 dimensions & literature, it is kept reasonably small, because it is usually very sparse. For example, Shelley's Frankestein 3-dimensional statistical distrbution can be serialised to a 1.5MB file. So, huge data can be compressed to the multi-dimensional probability distribution I am describing if all one wants is to keep creating clones of that data with respect to statistical properties of the original (and not an exact replicate of the original).

    Of course finite input data may not encompass all the details of the process which produced it and such a probability distrbution even over n-dimensions may prove insufficient for emulating the process.

    There are a few modules for doing something similar in CPAN already but I wanted to be able to read huge datasets without resorting to intermediate arrays for obvious reasons. And I wanted to be able to have access to the internal data representing the probability distribution of the data.

    Also, I wanted to read the data once, built the statistical distribution and save it, serialised to a file. Then I could do as many predictions as I wanted without re-reading huge data files.

    Lastly, I wanted to implement an efficient-to-store n-dimensional histogram so-to-speak using the very simple method of hash-of-hashes with the twist that one can also interrogate the HoH by means of: what follows the phrase I took my?

    And here are four three scripts for reading DNA or text sequences and doing a prediction. Available options can be inferred by just looking at the script or look at the examples further down.

    Here is some usage:

    $ wget +8.p12_chr20.fa.gz # warning ~100MB # this will build the 3-dim probability distribution of the input DNA +seq and serialise it to the state file $ --input-fasta hs_ref_GRCh38.p12_chr20.fa --n +gram-length 3 --output-state hs_ref_GRCh38.p12_chr20.fa.3.state --out +put-stats stats.txt # now work with some text, e.g +txt (easy on the gutenberg servers!!!) $ --input-corpus ShelleyFrankenstein.txt --ngram-lengt +h 2 --output-state shelley.state $ --input-state shelley.state

    I am looking for comments/feature-request before publishing this. Once it is published I will replace the code in this node with links. Please let me know asap if I am abusing resources by posting this code here.

    bw, bliako

New Cool Uses for Perl
Exploring Type::Tiny Part 6: Some Interesting Type Libraries
No replies — Read more | Post response
by tobyink
on Jan 20, 2019 at 08:40

    Type::Tiny is probably best known as a way of having Moose-like type constraints in Moo, but it can be used for so much more. This is the sixth in a series of posts showing other things you can use Type::Tiny for. This article along with the earlier ones in the series can be found on my blog and in the Cool Uses for Perl section of PerlMonks.

    While Types::Standard provides all the type constraints Moose users will be familiar with (and a few more) there are other type libraries you can use instead of or as well as Types::Standard.


    If your attribute or parameter needs to accept a file or directory name, I'd strongly recommend using Types::Path::Tiny. It provides Path, File, and Dir types, plus Abs* versions of them which coerce given filenames into absolute paths. The Path::Tiny objects it coerces strings into provide a bunch of helpful methods for manipulating files.

    package MyApp::Config { use Moo; use Types::Path::Tiny qw(AbsFile); use JSON::MaybeXS qw(decode_json); has config_file => ( is => 'ro', isa => AbsFile->where(q{ $_->basename =~ q/\.json$/ }), coerce => 1, ); sub get_hash { my $self = shift; decode_json( $self->config_file->slurp_utf8 ); } }

    Nice? Types::Path::Tiny is my personal favourite third-party type library. If you're writing an application that needs to deal with files, use it.

    Types::Common::String and Types::Common::Numeric

    Types::Common::String provides a bunch of type constraints more specific than the standard Str type. If you have indicated that an attribute or parameter should be a string, it's pretty rare that you really want to allow any string. You might want to constrain it more. This type library has types like NonEmptyStr and UpperCaseStr.

    Types::Common::Numeric does the same for numbers, giving you type constraints like PositiveInt and IntRange[1,10].

    Both of these libraries come bundled with Type::Tiny, so if you're already using Types::Standard, won't add any extra dependencies to your code.


    This is a type library created for Type::Tiny's internal use and gives you types like ArrayLike, HashLike, and CodeLike which allow overloaded objects.

    Again it's bundled with Type::Tiny, so won't add any extra dependencies.


    A type library for DateTime objects, allowing them to be coerced from strings.

    has start_date => ( is => 'ro', isa => DateTimeUTC, coerce => 1, );

    The above will not only coerce the attribute to a DateTime object, but coerce it to the correct timezone.

New Perl Poetry
The wise coder
No replies — Read more | Post response
by hippo
on Jan 22, 2019 at 14:14
    The wise coder will never pipe grep into awk
    For unneeded processes cause neighbours to talk.
    The wise coder will never type cat file | foo
    Unless they want a UUoCA too.
    The wise coder will never discard output from backticks
    As that displays ignorance of better tactics.
    The wise coder won't shell out to grep, awk or sed
    Since no JAPH would be seen doing that, dead.
New Monk Discussion
Solved: forced preview in user settings
1 direct reply — Read more / Contribute
by LanX
on Jan 17, 2019 at 17:13

    I had a conversation with rsFalse about one of his empty posts and he told me that he doesn't get a preview button.

    So I suggested to uncheck "No Forced Preview" in User Settings , but he told me that there is no effect whatsoever.

    I got curious and tried it out and it really doesn't seem to change anything. *

    I'm not sure if this is related to his case and probably he already found another solution, but it seemed necessary to report that this feature doesn't seem to work.

    UPDATE Solved

    *) Sorry, pryrt  clarified it for me.

    I wrongly expected the preview button to disappear.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2019-01-22 23:11 GMT
Find Nodes?
    Voting Booth?
    After Perl5, I'm mostly interested in:

    Results (394 votes). Check out past polls.