Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Seekers of Perl Wisdom

( #479=superdoc: print w/replies, xml ) Need Help??

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask. Post a new question!

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

User Questions
Improving speed match arrays with fuzzy logic
3 direct replies — Read more / Contribute
by Takamoto
on Jan 18, 2019 at 12:57

    Hello monks,

    I have a first long list (array) of n-grams (array various length, typical: 10.000 elements) and a second even bigger reference list with n-grams (~500.000 elements). I need to check if elements of list 1 are present in list 2. The match must be fuzzy, i.e. both lists are words, and I need to match also very similar words (basically to match singular-plurals, small differences in spellings, and so on). My script (very naive) works, but I'd like to hear from you some suggestions to improve its speed (which is quite crucial in my application). I already had a BIG gain in time switching from Text::Fuzzy::PP to Text::Fuzzy, but surely the script could further improved. Data looks like this (each line is one element of multi word units):

    IP exchange IPX Internet Protocol Exchange DPM Data Point Model

    My code

    use strict; use warnings; use Text::Fuzzy; use Data::Dumper; my @stringsToMatch=("extra-articular arthrodesis", "malnutritions");#m +y first list. I want to check if its elements are in my reference lis +t $referenceNgrams my @goodmatches; my $referenceNgrams="EN.txt";#my huge reference file (second list) print "Loading n-grams from $referenceNgrams\n"; open my $inFH, '<:encoding(UTF-8)', $referenceNgrams or die; chomp(my @Corpus = <$inFH>); close $inFH; #Matching with Fuzzy foreach my $stringToMatch (@stringsToMatch){ print "Working on $stringToMatch\n"; foreach my $corpusElement (@Corpus){ #matching only if $stringToMatch has the same amount of elemen +ts of $corpusElement (to save time?) my $elementsInstringToMatch = 1 + ($stringToMatch =~ tr{ }{ }) +; my $elementsIncorpusElement = 1 + ($corpusElement =~ tr{ }{ }) +; if ($elementsIncorpusElement eq $elementsInstringToMatch){ my $tf = Text::Fuzzy->new ($stringToMatch); my $distance= $tf->distance ($corpusElement); if ($distance < 2){#sensibility push (@goodmatches, $stringToMatch); last;#go out of loop if match has been found } } } } print "Good matches:\n"; print Dumper @goodmatches;
Documentation for die_bug
3 direct replies — Read more / Contribute
by dhvsfan
on Jan 18, 2019 at 12:27

    I see some use of "::die_bug" in Perl code found on the internet. I can't find any documentation for it. Using it under "This is perl 5, version 26, subversion 2 (v5.26.2) built for MSWin32-x64-multi-thread" gives the error: Undefined subroutine &main::die_bug called at ./ line 662.

    It works in "This is perl 5, version 18, subversion 4 (v5.18.4) built for x86_64-linux"

[OT] Accessing python3's print() of floating point values
2 direct replies — Read more / Contribute
by syphilis
on Jan 18, 2019 at 08:08

    With python3 installed, I'm able to access (from within a perl script) the values that python3 prints out for a particular double ($d) by shelling out as follows:
    use strict; use warnings; my $d = 2 ** -1074; my $py = `python3 -c \"print($d)\"`; print "$py\n"; # prints 5e-324
    Now, that's about the full extent of my python3 skills, and it works well enough for my purposes even though shelling out is quite an expensive operation.

    But that only works as I intend if $d is a 'double' - ie when $Config{nvtype} is 'double'.
    How do I get access to the output of 'long double' and '__float128' values in python3 - ie when perl's nvtype (and hence $d) is 'long double' or '__float128' ?
    I couldn't quickly google up an answer to that question ... so I'm asking here in the event that some kind soul might be able to help me out.

    <pathetic cringe>
    By way of explanation:
    Python3 has a rather nice approach to the (base 10) presentation of floating point values. It will provide as few digits as are needed. The script above is a good example. Perl will tell you:
    C:\>perl -le "print 2 ** -1074;" 4.94065645841247e-324
    And python3 will claim:
    $ python3 -c "print(2 ** -1074)" 5e-324
    The 2 values appear to be different ... but they're not, and perl will even tell you so:
    C:\>perl -le "print 'ok' if 5e-324 == 4.94065645841247e-324;" ok
    It therefore begs the question "Why go to the (obfuscating ?) trouble of providing all of those extra digits when they don't provide greater accuracy ?". It's a good question - though there are answers and the question is
    not necessarily a rhetorical one.

    Anyway, I've written a perl implementation (using Math::MPFR, Math::GMPz, Math::GMPq) of the python3 approach to base 10 output of floating point values, and I need to test it.
    The internal consistency checks are looking fine, but I also want to check against some trustworthy external source - largely to check that I've got the standardised output working as per the .... ummm .... standard.
    That's why I'm referencing python3, and it's checking out well for double precision floating point values.
    Now it's just a matter of checking the 'long double' and '__float128' precision values.
    Hence my request for assistance.
    </pathetic cringe>

    Is there a better "trustworthy external source" that I should be using ?
    Is there a C library available that already implements this algorithm ? (Note that the PDF file to which I've linked provides only pages 112-126 of the book. I've implemented the algorithm on page 120.)
    Incidentally, Zefram had indicated an interest in implementing this particular algorithm into the perl source, but I've seen no postings from him anywhere since his perl5 grant ran out last year.


Perl Script Not working Excel 2016
1 direct reply — Read more / Contribute
by raju1
on Jan 18, 2019 at 05:11
    Hi Friends,

    I had a perl script to convert .txt files to .xls after doing some calculations and filtering. It was working fine in Excel 2010.Recently I had Excel 2016 installed and from then its throwing me an error as follows "No type library matching "Microsoft Excel" found at C:\GMIO Transaction Audit Process\Perl Script\ line 13". Here is the script till line 13. Request your help regarding this

    #!/usr/bin/perl # These are the modules the program utilizes. # Strict is a module which compiles the code to avoid syntax issues. # List::Util was utilized for part of the random selection . # The POSIX module was utilized also in the random selection #function + # Win32::OLE is for utilizing Excel within Perl. use strict; use File::Find; use POSIX qw(ceil); use List::Util 'shuffle'; use Win32::OLE::Const "Microsoft Excel";
[SOLVED] Test:: fail when output file was changed
1 direct reply — Read more / Contribute
by jahero
on Jan 18, 2019 at 04:05

    Dear fellow monks.

    I think that some time ago I have seen somewhere information about a package on CPAN (in a blog post I think), which could be used in testing, and which:

    • by default failed, when named output file was changed
    • could run in such a mode, that it would "remember" current state of the file (if the change was intentional), which would prevent further fails (until next change)

    Imagine you are programmatically creating complex structured text, and want to be able to "keep it fixed" in you tests - unless you know that the change is "for the better"

    Is my memory playing tricks on me? Are you aware of such library? I hope that what I vaguely described above makes sense.

    Unfortunately, mu Google-fu is not advanced enough to yield the answer.

    Regards, Jan


    Update: change the title, answer is in comments below.

Building Inline::C from github source
1 direct reply — Read more / Contribute
by syphilis
on Jan 17, 2019 at 20:20

    I can clone the github repo ok:
    git clone inline-c-pm
    However, AFAICT, there's no Makefile.PL included - except for a couple in eg/modules that bear no relation to the building of Inline::C itself.
    In the absence of that file, what is the recommended course of action to be taken in order to build Inline::C from github source ?
    (And where is this procedure documented ?)

Comparison of XML files ignoring ordering of child elements
2 direct replies — Read more / Contribute
by adikan123
on Jan 17, 2019 at 00:09
    I have been currently using XML::SemanticDiff to compare 2 XML files. This module fails to check diff if there is change in ordering of child elements. I want to know the best way which can compare 2 XML files using Perl. One of the method would be to convert XML into text and than comparing text files (finding each line from 1st file in another file to compare). It would be great help if I can get the good response as I am struggling on this from past few weeks. PS: I am new to Perl and this is my first post in PM Thanks
printing an array reference
2 direct replies — Read more / Contribute
by vedas
on Jan 16, 2019 at 22:48

    Apologies if this seem like a simple question, I'm working on learning the more advanced topics in perl.

    I'm having trouble pulling information out of data an array reference. I'm trying to print data that returned from the aliases method. Would anyone be able to tell me what I'm doing wrong?

    #:Code #!/usr/bin/perl # use strict ; #use lib "/var/www/html/monkey/cgi-bin"; use lib "/var/www/html/production/cgi-bin"; # load the WebUI common libraries use Infoblox::WebUI ; use Data::Dumper; my $session = Infoblox::Session->new( master => $address, username => $user, password => $passwd, timeout => $timeout_num, connection_timeout => $conn_timeout ); print "Check:".Infoblox::status_code() . ":" . Infoblox::status_detail +()."\n"; unless ($session) { die("Construct session failed: ", Infoblox::status_code() . ":" . Infoblox::status_de +tail()); } #Get DHCP Host Address object through the session my @retrieved_objs = $session->get( object => "Infoblox::DNS::Host", name => "", ); #my @tmp_array = @$_->aliases() for (@retrieved_objs); #print "aliases: " . @tmp_array; print "aliases: " .$_->aliases(),"\n" for (@retrieved_objs); print "dns_name: " . $_->dns_name(),"\n" for (@retrieved_objs); print "ipv4addrs: " . $_->ipv4addrs(),"\n" for (@retrieved_objs); print "comment: " . $_->comment(),"\n" for (@retrieved_objs); print "name: " . $_->name(),"\n" for (@retrieved_objs);: : : :OUTPUT [user1@host perl]$ ./ Check:0:Operation succeeded aliases: ARRAY(0x55f6da24f8f0) dns_name: ipv4addrs: ARRAY(0x55f6da243eb0) comment: Standard Network: [TAGS: STATIC;] name: zone: ttl: 300
Bookmarking PDF by string
3 direct replies — Read more / Contribute
by ReverendDovie
on Jan 16, 2019 at 15:05
    Hello, first time poster. Hope I get it right.

    I'm trying to do the following:

    1) Convert an HTML page to a PDF
    2) Add the appropriate bookmarks in that PDF
    3) Join it with a pre-fab "title page" PDF

    I have found the answers (I'm pretty sure, haven't fully tested yet) to 1 and 3. Number 2 is getting me a bit. I found the bookmarking ability of PDF::Reuse to be close, but I want to bookmark to a specific string, not a page number since I won't necessarily know the right page number since the PDF was just generated back in step one.

    Is there a way to do the bookmarking thing but to a specific string (which I can preset when building the HTML)?

    Thank you
install_driver(ODBC) mac mojave 10.14.2
2 direct replies — Read more / Contribute
by raventheone
on Jan 16, 2019 at 12:09

    hello all

    i have the problem that i am not able to get a simple perl script doing a database connection and query. here the error i get:

    install_driver(ODBC) failed: Can't locate DBD/ in @INC (you may need to install the DBD::ODBC module) (@INC contains: /Library/Perl/5.18/darwin-thread-multi-2level /Library/Perl/5.18 /Network/Library/Perl/5.18/darwin-thread-multi-2level /Network/Library/Perl/5.18 /Library/Perl/Updates/5.18.2/darwin-thread-multi-2level /Library/Perl/Updates/5.18.2 /System/Library/Perl/5.18/darwin-thread-multi-2level /System/Library/Perl/5.18 /System/Library/Perl/Extras/5.18/darwin-thread-multi-2level /System/Library/Perl/Extras/5.18 .) at (eval 4) line 3. Perhaps the DBD::ODBC perl module hasn't been fully installed, or perhaps the capitalisation of 'ODBC' isn't right. Available drivers: DBM, ExampleP, File, Gofer, Proxy, SQLite, Sponge. at line 10.

    i am running mac os mojave 10.14.2

    the db connection works using "Azure Data Studio"

    any help is deeply appreciated

Add your question
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others about the Monastery: (5)
    As of 2019-01-18 23:43 GMT
    Find Nodes?
      Voting Booth?
      After Perl5, I'm mostly interested in:

      Results (334 votes). Check out past polls.