Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
Improving speed match arrays with fuzzy logic
3 direct replies — Read more / Contribute
by Takamoto
on Jan 18, 2019 at 12:57

    Hello monks,

    I have a first long list (array) of n-grams (array various length, typical: 10.000 elements) and a second even bigger reference list with n-grams (~500.000 elements). I need to check if elements of list 1 are present in list 2. The match must be fuzzy, i.e. both lists are words, and I need to match also very similar words (basically to match singular-plurals, small differences in spellings, and so on). My script (very naive) works, but I'd like to hear from you some suggestions to improve its speed (which is quite crucial in my application). I already had a BIG gain in time switching from Text::Fuzzy::PP to Text::Fuzzy, but surely the script could further improved. Data looks like this (each line is one element of multi word units):

    IP exchange IPX Internet Protocol Exchange DPM Data Point Model

    My code

    use strict; use warnings; use Text::Fuzzy; use Data::Dumper; my @stringsToMatch=("extra-articular arthrodesis", "malnutritions");#m +y first list. I want to check if its elements are in my reference lis +t $referenceNgrams my @goodmatches; my $referenceNgrams="EN.txt";#my huge reference file (second list) print "Loading n-grams from $referenceNgrams\n"; open my $inFH, '<:encoding(UTF-8)', $referenceNgrams or die; chomp(my @Corpus = <$inFH>); close $inFH; #Matching with Fuzzy foreach my $stringToMatch (@stringsToMatch){ print "Working on $stringToMatch\n"; foreach my $corpusElement (@Corpus){ #matching only if $stringToMatch has the same amount of elemen +ts of $corpusElement (to save time?) my $elementsInstringToMatch = 1 + ($stringToMatch =~ tr{ }{ }) +; my $elementsIncorpusElement = 1 + ($corpusElement =~ tr{ }{ }) +; if ($elementsIncorpusElement eq $elementsInstringToMatch){ my $tf = Text::Fuzzy->new ($stringToMatch); my $distance= $tf->distance ($corpusElement); if ($distance < 2){#sensibility push (@goodmatches, $stringToMatch); last;#go out of loop if match has been found } } } } print "Good matches:\n"; print Dumper @goodmatches;
[OT] Accessing python3's print() of floating point values
2 direct replies — Read more / Contribute
by syphilis
on Jan 18, 2019 at 08:08
    Hi,

    With python3 installed, I'm able to access (from within a perl script) the values that python3 prints out for a particular double ($d) by shelling out as follows:
    use strict; use warnings; my $d = 2 ** -1074; my $py = `python3 -c \"print($d)\"`; print "$py\n"; # prints 5e-324
    Now, that's about the full extent of my python3 skills, and it works well enough for my purposes even though shelling out is quite an expensive operation.

    But that only works as I intend if $d is a 'double' - ie when $Config{nvtype} is 'double'.
    How do I get access to the output of 'long double' and '__float128' values in python3 - ie when perl's nvtype (and hence $d) is 'long double' or '__float128' ?
    I couldn't quickly google up an answer to that question ... so I'm asking here in the event that some kind soul might be able to help me out.

    <pathetic cringe>
    By way of explanation:
    Python3 has a rather nice approach to the (base 10) presentation of floating point values. It will provide as few digits as are needed. The script above is a good example. Perl will tell you:
    C:\>perl -le "print 2 ** -1074;" 4.94065645841247e-324
    And python3 will claim:
    $ python3 -c "print(2 ** -1074)" 5e-324
    The 2 values appear to be different ... but they're not, and perl will even tell you so:
    C:\>perl -le "print 'ok' if 5e-324 == 4.94065645841247e-324;" ok
    It therefore begs the question "Why go to the (obfuscating ?) trouble of providing all of those extra digits when they don't provide greater accuracy ?". It's a good question - though there are answers and the question is
    not necessarily a rhetorical one.

    Anyway, I've written a perl implementation (using Math::MPFR, Math::GMPz, Math::GMPq) of the python3 approach to base 10 output of floating point values, and I need to test it.
    The internal consistency checks are looking fine, but I also want to check against some trustworthy external source - largely to check that I've got the standardised output working as per the .... ummm .... standard.
    That's why I'm referencing python3, and it's checking out well for double precision floating point values.
    Now it's just a matter of checking the 'long double' and '__float128' precision values.
    Hence my request for assistance.
    </pathetic cringe>

    Is there a better "trustworthy external source" that I should be using ?
    Is there a C library available that already implements this algorithm ? (Note that the PDF file to which I've linked provides only pages 112-126 of the book. I've implemented the algorithm on page 120.)
    Incidentally, Zefram had indicated an interest in implementing this particular algorithm into the perl source, but I've seen no postings from him anywhere since his perl5 grant ran out last year.

    Cheers,
    Rob

[SOLVED] Test:: fail when output file was changed
1 direct reply — Read more / Contribute
by jahero
on Jan 18, 2019 at 04:05

    Dear fellow monks.

    I think that some time ago I have seen somewhere information about a package on CPAN (in a blog post I think), which could be used in testing, and which:

    • by default failed, when named output file was changed
    • could run in such a mode, that it would "remember" current state of the file (if the change was intentional), which would prevent further fails (until next change)

    Imagine you are programmatically creating complex structured text, and want to be able to "keep it fixed" in you tests - unless you know that the change is "for the better"

    Is my memory playing tricks on me? Are you aware of such library? I hope that what I vaguely described above makes sense.

    Unfortunately, mu Google-fu is not advanced enough to yield the answer.

    Regards, Jan

    ---

    Update: change the title, answer is in comments below.

Building Inline::C from github source
1 direct reply — Read more / Contribute
by syphilis
on Jan 17, 2019 at 20:20
    Hi,

    I can clone the github repo ok:
    git clone git@github.com:ingydotnet/inline-c-pm inline-c-pm
    However, AFAICT, there's no Makefile.PL included - except for a couple in eg/modules that bear no relation to the building of Inline::C itself.
    In the absence of that file, what is the recommended course of action to be taken in order to build Inline::C from github source ?
    (And where is this procedure documented ?)

    Cheers,
    Rob
Comparison of XML files ignoring ordering of child elements
2 direct replies — Read more / Contribute
by adikan123
on Jan 17, 2019 at 00:09
    I have been currently using XML::SemanticDiff to compare 2 XML files. This module fails to check diff if there is change in ordering of child elements. I want to know the best way which can compare 2 XML files using Perl. One of the method would be to convert XML into text and than comparing text files (finding each line from 1st file in another file to compare). It would be great help if I can get the good response as I am struggling on this from past few weeks. PS: I am new to Perl and this is my first post in PM Thanks
Bookmarking PDF by string
3 direct replies — Read more / Contribute
by ReverendDovie
on Jan 16, 2019 at 15:05
    Hello, first time poster. Hope I get it right.

    I'm trying to do the following:

    1) Convert an HTML page to a PDF
    2) Add the appropriate bookmarks in that PDF
    3) Join it with a pre-fab "title page" PDF

    I have found the answers (I'm pretty sure, haven't fully tested yet) to 1 and 3. Number 2 is getting me a bit. I found the bookmarking ability of PDF::Reuse to be close, but I want to bookmark to a specific string, not a page number since I won't necessarily know the right page number since the PDF was just generated back in step one.

    Is there a way to do the bookmarking thing but to a specific string (which I can preset when building the HTML)?

    Thank you
Reading a hash structure stored in a file
5 direct replies — Read more / Contribute
by sam1990
on Jan 15, 2019 at 15:08

    Hello, I have a file that has a hash stored in it. I am trying to read that hash as it is using eval but I am getting following error: Global symbol "%hash1" requires explicit package name (did you forget to declare "my %hash1"?) at tiny.pl line (print Dumper(\%hash1);) 14. Execution of tiny.pl aborted due to compilation errors. Please help me understand the issue here, thank you

    #file.pl : my %hash1 = (hello => 1, hi =>2 ); #tiny.pl #!/home/utils/perl5/perlbrew/perls/5.24.2-021/bin/perl use strict; use warnings; use Path::Tiny qw( path ); use Data::Dumper; my $file = 'file.pl'; open(my $fh, '<', $file) or die "Could not open file $file"; eval($fh); close $fh; print Dumper(\%hash1);
Invoke the Perl string interpolation engine on a string contained in a scalar variable.
1 direct reply — Read more / Contribute
by ibm1620
on Jan 15, 2019 at 12:49
    I want to be able to take arbitrary lines containing variables that are defined in the program, and interpolate them.
    #!/usr/bin/env perl use 5.010; use warnings; use strict; my $var1 = "abel"; my $var2 = "baker"; my $var3 = "charlie"; while (my $line = <DATA>) { chomp $line; say "Before interpolation: $line"; say "After interpolation: " . perform_interpolation($line); say ''; } sub perform_interpolation { my $text = shift; # now what? } __DATA__ I'd like to see this one: $var1. How about \$var3? You shouldn't interpolate \$var3 (but it would be ni +ce if you'd remove the backslash) Try a concatenation: $var1$var2
    I've seen String::Interpolate mentioned in my searches, but unless I'm misunderstanding something, I'd have to know in advance what variables I'd be interpolating

    What I'm trying to do is create a program template where a comment block right after the shebang line (possibly containing scalar variables like $program or other constants or environment variables) can be rendered to produce a usage statement.

    The perform_interpolation() subroutine would be part of the template and wouldn't know specifically what variables the programmer might want to interpolate for the usage statement.

    Can String::Interpolate do this? Or is there a simpler way?

Win32::GUI and threads issue
3 direct replies — Read more / Contribute
by Garden Dwarf
on Jan 15, 2019 at 05:20

    Hello Monks!

    I am trying to create a Win32 application (with Strawberry Perl (v5.14.4) on Win10) displaying graphic computations. In order to optimize the process, I want to divide the management of my virtual buffer into small pieces computed by individual threads, then compile the results and copy the virtual buffer on the screen.

    My problem is the combination of Win32::GUI and threads (I have also tried forkmanager without success). Threads without Win32 is ok, Win32 without threads is ok, but using both is not. Here is a simple sample code to illustrate the issue (you can enable/disable the use of Win32 with the variable $use_win or change the amount of threads with the variable $t_amount):

    #!/bin/perl use Win32::GUI(); use threads; use strict; use warnings; use Data::Dumper; my $use_win=1; # Create Win32 GUI interface (1) or not (0) my $t_amount=4; # Amount of threads to create my $textbox; my $win; my $draw; if($use_win){ # Initialize window $win=new Win32::GUI::Window( -left => 0, -top => 0, -width => 300, -height => 300, -name => "Window", -text => "Test", ); $win->InvalidateRect(1); $textbox=$win->AddTextfield( -name => "Output", -left => 5, -top => 5, -width => 275, -height => 255, -text => ""); # Start application $draw=$win->AddTimer('draw',1000); $win->Show(); Win32::GUI::Dialog(); }else{ draw_Timer(); } sub Window_Terminate{-1} sub draw_Timer{ my @threads; my @ret; my $c; my $d; # Assign range of computation to fork processes foreach $c(1..$t_amount){ $d=$c-1; push(@threads,threads->new(\&draw,($d*10),($d*10+10))); } foreach my $thread(@threads){ @ret=$thread->join; foreach my $data(@ret){ $use_win?$textbox->Append("|".$data):print"|".$data; } $use_win?$textbox->Append("\n"):print"\n"; } } sub draw{ my $begin=shift; my $end=shift; my @tbl; my $cpt; for($cpt=$begin;$cpt<$end;$cpt++){ push(@tbl,$cpt); } return(@tbl); }

    Any help would be welcome. I have searched for previous posts without finding a solution. I also googled with no luck. Thanks in advance!

Case where '( shift @_ )[ 0, 0 ]' returns only one value?
5 direct replies — Read more / Contribute
by rsFalse
on Jan 14, 2019 at 15:58
Dbic and inflating Oracle DATE columns - solved
1 direct reply — Read more / Contribute
by Ea
on Jan 14, 2019 at 05:27
    Newbie trying to get DBIx::Class to work with Oracle DATE columns. Managed to get my error to go away, so I'm putting this out there for expert comment and for those newbies searching on the same problem.

    I used dbicdump with components=["InflateColumn::DateTime"] on an Oracle schema which produced a Result class which has

    __PACKAGE__->load_components("InflateColumn::DateTime"); ... __PACKAGE__->add_columns( "start_time", { data_type => "datetime", is_nullable => 1, original => { data_type => "date" }, }, );
    but after searching with dbic and getting a resultset
    $resultset = $schema->resultset('Ical') ->search({ uid => $id });
    calling $resultset->start_time would give me DBIx::Class::InflateColumn::DateTime::catch {...} (): Error while inflating '24-APR-18' for start_time on Timetable::Schema::Result::Ical: Invalid date format: 24-APR-18. There are lots of hints in the documentation, but nothing explicit on how to avoid this error.

    I found that the error went away when I added the on_connect_call option to the connect method.

    my $schema = Timetable::Schema->connect("dbi:Oracle:$schema_name", $db_username, $db_password, {on_connect_call => 'datetime_setup'} );

    Just thought I'd get it down while it was fresh in my mind and I'll update the post when I know more about what I've done.

    ta!

    Edit

    The more documentation I read, the more I think I did the Right Thing.
  • How to connect
  • How to customize InflateColumn (which I didn't need)

    Ea

    Sometimes I can think of 6 impossible LDAP attributes before breakfast.

    Mojoconf was great!

Broken cpan shell
1 direct reply — Read more / Contribute
by stangoesagain
on Jan 13, 2019 at 23:53
    Respectful obeisances, My Perl installation got updated to 5.28.1 by my OS (openSUSE) and ever since then some of my scripts complain about missing some of my modules and cpan interactive shell quits with complaints about lock files and segmentation faults. I can usually install missing modules using cpanminus and proceed doing whatever I was doing but I still need to fix cpan. As of now, I get this (edit: corrected misspelled command in original message):
    stan@linux-pwfe:~> sudo perl -d -MCPAN -e shell [sudo] password for root: Loading DB routines from perl5db.pl version 1.53 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(-e:1): shell *** buffer overflow detected ***: perl terminated Aborted stan@linux-pwfe:~>
    Cpan itself responds to "sudo cpan -h" and "sudo cpan -v" and it even updated cpan itself via "sudo cpan -i CPAN". It can't find and install missing "CAN.pm" this way, though. Edit - another example:
    stan@linux-pwfe:~> sudo cpan [sudo] password for root: Loading internal logger. Log::Log4perl recommended for better logging There seems to be running another CPAN process (pid 8012). Contacting +... Other job not responding. Shall I overwrite the lockfile '/root/.cpan/ +.lock'? (Y/n) [y] y cpan shell -- CPAN exploration and modules installation (v2.22) Enter 'h' for help. Segmentation fault stan@linux-pwfe:~> sudo rm /root/.cpan/.lock stan@linux-pwfe:~> sudo cpan Loading internal logger. Log::Log4perl recommended for better logging cpan shell -- CPAN exploration and modules installation (v2.22) Enter 'h' for help. *** buffer overflow detected ***: /usr/bin/perl terminated Aborted stan@linux-pwfe:~>
New Monk Discussion
Solved: forced preview in user settings
1 direct reply — Read more / Contribute
by LanX
on Jan 17, 2019 at 17:13
    Hi

    I had a conversation with rsFalse about one of his empty posts and he told me that he doesn't get a preview button.

    So I suggested to uncheck "No Forced Preview" in User Settings , but he told me that there is no effect whatsoever.

    I got curious and tried it out and it really doesn't seem to change anything. *

    I'm not sure if this is related to his case and probably he already found another solution, but it seemed necessary to report that this feature doesn't seem to work.

    UPDATE Solved

    *) Sorry, pryrt  clarified it for me.

    I wrongly expected the preview button to disappear.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (5)
As of 2019-01-19 00:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    After Perl5, I'm mostly interested in:
































    Results (335 votes). Check out past polls.

    Notices?