Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

  • (Sep 10, 2018 at 22:53 UTC) Welcome new users!
If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
First Web Crawl Task
No replies — Read more | Post response
by bennierounder
on Sep 20, 2018 at 18:28

    Hi guys,

    I'm very frustrated with this code

    #!/usr/bin/perl -w # a simple web crawler use strict; use LWP::Simple; my $url = shift || die 'Please provide an initial url after filename!' +; my $max = 10; my $html = get($url); my @urls; while ($url =~ s/(https:\/\/\S+)[">]//) { push @urls, $1; print @urls; } mkdir "web" , 0755; open (URLMAP, ">", "web/url.map" ) || die ("can't open web\/url.map\n" +); my $count = 0; for (my $i=0; $i<$max; $i++) { my $source = $urls[int(rand($#urls+1))]; getstore($source, 'web/$count.html'); print URLMAP "$count\n$source\n"; $count++; } close URLMAP;
    1,17 Top

    I run the script, perl web_crawl.pl https://www.money.co.uk and I get this!

    perl web_crawl.pl https://www.google.com
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.
    Use of uninitialized value $source in concatenation (.) or string at web_crawl.pl line 27.

    I'm trying to eventually get the prices and company names, so for example for this part of the site https://www.money.co.uk/travel-money/japanese-yen-exchange-rate.htm I want to get the prices on offer into an array in order (highest first), maybe keeping a note of the company name so may need a hash or array of hashes.
    That's the end goal, but stuck on the first hurdle, which is viewing the sites html in files where i can search the prices, then extract them from the files!!! If you can think of a better way and point me in the right direction on finding the solution, I'm all ears! Thanks in advance!

    Please help!

returning to the outer loop
4 direct replies — Read more / Contribute
by Anonymous Monk
on Sep 20, 2018 at 17:03

    This is greatly simplified - I hope it makes sense. :)

    I am trying to loop over rows of an array doing a test. If the test fails, I want to go all the way back to the outer loop, which the documentation suggests will work only with foreach.

    #!/usr/bin/perl -w use strict; my @x = ( ['aaaaa', 'bbbbb', 'ccccc', 'ddddd',], ['eeeee', 'fffff', 'ggggg', 'hhhhh',], ['iiiii', 'jjjjj', 'kkkkk', 'lllll',], ); for my $i (1 .. 1000) { for my $a (@x) { my $ifails = 0; for my $j (0 .. (scalar @$a) - 1 ) { <get external data string for pattern matching here, put i +n $c> if ($c =~ /$a->[$j]/) { $ifails++; } } if ($ifails > 1) { want to go to outer loop here, and not proc +ess the next (and subsequent) row(s) } } }

    Is there some way of doing that? Note that in the actual application, I do have more more processing below the inner loop, so a "last" statement at the test doesn't work.

    Thank you in advance.

Replace strings in text file
2 direct replies — Read more / Contribute
by TonyNY
on Sep 20, 2018 at 12:28

    Hi,

    I'm trying to replace strings in a text file but cannot get it to work using the following code:

    system("sed -i -e 's/The action failed./failed_build/g' $lookuptxtfile +");

    Any help modifying this code or using a better way to accomplish this would be greatly appreciated.

which GUI toolkit for this task?
3 direct replies — Read more / Contribute
by albert925
on Sep 20, 2018 at 01:26

    Hi, I am looking for a cross-platform GUI toolkit for Perl that can have animatable 2D textures & buttons (animatable position / rotation) , some thing that can work well with card games and board games and look nice.

    Using Perl/TK at the moment but I don't think it is best suited for this task. Any suggestions are welcome. Thanks

Joining array into new string
3 direct replies — Read more / Contribute
by kris1511
on Sep 19, 2018 at 14:42
    I have array in format
    my @groups = [ [ 23 ], [ 22 ] ]; my $perms = join ',', @groups; print Dumper $perms;

    It gives me result as array : $VAR1 = 'ARRAY(0x7f82c3792848)';

    How can I flatten this kind of array into string. Thanks!!
In Net::DNS::RR what is the 'can' method?
1 direct reply — Read more / Contribute
by Lotus1
on Sep 19, 2018 at 14:01

    This is from the examples section of Net::DNS :

    use Net::DNS; my $res = Net::DNS::Resolver->new; my $reply = $res->search("www.example.com", "A"); if ($reply) { foreach my $rr ($reply->answer) { print $rr->address, "\n" if $rr->can("address"); } } else { warn "query failed: ", $res->errorstring, "\n"; }

    The search method in the resolver returns a packet object. Then the 'answer' method of that (from Net::DNS::Packet) "Returns a list of Net::DNS::RR objects representing the answer section of the packet." I've looked through the documentation for Net::DNS::RR and the module sourch but can't find the description of what the 'can' method is. The example runs without errors while using strict and warnings. Any suggestions?

Perl not recognizing Chinese
3 direct replies — Read more / Contribute
by grsampson
on Sep 19, 2018 at 10:36

    I am trying to use Perl to excerpt lines of Chinese poetry from web pages where they are embedded in lots of HTML. According to my copy of the "Programming Perl" book, any version from 5.6 on should deal with Unicode happily -- the Perl on my Mac is many versions later than that. But when I run the script I've written over one of these web pages, where Chinese graphs ("characters") should be printed out I just see question marks. Odder still, there seem to be exactly three question marks per Chinese graph; so far as I know, Unicode uses two bytes per character.

    I'm not even sure whether this is a Perl question; I am wondering whether Chinese has been encoded on the web page in some way other than via Unicode. But however it has been encoded, my web browser (Firefox) and my text editor (BBEdit) seem to recognise it fine. I am really at a loss as to how to approach this problem.

    I probably should add that my Perl status is probably "intermediate". I have used the language a fair amount, for real tasks rather than just playing, but have never needed to move beyond the core language -- I have never used "pragmas", for instance.

    Any advice much appreciated!

Truly Isolated Perl
5 direct replies — Read more / Contribute
by mikkoi
on Sep 19, 2018 at 06:57

    For using on a website, I need to create a completely isolated Perl. Not just local::lib isolated module space. My first thought was using plenv install but it installs new Perl executable only under ${PLENV_ROOT}. What I need is to install to ~/public_html (or similar), e.g. ~/public_html/prog/env/bin/perl. And this new Perl executable would use ~/public_html/prog/env/lib as its @INC, and only that!

    The web server is running in a different server and for security reasons home directories are not mounted. Whereas the web page directories are mounted to the system in which I use shell. Besides this, the web server is running a seriously limited OS and could be updated not in sync with the other system, including Perl versions. So I need to have everything the software needs in the disk with the web page.</p:

    Currently this is not possible with plenv install. But could we make it so? Or is there an existing easy way? Instead of actually manually configuring and building Perl.

Blowing smoke tests
1 direct reply — Read more / Contribute
by nysus
on Sep 18, 2018 at 20:16

    Got a module that is failing many of its smoke tests. Most of the failed tests report they can't find the Getopt::Args module. I must have some kind of misconfig or something in my dist:zilla. Here's my Makefile.PL:

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

ithreads, locks, shared data: is that OK?
1 direct reply — Read more / Contribute
by bliako
on Sep 18, 2018 at 19:11

    I want some advice for when using Perl's ithreads to do parallel processing and return back the results.

    For this particular situation I have one or more input words and a huge dictionary. I search the input word for similar words (given some metric) and return back these similar words. I repeat the process with these words as input and so on until I go into some depth and be able to create a Graph of a few hundred nodes.

    Parallelisation: each thread checks for its input word via a Thread::Queue and when it gets one it calls a function ref which will know where the dictionary is and what to do to get the similar words out. I have included a mock function here which just gets random words as the other one is too long and convoluted.

    Additional to the work-queue, I have three more Thread::Queue objs: to save words-currently-being-processed (so as not to re-process them), for words-done-i.e.-results, and for failures-of-the-distance-function.

    I also have a couple of shared scalars : an integer to contain the total number of items processed and a flag to know if processing has to abort.

    Finally, I have the huge dictionary which is absolutely readonly and I do not know whether I have a choice in just distributing a ref to each thread rather than duplicating it to each.

    The test program works OK. (one needs a dictionary file which linux has at specified path or get 1MB worth from https://gist.githubusercontent.com/wchargin/8927565/raw/d9783627c731268fb2935a731a618aa8e95cf465/words) but I am not really sure I am doing the right things sometimes. For example where I am locking a Thread::Queue or a shared variable. And about passing references to scalars to the threads to keep track of the total number of items each thread processes. It's lovely in C but is it ok here?

    Most importantly, is there a way to avoid duplicating that huge dictionary (without explicitly share()ing each and every entry of the array or hash.

    I have another question, less important: I tried to use my own SIGINT handler to call stop() on Ctrl-C but it does not propagate all the way to the threads when threads start. I would say it is being completely overriden (by what?).

    Here is the program as a test preamble and the package in one file, many thanks bliako:

(solved) combining perlbrew and cpanm
1 direct reply — Read more / Contribute
by LanX
on Sep 18, 2018 at 12:58
    Hi

    I'm using an Ubuntu VM to test some stuff in different perlversions plus module-versions and already installed perlbrew succesfully.

    Unfortunately it seems that I installed and (sudo) used cpanminus before.

    which cpanm shows /usr/bin/cpanm and I already renamed and recreated .cpanm because it belonged to root.

    Using this instance from cpanm has strange side effects, looks like the old path from the system Perl where hardcoded.

    So whats the best way to resolve this?

    • apt-get uninstall cpanminus and reinstall it via cpan?
    • cpan install cpanm and hope that perlbrew gets the paths right?
    Yeah I know, probably I shouldn't use cpanm with System Perl in the future... but ... I didn't expect this VM to be used that long.

    So what's the canonical way to combine cpanminus with perlbrew?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

    update

    oops ... ((shame-oticon))

    Found:

    apparently I'm too colorblind to read https://perlbrew.pl/Perlbrew-and-Friends.html properly, already the second phrase says:

    > Also when you read the perlbrew usage documentation, there is a command install-cpanm that installs a standalone executable cpanm to to the same bin directory.

    I'll try later, but i think the issue is resolved now ...

    update

    works as described! ++

SOLVED: Predictable repeatable hash order in CORE
3 direct replies — Read more / Contribute
by VinsWorldcom
on Sep 18, 2018 at 09:55

    UPDATE

    I was so focused on an ordered hash, I didn't read the JSON::PP perldoc to see canonical() gives me what I want and due to the names of my hash keys, alphabetical works just fine. Thank you choroba for the quick response in Re: Predictable repeatable hash order in CORE.

    Original node follows:


    I see Tie::IxHash is not in core, but Tie::Hash is - which is the base class Tie::IxHash uses to create ordered hashes. Is there a way to get ordered hashes (predictable, repeatable order when printed for example) in CORE - other than rolling my own Tie::Hash extension (cause if I'm going that route, I might as well just install Tie::IxHash)?

    Background (in case you're interested):

    I've tested my code with Tie::IxHash and it works fine, so I may have to bite the bullet and teach a "even you can install Perl modules" course for my co-workers.

Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2018-09-21 07:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Eventually, "covfefe" will come to mean:













    Results (185 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!