Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

If you're new here please read PerlMonks FAQ
and Create a new user.

poll ideas quest 2022
Starts at: Jan 01, 2022 at 00:00
Ends at: Dec 31, 2022 at 23:59
Current Status: Active
5 replies by pollsters
    First, read How do I create a Poll?. Then suggest your poll here. Complete ideas are more likely to be used.

    Note that links may be used in choices but not in the title.

Perl News
StackOverflow blog: This is not your grandfather's Perl
on Sep 12, 2022 at 15:57
2 replies by mr_mischief
Stackoverflow blog: Why Perl is still relevant in 2022
on Jul 07, 2022 at 11:28
3 replies by NetWallah
    Girish Venkatachalam has blogged "Why Perl is still relevant in 2022" on July 6, 2022.

    No new info there - it is interesting only because it purports to be positive for perl, is published on SO, and showed up on my Google news feed.

    The author seems to have somewhat dated knowledge of perl and no knowledge of raku.

                    "These opinions are my own, though for a small fee they be yours too."

Need to speed up many regex substitutions and somehow make them a here-doc list
8 direct replies — Read more / Contribute
by xnous
on Oct 01, 2022 at 16:22

    Hello, monks. I've got a large number of text files (thousands to millions at a time, up to a couple of MB each) which I need to make a lot substitutions to (more than 150 to each). For quite some time I've been using a bash script which takes around 1 minute / 2000 files but, having used Perl in the long past, I decided to rewrite the script in Perl, hoping it would improve things considerably. However, using the standard (for my poor Perl skills at least) method of "open file; slurp it; loop through 150 substitutions" proved abyssmaly slower than bash/sed. Splitting the input down to 1 word at a time sped things up, but still to 60-70% slower than bash. Combining the regexes into one large sequence (s/^[0-9].*\s//m|s/\S*?talk\S*\s/ talk /gi...) didn't help either, as the interpeter probably optimizes them anyway. So, the problem is twofold:

    1. Speed. For context, most substitutions turn gerunds and past tenses of select verbs into infinitives, trim out numbers or convert plural to singular... nothing too fancy, no backreferences or grouping.

    2. I need to change the regex list often and a long string as shown above is hard to maintain. Ideally, I want to use a here-doc to list my substitutions, but I can't find a way to tell Perl how to use the resulting string in both the match and substitution parts of s///. If all else fails, I can split the regex into match/sub pairs as a workaround but I'm pretty sure there's a more elegant way to do it.

    I'd appreciate your wisdom on the matters, the snippet is to show how I'd prefer #2 to be implemented. Thank you.

    #!/usr/bin/perl use strict; use warnings; my @text = split /\n/, << 'TEXT'; Regular expressions have the undeserved reputation of being abstract and difficult to understand. TEXT my @regexlist = split /\n/, << 'REGEX'; s/a/A/g s/i/I/g s/e/E/g REGEX my $regex = join '|', @regexlist; while (<@text>) { // apply $regexes somehow, the fastest way possible; }
Modulino to report ip address changes
3 direct replies — Read more / Contribute
by davies
on Oct 01, 2022 at 09:04

    I have written such a modulino. It keeps a log of public facing IP addresses and sends messages when the addresses change. It can also validate IP addresses against public DNS. I don't think it is useful for businesses, but for people who run their own home servers (like me), there is a use case, even for static IP addresses - mine is not guaranteed. I am minded to publish it on CPAN, subject to feedback here. Is this something it would be useful to publish? If so:

  • Is App:ipchange a sane name?
  • The modulino currently has a .pm extension, allowing it to be used by test scripts. Should this be removed? If so, when and how? It's an aspect of CPAN with which I am unfamiliar & can find no docs.
  • Regards,

    John Davies

Using Perl for directory listing
6 direct replies — Read more / Contribute
by htmanning
on Sep 30, 2022 at 02:03

    I have a large directory called registrations with a folder for each user who has registered. I need to be able to look into each of these folders via a browser, but I have restricted indexes. I thought I could use Perl to do it. I'm using File::Fine::Rule like this:

    use File::Find::Rule; my $base_dir = "/usr/home/username/public_html/registrations/$dir"; my $find_rule = File::Find::Rule->new; foreach $item (@files) { $item =~ s/\/usr\/home\/username\/public_html\/registrations\/$dir +\///g; print "<a href=\"$dir/$item\" +>$item</a><br>"; }
    I feed it the directory via the URL and It works, but it does not show file sizes or any other info besides the filename. Is there another module I should be using or a better way to do it?


Strawberry Version
3 direct replies — Read more / Contribute
by ephemeralx
on Sep 29, 2022 at 17:24
    I have never thought too much about Perl versions. New machine? Just install the latest ActivePerl version and old code will be compatible. I recently stopped using ActivePerl and installed Strawberry Perl. I am pretty sure I am running into issues related to transitioning to an earlier Perl version of the Strawberry version. Is there any general advice, or how-to documentation related to this?
XML Node Values Based On Attributes
2 direct replies — Read more / Contribute
by gpjahn
on Sep 29, 2022 at 14:24


    I've been playing with, and trying to learn, XML::LibXML. I've been trying to get the values from specific nodes based on one of its attributes. I can get and print "all" the node values, and I can get and print just the specific attribute values, but I can't seem to figure out how to print the node values of nodes with specific attributes.

    Here is the xml I'm using. I'd like to grab the values from the <Data> nodes, but only the ones with the "name" attribute that equals "NAME_2".

    <xml> <Document> <Name>Places To Visit</Name> <Folder> <Place> <Name>Location 1</Name> <ExtendedName> <Data name="NAME_0">United States</Data> <Data name="NAME_1">Utah</Data> <Data name="NAME_2">Salt Lake City</Data> </ExtendedName> </Place> <Place> <Name>Location 2</Name> <ExtendedName> <Data name="NAME_0">United States</Data> <Data name="NAME_1">Rhode Island</Data> <Data name="NAME_2">Providence</Data> </ExtendedName> </Place> <Place> <Name>Location 3</Name> <ExtendedName> <Data name="NAME_0">United States</Data> <Data name="NAME_1">Wisconsin</Data> <Data name="NAME_2">Green Bay</Data> </ExtendedName> </Place> <Place> <Name>Location 4</Name> <ExtendedName> <Data name="NAME_0">United States</Data> <Data name="NAME_1">Wyoming</Data> <Data name="NAME_2">Casper</Data> </ExtendedName> </Place> </Folder> </Document> </xml>

    Here is the small bit of code that I'm using to pull out the contents of the attributes, but I don't know how to get the node values. I hope that makes sense.

    Thank you for your help!

    #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $xml = XML::LibXML->load_xml(location => 'locations.xml'); foreach my $node2 ($xml->findnodes('//Place')) { my $attr; foreach( $node2->findnodes('./ExtendedName/Data/@name') ) { $attr = $_->textContent(); if($attr eq "NAME_2" ) { print $_->textContent() . "\n"; } } }
How to Pass more than one file in perl MY function
5 direct replies — Read more / Contribute
by prad001
on Sep 29, 2022 at 11:04
    Hi Team,

    I am new to Perl and was wondering if you guys can help me in regards to passing more than one files in the below code;

    my @files=<data/j*.*.txt>; if (@ARGV) { my $test=$ARGV[0]; $test=lc($test); print "Using $test instead\n"; @files=</data/$test*.*.txt>; print "Found @files instead\n"; } my $outfile='/data/w_c.txt'; my $lotfile='/data/completed.txt'; if (-e $outfile) { unlink $outfile; }
    In the above code (my @files=<data/j*.*.txt>;) is currently having all the files starting with j*.*, But I would like to pass all the below files only;

    How could I pass the list of files?

    Thank you,

Installing HTML::Tidy
1 direct reply — Read more / Contribute
by bliako
on Sep 27, 2022 at 12:39

    HTML::Tidy depends on a C library tidyp which is a fork, by the author, of libtidy. Compiling the source from the repository is not straight forward as a configure script is missing and additionally, the provided INSTALL instructions contain this warning:

    If you do NOT have a ./configure program, then you are working from the source repository, not the tarball. Please get a release tarball from

    But the link is dead.

    What worked for me (in Linux) was to bootstrap configure by using the autotools mantra:

    libtoolize --ltdl --copy --force && aclocal && automake --add-missing --copy && autoconf

    Then also shush the beast with adding these missing files: touch AUTHORS NEWS

    And finally ./configure && make all && make install will hopefully install this dependency.

    Warning: serious cargo-culting above.

    Note both packages mentioned above are read-only and I could not find a way to post this comment there where it belongs.

    bw, bliako

file open with variables
3 direct replies — Read more / Contribute
by Anonymous Monk
on Sep 27, 2022 at 12:16
    Hello, I am a newbie in Perl and seeking a help with file open using variables. In below code, I like to use variables X_info, Y_info, Z_info in the file open line so I can only change the variable contents to open a file. Can someone help this? I can't figure this out. Thanks, Steve
    use strict; my X_info = 3; my Y_info = -4; my Z_info = 5; ###### want to replace 3 with X_info, -4 with Y_info, 5 with Z_info ## +################################ open(DLOG, '<' 'D:\PROJ\N123_X3\dataInfo_X-4_Y5_decode.csv') or die "w +e have a problem: $!"; print "It Works.\n" close (DLOG);
CSS mods for the new metacpan layout
No replies — Read more | Post response
by hippo
on Sep 30, 2022 at 11:14

    You've probably seen the new styling of MetaCPAN by now. One anonymous monk is less than enthralled. It's not all bad, IMHO and will hopefully improve over time. Meanwhile here is the little snippet of userContent.css which I've put together today to restore a little sanity.

    @-moz-document url-prefix( { { grid-template-columns: 200px calc(100vw - 200px +) !important; } ul.nav-list { padding: 10px !important; width: 200px !important } ul.nav-list>li a, ul.dependencies>li>a { color: #337ab7 !important + } div.content { padding: 20px !important } #index-container { margin-left: 20px !important } }

    This will:

    • Reduce the left nav from 300 to 200 pixels in width and reduce the padding on the main content so there is not so much wasted space (Don't ask me why they've gone with a fixed pixel width here to begin with)
    • Re-enable the different styling (colour) of links in the nav so you can tell what is a link and what is just info once again

    We'll see how much this needs tweaking over the next little while but at least if you are interested in this it saves us all reverse-engineering it independently.

    The new layout proposal and discussion is in the issues here.


LWP::UserAgent Client-Warning 500 against HTTP standards?
3 direct replies — Read more / Contribute
by Discipulus
on Sep 30, 2022 at 03:35
    Hello community,

    being our halls so quite in these days I'm lazily inviting you to meditate about LWP::UserAgent behaviour returning 500 when LWP can't connect to some URL or when other failures in protocol handlers occur.

    Is this breaking HTTP specification? If ever glanced current rfc or not you should know that all 5** status code are server side.

    The LWP doumentation is very clear on this:

    > There will still be a response object returned when LWP can't connect to the server specified in the URL or when other failures in protocol handlers occur. These internal responses use the standard HTTP status codes, so the responses can't be differentiated by testing the response status code alone. Error responses that LWP generates internally will have the "Client-Warning" header set to the value "Internal response". If you need to differentiate these internal responses from responses that a remote server actually generates, you need to test this header value.


    use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new(); for my $url ( qw( ){ print "\nGET $url\n"; my $res = $ua->get( $url ); # ..yes you can $res->status_line to have both combined print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; print "Client-Warning header:\t", $res->header( "Client-Warning" ) +, "\n"; } __END__ GET code : 200 message : OK Client-Warning header: GET code : 500 message : Can't connect to Client-Warning header: Internal response

    The message returned is already very clear Can't connect.. is oblviously client side: so why the choose of an error of the 5** class?

    In the chat LanX suggested 418 I'm a teapot and is fun and new to me, but not usable: teapots are reserved to IANA :)

    In the 4** class are defined status codes 401-418 plus 421 422 426 so there is room to have something like: 419 - Can't connect

    See also other status numbers used to craft a HTTP::Response

    So (and I dont want to blame LWP authors) why they choosed to return 500 setting an header internally to disambiguate it?

    What other frameworks do? Quickly trying Mojo::UserAgent I see it uses it's own Mojo::Message::Response and does not return any status code for unexisting urls:

    use strict; use warnings; use Mojo::UserAgent; my $ua = Mojo::UserAgent->new; for my $url ( qw( ){ print "\nGET $url\n"; my $res = $ua->get( $url )->result; print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; #print "Client-Warning header:\t", $res->header( "Client-Warning" +), "\n"; } __END__ GET code : 200 message : OK GET Can't connect: Host unknown. at line 10.

    ..and this error is defined in Mojo::IOLoop::Client it seems to me a better design, but... wait this is a die behaviour! if you switch URLs in the above code you never reach the second GET.

    By other hand curl tell us it is unable to resolve the URL:

    curl -I curl: (6) Could not resolve host:

    ..and it is right.

    What do you think about? What other frameworks I missed do?

    Is 200 if you post 203 but no 204 will be accepted! :)


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2022-10-04 03:23 GMT
Find Nodes?
    Voting Booth?
    My preferred way to holiday/vacation is:

    Results (15 votes). Check out past polls.