http://www.perlmonks.org?node_id=479

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask.

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

Post a new question!

User Questions
The perl source directory structure
1 direct reply — Read more / Contribute
by syphilis
on Aug 16, 2022 at 08:21
    Hi,

    In the perl source, we find various modules in the 'cpan', 'dist' and 'ext' directories.
    What are the rules that determine which of those 3 directories houses which modules ?

    For example, why is it that POSIX is in the 'ext' directory, but threads is in the 'dist' directory ? (Why not the other way round ? Or why aren't they both in the same directory ?)

    Cheers,
    Rob
Nonrepeating characters in an RE
7 direct replies — Read more / Contribute
by BernieC
on Aug 15, 2022 at 19:32
    I have an odd problem that's hurting my head: I'm trying to construct an RE that will only match if the letter in any position does *NOT* match any other character in the string. I'm constructing this RE with a perl program and building the RE from a template. It is the *template* that says "these letters should be distinct" and then I want to run through a few thousand words to pick out the words that "match".

    For example, my "template" might look like this: "abcdefa" and I already have the code that generates (.)?????\1. I can't figure how to make the "?"s say "these guys all have to be distinct".

How to color the regex captured groups?
4 direct replies — Read more / Contribute
by ovedpo15
on Aug 12, 2022 at 12:32
    Hi Monks!
    I have a array of hashes. Each hash contains a rexes rule. Given a path, I'm trying to iterate over the rules and find the first matching rule.
    I'm trying to add a small feature which will help users to debug (since the rexes are user custom) - I want to mark the groups in the given path. For example:
    # Given path: /a/b/c/d # Given regex: ^/a/b/([^/]*)/([^\/]*) # Output: /a/b/\033[1;31mc\033[0m/\033[1;31md\033[0m
    In this case I got: /a/b/[RED]c[/RED]/[RED]d[/RED].
    The current code:
    foreach $regex_href (@rexes) { %regex = %{$regex_href}; if (@captures = ($path =~ /$regex{'regex'}/)) { # Do logic } }
    The @captures contains the group values that were captured (c and d in the example). I came a cross with the Term::ANSIColor module which can help me color the string without writing the color codes myself.
    So, what would be the best way to create a variable $output that is basically $path but colored given the captured groups? You can assume that there are always at least two groups.
How do I reference repeated capture groups?
4 direct replies — Read more / Contribute
by TIOOWTDI
on Aug 12, 2022 at 10:24
    Esteemed monks

    I stumbled over this 2 year old reddit-question from a user "onion" and am not too convinced about the answers given.

    Is using Regexp::Grammars really the only way to do it? Seems like overkill...

    Suppose I have this regular expression:
    my $re = qr{(\w+)(\s*\d+\s*)*}; How do I get every match matched by the second group?
    Using the regular numeric variables only gets me the last value matched, not the whole list:
    my $re = qr{(\w+)(\s*\d+\s*)*}; my $str = 'a 1 2 3 b 4 5 6'; while ($str =~ /$re/g) { say "$&: $1 $2"; } # output: # a 1 2 3 : a 3 # b 4 5 6: b 6

    How do I get every number that follows a letter in this example, and not just the last one?

    EDIT

    Bonus question:

    How do I do it if I have named groups? I.e. my $re = qr{(?<letter>\w+)(?<digit>\s*\d+\s*)*};

Comparing time strings from a list of HH:MM:SS times
2 direct replies — Read more / Contribute
by slugger415
on Aug 10, 2022 at 14:38

    Hello esteemed PerlMonks,

    I have a list of time strings in the HH:MM:SS format. I want to add each to a variable set with localtime and compare it to a timestamp later in the script. I'm having trouble understanding how to add that HH:MM:SS string to the localtime variable.

    #! /usr/bin/perl use Time::Piece; use strict; my(@times) = ("00:05:21","00:08:05","00:10:33"); my $startTime = localtime(); print "Start: ", $startTime, $/; foreach my $t (@times) { sleep 2; my $newTime = localtime(); my $ss = $startTime + $t; ### this is where I need advice if($ss > $newTime){ print "\$ss is greater.\n"; ### execute some functions here } print "Newtime: ", $newTime, $/; my $diff = $newTime - $startTime; print $diff, $/;

    The $diff part works but not the addition of $t, how do I add that time to it? Obviously I need to convert it to something Time::Piece understands.

    Thank you.

Rename/mkdir with File::Fetch
3 direct replies — Read more / Contribute
by justin423
on Aug 10, 2022 at 11:03
    I posted a question on here before about File:Fetch and got a bunch of great responses, so thank you all... It doesn't seem like this is possible, but can file:Fetch either rename the files to a particular filename, or alternatively create a new directory each time with a pre-determined folder name? The URL's are in format www.example.com/document_id/document.pdf where document id is a unique number provided by the publisher. So all the files to fetch are named document.pdf. So that each successive document doesn't overwrite the previous one, I rename them to document0.pdf, document1.pdf using a loop to keep them unique. (see code below) so is there a way to either change the filename to document_id.pdf or make a new directory of data/documents/document_id/ and save the document.pdf to that new folder? I think file:fetch only takes one variable input. and won't work with SELECT DOCUMENT_ID,URL FROM LINKS
    my $query = "select url FROM LINKS"; # << minor edit my $sth = $dbh->prepare($query) or die "prepare: ".$dbh->errstr; $sth-> execute() or die "execute: ".$dbh->errstr; $i=0; while (my $ref = $sth->fetchrow_hashref()) { print "\nurl: $ref->{url}\n"; my $ff = File::Fetch->new(uri=>$ref->{url}); my $where = $ff->fetch( to => '/data/documents/'); my $error= $ff->error(); rename ("C:/data/documents/document.pdf","C:/data/documents/document$i.pdf"); ($i++); }
WWW::Mechanize and SSL
3 direct replies — Read more / Contribute
by Jonathan
on Aug 10, 2022 at 09:09

    Firstly, apologies as this is more a SSL issue than just Perl. I have an HTTPS url that I want to pull some data from (the url is another server on our local network). The url works fine with Chrome, MS Edge etc but I'm getting a certificate error running my test script from a dev server (Ubuntu). Also wget also fails with certificate errors.

    #!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; use Net::SSLeay; my $m = WWW::Mechanize->new( autocheck => 1 ); print "LWP: $LWP::UserAgent::VERSION\n"; print "Mech: $WWW::Mechanize::VERSION\n"; print "Net::SSLeay $Net::SSLeay::VERSION\n"; my $url = $ARGV[0]; $m->get($url); print $m->content();
    Which outputs;
    $ ./testit.pl LWP: 6.43 Mech: 1.96 Net::SSLeay 1.88 Error GETing https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: Can't conn +ect to xxxxxxxxxxx:443 (certificate verify failed) at ./testit.pl lin +e 15. $
    I suspect there is something available on the companies servers that satisfies browsers but isn't available elsewhere. Anyone seen this before and have any idea what I need to do? Thanks
How to use the right timezone in a DBIx::Class ResultSet call?
1 direct reply — Read more / Contribute
by LittleJack
on Aug 08, 2022 at 21:38

    I have a working DBIx::Class resultset which looks roughly like this:

    my $tasks_to_do = $self->search( { -and => [ scheduled_run_time => { '<= +', 'NOW()'}, status => 'pending' ] } );

    But I need that "NOW()" to be in Sydney, Australia time, not the server time which is UTC.

    In the script which calls this ResultSet method I've got both

    BEGIN { $ENV{TZ} = 'Australia/Sydney'; }

    and

    $schema->storage->dbh_do(sub {"SET TIMEZONE='Australia/Sydney'"} );

    But it still returns no records.

    Due diligence: I know there are records because if I go in to the db manually and do this:

    sessions=> SET TIMEZONE='UTC'; SET sessions=> select * from mytable where scheduled_run_time <= NOW() and + status = 'pending';

    I get "no rows", but if I do this:

    sessions=> SET TIMEZONE='Australia/Sydney'; SET sessions=> select * from mytable where scheduled_run_time <= NOW() and + status = 'pending';

    I get the expected number of records.

    TIA. I have had my coffee by the way.

cURL and HTTP::Request
2 direct replies — Read more / Contribute
by jeanbaptiste93
on Aug 08, 2022 at 18:10

    Hi, all!

    I am trying to call a particular API endpoint and am being unsuccessful. I have been able to make calls to dozens of endpoints to the same API, so I was unsure as to what the issue was. I tried a cURL with the relevant params and it worked successfully. But I am being unable to replicate it to HTTP::Request.

    Here's the cURL:

    curl --request PUT --url 'https://[URL]/[ENDPOINT]/item?key=[KEY]&token=[TOKEN]' --header 'Content-Type: application/json' --data '{"value": {"text": "38"}}'

    I have tried:

    my $browser = LWP::UserAgent::JSON->new(); my $response = $browser->put( $url, Content => to_json($data) );

    and:

    my $request = HTTP::Request::JSON->new( 'PUT', $url, [ Accept => 'application/json', 'Content-Type' => 'application/json', ], to_json($data) ); my $browser = LWP::UserAgent::JSON->new(); my $response = $browser->request($request);

    and similar solutions to no avail. I appreciate any help!

Allowing regex entries in web form to search database: Risks or gotchas?
4 direct replies — Read more / Contribute
by Polyglot
on Aug 08, 2022 at 13:10

    I have a research-oriented database, online, accessible via my own web interface and open to public use. The application is set up to allow read-only access to the database, the CGI script is hosted on a linux server, and the script is definitely not set as setuid. I am not allowing any use of nested executable code inside the regex, via the following sort of rules during the parsing of the query:

    return "ERROR: For security and bandwidth reasons, query may not conta +in pure wildcards." if $SR_query =~ m/^[( ]*\.\s*(?:(?:\{\s*\d+\s*,?\ +s*\d*\s*})?|[*+?]*)[) ]*$/; return "ERROR: Regex containing code disallowed." if $SR_query =~ m[\( +\?\??\{];

    Beyond these fundamental/basic protections against potential malicious actors, is there anything I might be blindly walking into by unleashing this capability in my website?

    I have had to run a rather complicated subroutine on the query itself to prevent taint from objecting to it--even though the code is never "executed" other than being inserted into a m// to run against text drawn from the database prior to formatting the results for return to the browser. But this is a small price to pay for the very useful functionality of having regex-capable searches on the database.

    Blessings,

    ~Polyglot~


Add your question
Title:
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":