Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
subtraction issue
2 direct replies — Read more / Contribute
by Anonymous Monk
on Jan 24, 2017 at 16:41

    I am setting a variable $finalamount = $v_amount - ($plbamt) with v_amount = -683.12 and $plbamt = -683.12. This should result in $finalamount = 0 but it results in 1.13686837721616e-013 can anyone tell me why this is?

XML::Twig parsing poorly structured content
1 direct reply — Read more / Contribute
by slugger415
on Jan 24, 2017 at 10:46

    Hi monks, I'm using XML::Twig to parse an XML document, with dates and events, that is not as nicely structured as I'd like. Here's a pseudo-code example:

    <div id="calendar"> <h3 class="current-day">Wednesday, February 1</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> <div class="event">Event 3</div> <h3 class="current-day">Thursday, February 2</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> <h3 class="current-day">Friday, February 3</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> </div>

    The problem is the div-events are not contained within the h3 elements, so I can't figure out how to associate the events with each date. I can get all the h3 children and all the div-event children with a div event handler at the top level:

    my($twig, $div) = @_; if($div->att('id') eq 'calendar'){ my(@dates) = $div->children('h3'); my(@events) = $div->children('div'); }

    But obviously that just gives me two unconnected lists. Is there some clever way I can associate these elements, perhaps in the order they appear? Doesn't seem to be a "next_child" function in XML::Twig.

    Thanks for any advice.

Web moniitoring with Perl, comments and suggestions on general approach please
5 direct replies — Read more / Contribute
by predrag
on Jan 23, 2017 at 17:27

    Dear Perl Monks,

    This is an idea I bear over 10 years, to have a web moniitoring on my beekeeping website, to show bee hive weight, temperature and moisture in and out of hive etc. I am a big fan of data acquisition in general, but never realized something like that yet.

    My conditions on the server: My website will be moved on Cloud server, with probably 1CPU, 1GB RAM in the beginning and maybe 2GB later. I am offered Centos (x64), Debian, Fedora… but the aplication among offered that I prefer is Cloud LAMP server: CentOS 6.x, Apache, MySQL, PHP… How it is about Perl installation and version? I expect Perl 5.10.1 installed On CentOS

    Is 1 or 2GB RAM enough for this my application? Till now maximum number of site visits was around 1000 visits in a day in the peak of beekeeping season, but with a completelly new desing, new content and web monitoring I think it can be easily increased 10 or more times

    I plan to move home soon and need to move bees too, but still didn't find the place. Because of that I still don't know the conditions for monitoring device on that place

    My data for monitoring:

    Measuring with several sensors and sending data two times in a day will be enough. Later, next projects may be more ambitious, with more frequent and different data and also, from 2-3 or even more locations.

    There is 3 solutions for the hadrware, as I can see:

    1. It is with PIC microcontroller on a remote place. It will connect to the server two times a day and send data to MySQL base. A general question: I wonder if there is anything between Perl and PICs? Although this solution is more complex, I would not easily refuse it as PICs can be used in some other projects

    2. The opposite: To create a small web server with PICs on the remote place, with MySQL base, so the website server should have to connect to it and read data two times a day.

    3. If a power supply and internet connection would be present on the remote place there is a solution that I don't need PICs. Instead I could have a computer there, accept data to the serial port (?) and write Perl scripts that read data and present in a form of graphs and tables on HTML pages. In that case I would also need a Perl script that makes internet connection or maintains continuous connection in the case of some other projects with more frequent data.

    Regarding MySQL base, I don't have any experience with it yet. I know about modules DBI and DBD:: mysql. Any recommendations for these or something other?

    Thanks to perl monk Hauke D, I have one possible solution for further processing:

    To use Chart::Clicker and to write a script that reads data (from MySQL or?), plots a graph and saves on the server as an image file. Then, to make a cron job calling that Perl script that generates the charts twice a day and saves them as image files. All these images should be saved under the same file name and image file will be inserted in html page. When data change, new image will be saved and automatically will become visible in html page after page refreshing or a new visiting

    In order to have data in a table too, the same principle can be applied to HTML pages as well: In the same script that generates charts, I can generate a HTML page that includes data and to save it on the web server directory.

    But if I would like to do monitoring during the whole year, I have to find a solutions for that. I believe that there are some Perl modules for that.

    Other ideas for further processing

    I've tried some examples for plotting graphs with GD::Graph but somehow, am more inclined to Chart::Clicker, although it has much dependencies.

    Just two days ago, I came onto RRDTool and also, found out about RRD-Simple module. I find the tutorial for RRDTool very good. Of course, all that may be not so simple, especially for me, but I believe that I could cope with it myself.

    I see that RRDTool is powerful and maybe it is a good solution for me to easily have graphs for each month or year (I still don't know if it a solution for tables too?). But again, maybe there is a Perl module that allso can help with that?

    I don't ask for any code but will be really very thankful for any, whether general or more concrete comment or a suggestion on any part of the project, or for some useful links. Comments on compatibility between elements of the project will be very useful too. Especially because I feel I will not stop here and I already dream about doing some other things, so I will probably try most of your sugestions.

OT: Storing encryption keys securely
4 direct replies — Read more / Contribute
by Beatnik
on Jan 23, 2017 at 09:42
    Slightly OT but here goes.

    I'm writing some glue code that will store an encrypted password in a database. I'm looking at different approaches on making all this as safe as possible. Hashing the password (for verification) is not really an option as I will need access to the clear-text (to pass it on to another class). I'll be taking some steps to avoid breaking the encrypted password easily but what about storing the key used to encrypt? In an ideal world, the key to encrypt won't be accessible by anyone but how can I make sure? In some way, the key must be stored somewhere.. Even with a keychain of some kind, I will still need to store the keychain key.

    Thoughts?


    Greetz
    Beatnik
    ... I'm belgian but I don't play one on TV.
Properly testing self-compiled character-encodings
2 direct replies — Read more / Contribute
by yulivee07
on Jan 23, 2017 at 07:15
    Hi Perlmonks, I am searching for a proper way to test various character-encodings if they works as expected on my platform.

    I have some special IBM-character codepages I need to include for my Platfrom (AIX 7.2). I have built these according to the manual delivered with enc2xs http://search.cpan.org/dist/Encode/bin/enc2xs

    However, I do not trust them. I took a look with the strings utility whether there are characters in the produced binary file:
    $ strings CP1141.so $
    strings just returns nothing. Using another C-Compiler solved this problem and produced binary files that contain characters.

    So now I want to create a test to see if the encoding is working properly. I have an older machine where all those character-encodings are working and installed, so I could generate files containing correct information there and copy them over to the new machine for testing.

    I am not entirely shure about a good testing strategy. I thought of testing characters beyond the 128th character (as below theyy would be all equal as it is ASCII). Does this seem reasonable?

    I looked into the Encode distribution in search for unit-tests for encodings but didn't find much. Are there best practices for cases like this?

    Regards, Yulivee
Unable to flush stdout: Invalid argument
4 direct replies — Read more / Contribute
by bakiperl
on Jan 22, 2017 at 08:27

    I have recently upgraded perl for windows from version 5.16 to version 5.24 and suddenly two of my cgi scripts started to produce the following error message:

    Unable to flush stdout: Invalid argument?

    I really have no idea where to look. Any help is appreciated.

    Thank you.
Readonly references, replicating data structures
3 direct replies — Read more / Contribute
by nikmit
on Jan 22, 2017 at 06:37

    When I initially hit this it seemed incomprehensible... then I figured out why it's happening but not yet how to fix it - so here I am humbly asking for your wisdom.

    I expected the below code to create a fresh copy of @arr every time gimme() is executed. Instead it creates a new reference for @arr but reuses all of the nested references. The result is that gimme() returns the last box with its content, rather than a fresh empty box which is what I want...

    I considered using Readonly or a configuration file on disk to ensure I get an empty box every time, but both seem clunky and slow. The actual data structure is nested to 5-6 levels and while it is not huge, it will be executed often.

    I suspect turning the data structure into an object may be the right path, but I'm yet to pick that side of perl up... What would be your advice?

    #!/usr/bin/perl -w use strict; use Data::Dumper; BEGIN { my @arr = ( { box => { attr => { this => 'that', foo => 'bar', }, content => {}, }, }, ); sub gimme { my @result = @arr; return \@result; } } my $box1_ref = gimme(); $box1_ref->[0]->{white_box}->{content}->{apples} = 5; print "tracing: $box1_ref -> $box1_ref->[0] -> $box1_ref->[0]->{white_ +box} -> $box1_ref->[0]->{white_box}->{content}\n"; print Dumper $box1_ref->[0]->{white_box}->{content}; my $box2_ref = gimme(); print "tracing: $box2_ref -> $box2_ref->[0] -> $box2_ref->[0]->{white_ +box} -> $box2_ref->[0]->{white_box}->{content}\n"; print Dumper $box2_ref->[0]->{white_box}->{content};

    Result of that code is:

    tracing: ARRAY(0x20d9e70) -> HASH(0x1fcf638) -> HASH(0x20d9db0) -> HAS +H(0x20d9de0) $VAR1 = { 'apples' => 5 }; tracing: ARRAY(0x1fab0f0) -> HASH(0x1fcf638) -> HASH(0x20d9db0) -> HAS +H(0x20d9de0) $VAR1 = { 'apples' => 5 };
Dont yell, perl2exe encoding (UTF-8) issue
1 direct reply — Read more / Contribute
by enrgyxprt
on Jan 21, 2017 at 19:51

    So, after reading a bunch on perl2exe here at the monastery, I ask with much trepidation the following.

    I have a basic less than 100 line script that has worked amazingly well. Id like to share it with my peers, but know they would disregard as soon as I said, "you need perl..."

    Installed strawberry on a spare laptop, downloaded perl2exe, installed a few dependency's in CPAN, created my Icons, ran

     perl2exe -icon=My.ico my.pl

    and was greeted by the sight of my.exe I got good at pressing "prt screen" and finding I needed to add

    #perl2exe_include PerlIO #perl2exe_include Encode

    What I think I'm seeing is my.exe is choking on

    open(my $remvrfh,'<:encoding(UTF-8)',$remvr) or die("Can't open input file \"$remvr\":$!\n");

    so first, is #perl2exe_include Encode enough or is there anything else I need for the (UTF-8) ?

    Second, my.pl was designed to be run from a batch file. In that batch file I have

     @perl "%~dp0my.pl" %*

    Using the batch file, launches my.pl in a small cmd window for a brief second and life is grand.
    Now, as my idea has evolved, I was thinking I could exclude the batch file completely, that the parameters or Args sent by windows to my batch (from double clicking a file.blt mapped to open with my.bat by default)would be sent to my.exe. But, thats only a hunch. I have in my .pl

    $path=abs_path($0) $arg0="$ARGV[0]";

    So, Question 2 is,
    do I need to manage the double click - bat %~dp0 and %* to my.pl differently,
    and if so please respond as little as you like but at least leaving breadcrumbs or letting me know every thing you do !

Converting utf-8 to base64 and back
2 direct replies — Read more / Contribute
by LanX
on Jan 21, 2017 at 18:21
    The following code produces the original utf-8 string after converting to base64.

    But it seems overly complicated. What am I missing ?

    Additionally: I can probably understand that I need to do encode_utf8 step, but why do I need to set the utf8 flag manually after explicitly encoding to utf8?

    use strict; use warnings; use utf8 ; use Encode; use Data::Dump qw/dd pp/; use Devel::Peek; use MIME::Base64 ; my $str ='Ä'; warn "\n*** orig str='$str'"; Dump $str; our $encoded = MIME::Base64::encode_base64($str); warn "\n*** encode_b64 encoded='$encoded'"; Dump $encoded; warn "\n*** decode_b64"; our $decoded = MIME::Base64::decode_base64($encoded); Dump $decoded; warn "\n*** encode_utf8"; $decoded = Encode::encode_utf8($decoded); Dump $decoded; warn "\n*** _utf8_on"; Encode::_utf8_on($decoded); Dump $decoded;

    *** orig str='Ä' at c:/tmp/b64_utf8.pl line 12. SV = PV(0x54c9d8) at 0x5a8b00 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x5471d8 "\303\204"\0 [UTF8 "\x{c4}"] CUR = 2 LEN = 16 *** encode_b64 encoded='xA== ' at c:/tmp/b64_utf8.pl line 17. SV = PV(0x54caa8) at 0x26dbcb8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x547478 "xA==\n"\0 CUR = 5 LEN = 16 *** decode_b64 at c:/tmp/b64_utf8.pl line 21. SV = PV(0x54cb08) at 0x26da130 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x28d2b98 "\304"\0 CUR = 1 LEN = 16 *** encode_utf8 at c:/tmp/b64_utf8.pl line 25. SV = PV(0x54cb08) at 0x26da130 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x28d2c28 "\303\204"\0 CUR = 2 LEN = 16 *** _utf8_on at c:/tmp/b64_utf8.pl line 29. SV = PV(0x54cb08) at 0x26da130 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x28d2c28 "\303\204"\0 [UTF8 "\x{c4}"] CUR = 2 LEN = 16

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Is this a sane/safe way to pass an aref into a C function?
4 direct replies — Read more / Contribute
by stevieb
on Jan 21, 2017 at 13:54

    Given that weekends are typically slower than through the week, I thought I'd ask for a quick code review of those who understand the perl internals. This is my real first attempt at doing anything regarding manipulating and managing perl data types in C/XS, and am just wondering if what I've got below is reasonable, sane and safe, or if there are better ways to do it.

    Instead of passing an integer into the C function then doing a bunch of bit shifting to get out the required number of bytes, I wanted to pass in an array reference so that the number of bytes can be dynamic (otherwise with an int, I'm limited to a maximum of four bytes in a call to testing()).

    use warnings; use strict; use Inline 'C'; my $channel = 0; my @bytes = (0x00, 0x01, 0x02, 0x03); testing($channel, \@bytes, 4); __END__ __C__ int testing(int channel, SV* byte_ref, int len){ if (channel != 0 && channel != 1){ croak("channel param must be 0 or 1\n"); } if (! SvROK(byte_ref) || SvTYPE(SvRV(byte_ref)) != SVt_PVAV){ croak("not an aref\n"); } AV* bytes = (AV*)SvRV(byte_ref); int num_bytes = av_len(bytes) + 1; if (len != num_bytes){ croak("$len param != elem count\n"); } unsigned char buf[num_bytes]; int i; for (i=0; i<len; i++){ SV** elem = av_fetch(bytes, i, 0); buf[i] = (int)SvNV(*elem); } /* * here, I'll be passing the char buffer and len * to an external C function. For display, I'll * just print the elements (the return is from ioctl()) * * if ((spiDataRW(channel, buf, len) < 0){ * croak("failed to write to the SPI bus\n"); * } */ int x; for (x=0; x<len; x++){ printf("%d\n", buf[x]); } }
Detecting 'our $foo => 1' mistake?
3 direct replies — Read more / Contribute
by perlancar
on Jan 21, 2017 at 04:58

    Dear monks,

    I just wasted half an hour trying to find a cause why a variable is undefined (and chasing the wrong rabbit of trying to figure out why a particular module won't load but require() doesn't raise an error), before finally realizing that I made a typo:

    our $foo => { a=>'blah', b=>'blah', ... };

    instead of:

    our $foo = { a=>'blah', b=>'blah', ... };

    And come to think of it, this is perhaps my third or fourth time being bitten by this. This mistake is not caught by warnings, so any suggestion on how to detect this? Perhaps there's a Perl::Critic policy somewhere and I need to just (re-)plunge and use Perl::Critic (again).

    UPDATE 2017-01-23: my bad, it turns out I didn't use warnings in the original code. our $foo => value indeed produces warning in most cases.

using waitpid() with signals
2 direct replies — Read more / Contribute
by ristov
on Jan 20, 2017 at 14:59

    I am trying to write the code which would wait for the child process to exit -- but if the code gets the TERM signal while waiting, the child process should be terminated with the same signal. While waiting for the child can be easily done with waitpid(pid, 0), signals do not interrupt waitpid() with EINTR error code. To solve this problem, one could use the following code with non-blocking waitpit():

    $term = 0; $SIG{TERM} = sub { $term = 1; }; for (;;) { $p = waitpid($pid, WNOHANG); # where did the child disappear? if ($p == -1) { exit(1); } # child has terminated if ($p == $pid && (WIFEXITED($?) || WIFSIGNALED($?))) { exit($?>>8); } # TERM has arrived, forward it to child and exit if ($term) { kill('TERM', $pid); exit(0); } # check the child again after 1 second sleep(1); }

    However, I am interested whether the same task can be accomplished with blocking waitpid() which consumes less CPU time. There is one very interesting recipe which involves the use of eval:

    http://blog.kazuhooku.com/2015/02/writing-signal-aware-waitpid-in-perl.html

    Nevertheless, I am wondering whether it would be OK to use the blocking waitpid() with a different signal handler for the same purpose:

    # waitpid($pid,WNOHANG) returns 0 if the child process # exists and has not terminated $SIG{TERM} = sub { waitpid($pid,WNOHANG) || kill('TERM', $pid); exit(0 +); } # waitpid loop for (;;) { $p = waitpid($pid, 0); # where did the child disappear? if ($p == -1) { exit(1); } # child has terminated if ($p == $pid && (WIFEXITED($?) || WIFSIGNALED($?))) { exit($?>>8); } }

    In the signal handler, waitpid($pid, WNOHANG) is used for checking if the child process exists, in order to avoid sending TERM to a non-existing process. Since I am not too familiar with Perl internals, I am not sure if it is OK if the signal handler is invoked in the middle of blocking waitpid(), in order to call waitpid() again in non-blocking mode from the handler. Can anyone provide some insights? If this approach has flaws, I would go with previous code examples.

    regards, risto
New Meditations
Improve readability of Perl code. Naming reference variables.
7 direct replies — Read more / Contribute
by hakonhagland
on Jan 19, 2017 at 15:05
    Hello Monks!

    I've been learning Perl for some years now. At the same time, moving from writing awk scripts to writing Perl scripts, I have found Perl to be an amazing resource for getting things done.

    Still, I have some minor issues with the language design that I have not yet been able to understand/resolve. This is what I want to discuss here.

    Background

    It sometimes bugs me that it is so difficult to write Perl code that is readable (easy to follow) when working with references. For example, if I see a variable $var in the middle of some code, it can be a scalar variable, a scalar reference, an array reference, a hash reference, and so on. Hence, I often end up guessing or having to scan source code nearby in order to determine the type of the variable. I find this workflow less than optimal. Would it not be better if the variable could (optionally) be made self-documenting with respect to reference type?

    In the book Perl Best Practices, the problem is mentioned in another setting, and the solution suggested is to add the suffix _ref to the variable name. So one could write,

    $var_href = { a => 1 };
    to create a hash ref, and
    $var_aref = [ 1, 2, 3];
    to create an array reference.

    However, a problem with this convention could be that the suffix is not optional. You should not be forced to used the more verbose form of the variable name. I think, the programmer should have a choice to decide whether he finds it advantageous to include the suffix at given place or not. For example, when declaring the variable as

    $var = [ 1, 2, 3 ];
    it is rather obvious that it is an array reference, and there is no need to write:
    $var_aref = [ 1, 2, 3 ];
    The latter is in my opinion too verbose. However, if the reference is just defined as
    my $var;
    it would often be better to include the suffix. If there is no indication on the next lines or so whether $var will be used as an array reference or not, it would be more readable to define it as
    my $var_aref;

    A new idea for reference variable naming syntax

    So this lead me to an idea: Could the postfix dereferencing syntax be extended for this use case?

    The Postfix Dereferening Syntax (PDS) was introduced as experimental in 5.20. And starting from 5.24 it is included in the Perl language by default.

    Currently PDS is used for dereferencing:

    my @array = $var->@*;
    Notice that the PDS includes a star after the sigil. It is a syntax error not to include the star. But let's say for the moment that if the star was omitted, the dereferencing was to be simply ignored instead. So
    my $var->@;
    would mean the same as
    my $var;
    and produce no syntax error.

    Let's denote this new syntax by Optional Postfix Reference Declaration Syntax (OPRDS). So when using OPRDS, should it be entirely up to the user to ensure that he used the correct sigil. For example, if I write

    $var->@ = 12;
    when I really meant
    $var->@ = [ 12 ];
    should it produce a compile time error? I think it would be very helpful if the compiler could use OPRDS to check for consistency. But it might be difficult to implement? I do not know. If it is difficult to implement, some alternatives might be used instead? I don't know much of Perl internals, so this is a point where I need help.

    When I started out with this idea, compile time type-checking was not on my mind at all. But I see now that OPRDS would offer the opportunity for stricter type checking.

    But type checking was not the main issue I wanted to discuss. What I would like to discuss is how to deal with reference variable names. Reading and understanding written Perl code can be difficult since the $ sigil can be used for many data types. How could this situation be improved?

RFC: Module for testing asynchronous event series
No replies — Read more | Post response
by Dallaylaen
on Jan 18, 2017 at 16:57

    Let's say we are going to test a module that is supposed to be run asynchronously - using threads, AnyEvent, or Coro, or some other means. And we need to check that certain events happen in certain sequence, because some of them depend on the others.

    Probably the best way to achieve this would be of course to minimize interdependencies and use mathematically correct synchronization for whatever is left. Of course, that is not always achievable, due to limited time.

    So I'm going to propose a primitive that I think should deal with a huge subclass of such tasks.

    The code goes as follows:

    use Test::AsyncSeq; my $id = Test::AsyncSeq->get_sequence_id; my $id2 = Test::AsyncSeq->get_sequence_id( "frobnicate" ); # would be "frobnicate1" or smth # somewhere in threads/callbacks is_after( $id, "start" ); # elsewhere is_after( $id, "event2", "start" ); # more elsewhere is_after( $id, "event3", "start" ); # finally is_after( $id, "finish", "event2", "event3" );

    The is_after( $id, $event, @dependencies ); passes if and only if:

    • sequence named $id was created;
    • $event was not seen in the sequence yet;
    • all of the @dependencies have been seen at the moment of the call.

    The id is just a string, and is required since Perl is not very good at passing blessed references across threads. And multiple tests MAY be needed in the same script, say to catch a race condition.

    Does such interface make sense? Would it be of use to anyone?

[RFC] Building Regex Alternations Dynamically
5 direct replies — Read more / Contribute
by haukex
on Jan 18, 2017 at 07:57

    Dear Monks, this is a suggestion for a tutorial, any comments or suggestions are welcome. Update 1: Fixed up explanation of metacharacters a bit. Update 2: Implemented some things from kcott's comments. Update 3: Added TL;DR, inspired by LanX.

    TL;DR: The two code samples below are working pieces of code that can be copied into your Perl script and adapted for your purposes.

    I thought it might be useful to explain the technique of building regular expressions dynamically from a set of strings. Let's say you have a list of strings, like ("abc", "def", "ghi"), and you want to build a regex that matches any of them, like /(?:abc|def|ghi)/. This also works well with s/search/replacement/ if you have a hash where the keys are the search strings and the values are the replacements, as I'll show below. If you're uncertain on some of the regex concepts used here, like alternations a|b and non-capturing groups (?:...), I recommend perlretut.

    First, the basic code, which I explain below - note the numbering on the lines of code.

    my @values = qw/ a ab. d ef def g|h /; my $regex_str = join '|', # 4. map {quotemeta} # 3. sort { length $b <=> length $a } # 2. @values; # 1. my $regex = qr/$regex_str/; # 5. print "$regex\n"; # 6.
    1. We begin with the list of strings stored in the array @values. This could be any list, such as a literal qw/.../, or return values from functions, including keys or values.
    2. We sort the list so that the longer strings appear first. This is necessary because if our regular expression was /foo|foobar/, then applied to the string "foobarfoofoobar", it would only match "foo" three times, and never "foobar". But if the regex is /foobar|foo/, then it would correctly match "foobar", "foo", and again "foobar".
    3. Next, we apply the quotemeta function to each string, which escapes any metacharacters that might have special meaning in a regex, such as . (dot, matches anything) or | (alternation operator). In our example, we want the string "g|h" to be matched literally, and not to mean "match g or h". Unescaped metacharacters can also break the syntax of the regex, like stray opening parentheses or similar. Note that quotemeta is the same as using \Q...\E in a regex. As discussed here, you should only drop \Q...\E or quotemeta in the case that you explicitly want metacharacters in your input strings to be special, they come from a trusted source, and you are certain that your strings don't contain any characters that would break your regular expression or expose security holes!
    4. Then, we join the strings into one long string using the regex alternation operator |. If you want to use this string without the qr// of step 5, note this potential pitfall: For example, if your input is qw/a b c/, then at this point your string will look like $regex_str="a|b|c". Then, saying /^$regex_str$/ will be interpolated to /^a|b|c$/, which means "match a only at the beginning of the string, or b anywhere in the string, or c only at the end of the string", which is probably not what you meant, you probably meant /^(?:a|b|c)$/, that is /^(?:$regex_str)$/!
    5. Finally, we compile the regular expression using qr//. This is not strictly necessary, you could just interpolate the string you've just created into a regex, but I prefer to turn them into regex objects explicitly. It also has the advantages that you can apply modifiers such as /i to the regex in a (IMO) more natural way, and that qr// implicitly adds a non-capturing group (?:...) around the regex, which takes care of the problem described in step 4 above.
    6. When we print the regular expression, we see that it has become this:
      (?^:ab\.|def|g\|h|ef|a|d)
      You can now use this precompiled regular expression anywhere, as explained in Compiling and saving regular expressions and perlop, such as if ($input=~$regex) { ... } or while ($input=~/$regex/g) { ... }.

    Search and Replace Using a Hash

    my %map = ( a=>1, ab=>23, cd=>45 ); # 1. my $regex_str = join '|', # 2. map {quotemeta} sort { length $b <=> length $a or $a cmp $b } # 3. keys %map; my $regex = qr/$regex_str/; print "$regex\n"; # 4. # Now, use the regex my @strings = qw/ abcd aacd abaab /; # 5. for (@strings) { my $before = $_; s/($regex)/$map{$1}/g; # 6. print "$before -> $_\n"; # 7. }
    1. This is the hash in which the keys are the search strings, and the values are the replacements. As above, this can come from any source.
    2. This code to build the regex is mostly the same as the above, with differences noted here.
    3. Instead of only sorting by length, this sort first sorts by length, and sorts values with the same length with a stringwise sort. While not strictly necessary, I would recommend this because hashes are unordered by default, meaning that your regex would be in a different order across different runs of the program. Sorting the hash keys like this causes the regex to be in the same order in every run of the program.
    4. We print the regex for debugging, and see that it looks like this: (?^:ab|cd|a)
    5. These are the test strings we will apply the regular expression against.
    6. This is the search and replace operation that matches the keys of the hash, and as a replacement value gets the corresponding value from the hash. Note that the /g modifier is not strictly required (s///g will replace all matches in the string, not just the first), and you can adapt this regex any way you like. So for example, to only make one replacement anchored at the beginning of the string, you can say s/^($regex)/$map{$1}/;.
    7. The output of the code is:
      abcd -> 2345 aacd -> 1145 abaab -> 23123

    Hope this helps,
    -- Hauke D

New Monk Discussion
Nodelet style broken?
3 direct replies — Read more / Contribute
by Anonymous Monk
on Jan 24, 2017 at 14:24
    The nodelets are displaying without the usual small font and blue border at the moment, just plain text, still aligned at the right of the page, web inspector isn't showing any errors. Did something change?