Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Seekers of Perl Wisdom

( #479=superdoc: print w/replies, xml ) Need Help??

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask. Post a new question!

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

User Questions
generating a microsoft word doc
2 direct replies — Read more / Contribute
by WoodyWeaver
on Jan 22, 2018 at 16:38

    The task is to create a (long) structured document -- it has a bunch of content that is relatively static, and one section that consists of about a thousand question/answer structured blocks that starts off with a table containing a bunch of checkboxes. Contractually, the end result has to be a Microsoft Word document.

    The usual approach is just to start writing using a Microsoft Word template. However, it seems to me that MS Word is not conducive to clear thought, particularly when a thousand points are required. When I look at other's work using a similar template (these are "system security documents") I find that often people are non-responsive, and I'm guessing its just that the document has so many moving parts. It also strikes me that its difficult to maintain over time (these are supposed to be 'living documents').

    My approach was to store all the complicated structured part in a database back end, and then render into something close to the desired format; then have a copy of the Microsoft Word template with the static stuff filled in, and then just insert the rendered text. I can come reasonably close by using html and checkboxes, then figured that the insert would carry the thing home.

    (Ok, so I really like the idea of a database for lots of other reasons: to prepare for CDM and work across multiple SSPs, to be able to have decent change control and external analysis, to be able to run statistics on the language, etc. A flat textual document just isn't a good approach, imho.)

    However, it seems like its not doing so -- perhaps because of size/memory issues, perhaps because of openoffice / libreoffice / MS Word conversions, the checkboxes look awful. So, I'm casting about for an alternative -- rendering directly into MS Word.

    Another complication is that I work with a linux box, not a Windows box, so I'd rather have a native perl approach rather than Win::OLE. I've used that many times in the past, and could make that work I suppose, it just seems inelegant.

    Is there something like EXCEL::Write::XLSX for microsoft word? Or is there a better way?

Issue with multiple options for a Getopt::Long argument
1 direct reply — Read more / Contribute
by Anonymous Monk
on Jan 22, 2018 at 14:36

    Hi all.

    I just noticed something I can't understand with Getopt::Long yesterday. From the manual:

    Options with multiple values
    Options sometimes take several values. For example, a program could use multiple directories to search for library files:
        --library lib/stdlib --library lib/extlib
    To accomplish this behaviour, simply specify an array reference as the destination for the option:
        GetOptions ("library=s" => \@libfiles);
    Alternatively, you can specify that the option can have multiple values by adding a "@", and pass a scalar reference as the destination:
        GetOptions ("library=s@" => \$libfiles);

    And this is what I have

    use strict; use warnings; use Getopt::Long; use Data::Dumper; ##### THREE VARIATIONS ##### ## Does work: my $options_ok = GetOptions ('option=s@' => \(our $option_array_ref)); my @option_array = @$option_array_ref; ## Does work: our $option_array; my $options_ok = GetOptions ('option=s' => \@option_array); ## Doesn't work: my $options_ok = GetOptions ('option=s' => \(our @option_array)); ##### COMMON CODE ##### if (not $options_ok) { print "Problem with options\n"; exit -1; } print Dumper(@option_array); exit 0;

    Run with something like:

    perl --option OPTION1 --option OPTION2
    The strange thing is that this works for a single argument option:
    my $options_ok = GetOptions ('option=s' => \(our $option_variable));

    So I had a bunch of those (saves lines for declaration of the variables), and when I wanted to switch one into a multi-value version, it didn't behave as I expectedů

    Wonder if anybody has an explanation, it must have to do with how references to variables and strings are returned on declaration, but it's not clear in my mind.
    Any insights appreciated.

    Cheers, Mark Collins.

How do I allow my test script to get rsync to archive file ownership?
3 direct replies — Read more / Contribute
by nysus
on Jan 22, 2018 at 12:16

    OK, I've been struggling with this for a while now and getting no where.

    I've got a test script t/test.t. My test script loads a Moose object that has a wrapper for Net::OpenSSH which I use to create an rsync archive from a remote server to my local server. I set up Vim so that when I hit <F7>, it will call a custom script I have called which executes my test script with the prove command and displays the test results in a separate tmux pane. It all works great.

    The problem is that since my test script is run under my local user account, it does not have the proper permission to change the uid/gid of the downloaded files. I tried to fix this by editing my bash script to run sudo -HE prove ... with the -HE options to to try to preserve my local user's environment so the root user would still have access to my PERL5LIB path. But it didn't work. I still get the Can't locate in @INC error. Apparently, sudo strips out the PERL5LIB path out for security reasons.

    The other thing I struggle with is I want to avoid typing in my password every time I run my test. Typing in my password dozens of times is not my idea of fun.

    Is there a way to securely give my script the ability to change permissions on the downloaded files (and preferably without the need to enter my password)?

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

Appending single record to CSV (or TDV, not too late to switch) - filling 13 fields from 13 files, one of which is split into array of 2 and I just need half of it...
2 direct replies — Read more / Contribute
by hrholmer
on Jan 22, 2018 at 07:01

    I've got a site or three where I need to track incoming URLs that have been received before, and what data was presented for that URL -- the URLs are specific to certain users/groups/whathaveyou, and they need to be able to link back to a given URL and see about 20 variables that are the same as the last time they visited -- or when they send their friends the links, the friends see the same thing instead of a jumble of too much rotating content. Anywho, a simple CGI Script and pearl .pm run it and I don't use most of it. But I do use modules that let me put out rotating content.

    That's fine for a first visit to a URL by anybody -- to just get rotating content, but once a URL has been visited for the very first time, I want to immediately record its unique URL structure, and store 12 numbers along with the key part of that structure in a database, and I'm thinking a CSV is good enough since I don't want to mess with SQL. Sites, btw, are plain old, out-of-date HTML, and the server is dedicated and new and very fast and I'm about the only user on it, so: 1.) Plenty of horsepower (40-core, 128GB DDR4) and fast, ample fast storage (NVMe drivess in Raid1) and cPanel/WHM on CentOS 7.4newest kept current with KernelCare and running CloudLinux with CageFS and mod_lsapi. ...and 2.) I'm running Perl5 and from WHM I can easily add ANY module CPAN offers, so I can be very flexible in solution if somebody is going to end up telling me to use, say, TEXT::CSV in a certain flavor, for instance.

    So out-of-the-box "total" solutions are great too. On to problem at hand. To make database small I hacked one of my modules that rotates content to hand back to the server an extra value in addition to the content it served out when a URL is called. The module creates a second MYvariable, let's call it $consistent_lineX, (where X is the module it came from -- yes 1 module per rotated content displayed...) and that variable holds the line number of the data that was served out from a .txt file to the viewer when the URL was first called. (obviously the module also puts out the content itself, which gets displayed in a template.) But now I know the line number of where the data came from and I can use it when the same URL is called and make sure that the software goes to the .txt storage file and pulls whatever is in that line number and re-displays it -- so the user has a consistent experience.

    By hacking some existing code I am able to get the URL called into a variable and then I also have 12 more variables $consistent_line1 - $consistent_line12 that have the line numbers stored of what was served out from EACH of these mods' .txt files last time the URL was seen. I already have figured out how to handle a repeat-visit to a URL. I can open my .CSV and find the line that matches on the url-string (first field) and I can fetch the string that matches and I can take those line numbers and go to each module's .txt file and pull out the data that was previously served out last time that URL was called, and put it into the variables that will feed into the template. User sees same 12 things, and all I had to store was line number, not the whole big batch of data. That's the point... to keep the cache small

    And it is small, it's 13 fields per line, with the first field being part of the URL to match on to see if it's been served out before, and the remaining 12 fields of each record are the line numbers presented from modules 1 through 12 in order. I've got the matching figured out and the retrieval figured out and all that when a URL is repeated. BUT, my problem is new URLs. The first time a URL is seen, it's parsed (SPLIT) in two, and the back half (2nd of 2 arrays) is where I've got it now in a file, let's call it: $parsed_url1.

    So, when the value of that $parsed_url1 doesn't match the first field in the cache, I know the URL hasn't been seen before, BUT I want the user to have a consistent experience, so I need to append a new record, where the first field is that back half of the parsed URL held now held in the 2nd array field of $parsed_url variable $parsed1, and the other 12 fields of this appended (new) record need to come from $consistent_line1, $consistent_line2, $consistent_line3, etc. variables, in order, up to $consistent_line12.

    So I can't figure out the best, neatest way to get those pushed? printed? what? into >> that .csv file as a new record. I've read a lot of "solutions" for other (not terribly similar) issues that come from using TEXT::CSVxxx, and I'm happy to install ANY Perl Modules on my system that will let me do this efficiently.

    I've got to append a new record to a .CSV file (it's not to late to change that to TDV or whatever, just nothing complicated, and no SQL) and to fill 13 fields with half of a split variable plus 12 more contents held in variables, in order.

    I started playing with the long way around where I transfer the value of that 2nd array 1 from the first file to a temporary variable, then maybe concatenate, then push, or try to get all the 13 values into 1 variable split into an array of 13 then pushing it in with something like push (@temporary,(join, $whatever, @array),"\n") and then using  foreach (@temporary) print $_; print $cachefile $pray_it_appended Yes, I'm not putting it in right syntax that's not my question -- this is bigger picture. I can get syntax right once I have a solution route, and I'm SO LOST looking for one -- that's to give you and an idea of the different "solutions" I'm seeing -- that particular one takes a lot of fiddling around to get all the field values into a single file split into an array, and then a lot of code to push/print/cajole them eventually into my cache.csv as an appended record without overwriting or worse.

    1. Yes, I'm lost. But I know I'm lost and that's the first step. Need advice on shortest route between my newby stupidity here and appending those records for first-time seen URLs over there

    2. Yes, I got the rest of it working -- I know you don't believe that... But since I can't append records, I tested the rest of it it by filling the cache.csv file with made-up records and running the script and it does indeed find a matching first field when compared with the same portion of the incoming URL, and yes, it does go to the text files and goes to the line number in each file,  something like while(<QUOTEFILE>) $used_url_line = $_ if /\(some version of $parsed_url[1])\/ and pulls out the full line content from the cache on a first field match and I chomp it to ditch newline, and then parse it into 13 array fields and then do 12 separate operations on 12 separate files (cause I only have the file line where the data lives, not the data, yet) to open the text file and go to the $selectedline for each and use a pipe split to grab from newline back and then use the [0] field of each of those variables arrays (the part with the actual data in it) to convey that data to the acutual variables that go to each of the 12 fields in the templates. So I'm sorta freehanding what I did here at 27 hours awake when I show code, but you get the point... the rest of it DOES work.

    3. And it's not even slow, but surely somewhat messy, but I'm learning, and now, NOW I'm just stuck on appending the new record to the .CSV filling 13 fields with the value of 13 variables, the first of which (the one needing to populate the first field) holds the value in the 2nd half of its array... I know this is more reading that you ever wanted to do, and I'll take all the slams in the world, but I'm having fun... I'm just stuck on appending a 13-variable CSV record from 12.5 files... Now I'll shut up and take my medicine in the form of your laughter, and hopefully help

    Reminder that I can and will install ANY Perl Modules to my server from the thousands at CPAN for a shortcut. Oh, yeah, and I don't use file-locking, oops, forgot that, and sometimes the bots come raiding and ignore my robots.txt instructions to go SLOWLY... Yeah, so there's that... No, I'm not high... maybe tired from cobbling together this little mess here...

    If you're still reading this, my THANKS for your patience in this alone. You are truly serene if you made it to here.

perl process slower and slower when loop number increase
7 direct replies — Read more / Contribute
by Anonymous Monk
on Jan 22, 2018 at 03:20
    Hi Monks,

    I have these simple one liner tests in my CentOS 7 vps box

    1a. test perl loop with $i=10 $ time perl -e 'for($i=0;$i<=10;$i++){}' real 0m0.003s user 0m0.000s sys 0m0.000s 1b. test PHP loop with $i=10 $ time php -r 'for($i=0;$i<=10;$i++){}' real 0m0.027s user 0m0.020s sys 0m0.004s in average test 1a and 1b, perl is 9x or 10x faster than PHP ------------------------------------------------------------ 2a. test perl loop with $i=10000 $ time perl -e 'for($i=0;$i<=10000;$i++){}' real 0m0.005s user 0m0.004s sys 0m0.000s 2b. test PHP loop with $i=10000 $ time php -r 'for($i=0;$i<=10000;$i++){}' real 0m0.027s user 0m0.024s sys 0m0.000s in test 2a and 2b, perl still much faster than PHP ------------------------------------------------------------ 3a. test perl loop with $i=1000000 $ time perl -e 'for($i=0;$i<=1000000;$i++){}' real 0m0.091s user 0m0.088s sys 0m0.000s 3b. test PHP loop with $i=1000000 $ time php -r 'for($i=0;$i<=1000000;$i++){}' real 0m0.045s user 0m0.044s sys 0m0.004s in test 3a and 3b, PHP already faster then perl ------------------------------------------------------------ now, use extreme loop number like all benchmark code in alioth benchmark game: 4a. test perl loop with $i=100000000 $ time perl -e 'for($i=0;$i<=100000000;$i++){}' real 0m7.310s user 0m7.304s sys 0m0.000s 4b. test PHP loop with $i=100000000 $ time php -r 'for($i=0;$i<=100000000;$i++){}' real 0m1.624s user 0m1.616s sys 0m0.004s in test 4a and 4b, PHP is much much faster then perl

    I already try compare with java, node, and c and the result same as result above, more loop number make loop slower but not as extreme slower as perl

    Is there any setting or ENV to make perl loop faster for big number like test above ?

Sorting and ranking
5 direct replies — Read more / Contribute
by Anonymous Monk
on Jan 21, 2018 at 06:32
    Hi Monks
    I would like your wisdom regarding why my code below does not work:
    Task : from the given hash, rank the keys based on the values, also taking into account that if 2 or more keys have the same value, the get the same ranking.
    my %data = ( 'car' => 180, 'motorcycle' => 150, 'skate' => 150, 'bird' => 120, ); my @keys = keys %data; my @sorted[ sort { $data{$keys[$b]} <=> $data{$keys[$a]} } 0..$#keys ] + = 1..@keys; my $rank = 1; my @ranks = (); foreach $count (0..$#sorted) { $rank = $count + 1 if ($count > 0 && $sorted[$count] != $sorted[$c +ount - 1]); push @ranks, $rank; } print join ",", @ranks;

    The code prints 1,2,3,4, while it should print 1,2,2,3 since motorcycle and skate have the same value. Also, could you tell me how I can also print the key next to its ranking (motorcycle, skate etc)?
    Many thanks!
Set a variable in calling package
6 direct replies — Read more / Contribute
by TerryBerry
on Jan 20, 2018 at 21:25

    I'm trying to figure out how to set a variable in the calling package that use's a package.

    That question might not have been entirely clear, so let me provide an example. Consider the following three files:

    --- #!/usr/bin/perl -w use strict; use lib './'; use MyPackage; --- package MyBase; use strict; use Utils; # return true 1; --- package MyPackage; use strict; # export use base 'Exporter'; our @EXPORT = qw{mysub}; # import sub import { # what goes here to set $var in the caller package? # E.g. $MyPackage::var # export symbols MyPackage->export_to_level(1, @_); } # some subroutine sub mysub { # how do I access $var in the caller package? # E.g. $MyPackage::var } # return true 1;
    Here's what I want to happen. When MyPackage uses Utils, Utils create a variable in MyPackage which can be accessed as $MyPackage::var. Furthermore, mysub() needs to be able to get to that variable when it's called. Of course, Utils doesn't know the name of the caller until, y'know, it's called, so I can't just hardcode in a package name.

    I think I could muck out a solution using a lot of evals, but that seems like it would be very inefficient. Something a little more in keeping with how symbol tables work seems like a better solution.

Generate list of possibilities from incomplete input
3 direct replies — Read more / Contribute
by Pascal666
on Jan 20, 2018 at 19:25

    Given a string containing 9 hexadecimal characters, I need to generate a list of all possible 10 character hexadecimal strings that contain the given 9 characters in order. The list need not be in any specific order.

    I think the complete list will have 16x10=160 items, but 9 of those will be duplicates. Don't worry about removing the duplicates, but if your solution doesn't include them that is fine too.

    Obviously, this can be done with a couple nested loops and string concatenation. Somehow, that just doesn't feel like the best way to me though.

    For example, given 0ae4bb830, the list would include:

    00ae4bb830 10ae4bb830 20ae4bb830 30ae4bb830 40ae4bb830 50ae4bb830 60ae4bb830 70ae4bb830 80ae4bb830 90ae4bb830 a0ae4bb830 b0ae4bb830 c0ae4bb830 d0ae4bb830 e0ae4bb830 f0ae4bb830 00ae4bb830* 01ae4bb830 02ae4bb830 03ae4bb830 04ae4bb830 ...
What does Test::LeakTrace do?
1 direct reply — Read more / Contribute
by TerryBerry
on Jan 20, 2018 at 16:53

    Could you nice folks clarify what Test::LeakTrace does and what problems it finds? I wonder if I misunderstand because it's giving me errors for very, very simple code. Or maybe there are some very simple things about Perl I've been doing wrong all these years. You decide (then tell me).

    Consider this code:

    #!/usr/bin/perl -w use strict; use Test::LeakTrace; # test for leak leaktrace { require './'; }; is, well, almost empty:
    Here's the output from running the script:
    leaked SCALAR(0x558ad5e37f40) from ./ line 7.

    It seems weird to me to get any kind of memory leak for such a very basic script.

    Here's my perl version:

    This is perl 5, version 26, subversion 0 (v5.26.0) built for x86_64-li +nux-gnu-thread-multi (with 56 registered patches, see perl -V for more detail)

    I'm running on Ubuntu Linux 17.10.

Multiple params
3 direct replies — Read more / Contribute
by htmanning
on Jan 20, 2018 at 15:05

    I'm missing something here. I have a link with 2 params, (cell and target_dir).

    use CGI qw(:standard); use CGI 'param'; $store = new CGI; $cell = $store->param('cell'); $target_dir = $store->param('target_dir');

    The link looks like this:

    All I get from the script is the target_dir. I'm not getting the cell returned. What am I doing wrong?

Add your question
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others studying the Monastery: (3)
    As of 2018-01-23 03:00 GMT
    Find Nodes?
      Voting Booth?
      How did you see in the new year?

      Results (238 votes). Check out past polls.