Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Seekers of Perl Wisdom

( #479=superdoc: print w/replies, xml ) Need Help??

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask. Post a new question!

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

User Questions
... something in the environment?
2 direct replies — Read more / Contribute
by chexmix
on Feb 27, 2020 at 15:22
    Hi Monks,

    I'm tearing my hair out over a problem, and I'd appreciate any guidance or pointers.

    I worked very hard on an update to a script we use at work. My update involved using stored procedures instead of direct database (Sybase) queries. This way I could call the stored procedures and get back values identified as output params.

    The construct looks like this:

    my ( $thing_id, $action ); my $update_or_insert = $dbh1->prepare( qq/exec stored_procedure / . qq/\@p1 = ?, \@p2 = ?, \@p3 = ?, \@p4 = ?, \@p5 = ?, / . qq/\@p6 = ?, \@p7 = ?, \@p8 = ?, \@p9 = ?, / . qq/\@thing_id = ? output, \@action = ? output/); $update_or_insert->bind_param(1, $p1, SQL_INTEGER); $update_or_insert->bind_param(2, $p2, SQL_INTEGER); $update_or_insert->bind_param(3, $p3, SQL_INTEGER); $update_or_insert->bind_param(4, $p4, SQL_VARCHAR); $update_or_insert->bind_param(5, $p5, SQL_DATETIME); $update_or_insert->bind_param(6, $p6, SQL_VARCHAR); $update_or_insert->bind_param(7, $p7, SQL_VARCHAR); $update_or_insert->bind_param(8, $p8, SQL_VARCHAR); $update_or_insert->bind_param(9, $p9, SQL_VARCHAR); $update_or_insert->bind_param(10, $thing_id, SQL_INTEGER); $update_or_insert->bind_param(11, $action, SQL_VARCHAR ); $update_or_insert->execute(); ( $thing_id, $action ) = $update_or_insert->syb_output_params(); $update_or_insert->finish;

    I got this working flawlessly in testing. But now I am running it in our 'shadow test' environment and am getting uninitialized value complaints for the bind_param statements for $thing_id and $action. Ultimately the script fails.

    These were just as unitialized in my tests, but those runs didn't complain and they ran to completion.

    To be fair, the input files to the program have changed some and have additional lines -- these new lines are skipped by default anyway, but I added a line of code to do it exolicitly.

    I did figure out that our shadow test environment was running a different version of Perl. I fixed this: both are running 5.22.2, which is what I ran in my tests and what we run in production. This didn't help.

    I confirmed that the same version of DBD::Sybase is being run in both places.

    Is there something that could cause this that I am failing to consider? I asked our dba whether it could be Sybase-related but he said no.

    Very tired. I could figure something out after I post this, but ...



Getiing keys or values of a file based hash
2 direct replies — Read more / Contribute
by shabird
on Feb 27, 2020 at 12:56

    I have a file named "gene.txt", which looks like this GeneName GeneType

    APOL4 protein_coding CYP2C8 protein_coding NAALADL2 protein_coding NANOS3 protein_coding C20orf204 protein_coding MIR429 miRNA MIR200A miRNA MIR200B miRNA I have a file named "gene.txt", which looks like this GeneName GeneType APOL4 protein_coding CYP2C8 protein_coding NAALADL2 protein_coding NANOS3 protein_coding C20orf204 protein_coding MIR429 miRNA MIR200A miRNA MIR200B miRNA
    i have read that file in perl and assigned the values to a hash. Now i want to get the keys or values from that file as i have assigned it to hash but i am unable to get those. my point is to i want to get keys by using the keys() method but its not printing the values on the left as keys and its not printing anything i don't know why i will be very thankful if anyone could help me. Thank you in advance.

    #!/usr/bin/perl -w use strict; open FILE1, "/Desktop/gene.txt" or die; my %hash; while (my $line=<FILE1>) { chomp $line; # print $line,"\n"; (my $word1,my $word2) = split /:/, $line; $hash{$word1} = $word2; } # Print hash for testinf purposes while ( my ($k,$v) = each %hash ) { # print "Key $k $v\n"; @keys = keys(%hash); print @keys; }
Where to go to discuss a Perl module, as in "Why do I want to use this?"
5 direct replies — Read more / Contribute
by davebaker
on Feb 27, 2020 at 09:32

    Wondering where is a good place to go to discuss a Perl module, as in "Why do I want to use this? What does it do that's better than existing solutions/modules?"

    The Synopsis often doesn't give me an example that adequately answers that question. It often assumes I know exactly why I'd want to use the module, and why it's better than other modules or hand-coded solutions (if any).

    For example, -- I understand the benefit of static file sites, and I have written some apps in Dancer2, but I am wondering how and why one would convert data in a database (which my Dancer2 app uses to produce HTML displays) into the markdown files that are used as input to the CPAN module. Is that how and why one would use it? Is it better than other static site generating modules?

    Perhaps that particular question is dumb, but it's a sincere one, and yet it's one that I feel uncomfortable lobbing into Seekers of Perl Wisdom. So the generic question here (not so much the App-unbelievable question above, which is just an example) is, do the monks (or do other parts of the Perl ecosystem) have a more appropriate place?

    I've been a Perl enthusiast for 20 years, but I'm not a professional developer. Thanks for your advice!

Close script with open system call
5 direct replies — Read more / Contribute
by Phweda
on Feb 27, 2020 at 07:24

    Script ends when it opens its logfile by a call to 'system'. Even with explicit 'exit' the script keeps running. Assume it is waiting for the system call to terminate? In this case want to force the script to terminate.

    my $logFileName = $rootDIR . "/fmlog.txt"; open $OUTPUTFILE, "> $logFileName" or die "Died trying to open the out +put file $!"; # Processing files and logging results close ($OUTPUTFILE); # Open Log file for user system($logFileName); exit;
    Script stays open until text application is closed or until a ctrl-c Thanks Phweda
Regular Expression to Parse Data from a PDF
5 direct replies — Read more / Contribute
by kevyt
on Feb 27, 2020 at 01:13

    I have not been able to parse a few fields from a pdf file with CAM::PDF or regular expressions. Can someone offer help on how I may accomplish the task?

    I noticed that CAM::PDF changes $100 to $ 100.

    I was not able to split on \n so I split the line on the Id number in the far right column.
    The column AWD is the company that won.

    I would like to capture all of the columns except comments.

    Here are two Example files:



    #!/usr/bin/perl -w use warnings; use strict; use CAM::PDF; use LWP::Simple; my @ns_headers = ( 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)', 'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*', 'Accept-Charset' => 'iso-8859-1,*,utf-8', 'Accept-Language' => 'en-US', ); my $jacket_id = 746810; my $ua = LWP::UserAgent->new; # $ua->timeout(5); # Is the site available? my $response = $ua->get(' +/'. $jacket_id , @ns_headers); my $pdf = CAM::PDF->new($response->content) || die "$CAM::PDF::errstr\ +n"; # my $pdf = CAM::PDF->new('C:\dev\perl\file.pdf') || die "$CAM::PDF::e +rrstr\n"; # print $pdf->toString(); for my $page (1..$pdf->numPages){ my $text = $pdf->getPageText($page); my @lines = split (/$jacket_id\s+/, $text); # split on Jacket ID a +nd a space foreach (@lines) { print "\n$_\n"; if ( /^A/ ) { # A at the beginning of a line is the Award winn +er print $1; } if (/^(\d+\-)(\d+)/) { # Contractor Code print"Contractor code ". $1,$2 ."\n"; } if (/(\w+)\s+\$/ ) { # Does not work print"Name ". $1 ."\n"; # Name } # if (/\$?([0-9]{1,3},([0-9]{3},)*[0-9]{3}|[0-9]+)(\.[0-9][0-9] +)?$)/) { # Does not work # print"Amount ". $1 ."\n"; # Amount # } # if(1){ # Date # print "Date " . $1; # } } }
count multiple variables in a single array.
3 direct replies — Read more / Contribute
by flieckster
on Feb 26, 2020 at 14:42
    I have a script that goes into a folder on our server, glob's all the files in it, and sends an email to me with exactly what clients files might be in that folder. the files in that folder all have elements in the filenames that i can match to an expression to identify each clients files. i would like to glob the files, then look at the files in the array and match to a list of possible matches and report back to me which client files its found. as you can see with my code below i can match with individual if statements, then list them all out in my email, but that seems wrong, plus i'd rather not have any of the "0" counts show up.
    1 TUMI files 0 BW files 0 maurices files
    ideally i'd list the files in one email like such. any help in modifying my code to be more scaliabe would be helpful.
    1 TUMI file TUMI-1354839054_alt1 1.psd 2 maurices files maur-1111.psd maur-1110.psd
    code below::
    #!/usr/bin/perl -w use Net::FTP; use File::Copy; use File::Basename; use Email::Send::SMTP::Gmail; use POSIX qw(strftime); my $date = strftime("%m-%d-%y",localtime); my $time = strftime("%I:%M:%S",localtime); my $photo_error_folder = "/Volumes/photorepos/Partners/WorkHorse/ERROR +/PhotographyDrop_ERROR"; my $post_error_folder = "/Volumes/photorepos/Partners/WorkHorse/ERROR/ +PostDrop_ERROR"; my $subject = "ICS -- Files in the Error Folder "; my $dllist =''; chdir( $photo_error_folder ) or print "Cant chdir to $photo_error_fold +er $!"; my $count = (@photo_err_list) = glob '"*{jpg,tif,tiff,psd}"'; if ($count > 0) { my ($mail,$error)=Email::Send::SMTP::Gmail->new( -smtp=>' +m', -login=>'kopautomatio', -pass=>'xxx', -layer=> 'ssl', -port=> '465', -debug=> 1, -timeout=> 1000); $tumi_count = grep { /TUMI/ } @photo_err_list; if ($tumi_count > 0) { print "$tumi_count TUMI files\n"; } else { print "$tumi_count TUMI files\n"; } $bw_count = grep { /BW/ } @photo_err_list; if ($bw_count > 0) { print "$bw_count BW files\n"; } else { print "$bw_count BW files\n"; } $maurices_count = grep { /Maur/ } @photo_err_list; if ($maurices_count > 0) { print "$maurices_count Maurices files\n"; } else { print "$maurices_count maurices files\n"; } my $body="please fix and redrop or delete the<br> file://$photo_error_folder/<br><br> "; my $spacer ="<br>"; my $body1= join "<br>\n", @photo_err_list; $mail->send(-to=>"$dllist", -from=>"$dllist", -subject=>"$subject", -b +ody=>"$body $spacer $body1", -contenttype=>"text/html"); $mail->bye; }
Taking control back once Interactive script completes
1 direct reply — Read more / Contribute
by murugu
on Feb 26, 2020 at 11:21

    Greetings Monks,

    Working on a Perl script in which I need to call a third party executable. The executable is interactive and asks lots of random questions to the invoker. I want to invoke the executable from the Perl script and give the control to the script runner(invoker). Once the executable is filled with all the details, I want to have the control back in my script and to continue with the remaining code.

    # Lots of random code # unless (fork()) { exec "inter.exe", "--active"; } # Some remaining code

    The above approach is not working as it is not getting executable in interactive phase. I can not directly use exec here as it will obliterate the existing process.

    Murugesan Kandasamy
    use perl for(;;);

Capturing errors from backtick
3 direct replies — Read more / Contribute
by cormanaz
on Feb 26, 2020 at 10:39
    I am using backticks to run a Python script from the command line. It retrieves the publication date from html pages (if you're interested in that script see here): my $result = `htmldate -u $url`; When htmldate can't access the url, it writes an error to output # ERROR no valid result for url: and I'm wondering if there is a way to trap this error. The FAQ says "Backticks and open() read only the STDOUT of your command." I assume this means it assigns only the stdout output to $result in the above command, and it is writing the error message to whatever output is designated for stderr. Is there some way to detect this, short of redirecting stderr to a file, then opening, reading, and closing this every time I call htmldate?
Behaviour of parsed XML
3 direct replies — Read more / Contribute
by dalgetty
on Feb 26, 2020 at 06:08

    Dear Brethren,

    I know that Perl is not inconsistent, so it must be me.

    For years I have been using XML::Simple to parse several RSS feeds, and have 1500 lines of code running nicely, except for the rare occasion when there is only one item in the RSS feed.

    In this case the script fails, because $data->{channel}->{item}->[0] does not exist. Since there is only one entry, XML::Simple does not create {item}->[0], but puts the hash table straight into $data->{channel}.

    So I adjust the hash table as follows, and I can access the information I need:
    $mydata=$data->{channel}->{item}; my $data->{channel}->{item}->[0]=$mydata; if ($data->{channel}->{item}->[$y]) { while (($data->{channel}->{item}->[$y])&&($y>-1)) { $keyword=$data->{channel}->{item}->[$y]->{epfl_keywords}; ...
    The data is then correctly constructed:
    $VAR1 = { 'channel' => { 'item' => [ { 'epfl_is_internal' => 'False', +'link' => ' +st-cubesat/', 'epfl_organizer' => 'eSpace ', 'pubDate' => 'Mon, 16 Ma +r 2020 14:00:00 +0100', 'description' => "Incl ...
    However, this code needs to run when there are several items in the RSS feed too, so I only want to apply the above operation in cases of one item. In order to test for this I use the following code:
    if (exists($data->{channel}->{title})) { $mydata=$data->{channel}->{item}; my $data->{channel}->{item}->[0]=$mydata; print "Only one item is present"; } if ($data->{channel}->{item}->[$y]) { while (($data->{channel}->{item}->[$y])&&($y>-1)) { $keyword=$data->{channel}->{item}->[$y]->{epfl_keywords}; ...

    "title" is one of many keys that always exists in the RSS feed entries. If it exists directly within "channel" that means that there is only one RSS entry in the feed, and the message prints out "Only one item is present". So far, so good.

    However, I then get an error: "Not an ARRAY reference" for the second "if" statement in line 6, as if the restructuring had not happened.

    This seemed strange to me, since the single entry case had clearly been identified correctly. So I tried the following:

    if (1) { $mydata=$data->{channel}->{item}; my $data->{channel}->{item}->[0]=$mydata; } if ($data->{channel}->{item}->[$y]) { while (($data->{channel}->{item}->[$y])&&($y>-1)) { $keyword=$data->{channel}->{item}->[$y]->{epfl_keywords}; ...

    I fully expected this code to run smoothly, like the first attempt did. Of course, I would never have attempted to make Perl look inconsistent. But I get the "not an array" error again.

    All I am doing is testing for something, not changing anything. But just making an if(1) statement is enough to stop my code working correctly. What is even more confusing is that a Dumper print of the data shows that the data is correctly structured, as in the first statement, whether I apply the if statement or not.

    Can any of you please tell me where my inconsistency lies? Thanks to all

Detecting whether UV fits into an NV
6 direct replies — Read more / Contribute
by syphilis
on Feb 25, 2020 at 20:26

    I was initially going to mark this post as off-topic ... still not entirely sure whether I should have ....
    The question relates specifically to perls for which both $Config{ivsize} and $Config{nvsize} are both 8. That is, perl's integer type (UV/IV) is 64-bit, and perl's floating point type (NV) is either a double or an 8-byte long double.

    The aim is to determine whether a given integer value can be represented exactly as a double.
    Clearly, any integer <= 9007199254740992 can be represented exactly as a double. (9007199254740992 == 2 ** 53.)
    In addition to those values, however, any integer whose highest set bit and lowest set bit are separated by 51 or fewer bits is also exactly representable as a double.

    Here follows my solution. The question is "Is there a better way ?".
    Assume that the given arg is an integer in the range 0 .. 18446744073709551615 (with 18446744073709551615 being the largest possible UV value).
    As an XSub:
    int uv_fits_double(UV arg) { if(arg < 9007199254740993) return 1; while(!(arg & 1)) { arg >>= 1; if(arg < 9007199254740993) return 1; } return 0; }
    And as perl sub:
    sub uv_fits_double { my $arg = shift; return 1 if $arg < 9007199254740993; while(!($arg & 1)) { $arg >>= 1; return 1 if $arg < 9007199254740993; } return 0; }
    It annoys me that I can't find a way to detect and shift all of the trailing zero bits off in one hit - and that I instead have to detect and shift them off one at a time.
    The number of times that the "$arg < 9007199254740993" comparison is evaluated also annoys me. (Doing that evaluation inside the while loop means that the while loop will perform a maximum of 11 cycles. Without that evaluation it could perform up to 63 cycles.)

    As I understand it, the only thing I need to determine is "Are there more than 51 bits between the highest set bit and the lowest set bit ?", and I do that by shifting off all trailing unset bits so that I can then determine (by examining the remaining value) whether it fits into 53 bits or not.
    It feels like there ought to be a quicker, simpler way of doing it ... but I don't see one.
    A perl demo of the uv_fits_double sub:
    use strict; use warnings; use Config; die "This script not meant for this perl configuration" unless $Config{ivsize} == $Config{nvsize}; # The integer value 2251799813685249 is # representable exactly as a double. # Therefore 2251799813685249 * (2 ** 10) # is exactly representable as a double, # since 10 is well within the allowable # exponent range. # 2251799813685249 * (2 ** 10) is also # within the bounds of allowable integer # values. my $fits_d = 2305843009213694976; # 2251799813685249 * (2 ** 10) my $no_fits_d = $fits_d + (2 ** 6); print uv_fits_double(2251799813685249); # fits print uv_fits_double($fits_d); # fits print uv_fits_double($no_fits_d); # doesn't fit print uv_fits_double($fits_d + (2 ** 15)); # fits print "\n"; sub uv_fits_double { my $arg = shift; return 1 if $arg < 9007199254740993; while(!($arg & 1)) { $arg >>= 1; return 1 if $arg < 9007199254740993; } return 0; } __END__ Should output 1101 The 4th value fits, even though it's greater than the 3rd value (which doesn't fit).

Add your question
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (4)
    As of 2020-02-28 03:24 GMT
    Find Nodes?
      Voting Booth?
      What numbers are you going to focus on primarily in 2020?

      Results (122 votes). Check out past polls.