Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

The Monastery Gates

( #131=superdoc: print w/ replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
Hash searching
2 direct replies — Read more / Contribute
by Oligo
on Aug 24, 2016 at 07:29

    Hello Monks,

    I'm trying to master the art of searching with hashes. I have a file with a list of search terms (one line = one term):

    J00153:42:HC5NCBBXX:6:1101:10896:14959 J00153:42:HC5NCBBXX:6:1101:10896:14959 J00153:42:HC5NCBBXX:6:1101:26616:20709 J00153:42:HC5NCBBXX:6:1101:27549:19935

    ...and a master file I want to search for those terms in (again, one line per record):

    J00153:42:HC5NCBBXX:6:1101:10896:14959 99 gnl|Btau_4.6.1|chr16 + 72729218 1 12M J00153:42:HC5NCBBXX:6:1101:27549:19935 83 gnl|Btau_4.6.1|chr8 + 49556412 1 7M

    I started by opening the query file and reading each line into an array. Then while-ing through the master file, returning matching lines where the array elements match the relevant part of the master file:

    # Open query file and read into array $queryfile = $ARGV[0]; open (QUERYFILE, $queryfile) or die "Cannot open query file\n"; @queries = <QUERYFILE>; close QUERYFILE; # Open main file $mainfile = $ARGV[1]; open (MAINFILE, $mainfile) or die "Cannot open searchable file\n"; # Search through main file while ($inline = <MAINFILE>) { @split = split /\t/, $inline; $ID = $split[0]; if (grep /$ID/, @queries) { print $inline; } else { } } exit;

    This works fine, but the files are huge and the code takes an age to run. So, I tried converting the array to a hash (array elements = hash keys, values all = 1) but I can't seem to get the pattern matching syntax right; the code runs much faster but nothing comes back. So far I have:

    # Open query file and read into array $queryfile = $ARGV[0]; open (QUERYFILE, $queryfile) or die "Cannot open query file\n"; @queries = <QUERYFILE>; close QUERYFILE; # Convert array to hash %hash = map {$_ => 1} @queries; # Open main file $mainfile = $ARGV[1]; open (MAINFILE, $mainfile) or die "Cannot open searchable file\n"; # Search through main file while ($inline = <MAINFILE>) { @split = split /\t/, $inline; $ID = $split[0]; if (defined $hash{$ID}) { print $inline; } else { } } exit;

    Any Perly wisodom greatly appreciated!

Advice wanted for debugging CPAN Testers failures
3 direct replies — Read more / Contribute
by pryrt
on Aug 23, 2016 at 11:59
    Fellow Monks,

    How do you go about debugging failures from CPAN Testers when your own configurations are not failing? I'd like advice, both in general, and anything you see in my specific examples below.

    For example, this test matrix has a bunch of failures -- but when I test on my machines, I cannot replicate the errors they are getting.

    Before releasing, I tested on a couple of different versions I have access to (strawberry perl 5.24.0_64 on Win7 and an ancient CentOS 4.6 linux 2.6.9-55 with perl 5.8.5), and neither failed my test suite. And since I've seen the CPAN Testers failures, I've started increasing my berrybrew installations to improve version coverage -- but so far, they've all passed, even when they've been on Perl versions that failed in the linux column.

    After I've exhausted available Strawberry installations, I will probably grab one of my linux virtual machines and start increasing perlbrew installations, and run through as many as I can there (I cannot install perlbrew or other local perls on the CentOS machine I noted, due to disk restrictions). But even with trying a new slew of versions, I cannot guarantee that I'll see the same failures that CPAN Testers is showing me.

    I know where I'll be looking for the specific errors: my expected values are wrong; the expected values were being generated by functions I thought were fully tested earlier in my test suite, so I'll have to look into that some more, and also see if maybe I should independently generate the expected values.

    But if I cannot replicate the exact failures from CPAN Testers, it's going to be harder to know I've solved the problem. When doing my last release to add features, I ended up submitting beta versions to CPAN, with extra debug printing, and waiting overnight while the CPAN Testers ran, then basing my fixes on changes in those results. But that's a rather slow debug process... and I noticed that every submission, I was getting fewer results from TESTERS: I think some of those auto-testers have some sort of submission limits, or otherwise remember that a particular module fails and stops testing new versions.

    Any advice, generic or specific, would be welcome.

Devel::Cover for with different command-line input arguments
2 direct replies — Read more / Contribute
by tito80
on Aug 22, 2016 at 15:08
    Hi PerlMonks,

    I am a newbie and pardon my ignorance. I am trying to use Devel::Cover (DC) to get an idea of the code coverage of my ran with arg1 arg2, etc. as command-line input arguments. I am using Perl 5.8.9 with 1.23 DC. The steps I executed successfully to generate the html file are:

    1) perl -MDevel::Cover -arg1 <arg1_value> -arg2 <arg2_value>

    2) Devel-Cover-1.23/bin/cover cover_db --report=html

    Now, I do not have packages/modules to test directly. But I have the file which invokes several packages.

    And I need to test with several different combinations of command line i/p arguments (i.e. <arg1_different_value1>, <arg2_different_value_2>, etc.). All these test combinations are a part of my regression suite (you can think of the options written in a test.txt file sequentially) and I am trying to see if Devel::Cover can assure that I have 100% code coverage with these tests.

    While I can run the above in a for-loop, each time with different arguments, I am not sure if that is the best way since I will end up with 100's of html/report files that need to be merged for any meaningful purpose.

    Could you kindly provide some pointers?

    Thanks, Tito

How to display name of current package
3 direct replies — Read more / Contribute
by sylph001
on Aug 22, 2016 at 06:16

    Dear monks,

    Recently I tried to find a way to display the name of the perl package that the program is in, however didn't get any clue about this.

    Could you let me know if there is any special varaible or any other method that I can get the package name?

    Thanks a lot in advance

    For example, I'm writing a package in "", and I need to display the package name when the program is executing this package, like in below:

    package MyPackage; use warnings; my $pkg_name = <how to know the package name?>; print "I'm currently in the $pkg_name package!"; exit 0;

Master Pages and OpenOffice::OODoc
1 direct reply — Read more / Contribute
by emav
on Aug 20, 2016 at 21:16

    I've been trying to process some XML files to create OpenOffice text documents and have been mostly successful so far. However, I would like to add page numbers in the footer, which should appear on all pages but the first couple of them where the title is supposed to appear.

    So far, I think I've figured out that the way to go about achieving this is through master/layout pages but I have been unable to accomplish the required task. Of course, it would be easier to simply produce the basic document and then edit it through LibreOffice to get what I need but... you know how it is! I've got to know now!

    So, the sample code below produces what I expect: An .odt document with the title on the first page without a page number in the footer and the text appears in the following pages with a page number in the footer. However, the problem seems to be that every page contains one paragraph only preceded by a page break.

    I guess the problem lies somewhere in the way I connect master/layout pages to the paragraph styles... or something... but I couldn't find any concrete examples on line to help me fix my mistake(s). I would appreciate any pointers because this is almost driving me crazy.

    Here's the sample code:

    #!/usr/bin/perl -w use strict; use warnings; use utf8; use Win32; use Encode; use OpenOffice::OODoc; use Data::Dumper::AutoEncode; ooLocalEncoding('utf-8'); my $curfolder = Win32::GetCwd(); my $outfolder = $curfolder . '\\out'; my $outfile = $outfolder . '\\test.odt'; my $doc = odfDocument( file => $outfile, create => 'text', opendocument => 0, ); my $styles = odfDocument( container => $doc, part => "styles", ); my $centerfooter = $styles->createStyle( 'centerfooter', family => "paragraph", properties => { "fo:margin-top" => "0.5cm", "fo:text-align" => 'center', }, replace => 1, ); my $headerstyle = $styles->createStyle( "header", family => "paragraph", parent => "Standard", properties => { "fo:margin-top" => "8cm", "fo:text-align" => "center", "fo:break-after" => "page", }, replace => 1, ); $styles->styleProperties( $headerstyle, -area => "text", "fo:font-size" => "200%", "fo:font-weight" => "bold", ); $styles->setAttributes( $headerstyle, "master-page-name" => 'header', ); my $regularstyle = $styles->createStyle( "regular", family => "paragraph", parent => "Standard", replace => 1, ); $styles->setAttributes( $regularstyle, "master-page-name" => 'pagenumbers', ); my $pagelayout = $styles->pageLayout("Standard"); my $titlepage = $styles->createMasterPage( 'titlepage', layout => $pagelayout, ); my $pnpage = $styles->createMasterPage( 'pagenumbers', layout => $pagelayout, ); my $pn = $styles->createParagraph( '', 'centerfooter' ); my $pg = $styles->textField( 'page-number', style => 'centerfooter' ); $styles->appendElement( $pn, $pg ); $styles->masterPageFooter( 'pagenumbers', $pn ); my $wordlist = $doc->appendParagraph( text => '', style => 'header', ); $doc->extendText( $wordlist, uc 'Main Title', 'header' ); my $regular = $doc->appendParagraph( text => 'Some text.', style => 'regular', ); $doc->appendParagraph( text => 'More text.', style => 'regular', ); $doc->appendParagraph( text => 'And some more.', style => 'regular', ); $doc->save;
Insert images on Mojolicious website
1 direct reply — Read more / Contribute
by pazt
on Aug 20, 2016 at 20:06

    Hi, monks

    I can't insert images on my Mojolicious projects. I'm converting my CGI script to Mojo. I'm Mojo newbie yet.

    My current app structure is:
    myapp |- |- lib | |- | +- MyApp | +- Model | +- |- t | +- login.t +- templates |- layouts | +- default.html.ep |- index.html.ep +- <b>protected.html.ep</b>

    It's an example from Mojolicious::Guides::Growing

    My questions are:

    1. How can I insert an image on protected.html.ep?
    2. Where should I store my image files?
    3. How should be?

    Thanks for help

Do I have to trick Split?
1 direct reply — Read more / Contribute
by $h4X4_&#124;=73}{
on Aug 20, 2016 at 08:12

    I found out something strange about split. I guess it will not parse blank content till there is something at the end...
    I have a string that splits at a comma,but when there are repeated delimiters at the end they will not be added to the array till there is something at the end like so...

    #!perl use warnings; use strict; use Data::Dumper; my $str = "ABC,123,,,,,,"; # just ABC and 123 #my $str = "ABC,123,,,,,,this"; # now it shows the blanks. my @elems = split ',', $str; print Dumper \@elems;
    Any other ways maybe regex or should I trick the array and then pop the last content out?

Unable to read values from Excel file
4 direct replies — Read more / Contribute
by youhaveaBigEgo
on Aug 20, 2016 at 01:33

    Hi everyone,

    This problem is driving me nuts.
    It should be fairly straightforward, but something's wrong.
    My .xls file *(not xlsx) has only few values:

    -35, -39, -39, -60, -35, -39, -39, -36, -40, -40, -59, -36, -40, -40,

    I installed Spreadsheet::Read using Perl Package Manager.
    Not ppm install from command line, I used the GUI tool to install it.
    My perl version is 5.18.2 (from ActiveState Perl)
    (yes i know some people here may crib about it, and I'm on Win 7 OS as well. Chill)

    Here is my script:

    use Spreadsheet::Read; our $workbook = ReadData("tmp.xls", debug => 9); my ($i,$j)=0; my @cell; for($i=1;$i<3;$i++){ for($j=1;$j<7;$j++){ $cell[$j] = cr2cell($j, $i); print $workbook->[1]{$cell[$j]}; print ","; } print "\n"; }

    This should have worked.
    But instead of printing the values above in my sheet, I just get :
    I added the debug flag to understand what's wrong

    $Options = { 'debug' => 9, 'strip' => 0, 'rc' => 1, 'cells' => 1, 'dtfmt' => 'yyyy-mm-dd', 'clip' => 1, 'attr' => 0 }; ,,,,,, ,,,,,,

    What gives?
    What did I do wrong?
    Did the PPM miss a dependency or something?
    Weird thing is, script works fine on another PC (not mine),
    but I don't know what all packages that station has installed in it.
    The perl version is the same as mine.

    Can anyone shine some light?

combining 3 arrays of hashes
1 direct reply — Read more / Contribute
by pearlgirl
on Aug 19, 2016 at 14:21
    Based on the 3 conditions I need to generate 3 arrays of hashes ( long data from DB) and join them into 1, maintaining the same format. The format I get from the DB is this:
    [ {fruit=>apple, vegetable=> pumpkin, }, {fruit=>mango, vegetable=> tomatoes, }, {fruit=>strawberry, vegetable=> pumpkin, }, ]
    So all potential hashes will be generated in this format. I tried the following:
    my $arrayHash1 =(); my $arrayHash2 =(); my $arrayHash3 =(); my $FinalMegaHash =(); if ( condition1==true){ generate arrayHash1; $FinalMegaHash .= $arrayHash1; } if ( condition2==true){ generate arrayHash2; $FinalMegaHash .= $arrayHash2; } if ( condition3==true){ generate arrayHash3; $FinalMegaHash .= $arrayHash3; } print Dumper $FinalMegaHash;
    I know the syntax is wrong, but it illustrates the idea of what I want to do. Have been googling all over the place, can't find anything. :/ Any guidance would be much appreciated. Thank you
Instrumenting a genetic algorithm.
4 direct replies — Read more / Contribute
by BrowserUk
on Aug 19, 2016 at 10:56

    I have a very long running process that is testing randomly generated scenarios. At any given point in time I have two knowns: $T is the total number run so far; and $S is the number of those that have been successful. From those I can easily indicate the current success rate as a percentage.

    I'm looking for some mechanism to indicate whether the processing is nearing -- or tending towards -- some level of completion.

    If I introduce a third variable, target success rate, $TSR (say: 90%), then what further statistics can I derive that would give me a feel for how the process is progressing?

    For example, it would be possible to calculate a number of further tests that would need to be run, that -- assuming they were all successful -- would raise the success rate to the target value. But assuming the current success rate is less than the $TSR -- otherwise I wouldn't be posting -- that would be a naive statistic.

    So then I thought about trying to calculate the number of addition tests required to achieve the $TSR, assuming the current success rate continued; but of course, that will never happen.

    So then I thought about trying to calculate the trend in the current success rate; and if that trend shows a consistent decline, or reaches a point where the $TSR can never be reached, then I abandon processing with the current parameters and move on to the next set.

    I don't have any code because I'm just starting to think about the problem; and so far, I have got a clue what to code.

    If any of this sparks any thoughts, possibilities or suggestions; please post.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
Checking TYPEs after my declaration
2 direct replies — Read more / Contribute
by LanX
on Aug 18, 2016 at 10:37
    Oh All-knowing, All-seeing Trash Heap! :)

    It is possible to use a TYPE when declaring a variable  my TYPE $a; ...

    ... but is it also possible to introspect this "type", kind of a $a->istype("TYPE") or $a->gettype() ?

    my mentions fields but all I understood is that it's somehow deprecated since 5.8. (wasn't it meant for immutable hashes ...?)

    What's confusing me is that neither ref nor reftype nor Devel::Peek show any signs of this typing, so it must be stored in fields somehow...

    Bonus points if you can tell me:

    Is it even possible to create compile time checks of types?


    I remember being capable to create errors after such "typing", but can't reproduce.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Please review this: code to extract the season/episode or date from a TV show's title on a torrent site
3 direct replies — Read more / Contribute
by Cody Fendant
on Aug 18, 2016 at 03:17

    I'm trying to create a robust way of detecting which season and episode a TV show is from.

    Common patterns are things like "S5E6", "S5.E6", or "S05xE06" but there's no real consistency.

    Also I want to fall back to detecting a date for shows like The Daily Show, which don't really have seasons/episodes.

    So here's my first attempt, please comment. Also if you're interested, please test on bulk data from TV torrent websites, which is what I'm doing right now.

    One straightforward question: what's the best-practice way to remove leading zeros? Convert to a number with something like $foo = ($foo + 0)? Strip the zeros as characters? sprintf?

    #!/usr/local/bin/perl use strict; use warnings; use Data::Dumper::Simple; use Regexp::Common qw(time); my ( $count, $success, $failure ) = ( 0, 0, 0 ); while (<DATA>) { chomp; $count++; my $data = extract_show_info($_); if ($data) { print "Success? \n"; print Dumper($data); $success++; } else { print "Failure: $_\n"; $failure++; } } print "Processed: $count | Successes: $success | Failures: $failure \n +"; sub extract_show_info { my $input_string = shift(); my $result = undef; if ( $result = extract_episode_data($input_string) ) { $result->{type} = 'se'; } elsif ( my @date = $_ =~ /$RE{time}{ymd}{-keep}/ ) { $result = { type => 'date', year => $date[1], month => $date[2], day => $date[3] }; } return $result; } sub extract_episode_data { my $input_string = shift(); if ( $input_string =~ /s(\d+)\s*e(\d+)/i || $input_string =~ /s(\d+)\.e(\d+)/i || $input_string =~ /(\d+)x(\d+)/i || $input_string =~ /Season\s*(\d+),?\s*Episode\s*(\d+)/i || $input_string =~ /Series\.(\d+)\.(\d+)/ ) { my $episode_data = { season => $1, episode => $2 }; return $episode_data; } else { return; } } __DATA__ The.Walking.Dead.S01E03.FRENCH.LD.BDRip.XviD-JMT.avi 348.55 Mb Gogglebox.AU.s01e08.PDTV.x264.Hector.mp4 266.46 Mb Power S03E01 HDTV x264-FS.mp4 285.38 Mb Wentworth.s03e04.HDTV.x264.Hector.mp4 226.32 Mb Suits.S06E03.HDTV.x264-FUM[eztv].mp4 222.37 Mb Killjoys.S02E07.HDTV.x264-FUM[eztv].mp4 255.05 Mb Superfoods.The.Real.Story.Series.2.4of8.Seaweed.720p.HDTV.x264.AACmp4[ +eztv].mp4 439.43 Mb Keeping.Up.With.The.Kardashians.S12E01.Out.With.The.Old.In.With.The.Ne +w.HDTV-MEGATV.mp4 445.27 Mb Keeping.Up.With.The.Kardashians.S12E04.All.About.Meme.HDTV-MEGATV.mp4 +416.17 Mb Are You the One S04E08 HDTV x264-Nada.mp4 476.85 Mb Superfoods.The.Real.Story.Series.2.8of8.Avocados.720p.HDTV.x264.AAC.MV 430.01 Mb Kingdom 2014 S02E20 No Sharp Objects HDTV x264-TTL.mp4 457.65 Mb The.Big.Bang.Theory.S09E19.HDTV.x264-LOL[eztv].mp4 144.79 Mb Superfoods.The.Real.Story.Series.2.4of8.Seaweed.720p.HDTV.x264.AAC.MVG 439.43 Mb [] Vikings.S02E07.FRENCH.HDTV.x264-DEAL.mp4 309.86 Mb BBC.Inside.Einsteins.Mind.1080p.HDTV.x265.AAC.MVGroup.Forum.mp4 728.65 + Mb
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2016-08-24 17:42 GMT
Find Nodes?
    Voting Booth?
    The best thing I ever won in a lottery was:

    Results (348 votes). Check out past polls.