Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Seekers of Perl Wisdom

by gods
on Sep 07, 1999 at 20:28 UTC ( #479=superdoc: print w/replies, xml ) Need Help??

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask. Post a new question!

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

User Questions
PDF to Text
2 direct replies — Read more / Contribute
by 9mohit2
on Sep 29, 2016 at 01:29
    Hi, I want to convert PDF to text format keeping the positions of the text same as PDF so as I can easily get the required data. I have seens posts recommending to use pdftotext and Poppler but can someone help me how can I setup these on a Windows 10 machine. Not getting any clear discussion on installation or usage on any of these. Exampoles would be very useful. Do let me know if any other way is possible. Thanks in Advance.
Is that anyway to override -X functions?
1 direct reply — Read more / Contribute
by exilepanda
on Sep 29, 2016 at 00:48
    Dear monks,

    I've read the perldoc CORE, but it didn't mention if I can or cannot override -X (ie. -f, -d, -e etc). I tried this

    BEGIN { no strict 'refs'; *{'CORE::GLOBAL::-f'} = sub { print "-f called"; } }
    but this won't work. I am trying to build a set of functions that deals with Unicode dirs and files, while maintaining the native syntax. I can override chdir, readdir and sort of, but can't -X. Any clue ?
Error : can't locate object method "createElement" via package XML::DOM::ELEMENT
1 direct reply — Read more / Contribute
by ankit.tayal560
on Sep 29, 2016 at 00:33

    I've written above code to modify the contents of my xml file. but when I tried to create new elements named "item" and "data" it showed me an error : can't locate object method "createElement" via package XML::DOM::ELEMENT. how can I get rid of this problem? any suggestions? it is giving me an output window of Joey 67890 4 and then an error pops up which I just mentioned above

    use strict; use warnings; use Data::Dumper; use XML::DOM; my $parser=new XML::DOM::Parser; my $doc=$parser->parsefile('C:\perl\perl_tests\xmlin.xml') or die$!; my $root=$doc->getDocumentElement(); my @address=$root->getElementsByTagName("address"); foreach my $address(@address) { if($address->getAttribute("name") eq "tayal") { if($address->getAttribute("id")=='70889') { $address->setAttribute("name","Joey"); $address->setAttribute("id","67890"); $address->setAttribute("flags","4"); my $temp1=$address->getAttribute("name"); my $temp2=$address->getAttribute("id"); my $temp3=$address->getAttribute("flags"); print("$temp1\n\n"); print("$temp2\n\n"); print("$temp3\n\n"); my $temp_item=$root->createElement("item"); my $temp_data=$root->createElement("data"); my $child1=$address->appendChild($temp_item); $child1->setAttribute("used","1"); $child1->setAttribute("order","0"); my $g=$child1->getAttribute("used"); my $h=$child1->getAttribute("order"); print("$g\t$h\n"); my $child2=$child1->appendChild($temp_data); $child2->setAttribute("typeid","4"); my $k=$child2->getAttribute("typeid"); print("$k\n"); } } } $doc->setXMLDecl($doc->createXMLDecl('1.0','UTF-8')); $doc->printToFile("C:/perl/perl_tests/xmlin2.xml"); $doc->dispose; XML FILE : <config logdir="var/log/foo/" debugfile="tmp/foo.debug"> <server name ="sahara" osname ="solaris" osversion="2.6"> <address name="ankit" id="70888"/> <address name="tayal" id="70889"/> </server> <server name="gobi" osname="irix" osversion="6.5"> <address name="anshul" id="70689"/> </server> <server name="kalahari" osname="linus" osversion="2.0.34"> <address name="raghu" id="45678"/> <address name="lucky" id="67895"/> </server> </config>
My local libs are not being used by my browser
3 direct replies — Read more / Contribute
by Lady_Aleena
on Sep 28, 2016 at 19:50

    Hello all. I am having a problem with my browser not recognizing my local libs. I set them up because I was told installing modules with sudo cpan and putting them in the main libs was not great. So I set up my local libs and installed all kinds of modules into them, got apache set up so I could view my website in my browser, but the browser does not use my local libs.

    Here is the result of perl -e 'print "$_\n" for sort @INC' on the command line:

    . /etc/perl /home/me/Documents/fantasy/files/lib /home/me/perl5/lib/perl5 /home/me/perl5/lib/perl5/5.20.0 /home/me/perl5/lib/perl5/5.20.2 /home/me/perl5/lib/perl5/5.20.2/x86_64-linux-gnu-thread-multi /home/me/perl5/lib/perl5/x86_64-linux-gnu-thread-multi /usr/lib/x86_64-linux-gnu/perl/5.20 /usr/lib/x86_64-linux-gnu/perl5/5.20 /usr/local/lib/site_perl /usr/local/lib/x86_64-linux-gnu/perl/5.20.2 /usr/local/share/perl/5.20.2 /usr/share/perl/5.20 /usr/share/perl5

    Here is the list of libs my browser is looking in:

    files/lib /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.20.2 /usr/local/share/perl/5.20.2 /usr/lib/x86_64-linux-gnu/perl5/5.20 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.20 /usr/share/perl/5.20 /usr/local/lib/site_perl

    I ran echo 'eval "$(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)"' >>~/.bashrc at the command line, and it added the following lines to my .bashrc file:

    PATH="/home/me/perl5/bin${PATH:+:${PATH}}"; export PATH; PERL5LIB="/home/me/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export P +ERL5LIB; PERL_LOCAL_LIB_ROOT="/home/me/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCA +L_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT; PERL_MB_OPT="--install_base \"/home/me/perl5\""; export PERL_MB_OPT; PERL_MM_OPT="INSTALL_BASE=/home/me/perl5"; export PERL_MM_OPT; eval "$(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)"

    I already have several PATH statements above these lines and a PERL5LIB line, so I do not know if there is a conflict. Here are the lines:

    # My changes to things # export LC_ALL=C export LC_COLLATE=C export LESS=-SXi export PERL5LIB="$HOME/Documents/fantasy/files/lib" PATH="$PATH:$HOME/bin" PATH="$PATH:$HOME/Documents/fantasy" PATH="$PATH:$HOME/Documents/scripts" export PATH setterm --linewrap off

    So can anyone tell me what I am missing here?

    Thank you!

    No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
    Lady Aleena
MCE -- how to know which function to use
1 direct reply — Read more / Contribute
by 1nickt
on Sep 28, 2016 at 17:57

    Hello all,

    I've read through the docs and made some experiments with basic usage of MCE, but I'm not sure if I'm barking up the wrong tree.

    I'm unclear on:

    • When to use mce_loop() vs. mce_map() vs. MCE::Shared
    • How to know how many workers to set as max

    I have an arrayref of hashrefs, and am outputting an arrayref of hashrefs. Processing each hashref is quite slow: takes about 0.1s. There are 7,500 hashes in the arrayref; that could grow to some tens of thousands.

    The code is running on an Ubuntu AWS instance whose lscpu outputs:

    Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1

    The MCE manager is splitting the array into 25 chunks using 'auto'.

    I am seeing almost no difference in time taken to execute using MCE versus a sequential foreach loop, in fact the sequential loop appears faster, which I would not have expected.

    Although the CPU usage looks quite different:

    Benchmark: timing 5 iterations of MCE loop , MCE map , Se +quential loop... MCE loop : 83 wallclock secs ( 3.28 usr 0.19 sys + 30.94 cusr 2 +4.89 csys = 59.30 CPU) @ 0.08/s (n=5) MCE map : 75 wallclock secs ( 4.48 usr 0.28 sys + 41.29 cusr 3 +7.93 csys = 83.98 CPU) @ 0.06/s (n=5) Sequential loop: 76 wallclock secs (37.79 usr + 28.91 sys = 66.70 CPU) + @ 0.07/s (n=5)

    Am I missing something obvious? Or non-obvious? Doing something wrong? Can anyone shed any light please?

    The way forward always starts with a minimal test.
split large CSV file >9.1MB into files of equal size that can be opened in excel
6 direct replies — Read more / Contribute
by Anonymous Monk
on Sep 28, 2016 at 15:59

    Hi Monks, I have a perl script that almost does what I want but not quite. I'm trying to divide a CVS file into multiple files of equal size. What I have divides almost all of the files into ones that are equal sizes but always leaves one larger file. I'm guessing it's the first or last file that's written in the series. If you could have a look at the code below and tell me what I'm doing wrong/have missed I'd greatly appreciate it. Thanks .

    #!/usr/bin/perl -w use strict; use warnings; #split large files mkdir "split_files"; open (FH, "all_tags2.csv") or die "Could not open source file. $!"; my $i = 0; while (1) { my $chunk; print "process part $i\n"; open(OUT, ">split_files/all_tags2$i.csv") or die "Could not open d +estination file"; $i ++; if (!eof(FH)) { read(FH, $chunk, 10000); print OUT $chunk; } if (!eof(FH)) { $chunk = <FH>; print OUT $chunk; } close(OUT); last if eof(FH); }
Perl6::Export::Attrs and Perl 5.24
1 direct reply — Read more / Contribute
by mdemoulin
on Sep 28, 2016 at 13:19

    Are there any known issues with Perl6::Export::Attrs and Perl 5.24? I installed Perl 5.24, and now the functions that were exported with no problem now produce the error message "My::Lib does not export: foo" when encountering the line use My::Lib qw{foo};

    Thanks for the help!

Javascript variables access help with WWW::Mechanize::Firefox
1 direct reply — Read more / Contribute
on Sep 28, 2016 at 12:59

    Hello Monks, I'm trying to extract text data from a webpage, with many javascript code in it. I'm able to go through the page, but when I'm finally there, I'm not able to get the information, since they are retained in javascript runtime variables (should the name for those be DOM? pretty confused). I identified wanted text section through FireBug, in the DOM panel section. The DOM object where they are retained seems like an array, that is called Diary. I'm not able to access it in perl, using eval() or eval_in_page() methods. I tried this piece of code:

    my ($contest, $type) = $mech2->eval_in_page( 'Diary' ) or warn "$!"; print Dumper \$contest; print Dumper \$type;

    Resulting in:

    MozRepl::RemoteObject: ReferenceError: Diary is not defined at ./ line 144.

    Of course content() or text() methods return only empty textareas.. I'm searching good suggestions. I would like if possibile to inject JS code to dump every single variable that is readable in current page context..I'm afraid that Diary is not readable or out of scope..There is a way to do this? thanks for any help or good suggestion.

Perl: How to convert db2 utf8 czech special character to latex format
2 direct replies — Read more / Contribute
by zimso
on Sep 28, 2016 at 10:57

    please maybe someone can help me :

    - I have a DB2 database which is in codeset UTF-8 and codepage 1208

    - I have a field "lastname" which has some names which contain east european special character ( and // s with caron or a with acute)

    - my environment on the shell is LANG=de_DE.utf8

    - I read the DB field with DBI module into Perl and want to convert the name to latex format for printing, but it doesn't work for the "s with caron"

    (a with a acute) (Unicode-Nummer: U+00E1 HTML-Code: á) ---->wanted goal: Latex format \'{a}

    (s with caron) (Unicode-Nummer: U+0161 HTML-Code: š) ---->wanted goal: Latex format \v{s}

    I don't manage to convert the "s" character:

    $tmp has ' '

    I try a print TeX::Encode::encode('latex',$tmp); It gives : \'a?

    \'a is correct "?" for the "" is not

    When I directly save the field to a file and look with a hexeditor on it it says: "e1 1a"

    e1 is correct "1a" isn't (according to latin-2 it should be "b9")

    Hmmm... Can someone please help me to manage to bring these east european names from a utf8 db to a universal latex format for printing ?

    Many Many Thanks !!!!!!


Finding Nearly Identical Sets
6 direct replies — Read more / Contribute
by Limbic~Region
on Sep 28, 2016 at 10:51
    This isn't so much a perl question as an algorithm question but the solution will be coded in perl (at least partially).

    The program will be processing millions of messages that contain a set of 9 digits where each digit may have a value of 1-9. For instance, it may contain (1, 7, 3, 3, 9, 5, 6, 1, 2). It is critical that all messages that contain the same set need to be stored together.

    Unfortunately, the set may contain an error (think typo). If a set doesn't match any previously seen sets exactly, the program needs to determine if this should be a new set or if it belongs to a previous set. Assume that I have a perfect way of doing this if I have two messages side by side, what I am looking for is a fast/cheap way to identify candidate sets.

    In other words, I can't compare each set to every previous set. What I want to be able to do is very quickly/cheaply identify some candidates that are worth the time to do an expensive message to message compare. Here are the types of things I want to allow for:

    • Any change to the order (1,2,3,4,5,6,7,8,9 instead of 9,8,7,6,5,4,3,2,1) AND/OR 1 of the following:
    • Exactly 1 insertion (10 digits instead of 9) OR
    • Exactly 1 deletion (8 digits instead of 9) OR
    • Exactly 1 transformation (7 instead of a 5)

    If I were just going with the first one, it would be simple. Sort/concatenate the list and perform a hash lookup.

    I am sick with a pretty bad head cold so I am going to assume I have done a poor job of explaining. I apologize in advance. Here are some of the ideas I have came up with that I don't think will work:

    • Using a range of the product of the set
    • Using a range for the average and deviation of the set
    • Using a range for the "distance" to another 3rd hardcoded set

    Fast and cheap but without too many false positives - ideas?

    Cheers - L~R

Add your question
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others exploiting the Monastery: (13)
    As of 2016-09-30 14:34 GMT
    Find Nodes?
      Voting Booth?
      Extraterrestrials haven't visited the Earth yet because:

      Results (569 votes). Check out past polls.