Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
Overloading Weirdness
No replies — Read more | Post response
by pudge
on Jun 23, 2018 at 01:40
    Long story short: we want to mark strings so that later we can do something with them, even if they get embedded in other strings. So we figured, hey, let's try overloading. It is pretty neat. I can do something like:
    my $str = str::new('<encode this later>'); my $html = "<html>$str</html>"; print $html; # <html><encode this later></html> print $html->encode; # <html>&lt;encode this later&gt;</html>
    It does this by overloading the concatenation operator to make a new object array with the plain string "<html>", the object wrapping "<encode this later>", and the plain string "</html>". It can nest these arbitrarily. On encode, it will leave the plain strings, but encode the object strings. But if you stringify the object, it just spits it all out as plain strings. This works well, except that in some cases, it stringifies for no apparent reason. The script below shows the behavior, which I've duplicated in 5.10 through 5.22.
    #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; $Data::Dumper::Sortkeys=1; my $str1 = str::new('foo'); my $str2 = str::new('bar'); my $good1 = "$str1 $str2"; my $good2; $good2 = $good1; my($good3, $good4); $good3 = "$str1 a"; $good4 = "a $str1"; my($bad1, $bad2, $bad3); $bad1 = "a $str1 a"; $bad2 = "$str1 $str2"; $bad3 = "a $str1 a $str2 a"; say Dumper { GOOD => [$good1, $good2, $good3], BAD => [$bad1, $bad2, $ +bad3] }; $bad1 = ''."a $str1 a"; $bad2 = ''."$str1 $str2"; $bad3 = ''."a $str1 a $str2 a"; say Dumper { BAD_GOOD => [$bad1, $bad2, $bad3] }; package str; use Data::Dumper; $Data::Dumper::Sortkeys=1; use strict; use warnings; use 5.010; use Scalar::Util 'reftype'; use overload ( '""' => \&stringify, '.' => \&concat, ); sub new { my($value) = @_; bless((ref $value ? $value : \$value), __PACKAGE__); } sub stringify { my($str) = @_; #say Dumper { stringify => \@_ }; if (reftype($str) eq 'ARRAY') { return join '', @$str; } else { $$str; } } sub concat { my($s1, $s2, $inverted) = @_; #say Dumper { concat => \@_ }; return new( $inverted ? [$s2, $s1] : [$s1, $s2] ); } 1;
    I want all of these to be dumped as objects, not strings. But the "BAD" examples are all stringified. All of the "BAD" examples are when I'm assigning a string object I am concatenating at the moment to a variable previously declared. If I declare at the same time, or concatenate the strings previously, or add in an extra concatenation (beyond the interpolated string concat), then it works fine. This is nuts. The result of the script:
    $VAR1 = { 'BAD' => [ 'a foo a', 'foo bar', 'a foo a bar a' ], 'GOOD' => [ bless( [ bless( [ bless( do{\(my $o = 'foo')}, ' +str' ), ' ' ], 'str' ), bless( do{\(my $o = 'bar')}, 'str' ) ], 'str' ), $VAR1->{'GOOD'}[0], bless( [ $VAR1->{'GOOD'}[0][0][0], ' a' ], 'str' ) ] }; $VAR1 = { 'BAD_GOOD' => [ bless( [ '', bless( [ bless( [ 'a ', bless( do{\(my $o + = 'foo')}, 'str' ) ], 'str' ), ' a' ], 'str' ) ], 'str' ), bless( [ '', bless( [ bless( [ $VAR1->{'BAD_GOOD +'}[0][1][0][1], ' ' ], 'str' ), bless( do{\(my $o = 'bar') +}, 'str' ) ], 'str' ) ], 'str' ), bless( [ '', bless( [ bless( [ bless( [ bless( [ + 'a ', + $VAR1->{'BAD_GOOD'}[0][1][0][1] ] +, 'str' ), ' a ' ], 'str' ) +, $VAR1->{'BAD_GOOD +'}[1][1][1] ], 'str' ), ' a' ], 'str' ) ], 'str' ) ] };
    The behavior makes no sense to me. I'd like to understand why it works this way, and I'd like to find a workaround.
Using 'no warnings' to disable own debug warn(ing)s?
1 direct reply — Read more / Contribute
by LanX
on Jun 22, 2018 at 13:13
    According to warnings and perldiag it's possible to disable special warning classes in a lexical scope. (like the uninitialized one in the example)

    Is it also possible to use this mechanism to disable my own debug messages for a scope?

    (I'm aware about the possibility to manipulate $SIG{"__WARN__"} handler, but trying not to reinvent the wheel.)

    use strict; use warnings; my $x; { no warnings; # no warnings 'uninitialized'; print "$x"; warn "inside"; } print "$x"; warn "outside";
    inside at d:/Users/lanx/pm/no_warn.pl line 9. Use of uninitialized value $x in string at d:/Users/lanx/pm/no_warn.pl + line 11. outside at d:/Users/lanx/pm/no_warn.pl line 12.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    ) and overusing Pad::Walker

Any downsides to this slurp idiom?
8 direct replies — Read more / Contribute
by haukex
on Jun 22, 2018 at 07:33

    I sometimes use (and suggest) this to slurp a file in a single line:

    my $data = do { open my $fh, '<', $file or die $!; local $/; <$fh> };

    The lexical filehandle should be automatically closed at the end of the block. Can anyone think of any downsides to the above? Like perhaps some subtle scoping issue, or it's bad style to rely on the implicit close, etc. ... or am I just being paranoid?

    Update: Typo fix, $/ not $\, thanks Corion!

Perl script that will read two pdb files with different line numbers and will replace the chain letter from the first to the second file
3 direct replies — Read more / Contribute
by Nastazia
on Jun 22, 2018 at 05:57

    Hello everyone, I am trying to write a script in perl which will do the following

    it will read a pdb file that contains only Ca atoms as the following

    1 2 3 4 5 6 ATOM 1 CA PRO A 889 84.370 72.820 26.830 1.00 0.00 + ATOM 2 CA THR A 890 87.370 73.900 28.080 1.00 0.00 + ATOM 3 CA VAL A 891 90.920 72.490 27.750 1.00 0.00 + ATOM 4 CA PHE A 892 93.640 74.890 28.970 1.00 0.00 + ATOM 5 CA HIS B 893 97.060 74.200 27.360 1.00 0.00 + ATOM 6 CA LYS B 894 99.880 73.920 29.990 1.00 0.00

    it will read a second pdb that contains every atom

    1 2 3 4 5 6 ATOM 1 N PRO A 889 16.220 12.185 1.804 1.00 71.54 + N ATOM 2 CA PRO A 889 16.101 12.990 3.034 1.00 70.89 + C ATOM 3 C PRO A 889 15.432 14.346 2.803 1.00 72.31 + C ATOM 4 O PRO A 889 14.743 14.852 3.703 1.00 72.20 + O ATOM 5 CB PRO A 889 17.553 13.151 3.502 1.00 72.96 + C ATOM 6 CG PRO A 889 18.315 12.067 2.782 1.00 78.00 + C ATOM 7 CD PRO A 889 17.626 11.907 1.465 1.00 73.35 + C

    (The files refer to the same molecule but have different number of lines)

    So if the residue number (column num 5) is the same it will take the chain letter (column num 4) from the first file and replace all the chain letters that have the same residue number in the second file. So far i've got this disaster :/

    print "\nEnter the network pdb file file: "; $inputFile = <STDIN>; chomp $inputFile; unless (open(INPUTFILE, $inputFile)) { print "Cannot read from '$inputFile'"; <STDIN>; exit; } # load the file into an array chomp(@networkpdb = <INPUTFILE>); # close the file close(INPUTFILE); print "\nEnter the pdb output file: "; $inputFile2 = <STDIN>; chomp $inputFile2; unless (open(INPUTFILE, $inputFile2)) { print "Cannot read from '$inputFile2'"; <STDIN>; exit; } chomp(@pdb = <INPUTFILE>); close(INPUTFILE); for ($line1 = 0; $line1 < scalar @networkpdb; $line1++) { if ($networkpdb[$line1] =~ m/ATOM\s+\d+\s+\w+\s+\w{3}\s*(\w+)\s*(\ +d*)\s+\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*/ig) { my $resnum=$2; my $chain=$1; for ($line = 0; $line < scalar @pdb; $line++) { if ($pdb[$line]=~ m/(ATOM\s+\d+\s+\w+\s+\w{3}\s*)(\w+)\s*(\d*)(\s ++\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*)/ig) { my $begining=$1; my $resnum1=$3; my $chain1=$2; my $end=$4; if ($resnum1=$resnum) {$chain1=$chain; $parsedData{$line} = $begining.$chain1."\s".$resnum1.$end; }}}}} # create the output file name $outputFile = "WithNetwork_".$inputFile; # open the output file open (OUTFILE, ">$outputFile"); # print the data lines foreach $line (sort {$a <=> $b} keys %parsedData) { print OUTFILE $parsedData{$line}."\n"; } # close the output file close (OUTFILE);

    thank you very much in advance

Will 'when()' be removed or deprecated in upcoming version of Perl?
3 direct replies — Read more / Contribute
by Darkwing
on Jun 22, 2018 at 02:57
    Every now and then i use something this in my code:
    for ($foo) { when (/blah/) {...} when (/xyz/) {...} ... }
    Now this page Beware of the removal of when in Perl v5.28 tells that when() will be removed in perl 5.28, meaning that i would need to change such code. But on this page perldeprecation there is nothing written about removing when() or marking it deprecated. Now, what is correct about the when()?
How to convert the NCBI Gene ID to GenBank ID?
2 direct replies — Read more / Contribute
by supriyoch_2008
on Jun 22, 2018 at 02:50

    Hi Perl Monks,

    I am interested in converting the Gene ID of NCBI to GenBank ID. In NCBI Gene database, when I write 7157 as Gene ID in search box, the page opens with the heading "TP53 tumor protein p53 Homo sapiens (human)". Almost at the bottom of that page the sub-heading appears as "mRNA and Protein(s)" which shows the GenBank ID as "NM_000546.5" (first entry) with a hyperlink. When clicked, the GenBank page opens up and shows the details. This is a cumbersome process when one has to get the GenBank ID of many genes. I searched in the web for a perl script which can convert Gene ID to GenBank ID using internet directly. But I did not get such a script. However, the link http://biodb.jp/ can perform this task of conversion in a very lengthy procedure. Then, I tried to get the sequence of Gene ID 7157 using a script:

    Here goes the script for sequence:

    #!/usr/bin/perl use warnings; use strict; use Bio::DB::GenBank; use Bio::SeqIO; use Text::Wrap; my $gb= new Bio::DB::GenBank; my $id='7157'; my $seq = $gb->get_Seq_by_gi($id); print "\n seq: $seq\n"; exit;

    But I got the wrong result and not the sequence in cmd as follows: Here goes the result in cmd:

    C:\Users\x>cd d* C:\Users\x\Desktop>g2.pl seq: Bio::Seq::RichSeq=HASH(0x780b234) C:\Users\x\Desktop>

    I need suggestions from PerlMonks to solve this problem of ID conversion so that I can get the results of Gene IDs: 7157, 7422 as follows in cmd:

    I expect results in the following format:

    GenBank ID NM_000546.5 NM_001025366.2
PERL-CGI on shared unix server
4 direct replies — Read more / Contribute
by rahu_6697
on Jun 22, 2018 at 01:43

    Hello to all! I am new to perl-cgi, just trying to execute my first program and stored it as mytest.pl in my home directory. I am working on a shared unix based server so I don't have access to var/www/cgi-bin folder of my directory and there is not any webserver package installed. So how should i run this type of files. Please provide step by step procedure for running this file such that on browser Hello World! should display.

    #!/usr/local/bin/perl print "Content-Type: text/html\n\n"; print "<html> <head>\n"; print "<title>Hello, world!</title>"; print "</head>\n"; print "<body>\n"; print "<h1>Hello, world!</h1>\n"; print "</body> </html>\n";
How to print data between tags from file sequentially?
4 direct replies — Read more / Contribute
by TonyNY
on Jun 21, 2018 at 23:15

    Hi, I have a text file that contains the follwoing that I want to print one line at a time. is it possible to print only the data between > < sequentiallly?

    <Answer type="string">ServerName</Answer> <Answer type="string">10.10.10.11</Answer> <Answer type="string">Windows Server 2012</Answer>

    output would be something like:

    Commputer: ServerName IP Address: 10.10.10.11 OS: Windows Server 2012

    This is what I have so far but it only prints out the entire lines at the same time:

    #!/usr/bin/perl # open file open(FILE, "data.txt") or die("Unable to open file"); # read file into an array @data = <FILE>; # close file close(FILE); # print file contents foreach $line (@data) { print $line; }

    Thanks

dealing with cyrillic characters
2 direct replies — Read more / Contribute
by Datz_cozee75
on Jun 21, 2018 at 19:55

    I'm having some issues with rendering the russian captions on my personal website, where I use a perl templating system to populate the content and get everything loaded to the web. Its nominal form is an image, followed by english captions and then russian captions. What I had was working alright, if you want to mark yourself as a hobby coder. The russian captions aren't fitting properly within their html boundary, as they are not getting treated the way english ones do, like this: testimonial text Furthermore, the russian ones don't render as paragraphs.

    The most relevant sections of code that did this are here, within readmore tags. One can contrast how this worked for english versus russian. In the english ones, I put them through the text2html function of HTML::FromText, which preserves urls, e-mails, and paragraphs. In this version, I don't make such a call in the russian caption-reading function. Please do not read if code makes you grumpy. I did most of this coding as a was studying references in perl as an intermediate. I wouldn't say that I've progressed any in the meantime. Any suggestions to improve code are gladly accepted.

    So, future friar me says, "run the russian text through text2html, and see what you get." With a little more russian text added to the headline to show how it doesn't render and the print_script function enabled, This html page shows how the russian goes when it goes wonky for me. It's always a matter of seeing these characters show like this: мой оп&#139;&#130;, сила и надежда The same characters show up when I try to use a hex editor such as okteta to manipulate these texts. I don't seem to get any meaningful conversion to happen, and I'm left with a sea of these deformed D-creatures. Here is the code for this latest attempt:

    My question is how do I get the formatting for the russian characters without having them turn into the D-creatures? What must be happening every time I see an encoding that makes no sense as in the headline or in okteta when I can readily read the cyrillic source text?

    Thanks for your comment.

    2018-06-22 Athanasius moved readmore tags outside of code tags

Best way to store/access large dataset?
5 direct replies — Read more / Contribute
by Speed_Freak
on Jun 21, 2018 at 18:25

    Hoping not to get flogged here, but I wanted to post the question tonight so maybe I would have a starting point when I come back in tomorrow.
    I'm an absolute newb when it comes to programming, but I think PERL will be good for what I'm trying to do...and I have the books, so I'm hoping to fumble through this.

    I'm trying to load in data from two files. One file has the category ID for each item I am interested in. (eg. item.1.ext = square, item.2.ext = circle, etc.)
    The second file contains all the attributes for these items. Each attribute has a binary yes/no represented by 1 or 0. My files can have a couple hundred items, with a million attributes for each item.

    What I am looking to do is find a good way to process through the attributes by category and score them. I was thinking that I would read in the files and attempt to create a count for each group. And then use the number of times the attribute was present in a category set over the number of items in that category to create a series of scoring criteria.(Like which attributes occur in each category more than 75% of the time, but less than 25% of the time in any other category. Basically looking for category unique attributes.)

    But as I've learned with PERL, there are 7000 different ways to skin a cat, so I'm up for any suggestions. I'm trying to make this a fairly quick process because it will be repeated OFTEN.(Datasets will be ~200 items, 4-10 categories, and 1 million attributes.)

    data example in readmore.

Using a thread to provide visual feedback.
3 direct replies — Read more / Contribute
by Anonymous Monk
on Jun 21, 2018 at 10:54

    I have been working on a command line script that makes several system calls, some of which take a few seconds (minutes?) to return. The long pauses give the impression that the script has hung, which isn't great. I figured I could address this by adding some kind of visual "still processing" feedback, like an hourglass or pinwheel. Because it's just a command line script, I do it with dots.
    My code is below. I would appreciate any feedback or suggestions for improvement. Thank you.

Extracting text from MS Word files on a Linux box
3 direct replies — Read more / Contribute
by Laurent_R
on Jun 21, 2018 at 06:30
    Dear fellow monks,

    I need to extract some textual information from a large bunch of MS Word files, mostly at the .doc format (i.e. Office 2003 version or before, if I remember correctly); there also may be some .docx files but let's concentrate on the .doc format for now.

    This is quite new to me, I've never had before to do that, and I am usually not working on Windows platform and Microsoft Office document formats, except for a few cases with Excel documents. So I am not sure how to handle this requirement.

    I also found that there seem to be surprisingly few CPAN solutions for extracting data from MS Word documents (especially in the pre-2003 format).

    The Text::Extract::Word module may be doing what I need, but it has a dependency on the OLE::Storage_Lite module (and perhaps some others). My understanding is the OLE::... domain modules will only work on Windows boxes (because they are presumably using Windows and/or MS Office libraries). Is this a correct assumption? If such is the case, then it would be a problem for us, because, ultimately, the whole much larger process should run on a Linux server.

    Does any monk here have some experience on this type of needs, i.e., to summarize, how to extract textual data from MS Word files (pre-2003 format) on a Linux box, where OLE modules would presumably not run? Any suggestions? Any other module that I overlooked? Thank you very much for any help in this respect.

New Cool Uses for Perl
Why is it so easy to make Perl apps for Apple Mac?
2 direct replies — Read more / Contribute
by Anonymous Monk
on Jun 17, 2018 at 13:40
    Compile your Perl to a portable binary application for Apple Mac (99k):
    echo 'display alert (do shell script "perl -v")' | osacompile -o perl. +app
    Write apps for Apple Mac in Perl: Pt.1, Pt.2, Pt.3
New Perl Poetry
Sex !
1 direct reply — Read more / Contribute
by usemodperl
on Jun 21, 2018 at 13:06
    perl -v|perl -pe's\(?<=r)[l]\x\g;s\(?<=pe)r\\gi;s\p(e)\S$1\gi'
    perl -MO=Deparse -e 'E S I X G S O O D'
Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2018-06-23 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (125 votes). Check out past polls.

    Notices?