Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

The Monastery Gates

( #131=superdoc: print w/ replies, xml ) Need Help??

Donations gladly accepted

If you're new here please read PerlMonks FAQ
and Create a new user.

New Questions
search and replace if pattern found in file
2 direct replies — Read more / Contribute
by myfrndjk
on Jul 31, 2014 at 09:57

    Hi I am here to seek a help from you.I want to search and replace the particular line from text file.The issue is my text file contains "|" between each sentence.When i try to replace the line it ends up printing 3 times instead of 1 time.Thanks

    use strict; use warnings; my $old="welcome|to|chennai"; my $new="Capital of karnataka|is|bangalore"; my $base = (); my @base = (); my $file="c:/users/jeyakuma/desktop/search.txt"; open(BASE, $file) || die("Could not open file!"); @base=<BASE>; close (BASE); foreach $base(@base) { if($base =~ /$old/){ $base =~ s/$old/$new/gi; print ("Replaced!\n"); } open (BASE, ">$file"); print BASE @base; close (BASE); };

    input file contains

    welcome|to|chennai

    wants to replace that with

    Capital of karnataka|is|bangalore

    exp output

    Capital of karnataka|is|bangalore

    current result

    Capital of karnataka|is|bangalore|Capital of karnataka|is|bangalore|Capital of karnataka|is|bangalore
String processing
2 direct replies — Read more / Contribute
by dovah
on Jul 31, 2014 at 07:05
    Hi! I'd like to ask for help because my code isn't behaving correctly (and I can't figure out why). So, here's my issue. I have to deduce positions in a "simple string" given positions in a "complete string". Here's a minimal exemple of my input file:
    #name complete(cs) len(cs) simple(ss) len(ss) pos(cs) NAME1 A0AAA000AAA00A 14 AAAAAAAA 8 4,6 NAME2 AAAA0AA00000A 13 AAAAAAA 7 7
    Here's my code:
    $ perl -anle ' print "$_ position(cs)" and next if /^#/; printf "%s",$_; for $pos_ss (split ",",$F[5]) { $char = substr($F[1],$pos_ss-1,1); @cs = split //,$F[3]; @cs_idx = grep {$cs[$_] eq $char} 0..$#cs; push @res,++$cs_idx[$pos_ss-1]; } printf "%14s\n", join ",",@res; @res=(); ' file
    And here's my expected output:
    NAME1 A0AAA000AAA00A 14 AAAAAAAA 8 4,9 3,5 NAME2 AAAA0AA00000A 13 AAAAAAA 7 7 6
    In the provided exemple, I have to say that the 4th character (which is the 3rd "A") in the complete string(cs) corresponds to the the 3rd character in the simple string(ss), and so on... Could you please help me formatting/reviewing my code? Thanks in advance for your precious help!!
printing the entire row with particular given last column criteria
3 direct replies — Read more / Contribute
by deepakshyl
on Jul 31, 2014 at 01:31
    INPUT
    orf00007					PHAGE_Prochl_MED4_213_NC_020845-gi|472340344|ref|YP_007673870.1|			  7665 8618	0.210897481636936
    orf00007					PHAGE_Prochl_MED4_213_NC_020845-gi|472340344|ref|YP_007673870.1|			  7665 8618	0.210897481636936
    orf00007					PHAGE_Prochl_P_HM2_NC_015284-gi|326783200|ref|YP_004323597.1|			  7665 8618	0.207761175236097
    orf00015					PHAGE_Megavi_lba_NC_020232-gi|448825467|ref|YP_007418398.1|			  11594 13510	0.278721920668058
    orf00015					PHAGE_Acanth_moumouvirus_NC_020104-gi|441432357|ref|YP_007354399.1|			  11594 13510	0.278721920668058
    
    THE SCRIPT I HAD IMPLEMENTED 
    
    
    use feature qw/say/; use Math::Trig; open FILE,"out02.txt"; my @file= <FILE>; close FILE; my $aa = 0; for (my $i =$aa; $i <=17822; $i++) { if(($file[$i] >= 0.210)){ open (OUTFILE,'>>out_t10-t10.txt'); print OUTFILE $file[$i]; } else{} }
    NOTE: 1) I need to take the last column as the analysing criteria to print the entire row(the float value, eg:0.210897481636936) 2) for example if the user input value is '0.210' we have to print the rows having >= values ,the expected output is OUTPUT orf00007 PHAGE_Prochl_MED4_213_NC_020845-gi|472340344|ref|YP_007673870.1| 7665 8618 0.210897481636936 orf00007 PHAGE_Prochl_MED4_213_NC_020845-gi|472340344|ref|YP_007673870.1| 7665 8618 0.210897481636936 orf00015 PHAGE_Megavi_lba_NC_020232-gi|448825467|ref|YP_007418398.1| 11594 13510 0.278721920668058 orf00015 PHAGE_Acanth_moumouvirus_NC_020104-gi|441432357|ref|YP_007354399.1| 11594 13510 0.278721920668058
XML::Twig and file encoding
1 direct reply — Read more / Contribute
by slugger415
on Jul 30, 2014 at 13:33

    Hello, another XML::Twig question. My original files are UTF8 encoded, but after I run my script they are ANSI encoded. Special characters such as German small sharp s ( ß when encoded) still appear in the output, but don't display properly in my XML editor.

    I've set keep_encoding, which does save the file in UTF8, but produces some odd character strings and spacing.

    Any thoughts on this? thanks as always, Scott

How can I create MS Word 2013 documents using Perl?
3 direct replies — Read more / Contribute
by shajiindia
on Jul 30, 2014 at 03:54
    Dear Monks,

    I am using Microsoft Word 2013 and I tried to download the module "Win32::Word::Writer" but can't get things right.

    Is there any Perl module to create Word 2013 documents which I can make use of?

    I am using Strawberry Perl Version 5.20 on Windows 8.1

    Please help.
Direferencing problem
6 direct replies — Read more / Contribute
by David92
on Jul 30, 2014 at 03:20
    Hey Monks,

    I got a problem with my PERL program, more specific with direferencing. I think I got confused, because first I have to direference it into a Hash, then it says, it contains array elements, then again, hash elements. But let me ask you, what do you think.

    $artfList = $ctf->PlanningAp->getArtifactListInPLanningFolder($session +,$parent,$filters,$recursive); %artfList =%$artfList; foreach $element (keys %artfList){ print "$element\n"; #outpus is: dataRows print "$artfList{dataRows}\n"; #output is: ARRAY(0x2139d) } @Array = @{$artfList{dataRows}}; foreach $element (@Array){ print "$element\n"; # output is: ArtifactsInPlanningFolderSoapRow += HASH(0x3a3acf0) }

    And here is where I got stucked. What I notice is, that there's again a HASH reference. How to direference that array now back to hash?

    This procedure should return me Artifact List (artf000,artf001,etc). The procedure is taken from online notes and that's how the return value is described.

    There are NO syntax errors if you might find ones in the above code, since I typed it over and not copy pasted!

    DUMPER OUTPUT:

    Alot of infromation from the Database that I am trying to access, it's even too much to copy and some information of company I'd rather not display. But I dont know how to access elements, I'll try to retype structure:

    bless( { 'priority' => '1', 'id' => 'artf0000', 'category' => 'V0B' }, 'ArtifactsInPlanningFolderSoapRow' ) # and it

    Hope you guys can scrumble through this. The above code repeats itself for each different artifact.

    Please assist me,

    David

Including modules and pragmas in caller's scope via use
3 direct replies — Read more / Contribute
by wanna_code_perl
on Jul 29, 2014 at 16:49

    Hello monks,

    I already have several Local::... modules that do things too specific for general public consumption. Now I have a "smaller" problem:

    Like many programmers, I have an ever-growing repository of useful subroutines that are either too isolated or too simple to (yet...) merit their own module (in the Local:: module namespace or otherwise). I also have a list of modules and pragmas I use in almost every script. However, these are things I'd like to have available in most of my internal-use scripts.

    The subroutines? Easy. Just pile them in a new module (say, Local::Junk and @EXPORT them by default (or with Exporter::Easy and qw(:all) if I'm feeling extra pedantic...).

    But I don't know how to the other modules (and pragmas) that I'd like to include by default whenever I use Local::Junk, such as List::Util, autodie, etc.

    In other words, I'd like to be able to simply do something like this:

    use Local::Junk qw(:all);

    Instead of:

    use strict; use warnings; use autodie; use List::Util qw(first max maxstr min minstr reduce shuffle sum); # etc...
Naming a module that handles SIP2
4 direct replies — Read more / Contribute
by gmcharlt
on Jul 29, 2014 at 12:39

    I'm one of the folks who hacks on a set if modules that handles a protocol used by libraries called SIP2 (not to be confused with the telephony protocol). The modules uses Net::Server.

    Currently, the modules are used by one project, and a fork of them are included in another. We're planning on folding the fork back in and making the modules suitable for submission for CPAN. One issue: the modules names, "ILS" and "Sip", are clearly unsuitable.

    I'd like advice on what to rename the module to. Net::SIP2::Server? Net::3MSIP2::Server? (The "3M" bit comes from the name of the company that started the protocol). Library::Net::SIP2?

    If you're curious, the code can be found here:

    http://git.evergreen-ils.org/?p=working/SIPServer.git;a=summary

Using Perl to automate GDB
4 direct replies — Read more / Contribute
by eloc
on Jul 28, 2014 at 19:46
    Monks, I humbly come before seeking your great wisdom. I essentially want to use Perl to send input commands to GDB, read the output GDB supplies from these commands, and based on that output send more commands to GDB. I believe open2() or open3() may be of use to me. Is this possible? If so could you show me a simple example? However, I am new to Perl, and would deeply appreciate any examples or advice you can provide.
Error with Net::SSLeay
2 direct replies — Read more / Contribute
by grektokomus
on Jul 28, 2014 at 16:38

    Hello Monks,

    I'm working on a simple script to retrieve JSON from an HTTPS web service. The code is being developed on a Windows 2008 server with Active Perl. The script reports an error on compilation, but then successfully retrieves the desired JSON from the Server. I've put substantial effort into eliminating the error and now seek assistance. Here is the declaration and the error:

    #!/usr/bin/perl -w use strict; #use diagnostics; use JSON -support_by_pp; use LWP 6.04; use LWP::UserAgent; use LWP::Protocol::https; use Net::SSLeay 1.63; use IO::Socket::SSL 1.997;
    Use of uninitialized value in subroutine entry at blib\lib\Net\SSLeay.pm (autosplit into blib\lib\auto\Net\SSLeay\randomize.al) line 912.

    Based on a few similar problems found on the internet, I have verified my LWP modules. I have updated to the latest SSLeay Module (which required force install with ppm). I have also installed OpenSSL 1.0.1h from slproweb.com (although when I verify using https://gist.github.com/dolmen/10096474, it confirms 1.0.1g) .

    I would greatly appreciate guidance to resolve this issue.

Visual Perl/Tk???
7 direct replies — Read more / Contribute
by Anonymous Monk
on Jul 28, 2014 at 13:51

    I am relatively new to Perl, I have used it on and off for a number of years, but never did anything very complicated. I am attempting to create an application for my own use that requires a GUI. I started writing native Perl/Tk, but am becoming a tad overwhelmed, especially with the Geometry managers, I came from an environment where we had tools to create the GUI. I have found and attempted to use several, but most of them require almost as much understanding of Perl/Tk as writing it natively. The only two tools that I have found that seem to do what I want is visual Camel and Eclipse SWT. Camel is unacceptable because it doesn't support very much, and Eclipse SWT appears to generate only Java code. Does anyone know of a free tool that has the functionality of Eclipse SWT, but will generate Perl/tk code? If Eclipse has this functionality, I could not find it. Documentation for most of these type products leave much to be desired. Any help in this direction would be appreciated.

Code Interpretation
7 direct replies — Read more / Contribute
by Perl_Ally
on Jul 28, 2014 at 11:32

    I'm hoping somebody can help me interpret what's going on in the following line of code:

     my @refs = @allrefs[ sort {$a <=> $b} values %uni_refs ];

    As far as I understand,  sort {$a <=> $b} values $uni_refs will sort $uni_refs by its values, numerically descending. Is this a correct interpretation?

    Assuming I'm correct so far, what does it then mean to have @allrefs outside of the square brackets containing the sort?

    Help greatly appreciated.

New Meditations
RFC: Proc::Governor
3 direct replies — Read more / Contribute
by tye
on Jul 28, 2014 at 03:12

    Here is the documentation for a little module I threw together after one of our services did a denial-of-service attack against another of our services. The math for this simple trick works out very neatly.

    I plan to upload this to CPAN very soon. Please let me know what you think.

    NAME

    Proc::Governor - Automatically prevent over-consumption of resources.

    SYNOPSIS

    use Proc::Governor(); my $gov = Proc::Governor->new(); while( ... ) { $gov->breathe(); ... # Use resources } while( ... ) { my $res = $gov->work( sub { ... # Use Service } ); ... }

    DESCRIPTION

    If you want to do a batch of processing as fast as possible, then you should probably also worry about overwhelming some resource and causing problems for other tasks that must share that resource. Fortunately, there is a simple trick that allows one to perform a batch of processing as fast as possible while automatically backing off resource consumption when most any involved resource starts to become a bottleneck (or even before it has become much of a bottleneck).

    The simple trick is to pause between steps for a duration equal to how long the prior step took to complete. The one minor down-side to this is that a single strand of execution can only go about 1/2 maximum speed. But if you have 2 or more strands (processes or threads), then throughput is not limited by this simple "universal governor" trick.

    It is also easy to slightly modify this trick so that, no matter how many strands you have working, they together (without any coordination or communication between the strands) will never consume more than, say, 60% of any resource (on average).

    A typical pattern for batch processing is a client sending a series of requests to a server over a network. But the universal governor trick also works in lots of other situations such as with 1 or more strands where each is doing a series of calculations and you don't want the collection of strands to use more than X% of the system's CPU.

    Note that the universal governor does not work well for resources that remain consumed while a process is sleep()ing, such as your process using too much memory.

    Proc::Governor provides lots of simple ways to incorporate this trick into your code so that you don't have to worry about your code becoming a "denial-of-service attack", which also frees you to split your processing among many strands of execution in order to get it done as fast as possible.

    METHODS

    new()

    my $gov = Proc::Governor->new( { working => 0, minSeconds => 0.01, maxPercent => 100, unsafe => 0, } );

    new() constructs a new Proc::Governor object for tracking how much time has recently been spent potentially consuming resources and how much time has recently been spent not consuming resources.

    new() takes a single, optional argument of a reference to a hash of options. The following option names are currently supported:

    working

    If given a true value, then the time spent immediately after the call to new() is counted as "working" (consuming resources). By default, the time spent immediately after the call to new() is counted as "not working" (not consuming).

    minSeconds

    minSeconds specifies the shortest duration for which a pause should be done. If a pause is requested but the calculated pause duration is shorter than the number of seconds specified for minSeconds, then no pause happens (and that calculated duration is effectively added to the next pause duration).

    The default for minSeconds is 0.01.

    maxPercent

    maxPercent indicates how much of any particular resource the collection of strands should be allowed to consume. The default is 100 (for 100%, or all of any resource, but avoid building up a backlog by trying to over-consuming any resource).

    Note that percentages are not simply additive. Having 3 groups of clients where each is set to not consume more than 75% of the same service's resources is the same as having just 1 group. The 3 groups together will not consume more than 75% of the service's resources in total.

    Say you have a group of clients, H, all set to not consume more than 50% of some service's resources and you have another group of clients, Q, all set to not consume more than 25% of that same service's resources. Both H and Q together will not add up to consuming more than 50% of the service's resources.

    If Q is managing to consume 20% of the service's resources when H starts running, then H won't be able to consume more than 30% of the service's resources without (slightly) impacting performance to the point that Q starts consuming less than 20%.

    H Q Total 50% 0% 50% 40% 10% 50% 30% 20% 50% 25% 25% 50%

    unsafe

    You can actually specify a maxPercent value larger than 100, perhaps because you have measured overhead that isn't easily accounted for by the client. But doing so risks overloading a resource (your measured overhead could end up being a much smaller percentage of the request time when the service is near capacity).

    So specifying a maxPercent of more than 100 is fatal unless you also specify a true value for unsafe.

    beginWork()

    $gov->beginWork( $breathe );

    Calling beginWork() means that the time spent immediately after the call is counted as "working" (consuming resources). Such time adds to how long the next pause will be.

    If $breathe is a true value, then beginWork() may put the strand to sleep for an appropriate duration.

    endWork()

    $gov->endWork( $breathe );

    Calling endWork() means that the time spent immediately after the call is counted as "not working" (not consuming resources). Such time subtracts from how long the next pause will be.

    If $breathe is a true value, then endWork() may put the strand to sleep for an appropriate duration.

    work()

    $gov->work( sub { ... # Consume resources }, $which );

    work() is a convenient shortcut that is roughly equivalent to:

    $gov->beginWork( $before ); ... # Consume resources $gov->endWork( $after );

    The value of $which can be:

    0 No pause will happen. 1 A pause may happen before the sub reference is called. 2 A pause may happen after the sub reference is called. 3 A pause may happen before and/or after the sub is called.

    If $which is not given or is undefined, then a value of 1 is used.

    You can actually get a return value through work():

    my @a = $gov->work( sub { ...; get_list() }, $which ); my $s = $gov->work( sub { ...; get_item() }, $which );

    Note that scalar or list (or void) context is preserved.

    Currently, if your code throws an exception, then endWork() does not get called. This is the same as would happen with the "equivalent" code shown above.

    breathe()

    $gov->breathe( $begin );

    Calling breathe() requests that the current process/thread pause for an appropriate duration.

    Each of the following:

    $gov->breathe(); # or $gov->breathe( 1 );

    is actually equivalent to:

    $gov->beginWork( 1 );

    While

    $gov->breathe( 0 );

    will just pause but will not change whether $gov is counting time as "working" or as "not working".

    pulse()

    $gov->pulse( $count, $begin );

    pulse() is very much like breathe() except that it is optimized for being called many times before enough "working" time has accumulated to justify doing a pause. The meaning of $begin is the same as with breathe().

    So, if you are making requests of a very fast service or are doing work in small chunks, then you can call pulse() directly in your loop and just pass it a value specifying approximiately how many calls to pulse() should be made before one of those calls does the work of calculating how long of a pause is called for.

    For example, a request to our Redis service typically takes a bit under 1ms. So code to perform a large number of such requests back-to-back might be written like:

    my $gov = Proc::Governor->new( { maxPercent => 70, working => 1, } ); my $redis = Redis->new(server=>...); while( ... ) { $gov->pulse( 20 ); $redis->...; }

    That is like calling breathe() every 20th time through the loop and is only the slightest bit less efficient (in run time) than if you had made the extra effort to write:

    ... my $count = 0; while( ... ) { if( 20 < ++$count ) { $gov->breathe(); $count = 0; } ...

    CROSS-OBJECT INTERACTIONS

    A single process (or thread) can simultaneously use more than one Proc::Governor object. For example, each process (of a group) that makes a series of requests to a service and does significant local processing of the data from each request might want to both prevent overwhelming the service and prevent overwhelming local resources (such as CPU).

    So you could have two Proc::Governor objects. One throttles use of local resources ($g_cpu below). The other throttles use of service resources ($g_db below).

    my $g_cpu = Proc::Governor->new( { maxPercent => 80 } ); my $g_db = Proc::Governor->new( { maxPercent => 30 } ); $g_db->beginWork(); my $db = DBI->connect( ... ); # DB work my $rows = $db->selectall_arrayref( ... ); $g_db->endWork(); for my $row ( @$rows ) { my $upd = $g_cpu->work( sub { process_row( $row ); # Local work } ); $g_db->work( sub { $db->update_row( $upd ); # DB work } ); }

    The above code assumes that the local resources required for making requests of the database service are relatively low. And realizes that doing local computations do not use database resources.

    If you set maxPercent to 100 for both Governors and each process spent about the same amount of time waiting for a response from the database as it spent performing local computations, then there might be no need for any pauses.

    Note that only time spent doing "DB work" adds to how long of a pause might be performed by the $g_db Governor. And only time spent doing "Local work" adds to how long of a pause might be performed by the $g_cpu Governor.

    Any pauses executed by either Governor get subtracted from the duration of any pauses of any Governor objects. So the $g_db Governor executing a pause also counts as a pause for the $g_cpu Governor (and thus makes the next pause that it performs either shorter or later or just not needed).

    Time spent inside of Proc::Governor methods may also be subtracted from future pause durations. But the code pays more attention to keeping such overhead small than to providing highly accurate accounting of the overhead and trying to subtract such from every Governor object.

    WHEN TO PAUSE

    Say you have a service that is a layer in front of some other service. You want to ensure that your service can't become a denial-of-service attack against the other service. But you want to prevent a Governor pause from impacting clients of your service when possible.

    You could implement such as follows:

    sub handle_request { my( $req ) = @_; our $Gov ||= Proc::Governor->new(); my $res = $Gov->work( sub { forward_request( $req ); }, 0 ); # Don't pause here. give_response( $res ); $Gov->breathe( 0 ); # Pause here; still idle. }

    (Well, so long as your service architecture supports returning a complete response before the request handler subroutine has returned.)

    If the other service is not near capacity, then the added pauses have no impact (other than perhaps preventing the number of active strands for your service from dropping lower). Be sure your service has an appropriate cap on how many strands it is allowed to keep active (as always).

    TO-DO

    A future version should have support for asynchronous processing. The shape of that interface is already sketched out, but the initial release was not delayed by the work to implement such.

    - tye        

Speeds vs functionality
6 direct replies — Read more / Contribute
by Tux
on Jul 27, 2014 at 12:48

    So my main question, also to myself, is "How much speed are you willing to sacrifice for a new feature?".

    Really. Lets assume you have a neat module that deals with your data, and it deals with it pretty well and reliable, but extending it with new features - some of them asked for by others - is getting harder and harder.

    We now have git, and making a branch is easy, so you can implements the most requested new feature, or the one that most appeals to you and when you are done and all old tests and new tests have passed, you notice a speed drop.

    What considerations do you make to decide whether to release the module with the new neat new feature and mention the slowdown (specified) or do you revert the change and note in the docs that the new feature would cause to big a slowdown.


    Enjoy, Have FUN! H.Merijn
New Monk Discussion
Meeting & conference announcements (updated with suggested guidelines)
1 direct reply — Read more / Contribute
by davies
on Jul 26, 2014 at 13:14

    I cannot find any guidelines for announcing Perl related events on this site. I have seen some events announced with requests for talks here, but not others. I am asking now because this year's London Perl Workshop has just been announced, but it is a question that has exercised my mind when other events have come up, such as the London PM tech meet two days ago (Thursday).

    Do such guidelines exist? If not, what should the guidelines contain? The simplest guideline of all is "Don't", but as I said above, it's one I have frequently seen broken. Do people want to know about major conferences only, lesser ones like the tech meet, routine pub events, emergency pub events? Where notices include publicity blurb about sponsors, should this be repeated or edited out?

    For any London mongers reading this, I happily volunteer to post notices about London events here, once I know I'm not doing the wrong thing. If guidelines exist or are created that permit such postings, I also volunteer to put a copy of the LPW 2014 notice in Perl news, unless someone else would prefer to do it.

    Regards,

    John Davies

    Update 2014-07-29

    I have received fewer replies & suggestions than I hoped. I'm therefore putting up my own suggestions for guidelines. Please note that they are my suggestions. If the Gods see things differently, their word goes.

  • It's good to talk. News of any sort of meeting should be posted hon this site in the Perl news section (http://www.perlmonks.org/?node=perl+news).
  • If this isn't the primary or first place a notice is posted, the notice should be of the form Title, Explanation (if the title isn't enough), Time, Place, Link to full details.
  • Repeated events should all be in the same thread. Each new case of an event should have a post of its own. I.e. there should be a thread for the London Perl Workshop, with each year's event having a new node in the thread.
  • Fewer threads are better than more, but materially different events should have their own thread. London would therefore have three threads, LPW, tech meets & social meets, with emergency socials being grouped with routine & heretic socials.
  • IMHO, any existing posts should be used as a starting point. If there are more than one that should be in a series, they should be considered and re-parented as necessary. But I'll wait for janitorial or Godly opinions before I start considering such nodes. If the amount of work is a problem, I hereby volunteer, if granted the power by the gods, to do it.

Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (13)
As of 2014-07-31 18:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (250 votes), past polls