Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

As I said you have more than one option. If my hypothesis is right. I would do the following:

1) Make sure hypothesis is right: If possible, call the code from a small test-script which calls the code a few 100.000 times and time that. Then use two testfiles: One with a few translation strings at the beginning of a long file, the other with the same translation strings at the end of the file. If the first file takes much longer then you can be pretty sure that runaway regex search is the culprit.

Another possibility would be to execute the code (either all or a extracted parts with a test script) with a newer perl version and use debugging features like "use re "debug";".

2) Change your programm to do the search and replace in a loop. If you call a regex with g parameter in scalar context, it only finds one occurence and stops, but it remembers where it left of (you can find out with pos() and change where it continues with pos() as well). What I would propose would be something like this:

my $result=""; while ($trans=m/__\('.*?[^\\']'|".*?[^\\"]"(?:,,?.*?[^'"])?\)/gis) { +#changed to remove the two capture parens my $pos=pos(); $result.= substr($_,0,$pos); my $translen= length($trans); my $transtext= substr($_,$pos,$translen); <here $transtext has your complete translation string. Do the subs +titution on $transtext, you can use the code you already used or even + simplify it> $result.= $transtext; #remove the already translated part from $_ substr($_, 0, $pos+$translen)=''; #we reset search to begin at position 0 again pos()=0; } $_= $result . $_;

Untested code but this should theoretically work. It has to parse the translation string twice, so it will naturally be twice as slow as your original simple regex. But it should not bring your webserver to its knees.

Clarification update: "twice as slow" only applies to the parsing of the string, not to the complete regex execution. gettr() will still be called only once,


In reply to Re^3: regex for translation by jethro
in thread regex for translation by klayman

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (3)
    As of 2020-07-07 06:24 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?