Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re^3: convert tags to punctuation

by Anonymous Monk
on Jan 15, 2021 at 19:01 UTC ( #11126971=note: print w/replies, xml ) Need Help??

in reply to Re^2: convert tags to punctuation
in thread convert tags to punctuation

Bill -- I think your code is more maintainable. The document I am messing with is about 600,000 lines long. Is there a way to speed this up? Is there a way to get a complete list of <ab> tags ?

Replies are listed 'Best First'.
Re^4: convert tags to punctuation
by BillKSmith (Prior) on Jan 16, 2021 at 16:47 UTC
    You should ask the person who prepares your input file if he can direct you to either a specification of the file format or to the documentation of the program that created it. If this fails, I would write a perl program to list all the tags. The only way I know to get the values, is use an editor to examine the tags in context and make your best guess. (It usually will be obvious.)

    It is nearly impossible to guess what will or will not make a Perl program faster. The usual advice is to profile your program. Only work on those parts which are using the most time. Use benchmark to measure possible improvement. In your case, I/O is probably taking much longer than processing. Slurping the entire file into memory is probably not an option. Reading the file in large blocks may help, but it is not easy to get right. I recommend against any optimization unless it is absolutely necessary.


      I noticed something interesting about this document: If I view it with the 'more' filter. I see a bunch of black rectangles with the tags inside them. If I view it with gedi or ptked I see \x{93} , \x{94} , \x{95} , etc. Does it matter what chars go in my s/ ... / line? What does PERL see?

        Could you post a representative sample of your input file, following the advice given here and here?

Re^4: convert tags to punctuation
by LanX (Cardinal) on Jan 16, 2021 at 16:59 UTC
    > Is there a way to speed this up?

    what makes you think it's not fast enough?


    quoting davido from the CB:

    who cares about how fast Perl runs; it's almost always the network or IO that are standing in the way.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11126971]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (2)
As of 2021-03-01 01:17 GMT
Find Nodes?
    Voting Booth?

    No recent polls found