Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^3: dos2ux shows cannot open file if it is greater than 2GB size

by shan_emails (Beadle)
on Mar 14, 2013 at 14:45 UTC ( [id://1023486]=note: print w/replies, xml ) Need Help??


in reply to Re^2: dos2ux shows cannot open file if it is greater than 2GB size
in thread dos2ux shows cannot open file if it is greater than 2GB size

Thanks Buddy...It works fine.

But it takes time to complete.

  • Comment on Re^3: dos2ux shows cannot open file if it is greater than 2GB size

Replies are listed 'Best First'.
Re^4: dos2ux shows cannot open file if it is greater than 2GB size
by kennethk (Abbot) on Mar 14, 2013 at 17:25 UTC

    You can actually optimize the regular expression using a string termination anchor: s/\r$// That keeps it from having to check the entire string.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re^4: dos2ux shows cannot open file if it is greater than 2GB size
by ikegami (Patriarch) on Mar 14, 2013 at 23:11 UTC
    To speed things up, I'd read more at a time.
    perl -pe'BEGIN { $/ = \(1024*1024); } s/\r//g;' file.in >file.out
    But yeah, it will take time to copy 7GB.

      That will change *ALL* \r in text, not just trailing \r. I do not think that is what dos2ux is supposed to do, though I admit that HP's manual page isn't very clear on that:

      DESCRIPTION dos2ux and ux2dos read each specified file in sequence and write + it to standard output, converting to HP-UX format or to DOS format, respectively. Each file can be either DOS format or HP-UX forma +t for either command. A DOS file name is recognized by the presence of an embedded col +on (:) delimiter; see dosif(4) for DOS file naming conventions. If no input file is given or if the argument - is encountered, d +os2ux and ux2dos read from standard input. Standard input can be comb +ined with other files.

      Enjoy, Have FUN! H.Merijn

        Fine, I suppose if you captured the output of a program that does print "\r95%", you'd have a problem. For the rest of the time, \r will only be found before \n.

        Checking for \r\n is more expensive since you need special handling because the read can end between the two.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1023486]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (7)
As of 2024-04-19 09:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found