Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re^4: dos2ux shows cannot open file if it is greater than 2GB size

by ikegami (Pope)
on Mar 14, 2013 at 23:11 UTC ( #1023576=note: print w/replies, xml ) Need Help??

in reply to Re^3: dos2ux shows cannot open file if it is greater than 2GB size
in thread dos2ux shows cannot open file if it is greater than 2GB size

To speed things up, I'd read more at a time.
perl -pe'BEGIN { $/ = \(1024*1024); } s/\r//g;' >file.out
But yeah, it will take time to copy 7GB.
  • Comment on Re^4: dos2ux shows cannot open file if it is greater than 2GB size
  • Download Code

Replies are listed 'Best First'.
Re^5: dos2ux shows cannot open file if it is greater than 2GB size
by Tux (Abbot) on Mar 15, 2013 at 07:34 UTC

    That will change *ALL* \r in text, not just trailing \r. I do not think that is what dos2ux is supposed to do, though I admit that HP's manual page isn't very clear on that:

    DESCRIPTION dos2ux and ux2dos read each specified file in sequence and write + it to standard output, converting to HP-UX format or to DOS format, respectively. Each file can be either DOS format or HP-UX forma +t for either command. A DOS file name is recognized by the presence of an embedded col +on (:) delimiter; see dosif(4) for DOS file naming conventions. If no input file is given or if the argument - is encountered, d +os2ux and ux2dos read from standard input. Standard input can be comb +ined with other files.

    Enjoy, Have FUN! H.Merijn

      Fine, I suppose if you captured the output of a program that does print "\r95%", you'd have a problem. For the rest of the time, \r will only be found before \n.

      Checking for \r\n is more expensive since you need special handling because the read can end between the two.


        Thanks for your reply
        In my input file CONTROL_M character's are found inside the line also.
        When i try it by dos2ux it removed CONTROL_M character's inside line also.
        And my concern is to remove only CONTROL_M character and not any other special characters.
        Kindly advise how to proceed, also how to speed up the process by using below code.

        $self->remove_controlM($infile); sub remove_controlM { my $self = shift; my $in_file = shift; my $out_file = $in_file . "controlM_removed"; open(my $FH_IN, '<', $in_file) or die "Failed to open $in_file $!\n +"; print "REMOVE CONTROL_M CHARACTER PROCESS STARTING"; open(my $FH_OUT, '>', $out_file) or die "Failed to write $out_file +$!\n"; while (<$FH_IN>) { s/\r//g; print($FH_OUT $_); } close ($FH_IN); close ($FH_OUT); unlink($in_file); `mv $out_file $in_file`; print "REMOVE CONTROL_M CHARACTER PROCESS ENDING"; }

        Shanmugam A.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023576]
[ambrus]: Corion: as for that, I talked with schmorp, and looked a bit more at Prima
[ambrus]: Corion: from the docs, it definitely looks like it would be possible to write an AnyEvent driver for it using Prima::File and Prima::Timer.
[ambrus]: (I haven't found an idle event.)
[ambrus]: However, I also looked at the internals, and found two interesting things:
[marto]: stupid xpath question, I have a xpath copied from the browser, is the * wildcard supposed to work for IDs? e.g. '//*[@id="*"]/ div/div/header/p/a '?
[ambrus]: The main loop for unixish systems is in https://metacpan. org/source/ KARASIK/Prima-1. 49/unix/apc_app.c. It is a hand-rolled select loop, but also always spins at least once every 0.2 seconds for some reason.
[ambrus]: marto: no, not that way. if you just want to test for an existing id attribute, then write *[@id] , or you can match a regexp to the value of that attribute like *[@id=~/something/ ]
[marto]: ambrus, thanks I ended up going with '//*[starts- with(@id, "thread_")]...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2016-12-08 10:50 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (140 votes). Check out past polls.