Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

'sort -u' in perl

by sanPerl (Friar)
on Jul 06, 2006 at 18:27 UTC ( #559638=perlquestion: print w/replies, xml ) Need Help??

sanPerl has asked for the wisdom of the Perl Monks concerning the following question:

Dear All,
I need to unique sort lines of a file (say 'a.txt'). However the file is huge and I need to carry out all the operations on Windows. I normally follow these steps
1) Generate file on Windows
2) FTP it to Unix and run 'sort -u a.txt > b.txt'
3) FTP 'b.txt' to Windows
4) Rename it to 'a.txt'.
I wonder is there any way I can write Perl program, which would use any inbuilt function to create 'b.txt' on Windows itself.
Regards,
Sandeep

Replies are listed 'Best First'.
Re: 'sort -u' in perl
by runrig (Abbot) on Jul 06, 2006 at 18:37 UTC
    Call sort -u. There are plenty of Win32 ports of Unix utilities. I like UnxUtils.

    Update: Alternatively, I'd use a database, maybe either DB_File or DBI/DBD::SQLite. But sort -u is probably the simplest solution.

    Another update: I now like MSYS for my Unix utilities

Re: 'sort -u' in perl
by JediWizard (Deacon) on Jul 06, 2006 at 18:41 UTC

    perl -e "my %hash; @hash{<>} = 1; print sort keys %hash;"

    They say that time changes things, but you actually have to change them yourself.

    —Andy Warhol

      The op mentioned this was a huge file. Reading the entire thing into a hash, then performing 2 copies on the data (sort keys) is probably going to be very painful depending on how huge is huge.
Re: 'sort -u' in perl
by planetscape (Chancellor) on Jul 06, 2006 at 19:25 UTC
Re: 'sort -u' in perl
by Hue-Bond (Priest) on Jul 06, 2006 at 18:51 UTC

    We have sort and uniq so it's a matter of combining them:

    use List::MoreUtils qw/uniq/; my @a = qw/4 6 8 9 3 1 5 8 9 0 0 2 5 7 9 4 2 5 5 4 7 9 6 4 2/; my @s = uniq sort @a; print "s: @s\n"; __END__ s: 0 1 2 3 4 5 6 7 8 9

    --
    David Serrano

Re: 'sort -u' in perl
by Ieronim (Friar) on Jul 06, 2006 at 19:26 UTC
    There is a lame builtin sort on Win32. so you can use this one-liner:
    perl -ne "next if $seen{$_}++;print" a.txt | sort >b.txt
Re: 'sort -u' in perl
by blue_cowdawg (Monsignor) on Jul 06, 2006 at 18:52 UTC
        I wonder is there any way I can write Perl program, which would use any inbuilt function to create 'b.txt' on Windows itself.

    In the spirit of TIMTOWTDI!:

    #!/usr/bin/perl -w use strict; ues Tie::File; my @ry=(); tie @ry,"Tie::File","a.txt" or die $!; my %een=(); @ry = grep !$een{$_}++,sort @ry; untie @ry;
    Sort of an implementation of an inline unique sort.

    HTH


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: 'sort -u' in perl
by creamygoodness (Curate) on Jul 07, 2006 at 21:02 UTC

    Install Sort::External from CPAN. It was written for the purpose of sorting huge files.

    my $sortex = Sort::External->new(); while (<HUGEFILE>) { $sortex->feed($_); } $sortex->finish; my $prev = ''; while ( defined( $_ = $sortex->fetch ) ) { next if $_ eq $prev; print OUTFILE $_; $prev = $_; }
    --
    Marvin Humphrey
    Rectangular Research ― http://www.rectangular.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://559638]
Approved by gellyfish
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2021-08-04 14:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My primary motivation for participating at PerlMonks is: (Choices in context)








    Results (41 votes). Check out past polls.

    Notices?