Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Sorting files you Have read

by brusimm (Pilgrim)
on Nov 16, 2006 at 20:59 UTC ( #584602=perlquestion: print w/ replies, xml ) Need Help??
brusimm has asked for the wisdom of the Perl Monks concerning the following question:

I am new. New to Perl, semi new to programming. I have a new headache. here is my delima:

I want to open a file with multiple columns. (Redundant statement, seeing as how I want to sort on the first column)

I can open a file, read through it and print it to screen.

But I can't seem to locate the key process on sorting the contents of a file before printing to screen, never mind printing it to a file.

I have procured the O'reilly book, learning perl. I have scoured many websites... but I cannot connect the dot or dots to make my task come to fruition.

Hence, assistance is asked for...

Here is the code I've been using, without the sort routine:

open (SOMELIST, "somelist"); #opening standings, putting in array +SOMELIST while ($record = <SOMELIST>) { print $record; } close (SOMELIST)

Where / how does one sort somelist?
Any / all help, input, prewritten code (optimal, seeing as how my logic evades the dots connecting.) would be appreciated.

Comment on Sorting files you Have read
Download Code
Re: Sorting files you Have read
by GrandFather (Cardinal) on Nov 16, 2006 at 21:16 UTC

    That's a bit like asking "How should I drink?". It depends a great deal on what it is you are drinking!

    First up:

    1. What does your data look like?
    2. How big is the file?
    3. How often does it need to be done?
    4. Is sort time important?
    5. Will the sorted data be reused in some fashion?

    Actually, many of those are related to each other.

    Because you were good and supplied some code, I'll show you some:

    my @contents = sort <SOMELIST>; print @contents;

    and because I'm kind I'll give you some hints:


    DWIM is Perl's answer to Gödel
      My file is comma delimited
      about 50 lines of text
      How often - for my initial purposes, to run and make it happen.
      At the moment, due to the small file size, sort time is unimportant.
      At some point, when I get to that stage, the sorted data will be reused.

      I tried your code, and it seems quite simple, hence, efective, BUT
      It runs with no errors, but nothing prints
      Neither to the screen or to a file.

      Thank you - I have read through various tutorials,
      and sort and it's various ways of handling data,
      i saw the Schwartzian Transform, but it made my brain hurt at this point in time..
      remember, newbie here

      and either I did not operate it right, or there is little on the simpler process, but super search did not turn up anything I could make sense of...

      again, newbie, injured brain, etc, etc.
      Thanks.

        Ok, lets give that "simple effective" sample code some data:

        use strict; use warnings; my @contents = sort <DATA>; print @contents; __DATA__ At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d. I tried your code, and it seems quite simple, hence, efective. BUT, It runs with no errors, but nothing prints.

        Prints:

        At some point, when I get to that stage, the sorted data will be reuse +d. At the moment, due to the small file size, sort time is unimportant. BUT, It runs with no errors, but nothing prints. I tried your code, and it seems quite simple, hence, efective.

        which is sorted on the whole line. It works, but ain't what you want. So lets add in some brain hurty code to sort by the "second column": :)

        use strict; use warnings; my @contents = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [$_, extractColumn (1, $_)] } <DATA>; print @contents; sub extractColumn { my ($columnIndex, $line) = @_; my ($key) = $line =~ /(?:[^,]*,){$columnIndex}([^,]*)/; return $key; } __DATA__ At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d. I tried your code, and it seems quite simple, hence, efective. BUT, It runs with no errors, but nothing prints.

        Prints:

        BUT, It runs with no errors, but nothing prints. I tried your code, and it seems quite simple, hence, efective. At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d.

        However if you are dealing with csv (comma separated variable) data then you really want to be using a module such as Text::CSV to read the file. You may like to check out a few nodes that have asked the "sort CSV" question before (Super Search SoPW remember): Sorting a CSV file and sorting CSV files may help too.


        DWIM is Perl's answer to Gödel
        I'll modify your example code to help you do some simple sorting. Now, this is likely not the most efficient way to do it, but since you have a small dataset and you are new to perl, this can help you get started.

        use strict; use warnings; open (SOMELIST, "somelist") or die "Cannot open file $!\n"; my %sort_data; while (my $record = <SOMELIST>) { my @one_line = split(/,/, $record); while (exists $sort_data{$one_line[1]}) { $one_line[1] = "$one_line[1]" . " "; #add a blank for + uniqueness } $sort_data{$one_line[1]} = $record; #store it by 2nd column } close (SOMELIST); foreach my $line (sort {$a cmp $b} keys %sort_data) { print "$sort_data{$line}"; }


        Like I said, this isn't the most efficient or even best method, but it is simple enough that you can hopefully see what is going on.


        (2006-11-18 17:21 GMT) Edited my perl code to remove a couple of syntax errors. - Thanks Grandfather for pointing them out.
Re: Sorting files you Have read
by Fletch (Chancellor) on Nov 16, 2006 at 21:22 UTC

    Also consider that TMTOWTDI which don't involve perl. See the manual page for the sort utility if you're on some flavour of *NIX (or can get a hold of Cygwin in Wintendo land).

Re: Sorting files you Have read
by swampyankee (Parson) on Nov 16, 2006 at 21:24 UTC

    First suggestion: use your system's sort routine. While I think the *ix sort is better than the Windows sort, both are reliable, and quite fast.

    As for sorting a list in Perl:

    use strict; # a good idea use warnings; # and another my $input = 'myinput.txt'; open(my $in, "<", $input) or die "Could not open $input because $!\n"; @unsorted = <$in>; chomp(@unsorted); # get rid of end-of-record markers close($in); @sorted = sort @unsorted;

    will sort the @unsorted array, based on the entire string, with the output going to @sort. Note that this requires a copy of the entire file be in memory, which may be a Bad Idea. There are several tutorials, for example, here and here. And of, course, here.

    emc

    At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

    —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.
      Hey swampyankee, thank you, but when I run the script, I get the following

      Global symbol "@unsorted" requires explicit package name at one-c.pl line 6.
      Global symbol "@unsorted" requires explicit package name at one-c.pl line 7.
      Global symbol "@sorted" requires explicit package name at one-c.pl line 10.
      Global symbol "@unsorted" requires explicit package name at one-c.pl line 10.
      Execution of one-c.pl aborted due to compilation errors.

      I am on an XP machine, if that has any bearing, but would there be something else amiss with my installation?

      The reason i ask is my first responder suggested some code and that looked like it ran successfully, but did not translate any result to the screen or a file.
      Now your code returns errors... hmm

      Any other thoughts?

        I cleverly forgot to declare @unsorted and @sorted. This is why one should (a) test code before posting and (b) use strict; and use warnings;.

        They do wonders to prevent things like:

        @array = (1,2,3,4,5,6,7,8,9); $sum+= $_ foreach @array; $mean = $sum/scalar(@aray);

        (Yes, I know scalar(@aray) is 0.)

        emc

        At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

        —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.

      If you are going to use strict; use warnings; (and it is a really really good idea), then you really really need to declare your variables: ;)

      use strict; # a good idea use warnings; # and another my $input = 'myinput.txt'; open(my $in, "<", $input) or die "Could not open $input because $!\n"; my @unsorted = <$in>; chomp (@unsorted); # get rid of end-of-record markers close ($in); my @sorted = sort @unsorted;

      DWIM is Perl's answer to Gödel
        Thank you, everyone
        This points me in the right direction
        I used to do IDL and Fortran (About 7 years ago..)
        but this seems harder. Sheesh.. is it me being out of the programming loop so long, or is see-nellity setting in?
        Maybe you shouldn't answer that!!
        Take care.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://584602]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2014-12-29 10:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (186 votes), past polls