Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

schwartzian transform problem - Solved

by Cristoforo (Curate)
on Feb 27, 2025 at 19:37 UTC ( [id://11164097]=perlquestion: print w/replies, xml ) Need Help??

Cristoforo has asked for the wisdom of the Perl Monks concerning the following question:

I wonder if someone could tell me why this code is not working. It is to sort the records from highest percent to lowest. (I edited the regexp in split) Found one error. the mao{$_[0]} should be map {$_->[0]} and the split needed a newline (\n) at the end of the pattern split(/(?<=workspace\/data)\n/, $s) Now, I'm getting the correct sorted output.
C:\Old_Data\perlp>perl try3.pl >>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data
(Below, the code before fixes noted above)
#!/usr/bin/perl use strict; use warnings; use feature 'say'; #https://stackoverflow.com/questions/79472778/sorting-the-content-of-a +-file my $s = <<EOF; >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data >>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data EOF my @data = map {$_[0]} sort {$b->[1] <=> $a->[1]} map {[$_, /\s(\d+)%/]} split(/(?<=workspace\/data)/, $s);
It is printing error as:
C:\Old_Data\perlp>perl try3.pl Use of uninitialized value in numeric comparison (<=>) at try3.pl line + 23. Use of uninitialized value in numeric comparison (<=>) at try3.pl line + 23.

Replies are listed 'Best First'.
Re: schwartzian transform problem - Solved
by johngg (Canon) on Feb 28, 2025 at 11:38 UTC

    An alternative to the ST is a GRT.

    In a do block read the data with no line buffering from a filehandle (in this script a HEREDOC) and split into records at points not preceded by start of string (to avoid an empty first record) and followed by the ">>>" which starts each record. Each record passes into a map where the digits preceding the % sign are captured then packed as a 32-bit network order value (logical NOT applied as we want descending numerical order) concatenated with the whole record packed as a string. This is then passed to a simple lexical sort and then into a second map which unpacks the record by skipping the first four bytes which is the number used to sort. The script ...

    use strict; use warnings; open my $fh, q{<}, \ <<__EOF__ or die qq{open: < HEREDOC: $!\n}; >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data >>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data __EOF__ print for map { unpack q{x4a*}, $_ } sort map { m{(\d+)(?=%)} && ( ~ pack( q{N}, $1 ) . pack( q{a*}, $_ ) ) } do { local $/ = q{}; split m{(?<!\A)(?=>>>)}, <$fh>; }; close $fh or die qq{close: < HEREDOC: $!\n};

    The output ...

    >>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data

    I hope this is of interest.

    Cheers,

    JohnGG

      Hi JohnGG, I'm trying to follow your solution using the GRT sort. I wonder why you have m{(\d+)(?=%)} where I might have used m{(\d+)%} without the positive lookahead for '%'?

        There's no difference since $& and others aren't used, but /(\d+)%/ should be faster.

Re: schwartzian transform problem
by ikegami (Patriarch) on Feb 28, 2025 at 23:03 UTC

    Instead of using complex code to make sort efficient, use Sort::Key which handles that for you without the messy code.

    use File::Slurper qw( read_text ); use Sort::Key qw( rikeysort ); print rikeysort { ( /(\d+)%/ )[0] } split /^(?=>>> )/m, read_text( 'try3.txt' );

    (Also note the more reliable split pattern.)

    It's also a stable sort like the ST solution, but unlike the provided GRT solution.

    It might be the fastest of all provided solutions.

      This is a situation where there's no substitute for measuring, e.g. with Benchmark.
Re: schwartzian transform problem
by Cristoforo (Curate) on Feb 27, 2025 at 20:12 UTC
    The program with the corrections (with file try3.txt same data):
    #!/usr/bin/perl use strict; use warnings; use feature 'say'; open my $fh, '<', 'try3.txt' or die $!; # try3.txt contains the data my $s; { local $/ = undef; $s = <$fh>; # slurp file } close $fh or die "read file close error: $!"; my @data = map {$_->[0]} sort {$b->[1] <=> $a->[1]} map {[$_, /(\d+)%/]} split(/(?<=workspace\/data)\n/, $s); say for @data;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11164097]
Approved by marto
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (2)
As of 2025-03-20 17:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    When you first encountered Perl, which feature amazed you the most?










    Results (61 votes). Check out past polls.