Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

A little golf anyone?

by deprecated (Priest)
on Oct 30, 2002 at 20:00 UTC ( #209212=perlmeditation: print w/replies, xml ) Need Help??

I am finding myself irritated by solaris 8's lack of du and df -h. Especially now that solaris 9 actually has the flag. So I am preparing to write a small shell one-liner or perl if I cant do it with awk or sed (the principle being, if youre going to do it a bunch of times, you want it to be fast).

So the challenge is:

Replicate the output of a du or df with the -h flag. Example:

minotaur% df -k | grep c0t1 /dev/dsk/c0t1d0s1 1269615 814811 391324 68% /usr /dev/dsk/c0t1d0s3 245407 44583 176284 21% /var /dev/dsk/c0t1d0s0 424519 187200 194868 49% /home minotaur% df -h | grep c0t1 /dev/dsk/c0t1d0s1 1.2G 796M 382M 68% /usr /dev/dsk/c0t1d0s3 240M 44M 172M 21% /var /dev/dsk/c0t1d0s0 415M 183M 190M 49% /home
Replicate it in such a way that it can be passed input from standard in.

Special note: if you can do it in sed or some other shell command (ideally a common one, let's not be linux or solaris or bsd centric here), that is fine, given the intent is to have a working shell alias. Contestants will be judged from the first character after the pipe:

alias du 'du | perl ... '

So, yeah, that means you get a 1 character bonus for using sed, but I doubt your command would be as compact.

Consider it a Unix "brown bag" golf. Enjoy!

brother deprecated

Laziness, Impatience, Hubris, and Generosity.

Replies are listed 'Best First'.
Re: A little golf anyone?
by jsprat (Curate) on Oct 30, 2002 at 22:23 UTC
    How about 114 (including the 'du -k|':

    df -k|perl -ape'map{while(1024<$_){$_/=1024;$i++}$i="%6.1f".substr"KMG +",$i,1;$_=sprintf$i,$_}@F[1..3];$_=join"\t",@F,$/'

    Output using the sample data looks like this:

    /dev/dsk/c0t1d0s1 1.2G 795.7M 382.2M 68% /usr /dev/dsk/c0t1d0s3 239.7M 43.5M 172.2M 21% /var /dev/dsk/c0t1d0s0 414.6M 182.8M 190.3M 49% /home

    I didn't come up with a way to cleanly format G to 1 decimal place, leaving K and M at a whole number (ie 1.2G, 240M).

    94 if less precision and un-aligned formatting meets your standards :-P

    df -k|perl -ape'map{while(1024<$_){$_>>=10;my$i++}$_.=substr"KMG",$i,1 +}@F[1..3];$_=join"\t",@F,$/' Output: /dev/dsk/c0t1d0s1 1G 795M 382M 68% /usr /dev/dsk/c0t1d0s3 239M 43M 172M 21% /var /dev/dsk/c0t1d0s0 414M 182M 190M 49% /home

    Update: fixed miscounts

Re: A little golf anyone?
by petral (Curate) on Oct 31, 2002 at 00:00 UTC
    Kinda silly at 143:
    perl -pe's/(^|\s)((\d?\d)?(\d)(\d)\d{5}|(\d?\d)?(\d)(\d)\d\d|\d{1,3})( +?=\s|$)/$1.($3?" $3$4G":$4?" $4.$5G":$6?" $6$7M":$7?" $7.$8M":$2) +/ge'
    but it does produce:
    /dev/dsk/c0t1d0s1 1.2G 814M 391M 68% /usr /dev/dsk/c0t1d0s3 245M 44M 176M 21% /var /dev/dsk/c0t1d0s0 424M 187M 194M 49% /home
    from your sample (note truncation instead of rounding).
    Maybe I can think of something more sensible in the morning.

    upday:   Ok, this is simpler (102 chars):
    perl -pe's/(^|\s)((\d*?)\d)(\d)\d\d(\d{3})?(?=\s)/$1.($5&&" ").($3?" + $2":"$2.$4").($5?"G":"M")/sge'
    It still preserves numeric alignment with the original text, eg,
    $ echo \ 1234 12 12345 123456 1234567 12345678 123456789 | perl -pe'. . .' 1.2M 12 12M 123M 1.2G 12G 123G
    and rounding adds 22 (to 124 chars):
    perl -pe's/(^|\s)((\d*?)\d)(\d)(\d)\d(\d{3})?(?=\s)/$1.($6&&" ").($3 +?" ".($2+($4>4)):$2+($4+($5>4))/10).($6?"G":"M")/ges'
    Update, the second:

    Here's a funny format one at 79 chars:
    perl -pe's/((\d*?)\d)(\d)\d\d(\d{3})? /$1.($2?" ":".$3").($4?" G ": +"M ")/ge'
    which produces this:
    $ echo ' 1234 12 123 12345 123456 7654321 12345678 123456789 ' | perl -pe'... 1.2M 12 123 12 M 123 M 7.6 G 12 G 123 G
    And an extendible one at 114 chars (got to be ready for those terrabyte-sized disks):
    perl -pe's!(^|\s)((\d*?)\d)(\d)\d\d((\d{3})*)(?=\s)!$"x($x=length$5).$ +1.($3?" $2":"$2.$4").qw(K M G T)[$x/3]!ges'
    Which does:
    $ echo \ 1234 1234567 123456789012 1234567890123 123456789012345 | perl ... 1.2K 1.2M 123G 1.2T 123T
    Ok, here's a real one at 117 chars that does 1024-type m's and g's and t's:
    perl -pe's!\d{4,}(?=\s)!sprintf" "x($x=(-4+length$&)/3).($x*3%3?" % +.0f":"%.1f").qw(M G T)[$x],$&/1024**int$x+1!ge'
    Which gets:
    $ echo ' 321 4321 654321 87654321 9876543210 98765432101 987654321012 '|perl... 321 4.2M 639M 84G 9.2T 92T 920T
    This ignores several problems:  1) Before 5.6, perl expanded   `qw(...)'   into   `split" ",'...''   (so you couldn't   [index]  it );  2) There's no protection against filenames or whatever which end with a string of 4 digits;  and 3) There's no provision for catching a number at the end of the line.

    Adding these back in, yields 132 chars:
    perl -pe's!(^|\s)(\d{4,})(?=\s)!sprintf"$1%s".(($x=-4+length$2)%3?" % +.0f":"%.1f").(qw(M G T))[$x/=3]," "x$x,$2/1024**int$x+1!ges'

    (BTW, what comes after Terrabytes?)

    Update the last:   Updated the above to save a few strokes (and round properly on the 1024 one (changed %d -> %.0f)).

      Shorter sillyness (read cheating). 54 characters. df -k|perl -pe's/\d{3} / M  /g;s/(\d+)(\d)\d\d M/ $1.$2 G/g'

      If we didn't reinvent the wheel, we wouldn't have rollerblades.

Re: A little golf anyone?
by Aristotle (Chancellor) on Oct 31, 2002 at 00:56 UTC
    101 with nearly proper output, besides printing the fraction for K and M as well. df -k|perl -pale'map{my$i;$_=sprintf"%6.1f%s",$_/1024,qw(K M G)[++$i]while$_>1024}@F[1..3];$_=join"\t",@F'
    Update: another 4 off, so we're at 97. df -k|perl -pale'map{$_=sprintf"%6.1f%s",$_/1024,qw(K M G)[++$"]while$_>1024}@F[1..3];$"="\t";$_="@F"'
    Update: and an examplary solution using awk (sed is not the tool for this job): df -k|awk '{OFS="\t";CONVFMT="%6.1f";for(i=2;i<5;i++){while($i>1024)$i=$i/1024}print}'
    It's missing several things, so I didn't bother counting characters. On the other hand, for this small job, especially if used frequently, I'd prefer something like this over the Perl solution simply for efficiency - awk is pretty nimble compared to perl, which is important when working interactively.

    Makeshifts last the longest.

Re: A little golf anyone?
by robartes (Priest) on Oct 30, 2002 at 20:59 UTC
    Here's one to get people started. It weighs in at 164 162 characters and is probably disqualified for not pretty-printing the output, but it does what you ask:
    df -k|perl -ne'print join"\t",map{/^\d+$/?($_=$_/1024)<1?p($_*1024)."K +":($_=$_/1024)<1?p($_*1024)."M":p($_)."G":$_}split/\s+/;print"\n";sub + p{sprintf"%.2f",shift}'


    Update: Killed two spaces.

Re: A little golf anyone?
by sauoq (Abbot) on Oct 31, 2002 at 02:34 UTC

    Why go to all that trouble? The GNU fileutils package compiles neatly on Solaris 8.

    "My two cents aren't worth a dime.";
      Pfff, party pooper. :^) C'mon, you know you want to. ;-)

      Makeshifts last the longest.

      I'm aware that GNU fileutils compiles cleanly on Solaris 8. That, however, implies that one has a compiler, or that one is willing and/or able to add packages to the system.

      The fact of the matter is, I'm trying to apply a global configuration to a lot of machines, most of which do not have more than 540mb of internal disk space.

      These machines have perl. They do not have gcc or forte, and it would take a long time to compile these packages, and it simply isnt reasonable to go and install a package to do that. It is much easier and practical to insert a

       du | perl
      ism into my .profile.

      I wanted also to say that this reminds me a lot of the typical irc #perl channels.

      <user> how can i accomplish X? <chanop> why on earth would you want to accomplish X? you can do Z ins +tead! <user> because Z isn't relevant at all to what I want to do.
      this kind of response is just rampant on IRC and on mailing lists, and I even see it here. It bothers me because these answers pollute search engines like Google. When somebody is searching for an answer on how to, say, format a file into six columns with perl, they're going to find ten answers from the chanop/listmaven type saying "just use
      rather than explaining what they need to do." Far more effective than telling somebody something else they can do is to answer the question and point them in another direction that would help.

      I've bitched about this in the past, but its been a year or so, so I'm inclined to bring it up again.

      brother dep

      Laziness, Impatience, Hubris, and Generosity.

        You forgot the rest of that irc conversation:

        <chanop> Why isn't Z relevant? <user> Because Y. <chanop> But Y isn't a good reason. Let me explain... [dozens of lines of explanation snipped] <user> Oh. Maybe I should try Z...
        At least, that's how it used to be on OPN/#perl when it was still relatively small and friendly.

        So, you were bothered by my reply. I'm left wondering why you were bothered that I asked: "Why go to all that trouble?" and then added "The GNU fileutils package compiles neatly on Solaris 8." as an explanation. Perhaps you feel underestimated. Perhaps you think that when you ask a question I should assume that you know what you are doing and not question your enlightened approach. You might even be right but its a rather ungracious attitude to have when you are seeking help.

        You complained about the answer I gave you when, in fact, my response wasn't intended to be an answer at all, but a probe for more information. Since you seem taken aback by it, I'll explain why I questioned you. In your original post you wrote:

        (the principle being, if youre going to do it a bunch of times, you want it to be fast)

        You want it to be fast. Fine, we'll ignore that you want to start up a perl interpreter for this... But why ask for a golfed solution? Short doesn't mean fast and the shortest solution for many problems is far slower than the fastest. Asking for the two things together is kind of, uh, green. Because you want it to be part of a shell alias isn't, by itself, a good justification for a golfed solution either. Shortish may be nice, maybe, but why golfed? Admittedly, it's ugly in tcsh but if you are using tcsh you are probably used to doing ugly things similar to:

        alias blahblah 'du | perl -e '\''\\ ... \\ ... \\ ... \\ '\'

        If, on the other hand, you did decide to use GNU du/df binaries, they weigh in at about 45k (stripped) each. I don't think they rely on special libraries, so you wouldn't have to install a package, you'd probably just have to copy the binaries around. (If you really have a lot of machines, NFS would probably make life easier.) Not only are they fast, they provide additional functionality and unlike shell aliases, they can be reliably called from scripts.

        Shell aliases are useful for many things but I wouldn't use one for this because it seems to me like too much complexity for too little gain. I would either use the GNU versions or get used to reading long numbers. (In fact, I have many gnu tools in my ~/bin which is automounted on most of the machines I work on but I tend to just use -k with df/du even though our arrays have 150Gb+. It comes pretty easily after a bit.)

        If you really insist on using aliases, I recommend against calling them "du" or "df" because you'll get frustrated all over again when you regularly find the need to use the commands with other switches like df -e or du -L.

        Back to the point...

        In your sig, you list "Generosity" along with the three great virtues of a programmer. I don't know that it really takes generosity to be a great programmer but it is certainly a quality worth having. Another one is humility. Humility enables us to truly accept the generosity of others.

        "My two cents aren't worth a dime.";
Wildly over-engineered solution to df formatting
by blssu (Pilgrim) on Oct 31, 2002 at 18:17 UTC

    Ok, this isn't golfing, but it's kinda cool anyways. I thought it would be nice to have a generalized number formatter that automagically recognized columns of numbers and reformatted then to KB, MB, etc. (BTW, the systems that inter-mix unit scales are very unfriendly. Outlook does this for example -- it intermixes bytes, kb and mb -- which makes it almost impossible to see relative sizes by quickly scanning a list.)

    Here are a few examples of the automagical number formatter:

    conch:~% du -sk * | perl -MFoo -e k2m 0.01M 6.88M Mail 0.18M Sent 0.00M bin 0.79M public_html conch:~% ls -l | perl -MFoo -e b2k total 394 -rw-r--r-- 1 fox guest 6.25K Oct 31 12:00 drwx------ 2 fox guest 0.50K Oct 31 12:00 Mail -rw------- 1 fox guest 169.05K Oct 30 15:33 Sent drwx------ 2 fox guest 0.50K Aug 3 2000 bin drwxr-xr-x 4 fox guest 0.50K Jul 17 17:38 public_html conch:~% df -k | perl -MFoo -e k2g Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/sd0a 0.06G 0.02G 0.04G 35% / /dev/sd0h 1.89G 1.52G 0.28G 85% /usr /dev/sd0g 1.89G 0.87G 0.93G 48% /var /dev/sd0f 1.89G 0.87G 0.93G 48% /usr/guest1 /dev/sd0e 0.71G 0.38G 0.29G 56% /usr/msen mfs:25 63471 116 60181 0% /tmp

    Here's the module itself. It's too long and messy. Sorry about that. It's also not smart enough to handle columns of numbers that don't quite line up. (OT: Wouldn't it be fun if all our tools could generate XML? And our terminals could automatically format it?)

    package Foo; use strict; require Exporter; use vars qw(@ISA @EXPORT); @ISA = qw(Exporter); @EXPORT = qw(f b2k k2m m2g b2m k2g b2g); use Text::Tabs; sub format_number { my($n, $scale, $suffix) = @_; $n = ($scale > 1) ? sprintf('%.2f', $n / $scale) : sprintf('%d', $n); while ($n =~ s/^(\d+)(\d{3})/$1,$2/) { } return $n . $suffix; } sub pad_left_justified { my($text, $nums, $i, $format) = @_; if (defined $nums->[$i] && $nums->[$i]{-start} == $format->{-pos}) + { my $padding = $format->{-width} - length($text->[$i]) - 1; if ($padding > 0) { $text->[$i] = (' ' x $padding) . $text->[$i] } $text->[$i] .= ' '; } } sub format_left_justified { my($text, $nums, $i, $format, $scale, $suffix) = @_; if (defined $nums->[$i] && $nums->[$i]{-start} == $format->{-pos}) + { if ($i < @{$text} - 1) { $text->[$i + 1] =~ s/^(\s*)//; if (length($text->[$i] + length($1) > $format->{-width})) +{ $format->{-width} = length($text->[$i]) + length($1) } } $text->[$i] = format_number($text->[$i], $scale, $suffix) } } sub pad_right_justified { my($text, $nums, $i, $format) = @_; if (defined $nums->[$i] && $nums->[$i]{-end} == $format->{-pos}) { my $padding = $format->{-width} - length($text->[$i]); if ($padding > 0) { $text->[$i] = (' ' x $padding) . $text->[$i] } } } sub format_right_justified { my($text, $nums, $i, $format, $scale, $suffix) = @_; if (defined $nums->[$i] && $nums->[$i]{-end} == $format->{-pos}) { if ($i > 0) { $text->[$i - 1] =~ s/(\s*)$//; if (length($text->[$i] + length($1) > $format->{-width})) +{ $format->{-width} = length($text->[$i]) + length($1) } } $text->[$i] = format_number($text->[$i], $scale, $suffix) } } sub most_popular(\%) { my($votes) = @_; my @ranked = sort { $b->[1] <=> $a->[1] } map { [ $_ => $votes->{$_} ] } keys %{$votes}; return (@{$ranked[0]}) } sub filter_columns { my($scale, $suffix) = @_; my @row_text = ( ); my @row_nums = ( ); my %field_count_votes = ( ); # Find all the numbers in the input and group # them into columns. Non-numeric text is treated # as filler between the numeric columns. while (<>) { chomp; $_ = expand($_); my @text = split(/\b(\d+)\b/); my $field_count = 0; my $pos = 0; my @nums = ( ); foreach my $text (@text) { my $length = length($text); my $end = $pos + $length; if ($text =~ /^\d+$/) { ++$field_count; push @nums, { -start => $pos, -end => $end, -length => $length }; } else { push @nums, undef; } $pos = $end; } push @row_text, [ @text ]; push @row_nums, [ @nums ]; ++$field_count_votes{$field_count}; } # Reverse engineer the sprintf formats and put the # column re-formatting subs into @format. my @format = ( ); my ($popular_field_count) = most_popular(%field_count_votes); foreach my $nums (@row_nums) { my $field_count = 0; foreach my $cell (@{$nums}) { if (defined $cell) { ++$field_count; } } next unless ($field_count == $popular_field_count); my $i = 0; foreach my $cell (@{$nums}) { if (defined $cell) { ++$format[$i]{-start}{$cell->{-start}}; ++$format[$i]{-end}{$cell->{-end}}; if (!$format[$i]{-max_length} || $cell->{-length} > $format[$i]{-max_length}) { $format[$i]{-max_length} = $cell->{-length} } } ++$i; } } foreach my $col (@format) { next unless (defined $col); if ($col->{-max_length} > 3) { my ($start, $start_count) = most_popular(%{$col->{-start}} +); my ($end, $end_count) = most_popular(%{$col->{-end}}); if ($end_count >= $start_count) { $col = { -format => \&format_right_justified, -pad => \&pad_right_justified, -type => 'right', -pos => $end, -count => $end_count, -width => $col->{-max_length} }; } else { $col = { -format => \&format_left_justified, -pad => \&pad_left_justified, -type => 'left', -pos => $start, -count => $start_count, -width => $col->{-max_length} }; } } if (!$col->{-format} || $col->{-count} < @row_nums / 2) { $col = undef; } } # Scale and format the columns if the row matches # the format, otherwise leave it alone. (Avoid formatting # things like column headers.) for (my $row = 0; $row < @row_text; ++$row) { my $i = 0; foreach my $col (@format) { if (defined $col) { $col->{-format}->($row_text[$row], $row_nums[$row], $i +, $col, $scale, $suffix); } ++$i; } } # Pad columns and print all the rows. for (my $row = 0; $row < @row_text; ++$row) { my $i = 0; foreach my $col (@format) { if (defined $col) { $col->{-pad}->($row_text[$row], $row_nums[$row], $i, $col); } ++$i; } print join('', @{$row_text[$row]}), "\n"; } } sub f { filter_columns(1, '') } sub b2k { filter_columns(1024, 'K') } sub k2m { filter_columns(1024, 'M') } sub m2g { filter_columns(1024, 'G') } sub b2m { filter_columns(1024*1024, 'M') } sub k2g { filter_columns(1024*1024, 'G') } sub b2g { filter_columns(1024*1024*1024, 'G') } 1;

      It may be "over-engineered", but this is a very readable format for presenting wildly varying numbers.

      Thank you.

Re: A little golf anyone?
by petral (Curate) on Nov 07, 2002 at 19:55 UTC
    OK, this is my final answer:
    perl -pe' s!\d{4,}(?=\s)! sprintf$"x($x=-4+length$&) ."%3.*f" .qw(M G T)[$x/3], $x%3<1, $&/1024**int$x/3+1 !ge '
    which is 104 chars not counting the decorative white space.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://209212]
Approved by Tanalis
Front-paged by robartes
[erix]: /hehehe
[marto]: LanX yesterday I found out about Gish gallop tactic
[marto]: "Eugenie Scott, executive director of the National Center for Science Education, dubbed this approach the Gish gallop, describing it as "where the creationist is allowed to run on for 45 minutes or an hour, spewing forth torrents of error that the
[marto]: evolutionist hasn't a prayer of refuting in the format of a debate." She also criticized Gish for failing to answer objections raised by his opponents"
[erix]: one would hope evolutionists haven't any prayers anyway

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2017-07-28 15:40 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (431 votes). Check out past polls.