Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: Algorithm to reduce the weight of a collection of bags

by ibm1620 (Pilgrim)
on Jul 05, 2022 at 15:44 UTC ( #11145292=note: print w/replies, xml ) Need Help??


in reply to Re: Algorithm to reduce the weight of a collection of bags
in thread Algorithm to reduce the weight of a collection of bags

I probably shouldn't have talked as much about table and data formatting. I'm writing a simple, dumb tool to browse/glance at CSV files from the terminal as quickly and effortlessly as possible. The reason I only need a generic solution (aside from its being more fun ;-) is this: My terminal is 158 characters wide; my CSV files have relatively few fields (under 15); the columns with the widest data tend to be unstructured text, where the widest cells are generally much wider than the average cell. Currently I'm browsing bank transaction files, where the Description column is the widest, and is the safest to truncate without chopping off "important" information. I indicate truncation by replacing the final character with a tilde, signaling data loss. The likelihood of my wanting to browse a CSV file having 30 columns, or on a terminal with 80 characters, where I'd have to be start narrowing the date and amount columns, is fairly remote.

I looked at Text::ANSITable but it did so much more than I needed, and didn't appear to address fitting the table width to the terminal.

  • Comment on Re^2: Algorithm to reduce the weight of a collection of bags

Replies are listed 'Best First'.
Re^3: Algorithm to reduce the weight of a collection of bags
by haukex (Archbishop) on Jul 06, 2022 at 21:14 UTC
    My terminal is 158 characters wide; my CSV files have relatively few fields (under 15); the columns with the widest data tend to be unstructured text, where the widest cells are generally much wider than the average cell. Currently I'm browsing bank transaction files, where the Description column is the widest, and is the safest to truncate without chopping off "important" information.

    Thanks for the context. In that case I personally would probably take the pragmatic route and attempt to identify those text columns based on their width, and truncate those, while not truncating any columns under a certain length to make sure I don't truncate amounts or dates (perhaps even trying to identify such "important" column data types with regexes). But as you said, since this kind of thing is also fun, I understand wanting a more generic solution - at the moment I just don't have good tips for that. As to the existing modules, I didn't have the time to look through all of them to see if maybe there is one that already limits its output width to the terminal width "intelligently" - but perhaps another solution would be to implement the truncation yourself before passing the data off to a module for the output.

      This is the truncation (or weight-reduction) algorithm as it now stands.
      #!/usr/bin/env perl use v5.36; # implies warnings no warnings q/experimental::for_list/; no warnings q/experimental::builtin/; use builtin qw/indexed/; use List::Util qw/sum/; my $target_weight = shift // die 'need target_weight'; my @weights = ( 20, 3, 25, 10, 3, 24, 25 ); say "Before:\n" . display( \@weights, $target_weight ); shrink( \@weights, $target_weight ); say "After:\n" . display( \@weights, $target_weight ); die if sum(@weights) != $target_weight; sub shrink ( $bags, $target_weight ) { my $curr_weight = sum @$bags; return if ( $curr_weight <= $target_weight ); # no shrink req'd my @refs = sort { ${$b} <=> ${$a} } map \$_, @$bags; BAG: for my ($i, $ref) ( indexed @refs ) { my $next_wt = $i < $#refs ? ${$refs[$i+1]} : 0; my $drop = $$ref - $next_wt; my $lowered_weight = $curr_weight - $drop * ( $i + 1 ); if ( $lowered_weight >= $target_weight ) { for ( 0 .. $i ) { ${$refs[ $_ ]} -= $drop; } $curr_weight = $lowered_weight; } else { use integer; my $target_loss = $curr_weight - $target_weight; my $div = $target_loss / ( 1 + $i ); my $rem = $target_loss % ( 1 + $i ); for ( reverse 0 .. $i ) { ${$refs[ $_ ]} -= $div + ( $rem-- > 0 ? 1 : 0 ); } last BAG; } } } sub display ($aref, $target) { my $r = ''; for my ( $i, $wt ) ( indexed @$aref ) { $r .= sprintf " %2s: {%s} (%d)\n", "#$i", ( '=' x $wt ), $wt; } $r .= sprintf "Weight %d, target=%d\n", sum(@$aref), $target; return $r; }
      Note that, having brought the four highest weights down to 16 but still needing to trim one more character, it took it from the bag that was originally the lightest of the four (#0), thus never violating the original ranking.
      $ shrink 79 Before: #0: {====================} (20) #1: {===} (3) #2: {=========================} (25) #3: {==========} (10) #4: {===} (3) #5: {========================} (24) #6: {=========================} (25) Weight 110, target=79 After: #0: {===============} (15) #1: {===} (3) #2: {================} (16) #3: {==========} (10) #4: {===} (3) #5: {================} (16) #6: {================} (16) Weight 79, target=79
      Plugging this into my simple-minded CSV columnizer gave me exactly what I wanted. It remains to be seen if I'll ever want to apply more sophisticated, data-aware methods of narrowing. :-)
Re^3: Algorithm to reduce the weight of a collection of bags
by Fletch (Bishop) on Jul 06, 2022 at 21:56 UTC

    Itís not Perl (unfortunately) but itís pretty spiffy and already works for what youíre probably wanting to do: visidata.

    Edit: iOS smart quotes strike again.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      Cool! (Link should be https://www.visidata.org)
      It is quite spiffy, and it truncates in more or less the same way I was trying to do (furthermore, it supports left/right scrolling, and stops narrowing the columns beyond a certain resaonable limit). Thanks for the info!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11145292]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2022-10-01 02:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (126 votes). Check out past polls.

    Notices?