Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Algorithm to reduce the weight of a collection of bags

by haukex (Archbishop)
on Jul 04, 2022 at 20:34 UTC ( [id://11145275]=note: print w/replies, xml ) Need Help??


in reply to Algorithm to reduce the weight of a collection of bags

First of all, I personally would not rename "columns" to "bags" and "widths" to "weights". Anyway, could you tell us some more about the constrains, specifically on the number of columns and the possible window widths? Though coming up with a completely generic algorithm always seems more fun, I'm also for the pragmatic approach. For example, are the data types of the columns known? Because I doubt it would be a good idea to cut off the "date" and "amount" columns. There are also a bunch of table formatting modules on the CPAN, have you looked at those? Update: For example, Text::Table::Any lists 27 possible backends, and I'm sure there are plenty more. "Format an ASCII table" seems to be a favorite wheel to reinvent (I'm guilty of that too).

Replies are listed 'Best First'.
Re^2: Algorithm to reduce the weight of a collection of bags
by ibm1620 (Hermit) on Jul 05, 2022 at 15:44 UTC
    I probably shouldn't have talked as much about table and data formatting. I'm writing a simple, dumb tool to browse/glance at CSV files from the terminal as quickly and effortlessly as possible. The reason I only need a generic solution (aside from its being more fun ;-) is this: My terminal is 158 characters wide; my CSV files have relatively few fields (under 15); the columns with the widest data tend to be unstructured text, where the widest cells are generally much wider than the average cell. Currently I'm browsing bank transaction files, where the Description column is the widest, and is the safest to truncate without chopping off "important" information. I indicate truncation by replacing the final character with a tilde, signaling data loss. The likelihood of my wanting to browse a CSV file having 30 columns, or on a terminal with 80 characters, where I'd have to be start narrowing the date and amount columns, is fairly remote.

    I looked at Text::ANSITable but it did so much more than I needed, and didn't appear to address fitting the table width to the terminal.

      My terminal is 158 characters wide; my CSV files have relatively few fields (under 15); the columns with the widest data tend to be unstructured text, where the widest cells are generally much wider than the average cell. Currently I'm browsing bank transaction files, where the Description column is the widest, and is the safest to truncate without chopping off "important" information.

      Thanks for the context. In that case I personally would probably take the pragmatic route and attempt to identify those text columns based on their width, and truncate those, while not truncating any columns under a certain length to make sure I don't truncate amounts or dates (perhaps even trying to identify such "important" column data types with regexes). But as you said, since this kind of thing is also fun, I understand wanting a more generic solution - at the moment I just don't have good tips for that. As to the existing modules, I didn't have the time to look through all of them to see if maybe there is one that already limits its output width to the terminal width "intelligently" - but perhaps another solution would be to implement the truncation yourself before passing the data off to a module for the output.

        This is the truncation (or weight-reduction) algorithm as it now stands.
        #!/usr/bin/env perl use v5.36; # implies warnings no warnings q/experimental::for_list/; no warnings q/experimental::builtin/; use builtin qw/indexed/; use List::Util qw/sum/; my $target_weight = shift // die 'need target_weight'; my @weights = ( 20, 3, 25, 10, 3, 24, 25 ); say "Before:\n" . display( \@weights, $target_weight ); shrink( \@weights, $target_weight ); say "After:\n" . display( \@weights, $target_weight ); die if sum(@weights) != $target_weight; sub shrink ( $bags, $target_weight ) { my $curr_weight = sum @$bags; return if ( $curr_weight <= $target_weight ); # no shrink req'd my @refs = sort { ${$b} <=> ${$a} } map \$_, @$bags; BAG: for my ($i, $ref) ( indexed @refs ) { my $next_wt = $i < $#refs ? ${$refs[$i+1]} : 0; my $drop = $$ref - $next_wt; my $lowered_weight = $curr_weight - $drop * ( $i + 1 ); if ( $lowered_weight >= $target_weight ) { for ( 0 .. $i ) { ${$refs[ $_ ]} -= $drop; } $curr_weight = $lowered_weight; } else { use integer; my $target_loss = $curr_weight - $target_weight; my $div = $target_loss / ( 1 + $i ); my $rem = $target_loss % ( 1 + $i ); for ( reverse 0 .. $i ) { ${$refs[ $_ ]} -= $div + ( $rem-- > 0 ? 1 : 0 ); } last BAG; } } } sub display ($aref, $target) { my $r = ''; for my ( $i, $wt ) ( indexed @$aref ) { $r .= sprintf " %2s: {%s} (%d)\n", "#$i", ( '=' x $wt ), $wt; } $r .= sprintf "Weight %d, target=%d\n", sum(@$aref), $target; return $r; }
        Note that, having brought the four highest weights down to 16 but still needing to trim one more character, it took it from the bag that was originally the lightest of the four (#0), thus never violating the original ranking.
        $ shrink 79 Before: #0: {====================} (20) #1: {===} (3) #2: {=========================} (25) #3: {==========} (10) #4: {===} (3) #5: {========================} (24) #6: {=========================} (25) Weight 110, target=79 After: #0: {===============} (15) #1: {===} (3) #2: {================} (16) #3: {==========} (10) #4: {===} (3) #5: {================} (16) #6: {================} (16) Weight 79, target=79
        Plugging this into my simple-minded CSV columnizer gave me exactly what I wanted. It remains to be seen if I'll ever want to apply more sophisticated, data-aware methods of narrowing. :-)

      It’s not Perl (unfortunately) but it’s pretty spiffy and already works for what you’re probably wanting to do: visidata.

      Edit: iOS smart quotes strike again.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        Cool! (Link should be https://www.visidata.org)
        It is quite spiffy, and it truncates in more or less the same way I was trying to do (furthermore, it supports left/right scrolling, and stops narrowing the columns beyond a certain resaonable limit). Thanks for the info!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11145275]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-03-29 11:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found