Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Can you make it nicer?

by choroba (Abbot)
on Nov 18, 2013 at 21:18 UTC ( #1063201=perlquestion: print w/ replies, xml ) Need Help??
choroba has asked for the wisdom of the Perl Monks concerning the following question:

In the old unexplored code at work, I discovered a subroutine id2path that converts a numeric id of an object to a path (where some content related to the object is stored). I did not like the way how it was implemented, and adding strictures, underlines, and whitespace did not make it much better:

sub id2path { my $id = shift; return q() unless $id; my $path; if ($id > 999_999) { $path = sprintf '%03d/%03d/%03d', $id / 1_000_000, ($id / 1_000) % 1_000, $id % 1_000; } else { $path = sprintf '%02d/%02d/%02d', $id / 10_000, ($id / 100) % 100, $id % 100; } return $path; }

I do not like strange constants and repeated code. However, the only alternative I was able to write was the following:

sub id2path_new { my $id = shift or return q(); my $chunk_length = length $id > 6 ? 3 : 2; $id = sprintf '%0' . ($chunk_length * 3) . 'd', $id; my $chunk = ".{$chunk_length}"; my $path = join '/', $id =~ / ^ (.*) ($chunk) ($chunk) $ /xg; return $path; }

I do not like it much either. Do you have any ideas?

To make your work easier, here is how I tested it:

use Test::More tests => 23; is(id2path_new($_), id2path($_), "id=$_") for q(), 0, 1, 9, 10, 99, 100, 999, 1000, 9999, 10000, 99999, 100000, 999999, 1e6, 1e7-1, 1e7, 1e8-1, 1e8, 1e9, 1e10, 1e11, 1e12;

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Comment on Can you make it nicer?
Select or Download Code
Re: Can you make it nicer?
by NetWallah (Abbot) on Nov 18, 2013 at 22:03 UTC
    Take a look at How do I add commas to a number?.

    You want Slashes instead of commas - that should be trivial.

                 When in doubt, mumble; when in trouble, delegate; when in charge, ponder. -- James H. Boren

      Beauty is indeed subjective...
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Can you make it nicer?
by BrowserUk (Pope) on Nov 18, 2013 at 22:09 UTC

    It's a bit weird that 0 is treated the same as '':

    sub id2path3 { my $id = shift || return ''; my $path; my( $pat, $bit ) = $id < 1e6 ? ( '%06u', '..' ) : ( '%09u', '...' +); $path = sprintf $pat, $id; $path =~ s[($bit)($bit)$][/$1/$2]; return $path; }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Can you make it nicer?
by LanX (Canon) on Nov 18, 2013 at 22:29 UTC
    well I'm not too convinced about the cases >= 1e10 ... (shouldn't bugs be catched???)

    use strict; use warnings; sub id2p_choroba { my $id = shift or return q(); my $chunk_length = length $id > 6 ? 3 : 2; $id = sprintf '%0' . ($chunk_length * 3) . 'd', $id; my $chunk = ".{$chunk_length}"; my $path = join '/', $id =~ / ^ (.*) ($chunk) ($chunk) $ /xg; return $path; } sub id2p_lanx { my $id = shift or return q(); my ($chunk,$len) = (length $id > 6 ) ? (3,9) : (2,6); my $norm = sprintf "%0${len}d", $id; my $path = join "/", ( $norm =~ m/ ^ (-?\d+) (\d{$chunk}) (\d{$chunk}) $ /x ); # $path ||= '-00/000/001'; # this weirdness is needed if '-?' is mi +ssing return $path; } use Test::More tests => 23; is(id2p_lanx($_), id2p_choroba($_), "id=$_") for q(), 0, 1, 9, 10, 99, 100, 999, 1000, 9999, 10000, 99999, 100000, 999999, 1e6, 1e7-1, 1e7, 1e8-1, 1e8, 1e9, 1e10, 1e11, 1e12;

    update

    hmm ... well ... surprisingly now after some modifications my code looks a lot like yours! ;-)

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Re: Can you make it nicer?
by boftx (Chaplain) on Nov 18, 2013 at 22:40 UTC

    I'm not convinced I like any of the above when it comes to readability. Here is what I came up with that might make it easier for a junior-level programmer to at least have half a clue as to what is going on (it being understood that my choice of variable names could be better):

    sub id2path_boftx { my $id = shift or return q(); # my $chunk_length = length $id > 6 ? 3 : 2; my $chunk_length = $id < 1e6 ? 2 : 3; my $path_pat = '%0' . $chunk_length . 'd'; $path_pat = join('/', $path_pat, $path_pat, $path_pat); my $chunk_power = 10 ** $chunk_length; my $path = sprintf $path_pat, $id / ($chunk_power ** 2), ($id / $chunk_power) % $chunk_power, $id % $chunk_power; return $path; }

    FWIW, it passes all 23 tests.

    Update: Changed how $chunk_lentgh is determined. I like the numeric test better after looking at it for a while.

    It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.
Re: Can you make it nicer?
by rminner (Hermit) on Nov 18, 2013 at 23:50 UTC
    Personally i consider it most important to have readable code. It doesn't need to be fancy, but it should be quick to be understood. Sometimes i would rather write 2 or 3 lines more, if it would improve readability.
    Keeping the same output format of id2path, i would potentially have written it like this:
    sub id2path_new { my ($id) = @_; return '' unless $id; my $path = ''; if ($id >= 1_000_000) { $path = sprintf("%09d", $id); $path =~ s#\A(.*)(...)(...)\z#$1/$2/$3#; } else { $path = sprintf("%06d", $id); $path =~ s#\A(..)(..)(..)\z#$1/$2/$3#; } return $path; }
    Note: I know that i could also write the regex using \d and the {3} quantifier, resulting in:
    \A(\d*)(\d{3})(\d{3})\z
    In this case however, i consider the dots to be visually clearer, while having the same effect. Having two hard coded printfs is in my opinion also easier to read than first programmatically creating the sprintf format string.
Re: Can you make it nicer?
by sundialsvc4 (Monsignor) on Nov 18, 2013 at 23:53 UTC

    I’m afraid that, were I your team-lead, I would, rather decidedly, vote down your change, and not permit it to move forward into production.   And here would be my two three reasons why:

    1. Although the existing code might “offend your personal sensibilities,” it is nonetheless obvious as to what it does.   I can, using nothing more than my two eyeballs and from this (looks to be, rather extreme) geographical distance, be pretty-darned sure that it works.   (Furthermore, you are not reacting to any outstanding bug-report that suggests that it does not work.)
    2. Your “replacement” code, on the other hand, is not “visually obvious.”   At first glance, and the second and the third, I have no idea what this code is supposed to be doing.   Thank-you for providing test cases, but once again, I cannot be sure that these test cases are complete or exhaustive.
    3. There is business risk associated with your “gratuitous” change, but without any apparent business justification.

    Yes, I would make a mental note to be sure that, henceforth, you always had enough assigned tickets to deal with, that you nevermore felt that you felt that you had enough spare time to indulge any temptations to “distract yourself with things that ‘you thought’ needed to be ‘improved about’ stable production code that was in service.

    I am, as usual, quite sure that “the downvote daemons” will jump-on this opinion ... as they have done with so many others ...but in my defense I would simply say that it boiled down to a pure-business decision:   “In some ways, we all work in a sewer.   We can’t be making decisions based on smells.”   Change, itself, is “the bugaboo” here.   The <<business risk | business cost>> intrinsically associated with any change at all (especially to “active” code such as this ...) is so extreme that it must never be undertaken lightly.

      > it is nonetheless obvious as to what it does.

      ORLY?

      The original code is buggy for > 1e10, guess why!

      Cheers Rolf

      ( addicted to the Perl Programming Language)

      PS: Did anyone really expect you would contribute any code?

        How exactly is it buggy for id > 1e10?

        I will have to side with sundialsvc4 here. The original code is immediately obvious and clear for a C programmer.

      The downvote demons jump on your posts because it they look like a unicorn ate an APL keyboard and threw it all up. No one can ever figure out what you're trying to say because no one can ever ever ever read it.

        No one can ever figure out what you're trying to say because no one can ever ever ever read it.

        No figuring out is required; the text effects merely signal the stench volume of the advice

      Let me disagree, even if the original code was not the worst example I could have brought up.
      1. I was staring at the code for a long time before I understood what it does (originally, there was inconsistent indentation, global variables with meaningless names, almost no whitespace). I had to run it to see whether the numeric constants were correct and what the behaviour was for larger numbers.
      2. Here I have to agree. I was not satisfied with my solution, that was why I asked here. To run an exhaustive test, you can modify the list to
        q(), 0 .. 1e12
        Reserve enough time, though.
      3. Business justification: I am the maintainer of the code. If the code changes only when fixing bugs, it will stay in the nineties and become unmaintainable. Without refactoring, the code becomes dead, and accumulates risk: at one point in the future, a bug or feature request will appear that will not be possible.

        Working with always enough assigned tickets to prevent any creativity leads to exhaustion. It will make me leave your company, which means «business cost»: you invested in the hiring process and in training me. Moreover, trust (both to expert opinions and business decisions) makes the working environment better.

      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      I would make a mental note to be sure that, henceforth, you always had enough assigned tickets to deal with, that you nevermore felt that you felt that you had enough spare time to indulge any temptations to “distract yourself with things that ‘you thought’ needed to be ‘improved about’ stable production code that was in service.

      I am thankful you are not in a position to exercise such childish mismanagement principles in my company.

      I am, as usual, quite sure that “the downvote daemons” will jump-on this opinion ... as they have done with so many others.

      It really doesn't occur to you at all that the reason you get downvoted so often is that you're both wrong and abrasive, does it?

      I pity you.

Re: Can you make it nicer?
by hdb (Parson) on Nov 19, 2013 at 07:21 UTC

    Similar to BrowserUk's, but using substr instead of regex:

    sub id2path_new { my $id = shift or return q(); my( $pat, @d ) = length $id > 6 ? ('%09d',6,3) : ('%06d',4,2); $id = sprintf $pat, $id; substr $id, -$_, 0, '/' for @d; return $id; }
Re: Can you make it nicer?
by hdb (Parson) on Nov 19, 2013 at 07:46 UTC

    Or this?

    sub id2path_new { my $id = shift or return q(); return sprintf "%03d/$2/$3", $1 if $id =~ /(.+)(...)(...)/; return join "/", "00000$id" =~ /(..)(..)(..)$/; }

    Update: Changed the last line, was return sprintf "%02d/%02d/%02d", "00000$id" =~ /(..)(..)(..)$/;.

    Update: Another variation:

    sub id2path_new { my $id = shift or return q(); $id =~ /(.+)(...)(...)/ or "00000$id" =~ /(..)(..)(..)$/; return sprintf "%0".length($3)."d/$2/$3", $1; }

    Update: ...and close to obfuscation...

    sub id2path_new { my $id = shift or return q(); sprintf "%0".($id=~/(.+)(...)(...)/?3:("00000$id"=~/(..)(..)(..)$/ +,2))."d/$2/$3",$1; }

    I am getting carried away, so here is yet another one:

    sub id2path_new { $_[0]?join"/",grep{$_}"00000$_[0]"=~/0*(..)(..)(..)$|0*(.{3,})(... +)(...)$/:q(); }
Re: Can you make it nicer?
by oiskuu (Pilgrim) on Nov 19, 2013 at 13:45 UTC
    Using sprintf to stringify a number, you may want to check its range to avoid negatives or overflow surprises:
    my $id = shift; # disallow ID 0; $id > 0 && $id < 1e9 or return q(); $id = sprintf '%0*lu', ($id < 1e6 ? 6 : 9), $id; ...
    Or, you could let perl stringify the number, and munge it from there.
    sub id2path { my $id = shift; # allow ID 0; disallow 12.34 $id =~ m/^\d{1,9}$/ or return q(); my $n = length($id) > 6 ? 3 : 2; $id = sprintf '%0*s', 3 * $n, $id; $_ = "/$_/" for substr($id, $n, $n); # or: substr($id, $_, 0, '/') for -2*$n, -$n; return $id; }
    Interestingly, perl sprintf allows zero-padding of %s! Is this a documented feature?
Re: Can you make it nicer?
by hexcoder (Hermit) on Nov 19, 2013 at 14:46 UTC
    I tried to seperate the necessary steps. The code satisfies the tests. $id is treated only as a string.
    sub id2path_new { my $id = shift; return q() unless $id; # group of digits my $bits = length $id > 6 ? 3 : 2; # prepend with 0 digits if necessary my $id = ('0' x (3 * $bits - length $id)) . $id; my @parts = ( # leftmost part (substr $id, 0, -2 * $bits), # subparts (substr $id, -2 * $bits) =~ m{(\d{$bits})}xmsg ); return join q{/}, @parts; }
Re: Can you make it nicer?
by zork42 (Monk) on Nov 19, 2013 at 17:19 UTC
    I'm a little nervous to comment... I hope this won't offend anyone...
    IMVHO I think the clearest code is rminner's code.
    IMVVHO The 2nd clearest is ... oh look over there, a spider ... *whispers* the original code.
    IMVVVHO Both make it very clear what the 2 different cases are, and what to do for each of them.

    I like to line things up to help highlight the similarities and differences, so I might write it like this:
    sub id2path_new { my ($id) = @_; return '' unless $id; my $path = ''; if ($id >= 1_000_000) { $path = sprintf("%09d", $id); $path =~ s +#\A(.*)(...)(...)\z#$1/$2/$3#; } else { $path = sprintf("%06d", $id); $path =~ s +#\A(..) (..) (..)\z#$1/$2/$3#x; } return $path; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1063201]
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (12)
As of 2014-08-27 21:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (253 votes), past polls