Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Cyclomatic Complexity of Perl code

by BrowserUk (Patriarch)
on Dec 08, 2004 at 08:57 UTC ( [id://413132]=note: print w/replies, xml ) Need Help??


in reply to Cyclomatic Complexity of Perl code

This particular metric, along most other metrics of this type--ie. static code metrics; including Lines-Of-Code (LOC); Function Points (FP); Interface Complexity (IF) etc,-- would be almost impossible to calculate to any degree of accuracy in any language that supports dynamic programming.

That is to say, any language that supports eval & bless; runtime creation of functions my $func = sub{...};; runtime modification of the symbol table *{"${module}::{$func}} =; introspection if( ref $arg ){...} & if( $arg->can( ... ) ) {...}; and probably others; is almost impossible to evaluate in these static terms.

If you try to do it simply by analysing the source code, you will completely miss all of the complexity that only comes into being at runtime. Even in pretty statitic langauges, like C, that support function pointers and dispatch tables, source code analysis is almost meaningless.

The only way to render these type of statistics vaguely meaningful and comparable is to totally hamstring the programmers to not using any language feature or construct that would defeat static source code analysis.

I once went through the exercise of taking a moderately complex, working system, that used function pointers and dispatch tables and re-writing it to remove these and a few other fairly everyday techniques that the 'Metrics guys' decried--mostly because their (very exspensive!), source code analysis software couldn't properly handle them.

The result was a system that was almost

  1. Twice as large;

    Every previously dynamic decision had to have all it's possible code paths laid out to static decisions. 60% of which would probably never be exercised, but they had to be there for 'completeness'.

  2. Much slower.

    Even simple constructs like (some) switch/case statements; functions with multiple (data-conditional)returns; even exception handling (of the setjump/longjump variety); were outlawed.

    Because 'they said', "the data-driven aspect of the control flow, meant the complexity and correctness of the running system was dependent upon the make-up and correctness of the data supplied". (That's a from-memory quote.).

    In other words, their software couldn't predict the path(s) of flow, so their metrics were useless.

  3. Involved a much higher degree of cut&paste "code reuse".

    Nearly identical functions 'xeroxed' with a different name--because it reduced the complexity number!

  4. Took nearly twice as long to refactor as the original did to code.

    And much functionality had to be 'deferred' along the way.

As with all such simple minded metrics systems, the original idea of providing for some means of measuring and comparing code quality, complexity and maintainability, is, in principle, a sound one.

The problem is that the result, is a metric that becomes self-serving.

  1. Programmer's work starts to be assessed on the basis of 'the numbers'--rather than their inventiveness, productivilty and thoroughness.
  2. User requirements start being rejected on the basis that it 'introduces too much complexity'--rather than on the usefulness and productivity gains that the feature would provide the users.
  3. Employer's start seeing the measurements as a way of reducing their maintainance and training bills--by employing cheaper maintainance programmers and providing less training.

    This is the most pernicious and debilitating effect.

    • Development team turnover starts to rise because they become sick of writing 'baby code'.
    • Maintainance programmer turnover rises because they are paid so little that the promise of a few pence more will take them away.

      Or the provision of some training that may lift them out of the maintainance role into the developer role.

      Or simply because when you pay peanuts.... Even with the best will in the world, not every person that adopts the job-title "programmer" is going to be any good at it.

      When you pay bottom dollar, the least-skilled practioners will tend to rotate amongst the employers within their area until such time as the pattern becomes obvious and they move onto the next area or re-invent themselves in some variation of the role.

I well remember the reactions of a few of the 'independant programmer assessments' that were made of the "reduced complexity system" that resulted from that exercise. Cheif among them were phrases like:

  • Who the **** coded this?
  • Why on earth was this done this way?
  • My eleven year old writes better code than this!
  • This is a joke right? Right??

Don't misread me. I want code metrics. It has always seemed rediculous to me that a carpenter or metalworker can delegate a piece of work to an apprentice by drawing a sketch on the back of an envelope and writing a few numbers on it. The sketch doesn't need to be accurate, if the numbers (including tolorences) are.

But the equivalent task of delegation in software requires, two hours, an overhead projector, several white boards and still what you get back may be as much related to what you envisaged as a Mac truck is to a tricycle.

But imagine trying to measure the accuracy & productivity of a carpenter by measuring the length of his plane shavings or the volume of his sawdust; and then try and improve this by restricting him to overlap joints and nails because dovetails and screws are 'too complicated'.


Examine what is said, not who speaks.
"But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
"Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re^2: Cyclomatic Complexity of Perl code
by stvn (Monsignor) on Dec 08, 2004 at 15:13 UTC

    I think you have hit upon an important point (which is only slightly related to the topic at hand).

    It has always seemed rediculous to me that a carpenter or metalworker can delegate a piece of work to an apprentice by drawing a sketch on the back of an envelope and writing a few numbers on it. The sketch doesn't need to be accurate, if the numbers (including tolorences) are.

    The key here is that they are drawing a representation of the "real world". And those numbers can be easily measured in the "real world". A carpenter or metalworker works directly with measurable physical reality.

    There is also a knowledge base of carpentry and metalworking that goes back to pre-historic times, and common language for all the aspects of those crafts which goes back many thousands of years. I know what you mean when you say a "dovetail joint", and anyone with even basic carpentry skills will to. And even more importantly, the concept is understood well enough that it can mentally visualized easily and that mental image can be easily transposed into many unique situations without too much trouble. After all, it's only a dovetail right?

    But the equivalent task of delegation in software requires, two hours, an overhead projector, several white boards and still what you get back may be as much related to what you envisaged as a Mac truck is to a tricycle.

    This is because it is incredibly hard to describe something which cannot be seen, cannot be heard, cannot be smelled, cannot be touched and cannot be tasted. As a matter of fact, software is nothing really at all but a collection of ideas and thoughts for which the end product is imperceptible patterns of magnetism on a round peice of metal (or metal coated plastic). Software works purely in the abstract.

    It should also be noted that we as software craftspeople do not have a common knowledge base to draw off of. We instead have a whole lot of common knowledge bases, which we pick and choose from and argue about endlessly ;)

    There is also no common language that we all share. Terminology is a disease, not a cure.

    And lastly, this silliness that we all enjoy so much has really only been happening for about 50 years. Sure it's foundations are a little older than that, but the act of magnitizing small disks and using coordinated electical pulses to perform something akin to thought is a pretty new thing still.

    Okay, enough of that silliness, I need more coffee!

    -stvn
      This is because it is incredibly hard to describe something which cannot be seen, cannot be heard, cannot be smelled, cannot be touched and cannot be tasted.

      The electronics industry, by definition, has been around just a few years longer than the computer software industry. Sit and observe a cpu, memory chips, or even the venerable 741 IC in operation and it is equally hard to measure or describe what is going on. However, you can buy memory from any one of a dozen manufactures and stick it in your PC, or mp3 player, or camera and it will work just fine.

      By contrast, the software industry continues to re-invent every building block used by every program on every system over and over. It reminds me very much of the early days of the car & motorcycle industries where each component was individually crafted for each vehicle.

      I said a bit more on this in Re^2: What do you call yourself? & Re^4: What do you call yourself?, and a lot more in Re: How do you view programming, and even more in Re: (OT) Programming as a craft, Re: Re: Re: (OT) Programming as a craft & Re+: (OT) Programming as a craft (loong post. Even by my standards!).

      As a coder, I enjoy trying to invent rounder wheels (along with hovercraft, monorails, helecopters and rockets ships; not to mention those designs for an 8-legged vehicle that can walk on ceilings and the snake-car that can traverse land an water without roads using an fuel economic motion. :). As a one-time project leader/manager/architect, I know that until we in the software industry manage to agree on a few basic, re-usable, interchangable, replaceable, software component standards--and agree to use them, universally--software will continue to be one-off works of art! Replicated as in reproductions, but each individually created for it's purpose, platform etc.

      As I said. I want software metrics. I want to be able to describe a requirement, or a subcomponent of it, in terms of smaller, standardised subcomponents. I want to be able to mix'n'match those subcomponents from multiple sources. I want to be able to assess and compare those components via some consistant and verifiable, industry standard metrics.

      I should be able to choose a sort implementation based on my criteria: For application A, I choose an implementation that trades ram for speed, cos speed is paramount. For application B; I'll choose a low-memory implementation, because that's more important that absolute performance.

      I shouldn't have to write these, just download them (with or without fee) as binary components from any standard sort supplier. I could even ship my program without a sort. The customers machine would interogate the local system and either use the one currently available--if it meets the specification I embed in the application--or goes out to the intranet; or the internet; and pulls one that does from a preferred supplier, or the first complient one that Google throws up.

      This utopian vision (IMO) has to happen. Useful , comparable, standardised metrics are the first step to achieving it. I just don't see source code complexity as a useful measure. At least not until you also have a way of measuring the complexity of the problem the same software solves. If you had both, then you could divide the latter by the former to produce a "complexity of implementation" ratio that might be useful.

      Until then you may as well compare the goodness of art by the size of the canvas divided by the number of brush-strokes used. Rolf Harris would do well:)


      Examine what is said, not who speaks.
      "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
      "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

        Amen!

        Sing that sweet song of Software Manufacturing Brother BrowserUK!!!!!

        -stvn
        A 741 opamp is pretty easy to describe. A DC gain of 100,000 and a single pole roll-off with a corner at 10Hz. Don't exceed 0.5W of power.
Re^2: Cyclomatic Complexity of Perl code
by simonm (Vicar) on Dec 08, 2004 at 16:34 UTC
    As stvn pointed out, the most effective way to analyze Perl software is not through static source code analysis, but by taking advantage of the dynamic nature to gather information at run-time.

    Of course, that does mean that the expensive analysis software your company bought wasn't going to be any help, but it doesn't mean the task is impossible.

    Looking at the output of the various profiling and coverage tools, it seems like you could perform a similar code analysis function with reasonable results by combining the source-code and run-time information.

      I was commenting directly at the static method the OP asked about, and at static methods in general.

      The source code analysis software we were evaluating was for C; around in the early '90s; rejected completely; and probably died a death. The best tool I saw for code complexity analysis was a ICE based tool that operated on the compiled (with probes) executable that instrumented the entire program, not individual source files.

      The problem with instrumenting individual files is that it is pretty easy to ensure that each file has a very low complexity value. One easy step is to make sure that each file has almost nothing in it. But that is self-defeating if the number of source files trebles. Even if this simple step is avoided, the other sanctioned ways of reducing complexity result in much higher interface complexities, which is again self defeating.

      As a measure, complexity is a very bad indicator of anything useful. Some, (most) systems are complex by their very nature.

      A 747 is hugely complex, but would it be "better" if you removed all the dual and triple redundancy systems to reduce complexity?

      The most reliable car, is one without an engine. It's less complex, but is it "better".

      It's not just the case that it would be difficult to produce a measure of complexity for Perl source code, even by attempting to combine both static and dynamic measurements; you also have to consider what that measure would mean, if it was produced?

      Measuring the complexity of the source code, or how hard the runtime code has to branch to do what it does, has much more to do with the complexity of the problem being solved, than of the 'goodness' of the way the solution was written.


      Examine what is said, not who speaks.
      "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
      "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        I was commenting directly at the static method the OP asked about, and at static methods in general.

        Understood -- but why assume that Cyclomatic Complexity or Interface Complexity *must* be analyzed statically?

        As a measure, complexity is a very bad indicator of anything useful. Some, (most) systems are complex by their very nature.

        I understand that metrics are often abused, but that doesn't mean they're useless.

        It would be foolish to try to maximize one metric without considering the tradeoffs -- as shown by your experience of driving down CC while allowing LOC to mushroom... But being able to compare several different implementations on the basis of multiple metrics seems helpful.

        The most reliable car, is one without an engine. It's less complex, but is it "better"?

        Of course you should include "do the tests pass" as one of your key metrics.

Re^2: Cyclomatic Complexity of Perl code
by QM (Parson) on Dec 08, 2004 at 18:47 UTC
    Reminds me of a quote I ran across earlier today:
    If you torture data sufficiently, it will confess to almost anything. -Fred Menger, chemistry professor (1937- )

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re^2: Cyclomatic Complexity of Perl code
by EdwardG (Vicar) on Jan 12, 2005 at 11:07 UTC

    I empathise with your tales of woe, but I think you go too far in dismissing the value of metrics as source code instrumentation.

    If one considers a given metric to be part of the whole picture, but at the same time does not confuse the single metric for the whole of the picture, I think metrics can help a great deal, particularly when dealing with large volumes of code, and large numbers of coders, and particularly in terms of informing decisions about where to spend code review and unit test effort.

    It sounds like you have suffered from idiotic applications of misguided rules more than you have suffered from the metrics themselves, but perhaps that is the price of their existence; that they will be abused like so much else.

    Out of interest, I generated these metrics for a part of my hobby chess engine -

    MethodCyc. Compl.LOCBlank linesComment linesComment %Statements
    GenMoves504372311426.09236
    FENToPosition41117182723.0899
    IsAttacked26193177438.34102
    PieceToString1621000.0032
    ColourOfPiece1416400.004
    PieceToString1015000.0020
    HitTheEdge1031900.0010
    ASCToFEN95361935.8526
    FirstBit8391000.0021
    MoveToString828100.0017
    Slide5256416.0015
    GetMoveCount421600.0012
    PrintMoveList25000.002
    LeftEdge14000.002
    RightEdge13000.001
    TopEdge13000.001
    BottomEdge13000.001
    SquareToString13000.001
    SquareToString13000.001
    SquareToRank13000.001
    SquareToFile13000.001
    ASCToPosition13000.001
    Warn13000.001

    As a profile of this part of the engine, I found this very interesting. For instance,

    • why do four of the complex methods have no comment lines at all?
    • Why is FENToPosition so complex? (this was a surprise)
    • Is Slide() under-commented at 16%?
    • What (if any) is the correlation between lines of code (LOC) and Complexity?
    • Is this consistent with other code I've written?
    • How do all these stats compare to other projects?
    • Does this align with my intuitive sense of their complexity?
    • Is there a code density metric I can derive from LOC and statements?

    Now some of these thoughts I could have had by simply browsing the code, but I hope I'm illustrating the usefulness of these metrics as a high-level view. I find them provocative in a highly beneficial way.

     

      I empathise ... It sounds like you have suffered from idiotic applications of misguided rules

      No need for empathy. It was a long time ago, and we didn't suffer :) The result of the study, was that we rejected both the tool and the idea.

      With respect to your statistics. One set thereof does not a case make.

      I would derive two things from my reading of the numbers.

    • Three modules are heavily over commented.
    • Large modules are more complicated than small ones.

      What I cannot say from those numbers is whether the complexity arises as a result of

      • the needs of the algorithm required to perform the function of those methods.
      • because a bad algorithm has been used.
      • because a good algorithm has been badly implemented.
      • because the "complex" (read:big) methods do too much.

      Indeed, without inspecting the source code, I cannot even tell the accuracy of those metrics.

      It could be that "HitTheEdge" contains a recursive algorithm that is extremely complicated to follow and modify, but simple in it's code representation.

      Or that "GenMoves" is a huge if/then/else structure that would be better implemented as a dispatch table

      Or that "Slide" uses a string eval to replace itself with an extremely complicated subroutine that the source code analyser sees simply as a big string constant.

      And that's my point. You are already looking to derive further metrics from the generated metrics, but there is no way to validate the efficacy of those metrics you have, beyond inspecting the code and making a value judgement.

      So you already falling into the trap of allowing the metrics to become self-serving, but the metrics themselves are not reproducible, scientific measurements, they are simply "indicator values".

      When you measure the length, mass, hardness, reflectivity, temperature, elasticity, expansion coefficient etc. of a piece of steel, you are collecting a metric which can be reproduced by anyone, anywhere, any time. Even if the tools used to make the measurement are calibrated to a different scale, it is a matter of a simple piece of math, or a lookup table to convert from that scale to whichever scale is needed or preferred. This is not the case for any of the metrics in your table.

      You don't say what language your program is coded in but I could (probably) take all of your methods and reduce them to a single line. It would be a very long line, but in most languages it would still run perfectly well. What affect does that have on your metrics?

      Equally, we could get half a dozen monks (assuming Perl) to refactor your methods according to their own formatting and coding preferences and skill-levels. And even if they all do a bang-up job of making sure that they reproduce the function of your originals--bugs an all--and if we then used the same program to measure their code as you have used, they will all produce different sets of numbers.

      And that is the crux of my distaste for such numbers. They are not metrics. They do not measure anything! They generate a number, according to some heuristic.

      They do not measure anything about the correctness, efficiency or maintainability of the code that gets run, they only make some guesses, based upon the way the coder formatted his source code.

      1. They do not conform to any standards.
      2. They are not transferable between programmers.
      3. They are not transferable between algorithms.
      4. They are not transferable between programs.
      5. They are not transferable between languages.
      6. They are not transferable between sites.
      7. They are not tranferable between design or coding methods. (OO, procedural, functional etc.).
      8. They are not transferable between assessment methods or tools.

      In short, they are not comparable, and you cannot perform math with them.

      As proof of this, take a look at your "PieceToString" and "HitTheEdge" methods. They have an equal 'complexity' when measured by the same tool. Is this obvious, or even definable from looking at the source code? If I am given two pieces of steel 10 cms long, even without measuring them with a rule, I can easily tell they are the same length. No such comparison is possible for source code.

      The tool has become the only way of comparing source code, and as it does not (and could not) adhere to any standard, all measurements are relative, not absolute. So, unless everyone agrees on which tool/language/coding standards etc. etc. to use, there is no way to compare two versions of the same thing.

      That means that in order to make comparisons, you have to implement every possible (or interesting) version of the source code, before you can make any inference about whether any one is good or bad.

      And even if you could code every possible implementation of a given algorithm, and could prove that they all produced exactly the same results, and you generated your numbers: What would it tell you?

      Should you pick the version with the lowest complexity rating? The shortest? The longest? The one with the highest ratio of comments?

      Would you make any choice based on the numbers alone? Or would you have to look at the source code?

      If you admit that you would have to look at the source code, then you have just thrown your "metrics" in the bin in favour of your own value judgement.

      And if you didn't, then you should publish the formulea by which you are going to juggle all those numbers in order to make your decision. It should make for interesting reading.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.

        You seem to be confused about the purpose of metrics. They are NOT an end in themselves, and they are NOT useful in isolation from the code. In demanding such things, and exaggerating the abuses therein, you are setting up straw men and knocking them down in puffs of rhetoric.

         

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://413132]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2024-03-19 02:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found