Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^4: Debugging a module that's failing under taint mode

by Bod (Parson)
on Jul 02, 2021 at 16:47 UTC ( [id://11134592]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Debugging a module that's failing under taint mode
in thread Debugging a module that's failing under taint mode

my answer was not intended as an April Fools' joke

I didn't think it was :P

As this module is not much more than a wrapper around Template and really cannot be used anywhere other than within the website that it is written for - really...it cannot - I didn't think tests would be needed because there is no end user as such.

Time to revisit that thought...

  • Comment on Re^4: Debugging a module that's failing under taint mode

Replies are listed 'Best First'.
Re^5: Debugging a module that's failing under taint mode
by kcott (Archbishop) on Jul 02, 2021 at 23:56 UTC
    "... really cannot be used anywhere other than within the website that it is written for - really...it cannot ..."

    Unfortunately, what happens all too often, is a sequence like this:

    1. You write something specifically for "A" (whatever that might be). The code is peppered with hard-coded "A"s throughout.
    2. At some later point you have a similar requirement for "B". You take a copy of the first code and replace all of the "A"s with "B"s. You add some enhancements to this version of the code.
    3. Further down the track, you have another similar requirement, this time for "C". You take a copy of the first code and replace all of the "A"s with "C"s. However, you notice that the enhancements which you added for the "B" code are not in the "C" code. You attempt to retrofit those enhancements but encounter problems. You scrap your new "C" code, take a copy of the "B" code and replace all of the "B"s with "C"s.
    4. And then something generic changes; it could be something minor like certain URLs with an http scheme now need an https scheme. You now sit down with your editor and modify the "A" code, then the "B" code, then the "C" code.
    5. Later you realise that the "A" code needs the enhancements that were originally added to the "B" code. You sit down again with your editor and ...

    I hope you can see where this is going. There is no such thing as "will only be used by" or "final version" or anything else absolute like that. Every time you copy and substitute, or attempt to make identical changes in multiple files, you run the risk of introducing typos, subtle errors, and the like; the more you do this, these risks go from minimal chance to almost guaranteed. Furthermore, look at the huge rod you're making for your back with all of this extra work (that could've been avoided).

    The A/B/C scenario I provided is, of course, a purely hypothetical example; however, it does mirror the type of thing that I've encountered on numerous occasions in over four decades of software development. Here's some examples.

    • One company had hundreds of scripts, whose core functionality was roughly equivalent, but each had hard-coded values, and were all subtly different from each other (including unprofessional-looking typos and bugs in rarely called routines that had never been tested).
    • Another employed the practice of copying entire modular frameworks and changing hard-coded values. My protestations fell on deaf ears. Then it blew up in their faces when an update, to a client in the Philippines, displayed text in a variety of languages including Vietnamese and Urdu.
    • A third, and classic, example is from over 20 years ago. The company had a number corporate clients for whom they managed hundreds of static HTML pages. Whenever a new page was required, they just copied some random existing page, then changed the title and replaced the old main content with new. All of these pages were littered with <font> tags. I urged them to strip out the <font> tags and start using CSS: this was greeted with howls of "if it ain't broke, don't fix it" and such like. Then one of their clients changed one of their corporate colours to a slightly different shade: more than 3,000 edits to <font> tags later, CSS was back on the table.

    Never assume your code will only ever be used by one entity. Always abstract your code such that it can be reused. Avoid hard-coded values like the plague.

    Some of this might seem like additional work; however, once you get into the habit of doing it, you should find that it takes little or no extra effort. Moreover, as your software matures, the benfits accrue and you'll avoid the types of problems that I've indicated above.

    You've been here for less than a year with a clear appetite for learning and improvement — this is great. You've taken onboard templates, placeholders, and so on; I hope you pick up on the ideas I've presented here too.

    — Ken

      Well said kcott. The ideal of the DRY principle is that every piece of knowledge should have a single, unambiguous, authoritative representation within a system. Note that DRY is broader than just code, including things like database schemas, configuration files, test plans, build systems and doco. Violations of DRY are sometimes called WET: "write everything twice" or "we enjoy typing". ;-)

      Bod, in your current system are there any places where you find yourself having to make multiple changes in response to a single value changing? If so, you might enjoy thinking about how to eliminate these sort of error-prone chores. In large complex legacy environments, with multiple programming languages and tools in use across multiple platforms, Perl is an ideal glue language for building little tools to eliminate DRY violations.

        Bod, in your current system are there any places where you find yourself having to make multiple changes in response to a single value changing?

        I have always been reasonably good at making code reusable and avoiding having to make multiple changes.

        One glaring exception to this is a blog which exists separately on two different websites. Recently I needed wanted to add a new image inclusion option and had to separately add it to both instances. This is high on the list of things to refactor and make use of Template. Even more so because the blog on my personal site could do with some functional updating and it would be just silly to create a third instance! Instead I shall write a module that does all the formatting and rendering in a way that can be easily passed to a site specific Template file.

        In marketing I am big beleiver in COPE. We write a blog, shorten it to form a promotional email, shorten that into a Facebook post and shorten that into a tweet.

      Thank you kcott for such a thorough warning of the potential pitfalls...

      I hope you can see where this is going. There is no such thing as "will only be used by" or "final version" or anything else absolute like that

      Oh yes...I see where that is going.
      That's exactly the reason I created the CRM module that I asked for help with back here -> [RFC] Review of module code and POD

      Generally I have not been too bad at making code reusable.

      There is no such thing as "will only be used by"

      Well...I have some boilerplate code (I think that's the right term) which I use on nearly every website. It does the stuff I need for every site but for which the details are site specific. Things like displaying the headers and footers and putting in site wide default titles, descriptions, etc. The framework gets copied to each site and the site-specific values added in. Once that has happened, the code really is of no use anywhere else. If I wanted to recreate it, I'd make another copy of the code without any site values. This is what I mean by a module that will not get reused.

      I would abstract all those functions into a generic module which would update all the functionality of all sites. Except I don't think it can be done...or at least I don't think I can do it. Every site has different levels of user tracking from none at all through just checking if the user is logged on all the way to having persistent session cookies linked to our central CRM so different browsers used by the same person get linked and all conversion activity tracked and recorded. The use cases seem to be too diverse to cover with a generic module.

      My existing code is slowly being improved as I need to make changes. As part of that process I am looking for places to move similar things into a module.

      More so with new code. I have just started working on a new phase of a project where, because of the history of the project, security is likely to be very important. So I am designing abstraction and data hiding into that from the start. Even to the point of only serving Javascript functions to users that have the appropriate privileges.

      You've been here for less than a year with a clear appetite for learning and improvement

      Thank you for your kind words.
      It is certainly my intention to learn and to improve.

        The following provides a few hints regarding some of the points you mentioned. They are based on a current $work project so, while I can't give too much away, I do know that in principle it works and the information is up-to-date (i.e. not like the historical examples I posted earlier).

        Whenever I make a change to a module, that can be quite a bit of work in itself: the version must be changed in the code, POD, and potentially other places; I need to test the changes; I have to rerun the make cycle; and, a new distribution needs to be created and appropriately disseminated. Any modules that depend on the first module also need varying amounts of similar work: version updates; changes to Makefile.PL; and so on. I would much prefer to avoid all of this tedious effort wherever possible.

        I assume by "framework" that you are referring to a basic layout to provide a consistent look-and-feel across different pages. I create a single template for all of these pages. The text for things like the title in the header, copyright date in the footer, and so on, is all held in configuration files. If, say, I find a typo somewhere, I just make a change to a configuration file. Pages reuse a single generic module.

        I have situations where different JavaScript files need to be provided. My template has code like this:

        <% BLOCK script_js %> <script src="<% request.uri_base %>/js/<% js_file %>"></script +> <% END %> <% FOREACH js_file = js_files %> <% PROCESS script_js %> <% END %>

        My module code provides the list of *.js files (js_files) to the template; that then generates an appropriate list of <script> elements. Your list might include core.js and then, depending on the outcome of authentication, either guest.js, standard.js, or manager.js (of course, I'm completely guessing about your requirements and naming conventions).

        I have similar code for generating a list of <link> elements for CSS files.

        I wrote the module code to be generic from the outset. When the project moved from version 1 to version 2, I needed to make various changes or additions to config, template, JS and CSS files; however, the module remains unchanged at version 0.001.

        — Ken

Re^5: Debugging a module that's failing under taint mode
by jo37 (Deacon) on Jul 02, 2021 at 20:00 UTC

    I never wrote a test case for anyone else than me, though others might benefit from it.

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-04-19 08:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found