Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

[RFC] Review of module code and POD

by Bod (Curate)
on Mar 31, 2021 at 19:45 UTC ( #11130640=perlmeditation: print w/replies, xml ) Need Help??

Esteemed Monks,

For all past time we have connected directly to our CRM from the scripts that need to read or write the data. It has been my intention for a very long time to create some standard subroutines in a require file to do all the things we regularly need to do. Thanks to The Monastery, I now know I need to use a module to do this. So I have set about creating a suitable module. I also thought this would be a good opportunity to do it properly and include some POD. This module will never be used outside of our use case but it seems like good practice to include documentation.

Could you please look over the code and documentation and for me before I go too much further and advise if I am making any horrible mistakes, what I can improve and how clear the documentation is...

package Bod::CRM; use DBI; use Bod::Variables; use strict; use warnings; =head1 Name Bod::CRM =head1 Synopsis Bod::CRM provides simple methods to access the Lets Delight CRM. use Bod::CRM; my $crm = Bod::CRM->new(); my $vars = { 'firstname' => 'John', 'lastname' => 'Smith', 'pri_email' => 'john.smith@example.com', }; $crm->add($vars); my $vars = { 'firstname' => 'John', }; my @contacts = $crm->find($vars); print $crm->error; =head2 Error Handling Except where otherwise documented, methods return zero to indicate an +error. The error message can be found by calling the B<error()> method. =head2 Description =cut =head4 new(environment); =over 1 Creates a new connection to the CRM. If I<environment> is 'prod' or ' +test', that environment is used. If I<environment> is omitted, an at +tempt is made to find the environment from the C<$ENV{'HTTP_HOST'}> v +ariable. Returns a B<Bod::CRM> object or I<undef> if unsuccessful. Unsuccessful means either the environment was wrong or the connection +to the database failed for another reason. =back =cut sub new { my ($class, $env) = @_; if (!$env and $ENV{'HTTP_HOST'} =~ /^(.*?)\./) { $env = $1; } $env = 'prod' if $env eq 'www'; my $dbh = DBI->connect("dbi:mysql:$Bod::Variables::db_prefix$env:l +ocalhost:3306", "$Bod::Variables::db_username", "$Bod::Variables::db_ +password"); if (!$dbh) { warn("Unable to connect to $env database - " . $DBI::errstr); return undef; } my @schema_fields; my $query = $dbh->prepare("SELECT COLUMN_NAME FROM information_sch +ema.`COLUMNS` WHERE TABLE_SCHEMA = '$Bod::Variables::db_prefix$env' A +ND TABLE_NAME = '$Bod::Variables::C_Table'"); $query->execute; while (my ($field) = $query->fetchrow_array) { push @schema_fields, $field; } my $self = bless { 'env' => $env, 'dbh' => $dbh, 'error' => '', 'field' => \@schema_fields }, $class; return $self; } =head4 error(); =over 1 Return a string detailing the last error. =back =cut sub error { my ($self) = @_; return $self->{error}; } =head4 find(fields, blanks); =over 1 Finds a list of people in the CRM matching all the I<fields> passed as + a hash reference. Empty strings are not matched unless the optional + I<blanks> parameter is true. Returns an array of people IDs that match all keys as fields to values + as contents. =back =cut sub find { my ($self, $fields, $blanks) = @_; my $where; my @place; my $flag = 0; foreach my $field(keys(%$fields)) { next unless $fields->{$field} or $blanks; $where .= " AND " if $flag; $where .= "$field = ?"; push @place, $fields->{$field}; $flag = 1; } my @ids; my $query = $self->{'dbh'}->prepare("SELECT idContact FROM $Bod::V +ariables::C_Table WHERE $where"); $query->execute(@place); while (my $id = $query->fetchrow_array) { push @ids, $id; } return @ids; } =head4 get(id); =over 1 Get information about the person with given I<id> Returns a hash reference containing keys with the values of the CRM fi +eld names. =back =cut sub get { my ($self, $id) = @_; return $self->{'dbh'}->selectrow_hashref("SELECT * FROM $Bod::Vari +ables::C_Table WHERE id$Bod::Variables::C_Table = ?", undef, $id); } =head4 add(fields); =over 1 Add a person to the CRM or update existing person Given the information in the passed hash reference I<fields> add or up +date a person. If the B<email> or B<Twitter> fields match then updat +e an existing person, otherwise add a new person. If a new person is + being added then the I<fields> source key must be set to the correct + value to correspond with the B<PersonSource> table. If the source key is not set or insufficient information is provided, +returns zero. Otherwise returns the contact ID of the person created + or updated. =back =cut sub add { my ($self, $fields) = @_; my $id; $id = $self->{'dbh'}->selectrow_array("SELECT id$Bod::Variables::C +_Table FROM $Bod::Variables::C_Table WHERE pri_email= ? OR sec_email += ?", undef, $fields->{'email'}, $fields->{'email'}) if $fields->{'em +ail'}; $fields->{'twitter'} =~ s/^\@//; $id = $self->{'dbh'}->selectrow_array("SELECT id$Bod::Variables::C +_Table FROM $Bod::Variables::C_Table WHERE twitter = ?", undef, $fiel +ds->{'twitter'}) if $fields->{'twitter'} and !$id; $fields->{"id$Bod::Variables::C_Table"} = $fields->{'id'} if $fiel +ds->{'id'} and !$fields->{"id$Bod::Variables::C_Table"}; # These fields should not be set manually $fields->{$Bod::Variables::S_Table} = undef; $fields->{'created'} = undef; $fields->{'updated'} = undef; my @place; my $set; my $flag = 0; foreach my $field(@{$self->{'field'}}) { if ($fields->{$field}) { $set .= ', ' if $flag; $set .= "$field = ?"; push @place, $fields->{$field}; $flag = 1; } } if ($id) { my $query = $self->{'dbh'}->prepare("UPDATE $Bod::Variables::C +_Table SET $set, updated = DATE(NOW()) WHERE id$Bod::Variables::C_Tab +le = $id"); $query->execute(@place); } else { if (!$fields->{'source'}) { $self->error = 'No source code provided for contact'; return 0; } if (!$fields->{'firstname'} and !$fields->{'email'} and !$fiel +ds->{'twitter'}) { $self->error = 'Insufficient information provided for cont +act'; return 0; } my $query = $self->{'dbh'}->prepare("INSERT INTO $Bod::Variabl +es::C_Table SET $set, created = DATE(NOW()), $Bod::Variables::S_Table + = ?"); $query->execute(@place, $fields->{'source'}); $id = $self->{'dbh'}->selectrow_array("SELECT LAST_INSERT_ID() +"); } return $id; } =head4 business(contact, business, subscribe) =over 1 Adds the person to the business unit as definded in the B<BusinessUnit +> table. Has no effect if the person is already added to the busines +s unit. However, if the optional I<subscribe> is true, the person wi +ll be resubscribed to the list if they had previously unsubscribed. Returns true if the person is in the busines unit, zero if it has fail +ed and I<undef> if an error has occurred. =back =cut sub business { my ($self, $contact, $business, $subscribe) = @_; unless ($contact > 0 and $business > 0) { $self->error = 'Invalid Contact or Business parameters'; return undef; } my $update; $update = ' ON DUPLICATE KEY UPDATE subscribe = 1' if $subscribe; my $query = $self->{'dbh'}->prepare("INSERT IGNORE INTO $Bod::Vari +ables::B_Table_has_$Bod::Variables::C_Table SET $Bod::Variables::C_Ta +ble_id$Bod::Variables::C_Table = ?, $Bod::Variables::B_Table_id$Bod:: +Variables::B_Table = ?, subscribe = 1$update"); $query->execute($contact, $business); return $self->{'dbh'}->selectrow_array("SELECT COUNT(*) FROM $Bod: +:Variables::B_Table_has_$Bod::Variables::C_Table WHERE $Bod::Variable +s::C_Table_id$Bod::Variables::C_Table = ?, $Bod::Variables::B_Table_i +d$Bod::Variables::B_Table = ?", undef, $contact, $business); } =head4 marketing(contact, business, permission, grant) =over 1 Set the marketing permissions for a person on the specified business u +nit. The I<permission> parameter is defined in the B<MarketingPermis +sionLookUp> table. Normally, this method will grant the permission b +ut can also revoke the permision by passing 'R' as the optional I<gra +nt> parameter =back =cut sub marketing { my ($self, $contact, $business, $permission, $grant) = @_; unless ($contact > 0 and $business > 0 and $permission > 0) { $self->error = 'Invalid Contact, Business or Permission parame +ters'; return undef; } $grant = 'G' unless $grant eq 'R'; my $query = $self->{'dbh'}->prepare("INSERT INTO $Bod::Variables:: +M_Table SET $Bod::Variables::C_Table_id$Bod::Variables::C_Table = ?, +$Bod::Variables::B_Table_id$Bod::Variables::B_Table = ?, date = NOW() +, giveRem = ?, Permission_idPermission = ?"); $query->execute($contact, $business, $grant, $permission); return $self->{'dbh'}->selectrow_array("SELECT COUNT(*) FROM $Bod: +:Variables::M_Table WHERE $Bod::Variables::C_Table_id$Bod::Variables: +:C_Table = ?, $Bod::Variables::B_Table_id$Bod::Variables::B_Table = ? + AND Permission_idPermission = ? AND date > NOW() - INTERVAL 2 SECOND +", undef, $contact, $business, $permission); } 1;

I have pulled out anything that could pose a security threat to Bod::Variables. Not just database username and password but also schema name and table names.

The documentation as generated by pod2html is here.

Replies are listed 'Best First'.
Re: [RFC] Review of module code and POD
by GrandFather (Saint) on Apr 01, 2021 at 00:05 UTC

    As a general thing lines that are very long, especially if there is important stuff at the end, and even worse of the structure is convoluted, compared with short lines are hard to understand. Keep your lines short, or at least wrap them sensibly. For example:

    return $self->{'dbh'}->selectrow_array("SELECT COUNT(*) FROM $Bod: +:Variables::M_Table WHERE $Bod::Variables::C_Table_id$Bod::Variables: +:C_Table = ?, $Bod::Variables::B_Table_id$Bod::Variables::B_Table = ? + AND Permission_idPermission = ? AND date > NOW() - INTERVAL 2 SECOND +", undef, $contact, $business, $permission);

    is easier to maintain written:

    return $self->{'dbh'}->selectrow_array( "SELECT COUNT(*) FROM $Bod::Variables::M_Table WHERE $Bod::Variables::C_Table_id$Bod::Variables::C_Table += ?, $Bod::Variables::B_Table_id$Bod::Variables::B_Tabl +e = ? AND Permission_idPermission = ? AND date > NOW() - INTERVAL 2 SECOND", undef, $contact, $business, $permission );

    Consider using spaces instead of tabs for indentation.

    Consider having new return an instance of Bod::CRM with its error field set, or throw an exception, for a DB connection failure.

    Consider removing warnings in lower level code. Low level code may not have enough context to be really helpful and low level warnings can end up generating a noisy log. Calling code can emit warnings if appropriate and can generally provide a better semantic context.

    Consider using exceptions for handling all errors so that calling code can chunk error handling into blocks of related code rather than having to deal with handling errors on all function calls. That extends quite well to having DBI code throw exceptions rather than having fine grain error checking at a low level (see RaiseWarn and RaiseError in DBI).

    Consider passing named parameters to find and add. That way your existing hash ref gets passed in as a hash with no other change, but a simple find doesn't need to have a hash ref spun up. That makes it easier to write the code and easier to read and check:

    my %fields = ( firstname => 'John', lastname => 'Smith', pri_email => 'john.smith@example.com', ); $crm->add(%fields); $crm->add(firstname => 'John', lastname => 'Smith', pri_email => ' +john@xxxx.com'); my @contacts = $crm->find(firstname => 'John');
    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      Consider using spaces instead of tabs for indentation.

      8 simple words to start a holy war.


      🦛

        From that node subtitled "My arguement in favor of tabs": "Reputation: -34 (+44 -78)" - says it all really.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
        Consider using spaces instead of tabs for indentation

        I never cease to be amazed at how The Monastery reveals that things I thought were standard practice...just aren't!

        The only formal education I have ever had relating to writing code was a short C++ course as part of my undergrad degree course and an even shorter Java course at another university as a small part of wider study. In both courses in the 1990's we were told to use tabs as indents and I didn't question why. Since then I have never questioned that anyone might do anything else...

        Is this a case of academia being out of touch with working practices or has the favour of spaces changes in the last quarter of a century?

      Consider passing named parameters to find and add. That way your existing hash ref gets passed in as a hash with no other change, but a simple find doesn't need to have a hash ref spun up.

      Thank you - I don't know why I didn't do it exactly like that because most of the time it will be a hash, not a hashref that gets passed in. Except there are a lot of parameters which could change...hence why I read the table columns out of the information_schema in the new method.

Re: [RFC] Review of module code and POD (Doco)
by eyepopslikeamosquito (Bishop) on Apr 01, 2021 at 00:32 UTC

    Two points from Perl Best Practices (Documentation section) you might want to contemplate:

    • Distinguish user documentation from technical documentation.
    • Place POD as close as possible to the end of the file.

    Personally, I dislike intermingling POD with code. Documentation of code is for programmers, so I use simple Perl comments (not POD) for that, right by the code it is documenting. Makes it easier to update the comments in step with the code.

    Technical documentation of the system architecture (not the code) I place in a single large POD documentation section at the end of the file, or in a separate file.

    User documentation should be kept separate from technical documentation. For that, I suggest either a separate POD file (or a big POD block at the end of the source file) ... or simply a Usage subroutine displayed when the user enters --help at the command line, making the user documentation more discoverable and more likely to be read.

    See also:

Re: [RFC] Review of module code and POD
by kcott (Bishop) on Apr 01, 2021 at 02:58 UTC

    G'day Bod,

    As far as I can see, there's nothing horribly wrong with what you've presented; however, there are a number of things I would have done differently. The following is very much biased towards my own preferences: pick and choose any bits you like; skip the bits you don't want; feel free to ignore all of it. I've also added some useful POD-related documentation links.

    I write a lot of modules. The majority would be for $work; although, I do write quite a few for personal use. Regardless, these are not for CPAN: the $work ones are Commercial in Confidence; the personal ones are too specific to my needs, or in some cases just testing ideas, and are not considered to be of general usefulness.

    Having said that, I still write these modules as if they were going to be uploaded to CPAN, and include: POD with the code; standard files such as Makefile.PL, MANIFEST, README and Changes; tests in a t/ directory; and so on. This allows me to, amongst other things: test changes (make test); easily install the modules (make install); create distributions (make dist) which I can transfer to other machines (e.g. scp to $work). I also find the POD to be exceptionally useful: six months or so after completing a module, I can rarely remember the exact details of every function and method, particularly when it comes to optional arguments and complex return values (e.g. array and hash references) — perldoc ModuleName makes my life so much easier.

    I never sit down and write all the files needed for the type of module described above; instead I use Module::Starter with the Module::Starter::PBP plugin. The latter is mainly for the excellent templates and does not represent a slavish adherence to "Perl Best Practices" guidelines.

    I have multiple config* files for my various uses. As an example, here's the currently active one for personal modules using Perl v5.32:

    ken@titan ~/.module-starter $ cat config author: Ken Cotterill email: kcott@cpan.org builder: ExtUtils::MakeMaker plugins: Module::Starter::PBP template_dir: /home/ken/.module-starter/P532

    I have modified the templates for my own preferences. For personal modules, that can be pretty much whatever I want. For $work modules, there are certain constraints, policies, and so on, that I need to take into account (e.g. specific Perl version, copyright text, etc.).

    There are three test files that I always add to the templates. These use Test::Pod, Test::Pod::Coverage and ExtUtils::Manifest.

    The idea behind all of this is to create templates once, then let module-starter create as many module frameworks as I want, all with the same, consistent look and feel. Obviously, I still need to fill in details, such as the description of functions, but all of the boilerplate has been done for me and I can pretty much forget about that part of the process. Partly DRY; mostly "the first great virtue of a programmer": laziness.

    Here's the POD documentation to which I alluded earlier:

    perlpod
    This is the basic documentation and a handy reference. If you're not familiar with it, check out the use of multiple angle brackets allowing nesting of "Formatting Codes" (it's near the end of that section).
    perlpodspec
    This is the gory details. It's probably more geared to those writing POD parsers and the like, but still has useful information. I'd generally only use this if I couldn't find sufficiently specific information in perlpod.
    perlpodstyle
    This has information on the standard sections of a manual page. You may find this useful.

    Things I would have done differently (in no particular order) and bear in mind, as I said above, these are my personal preferences:

    • I put all the code first; and all the POD last.
    • I always start POD with =encoding utf8.
    • I generally put whitespace before and after code blocks (if, while, etc.).
    • I try to keep lines under 80 characters; I'll go out of my way not to exceed 120 characters. This simply makes reading the code easier. I acknowledge that this is not always possible, but mostly it is. As an example, I'd prefer:
      my $dbh = DBI->connect( "dbi:mysql:$Bod::Variables::db_prefix$env:localhost:3306", "$Bod::Variables::db_username", "$Bod::Variables::db_password" );
    • Your use of $Bod::Variables::varname, locks you into a single class. Consider this which allows subclassing:
      # Uncomment one of these # my $bvar = Bod::Variables::->new(); # my $bvar = Bod::Variables::Special::->new(); my $dbh = DBI->connect( 'dbi:mysql:' . $bvar->db_prefix() . "$env:localhost:3306", $bvar->db_username(), $bvar->db_password() );
      Obviously, as I have no knowledge of Bod::Variables, that's highly contrived with a lot of guesswork; however, I hope you get the idea of the increased flexibility it offers.
    • I'd generally use more meaningful names than find() and add(); perhaps find_people() and add_person() would aid in reading and understanding the code.
    • The get() function could also do with a more meaningful name. If it's part of the API, document it in the SYNOPSIS; if it's not, prefix it with an underscore, to nominally indicate that it's private, and remove it from the POD (if renamed to _get_person_info(), the function is so simple that it wouldn't even need a comment).

    — Ken

Re: [RFC] Review of module code and POD
by Arunbear (Prior) on Mar 31, 2021 at 22:41 UTC
    It looks reasonable enough. Some possible improvements:
    1. Use a configuration file e.g. to store your environment setting as well as database credentials (decouple settings from code)
    2. Use DBIx::Class - you'll get search/insert and more functionality (almost) for free.
    3. Consider named parameters for methods with more than two args (makes code more self documenting).
      Use a configuration file

      Sorry if I'm being a bit thick but isn't that what I'm doing with use Bod::Variables; because all that's in there is a package statement and a list of scalars defined with our.

      Thanks for the suggestions.

        You're still keeping sensitive information in your code base, so that's not a configuration file.

        In the real world they are typically managed separately from the code base and often by people who may not be programmers (I realise your use case is simpler than the typical one).

        See Config::Tiny for an example.
Re: [RFC] Review of module code and POD
by perlfan (Vicar) on Mar 31, 2021 at 21:39 UTC
    Huge misakes? Not that I can see. Automatic A++ for even using POD. The bulk of the feedback you receive may end up having more to do (IMO) with POD's placement inline functions versus at the end. I know the merits of each and have the debates before, even at $WORK (related issues include: maintenance, Lukes using the source to augment module understanding, etc; however I get it, it all looks the same when rendered). My personal preference is all at the bottom as a monolith and *.pod is only useful for cookbooks and tutorials. Minor nits: DESCRIPTION typically goes immediately after SYNOPSIS; and I think there's a standard way the NAME section is written so CPAN gets the description right. Also, looking at other well known module POD is always informative. Good job.

      Thank you

      Yes, I did wonder if I was going to start a major debate about POD being inline or clumped at the bottom. I took the view, without thinking about it too much, that it is probably easier to write it next to the subroutines it relates to whilst developing the module. Then, once that subroutine is proven working and shouldn't need touching again, shift its POD to the bottom out of the way. Whether this moving would ever happen of course is up for debate!

      For this one I don't need to worry about CPAN rendering but I still want to try and get it right for the future.

      Is it worth also trying to write tests for this - again for practice and learning - if so, where would I start?

        Is it worth also trying to write tests for this - again for practice and learning - if so, where would I start?
        Before you write the module. :)

        The main point of TDD is that writing the tests first (or at least early) affects and improves the design of the module! Writing a test first forces you to focus on interface - from the point of view of the user. Hard to test code is often hard to use. Simpler interfaces are easier to test. Functions that are encapsulated and easy to test are easy to reuse. Components that are easy to mock are usually more flexible/extensible. Testing components in isolation ensures they can be understood in isolation and promotes low coupling/high cohesion. Implementing only what is required to pass your tests helps prevent over-engineering.

        To give a clearer picture of where I'm coming from, a quote from Ten Essential Development Practices:

        The most important aspect of any module is not how it implements the facilities it provides, but the way in which it provides those facilities in the first place. If the module’s API is too awkward, or too complex, or too extensive, or too fragmented, or even just poorly named, developers will avoid using it. They’ll write their own code instead. In that way, a poorly designed module can actually reduce the overall maintainability of a system.

        Designing module interfaces requires both experience and creativity. Perhaps the easiest way to work out how an interface should work is to “play test” it: to write examples of code that will use the module before implementing the module itself. These examples will not be wasted when the design is complete. You can usually recycle them into demos, documentation examples, or the core of a test suite.

        The key, however, is to write that code as if the module were already available, and write it the way you’d most like the module to work.

        Once you have some idea of the interface you want to create, convert your “play tests” into actual tests. Then it’s just a Simple Matter Of Programming to make the module work the way that the code examples and the tests want it to.

        See also: Re: Winning people over to better development practises (TDD) and On Interfaces and APIs

        100%, yes, write tests for your module. It's unclear if you meant for the POD. I am not sure I would go that far. Besides it wouldn't go to improving your test coverage :).
      Automatic A++ for even using POD

      What are Monks using to generate readable documents from POD?
      I'm guessing something better than pod2html...

Re: [RFC] Review of module code and POD
by bliako (Monsignor) on Apr 01, 2021 at 11:50 UTC

    *I* *usually* have an init() method called automatically in new() to do any initialisation, e.g. connect and read DB. This has the advantage of being able to re-initialise any time I like after instantiation by just calling $obj->init().

    I am still undecided (or haven't read enough) on how to propagate errors, error messages and error codes to caller all the way up. The method you are using by saving the *last* error message will need some tweaking when you are nesting sub calls (and if you do that it will soon result to a paper-pushing system). Perhaps simply append to $error with obligatory clearing it when starting a-fresh. Alternatively throw an exception which is the "modern" thing to do but that imposes a whole new style which I don't see often in Perl modules (arbitrary judgement), see e.g. Exception::Class. Sometimes I return a complex data structure with error code, error message, data etc. but I only do that for complex subs.

    I appreciate Perl more when I have to work with Java and its super-strict typing.:

    sub work { my $something_wrong = 1; if( $something_wrong ){ #return Error->new(0, "because ..."); return bless [0, "because ..."] => 'Error' } # edit: changed to a hashref to show returning different types # it need not be a ref at all, e.g. can send a hash too #return [1,2,3] return {1=>2, 3=>4} } my $ret = work(); if( ref($ret) eq 'Error' ){ die "error: @$ret" } print "got some results:".Dumper($ret);
      I am still undecided (or haven't read enough) on how to propagate errors, error messages and error codes to caller all the way up

      As you can tell from my code - I am also undecided!
      This was something that I hoped people would comment on and I wasn't disappointed.

      I am fairly sure that I don't need to go as far as implementing full blown Exceptions, especially for modules that will only ever be used internally. This is the first time I have tried setting an 'error' field within the module to provide a verbose message and it seems to work quite well. Having said that, I would like to adopt more consistency both in this module and for other modules going forward.

      I do like the idea of an init(); method although I'm not sure of a situation where I would want to reconnect to the database within the same session.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11130640]
Approved by GrandFather
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2021-12-04 17:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (30 votes). Check out past polls.

    Notices?