Best practices - if any?

AriSoft has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Best practices - if any? by biohisham (Priest) on Feb 20, 2010 at 22:51 UTC
"I am not writing a new module or class. Just putting couple of big subs to a safe place" If these subs are part of the same package then this is possible to achieve and then you can use the spread package by 'require'ing the filenames.pl which contain that package (File boundaries are not considered to be package boundaries in Perl). However, I wouldn't rule this out as your path to follow because I am not sure of how your code looks like.. Following is an example of the same package's subroutines being spread over a couple of files and then accessed and checked for which `__package__` they belong. `#FileOne.pl package ProgramSpread; BEGIN{} sub subroutine1{ print "Hello from the sub 1 in ", __PACKAGE__, "\n"; } return 1; END{}` [download] `#FileTwo.pl package ProgramSpread; #The same package above BEGIN{} sub subroutine2{ print "Hello from the sub 2 in ",__PACKAGE__,"\n"; } return 1; END{}` [download] `#using the package require "FileOne.pl"; #File names containing the package require "FileTwo.pl"; ProgramSpread::subroutine1(); ProgramSpread::subroutine2();` [download] UPDATE: Though this is possible, it is still not recommended for proper design. Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.	[reply] [d/l] [select]
Re^2: Best practices - if any? by Anonymous Monk on Feb 21, 2010 at 13:47 UTC
Why `BEGIN{} ... END{}` ??	[reply] [d/l]
Re^3: Best practices - if any? by biohisham (Priest) on Feb 21, 2010 at 14:25 UTC
That is just to show the package skeleton in general. It is not required in this case. There are situations like when you want to initialize some variables in the start or do some cleanup/deallocation at the end, for such cases you might wanna use `BEGIN{}` and `END{}`. `package Constructor_Destructor; BEGIN{ our $text; $text = "Hello from BEGIN\n\n"; } sub subroutine{ print $text; } END{ print "DESTROYING...\n"; $text=0; print "Now \$text is $text\n"; print "Exiting with $?\n" } #return 1; #did not return since I am calling from the same package #Use the package: Constructor_Destructor::subroutine();` [download] You can also use multiple `BEGIN{}` and `END{}` subroutines, the `BEGIN{}` ones would execute in the order encountered and the `END{}` ones would execute in the reverse order they were defined in order to match the `BEGIN{}` subroutines.. Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.	[reply] [d/l] [select]
Re^4: Best practices - if any? by Anonymous Monk on Feb 21, 2010 at 14:42 UTC
Re^5: Best practices - if any? by Anonymous Monk on Feb 21, 2010 at 14:51 UTC
Some notes below your chosen depth have not been shown here
Re: Best practices - if any? by desemondo (Hermit) on Feb 20, 2010 at 22:05 UTC
maybe you need to require second_part.pl in your main script ? If that's not it, maybe showing us a little of your code will help clarify what your trying to do.	[reply]
Re^2: Best practices - if any? by AriSoft (Sexton) on Feb 20, 2010 at 22:22 UTC
It gives me a long list of errors like these: `Variable "$datalock" is not imported at agent.pl line 277. Variable "$debug" is not imported at agent.pl line 285.Global symbol " +$datalock" requires explicit package name at agent.pl line 277. Global symbol "$debug" requires explicit package name at agent.pl line + 285.` [download] I tried do and require. I had to copy use commands from the main part to the second one to get it compiled this far but now it whines about many variables like: `our $datalock = MyLock::new; our $debug = 1; #Debug messages` [download] I understand that do works in limited lexical view but how should I originally declare variables which spans to global scope if "our" is not global enough?	[reply] [d/l] [select]
Re^3: Best practices - if any? by almut (Canon) on Feb 20, 2010 at 23:22 UTC
how should I originally declare variables which spans to global scope if "our" is not global enough? our is lexically scoped, and files included via `require` and friends have their own implicit lexical scope (so you'd need to redeclare your `our` variables). You might also declare your global package variables using `use vars` in order to share them across files with strictures enabled. As another alternative, just fully qualify every occurrence of a global variable — as long as they're in the main namespace, that would simply be something like `$::foo`. The advantage of the latter approach is that they're immediately evident, and that the required additional typing helps to keep them at a minimum :) That said, think twice before you do so! What is the real idea behind splitting the code? You say "putting couple of big subs to a safe place", but why are they unsafe in their original place? If modularisation/reuse is the idea, why not create proper modules? Also, having to share many variables across different files typically is an indication of bad design in the first place...	[reply] [d/l] [select]
Re: Best practices - if any? by BrowserUk (Patriarch) on Feb 21, 2010 at 00:17 UTC
Files constitute scopes, even when "incorporated" via do. Just declare any our variables used within the subs at the top of the separate file, or better within the sub bodies they are used in, (as well as in the main file). Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "I'd rather go naked than blow up my ass"	[reply]
Re^2: Best practices - if any? by AriSoft (Sexton) on Feb 21, 2010 at 05:36 UTC
Ok. I was already wondering what is the the practical difference with "my" and "our" variable declarations in the main level. I knew that "our" goes to symbol table but I did not realize what it means until now. I can declare the same variable in many files and they all points to the same symbol as far as the module is the same one. Right?	[reply]
Re^3: Best practices - if any? by BrowserUk (Patriarch) on Feb 21, 2010 at 06:01 UTC
I can declare the same variable in many files and they all points to the same symbol ~~as far as the module is the same one.~~ Right? With the above correction, yes. The only unfortunate exception is when you use threads. Then each thread inherits a cloned, non-shared copy of those globals already in existance in the thread from which it is cloned. If you want a shared global, you must post-fix it with `:shared` everywhere it is declared. Use it some places and omit it other and things get really messy. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "I'd rather go naked than blow up my ass"	[reply] [d/l]
Re^3: Best practices - if any? (our) by shmem (Chancellor) on Feb 21, 2010 at 11:56 UTC
Variables declared with our create a package variable and a lexically scoped ~~variable which is an~~ alias to that package variable, visible through the entire scope (file or block) even spanning packages: `# file foo.pl use strict; our $foo; # that's $main::foo { package Foo; our $foo = "foo"; # package variable $Foo::foo created print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; package Bar; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; } print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n"; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$Foo::foo is '$Foo::f +oo'\n";` [download] `# file bar.pl use strict; { package Bar; our $foo; # package variable $Bar::foo created print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; } # end of scope package Foo; our $foo; # package variable $Foo::foo initialized in 'foo.pl' print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n"; package Bar; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n";` [download] `#!/usr/bin/perl use strict; our $foo = 'bar'; # package variale main::foo require 'foo.pl'; require 'bar.pl'; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n";` [download] Running main.pl yields `foo.pl 8 Foo:: $foo is 'foo' foo.pl 11 Bar:: $foo is 'foo' foo.pl 13 main:: $foo is 'bar' foo.pl 14 main:: $Foo::foo is 'foo' bar.pl 7 Bar:: $foo is '' bar.pl 12 Foo:: $foo is 'foo' bar.pl 15 Bar:: $foo is 'foo' main.pl 8 main:: $foo is 'bar'` [download] updated as per JavaFan's comment below. Of course there's only one variable and it's alias in the current scope.	[reply] [d/l] [select]
Re^4: Best practices - if any? (our) by JavaFan (Canon) on Feb 22, 2010 at 11:43 UTC
Re^5: Best practices - if any? (our) by shmem (Chancellor) on Feb 22, 2010 at 12:49 UTC
Re: Best practices - if any? by afoken (Chancellor) on Feb 21, 2010 at 12:27 UTC
If you have that much code that you feel the need to spread it over several files, think about modularising it. In C, you would better NOT simply `#include "second_part.c"`, but instead split your code into smaller parts, compile them separately into object files, and use the linker to create a single executable. You would perhaps end with something like main.c, inputreader.c, logger.c, smoothify.c, prettyprint.c, and perhaps utils.c and globals.c. For most of the files, there would be a corresponding .h file containing the "public" interface, i.e. those functions that are called by one of the other files (logger.h would perhaps contain something like `extern int initlogger(const char logfile);` and `extern void log(int level, const char * message);`, globals.h would instead define the few needed global variables, e.g. `extern int verbose; extern char frobnicate;`). All functions (and global variables) not needed outside one of the source files would be declared as static, so that the linker does not try to resolve those names. In Perl, you would do pretty much the same: Put groups of functions into modules, have a public interface for each module (i.e. `use Exporter` for non-OOP code), and have a short main program that delegates to the modules. Because Perl already has a lot of modules, use a unique prefix for your module names. If you have no better idea, use the application name and / or your last name or your company's name. You would end with AriSoft::Frobnicate for the main routines, AriSoft::Frobnicate::InputReader, AriSoft::Frobnicate::Logger, AriSoft::Frobnicate::Smoothify, AriSoft::Frobnicate::PrettyPrint, AriSoft::Frobnicate::Utils, and perhaps AriSoft::Frobnicate::Globals. Thinking about "big subs": Some big things are a pleasure to the eye, but "big" subs spanning more than one or two screens (i.e. more than 50 lines) are a sure sign of wrong design. You will become confused when you need to change the code, you will pile up status variables and obscure if-then-else constructs, and perhaps you even will abuse goto. Split them into smaller, specialised functions. This is pretty independant from Perl, you have exactly the same problem in nearly every other language. I'm currently earning my money by refactoring C-like code written by an ungifted amateur, full of bugs, copy-and-paste, gotos and cargo cult, without any proper indenting, with functions spanning literally thousands of lines, with loops and ifs nested more than 10 levels deep, and of course without any useful documentation. I've removed more than 30% of the code without any loss, and I will remove about another 30% before the will go back to the production machines. During the process, the number of functions will at least double. That code is a real nightmare, no one has a clue about what it does, and just deleting the crap and starting from scratch is not an option. All we can do is to cleanup every piece of code we need to touch, and hope for a slow improvement over time. Learn from that, start writing clean, structured, and documented code NOW. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply] [d/l] [select]
Re^2: Best practices - if any? by AriSoft (Sexton) on Feb 21, 2010 at 19:24 UTC
Lets test this theory. Here is one authentic sub from my project. Why should I break it pieces and how this helps to keep it with other subs in the same file? I still prefer to keep this in a separate file like a module. Read more... (12 kB)	[reply] [d/l]
Re^3: Best practices - if any? by mirod (Canon) on Feb 22, 2010 at 10:18 UTC
I am not going to get into the details of this, but just a couple of quick stylistic comments: instead of including anonymous subs in your hashes, why don't you name them? Instead of `row => sub { # complex code here that ASAIK creates a record }`, write `row => \&create_record,` and then define `create_record` a little further. This way you're giving a name to that sub, and it becomes easier to see the data structure, without the big, often irrelevant, blob of code right in there. `$$flight{Reg};` can be written `$flight->{Reg};`, which is especially handy when writing `$cols->[0]</<c> instead of <c>@{$cols}[0]` the commented-out code does not belong in there, if you have to remove code just do it, the source-control system will keep the old version.	[reply] [d/l] [select]
Re: Best practices - if any? by cdarke (Prior) on Feb 21, 2010 at 08:39 UTC
I'm going to recommend this node to anyone who questions the advice "don't use global variables".	[reply]
Re^2: Best practices - if any? by AriSoft (Sexton) on Feb 21, 2010 at 19:01 UTC
"don't use global variables". I have to comment that our variables are far from globals. It is impossible to keep all data in function parameters. If you are passing refs you are reinventing globals :-) Passing refs automatically by the compiler is called OOP. I will use "goto" and "our" without any pain and I also put parentheses in a "wrong" line. With perl you can do it in many ways but I am still missing inline functions.	[reply]
Re^3: Best practices - if any? by tfrayner (Curate) on Feb 22, 2010 at 13:13 UTC
I must have been dreaming all these years I've kept all my data in function parameters. As for passing refs reinventing globals, I think you're mistaken. Maybe it makes sense if you're in the habit of addressing your variables by memory location (as in C; pointer arithmetic, for example). The only time a reference is global in Perl is if you've stored it in a global variable (although I'm sure someone is going to pop up with a counter-example). However, disregard that argument for a moment. You're definitely missing one of the key advantages of parameter passing - it effectively documents the flow of a variable through your code, so that when you change a variable you can trace the effects of that change. The problem with globals is that you can change the value of a variable and then find it hard to figure out what's affected downstream; and the bigger the codebase grows, the harder it becomes. Tim	[reply]
Re^4: Best practices - if any? by AriSoft (Sexton) on Feb 22, 2010 at 15:29 UTC
Re: Best practices - if any? by Anonymous Monk on Feb 22, 2010 at 20:55 UTC
To put your functions and package global variables into their own package/module, just create a `Foobar.pm` file (in the same dir as your script) and put your subs in it like so: `package Foobar; use strict; use warnings; our $some_global_var = 8; sub some_sub { print "Hi from Foobar::some_sub().\n"; } 1;` [download] Now, in your script, to use that new module you just created: `#!/usr/bin/env perl use strict; use warnings; use lib '.'; use Foobar; Foobar::some_sub(); print "The global var is $Foobar::some_global_var.\n";` [download] That's it.	[reply] [d/l] [select]


more useful options
	PerlMonks