|Perl: the Markov chain saw|
Building a Development Environment on Ubuntuby Xiong (Hermit)
|on Mar 17, 2010 at 12:42 UTC||Need Help??|
Here is an environment for developing applications in Perl, suitable perhaps for the newcomer. It seems ironic that the rawest newcomer to the Perl Community needs a stable, independent development environment more than the grizzled guru; yet constructing that environment is not an entirely straightforward task (judging from the many blind alleys I have blundered into).
We all have different situations so let me begin by telling mine. If my shoes fit, you are more than welcome to try them on. This discussion may prove quite boring to the experts, of whom I beg pardon.
I'm a solo developer on my own time; my current project is a web application to run on a virtual host to which I have shell access and read-only access to root. My development platform is an old trashy Intel laptop, considerably upgraded, running Ubuntu 9.10. This is a Debian-based Linux distro with the GNOME desktop. While I've written code all my life, it's been to support hardware development and operations. I have about 5 years in, fooling around with Perl; this is my second major project. Also, I'm fairly new to Linux in general.
I have a bias in favor of GUI tools when they are suitable. I use the command line as appropriate but I feel there's some justification in calling GUI an advance in productivity, especially when working with tools infrequently or initially. Also, I prefer when possible not to do things I don't understand, at least at some level. I feel more comfortable with command line invocations when I understand the implications of the funny switches.
The Lesser Issues
When I first began to write Perl scripts, I didn't give much thought to it. A perl executable is installed by default with Ubuntu; any text editor will work; compiling and linking is not required. I was well underway before I realized that I'd made many poor choices. Here are better ones.
How can I manage all my files?
By this, I mean low-level management: starting, editing, and finding them.
The default text editor, gedit, is too basic. I settled on Geany, which is intended for writing code. Geany has symbol completion, plenty of formatting options, syntax highlighting, and many other handy features. Importantly, it has the concept of projects.
A Geany project is nothing more than what you put into it. When you create a new project (Project -> New), Geany will create a file by that name, say myproject.geany. You may then open as many files as you like, each in its own tab. Don't close the tabs; close the project. When next you open the project, all the same files will be there in the same tabs and in the same state. Handy!
You are best to ignore items in the Build menu, which make more sense for compiled languages. Note that Geany can be heavily customized. For example, I edited ~/.config/geany/templates/filetype.perl to create a custom template for my .pl scripts. You are well advised to read the HTML format documentation available from the Help menu.
I found I needed both the GUI file browser, Nautilus, and a set of command line tools to work with files effectively. Nautilus is good for showing an overview of a folder and has tabs; unfortunately, these cannot be saved as a project group. Some operations are easier to do there as well. However, the Search function is rather slow.
Basic command line tools include ls, cd, cp, mkdir, rm, rmdir, chmod, chown, sudo, ln, locate, find, and the all-important man. I got much education by typing man command for each of these and many more.
A frequent issue is file permissions. Files may be read-only, read-and-write, executable or not; and these permissions may extend to owner (you), group (usually not important if you're solo), or everyone. You can make a Perl script executable quickly with Ctrl-I but for more advanced permissions work, you'll need chmod and chown. File permissions sometimes change without warning (when you do other stuff inadvertently, like restoring from backup) and can cause no end of trouble if you're not checking for them. Study up.
Two other text tools deserve mention: Bluefish and PodBrowser (see and this). Bluefish, I feel, is better for editing HTML than Geany. POD, the standard documentation format for Perl stuff, can be written in any text editor but oddly enough, there don't seem to be any really great applications for displaying it. PodBrowser is okay; you may just use gedit.
How can I track changes to my code?
My old standby was always to copy my project folder, name the copy by number or by date, and move on. Then I discovered git. There are other tools for version control but I feel this is best for our needs -- and a world beyond just copying files.
There is much to learn about git; much more than I can squeeze into a single post. The basic idea is to create a git repository inside every project folder; this will appear as a hidden folder, .git, which you can see in Nautilus with Ctrl-H or on the command line with ls -A (but which you do not want to touch directly). Whenever you make a small number of changes to your project, commit the changes to the "repo". Later, you can see what you did, how you did it, and (if you've written good notes) why you did it. You can even go back to an earlier version of your project. So git is an invaluable tool.
You can work with git from the command line and, as usual, you will have more power there; but I find two tools sufficient in most cases: git-gui and gitk. Use git-gui to commit changes; use gitk to see the history of those changes. You will likely have to study up on git to get the hang of it but I promise you the investment will be more than worth it. Tip: It's better to create a repo for an entire project folder and exclude files you don't want to track.
How do I organize all my files?
This was the topic of my first post to PerlMonks and oddly, I still haven't got the definitive answer; better, I'm no longer worried about it. When you have dozens, eventually hundreds of files, no over-arching scheme is going to last. Be willing to create many folders and try to keep things together.
The key to organizing files is the symbolic link or symlink. By creating links to files and putting them in different places, you fix the problem of wanting a thing to be in two (or more) places at once.
You can create a symlink on the command line with ln -s and view its target with ls -l. You can also make a symlink in Nautilus by Shift-Ctrl-dragging a file or folder.
Don't agonize over where to put (most) things. Put them where it seems reasonable and symlink them to other places where they also seem reasonable. Note that it can cause some trouble if you navigate a heavily symlinked directory structure, say, within one of your scripts; or you just forget which is the original and which the link. Don't go crazy.
Tip: I like to create a folder ~/sym, throw symlinks into it to any obscure item, and then put symlinks to ~/sym wherever convenient. This is too sloppy for files your scripts will work with but fine for stuff you need to look at or perhaps edit occasionally.
Another tool that helps with organization is File Roller (here and that). This is how you can make "tarballs" (compressed archives) of entire folder trees. Although git will keep you from having to make too many manual backups, it's still good to roll up your folders from time to time.
Finally, and most important, back up your files! By this I mean you should copy them to an external hard drive, one you can unplug; or better, burn them to DVD. This can't be said enough: A good set of backups will get you out of more trouble than a week's worth of tears in a chat room. Back up regularly.
How do I get all this stuff?
All the tools I mention in this post (with the exception of perl itself) can be got through the Debian package manager, dpkg. There are several command line front ends, which I encourage you to look at; but the easiest way to install applications of any kind is probably to use the GUI Synaptic front end, found under System -> Administration -> Synaptic Package Manager. This will download and install for you most things that you will want on your Ubuntu system.
You do not want to use Synaptic to install perl itself! Ubuntu comes with perl installed already; but you don't want to use that for your laboratory. (More on this later.)
How do I partition my hard drive?
Installing Ubuntu is beyond the scope of this post but there is one key point that any developer needs to take into consideration and that is partitioning.
If you're a newcomer (and I don't think you'll have read this far if not), then you may well have a rather crude partitioning scheme. I'm going to suggest you back up all your files, clear off your hard drive, repartition it, and restore files from backup. If you mess up badly (as I did, and as is easy to do), you'll be doing this anyway some day. You may as well do it before things get complicated.
You can repartition an existing system "hot" but that's a bit risky and very slow. You're better to do this at install time.
Each system is different and I can only tell you how I spent my 111 Gb of storage:
Why so many partitions? There's a reason for each and no simple explanation. Consider this only a starting point for your own meditations; study up. All these names except one are standard and you can find info on them. I made /ark to store archives and rarely-edited files, of which I have several years' worth. Next time I do this, I will probably also create a /lab, /rad, or /run partition, for reasons which will be clear soon.
Building a Development perl Executable
Since perl is indeed installed by default with every Ubuntu installation, it seems reasonable to use it to run your development scripts. This is not bad of itself but next, you will want to use some module you download from the excellent CPAN. This is a vast collection of useful (and sometimes not-so) stuff and you will want to taste it. That will also usually be okay ...
... until the day comes that you happen to install a not-so-good module that conflicts with something that Ubuntu needs to run properly. You see, perl is not installed by default because Somebody thought you might like to play with it. Various parts of the system rely on Perl to work. You might be able to back out of that faux pas and you might not. I didn't, which led me to a clean install and a resolution to Never Tamper with the System Perl. (Others disagree but then, others are not over to my home to fix my broken system.)
So, must needs build a second, development perl. This safe practice is no doubt quite simple for the bewhiskered Saints but for a relative newcomer, I found it tough. I hope you'll find it easier to follow.
Download the Tarball
Go here and download the version of your choice. You probably want the latest 'maint' version; maybe not.
Unpack the tarball with File Roller and do not throw away the tarball. To be human is to err and the installation process changes files in the unpacked folder. You will want to be able to throw it out and re-unpack.
Create the Base Folder
You can put this anywhere but since you'll be going here a lot, you may not want to put it in your home folder; I didn't. I created a folder in the file system root at /rad/perl. (For "radical", anticipating a third, tamer build elsewhere.)
If you try to put this in your root, you may have some trouble, since as a user, you can only create things in your home folder. This is where the command line comes in handy:
The sudo means you're commanding as superuser, like root. Careful what you do this way. The -p means to create the parent folder, too; the -R means to apply the ownership changes recursively on all contents. You will probably want to reissue the chown a few times, whenever you get a message that something isn't working because of permissions. My username is 'xiong'; I might as well have chown'd to xiong:xiong but I'd created the 'developer' group on my machine previously. (You needn't.)
Note that I've chosen to build perl with thread support. This is still considered experimental but -- did you know? -- the system perl shipped with Ubuntu has threads enabled.
Now comes a very long list of questions, which must be answered somehow. In most cases, just hitting return (accepting the default) works. Here are some of the changes I made (and didn't):
Note that I (eventually) accepted the default here. This is for other libraries needed to build perl, not libraries of Perl modules, which I wanted to keep separate from the system.
Again, the default is fine. You will see references to compiling with debugging support but you don't need to do this in order to use the Perl debugger.
Note that the hashbang line that starts your scripts must now be #!/rad/perl/bin/perl, not the #!/usr/bin/perl that you may see elsewhere (or used with success before). Later, I'll show you a way I shortened that.
I prefer a shorter name, easier to type and grok.
At this point, there's a test for setuid. Don't worry about it. Answer 'no'; then you're asked if you want to emulate; say 'yes'.
Again, I don't like excessively long path names and I don't think I'll be installing another version on top. But this may have been a mistake; I think cpanp expects there to be another level of depth. (Anybody?)
After several attempts, I hit on this safety valve. You see, when you build perl you are also setting the default paths for @INC. You can alter this at run time but you cannot change the default without rebuilding. By including a spare, I was later able to symlink to modules that installed in an unexpected location.
Absolutely not. The whole point of the exercise is to leave the system build untouched. As usual, hitting return is almost always best.
At this point, there was a great amount of output, which I might have suppressed had I not an obsession with such stuff. Eventually, it's done and we're back to the command line prompt.
After each such command, there will be much business. Don't be alarmed that it throws all sorts of warnings. So long as it completes with success, it's okay. When it's all done, chown again for good measure.
Now for a little manual test of our own. Write this script and change permissions on it so that it's executable:
Then run it with ./hello.pl.
(Purists will complain that this is a lot of work to avoid "\n" but I like it.)
Create a symlink as /run to /rad/perl and then you can change the hashbang to #!/run/bin/perl.
There's a couple points to this. By building in /rad/perl instead of just plain /rad, the Configure operation defaults to a shorter path for those module libraries. By symlinking from /run, I can change the link later to point to a different build, leaving my scripts untouched. This makes more sense when you see that perl might be anywhere on my remote webhost, probably /usr/bin/perl.
Congratulations! The new perl works. Now for the hard part.
There would be no point at all to any of this if I didn't want to be able to grab CPAN modules and install them into my development environment. This was not quite as easy as it might be.
I install modules from CPAN with the cpan script. But there's already such a script in /usr/bin and if I type cpan at the command line, that's what I'll get. I don't want that; it will try to install into my system. So, I renamed the system cpan and created a symlink to the cpan installed during the perl build.
There are actually four related scripts:
Each one has a nasty little gubbins up top:
This is only needed if you tend to invoke your scripts from some shell that doesn't understand the hashbang line. (The default shell, bash, is fine with it.) Obnoxiously, when the gubbins is built, the second hashbang isn't even right. Cut the whole thing down to:
Cut the gubbins, rename, and symlink all four scripts.
Now, both cpan and cpanp require various modules to run with all features and unfortunately, neither ship with all of them. So, you start off with somewhat broken tools, which is all you have to fix them with. I cannot say exactly in what order, but I ran first one, then the other, installing what they would; they need different modules and work differently. There may be a way to get just one to load right up and if Somebody will tell me about it, I'll gladly edit this node.
Configuring cpan and cpanp is a bit tricky. They need to know where to put their files and there's quite a bit of typing required. I suggest you copy and paste. Again, if I don't mention it, a simple return (accept default) is enough.
Most of the choices are a matter of taste. This process is both forgiving and unforgiving. If you make the wrong entry, you can't go back; but you can go around again. Difficulty: The next time through, the default will be whatever you chose last time, not the original default.
No; we want to reconfigure.
You may disagree but it's easy to forget to save preferences changes, which is what we're setting here.
If it's a prerequisite, well, we'd better build it. I don't see any point in downloading something and then not keeping it but you may disagree.
This is the first biggie. Note well that although I've broken the line into parts for readability, this will not work for you when actually doing the configuration! Nor will it help to put backslashes in. It must all be on one line. Suggest you copy and paste the whole mess to a scratch file, make any changes to the paths first, then take out the whitespace, copy again, and paste into cpan.
Same drill but notice that there are important differences. You want the double dashes and you want the lowercase. Keep it all on one line!
I like the colors but beware that the defaults are kinda ugly and hard to read. If you want normal output plain black on white, don't choose black on_white; just say black. Say black on_blue, not black on blue, for black on blue.
You must say yes or everything fails, if this is your first time through. You'll be prompted through several menus of continents, countries, and eventually CPAN sites. Choose several; you never know when one will go down. I googled to see which were geographically close. I favor .edu mirrors but I throw in some commercial ones. You want a mix of ftp and http sites.
At this point, you're done; but cpan will make all kinds of complaints about various modules being unavailable -- loud complaints, if you've enabled bold_red on_white for such things. Install what you can; quit every so often and restart.
Unlike cpan, cpanp gives you some sub-scripts you can choose from a menu, so you don't have to do it all in one shot.
Again, most are a matter of taste and most defaults are okay.
*** Here you enter the whole mess on one line, as above in makepl_arg.
$$$ Here the other whole mess, as mbuildpl_arg.
Again, choose some CPAN mirrors.
Again, install what you can and try cpan again.
I wish I could offer a better process than fudging back and forth between cpan and cpanp until they both work; but that's what I did. No amount of hammering on just one or the other worked.
I ran into a nasty bit along here, where somehow I failed to build perl with all the right paths in @INC; cpan insisted on building into an unreachable folder. That's where the spare folder, /rad/lib/more, came in useful. I never created that folder as such; I just made a symlink with the unreachable folder as its target. Expect the unexpected.
If you go wrong somewhere, never fear. Throw all your builds out, if you must; and go back and untar the perl tarball. I went through and salvaged some configuration files from each attempt as a guide to the next; but I'm not sure to recommend that, since I fear I may have propagated mistakes that way, too. Don't lose heart; you'll get it done eventually. It took me only 13 tries.
Install some more modules. Write some scripts. Have Fun!
Monks of much greater serenity than I may scoff not only at the extreme length of this post but also at the weasel words and general fudgery-doo.
First, I apologize to anyone who felt he had to read it all. I offer the defense that I would have loved dearly to have seen the whole gross tome six months ago. I've spent literally hundreds of hours studying up to get this far and no matter what I read, it contained only a small piece of the puzzle. Although I know much more now about building a development environment than I did, I would have been content to have it all handed to me and been free to concentrate entirely on learning Perl.
Second, my regrets to anyone offended due to the time spent explaining not-Perl stuff. I found that stuff essential to doing anything with Perl and it took me months to figure it out, a nybble at a time.
Third, my deepest sympathies to those who come looking for hints on how to develop on anything but Ubuntu, or for any other purpose. When I searched for web pages with practical advice (outside of the standard docs), I almost invariably stubmled on instructions for installing ActivePerl on a BeOS server through telnet. There don't seem to be many tutorials for my situation, since perl is installed by default and anyway, Ubuntu is Linux and any build here must be a snap.
Fourth, I'm more than willing to incorporate reasonable suggestions and sage corrections from anyone who considers to provide them. I'm only too aware of how little I understand what I just did and I'm convinced I only did so-so, perhaps less well considering the investment.
Please take note that I owe far more than I can possibly repay to the many Saints and lights who made these tools available and those who condescended to explain them to me, sometimes five or six times. If oftoccasion these tools have made a poor cut, let us blame the hand that held them.
s(sudo ./Configure)(./Configure) per duelafn ++
Correct my remotehost access.
There are many good reasons to partition. My hope in showing off my table is to provoke the newcomer to study up and think about a plan beyond /, /home, and /swap. I agree with wazoox that this is the absolute minimum. Would anyone like to suggest a starting point for the newcomer to read more about partitions?
Since as an Ubuntu noob I use bash exclusively, is there any reason to retain the gubbins? Has it any purpose? I admire the gubbins as I admire any cute dodge. I detest the gubbins as I loathe obfuscated code in production. It's an extremely clever way to fix an issue that should never have arisen. I wasted much time studying the gubbins before I was certain how to dispose of it and that alone incurs my resentment. I'm not entirely sure of the effect of the second (third?) hashbang (perhaps none) but seeing it there gives me the willies. I should probably extend the list of scripts to trim. There is probably no reason to sudo ./Configure; fixed. Truth be told, I sudo su and did the whole thing from a root shell. (Please don't hit me!) Even worse, I gksu nautilus. This is a step down from logging into GNOME as root. Sorry; I'm not quite in the spirit of doing admin stuff as a humble user on my own machine. I'd probably run as root all day long if it wouldn't so irk my fellow Monks. I'm the sort of person who rarely sets write-protect tabs and often runs heavy equipment with the chain guards off. I admit that age has made me a little more cautious.
I need clarification on using aptitude to install perl. How would I use it to install a second executable for development only? How is that better than downloading the tarball from CPAN? Do I get more stuff that way? Will cpan and friends work better out of the box? What's the win? How many steps above do I bypass this way? When I screw up the build, how do I re-unpack the download and go around again? Note my bias in favor of GUI tools appropriate to the newcomer. I fear that using Synaptic will do nothing at best and wreak havoc at worst. I surely wouldn't want to mislead the newcomer who, like me, has learned to substitute Synaptic for the many apt-get suggestions he reads. I'll be happy to incorporate an aptitude call, though, if there's a commensurate payout.
Originally, I chose /dev, for "development", as my development folder. (You see how raw I really am.) Then I chose /lab, for "laboratory". Then I realized that eventually I would build at least two perl environments (in addition to the preinstalled system perl): one latest-and-greatest and one to match exactly that installed on my remotehost. I chose /rad for "radical" and put a symlink at /run (for "run"), which keeps paths short and gives me some flexibility.
I definitely don't want to install to /usr or any subfolder thereof. That folder is full (> 200,000 items) and I've already put /usr/share on its own partition, also full. Besides, it is, as rowdog points out, the default for many things. I want to be very sure that nothing defaultish hits into my development environment or vice versa.
Yes, /opt (for "optional"?) seems reasonable enough. Perhaps when I feel confident enough to rewrite this personal saga as an authoritative tutorial I'll change over to that.
I very rarely invoke perl from the command line; I'm not a one-liner guy and I expect my script to find its interpreter. I can see some benefit to a modified PATH. I'll have to study up. rowdog++
I certainly need to add a paragraph about development libraries and XS modules. These headers can be installed easily with Synaptic; it takes only a bit of research to figure out which packages are needed. Unfortunately, cpan does not resolve these non-Perl dependencies nor does it emit illuminating errors when they're missing. I also want to expand the section on cpan to include recovery from failed installs.
Upvoted scorpio17's comment because it's nice to see alternatives.
Um, mpeg4codec, you and perhaps duelafn are pushing Debian packages as a way to install perl and CPAN modules. My understanding is that all dpkg tools will put things in the default location, which is exactly what I don't want. If I'm wrong, tell me and I'll study up. If using aptitude, apt-get, or better, Synaptic itself will get anything installed with less hassle and/or more safety (particularly those stubborn XS modules), I'll not only rewrite this post, I'll tear down my devel-environ and start over.
Is it rational to construct and upload some sort of development environment starter package?