Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

PM Confessional

by thezip (Vicar)
on Jan 15, 2011 at 00:54 UTC ( [id://882430]=perlmeditation: print w/replies, xml ) Need Help??

When I was a newb contractor about 15 years ago, I wrote a Perl script that had, as a step, to delete all of the files in a current working directory. When I ran the script the first time (as root), I noticed that the operation was taking quite a bit longer than I expected.

Later, I noticed that poor Charlie was frantically trying to locate the backup for that server. Apparently cwd was somehow '/'. I don't know how that could have happened!

Sorry Charlie...

What can be asserted without proof can be dismissed without proof. - Christopher Hitchens

Replies are listed 'Best First'.
Re: PM Confessional
by dHarry (Abbot) on Jan 15, 2011 at 14:17 UTC

    Ah confession time...

    I once had to port an environment from Unix AIX to a VAX OpenVMS machine. Belief it or not, it was about 15 years ago:) Creating databases, dumping and loading data, modifying scripts and compiling tons of SW. At the end of a long day it all worked. "Let's just clean up this temporary directory and go home, I told my mate." Oops! I know the feeling, "hmm this takes longer than it should..." Next day it took about two hours to do everything again... I did learn from it, as it never happened again (so far).

    Later a senior engineer explained me about the 4-factor. When you do overtime and get tired, everything takes 2 times as long and you make 2 times more mistakes. 2*2=4. It sounds silly but the 4-factor is a useful concept to know. Especially if you're managing a project and ask the SW engineers to do overtime to meet some deadline.



      That reminds me of something a senior engineer told me about 20 years ago on the topic of estimating effort: double your first number and go the the next larger unit. So if your first reaction is 30 seconds, better say 60 minutes. Four hours is more likely to be 8 days, and so on. I'm still crap at estimating individual task effort, but I'm much better at total project cost at least.

      I'd like to be able to assign to an luser

        These sort of rules of thumb sound half-funny half-useful, until the day comes that you apply them and still wind up behind an 8-ball...

        Er, not that that's ever happened to me, three times in a row, or anything. No siree...

        Ours was multiply by two and add 40.


Re: PM Confessional
by raybies (Chaplain) on Jan 15, 2011 at 21:34 UTC

    My teamlead, who's a great C# programmer but doesn't know squat about Linux, deleted the /etc directory of our main linux fileserver as root.

    After that we disabled his ability to rm anything... replacing the rm command with a backup feature... I think it was a perl script. :)


      and he still had root access? :-)


        Heh. Yeah, it's funny how often managers and bosses think that because they're boss they have to have root. Essentially we've neutered root, unless you have some minimal linux skillz, which luckily haven't shown any sign of appearing lately.


      you created another user w/ the uid of 0 right?

Re: PM Confessional
by ack (Deacon) on Jan 18, 2011 at 18:03 UTC

    Back in the late '70's I was promoted to a position in charge of a DEC VAX 11/780 system. I had taken all the Sys Admin courses that DEC offered and quite a few of the sys level programming courses from them, too.

    My first day on the job, all pumped up with my "superior skills" I wrote my first sys admin program to help me better manage the User Accounts, especially the Password file.

    Just the tiniest of errors crept in and I deleted the master password file...without knowing it.

    I logged off and after about an hour I started getting frantic calls from users who could not log on. When I tried I, too, could not log on!

    The previous Sys Admin (who'd worked as an Admin for about 10 years and had about 3 years on VAX systems let me stew and fret for about 2 hours...after had quickly but unbeknownst to me fixed the problem for everyone else so that only I was locked out...and then revealed to me the "error of my ways." I hadn't even smart enough to see that he'd fixed the problem or to wonder why I quickly stopped getting those phone calls...Duhhhhh!

    Amazing how a bit of fear mixed with a good measure of embarrasment goes so far to bring humbleness and caution in our sometimes youthful wrecklessness. His gentleness with me, however, also has stayed with me reminding me that 'we're all human (except for zantara who just plays one here on earth) and mistakes happen."

    Loved your post. Brought back some fond, if humbling, memories!

    ack Albuquerque, NM

      I thought I was the only one who had done that ...

Re: PM Confessional
by Tux (Canon) on Jan 17, 2011 at 07:44 UTC

    Does the SQL command "delete from huge_table; where key = 12;" count as well? I did that once :(

    Enjoy, Have FUN! H.Merijn

      I know it's stupid, but I often have the urge to delete editor backup files. I don't dare to type rm *~ though, for fear of accidentally pressing enter too early, so I have an alias rmt which does that.

      That's why I (prettymuch always) stick the handwritten updates and deletes between BEGIN TRANSACTION and ROLLBACK and if and only if I get a sane "N rows affected" message do I select and run just the statement.

      Saved my butt a few times.

      Enoch was right!
      Enjoy the last years of Rome.

        Sane way of operating. It will also protect you fro the less obvious mistake of writing

        delete from table where key > 0;

        instead of

        delete from table where key < 0;

        which is only one key apart. I know someone else who did that.

        This thread is not about errors someone else made, but I will still tell one from the recent past:

        After months of preparing and exercising all the needed commands for the move of one production platform to a newer platform of a more modern architecture, someone (not in our company) issued the main "rsync" command (which also had a --delete) THE WRONG WAY AROUND and deleting two days of work. The restore from backup took 6 hours and regained 1˝ day of work.

        Enjoy, Have FUN! H.Merijn
      Oh Man, That is the worst. I did the same, but with a worst SQL command, something like these: delete from unindexed_table order by id where key = AF0425; Worst experience of my newb times, but that led me to the discovery of MySQL logs. The Good ol' days.
Re: PM Confessional
by ELISHEVA (Prior) on Jan 18, 2011 at 21:37 UTC

    My mea culpa:

    I had designed an C API where most of the functions were supposed to return 0 for "all is well" and an error code if something is wrong. This is a standard return value pattern for C.

    One of our team members went on vacation. His code was a "mess" so I thought I'd clean it up a bit. He came back a few days later. About that time, a big chunk of the API that had been coded and tested stopped working. We couldn't figure out what could possibly be the problem. Since I was the lead and we were really stuck, I went through the code looking for a bug. After laborously tracing code (this was back in the days of character monitors - VT-100 I think) I discovered that the bug was ... mine.

    Turns out that the programmer thought 0=OK, !0=problem was too confusing. 0=false and 1=true right? So within the code that was hidden from the public interface he'd gone by his own conventions and had successful functions return 1 and failed functions return 0. When I had "fixed" the code to match convention, I broke it because I didn't realize that his returning 1 where I expected 0 was by intent. We lost three days.

    Lessons learned?

    • Don't change working code to make it look good unless you have really good regression tests. And even then, think twice unless there is good maintenance/testability reason.
    • Don't assume that everyone knows the conventions. State and restate them until you know your team members and their assumptions well.
Re: PM Confessional
by zentara (Archbishop) on Jan 16, 2011 at 15:19 UTC
Re: PM Confessional
by CountZero (Bishop) on Jan 15, 2011 at 10:19 UTC
    We all learn from other people's mistakes (never from our own).


    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: PM Confessional
by tilly (Archbishop) on Jan 18, 2011 at 17:18 UTC
Re: PM Confessional
by NateTut (Deacon) on Jan 19, 2011 at 18:59 UTC
    Back in the day when Novell networks were hot and 20M was a big drive, I discovered an new utility. It was called pkarc and it compressed files - way cool! So I had to try it out on our Novell server. pkarc had a bunch of nifty options one of which -m moved files into the archive file, in other words deleted them.

    Well I kicked it off and it cranked away for a while and then I started getting the aforementioned calls. "I can't login". "I can't get to my files" etc. I soon discovered I had included the -m switch - doh! Fortunately I didn't panic. I prayed that pkarc was as good as it sounded and it would let me restore everything. It took hours and made a very stressful day for me but I recovered all the files except one.

    It was the login executable! So even though the network files were back no could login. I started looking around and found an install disk with the file, I copied it to the right place and voila! We were back in business.

    I can't say I have never made another mistake, but that experience made me a much more cautious person.
Re: PM Confessional
by Anonymous Monk on Jan 15, 2011 at 01:03 UTC
    Did you have to sacrifice a digit for this mistake? Some confession
Re: PM Confessional
by shmem (Chancellor) on Jan 18, 2011 at 22:33 UTC

    Back in the good days I wrecked /etc/fstab of our company's main file server running SunOS-4.1.3. 'Twas just the day I learnt Sun's /bin/sh and some low level unix system tools. Not before, but fixing the mess.

Re: PM Confessional
by MidLifeXis (Monsignor) on Jan 18, 2011 at 15:45 UTC

    ISTR doing something along the lines of running init with parameters for a different OS once, on a production DB machine. I was trying to get it to reread the inittab, and ended up taking the system to a lower run level. *sigh*


Re: PM Confessional
by sundialsvc4 (Abbot) on Jan 20, 2011 at 23:27 UTC

    This is not a true Perl story, but I do remember learning how IBM's many service-tapes were best used as scratch tapes.   They called us up in a panic, having just delivered a service tape (apparently, to all their customers) that would render a mainframe un-bootable.

    Fortunately, I had not installed it, and never planned to.   “Computer software like fine wine ... let it age.   Knowing that their service-tapes were cumulative, I actually applied them about once a quarter, and used tapes that were at least a month old (after reviewing all of the issues-lists for the tapes that had arrived subsequently).   The rest went into a (big...) box.

      “Computer software like fine wine ... let it age.”
      My firm's equivalent of this was to never use a version *.0 of any vendors software...

        Likewise, turn off “automatic updates” (anywhere and everywhere you find them ...), and when advised that a new version of software is available ... wait.

        For instance, when iOS 4.0 came out for my phone, I waited several months as the version-numbers quickly bounced along.   When they finally settled down and stayed that way for more than 30 days running, I proceeded with one uneventful upgrade.

        Software’s difficult ... we all know that.   So, there is no reason to be “on the bleeding edge” about the bugs that will inevitably crop-up with a new release of anything.

Re: PM Confessional
by hesco (Deacon) on Feb 12, 2011 at 04:23 UTC
    My last night on a job seven years ago, cleaning up the desktop I used next to the rack which ran the operation, I used an ill advised `rm -rf`, which like the OP's took too long to run. By the time I took note of what I had done and pounded on Control-C for a while, I had done some damage to my home directory. Fortunately it was something of a mess and I had mostly deleted large cache's buried in .dot files at the beginning of the alphabet, without doing too much damage to the archives I was leaving for those who were to come later.

    One of the guys who helped to set up our systems in that office once advised me to never be root after midnight. I also have learned to never run a DELETE FROM, until I had run a SELECT FROM using the same WHERE clause.

    I also have had an awful experience asking a script to `rm files`. My path on that command was programmatically built, and a bug in that code did some damage. I'm pretty cautious now about letting a script use `rm` for any reason, particularly if any wildcards are involved.

    -- Hugh

    if( $lal && $lol ) { $life++; }
    if( $insurance->rationing() ) { $people->die(); }

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://882430]
Approved by planetscape
Front-paged by Arunbear
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2024-04-21 12:01 GMT
Find Nodes?
    Voting Booth?

    No recent polls found