Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

OT: When to put things live

by simon.proctor (Vicar)
on Apr 19, 2007 at 11:42 UTC ( #610928=perlmeditation: print w/ replies, xml ) Need Help??

I've been putting various nuggets of software live recently. Either directly or handing over to the client. In all cases I've kept to my personal rules about launching or handing over major work and I wondered what rules of thumb people followed here.

This isn't about programming practise per se. At this stage you've built and (hopefully) tested the software.

So, I generally do this:

Never put it live on a Monday or a Friday
There is more to this than the Monday drag or the excitement of the impending weekend
Where I work changes to systems occur over the weekend, should they affect your work you'll need Monday to check. Also, people may be ill and you'll need that day to recommunicate to the intended management/audience.
No one likes working late on a Friday. Ideal world aside, if something goes wrong with your deployment you need help around to help diagnose and fix it. You have an extra working day where most people will be in the office.
Avoid going live at a specific time
Picking a time of day is fraught with danger. If you can, make the system 'pre-live' where only the most minor of changes are required at this point in time. Like switching internal ip addresses (etc). DNS changes always take time to propagate so a fixed time launch for a site would need a launch page (hence the minor change rule).
Don't put things live first thing in the morning
Well for a start you don't know if anyone is going to be late for work. Traffic (etc).
If you can, have the existing system still in place
Sometimes, the best contingency is to be able to fall back to the existing system while you fix any problems. Back out any minor changes and you are back where you started. Diagnose, fix and try again.

So for me, currently, Tuesday to Thursdays from 11:00 to 4:00 are the ideal times to put things live. I am lucky, though, that I do have a major say in how these things proceed.

Comment on OT: When to put things live
Re: OT: When to put things live
by derby (Abbot) on Apr 19, 2007 at 12:42 UTC

    simon.proctor++ ... that's pretty close to how we handle it. Sometimes we have pushes that would impact customers (database needs to go down because you cannot pre-live) so we also evaluate the impact to the end-user. If it's too much of an impact, we co-ordinate to do the push off hours (weekend or after biz hours). I'd say more than 90% of our pushes follow your rules and about 2-3 times a year I can expect to be working on a Saturday.

    -derby
      I'll second that... much the same where I am.
      One added thing - whenever possible (and it's not always possible), I try to have the "live" system up and running for a day before announcing it. One last chance to check for... unforseen consequences :-)

      -- WARNING: You are logged into reality as root.
Re: OT: When to put things live
by jhourcle (Prior) on Apr 19, 2007 at 14:06 UTC

    Some of these issues have to do with your specific job, and the nature of the systems.

    For instance, when I worked for a university, if a system change would require downtime, we never started before 6pm on a Friday for planned change, so we had the whole weekend to clean up if things went wrong.* Things were put into place on Friday night, stakeholders got to review the system on Saturday, and sysadmins and programmers had to get it working by 6am on Monday.

    If it was something that could be done with a quick cutover, I'd prep everything in advance, get signoff on it in testing, and then cut it over at 7am on a Monday morning. (specifically because people came in late on Mondays, so the trouble calls came in slower**).

    These days, my work is international, and as I'm a contractor, they don't like me working overtime or odd hours. So, I'm a firm believer in the Tues-Thurs window. We had a specific rule of NO system changes after noon on Fridays. For some types of changes, we have to wait for specific gaps which occur. For other changes, we do 'em late morning, as most of Europe's left for the day, and many of our West Coast users aren't early risers. We then have people around for a full day of debugging, should something go wrong.

    So, I'd have to say my normal window these days is Tuesday-Thursday, 10am-1pm. (if it's a local change, and not going to affect the European folks, 8am-10am)

    ...

    * the 6pm rule, unfortunately, is what got us into problems when our management wouldn't let us take down a mail server when we noticed it was having disk problems, and resulted in us spending 16 days around the clock trying to get things working (and by 'working' I mean e-mail was lost for all students) ... because management didn't want to react when we noticed the problem (a little past noon), and the system failed in a cascading manner at about 4:30pm.

    ** Of course, that also resulted in a problem one day when I got all of the signoffs, but something wasn't transfered cleanly ... and as no one noticed 'till about 10am, I had to manually merge changes, which took me about 2 days ... normally, not a big deal, but the cutover was just to buy me 2 weeks to finish a project ... which then spiralled out of control, and took years to complete (of course, I had been fired for 'use of sarcasm', as as the lead on the project, that might've explained why they were delayed by 3+ years (well, that, and bringing in a 'third party' to review the system, who didn't understand our business needs, or the software we were using which added the first year of delay, and may have resulted in my sarcastic attitude))

Re: OT: When to put things live
by graq (Curate) on Apr 19, 2007 at 14:15 UTC

    If it were realistic, I would push the end time back to 15:00.

    If you are writing from a (purely) development side, then that potentially leaves you with other teams in the mix (say a database management and operations teams). You may not have any access at all to any live machines, and you are reliant on these other departments to deploy your changes. That means you are reliant on them to help fix errors and roll back.

    Also, as you mention DNS, it sounds like web development, and if you are UK based, then one of your peak traffic periods may be 16:00.

    And this all ignores the times when you are implementing major changes (that might include hardware changes as well as software) where downtime is unavoidable (or highly likely). Then you must consider the user/customer base first and pick off-peak times in order not to cause long lasting damage to the business.

    -=( Graq )=-

      > that potentially leaves you with other teams in the mix (say a database management and operations teams).

      Almost true in my case. Unfortunately, there are no real database teams per se. The hardware team looks after hardware, makes sure the os is up and the capacity is ok. Beyond that, it is considered 'application' and out of their remit.

      Because of that, I tend to get a little more say in the stuff that I do. I'm also lucky that most of what I do is internal to the business. However, as an international business we have 24 hour access requirements. This is where our SLAs come into play.

      Downtime is inevitable but is generally mitigated as much as possible. I don't really mention it as downtime is a lot rarer for my work.

      > As you mention DNS it sounds like web development

      That is part of my work but i used DNS as an easy to understand example of not choosing a fixed time for launches.

      Thanks for your comments :)

Re: OT: When to put things live
by Herkum (Parson) on Apr 19, 2007 at 14:26 UTC
    I did support for a Fortune 500 company and typically all major changes were done on the weekend. There were two reasons for this,
    • It supposedly gave the developers, Admins extra time to test the environment without everyone banging on it during the work day. They would bring in employees on Saturday/Sunday to help with testing the implementation.
    • The real reason was that the Production environment was so convoluted that there was not a good 'test' environment. The only way to make sure everything was going to work was to push it into Production and try patching and fixing things as they broke.

      What ending up happening was all the basic stuff got checked in Production and they went with that. Come Monday, as the users got onto the system a ton of bugs were consistently found, which usually caused corporate wide problems.

    I have not worked with a company yet that has been consistent about how they role out code into production, but usually it was in the evening hours because it interupted fewer people but those systems typically had little testing so always resulted in major problems the next day.

Re: OT: When to put things live
by Old_Gray_Bear (Bishop) on Apr 19, 2007 at 17:26 UTC
    That's pretty much the size of it. We aim for 'non-disruptive enhancement' around here. That means:
    • No Friday installs of anything that has a Customer facing component.

      Of late this has become problematic, since we have a sizable User base distributed around the Globe. Someone is always inconvenienced, no matter when we install; so we have settled on Developer Convenience as the determining factor. This means that the last five or six Product releases have occurred on Wednesday Evening between 1800 and 2000.

    • The Service DOES NOT GO DOWN.

      The New Service (Client enhancement, bug fix and code maintenance, new database features, etc) will run in parallel with the Old for a period of weeks. (Or a period of years, in the case of Client code. We have Client-software in the Wild that is over eight years old. The feature set is still supported, We don't accept bugs on it, however.)

    • While there is a committee involved in the planning and scheduling of a Change, there is a single Change Captain. The Change Captain has final authority to say 'Oops. Back it out'.

    • We try to give our User Community a reasonable estimate of when a new feature set will be available. We plan to have it in play four to six hours before the announced go-live. This gives us a little 'final-checkout' time. Our Users know this and so we sometime get 'early adopters'. We don't discourage this.

    • The Usual Time Line:
      1. Two weeks before the date -- feature freeze
      2. One week before the date -- code freeze and QA begins regression testing.
      3. On the day:
        1. 1300 -- Final Change Review meeting -- are we really ready?
        2. 1600 -- New code/hardware active and checked out
        3. 1800 -- Go-time -- everybody involved gathers in a conference room and watches the logs and monitors.
        4. 1810 -- Pizza delivered (on the Project Managers nickel)
        5. 1930 -- More Pizza, this time with beer (ditto; there is a line item in the project budget for this)
        6. 2000 -- Go-Live for the Users; Ice cream arrives
        7. Afternoon of the following day -- Post Mortem
    This seems to be a working method, it has served for the past fourteen months. We have only had one release aborted by the Change Captain -- when it was announced in the 1300 'final readiness' meeting that the primary power system to one of the co-location facilities had failed at 0300, and we were on standby generators. The power vendor 'expected to have it back online by close-of-business today.' The CC said "that's nice. We ain't going until the generator has been up for at least 12 hours." We slipped the install a day.

    ----
    I Go Back to Sleep, Now.

    OGB

Re: OT: When to put things live
by dana (Monk) on Apr 19, 2007 at 21:34 UTC

    I don't have anything to add but wanted to drop a note to say that I really enjoyed this discussion and reading the various approaches. I've worked as a sys admin in a highly heterogeneous environment where 24/7 was critical (associated with patient care) and I've worked as a programmer in research where lucky users were supported between 8am and 5pm. Clearly the needs vary depending on users, systems, etc. Clearly the demands on the admin and/or programmer also vary greatly.

    Thanks again for sparking such a discussion.

Re: OT: When to put things live
by talexb (Canon) on Apr 20, 2007 at 03:04 UTC

    I support one system that has a team working 9am-6pm in Toronto on one server, and another team working about 10am-10pm in Mumbai, India, on a satellite server. That means the system's in use from about 11pm through to about 6pm local time, so my 'maintenance window' is technically 6-8pm.

    So when it's time for a roll-out, it's a little tricky because the two teams share a server for some operations, but not for everything .. so we usually make the changes at both ends sometime during the day and get our end tested. If it all checks out, we test the Mumbai satellite from Toronto, and then finally get them to try it out at about midnight when they start their day.

    We usually have an opportunity to roll back to the previous version, unless it's a database schema change, in which case any fixes depend on changes to live Production code. That hasn't been a problem in the four years I've been doing the job.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: OT: When to put things live
by rodion (Chaplain) on Apr 21, 2007 at 07:16 UTC
    Our group handles information used for clinical care in a hospital, so limiting Friday and late-in-the-day installs to exceptional cases, and having easy backouts available are both essential rules. It's harder on weekends and nights to contact the software staff needed to diagnose and fix something if bugs surface sometime later, and the staff using the software on off shifts is smaller and less well informed about changes.

    We usually avoid Mondays because they can be crowded with handling questions and issues from the weekends, but we will sometimes choose a Monday install if we want things running for a full 5 days with full staff on-hand.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://610928]
Approved by NovMonk
Front-paged by TStanley
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2014-11-29 06:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (203 votes), past polls