This is the Message Centre for Icy North

Dilemma

Post 1

Icy North

I had a difficult decision to make yesterday.

The client's main financial reporting system was down - it had been down for a couple of days. We'd only taken over supporting it in January and the previous support team weren't around any more to help.

The software vendor was particularly unhelpful, either getting us to try things which we all knew wouldn't work, or telling us we had to upgrade as they didn't support that version of software any more. Upgrading such a critical system just can't be done there and then - it would take many days, what with all the customised code and testing we'd have to do.

There was a disaster-recovery system we could switch to, but the last time we did that it was too slow to be of use, and we were in the long process of upgrading it all.

There a final option - just switching the live servers off and on again. We tried it on the development servers - it failed. We tried it again - it failed with different errors. Then someone told us that the live servers hadn't been rebooted for over 2 years. To do it now would be incredibly risky.

Then someone reminded us that financial month end was coming up and we had to do something now.

What would you do?

1.Tell the client they can't have their system for a week while we upgrade.
2. Fire up the disaster recovery system in the hope that it doesn't grind to an immediate halt.
3. Reboot the live system which hasn't been shut down for 2 years.

What did I do?


Dilemma

Post 2

bobstafford

option 2 sounds most logical as you can resort to options 1 and 3 if that fails.smiley - smiley


Dilemma

Post 3

Icy North

Option 2 would take many hours to do, as we have to migrate a massive database. Does that change your decision?


Dilemma

Post 4

Galaxy Babe - eclectic editor

put the kettle on?


Dilemma

Post 5

bobstafford

Sounds like a late night and if the data base is un useable without a danger of data loss what choice have you unless you risk the data


Dilemma

Post 6

Bluebottle

Is this a trick question where the answer isn't one of the three options?

<BB<


Dilemma

Post 7

8584330

You ordered drinks.


Dilemma

Post 8

Icy North

Drinks? This was a Hamlet cigar moment.

Feel free to suggest another option, BB. I couldn't think of one.


Dilemma

Post 9

Geggs

I'm guessing Option 1 will take far too far for the customer's liking.

And Option 2 to will be a bit too long also.

So, it's got to be the incredibly dangerous Option 3. It's quicker, and you've just got to hope that it'll work.


Geggs


Dilemma

Post 10

Beatrice

If Option 3 doesn't work - what will the situation be? Are any of the other options then ruled out?


Dilemma

Post 11

Icy North

Spot on, Geggs. We gulped and ordered a server restart ...


Dilemma

Post 12

Gnomon - time to move on

What sort of a company doesn't reboot a server for 2 years? You should at least reboot them every month when you apply the operating system patches.

You haven't been applying the patches? Oh dear..

smiley - biggrin


Dilemma

Post 13

Icy North

If it were so easy...

Oh, I forgot to tell you the outcome. By divine providence it actually restarted in one piece. Not only that, but it runs jobs a lot faster than it has for the last 2 years.

There's a moral to this story, but I'm not sure what it is. I'd still feel very uncomfortable if I had to make that decision again.


Dilemma

Post 14

Recumbentman

Moral would seem to be change your boots more frequently?


Dilemma

Post 15

Milla, h2g2 Operations

Regular reboots, regular back ups, and restore tests. (Yes, tests!)

A mirrored server, that you can switch to as a failover.

Documented procedures for all these.

A disaster recovery plan in place. Documented, and tested.

A business continuity plan in place. Documented, and tested.

All signed and approved by the client.

To begin with, at least.

smiley - towel


Dilemma

Post 16

Dmitri Gheorgheni, Post Editor

Boy, you have nerves of steel, Icy. I'd never have guessed you'd take Option 3. smiley - bigeyes

Glad the Force was with you.


Dilemma

Post 17

Icy North

Thanks for the best practice, Milla.

Sadly, the real world only works like that when the client stumps up the cash.


Dilemma

Post 18

bobstafford

Client must be HM GOV or MOD smiley - biggrin


Dilemma

Post 19

Milla, h2g2 Operations

smiley - winkeye
Isn't reality precious....

Yeah, I know it doesn't happen unless disaster has struck once, and then only maybe....

smiley - towel


Dilemma

Post 20

Baron Grim

I'm with Gnomon; I'm aghast that they haven't been performing system software patches. smiley - yikes


But I wasn't at all surprised that you went with option 3.

Just last night I was rewatching episodes of The IT Crowd.

"Have you tried turning it off and back on again." That's always step 1.


Key: Complain about this post