This is the Message Centre for Icy North

Icy Naj 24 - You Are The Tech...

Post 1

Icy North

This is another in my occasional series of entries about the life of IT technicians. I describe an incident. You have to say what you think caused it and what they should do next.


This is a true story. Only the names have been changed, to protect the incompetent.

* * *

An engineer visits a modern data centre to install some new network equipment. Let's say it's on a business park somewhere between London and Swindon. The data centre contains lots of racks of IT equipment: servers, network switches, storage devices, etc. The equipment belongs to two different organisations. In the racks at one end is the equipment for a well-known utility company, let's call them O-Power. In the racks at the other end is the equipment running vital services for a high-street bank. Let's call them HTSB.

The engineer installs new network equipment for HTSB into an empty rack. It's switched on, and everything works fine.

A week later they go back to the data centre to decommission all the redundant HTSB equipment. They unplug all the network cables and power supplies, then remove all the servers and devices off the racks. When they've finished, they have an empty rack, and lots of equipment piled haphazardly on the floor behind them.

Then the phone rings.

"We've lost access to O-Power's systems! This is critical! Please can you put everything back in reverse order so we can find out what caused it!"

The engineer looks up at the empty rack, then down at the haphazard pile of hardware and wishes he'd brought a spare pair of trousers.

What should he do next?

* * *


Icy Naj 24 - You Are The Tech...

Post 2

2legs - Hey, babe, take a walk on the wild side...

*thinks* Hmmm.... did they switch the power off at the master mains/power switch, before removing just the bits they were there to remove, and, thusly, basically powered all the stuff in all the racks down? . . well... ... smiley - doh


Icy Naj 24 - You Are The Tech...

Post 3

Icy North

They just flipped the power switches on the devices then yanked the leads out.


Icy Naj 24 - You Are The Tech...

Post 4

Amy Pawloski, aka 'paper lady'--'Mufflewhump'?!? click here to find out... (ACE)

[Amy P]


Icy Naj 24 - You Are The Tech...

Post 5

Gnomon - time to move on

It appears that one of the servers in the HTSB end was actually an O-power server.

My first response would be - switch to the other data centre.


Icy Naj 24 - You Are The Tech...

Post 6

Icy North

The servers were in the right rack in this case.

Invoking Disaster Recovery is probably exactly what I'd have suggested in the circumstances, although it would presumably be others in operations management who would have made that decision. Automatic failover would have been nice, too.

In this case, the engineer kept a cool head and worked out what had happened.


Icy Naj 24 - You Are The Tech...

Post 7

Florida Sailor All is well with the world

I would try powering up all the removed severs off-line and see if any contain O-Power files.

I would try to ping the routers from the other bank in the meantime.

F smiley - dolphin S


Icy Naj 24 - You Are The Tech...

Post 8

2legs - Hey, babe, take a walk on the wild side...

I'd probably try screaming at it, then get out the hammer... but that might not work so well... works for a lot of stuff though smiley - laugh


Icy Naj 24 - You Are The Tech...

Post 9

Florida Sailor All is well with the world

2legs, are you working as a secret consultant for my company's IT subcontractors? smiley - rofl

F smiley - dolphin S


Icy Naj 24 - You Are The Tech...

Post 10

2legs - Hey, babe, take a walk on the wild side...

Maybe! There are some people that hold solumly to the addage you need only two tools; WD40 and duct tape, and, to these people, I say they are wrong; You need three tools; WD40, duct take, and a hammer. Curiously, its also the same list, for my medical supplys, kept in the house; WD40 Duct tape and a hammer... fixes 99% of all DIY and medical emergencys.... smiley - zen


Icy Naj 24 - You Are The Tech...

Post 11

Bluebottle

Coincidentally my son put his finger in a hole in a railing yesterday, where a bolt had been, and got it stucksmiley - doh. Guess which of the 3 tools suggested above was used to extract him?

<BB<


Icy Naj 24 - You Are The Tech...

Post 12

Icy North

Any more takers for this?

I'll tell you that the engineer decided to wander over to the O-Power racks and take a look.


Icy Naj 24 - You Are The Tech...

Post 13

Icy North

...and he saw that most of the devices were happily flashing away, but that 3 or 4 of them appeared to be powered off...


Icy Naj 24 - You Are The Tech...

Post 14

scorp

2legs! I was brought up to believe that the 'big hammer' should only be used when all else failssmiley - biggrin


Icy Naj 24 - You Are The Tech...

Post 15

Beatrice

Was it something else that had happened, not related to the engineer at all?

Or else it was the cleaner.


Icy Naj 24 - You Are The Tech...

Post 16

Florida Sailor All is well with the world

I did consider this might be a coincidental incident, but as more than one unit was down, I would suspect they had been pulling power from the other rack.
I would start pulling the units that were down and trace their power supplies and other connections.

I suspect I would have found this while examining O-Powers racksmiley - winkeye

F smiley - dolphin S


Icy Naj 24 - You Are The Tech...

Post 17

Icy North

Yes, you're right, FS. It was a power issue.

What happened was that when the engineer switched off one of the HTSB units it happened to trip a circuit breaker. What he didn't realise was that this unit was actually powered through the same circuit breaker as some of the O-Power machines.

When the data centre operators were quizzed on it, they admitted that it had been like that for years, from a time when other equipment was installed in the racks. They had never properly checked the power circuits since.

But, hey, it only brought down a major utility's IT systems.


Key: Complain about this post