This is the Message Centre for Icy North

Icy Naj 19 - Problems, Problems...

Post 1

Icy North

In yesterday's journal I wrote about the system of best practice that many IT people use to organise themselves, and how when something goes wrong we call it an 'incident' rather than a 'problem'. I also said that when a help desk gets you working again, however they do it, they can close your incident ticket and go on to help somebody else in the queue.


Now, sometimes there's still an underlying fault that needs to be fixed. Maybe you could get online again by connecting via a cable, but your wifi's still not working. Or maybe the engineers 'failed over' your server to a backup to keep your billing system working. They still need to find out what went wrong with the wifi or with the primary billing server, so they can fix them for other people and prevent them from failing again in the future.

These are what IT people call 'problems'. Something has failed, and needs investigating to find what the root cause is. It can involve a lot of activities: looking at the screenshots and log files, discussing with hardware and software vendors, testing the system in a lab to see if you can re-create the error.

It's often very difficult to find the root cause of a problem. On the one hand, IT systems are very complicated. If, for example, you have a problem printing out an invoice from a software application, well, the fault could be anywhere. It could be the printer, or the paper tray, or the ink levels, the cables, the network, the PC hardware you were using, or the software on it, or the server the application is on, the data centre it's hosted in, the software licences, your internet connection, power leads, or, very occasionally, maybe the user didn't know which keys to hit.

On the other hand, problem solving is iterative. You may find that the software application crashed, but it worked fine after you rebooted your PC. But why did the software crash? You find it was because it was leaking memory. And this was because there was a bug in the code, and this was because there was a design fault. Sometimes you just don't have the time or the inclination to investigate every problem through to its ultimate cause.

When you do find a root cause that's at a controllable level, you need to record it. We call these 'known errors' and get around to fixing them as time allows, according to how important they are. We also let the help desk know about them, as some other caller is sure to have the same problem in the future. The help desk need to know how to advise those callers until we get it permanently fixed.


So, to put it in h2g2 terms: If you see '502' errors and can't connect to the site, then that's an incident. If you can't edit your entry in Pliny, then that's not an incident, as you can instead edit it in Ripley skins for the time being. It is, however, a problem, and technicians will work on it when they get around to it, if they think it's important.


Icy Naj 19 - Problems, Problems...

Post 2

Recumbentman

Or "very occasionally, maybe the user didn't know which keys to hit."

Do I detect a note of irony, maybe bordering on sarcasm here?


Icy Naj 19 - Problems, Problems...

Post 3

Icy North

SARCHASM. n. The gulf between the author of sarcastic wit and the person who doesn't get it. smiley - run


Icy Naj 19 - Problems, Problems...

Post 4

Gnomon - time to move on

User error is so common a cause of computer problems that we don't bother being sarcastic about it anymore.

Problems are the things that make my work interesting. We pay contractors to do the routine stuff, but to solve a problem you need someone with years of experience of the system. That's what I do.


Icy Naj 19 - Problems, Problems...

Post 5

bobstafford

smiley - senior you are not that old


Icy Naj 19 - Problems, Problems...

Post 6

Icy North

I had an interesting discussion this morning about how we should prioritise problem work.

If you had a high-priority incident (the billing system failed, say), then should it always be a high-priority problem investigation? A lot of people would say 'yes', but then what happens if someone assign you a medium priority incident (let's say: "I can't run this report which needs to go out tomorrow"). Which do you work on first?


Icy Naj 19 - Problems, Problems...

Post 7

paulh, vaccinated against the Omigod Variant

I would go with the report that has an impending deadline. [Well, you did ask after all.]


Icy Naj 19 - Problems, Problems...

Post 8

Icy North

Yes, you will generally get a lot more noise from those guys, and in practice, that's what you'll have to do. It illustrates that it can be difficult finding time to do problem investigations, even critical ones.


Icy Naj 19 - Problems, Problems...

Post 9

Gnomon - time to move on

We've a problem in work and Mr Messy is going to sort it out.


Icy Naj 19 - Problems, Problems...

Post 10

Amy Pawloski, aka 'paper lady'--'Mufflewhump'?!? click here to find out... (ACE)

[Amy P]


Icy Naj 19 - Problems, Problems...

Post 11

Florida Sailor All is well with the world

You remind me of my first big program for a bank supply warehouse. This was shortly after the PC had been introduced and they had a huge hard drive that could store 70 megabytessmiley - erm

It was a simple database that held a file line for each item carried. As each item was picked it reduced the inventory on hand, created a monthly report about how much to charge each branch and it had a re-order report to show what items needed to be replaced.

Everything was printed on a three part carbon paper on an impact printer. As each form printed the operator had an option to select if it had printed correctly. If they said (Y)es the form was marked Original and everything was charged, if they typed (N)o the form was marked 'Void' and nothing was charged.

Of course some one got into a hurry and kept hitting (Y)es even though there was a huge paper jam in the other room. Fortunately I did include an option to allow them to print forms marked 'Copy' without readjusting the inventory.

smiley - biggrin

F smiley - dolphin S


Key: Complain about this post

More Conversations for Icy North

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more