ITIL Problem Management Process – Your Lifecycle Guide

ITIL Problem Management ProcessInformation Technology is full of jargon that most of us do not understand. To top these complex terms are processes, frameworks, tools, and acronyms.

Among these floating terms, one that has found prominence in recent times is ITIL problem management.

Why is this concept important?

That’s because the goal of the IT service desk professionals is to constantly support and deliver exemplary service experience to all their users.

The Utopian dream would be to have no incidents at all. However, that’s a far-fetched idea. So, the next possible aim now is to manage incidents and restore service swiftly and efficiently. These results can be made possible by bringing in the ITIL Problem Management Process in place.

But before that, you need to know what is ITIL Problem Management?

Defining ITIL Problem Management

Prior to deep-diving into the concept of IT Infrastructure Library Problem Management, let’s first understand the relationship between ITIL and Problem Management.

In simple words, in the context of ITIL, the term ‘Problem’ basically defines an unknown cause resulting in one or more incidents.

On the other hand, the ITIL problem management definition explains how “Problem Management ensures the identification of problems and performs Root Cause Analysis. It also ensures that recurring incidents are minimized, and problems can be prevented.

problem_management_processAs a key component of ITIL, the Problem Management process requires you to tackle the lifecycle of all underlying problems. The task is deemed to be successful only if detection happens rapidly and should be followed up with offering solutions or workarounds in quick succession. The objective, of course, is to reduce the effect of these underlying problems on the business and ideally to prevent any relapse.

Let’s look at problem management examples to get a better understanding of the concept.

Imagine you have a flat tire. Just like any vehicle owner, you’d want to replace or fix the flat tire and be back on the road as quickly as you can. Now, you have a flat tire again; you will adopt the same process of problem resolution even though the overall incident may seem more of an annoyance.

This clearly explains Problem Management, wherein your service provider dutifully fixes the tire every time you get a flat one and help you get back on the road as soon as possible.  

Several reasons may have caused a flat tire.

The first flat may have been the work of a nail. But what about the second time?

Instead of following through to find the answer to this, many organizations just stop at the first incident itself. That’s because a key question remains unanswered - what if this problem occurs again? 

Therefore, to put things into perspective, Problem Management is the process that comes into use to respond to key queries, locate and recognize the underlying causes of the problems and then eventually take corrective measures.

In reference to the usage within the process of IT Infrastructure Library (ITIL), take a look at the following core definitions of common Problem Management terms:

  1. Problem: The cause of one or more Incidents. The cause is not usually known at the time a problem record is created.
  2. Error: A design flaw or malfunction that causes a failure of one or more IT services or other configuration items.
  3. Known Error: A Problem that has a documented root cause and workaround.
  4. Root Cause: The underlying or original cause of an incident or problem.

Hence, you may rightfully explain Problem Management as the process that encompasses end to end solution to managing problems, right from identification to its elimination.

In the absence of an effective ITIL problem management process, organizations can unnecessarily suffer as IT teams are frequently confused between Problem Management with Incident Management. This must be clearly defined from the very onset to eliminate misperceptions.

ITIL Problem Management vs. Incident Management

ITIL defines a problem that is a direct result of incidents. For example, the malfunctioning of a mouse or even server downtime can be classified as incidents. They are not indicative of a problem yet. However, in due course of time, these incidents keep recurring. For instance, hardware failures, network outages, etc. which now takes the form of a problem.

Even though both processes work in tandem, the key differentiator between Problem Management and Incident Management is that the former gets rid of the root cause of a failed service, whereas the latter restores the service even if it is temporary in nature.

The foundation of the successful problem management process is when both the Problem and Incident Management procedures complement each other. The system is a long draw proposition that requires conducting post-incident reviews to determine the root cause of a problem through a disciplined approach to analysis.

Types of Problem Management

Identification and removal of underlying causes of recurring incidents is the sole objective of Problem Management. If you notice putting undue pressure on locating the problem to get your services up and running, that, of course, is no longer classified as Problem Management. On the contrary, it has become Incident Management as the focus lies on the restoration of service.

As far as Problem Management is concerned, there are two main types - Reactive or Proactive.

  • Reactive Problem Management refers to the reaction of problem-solving that is a by-product of when one or more incidents occur.
  • Proactive Problem Management looks into locating and resolving problems prior to specific incidents occurring.

That said, Reactive Problem Management is a result of a direct trigger in response to an incident. For major incidents, a number of companies do perform Post Incident Reviews that are directed at identifying any underlying problem. If any such issue is, in fact, detected, a Reactive Problem Management effort is initiated right away.

Potential problem cases that are dealt with under Proactive Problem Management generally depend on trending and historical data. This can be gathered from various sources, including moderate analysis, formal Continual Service Improvement, or plain old gut feeling.

The focal point of prioritization of Problem Management cases should be based on the value-added to the business. Business Impact Analysis or Pain Value is acceptable to support methodologies that help in identifying the problem. The ones which have the highest business value are eliminated first.

Problem Management Process Flow

Every single step of the Problem Management Process flow is key to solving a problem successfully while competently delivering the first-rate service. Take an in-depth look at ITIL’s problem management lifecycle:

#1: Detect

You cannot progress further unless you identify what the problem is. Incidents too can be accounted for as problems but only under the following circumstances:

  • Keeps occurring across the organization is very similar situations
  • In spite to successfully solving the incident, it keeps relapsing
  • The service desk is unable to resolve it

#2: Log

A help desk management software works extremely well to log problems within the ITIL Problem Management framework. This basically is a compilation of all occurring and recurring problems from across the organization.

Undoubtedly, the motive is to find the root cause of a problem for which you need to log in all relevant data including date and time of occurrence, symptoms, reference of related problems and the steps taken to troubleshoot them.

#3: Categorize

Once an issue crops up, you will need to assign a primary and secondary category to it. This categorization of the issue enables the service desk to filter through and model frequently recurring incidents.

This also builds a solid foundation for the service desk to collect data which can be later analyzed to generate reports. Ultimately, these insights also assist the service desk in recognizing trends and evaluating the overall impact on service demand.


Recommend Read: Help Desk VS Service Desk: The Key Differences


#4: Prioritize

The urgency of the problem and its effect on the users and the company determines how the issue can be prioritized. With prioritization comes efficiency of resource allocation that, in turn, helps to mitigate the SLA breaches.

You can reallocate your resources as soon as the detection of the issue takes place.

#5: Investigate and Diagnose

The problem’s assigned priority defines how the investigation and diagnosis of the problem are performed. If the problem has been assigned as a high priority, it has to be addressed first since it has the power to impact service delivery.

This typically involves the assessment and testing of logged incidents in the problem report. These incidents have not been tested at the service desk level.

#6: Identify a Workaround

A problem resolution can vary between a few hours to even months at times. This is primarily because they cannot be resolved at the incident level. Generally, a short-term workaround is needed to assist the service desk in its effort to restore services. Parallelly, the problem resolution takes place as the company addresses the core issues that triggered the problem.

That said, a short-term workaround is just a temporary solution. This is because it tags the problem status as unresolved and open.

#7: Raise a Known Error Record

The next stage in the ITIL problem management process is to ensure that you record a known error. Once the workaround has been located, the employees should be notified of it as a ‘known error’.

Here the KEDB or the Known Error Database comes into play as that is the database available to record a known error.

The incidents can be quickly resolved by the service desk by documenting the known errors. This also prevents raising any additional problems.

#8: Resolve

The company’s objective should be to resolve any arising problems. Some key and complex problems that have a direct impact on your service delivery levels may require the constitution of a change management board.

Take, for example, an instance where your operations are slowed down as a result of switching databases. For optimal output, the business must be able to assess and account for all associated risks prior to implementing any resolution.

#9: Close

The process of closing a problem can only be accepted if the problem has gone through all the required stages of – identification, logging, categorization, prioritization, diagnosis, and resolution. The majority of organizations only follow these nine steps in their Problem Management Process flow.

A tenth step is recommended to be followed in the ITIL problem management lifecycle that comprises reviewing the problem. The end goal is to prevent the recurrence of the same problem in the future.

This last step is key as relevant teams perform Post Incident Reviews that closely analyzes the problem. Errors and mistakes can be identified by cross-referencing with the problem log. This, in turn, locates loopholes and areas for improvement and appropriate corrective actions can be taken for the future. Overall, it helps enhance operational efficiencies and increases employee productivity and performance.

Problem Management Techniques

Understanding the Problem Management process is key. What you also need to be aware of are the various Problem Management Techniques that are available to the service provider.

Problem management techniques

Image Source: Slideshare.net

These smart tools are recommended by experts as they partly help in the removal of fear, uncertainty, and doubt. This is typically done by identifying the root cause of the problem and drives the teams towards finding a permanent resolution.

Take a look at some of the problem management techniques and their respective applications.

#1: Chronological Analysis

Being a time-based approach, Chronological Analysis enables the user to look through the order of events which leads to the identification of the chain of cause and its related impact.

This technique comes in handy when the user is trying to look at potential long term problems that have slowly developed over a period of time. Generally, these problems showcase very distinctive symptoms.

#2: Pain Value Analysis

If the user is looking to narrow down his or her search to precise effects that are a direct consequence of specific incidents and events creating trouble for the business, then Pain Value Analysis, is the technique that you should adopt.

When the business is encountering a series of problems simultaneously, conducting a Pain Value Analysis helps in the segregation and prioritization of the items.

#3: Brainstorming

Taking on a more collaborative approach, Brainstorming is a key Problem Management technique that allows knowledgeable individuals and subject matter experts to collectively share their opinions and perspectives on the cause of the problem.

The methodology does not subscribe to the saying of ‘too many cooks spoil the broth’. On the contrary, this ‘coming together strategy’ allows the problem to be viewed from multiple sets of eyes, minds, and perspectives. With broadening the perspectives can help in enhanced problem-solving capabilities.

#4: Kepner-Tregoe Problem Solving Method

The Kepner-Tregoe Problem Solving method comes in handy when businesses encounter a major problem that needs formality and structure.

With a set of carefully formulated steps, this popular Problem Management method helps in zeroing in on the root cause of the problem. This is followed up with a recommended solution and also sees it through the implementation stage as well.

#5: Ishikawa Diagram (Fishbone)

Using a structured and categorized approach, the Ishikawa Diagram technique of Problem Management enables the service provider to frame a question and come up with possible causes. As a more visual approach, the technique can be a precursor to other methods being used in the future.

#6: Pareto Analysis

The Pareto Analysis enables the service provider to only concentrate on problems that comparatively have the greatest effect through a process of prioritization. The technique works on the assumption that 20% of the problem caused is a result of 80% of incidents.

The service provider looks at resolving those problems first that have the greatest return. Combining Pareto Analysis with Ishikawa diagrams can become an even more powerful tool to combat Problem Management.

These Problem Management techniques may seem quite complex and difficult to use initially, but in reality, they are pretty easy to adopt and implement. What may be a struggle is to identify the correct technique according to the prevalent circumstances to battle the problem at hand.

Problem Management Roles and Responsibilities

To effectively execute ITIL problem management processes successfully, it is imperative that the system includes people with clearly defined roles and responsibilities.

#1: The Problem Manager

At the helm is the Problem Manager who is designated as the leader and has complete ownership of the Problem Management process. Typically, his or her role will include:

  • Responsible for coordinating between all concerned teams and departments involved in Problem resolution
  • Keeping up with designated SLA timelines for Problem resolution
  • Taking total ownership and management of KEDB, otherwise referred to as Known Error Database
  • Ensures that all Problems have been properly closed
  • Driving and Liaoning on major Problem review

That said, do not make the mistake of assigning a Problem Manager with the role of an Incident Manager too. This will create a conflict of interest and shift focus from resolution execution.

#2: The Problem Solving Team

The Problem Manager needs to be supported by either internal technical support team members or external vendors or suppliers to solve problems quickly and efficiently. Having said that, if the situation is serious and a major Problem needs resolving, the Problem Manager generally has the power to constitute a Project Management team.

This special team is dedicated and committed to finding a solution to a major problem. Usually, different team members will have their own individual expertise and skills that can solve the problem in a collective effort.

#3: Problem Management Software

Along with the expertise of a skilled leader and team members, incorporating a Problem Management system in the form of a help desk ticketing software suite greatly enhances the capabilities of the service desk.

The help desk is considered to be an evolved version of the service desk which is a by-product of the ITIL best practice framework. Remember that this is based on the concept of ‘managing IT as a service’.

Reasons Why a Help Desk Software Makes Problem Management Efficient

Being born of IT centricity, incorporating best help desk principles may seem a petty element in the overall scheme of totalitarian Problem Management efforts. But its intervention can certainly go a long way in making the process efficient. The focus of the help desk is what is known as ‘break-fix’.

A help desk ticketing software is an add-on to the existing IT activities and offers constant support to the IT ecosystem of the business. Here are some compelling reasons why a help desk has become a necessary addition today.

#1: Minimize Incident Ticket Volumes

Minimize Incident Ticket VolumesWhen you get a chance to view your Incident Reports, you may notice that your IT support staff are tackling the same issues over and over again. Fixing the same issue repeatedly indicates trends of possibly the same users with the same incidents.

This generally occurs when the reviews are restricted to the incident level only which translates into a gap in problem management. This essentially implies that by just limiting your review of the symptoms, which in this case, are the incidents, the root cause of the problem is being overlooked. By not being fixed at its needed time means that it can rear its ugly head once again, very soon.

By only addressing incidents, the problem remains in the background and will keep interrupting your service delivery. This will lead to an increased number of repeat incidents dealing with the same type of issue. This also means more tickets which are not only a negative impact on your agents but also projects your organization in a bad light.

The role of Problem Management at this stage is to tackle the cause of the interruption head-on. By fixing it here, the problem can go away permanently. This also translates into less recurring incident tickets that land up at your service desk.

#2: Reduce the Burden on Your It Service Desk Staff

Heavy workload is one of the primary reasons causing IT Service Desk staff to suffer from frequent burnouts. A massive volume of incoming tickets and each being as urgent as the other, the concept of Incident Prioritization completely takes a backseat as the tickets keep piling up. This ends up impacting the overall business operations adversely.

Plus, the SLA timeline breaches occur more frequently as a huge number of incoming tickets failed to be triaged within defined timelines. This reflects in the massive hammering of the SLA stats that can be viewed when the reports are generated eventually.

In addition, the monotony of servicing the same incidents every day also takes its toll on the mental well-being of the agents. With a lack of appropriate Problem Management translates in staff turnover in increasing numbers.

With a help desk ticketing system in place, the organization can take more cognizance of the scenario and effectively deal with commonly recurring problems.

With a varied workload and less stressful work environment, the IT Service Desk staff can have a more meaningful and enjoyable experience at work.

#3: Stop Disruptions Before They Happen

What if you could avoid repeat incidents all together? The help desk software incorporated with your Problem Management system is able to identify trends of repeat issues.

A 360-degree suite of help desk services includes incident management, asset management, event management, change management, and access management. The tools are available to monitor and track recurring issues that can be effectively dealt with by having a dedicated Problem Management team in place.

By preventing major business crippling problems also means fewer tickets for your staff to deal with. With a dedicated help desk tool, your IT team can be more transparent, collaborative, and efficient.

The Goal of Problem Management

Unquestionably, the ITIL problem management process is implemented to achieve specific goals. The primary amongst them all is the prevention of incidents from occurring over and over again. Just think about the role of a Service Provider when they have to constantly react to repetitive incidents that never truly get resolved.

It is definitely not a ‘business usual’ scenario combating relapsing incidents and problems. From a business point of view, it leads to a loss in resources and an increase in operational expenditures simply because there is no concrete solution to recurring incidents.

The number of increasing incidents will eventually result in user and customer satisfaction plummeting right to the bottom. Moreover, this will adversely impact the reputation of the Service Desk too. A rise in shadow IT initiatives becoming standard that can jointly result in an unfavorable effect on your ability to enable the business to function.

Problem Management implies taking time out to execute the techniques in alignment with the scenario. The goal is to go over and above resolving Incidents and concentrate on Problem Management only. The irony of the situation is that when you focus on Problem Management, you will automatically address issues relating to Incident Management too. This can eventually consume more resources than you anticipated.

And when done right, Problem Management is not resource-hungry. Commitment and attention are the most imperative traits required for success here. 

Having taken a look at all aspects of ITIL Problem Management, here is an overview of the most commonly asked questions.


Q.1 Which Factor Does not Impact the Complexity of an Incident?

Several factors can impact incident complexity. Other parameters that can also negatively influence are political insensitivity, weather, and environmental factors and possibly media relations.
The complexity of incidents can also be subject to influences of a terror attack and any other potentially hazardous materials.

Q.2 What are the Different Types of Problem Management?

The different types of Problem Management are:

  • Reactive Problem Management
  • Proactive Problem Management

Q.3 What are the KPIs of Problem Management?

The KPIs of Problem Management include:

  • Number of Problems
  • Problem Resolution Time
  • Number of unresolved Problems
  • Number of Incidents per Known Problem
  • Time until Problem Identification
  • Problem Resolution Effort

Q.4 How Many Types of Changes are There in ITIL?

These are the types of changes in ITIL:

  • Major change
  • Standard change
  • Minor change
  • Emergency change
627
Reads
Share this article on

Do you want a free Help Desk?

We have the #1 Help Desk for delightful customer support starting at $0

About the author

Jared is a customer support expert. He has been published in CrazyEgg, Foundr, and CXL. As a customer support executive at ProProfs, he has been instrumental in developing a complete customer support system that more than doubled customer satisfaction.

Comment

Leave a comment

Start Your Help Desk