Mod 3 > Week 2 > Day 2

Overview of the day

Today we look at the Maintenance stage of the SDLC.


Learning Objectives

Lesson

Purpose of the Maintenance stage

The maintenance stage is where deployed software is supported and enhanced. We discover whether the business case that initiated the software development life cycle is realised and discover how decisions made during the design and development stages impact the longevity of the system.

Software needs to be maintained for the following reasons:

Inputs to the Maintenance stage

The input to the maintenance stage is production software. This stage mainly deals with:

Service Level Agreements

A Service Level Agreement (SLA) is a legal agreement between two parties (a supplier and a customer) that defines the services the supplier will deliver and the expected level of service. If the expected level of servie is not met, a financial penalty will be applied.

Here is an example SLA for Google Workspace for ISPs.

1st, 2nd, 3rd line support

One aspect of this phase of the SDLC is support. This can take the form of either 1st, 2nd or 3rd line support. Often this kind of support is managed by a system. Let us have a look at how it works.

Support level Purpose
1st line support Usually a first point of contact when a bug or error is detected. This might be over the phone for example. Often this involves raising a ticket on a system so that your reported incident can be tracked. There may be quick solutions offered for example, "have you tried turning it off and on again?"
2nd line support This is usually more involved and often a support professional with domain knowledge will try to solve the problem. This typically involves on-site visits or remote sessions. 2nd line support technicians might escalate a something they are unable to fix to 3rd line support
3rd line support This level involves the experts. These are the people who actually wrote the software or the network engineers who actually installed a network

Listen to this video to hear more about the life of a Service Desk Support Analyst.

Companies typically use Help Desk software such as ZenDesk or LiveAgent to handle support queries.

When things go wrong

It happens. Sometimes software breaks in production. The important thing is how we identify the root cause and provide a fix.

In this video we hear how NASA engineers used problem solving to return the Apollo 13 astraunants safely back to Earth after an explosion in an oxygen tank caused a loss of oxygen, water, electrical power and the loss of use of the propulsion system.

In the following sections we are going to learn about 4 different kinds of structured problem solving techniques.

Brainstorming

brainstorming

This is the more informal kind of problem solving and often a starting point. Members of the team pile in ideas as to what have caused a failure. There is no particular structure or order.

Fault tree analysis

fault tree analysis

Fault tree analysis is a top-down approach to identify all potential causes leading to a defect. Each cause is further broken down into least possible events or faults using Boolean logic (AND or OR gates). The analysis begins with a major defect. All the potential events – (individual or in combination) that may cause the defect are identified. Potential events are further traced down in a similar way to the lowest possible level.

Ishakawa (fishbone) diagrams

Ishakawa diagrams show the potential causes of a specific event. Causes are grouped into major categories, typically, People, Process, Equipment, Materials, Environment and Management.

Ishakawa (fishbone) diagrams

Kepner-Tregoe "Root cause analysis"

The Kepner-Tregoe method is systematic method to analyse a problem and find the root cause of the issue, instead of making assumptions and jumping to conclusions. The Apollo 13 Mission Control and NASA engineers used the Kepner-Tregoe methodology to return the astraunants safely back to Earth.

Kepner-Tregoe training is rigorous, it requires that trainees work through complex simulations that are extremely intellectually challenging.

There are four basic steps when using the Kepner-Tregoe decision matrix:

Type Analysis
Situational Analysis This is used to clarify the problem (what happened)
Problem Analysis This is used to find the cause of the problem
Decision Analysis This determines the options for potential problem resolution and the risks associated with each
Potential Problem Analysis This anticipates future problems and looks at preventative actions

Output of the Maintenance stage

The output of the Maintenance stage is operational software that satisfies the business needs identified in the first stage of the SDLC.

Assignment

TODO

Additional resources

Research paper on the Apollo 13 decision making process

attendance log main|prev|next