Risk Based Testing and Failure Mode and Effects Analysis
Risk is the probability of a negative or undesirable outcome or event. Risk is any problem that might occur that would decrease customer, user, stakeholder perception of quality and/or project success.
Types of risks:
There are 2 types of risks- Product Risk and Quality Risk.
- Product risk-
When the Primary effect of a potential problem is on product quality, then the potential problem is called Product risk. It can also be called a quality risk. Example- A defect that can cause a system crash or monetary loss.
- Project Risk-
When the Primary effect of a potential problem is on the overall success of the project, those potential problems are called Project risk. They can also be called Planning risks. Example- Staffing issues that can delay project completion.
Not all risks are equally important. There is a number of ways we can classify the level of risk. The easiest way is to look at two factors-
- Likelihood of occurrence of the problem. Likelihood arises from technical considerations.
- Impact of the problem, if it occurs. Impact arises from business considerations.
What Is Risk-Based Testing?
In risk-based testing, we use the risk items identified during risk analysis, along with the level of risk associated with each risk item to guide the testing. It is a type of software testing technique that is primarily based on the probability of the risk. Risk-based testing involves the following steps:
- Accessing the risk based on software quality.
- Frequency of use.
- Criticality of Business.
- Possible areas with defects, etc.
Characteristics Of Risk-Based Testing (RBT):
Below are some characteristics of Risk-based testing (RBT)-
- RBT strategy matches the level of test effort to the level of risk. The higher the risk, the more is the test effort. This applies to test execution as well as other test activities like test designing and implementation.
- It matches the order of testing with the level of risk. Higher risk tests tend to find more defects or are related to more important areas in the application or maybe both. So higher the risk, we plan the tests earlier in the cycle- both in design and execution. This helps in building the confidence of business stakeholders as well.
- By effort allocation and maintaining the order of testing, the quality risk gets systematically and predictably reduced. By maintaining a traceability matrix of tests vs risks and defects identified to risks, reporting the risk as residual risks make sense. This allows the concerned stakeholders to decide when to stop testing i.e whenever the risk of continuing testing exceeds the risk of testing completion.
- If schedule reduction requires scope reduction, it is easier to decide what needs to go out. It will always be acceptable and explainable to business stakeholders as risk levels are agreed upon.
- To identify risks, we need to take inputs from various sources like – Requirements specifications, design specifications, existing application data, previous incident records, etc. However, if this information is not readily available, we can still use this approach by getting inputs from stakeholders for the risk identification and assessment process.
Here, Please note that the ability to sustain on little or no documentation makes this strategy of testing more robust (if not fail-proof) as dependency on upstream processes like requirement gathering is reduced to an extent.
When To Implement Risk-Based Testing?
Risk-based testing approach is implemented in scenarios where-
- There is time/resource or budget constraints. For example- A hotfix to be deployed in production.
- When a proof of concept is being implemented.
- When the team is new to the technology platform or to the application under test.
- When testing in Incremental models or Iterative models.
- Security testing is done in Cloud computing environments.
How Risk-Based Testing Is Implemented?
Risk can guide testing in multiple ways but below are the major ones –
- The effort is allocated by test managers proportional to the risk associated with the test item.
- Test techniques are selected in a way that matches the rigor and extensiveness required based on the level of risk of the test item.
- Test activities should be carried out in reverse order of risk i.e The Highest risk item should be tested first.
- Prioritization and resolution of defects should be done as appropriate with the level of risk associated.
- During test planning and control, test managers should carry out risk control for all significant risk items. The higher the level of risk associated, the more thoroughly it should be controlled.
- Reporting should be done in terms of residual risks. Example- Which tests have not run yet? Which defects have not fixed yet?
It is important to note that risk management is not something that happens at the project start. It should be an ongoing activity throughout the project lifecycle. However, the nature of risks keeps changing depending on which test phase we are in.
Risks should be periodically evaluated along with risk levels based on new developments in the project. It may result in some risks getting undervalued or even closed. Based on the outcomes, test efforts allocation and other test control activities will also change.
Also, Effort should be to try to reduce quality risks by running the tests, finding defects, and reduce project risks by mitigating and contingency actions.
Benefits of Risk-Based Testing (RBT):
By identifying and analyzing the risks related to the system, it is possible to make testing efficient and effective-
RBT is efficient because you test the most critical areas of the system early in the cycle (the earlier the defect is detected the lower is the cost of solving those issues)
RBT is effective because your time is spent according to the risk rating and mitigation plans. You do not spend time on items and activities which might not be very important in the overall scheme of things.
- Reduced Test cases-
Test case count gets reduced as test cases become more focused.
- Cost reduction-
A reduction in cost per quality as critical issues get fixed early in the cycle and hence lowering the cost of change.
- Faster time to market-
Faster time to market is more easily achievable with RBT approach as the most important features are available in a shippable position early in the cycle.
Failure Mode And Effects Analysis (FMEA):
FMEA is a systematic technique used to identify quality risk items known as failure modes i.e.it identifies where and how the system under test might fail and then assess the relative impact of different failures.
The steps involved in the process are:
- Failure modes-
What could fail?
- Failure causes-
Why would the failure happen?
- Failure effects-
What would be the outcome of each failure?
The FMEA approach is iterative i.e re-evaluation of residual risk needs to be done repeatedly. This technique was originally designed to help prevent defects during the design and implementation phase and hence is expected to be used early in the cycle.
There is a need to be fine-grained when it comes to failure mode analysis as effects on users, customers need to be identified as well. Since this level of depth of analysis is required, FMEA documentation can be intricate.
It is mostly used in safety-critical, high-risk, and conservative projects. For example- Industrial control software, nuclear control software etc.
FMEA is very useful in evaluating a new process prior to implementation and in assessing the impact of a proposed change on an existing process.
This section details the FMEA process. Below are the steps-
1. Review the Process- Use a process flowchart to identify each process component and list each process component in the FMEA table.
2. Identify potential failure modes and map their potential impact-
- Review existing documentation, design, and data to identify the ways each component can fail to come up with an exhaustive list.
- Map each failure to the impact each failure has on the end product or on subsequent steps in the process.
- Assign severity to each failure.
|10||Hazardous, without warning|
|9||Hazardous, with warning|
3. Once impact and severity are identified, also assign the occurrence ranking.
- An occurrence is a ranking number associated with the likelihood that the failure mode and its associated cause will be present in the item being analyzed. The occurrence ranking has a relative meaning rather than an absolute value and is determined without regard to the severity or likelihood of detection.
- For System and Design FMEAs, the occurrence ranking considers the likelihood of occurrence during the design life of the product.
|10||>100 Per 1,000|
|9||50 Per 1,000|
|8||20 Per 1,000|
|7||10 Per 1,000|
|6||5 Per 1,000|
|5||2 Per 1,000|
|4||1 Per 1,000|
|3||0.5 Per 1,000|
|2||0.1 Per 1,000|
|1||<0.01 Per 1,000|
4. Assign Detection Ranking
- The detection ranking considers the likelihood of detection of the failure mode/cause, according to defined criteria.
- Detection is a relative ranking within the scope of the specific FMEA and is determined without regard to the severity or likelihood of occurrence.
5. Calculate the Risk Priority Number.
- The RPN is calculated by multiplying the three scoring columns-
- Occurrence (Point 3).
- Detection (Point 4).
6. Develop an action plan according to RPN.
- Decide which failures should be prioritized to work on based on the Risk Priority Numbers. Highest RPNs get the most priority.
- Define action plan, Responsible person, and Expected date of completion.
7. Implement the action plan.
8. Re-evaluate and repeat.
- Re-evaluate each of the potential failures once improvements have been made and determine the impact of the improvements.
- Repeat the process to identify the next actions.
Benefits Of FMEA:
Below are the benefits of FMEA-
- It is precise and thorough and hence less likely to omit risks.
- It gives a holistic view of potential problems since we need to do a detailed analysis of expected system failures
- It provides justification for not doing certain tests i.e where the probability of failure is least.
Challenges Of FMEA:
Below are the challenges of FMEA-
- It is documentation-heavy and hence time-consuming.
- When trying to determine the causes of failures, it might be challenging to determine the true cause from intermediate effects.
- Many organizations fail to recognize that the FMEA is not a static model. For successful risk management, the FMEA should be regularly updated as new potential failure modes are identified and corresponding control plans are developed.