Curve Fitting models – Software Engineering

Last Updated : 12 Dec, 2023

The curve fitting group models use statistical regression analysis to study the relationship between software complexity and the number of faults in a program, the number of changes, or failure rate. This group of models finds a relationship between input and output variables by using the methods linear regression, nonlinear regression, or time series analysis. The dependent variables, for example, are the number of errors in a program. The independent variables are the number of modules changed in the maintenance phase, the time between failures, programmers’ skill, program size, etc. Models included in this group are: Estimation of errors, Estimation of complexity, Estimation of failure rate. These are explained as following below.

Graph of b/w actual vs. estimation:

1. Estimation of Errors Model:

The number of errors in a program can be estimated by using a linear or nonlinear regression model. A simple nonlinear regression model to estimate the total number of initial errors in the program, N, can be presented as follows:

$N = \Sigma a_{i} X_{i} + \Sigma b_{i} X^{2}_{i} + \Sigma c_{i} X^{3}_{i} + \varepsilon$

Where Xi is the ith error factor; ai, bi, ci are the coefficients of the model, and $\varepsilon$ is an error term. Typical error factors are software complexity metrics and environmental factors. Most curve fitting models involve only one error factor.

2. Estimation of Complexity Model:

This model is used to estimate the software complexity, CR, using the time series approach. The software complexity model is summarized as follows:

$CR = a_{0} + a_{1} R + a_{2} E_{R} + a_{3} MR + a_{4} IR + a_{5} D + \varepsilon$

where
R = release sequence number
ER = environmental factor(s) at release R
MR = number of modules at release R
IR = inter-release interval R
D = number of days when first error occurs
$\varepsilon$ = error

This particular model is used when the software is evaluated due to time by time means when more versions of the model are released.

3. Estimation of Failure Rate Model:

This model is used to estimate the failure rate of software. Given failure times t1, t2, .., tn, a rough estimate of the failure rate at the ith failure interval is

$\widehat{\lambda}_{i} = \frac{1}{t_{i-1} - t_{i}}$

Assuming that the failure rate is monotonically non-increasing, an estimate of this function $\lambda$ , i = 1, 2, …, n can be obtained by using the least squared method.