# Project size Estimation Techniques – Software Engineering

Project size estimation is a crucial aspect of software engineering, as it helps in planning and allocating resources for the project. Here are some of the popular project size estimation techniques used in software engineering:

• Expert Judgment: In this technique, a group of experts in the relevant field estimates the project size based on their experience and expertise. This technique is often used when there is limited information available about the project.
• Analogous Estimation: This technique involves estimating the project size based on the similarities between the current project and previously completed projects. This technique is useful when historical data is available for similar projects.
• Bottom-up Estimation: In this technique, the project is divided into smaller modules or tasks, and each task is estimated separately. The estimates are then aggregated to arrive at the overall project estimate.
• Three-point Estimation: This technique involves estimating the project size using three values: optimistic, pessimistic, and most likely. These values are then used to calculate the expected project size using a formula such as the PERT formula.
• Function Points: This technique involves estimating the project size based on the functionality provided by the software. Function points consider factors such as inputs, outputs, inquiries, and files to arrive at the project size estimate.
• Use Case Points: This technique involves estimating the project size based on the number of use cases that the software must support. Use case points consider factors such as the complexity of each use case, the number of actors involved, and the number of use cases.
• Parametric Estimation: For precise size estimation, mathematical models founded on project parameters and historical data are used.
• COCOMO (Constructive Cost Model): It is an algorithmic model that estimates effort, time, and cost in software development projects by taking into account a number of different elements.
• Wideband Delphi: Consensus-based estimating method for balanced size estimations that combines expert estimates from anonymous experts with cooperative conversations.
• Monte Carlo simulation: This technique, which works especially well for complicated and unpredictable projects, estimates project size and analyses hazards using statistical methods and random sampling.

Each of these techniques has its strengths and weaknesses, and the choice of technique depends on various factors such as the project’s complexity, available data, and the expertise of the team.

## Importance of Project Size Estimation Techniques

• Resource Allocation: Appropriate distribution of financial and human resources is ensured by accurate estimation.
• Risk management: Early risk assessment helps with mitigation techniques by taking into account the complexity of the project.
• Time management: Facilitates the creation of realistic schedules and milestones for efficient time management.
• Cost control and budgeting: Both the terms are closely related, which lowers the possibility of cost overruns.
• Resource Allocation: Enables efficient task delegation and work allocation optimization.
• Scope Definition: Defines the scope of a project, keeps project boundaries intact and guards against scope creep.

## Estimating the size of the Software

Estimation of the size of the software is an essential part of Software Project Management. It helps the project manager to further predict the effort and time that will be needed to build the project. Various measures are used in project size estimation. Some of these are:

• Lines of Code
• Number of entities in the ER diagram
• Total number of processes in detailed data flow diagram
• Function points

1. Lines of Code (LOC): As the name suggests, LOC counts the total number of lines of source code in a project. The units of LOC are:

• KLOC- Thousand lines of code
• NLOC- Non-comment lines of code
• KDSI- Thousands of delivered source instruction

The size is estimated by comparing it with the existing systems of the same kind. The experts use it to predict the required size of various components of software and then add them to get the total size.

It’s tough to estimate LOC by analyzing the problem definition. Only after the whole code has been developed can accurate LOC be estimated. This statistic is of little utility to project managers because project planning must be completed before development activity can begin.

Two separate source files having a similar number of lines may not require the same effort. A file with complicated logic would take longer to create than one with simple logic. Proper estimation may not be attainable based on LOC.

The length of time it takes to solve an issue is measured in LOC. This statistic will differ greatly from one programmer to the next. A seasoned programmer can write the same logic in fewer lines than a newbie coder.

• Universally accepted and is used in many models like COCOMO.
• Estimation is closer to the developer’s perspective.
• Both people throughout the world utilize and accept it.
• At project completion, LOC is easily quantified.
• It has a specific connection to the result.
• Simple to use.

• Different programming languages contain a different number of lines.
• No proper industry standard exists for this technique.
• It is difficult to estimate the size using this technique in the early stages of the project.
• When platforms and languages are different, LOC cannot be used to normalize.

2. Number of entities in ER diagram: ER model provides a static view of the project. It describes the entities and their relationships. The number of entities in ER model can be used to measure the estimation of the size of the project. The number of entities depends on the size of the project. This is because more entities needed more classes/structures thus leading to more coding.

• Size estimation can be done during the initial stages of planning.
• The number of entities is independent of the programming technologies used.

• No fixed standards exist. Some entities contribute more to project size than others.
• Just like FPA, it is less used in the cost estimation model. Hence, it must be converted to LOC.

3. Total number of processes in detailed data flow diagram: Data Flow Diagram(DFD) represents the functional view of software. The model depicts the main processes/functions involved in software and the flow of data between them. Utilization of the number of functions in DFD to predict software size. Already existing processes of similar type are studied and used to estimate the size of the process. Sum of the estimated size of each process gives the final estimated size.

• It is independent of the programming language.
• Each major process can be decomposed into smaller processes. This will increase the accuracy of the estimation.

• Studying similar kinds of processes to estimate size takes additional time and effort.
• All software projects are not required for the construction of DFD.

4. Function Point Analysis: In this method, the number and type of functions supported by the software are utilized to find FPC(function point count). The steps in function point analysis are:

• Count the number of functions of each proposed type.
• Compute the Unadjusted Function Points(UFP).
• Find the Total Degree of Influence(TDI).
• Find the Function Point Count(FPC).

The explanation of the above points is given below:

• Count the number of functions of each proposed type: Find the number of functions belonging to the following types:
• External Inputs: Functions related to data entering the system.
• External outputs: Functions related to data exiting the system.
• External Inquiries: They lead to data retrieval from the system but don’t change the system.
• Internal Files: Logical files maintained within the system. Log files are not included here.
• External interface Files: These are logical files for other applications which are used by our system.
• Compute the Unadjusted Function Points(UFP): Categorise each of the five function types like simple, average, or complex based on their complexity. Multiply the count of each function type with its weighting factor and find the weighted sum. The weighting factors for each type based on their complexity are as follows:
Function typeSimpleAverageComplex
External Inputs346
External Output457
External Inquiries346
Internal Logical Files71015
External Interface Files5710
• Find Total Degree of Influence: Use the ’14 general characteristics’ of a system to find the degree of influence of each of them. The sum of all 14 degrees of influence will give the TDI. The range of TDI is 0 to 70. The 14 general characteristics are: Data Communications, Distributed Data Processing, Performance, Heavily Used Configuration, Transaction Rate, On-Line Data Entry, End-user Efficiency, Online Update, Complex Processing Reusability, Installation Ease, Operational Ease, Multiple Sites and Facilitate Change.
Each of the above characteristics is evaluated on a scale of 0-5.
• Compute Value Adjustment Factor(VAF): Use the following formula to calculate VAF
```VAF = (TDI * 0.01) + 0.65

```
• Find the Function Point Count: Use the following formula to calculate FPC
```FPC = UFP * VAF

```

• It can be easily used in the early stages of project planning.
• It is independent of the programming language.
• It can be used to compare different projects even if they use different technologies(database, language, etc).