Software Reverse Engineering is a process of recovering the design, requirement specifications, and functions of a product from an analysis of its code. It builds a program database and generates information from this. This article focuses on discussing reverse engineering in detail.
What is Reverse Engineering?
Reverse engineering can extract design information from source code, but the abstraction level, the completeness of the documentation, the degree to which tools and a human analyst work together, and the directionality of the process are highly variable.
Objective of Reverse Engineering:
- Reducing Costs: Reverse engineering can help cut costs in product development by finding replacements or cost-effective alternatives for systems or components.
- Analysis of Security: Reverse engineering is used in cybersecurity to examine exploits, vulnerabilities, and malware. This helps in understanding of threat mechanisms and the development of practical defenses by security experts.
- Integration and Customization: Through the process of reverse engineering, developers can incorporate or modify hardware or software components into pre-existing systems to improve their operation or tailor them to meet particular needs.
- Recovering Lost Source Code: Reverse engineering can be used to recover the source code of a software application that has been lost or is inaccessible or at the very least, to produce a higher-level representation of it.
- Fixing bugs and maintenance: Reverse engineering can help find and repair flaws or provide updates for systems for which the original source code is either unavailable or inadequately documented.
Reverse Engineering Goals:
- Cope with Complexity: Reverse engineering is a common tool used to understand and control system complexity. It gives engineers the ability to analyze complex systems and reveal details about their architecture, relationships and design patterns.
- Recover lost information: Reverse engineering seeks to retrieve as much information as possible in situations where source code or documentation are lost or unavailable. Rebuilding source code, analyzing data structures and retrieving design details are a few examples of this.
- Detect side effects: Understanding a system or component’s behavior requires analyzing its side effects. Unintended implications, dependencies, and interactions that might not be obvious from the system’s documentation or original source code can be found with the use of reverse engineering.
- Synthesis higher abstraction: Abstracting low-level features in order to build higher-level representations is a common practice in reverse engineering. This abstraction makes communication and analysis easier by facilitating a greater understanding of the system’s functionality.
- Facilitate Reuse: Reverse engineering can be used to find reusable parts or modules in systems that already exist. By understanding the functionality and architecture of a system, developers can extract and repurpose components for use in other projects, improving efficiency and decreasing development time.
Reverse Engineering to Understand Data:
Reverse engineering of data occurs at different levels of abstraction .It is often the first reengineering task.
- At the program level, internal program data structures must often be reverse engineered as part of an overall reengineering effort.
- At the system level, global data structures (e.g., files, databases) are often reengineered to accommodate new database management paradigms (e.g., the move from flat file to relational or object-oriented database systems).
Internal Data Structures
Reverse engineering techniques for internal program data focus on the definition of classes of objects.
- This is accomplished by examining the program code with the intent of grouping related program variables.
- In many cases, the data organization within the code identifies abstract data types.
- For example, record structures, files, lists, and other data structures often provide an initial indicator of classes.
A database allows the definition of data objects and supports some method for establishing relationships among the objects. Therefore, reengineering one database schema into another requires an understanding of existing objects and their relationships.
The following steps define the existing data model as a precursor to reengineering a new database model:
- Build an initial object model.
- Determine candidate keys (the attributes are examined to determine whether they are used to point to another record or table; those that serve as pointers become candidate keys).
- Refine the tentative classes.
- Define generalizations.
Reverse Engineering to Understand Processing:
To understand processing begins with an attempt to understand and then extract procedural abstractions represented by the source code. To understand procedural abstractions, the code is analyzed at varying levels of abstraction :system, program, component, pattern, and statement.
- Each of the programs that make up the application system represents a functional abstraction at a high level of detail. A block diagram, representing the interaction between these functional abstractions, is created.
- Each component performs some subfunction and represents a defined procedural abstraction. A processing narrative for each component is developed.
For large systems, reverse engineering is generally accomplished using a semiautomated(partial automation) approach. Automated tools can be used to help you understand the semantics of existing code. The output of this process is then passed to restructuring and forward engineering tools to complete the reengineering process.
Steps of Software Reverse Engineering:
- Collection Information: This step focuses on collecting all possible information (i.e., source design documents, etc.) about the software.
- Examining the Information: The information collected in step-1 is studied so as to get familiar with the system.
- Extracting the Structure: This step concerns identifying program structure in the form of a structure chart where each node corresponds to some routine.
- Recording the Functionality: During this step processing details of each module of the structure, charts are recorded using structured language like decision table, etc.
- Recording Data Flow: From the information extracted in step-3 and step-4, a set of data flow diagrams is derived to show the flow of data among the processes.
- Recording Control Flow: The high-level control structure of the software is recorded.
- Review Extracted Design: The design document extracted is reviewed several times to ensure consistency and correctness. It also ensures that the design represents the program.
- Generate Documentation: Finally, in this step, the complete documentation including SRS, design document, history, overview, etc. is recorded for future use.
Reverse Engineering Tools:
Reverse engineering tools accept source code as input and produce a variety of structural, procedural, data, and behavioral design. Reverse engineering if done manually would consume a lot of time and human labor and hence must be supported by automated tools. Some of the tools are given below:
- CIAO and CIA: A graphical navigator for software and web repositories and a collection of Reverse Engineering tools.
- Rigi: A visual software understanding tool.
- Bunch: A software clustering/modularization tool.
- GEN++: An application generator to support the development of analysis tools for the C++ language.
- PBS: Software Bookshelf tools for extracting and visualizing the architecture of programs.
Share your thoughts in the comments
Please Login to comment...