Search Techniques in Cyber Forensics
Computer forensic examinations uses computer generated data as their vital source. The goal of any given computer forensic examination is to find facts, and through these facts they try to recreate truth of an event. These Automated Search Techniques are used to find out whether given type of object such as hacking tools or pictures of specific type are present in information that is collected.
There are two types of Automated Search Techniques : Manual Browsing and Automated Browsing.
What is Manual Browsing ?
Forensic Analyst browses information that has been gathered and selects objects of preferred type in Manual Browsing. The tool used for this browsing is type of Watcher. It takes data object, e.g., file, decodes that file and gives result back in human-readable format. Manual Browsing is slow and time consuming as there is massive amount of data that is to be gathered in lot of investigations.
What are Automated Searches ?
The word Automated comes from Greek word automatos, meaning “acting of oneself.” Something that is automated can do what it’s meant to do without having person to help run it. An automated search procedure provides direct access to automated files of another party where response to search procedure is fully automated.
The types of automated Searches are : Keyword Search, Regular Expression Search, Approximate Matching Search, Custom Searches, Search of Modifications.
- Keyword Search –
The cyber forensic keyword search is feature used to find evidence from large amount of electronic data. During the cyber crime investigation forensic email search is performed on basis of keywords that you enter in computer forensics tool. Keyword search consists of specific keywords. It is widely used easy technique that speeds up manual browsing. The list of found data objects is output of keyword search. However, there are two problems with keyword search: False Positive and False Negative.
- (i). False Positive :
Keyword searches gives approximate required type of data objects. Because of this output of this could have false positives. False Positives means objects that do not belong to any particular type even though they contain specified keywords. A Forensic Analyst has to browse keyword search data objects manually to discard false positives.
- (ii). False Negative :
False Negatives means that there are objects of particular given type but they are missed by search. If search utility fails to correctly interpret data objects then result is false negative. Encryption, Compression or lack of ability of search utility to interpret new data might be reason for this to happen.
- (i). False Positive :
- Regular Expression Search –
Regular expression (Regex) is powerful way used to search anything in text based files for data with an identifiable pattern. This search gives more expressible language for describing object of interest than keywords. This is an extension of keyword search. These are also used to specify searches of e-mail addresses and files of precise type. To perform regular expression searches Encase Tool is used. Not all type of data can be sufficiently described using regex. Regular Expression Search also results in false positives and false negatives.
- Approximate Matching Search –
An expansion of regular expression search is Approximate Matching Search. It uses Matching algorithm. Approximate matching Search algorithm allows character mismatches while searching for keyword. It detects misspelled words which gives mismatches and raises lot of false positives. The agrep is used for approximate matches.
- Custom Searches –
Heuristic procedure is used by this tool to find full names of people in gathered information/data. These programs are written for more complex searches like FILTER_1 tool from new Technologies Inc. because regular expressions have limited expressiveness. This too suffers from false positives and false negatives.
- Search of Modifications –
This is used for data objects that have been modified since specified instant in past. The modifications of data objects that are not frequent like operating system utilities. These utilities are detected by comparing their current hash with their expected hash. A library of expected hashes is built before search.