Pattern Recognition | Introduction

Last Updated : 04 Apr, 2024

Pattern is everything around in this digital world. A pattern can either be seen physically or it can be observed mathematically by applying algorithms.

Example: The colors on the clothes, speech pattern, etc. In computer science, a pattern is represented using vector feature values.
What is Pattern Recognition?
Pattern recognition is the process of recognizing patterns by using a machine learning algorithm. Pattern recognition can be defined as the classification of data based on knowledge already gained or on statistical information extracted from patterns and/or their representation. One of the important aspects of pattern recognition is its application potential.

Examples: Speech recognition, speaker identification, multimedia document recognition (MDR), automatic medical diagnosis.
In a typical pattern recognition application, the raw data is processed and converted into a form that is amenable for a machine to use. Pattern recognition involves the classification and cluster of patterns.

In classification, an appropriate class label is assigned to a pattern based on an abstraction that is generated using a set of training patterns or domain knowledge. Classification is used in supervised learning.
Clustering generated a partition of the data which helps decision making, the specific decision-making activity of interest to us. Clustering is used in unsupervised learning.

Features may be represented as continuous, discrete, or discrete binary variables. A feature is a function of one or more measurements, computed so that it quantifies some significant characteristics of the object.

Example: consider our face then eyes, ears, nose, etc are features of the face.
A set of features that are taken together, forms the features vector.

Example: In the above example of a face, if all the features (eyes, ears, nose, etc) are taken together then the sequence is a feature vector([eyes, ears, nose]). The feature vector is the sequence of a feature represented as a d-dimensional column vector. In the case of speech, MFCC (Mel-frequency Cepstral Coefficient) is the spectral feature of the speech. The sequence of the first 13 features forms a feature vector.
Pattern recognition possesses the following features:

Pattern recognition system should recognize familiar patterns quickly and accurate
Recognize and classify unfamiliar objects
Accurately recognize shapes and objects from different angles
Identify patterns and objects even when partly hidden
Recognize patterns quickly with ease, and with automaticity.

Training and Learning in Pattern Recognition
Learning is a phenomenon through which a system gets trained and becomes adaptable to give results in an accurate manner. Learning is the most important phase as to how well the system performs on the data provided to the system depends on which algorithms are used on the data. The entire dataset is divided into two categories, one which is used in training the model i.e. Training set, and the other that is used in testing the model after training, i.e. Testing set.

Training set:
The training set is used to build a model. It consists of the set of images that are used to train the system. Training rules and algorithms are used to give relevant information on how to associate input data with output decisions. The system is trained by applying these algorithms to the dataset, all the relevant information is extracted from the data, and results are obtained. Generally, 80% of the data of the dataset is taken for training data.
Testing set:
Testing data is used to test the system. It is the set of data that is used to verify whether the system is producing the correct output after being trained or not. Generally, 20% of the data of the dataset is used for testing. Testing data is used to measure the accuracy of the system. For example, a system that identifies which category a particular flower belongs to is able to identify seven categories of flowers correctly out of ten and the rest of others wrong, then the accuracy is 70 %

Real-time Examples and Explanations:
A pattern is a physical object or an abstract notion. While talking about the classes of animals, a description of an animal would be a pattern. While talking about various types of balls, then a description of a ball is a pattern. In the case balls considered as pattern, the classes could be football, cricket ball, table tennis ball, etc. Given a new pattern, the class of the pattern is to be determined. The choice of attributes and representation of patterns is a very important step in pattern classification. A good representation is one that makes use of discriminating attributes and also reduces the computational burden in pattern classification.

An obvious representation of a pattern will be a vector. Each element of the vector can represent one attribute of the pattern. The first element of the vector will contain the value of the first attribute for the pattern being considered.

Example: While representing spherical objects, (25, 1) may be represented as a spherical object with 25 units of weight and 1 unit diameter. The class label can form a part of the vector. If spherical objects belong to class 1, the vector would be (25, 1, 1), where the first element represents the weight of the object, the second element, the diameter of the object and the third element represents the class of the object.
Advantages:

Pattern recognition solves classification problems
Pattern recognition solves the problem of fake biometric detection.
It is useful for cloth pattern recognition for visually impaired blind people.
It helps in speaker diarization.
We can recognize particular objects from different angles.

Disadvantages:

The syntactic pattern recognition approach is complex to implement and it is a very slow process.
Sometimes to get better accuracy, a larger dataset is required.
It cannot explain why a particular object is recognized.
Example: my face vs my friend’s face.

Applications:

Image processing, segmentation, and analysis
Pattern recognition is used to give human recognition intelligence to machines that are required in image processing.
Computer vision
Pattern recognition is used to extract meaningful features from given image/video samples and is used in computer vision for various applications like biological and biomedical imaging.
Seismic analysis
The pattern recognition approach is used for the discovery, imaging, and interpretation of temporal patterns in seismic array recordings. Statistical pattern recognition is implemented and used in different types of seismic analysis models.
Radar signal classification/analysis
Pattern recognition and signal processing methods are used in various applications of radar signal classifications like AP mine detection and identification.
Speech recognition
The greatest success in speech recognition has been obtained using pattern recognition paradigms. It is used in various algorithms of speech recognition which tries to avoid the problems of using a phoneme level of description and treats larger units such as words as pattern
Fingerprint identification
Fingerprint recognition technology is a dominant technology in the biometric market. A number of recognition methods have been used to perform fingerprint matching out of which pattern recognition approaches are widely used.

Imagine we have a dataset containing information about apples and oranges. The features of each fruit are its color (red or yellow) and its shape (round or oval). We can represent each fruit using a list of strings, e.g. [‘red’, ’round’] for a red, round fruit.

Our goal is to write a function that can predict whether a given fruit is an apple or an orange. To do this, we will use a simple pattern recognition algorithm called k-nearest neighbors (k-NN).

Here is the function in Python:

Python3

\from collections import Counter

def predict(fruit):
    # Count the number of apples and oranges in the training data
    num_apples = sum([1 for f in training_data if f[-1] == 'apple'])
    num_oranges = sum([1 for f in training_data if f[-1] == 'orange'])
    
    # Find the k nearest neighbors of the fruit
    nearest_neighbors = find_nearest_neighbors(fruit, training_data, k=5)
    
    # Count the number of apples and oranges among the nearest neighbors
    num_apples_nn = sum([1 for nn in nearest_neighbors if nn[-1] == 'apple'])
    num_oranges_nn = sum([1 for nn in nearest_neighbors if nn[-1] == 'orange'])
    
    # Predict the label of the fruit based on the majority class among the nearest neighbors
    if num_apples_nn > num_oranges_nn:
        return 'apple'
    else:
        return 'orange'

# Create a training dataset
training_data = [
    ['red', 'round', 'apple'],
    ['red', 'oval', 'apple'],
    ['yellow', 'round', 'orange'],
    ['yellow', 'oval', 'orange']
]

# Create a test fruit
test_fruit = ['red', 'round']

# Predict the label of the test fruit
prediction = predict(test_fruit)
print(prediction)