Bitmap Indexing in DBMS
INTRODUCTION OF BITMAP INDEXING:
Bitmap indexing is a data indexing technique used in database management systems (DBMS) to improve the performance of read-only queries that involve large datasets. It involves creating a bitmap index, which is a data structure that represents the presence or absence of data values in a table or column.
In a bitmap index, each distinct value in a column is assigned a bit vector that represents the presence or absence of that value in each row of the table. The bit vector contains one bit for each row in the table, where a set bit indicates the presence of the corresponding value in the row, and a cleared bit indicates the absence of the value.
Bitmap indexing can be used to improve the performance of queries that involve complex logical operations, such as AND, OR, and NOT. The logical operations are performed on the bit vectors, which can be efficiently combined using bitwise operators.
Bitmap indexing is particularly useful for datasets with low cardinality columns, where each value appears in a large number of rows. It is also useful for read-only queries that involve aggregations, such as counting or summing values, as the bit vectors can be quickly scanned to retrieve the required data.
Bitmap Indexing is a special type of database indexing that uses bitmaps. This technique is used for huge databases, when column is of low cardinality and these columns are most frequently used in the query.
Bitmap indexing is a data structure used in database management systems (DBMS) to efficiently represent and query large datasets with many attributes (columns). Bitmap indexes use a compact binary representation to store the occurrence of each value or combination of values in each attribute, allowing for fast, set-based operations.
The key features of bitmap indexing in DBMS include:
Space efficiency: Bitmap indexes are highly space-efficient because they use a compact binary representation to store the occurrence of each value or combination of values in each attribute. This makes them especially useful for large datasets with many attributes.
Fast query processing: Bitmap indexes can be used to quickly answer complex queries involving multiple attributes using set-based operations such as AND, OR, and NOT. This allows for fast query processing and reduces the need for full table scans.
Low maintenance overhead: Bitmap indexes require relatively low maintenance overhead because they can be updated incrementally as data changes. This makes them especially useful for applications where the data is frequently updated.
Flexibility: Bitmap indexes can be used for both numerical and categorical data types, and can also be used to index text data using techniques such as term frequency-inverse document frequency (TF-IDF).
Reduced I/O overhead: Bitmap indexes can be used to avoid expensive I/O operations by using a compressed representation of the data. This reduces the amount of data that needs to be read from disk, improving query performance.
bitmap indexing is a powerful technique for efficiently querying large datasets with many attributes. Its compact representation and set-based operations make it an ideal choice for data warehousing and other applications where fast query processing is critical.
Applications of bitmap indexing in DBMS:
Fast queries on large datasets: Bitmap indexing is particularly useful for fast queries on large datasets. When querying a table with a bitmap index, the database engine can quickly determine which rows satisfy the query by performing a bitwise operation on the corresponding bitmaps. This can greatly reduce query execution time, especially for queries that involve multiple columns or complex conditions.
Efficient range queries: Bitmap indexing can also be used for efficient range queries on numeric or date columns. The bitmap for a range of values can be constructed by performing bitwise operations on the bitmaps for individual values. This allows the database engine to quickly identify all rows that fall within a given range.
Space efficiency: Bitmap indexing can be more space-efficient than other indexing techniques, especially for columns with a small number of distinct values. For example, a column with only two distinct values (such as gender) can be represented using a single bitmap, while other indexing techniques would require multiple index entries.
Multi-dimensional indexing: Bitmap indexing can be used for multi-dimensional indexing by creating a bitmap for each dimension. The bitmaps can then be combined using bitwise operations to identify rows that satisfy a query with multiple conditions.
Data warehousing applications: Bitmap indexing is commonly used in data warehousing applications where the focus is on fast query performance. In these applications, tables often have large numbers of columns with a relatively small number of distinct values, making bitmap indexing a particularly effective technique.
Need of Bitmap Indexing –
The need of Bitmap Indexing will be clear through the below given example :
For example, Let us say that a company holds an employee table with entries like EmpNo, EmpName, Job, New_Emp and salary. Let us assume that the employees are hired once in the year, therefore the table will be updated very less and will remain static most of the time. But the columns will be frequently used in queries to retrieve data like : No. of female employees in the company etc. In this case we need a file organization method which should be fast enough to give quick results. But any of the traditional file organization method is not that fast, therefore we switch to a better method of storing and retrieving data known as Bitmap Indexing.
How Bitmap Indexing is done –
In the above example of table employee, we can see that the column New_Emp has only two values Yes and No based upon the fact that the employee is new to the company or not. Similarly let us assume that the Job of the Employees is divided into 4 categories only i.e Manager, Analyst, Clerk and Salesman. Such columns are called columns with low cardinality. Even though these columns have less unique values, they can be queried very often.
Bit: Bit is a basic unit of information used in computing that can have only one of two values either 0 or 1 . The two values of a binary digit can also be interpreted as logical values true/false or yes/no.
In Bitmap Indexing these bits are used to represent the unique values in those low cardinality columns. This technique of storing the low cardinality rows in form of bits are called bitmap indices.
Continuing the Employee example, Given below is the Employee table :
If New_Emp is the data to be indexed, the content of the bitmap index is shown as four( As we have four rows in the above table) columns under the heading Bitmap Indices. Here Bitmap Index “Yes” has value 1001 because row 1 and row four has value “Yes” in column New_Emp.
In this case there are two such bitmaps, one for “New_Emp” Yes and one for “New_Emp” NO. It is easy to see that each bit in bitmap indices shows that whether a particular row refer to a person who is New to the company or not.
The above scenario is the simplest form of Bitmap Indexing. Most columns will have more distinct values. For example the column Job here will have only 4 unique values (As mentioned earlier). Variations on the bitmap index can effectively index this data as well. For Job column the bitmap Indexing is shown below:
Now Suppose, If we want to find out the details for the Employee who is not new in the company and is a sales person then we will run the query:
SELECT * FROM Employee WHERE New_Emp = "No" and Job = "Salesperson";
For this query the DBMS will search the bitmap index of both the columns and perform logical AND operation on those bits and find out the actual result:
Here the result 0100 represents that the second row has to be retrieved as a result.
Bitmap Indexing in SQL – The syntax for creating bitmap index in sql is given below:
CREATE BITMAP INDEX Index_Name ON Table_Name (Column_Name);
For the above example of employee table, the bitmap index on column New_Emp will be created as follows:
CREATE BITMAP INDEX index_New_Emp ON Employee (New_Emp);
- Efficiency in terms of insertion deletion and updation.
- Faster retrieval of records
- Only suitable for large tables
- Bitmap Indexing is time consuming
Please Login to comment...