Database Administrator Interview Questions

Last Updated : 01 Feb, 2024

Explore these carefully collected Database Administrator (DBA) interview questions to equip yourself for a successful career move in the realm of database management. Familiarize yourself with the types of questions often encountered in technical assessments and problem-solving scenarios. Enhance your preparation for interviews by diving into these questions. A Database Administrator is a crucial role responsible for managing and organizing data systems. As an expert in database technology, a DBA ensures the efficiency, security, and reliability of data storage. Discover exciting career prospects in the dynamic field of database administration and gear up for a rewarding journey in this important role.

Securing the role you want as a Database Administrator isn’t only about technical know-how – it’s also about understanding how interviews in this fieldwork. This guide offers a detailed exploration of what to expect in Database Administrator interviews, giving you insights into the kinds of questions you might encounter.

Q1. Write a simple SQL query to retrieve all columns from a table named “employees.”

To retrieve all columns from a table named “employees,” you can use the following SQL query:

SELECT * FROM employees;

Q2. Explain the difference between INNER JOIN and LEFT JOIN. Provide an example for each.

INNER JOIN

The INNER JOIN keyword selects all rows from both tables as long as the condition is satisfied. This keyword will create the result set by combining all rows from both the tables where the condition satisfies i.e. the value of the common field will be the same.

Syntax:

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
INNER JOIN table2
ON table1.matching_column = table2.matching_column;

table1: First table.
table2: Second table
matching_column: Column common to both the tables.

Note: We can also write JOIN instead of INNER JOIN. JOIN is the same as INNER JOIN.

LEFT JOIN

This join returns all the rows of the table on the left side of the join and matches rows for the table on the right side of the join. For the rows for which there is no matching row on the right side, the result-set will contain null. LEFT JOIN is also known as LEFT OUTER JOIN.

Syntax:

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1 
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;

table1: First table.
table2: Second table
matching_column: Column common to both the tables.

Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are the same.

Q3. What is a subquery, and how is it different from a JOIN? Provide an example of using a subquery.
A subquery, as the name itself suggests, is a query within another query. It acts like a mini-query that helps to solve a specific part of the main query, and its results are used by the outer query to filter or manipulate the final data. Subqueries are nested queries that filter data based on the results of another query, while joins combine data from multiple tables based on a shared relationship. Subqueries are generally easier to understand but less efficient for large datasets, while joins are more efficient but can be a little complex.

For ex:

Find all customers who placed orders exceeding the average order amount.

Subquery for the above question:

SELECT * FROM customers

WHERE customer_id IN (

SELECT customer_id

FROM orders

WHERE amount > (SELECT AVG(amount) FROM orders)

);

Q4.Write a SQL query to calculate the average salary of employees in a specific department.

SELECT AVG(salary) AS average_salary

FROM employees

WHERE department_id = your_department_id;

Q5. Write a SQL query to find the second-highest salary in an “employees” table.

select * from employee where salary=(select Max(salary) from employee);

Q6. SQL HAVING Clause with Examples

The HAVING clause was introduced in SQL to allow the filtering of query results based on aggregate functions and groupings, which cannot be achieved using the WHERE clause that is used to filter individual rows.

Q7. Stored procedures in SQL
Stored procedures are prepared SQL code that you save so you can reuse it over and over again. So if you have an SQL query that you write over and over again, save it as a stored procedure and call it to run it.

Q8 What is Data Normalization and Why Is It Important?

Normalization is the process of reducing data redundancy in a table and improving data integrity. Then why do you need it? If there is no normalization in SQL, there will be many problems, such as:

Insert Anomaly: This happens when we cannot insert data into the table without another.
Update Anomaly: This is due to data inconsistency caused by data redundancy and data update.
Delete exception: Occurs when some attributes are lost due to the deletion of other attributes

The main use of normalization is to utilize in order to remove anomalies that are caused because of the transitive dependency. Normalization is to minimize the redundancy and remove Insert, Update and Delete Anomaly. It divides larger tables into smaller tables and links them using relationships.

Need for normalization :

It eliminates redundant data.
It reduces the chances of data error.
The normalization is important because it allows the database to take up less disk
space.
It also helps in increasing the performance.
It improves the data integrity and consistency.

Q9. Third Normal Form (3NF)

A relation is in the third normal form, if there is no transitive dependency for non-prime attributes as well as it is in the second normal form. A relation is in 3NF if at least one of the following conditions holds in every non-trivial function dependency X –> Y.

X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).

In other words

A relation that is in First and Second Normal Form and in which no non-primary-key attribute is transitively dependent on the primary key, then it is in Third Normal Form (3NF).

Q10. Denormalization in Databases
Denormalization is a database optimization technique in which we add redundant data to one or more tables. This can help us avoid costly joins in a relational database. Note that denormalization does not mean ‘reversing normalization’ or ‘not to normalize’. It is an optimization technique that is applied after normalization.

Q11.Composite Key in SQL
A composite key is made by the combination of two or more columns in a table that can be used to uniquely identify each row in the table when the columns are combined uniqueness of a row is guaranteed, but when it is taken individually it does not guarantee uniqueness, or it can also be understood as a primary key made by the combination of two or more attributes to uniquely identify every row in a table.

Note:

A composite key can also be made by the combination of more than one candidate key.
A composite key cannot be null.

Q12. How to Draw Entity Relationship Diagrams
In the world of database design, entity relationship diagrams serve as valuable tools for designing complex systems and their relationships. In this article will go through the step-by-step process of designing an ER diagram, and defining how entities, attributes, and relationships are defined. Entity relationship diagrams are extremely important in database design and require a clear structure of all data.

Entity Relationship Diagram for BANK

We are following the below steps to design an ER diagram:

Defining Entities
Adding Attributes
Establishing Relationships
Specify Cardinality
Identify Primary Keys
Draw the ER Diagram
Benefits of ER Diagram

Q13. Database Normalization vs Database Optimization

Factor	Database Normalization	Database Optimization
Process	Database Normalization involves breaking up data into smaller, related tables and creating relationships between them.	Database Optimization involves making changes to the physical structure of the database, such as adding indexes, creating partitions, and reorganizing tables and other system parameters.
Security	Database normalization does not affect database security.	Database optimization can improve database security.
Data Access	Database normalization does not affect data access.	Database optimization can improve data access.
Output	Database Normalization results in a more organized and efficient database structure.	Database Optimization results in a faster and more efficient database.

Q14. SQL*Plus Command Reference
SQLPlus is a command-line tool for Oracle Database that allows users to interact with the database using SQL and PL/SQL commands. When passing SQL commands as arguments to SQLPlus in a Linux environment, there are limits on the maximum length of the command line. In this article, we will learn about the Maximum length of command line arguments that can be passed to SQL*Plus.

Command Line Arguments in SQL*Plus

Command line arguments in SQL*Plus serve as parameters that control the behavior of the tool. These arguments can include script names, connection details, and other options. Users frequently utilize these arguments to streamline their workflows and automate tasks.

Maximum Length of Command Line Argument

The maximum length of a command line argument is determined by the operating system. In Linux, the maximum length is typically 131,072 bytes (or 128 KB). This includes the length of the SQL*Plus command, SQL query, and any additional parameters.

Syntax:

sqlplus [username]/[password]@[database] @script.sql

sqlplus: Command to start SQL*Plus.
[username]/[password]@[database]: Connection details to log in to the Oracle Database.
@script.sql: The SQL script file to be executed.

Q15. Difference between MS SQL Server and PostgreSQL

MS SQL SERVER	POSTGRESQL
Developed by Microsoft Corporation and initially released on April 24, 1989	Developed by PostgreSQL Global Development Group on 1989.
MS SQL server is written in C++ language.	PostgreSQL is written in C language.
It is a Microsoft relational DBMS.	It is a widely used open-source RDBMS.
The primary database model for MS SQL Server is Relational DBMS.	The primary database model for PostgreSQL is also Relational DBMS.
It also has two Secondary database models – Document Store and Graph DBMS.	It has a Document store as a Secondary database model.

Q16. What is the Data Replication?
Data Replication is the process of storing data in more than one site or node. It is useful in improving the availability of data. It is simply copying data from a database from one server to another server so that all the users can share the same data without any inconsistency.
Types of Data Replication –

Transactional Replication: In Transactional replication users receive full initial copies of the database and then receive updates as data changes. Data is copied in real-time from the publisher to the receiving database(subscriber) in the same order as they occur with the publisher therefore in this type of replication, transactional consistency is guaranteed.
Snapshot Replication: Snapshot replication distributes data exactly as it appears at a specific moment in time and the does not monitor for updates to the data. The entire snapshot is generated and sent to Users. Snapshot replication is generally used when data changes are infrequent.
Merge Replication: Data from two or more databases is combined into a single database. Merge replication is the most complex type of replication because it allows both publisher and subscriber to independently make changes to the database.

Q17. What is SQL injection?

SQL injection is a technique used to extract user data by injecting web page inputs as statements through SQL commands. Basically, malicious users can use these instructions to manipulate the application’s web server.

SQL injection is a code injection technique that can compromise your database.
SQL injection is one of the most common web hacking techniques.
SQL injection is the injection of malicious code into SQL statements via web page input.

Q18. What is Data Masking?
Data masking is a very important concept to keep data safe from any breaches. Especially, for big organizations that contain heaps of sensitive data that can be easily compromised. Details like credit card information, phone numbers, and house addresses are highly vulnerable information that must be protected. To understand data masking better we first need to know what computer networks are.

Q19. Control Methods of Database Security
Database Security means keeping sensitive information safe and preventing the loss of data. The security of the database is controlled by Database Administrator (DBA).

The following are the main control measures used to provide the security of data in databases:

Authentication

Access control

Inference control

Flow control

Database Security applying Statistical Method

Encryption

Q20. What is Data Encryption?
Data Encryption is a method of preserving data confidentiality by transforming it into ciphertext, which can only be decoded using a unique decryption key produced at the time of the encryption or prior to it.

Data encryption converts data into a different form (code) that can only be accessed by people who have a secret key (formally known as a decryption key) or password. Data that has not been encrypted is referred to as plaintext, and data that has been encrypted is referred to as ciphertext. Encryption is one of the most widely used and successful data protection technologies in today’s corporate world.

Q21. What are the benefits of implementing two-factor authentication in a database environment?

Implementing two-factor authentication (2FA) in a database context adds a strong layer of protection, functioning as a virtual bouncer for sensitive data. Consider it a double lock for your most critical information. Here are a few human-friendly benefits:

Double Lock, Double Security: Just as opening a high-security vault takes both a key and a fingerprint, two-factor authentication requires two distinct forms of identity. This implies that even if someone learns your password, they’ll still need the second piece of the puzzle.
Guarding Against Stolen Passwords: Passwords can leak or be stolen. Even if your password is hacked, 2FA adds an extra layer of security (such as a temporary code texted to your phone) to prevent an intruder from gaining access.
Remote Access Protection: In a world where remote work is becoming more frequent, 2FA serves as a virtual bodyguard for your data, making it more difficult for unauthorised users to penetrate the system, even if they are not physically there.
Preventing Unauthorized Access: 2FA provides an additional degree of security against brute force attacks and other malicious efforts to acquire unauthorised access. It’s like having a gatekeeper who requires both a key and a secret handshake.

Q22. What is DBCC?
Database Console Commands (DBCC) is a set of Transact-SQL commands for database management and maintenance in Microsoft SQL Server.

Here are a few key uses of DBCC commands:

Checking Database Consistency: DBCC commands, such as DBCC CHECKDB, are used to verify the logical and physical integrity of a database. They can assist in identifying and resolving corruption, missing indexes, and other structural issues.
Performance Tuning: Some DBCC commands provide information about database performance and can be used to tune the performance of SQL Server. For example, DBCC SQLPERF provides information about various performance-related counters.
Statistics Updates: DBCC procedures like as DBCC UPDATEUSAGE and DBCC SHOW_STATISTICS are used to update and show statistics about database objects. Keeping statistics up to current is critical for the query optimizer to build effective execution plans.

Q23. SQL Queries on Clustered and Non-Clustered Indexes

Indexing is a procedure that returns your requested data faster from the defined table. Without indexing, the SQL server has to scan the whole table for your data. By indexing, the SQL server will do the exact same thing you do when searching for content in a book by checking the index page. In the same way, a table’s index allows us to locate the exact data without scanning the whole table. There are two types of indexing in SQL.

Clustered index
Non-clustered index

Clustered Index

A clustered index is the type of indexing that establishes a physical sorting order of rows.

Suppose you have a table Student_info which contains ROLL_NO as a primary key, then the clustered index which is self-created on that primary key will sort the Student_info table as per ROLL_NO. A clustered index is like a Dictionary; in the dictionary, the sorting order is alphabetical and there is no separate index page.

Examples:

CREATE TABLE Student_info
(
ROLL_NO int(10) primary key,
NAME varchar(20),
DEPARTMENT varchar(20),
);
INSERT INTO Student_info values(1410110405, 'H Agarwal', 'CSE');
INSERT INTO Student_info values(1410110404, 'S Samadder', 'CSE');
INSERT INTO Student_info values(1410110403, 'MD Irfan', 'CSE'); 

SELECT * FROM Student_info;

Output:

ROLL_NO	NAME	DEPARTMENT
1410110403	MD Irfan	CSE
1410110404	S Samadder	CSE
1410110405	H Agarwal	CSE

Q24. Describe what taking a database offline means.

In the context of database administration, taking a database offline means temporarily removing it from an active state, rendering it unreachable to users and applications. When a database is taken offline, the database engine disallows new connections and terminates current ones. This action can be used for a variety of administrative and maintenance duties. Here’s an overview of what putting a database offline entails:

No User Access: When a database is taken offline, users and programs are no longer able to connect or interact with it. Attempts to access the database while in this offline state will result in an error.
Administrative tasks: Taking a database offline is frequently used to undertake administrative procedures that require exclusive access to the database. This might involve things like running a backup, installing updates or fixes, or migrating database files.
Maintenance Operations: Maintenance procedures, such as rebuilding indexes, updating statistics, or making substantial changes to the database structure, may necessitate taking the database offline to guarantee data consistency and prevent any conflicts.

Q25. What is Summary.txt and Detail.txt?

File Name	Purpose	Content
Summary.txt	Overview of high-level summary	Condensed summary of events, actions, or data
Detail.txt	Detailed information or granular breakdown	In-depth log or detailed information about specific events/data