Skip to content
Related Articles

Related Articles

Statistical Database Security
  • Difficulty Level : Medium
  • Last Updated : 24 Aug, 2020

Prerequisite – Control Methods of Database Security

Certain databases may contain confidential or secret data of individuals of country like (Aadhar numbers, PAN card numbers) and this database should not be accessed by attackers. So, therefore it should be protected from user access.

The database which contains details of huge population is called Statistical databases and it is used mainly to produce statistics on various populations. But Users are allowed to retrieve certain statistical information of population like averages of population of particular state/district etc and their sum, count, maximum, minimum, and standard deviations, etc.

It is the responsibility of ethical hackers to monitor Statistical Database security statistical users are not permitted to access individual data, such as income of specific person, phone number, Debit card numbers of specified person in database because Statistical database security techniques prohibit retrieval of individual data. It is also responsibility of DBMS to provide confidentiality of data about individuals.

Statistical Queries :
The queries which allow only aggregate functions such as COUNT, SUM, MIN, MAX, AVERAGE, and STANDARD DEVIATION are called statistical queries. Statistical queries are mainly used for knowing population statistics and in companies/industries to maintain their employees’ database etc.



Example –
Consider the following examples of statistical queries where EMP_SALARY is confidential database that contains the income of each employee of company.

Query-1:

SELECT COUNT(*) 
FROM EMP_SALARY
WHERE Emp-department = '3';

Query-2:

SELECT AVG(income) 
FROM EMP_SALARY 
WHERE Emp-id = '2'; 

Here, the “Where” condition can be manipulated by attacker and there is chance to access income of individual employees or confidential data of employee if he knows id/name of particular employee.

The possibility of accessing individual information from statistical queries is reduced by using the following measures –

  1. Partitioning of Database – This means the records of database must be not be stored as bulk in single record. It must be divided into groups of some minimum size according to confidentiality of records.

    The advantage of Partitioning of database is queries can refer to any complete group or set of groups, but queries cannot access the subsets of records within a group. So, attacker can access at most one or two groups which are less private.

  2. If no statistical queries are permitted whenever number of tuples in population specified by selection condition falls below some threshold.
  3. Prohibit sequences of queries that refer repeatedly to same population of tuples.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up
Recommended Articles
Page :