Statistical Database Security
Prerequisite – Control Methods of Database Security Certain databases may contain confidential or secret data of individuals of country like (Aadhaar numbers, PAN card numbers) and this database should not be accessed by attackers. So, therefore it should be protected from user access. The database which contains details of huge population is called Statistical databases and it is used mainly to produce statistics on various populations. But Users are allowed to retrieve certain statistical information of population like averages of population of particular state/district etc and their sum, count, maximum, minimum, and standard deviations, etc. It is the responsibility of ethical hackers to monitor Statistical Database security statistical users are not permitted to access individual data, such as income of specific person, phone number, Debit card numbers of specified person in database because Statistical database security techniques prohibit retrieval of individual data. It is also responsibility of DBMS to provide confidentiality of data about individuals. Statistical Queries : The queries which allow only aggregate functions such as COUNT, SUM, MIN, MAX, AVERAGE, and STANDARD DEVIATION are called statistical queries. Statistical queries are mainly used for knowing population statistics and in companies/industries to maintain their employees’ database etc. Example – Consider the following examples of statistical queries where EMP_SALARY is confidential database that contains the income of each employee of company. Query-1:
WHERE Emp-department = '3';
WHERE Emp-id = '2';
Here, the “Where” condition can be manipulated by attacker and there is chance to access income of individual employees or confidential data of employee if he knows id/name of particular employee. The possibility of accessing individual information from statistical queries is reduced by using the following measures –
- Partitioning of Database – This means the records of database must be not be stored as bulk in single record. It must be divided into groups of some minimum size according to confidentiality of records. The advantage of Partitioning of database is queries can refer to any complete group or set of groups, but queries cannot access the subsets of records within a group. So, attacker can access at most one or two groups which are less private.
- If no statistical queries are permitted whenever number of tuples in population specified by selection condition falls below some threshold.
- Prohibit sequences of queries that refer repeatedly to same population of tuples.
Share your thoughts in the comments
Please Login to comment...