Database sensitive information detection method and system
Technical field
The present invention is on field of information security technology, more particularly to database sensitive information detection method and system.
Background technology
The event of current various sensitive information leakages or frequent generation, data safety are increasingly valued by people.
But current Database Systems are more and more huger.Big data quantity brings the problem of new to the safety management of database.If data
Ku Li only has a few data management to get up to be easy to, but if there is tens databases, a few thousand sheets tables, it is not known that important information
Where, protection and audit get up to have no way of doing it.Therefore method must quickly can comprehensively finds important information in number
According to the position in storehouse, these important information are carried out with the protection and audit of emphasis.
Common database security scanning software predominantly detects Database Systems configuration risk and database software in itself
Security breaches, analysis be all Database Systems information, sensitive information function where do not detected.
The content of the invention
It is a primary object of the present invention to overcome deficiency of the prior art, there is provided a kind of it can be found that where sensitive information
The database detection method and system of position.In order to solve the above technical problems, the solution of the present invention is:
Database sensitive information detection method is provided, specifically includes following step:
(1) system view of scan database, all user's tables are obtained;
(2) a part of data are extracted to each field of each user's table as sample;
(3) analysis matching is carried out to sample, judges whether it is sensitive information;
The step (1) specifically includes following step:
Step A:Connect database;
Step B:In the system view that database is obtained with SELECT statement, all table names of database, and removal system
Table, it is left user's table;
Step C:The user's table list obtained in step B is returned;
The step (2) specifically includes following step:
Step D:In user's table list that step C is returned, user's table name is taken;
Step E:With the method for SELECT statement and paging query, all fields in the user's table chosen in obtaining step D
Partial data;The partial data refer to take a table beginning N bars record (for example a table has 1000 records, if
All take out and database will be impacted, so taking away 20 of head with the method for paging query or 30 judge);
Step F:By the partial data of all fields obtained in step E, returned as sample data;
The step (3) specifically includes following step:
Step G:Take the sample data of a field of step F returns;
Step H:To the sample data chosen in step G, it is using the method judgement sample data of matching regular expressions
No to belong to sensitive information, determination methods are:If sample data all with the sensitive information matching in sensitive information feature database, is recognized
It is sensitive data field for sample data, show that judged result is;If sample data is with the sensitivity in sensitive information feature database
Information is not less than 80% ratio match, show that judged result is suspected to be;If sample data is with quick in sensitive information feature database
Feel information all mismatch or the ratio match less than 20%, then show that judged result is no;
Step I:Circulation performs step D, step E, step F, step G, step H, until completing to own step C returns
User's table is judged, then returns to the judged result drawn in step H, and shows the scanning report for including judged result
Accuse, and also include when judged result is is suspected to be either, in scan report sensitive information field inventory (i.e. in sample data and
The part of sensitive information feature database matching, such as the B field of Table A have sensitive information (cell-phone number), and return is exactly table name:A, word
Section:B, content:138XXXXXXXX, user is allowed more intuitively to see which type of sensitive information is the sensitive information field have).
In the present invention, the sensitive information feature database in the step H is the regular expression for judging sensitive information
One set (for example judging that cell-phone number has a regular expression, bank's card number is another regular expression);Sensitivity letter
Breath refers to the data of needs protection and audit, including cell-phone number, bank's card number, identification card number and mailbox.
In the present invention, the regular expression in the sensitive information feature database can carry out self-defined addition, for sensitivity
(for example user thinks that employee number is sensitive information to the matching judgment of information, and they can be with self-defined one matching employee number
Regular expression, for the field comprising employee number is listed);Regular expression is to disclose general string matching
Method.
The system for realizing described database sensitive information detection method is provided, including system table, scan module, judges mould
Block and display module, scan module connect with database, and judge module is connected with sensitive information feature database, scan module respectively,
Judge module is connected with display module;
The system table is used to be scanned database, and obtains, user's table in returned data storehouse;
The scan module is used to obtain in user's table that system table returns, the sample data of each field;
The judge module is used for the sample data for obtaining scan module, is carried out using the method for matching regular expressions
Whether matching judgment is sensitive information, and draw be, a kind of no or in being suspected to be judged result;
The display module is used to feed back in display data storehouse, is judged the field that module is judged as sensitive information.
The present invention realization principle be:Regular expressions feature database defines the feature of sensitive information first, followed by fortune
The database of row during use is scanned and detected.Regular expressions feature database have collected cell-phone number, bank's card number, identification card number
And the feature of the sensitive information such as mailbox.Scan module is responsible for scan database, and returns to the user's sample data scanned;So
Afterwards, contrasted by judge module according to regular expressions feature database, judge whether scanned user's sample data is sensitive letter
Breath.In this way or be suspected to be, sensitive information field inventory just listed in scan report, for DBA can key protection and
Audit.
Compared with prior art, the beneficial effects of the invention are as follows:
The present invention is based on regular expressions feature database, and detection is scanned to the user data of lane database by feature database,
It can be found that the position where the sensitive information such as cell-phone number, bank's card number, identification card number and mailbox, and detailed scanning is provided
Report, enables DBA's key protection and audit.
Brief description of the drawings
Fig. 1 is the database sensitive information detection system fundamental diagram of the present invention.
Fig. 2 is the database sensitive information detection method workflow diagram of the present invention.
Embodiment
It is computer technology in field of information security technology the present invention relates to database technology firstly the need of explanation
A kind of application.In the implementation process of the present invention, the application of multiple software function modules can be related to.It is applicant's understanding that such as
After application documents, accurate understanding realization principle and goal of the invention of the invention is read over, existing known technology is being combined
In the case of, those skilled in the art can use the software programming technical ability of its grasp to realize the present invention completely.Aforementioned software work(
Energy module includes but is not limited to:Regular expressions feature database, scan module, judge module, display module etc., all the present patent application texts
Category this category that part refers to, applicant will not enumerate.
The present invention is described in further detail with embodiment below in conjunction with the accompanying drawings:
As shown in Fig. 2 database sensitive information detection method, specifically includes following step:
(1) system view of scan database, all user's tables are obtained;
(2) a part of data are extracted to each field of each user's table as sample;
(3) analysis matching is carried out to sample, judges whether it is sensitive information.
The step (1) specifically includes following step:
Step A:Connect database;
Step B:In the system view that database is obtained with SELECT statement, all table names of database, and removal system
Table, it is left user's table;
Step C:The user's table list obtained in step B is returned.
The step (2) specifically includes following step:
Step D:In user's table list that step C is returned, user's table name is taken;
Step E:With the method for SELECT statement and paging query, all fields in the user's table chosen in obtaining step D
Partial data;Partial data refers to that the N bars record for taking a table beginning, such as a table have 1000 records, if all
Taking out will impact to database, so taking away 20 of head with the method for paging query or 30 judge;
Step F:By the partial data of all fields obtained in step E, returned as sample data.
The step (3) specifically includes following step:
Step G:Take the sample data of a field of step F returns;
Step H:To the sample data chosen in step G, it is using the method judgement sample data of matching regular expressions
No to belong to sensitive information, determination methods are:If sample data all with the sensitive information matching in sensitive information feature database, is recognized
It is sensitive data field for sample data, show that judged result is;If sample data is with the sensitivity in sensitive information feature database
Information is not less than 80% ratio match, show that judged result is suspected to be;If sample data is with quick in sensitive information feature database
Feel information all mismatch or the ratio match less than 20%, then show that judged result is no;
The sensitive information feature database is for judging the regular expression of sensitive information set, for example judging hand
Machine number has a regular expression, and bank's card number is another regular expression;Sensitive information refers to need key protection and examined
The data of meter, including cell-phone number, bank's card number, identification card number and mailbox etc..The method of the matching regular expressions refers to energy
Enough different definitions according to each user to sensitive information, self-defined matching is carried out using regular expression, regular expression is
General character string matching method is disclosed;In addition, the regular expression in sensitive information feature database can carry out self-defined addition, than
As user thinks that employee number is sensitive information, they can be used for the regular expression of self-defined one matching employee number
Field comprising employee number is listed.
Step I:Circulation performs step D, step E, step F, step G, step H, until completing to own step C returns
User's table is judged, then returns to the judged result drawn in step H, and shows the scanning report for including judged result
Accuse, and when judged result is is suspected to be either, also include sensitive information field inventory in scan report;Sensitive information field is clear
List is the part matched in sample data with sensitive information feature database, for example the B field of Table A has sensitive information (cell-phone number), returns
It is exactly table name to return:A, field:B, content:138XXXXXXXX, user is allowed more intuitively to see what the sensitive information field has
Sensitive information.
Database sensitive information detection system as shown in Figure 1 includes system table, scan module, judge module and display mould
Block;Scan module connects with database, is that judge module is connected with sensitive information feature database, scan module respectively, judge module
It is connected with display module.
The system table is used to be scanned database, and obtains, user's table in returned data storehouse.
The scan module is used to obtain in user's table that system table returns, the sample data of each field.
The judge module is used for the sample data for obtaining scan module, carries out matching using regular expressions feature database and sentences
Whether disconnected is sensitive information, and draw be, a kind of no or in being suspected to be judged result.
The display module is used to feed back in display data storehouse, is judged the field that module is judged as sensitive information.
The present invention is more fully understood in the professional and technical personnel that the following examples can make this professional, but not with any side
The formula limitation present invention.Assuming that sensitive information detection is carried out to an ORACLE10G database.
Installation database sensitive information detection system first, input the IP of database to be scanned, port, SID, user name and
Password, connect database.
Scanning and the process for judging to operate mainly perform following steps:
(1) table name of all user's tables is obtained by scanning SELECT statement inquiry system view ALL_TABLES;
(2) sample data of each each field of user's table is obtained using paging query according to table name;
(3) judge whether to belong to corresponding sensitive information according to sample data regular expressions feature database;
(4) according to the table name where the sensitive information that detection obtains and field name generation report, sensitive information list is carried
Supply user.
After the completion of detection, find which sensitive information database has according to detection report can, which table be all stored in
Which field in.
Finally it should be noted that listed above is only specific embodiment of the invention.It is clear that the invention is not restricted to
Above example, there can also be many variations.One of ordinary skill in the art can directly lead from present disclosure
All deformations for going out or associating, are considered as protection scope of the present invention.