CN104123370B - Database sensitive information detection method and system - Google Patents

Database sensitive information detection method and system Download PDF

Info

Publication number
CN104123370B
CN104123370B CN201410356492.8A CN201410356492A CN104123370B CN 104123370 B CN104123370 B CN 104123370B CN 201410356492 A CN201410356492 A CN 201410356492A CN 104123370 B CN104123370 B CN 104123370B
Authority
CN
China
Prior art keywords
sensitive information
database
user
data
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410356492.8A
Other languages
Chinese (zh)
Other versions
CN104123370A (en
Inventor
刘海卫
范渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dbappsecurity Technology Co Ltd
Original Assignee
DBAPPSecurity Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DBAPPSecurity Co Ltd filed Critical DBAPPSecurity Co Ltd
Priority to CN201410356492.8A priority Critical patent/CN104123370B/en
Publication of CN104123370A publication Critical patent/CN104123370A/en
Application granted granted Critical
Publication of CN104123370B publication Critical patent/CN104123370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Abstract

The present invention relates to field of information security technology, it is desirable to provide database sensitive information detection method and system.The database sensitive information detection method includes step:The system view of scan database, all user's tables are obtained, a part of data are extracted to each field of each user's table as sample, analysis matching is carried out to sample, judges whether it is sensitive information;The database sensitive information detection system includes system table, scan module, judge module and display module, and scan module connects with database, and judge module is connected with sensitive information feature database, scan module respectively, and judge module is connected with display module.The present invention is based on regular expressions feature database, detection is scanned to the user data of lane database by feature database, it can be found that the position where the sensitive information such as cell-phone number, bank's card number, identification card number and mailbox, and detailed scan report is provided, enable DBA's key protection and audit.

Description

Database sensitive information detection method and system
Technical field
The present invention is on field of information security technology, more particularly to database sensitive information detection method and system.
Background technology
The event of current various sensitive information leakages or frequent generation, data safety are increasingly valued by people. But current Database Systems are more and more huger.Big data quantity brings the problem of new to the safety management of database.If data Ku Li only has a few data management to get up to be easy to, but if there is tens databases, a few thousand sheets tables, it is not known that important information Where, protection and audit get up to have no way of doing it.Therefore method must quickly can comprehensively finds important information in number According to the position in storehouse, these important information are carried out with the protection and audit of emphasis.
Common database security scanning software predominantly detects Database Systems configuration risk and database software in itself Security breaches, analysis be all Database Systems information, sensitive information function where do not detected.
The content of the invention
It is a primary object of the present invention to overcome deficiency of the prior art, there is provided a kind of it can be found that where sensitive information The database detection method and system of position.In order to solve the above technical problems, the solution of the present invention is:
Database sensitive information detection method is provided, specifically includes following step:
(1) system view of scan database, all user's tables are obtained;
(2) a part of data are extracted to each field of each user's table as sample;
(3) analysis matching is carried out to sample, judges whether it is sensitive information;
The step (1) specifically includes following step:
Step A:Connect database;
Step B:In the system view that database is obtained with SELECT statement, all table names of database, and removal system Table, it is left user's table;
Step C:The user's table list obtained in step B is returned;
The step (2) specifically includes following step:
Step D:In user's table list that step C is returned, user's table name is taken;
Step E:With the method for SELECT statement and paging query, all fields in the user's table chosen in obtaining step D Partial data;The partial data refer to take a table beginning N bars record (for example a table has 1000 records, if All take out and database will be impacted, so taking away 20 of head with the method for paging query or 30 judge);
Step F:By the partial data of all fields obtained in step E, returned as sample data;
The step (3) specifically includes following step:
Step G:Take the sample data of a field of step F returns;
Step H:To the sample data chosen in step G, it is using the method judgement sample data of matching regular expressions No to belong to sensitive information, determination methods are:If sample data all with the sensitive information matching in sensitive information feature database, is recognized It is sensitive data field for sample data, show that judged result is;If sample data is with the sensitivity in sensitive information feature database Information is not less than 80% ratio match, show that judged result is suspected to be;If sample data is with quick in sensitive information feature database Feel information all mismatch or the ratio match less than 20%, then show that judged result is no;
Step I:Circulation performs step D, step E, step F, step G, step H, until completing to own step C returns User's table is judged, then returns to the judged result drawn in step H, and shows the scanning report for including judged result Accuse, and also include when judged result is is suspected to be either, in scan report sensitive information field inventory (i.e. in sample data and The part of sensitive information feature database matching, such as the B field of Table A have sensitive information (cell-phone number), and return is exactly table name:A, word Section:B, content:138XXXXXXXX, user is allowed more intuitively to see which type of sensitive information is the sensitive information field have).
In the present invention, the sensitive information feature database in the step H is the regular expression for judging sensitive information One set (for example judging that cell-phone number has a regular expression, bank's card number is another regular expression);Sensitivity letter Breath refers to the data of needs protection and audit, including cell-phone number, bank's card number, identification card number and mailbox.
In the present invention, the regular expression in the sensitive information feature database can carry out self-defined addition, for sensitivity (for example user thinks that employee number is sensitive information to the matching judgment of information, and they can be with self-defined one matching employee number Regular expression, for the field comprising employee number is listed);Regular expression is to disclose general string matching Method.
The system for realizing described database sensitive information detection method is provided, including system table, scan module, judges mould Block and display module, scan module connect with database, and judge module is connected with sensitive information feature database, scan module respectively, Judge module is connected with display module;
The system table is used to be scanned database, and obtains, user's table in returned data storehouse;
The scan module is used to obtain in user's table that system table returns, the sample data of each field;
The judge module is used for the sample data for obtaining scan module, is carried out using the method for matching regular expressions Whether matching judgment is sensitive information, and draw be, a kind of no or in being suspected to be judged result;
The display module is used to feed back in display data storehouse, is judged the field that module is judged as sensitive information.
The present invention realization principle be:Regular expressions feature database defines the feature of sensitive information first, followed by fortune The database of row during use is scanned and detected.Regular expressions feature database have collected cell-phone number, bank's card number, identification card number And the feature of the sensitive information such as mailbox.Scan module is responsible for scan database, and returns to the user's sample data scanned;So Afterwards, contrasted by judge module according to regular expressions feature database, judge whether scanned user's sample data is sensitive letter Breath.In this way or be suspected to be, sensitive information field inventory just listed in scan report, for DBA can key protection and Audit.
Compared with prior art, the beneficial effects of the invention are as follows:
The present invention is based on regular expressions feature database, and detection is scanned to the user data of lane database by feature database, It can be found that the position where the sensitive information such as cell-phone number, bank's card number, identification card number and mailbox, and detailed scanning is provided Report, enables DBA's key protection and audit.
Brief description of the drawings
Fig. 1 is the database sensitive information detection system fundamental diagram of the present invention.
Fig. 2 is the database sensitive information detection method workflow diagram of the present invention.
Embodiment
It is computer technology in field of information security technology the present invention relates to database technology firstly the need of explanation A kind of application.In the implementation process of the present invention, the application of multiple software function modules can be related to.It is applicant's understanding that such as After application documents, accurate understanding realization principle and goal of the invention of the invention is read over, existing known technology is being combined In the case of, those skilled in the art can use the software programming technical ability of its grasp to realize the present invention completely.Aforementioned software work( Energy module includes but is not limited to:Regular expressions feature database, scan module, judge module, display module etc., all the present patent application texts Category this category that part refers to, applicant will not enumerate.
The present invention is described in further detail with embodiment below in conjunction with the accompanying drawings:
As shown in Fig. 2 database sensitive information detection method, specifically includes following step:
(1) system view of scan database, all user's tables are obtained;
(2) a part of data are extracted to each field of each user's table as sample;
(3) analysis matching is carried out to sample, judges whether it is sensitive information.
The step (1) specifically includes following step:
Step A:Connect database;
Step B:In the system view that database is obtained with SELECT statement, all table names of database, and removal system Table, it is left user's table;
Step C:The user's table list obtained in step B is returned.
The step (2) specifically includes following step:
Step D:In user's table list that step C is returned, user's table name is taken;
Step E:With the method for SELECT statement and paging query, all fields in the user's table chosen in obtaining step D Partial data;Partial data refers to that the N bars record for taking a table beginning, such as a table have 1000 records, if all Taking out will impact to database, so taking away 20 of head with the method for paging query or 30 judge;
Step F:By the partial data of all fields obtained in step E, returned as sample data.
The step (3) specifically includes following step:
Step G:Take the sample data of a field of step F returns;
Step H:To the sample data chosen in step G, it is using the method judgement sample data of matching regular expressions No to belong to sensitive information, determination methods are:If sample data all with the sensitive information matching in sensitive information feature database, is recognized It is sensitive data field for sample data, show that judged result is;If sample data is with the sensitivity in sensitive information feature database Information is not less than 80% ratio match, show that judged result is suspected to be;If sample data is with quick in sensitive information feature database Feel information all mismatch or the ratio match less than 20%, then show that judged result is no;
The sensitive information feature database is for judging the regular expression of sensitive information set, for example judging hand Machine number has a regular expression, and bank's card number is another regular expression;Sensitive information refers to need key protection and examined The data of meter, including cell-phone number, bank's card number, identification card number and mailbox etc..The method of the matching regular expressions refers to energy Enough different definitions according to each user to sensitive information, self-defined matching is carried out using regular expression, regular expression is General character string matching method is disclosed;In addition, the regular expression in sensitive information feature database can carry out self-defined addition, than As user thinks that employee number is sensitive information, they can be used for the regular expression of self-defined one matching employee number Field comprising employee number is listed.
Step I:Circulation performs step D, step E, step F, step G, step H, until completing to own step C returns User's table is judged, then returns to the judged result drawn in step H, and shows the scanning report for including judged result Accuse, and when judged result is is suspected to be either, also include sensitive information field inventory in scan report;Sensitive information field is clear List is the part matched in sample data with sensitive information feature database, for example the B field of Table A has sensitive information (cell-phone number), returns It is exactly table name to return:A, field:B, content:138XXXXXXXX, user is allowed more intuitively to see what the sensitive information field has Sensitive information.
Database sensitive information detection system as shown in Figure 1 includes system table, scan module, judge module and display mould Block;Scan module connects with database, is that judge module is connected with sensitive information feature database, scan module respectively, judge module It is connected with display module.
The system table is used to be scanned database, and obtains, user's table in returned data storehouse.
The scan module is used to obtain in user's table that system table returns, the sample data of each field.
The judge module is used for the sample data for obtaining scan module, carries out matching using regular expressions feature database and sentences Whether disconnected is sensitive information, and draw be, a kind of no or in being suspected to be judged result.
The display module is used to feed back in display data storehouse, is judged the field that module is judged as sensitive information.
The present invention is more fully understood in the professional and technical personnel that the following examples can make this professional, but not with any side The formula limitation present invention.Assuming that sensitive information detection is carried out to an ORACLE10G database.
Installation database sensitive information detection system first, input the IP of database to be scanned, port, SID, user name and Password, connect database.
Scanning and the process for judging to operate mainly perform following steps:
(1) table name of all user's tables is obtained by scanning SELECT statement inquiry system view ALL_TABLES;
(2) sample data of each each field of user's table is obtained using paging query according to table name;
(3) judge whether to belong to corresponding sensitive information according to sample data regular expressions feature database;
(4) according to the table name where the sensitive information that detection obtains and field name generation report, sensitive information list is carried Supply user.
After the completion of detection, find which sensitive information database has according to detection report can, which table be all stored in Which field in.
Finally it should be noted that listed above is only specific embodiment of the invention.It is clear that the invention is not restricted to Above example, there can also be many variations.One of ordinary skill in the art can directly lead from present disclosure All deformations for going out or associating, are considered as protection scope of the present invention.

Claims (2)

1. database sensitive information detection method, it is characterised in that specifically include following step:
(1) system view of scan database, all user's tables are obtained;
(2) a part of data are extracted to each field of each user's table as sample;
(3) analysis matching is carried out to sample, judges whether it is sensitive information;
The step (1) specifically includes following step:
Step A:Connect database;
Step B:In the system view that database is obtained with SELECT statement, all table names of database, and system table is excluded, remain Lower user's table;
Step C:The user's table list obtained in step B is returned;
The step (2) specifically includes following step:
Step D:In user's table list that step C is returned, user's table name is taken;
Step E:With the method for SELECT statement and paging query, the portion of all fields in the user's table chosen in obtaining step D Divided data;The partial data refers to the N bars record for taking a table beginning;
Step F:By the partial data of all fields obtained in step E, returned as sample data;
The step (3) specifically includes following step:
Step G:Take the sample data of a field of step F returns;
Step H:To the sample data chosen in step G, whether belonged to using the method judgement sample data of matching regular expressions In sensitive information, determination methods are:If sample data is all with the sensitive information matching in sensitive information feature database, it is believed that sample Notebook data is sensitive data field, show that judged result is;If sample data is with the sensitive information in sensitive information feature database Ratio match not less than 80%, show that judged result is suspected to be;If sample data is with the sensitivity letter in sensitive information feature database Breath all mismatches or the ratio match less than 20%, then show that judged result is no;
The sensitive information feature database is for judging the regular expression of sensitive information set;Sensitive information refers to need The data protected and audited, including cell-phone number, bank's card number, identification card number and mailbox;In the sensitive information feature database Regular expression can carry out self-defined addition, the matching judgment for sensitive information;Regular expression is to disclose general character String matching method;
Step I:Circulation performs step D, step E, step F, step G, step H, until completing all users returned to step C Table is judged, then returns to the judged result drawn in step H, and shows the scan report for including judged result, and When judged result is is suspected to be either, also include sensitive information field inventory in scan report.
2. realize the system of the database sensitive information detection method described in claim 1, it is characterised in that including system table, Scan module, judge module and display module, scan module connect with database, judge module respectively with sensitive information feature Storehouse, scan module are connected, and judge module is connected with display module;
The system table is used to be scanned database, and obtains, user's table in returned data storehouse;
The scan module is used to obtain in user's table that system table returns, the sample data of each field;
The judge module is used for the sample data for obtaining scan module, is matched using the method for matching regular expressions Determine whether sensitive information, and draw be, a kind of no or in being suspected to be judged result;
The display module is used to feed back in display data storehouse, is judged the field that module is judged as sensitive information.
CN201410356492.8A 2014-07-24 2014-07-24 Database sensitive information detection method and system Active CN104123370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410356492.8A CN104123370B (en) 2014-07-24 2014-07-24 Database sensitive information detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410356492.8A CN104123370B (en) 2014-07-24 2014-07-24 Database sensitive information detection method and system

Publications (2)

Publication Number Publication Date
CN104123370A CN104123370A (en) 2014-10-29
CN104123370B true CN104123370B (en) 2017-11-24

Family

ID=51768781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410356492.8A Active CN104123370B (en) 2014-07-24 2014-07-24 Database sensitive information detection method and system

Country Status (1)

Country Link
CN (1) CN104123370B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462451B (en) * 2014-12-15 2017-12-05 中电长城网际系统应用有限公司 The detection method and device of database file sensitive content
CN105825138B (en) * 2015-01-04 2019-02-15 北京神州泰岳软件股份有限公司 A kind of method and apparatus of sensitive data identification
CN106156046B (en) * 2015-03-27 2021-03-30 中国移动通信集团云南有限公司 Information management method, device and system and analysis equipment
CN104794204B (en) * 2015-04-23 2018-11-09 上海新炬网络技术有限公司 A kind of database sensitive data automatic identifying method
CN107305615B (en) * 2016-04-25 2019-12-17 深信服科技股份有限公司 Data table identification method and system
CN106203145A (en) * 2016-08-04 2016-12-07 北京网智天元科技股份有限公司 Data desensitization method and relevant device
CN106295400A (en) * 2016-08-04 2017-01-04 北京网智天元科技股份有限公司 Masking type data desensitization method and relevant device
CN107066882B (en) * 2017-03-17 2019-07-12 平安科技(深圳)有限公司 Information leakage detection method and device
CN107295009A (en) * 2017-08-01 2017-10-24 杭州安恒信息技术有限公司 A kind of method for bypassing audit sqlserver link informations
CN107729456A (en) * 2017-09-30 2018-02-23 武汉汉思信息技术有限责任公司 Sensitive information search method, server and storage medium
CN108062484A (en) * 2017-12-11 2018-05-22 北京安华金和科技有限公司 A kind of classification stage division based on data sensitive feature and database metadata
CN108536739B (en) * 2018-03-07 2021-10-12 中国平安人寿保险股份有限公司 Metadata sensitive information field identification method, device, equipment and storage medium
CN109597823B (en) * 2018-11-05 2023-08-29 中国平安财产保险股份有限公司 Data source configuration method, device, computer equipment and storage medium
CN109829327A (en) * 2018-12-15 2019-05-31 中国平安人寿保险股份有限公司 Sensitive information processing method, device, electronic equipment and storage medium
CN109617880A (en) * 2018-12-17 2019-04-12 杭州安恒信息技术股份有限公司 Actively protect the method and apparatus of privacy information
CN110472432A (en) * 2019-05-31 2019-11-19 上海上湖信息技术有限公司 A kind of method and device of sensitive information desensitization
CN111737742B (en) * 2020-06-19 2023-06-20 建信金融科技有限责任公司 Sensitive data scanning method and system
CN113157854B (en) * 2021-01-22 2023-08-04 奇安信科技集团股份有限公司 API sensitive data leakage detection method and system
CN113177233A (en) * 2021-05-31 2021-07-27 上海英方软件股份有限公司 Sensitive data identification method and device
CN113704573A (en) * 2021-08-26 2021-11-26 北京中安星云软件技术有限公司 Database sensitive data scanning method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101448007A (en) * 2008-12-31 2009-06-03 中国电力科学研究院 Attack prevention system based on structured query language (SQL)
CN102521536A (en) * 2011-12-06 2012-06-27 杭州安恒信息技术有限公司 Method and system for detecting inner core object invasion of database
US8272051B1 (en) * 2008-03-27 2012-09-18 Trend Micro Incorporated Method and apparatus of information leakage prevention for database tables
CN102902703A (en) * 2012-07-19 2013-01-30 中国人民解放军国防科学技术大学 Network sensitive information-oriented screenshot discovery and locking callback method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8272051B1 (en) * 2008-03-27 2012-09-18 Trend Micro Incorporated Method and apparatus of information leakage prevention for database tables
CN101448007A (en) * 2008-12-31 2009-06-03 中国电力科学研究院 Attack prevention system based on structured query language (SQL)
CN102521536A (en) * 2011-12-06 2012-06-27 杭州安恒信息技术有限公司 Method and system for detecting inner core object invasion of database
CN102902703A (en) * 2012-07-19 2013-01-30 中国人民解放军国防科学技术大学 Network sensitive information-oriented screenshot discovery and locking callback method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
New method of detection and wiping of sensitive information;George P.等;《Intelligent computer communication and processing》;20111020;145-148 *
无线音乐业务敏感数据保护技术研究;杨雪涛;《电信工程技术与标准化》;20131231(第12期);55-59 *

Also Published As

Publication number Publication date
CN104123370A (en) 2014-10-29

Similar Documents

Publication Publication Date Title
CN104123370B (en) Database sensitive information detection method and system
US9736331B2 (en) Device, system and method for identifying sections of documents
CN107870927B (en) File evaluation method and device
CN106980637A (en) SQL checking methods and device
WO2019194028A1 (en) Image processing device, image processing method, and storage medium for storing program
CN106384057A (en) Data access authority identification method and device
Fu et al. Automatic record linkage of individuals and households in historical census data
CN110599289A (en) Method for formatting official document
KR101019627B1 (en) System and Method for Construction Automatic Bibliography based Pattern, and Recording Medium therefor
JP2011197997A (en) Device, processing program, and method for controlling information display
CN116340387A (en) Statistical analysis method and system for personal information disclosure condition of data table
CN108268462A (en) A kind of data quality checking system of relation integraity
CN114417099B (en) Archive management system based on RFID (radio frequency identification) label
Dejean Extracting structured data from unstructured document with incomplete resources
CN110570207A (en) commodity tracing method and device
CN115081916A (en) DNA digital management system
CN105893527A (en) Intelligent user information inputting method
US20150324813A1 (en) System and method for determining by an external entity the human hierarchial structure of an rganization, using public social networks
JP5419475B2 (en) Security management apparatus and program
CN113706311B (en) Efficient verifiable supply chain data sharing architecture
Alotaibi ETDC: an efficient technique to cleanse data in the data warehouse
TWI684109B (en) A computer implemented system and method for collating and presenting multi-format information
US20230077642A1 (en) Systems and methods for performing Correlated Multiphasic Analysis
Solaro Sensitivity analysis and robust approach in multidimensional scaling: An evaluation of customer satisfaction
JP7206415B2 (en) Daily report data shaping device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Zhejiang Zhongcai Building No. 68 Hangzhou 310051 Zhejiang province Binjiang District Tong Road 15

Patentee after: Hangzhou Annan information technology Limited by Share Ltd

Address before: Hangzhou City, Zhejiang province 310051 Binjiang District and Zhejiang road in the 15 storey building

Patentee before: Dbappsecurity Co.,ltd.

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: Zhejiang Zhongcai Building No. 68 Binjiang District road Hangzhou City, Zhejiang Province, the 310051 and 15 layer

Patentee after: Hangzhou Annan information technology Limited by Share Ltd

Address before: Zhejiang Zhongcai Building No. 68 Hangzhou 310051 Zhejiang province Binjiang District Tong Road 15

Patentee before: Hangzhou Annan information technology Limited by Share Ltd