CN111125758A - Dynamic desensitization method based on full syntax tree analysis - Google Patents

Dynamic desensitization method based on full syntax tree analysis Download PDF

Info

Publication number
CN111125758A
CN111125758A CN201911313880.7A CN201911313880A CN111125758A CN 111125758 A CN111125758 A CN 111125758A CN 201911313880 A CN201911313880 A CN 201911313880A CN 111125758 A CN111125758 A CN 111125758A
Authority
CN
China
Prior art keywords
field
syntax tree
information
matching
desensitization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911313880.7A
Other languages
Chinese (zh)
Inventor
杨海峰
王佩思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dbsec Technology Co ltd
Original Assignee
Beijing Dbsec Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dbsec Technology Co ltd filed Critical Beijing Dbsec Technology Co ltd
Priority to CN201911313880.7A priority Critical patent/CN111125758A/en
Publication of CN111125758A publication Critical patent/CN111125758A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Abstract

The invention relates to a dynamic desensitization method based on full syntax tree analysis, which is technically characterized by comprising the following steps of: carrying out full syntax tree analysis on the sql original sentence; matching with field information needing desensitization through quick access to obtain a matching result; if the matching is unsuccessful, the syntax tree is directly released, if the matching is successful, the syntax tree desensitization logic is entered, the syntax tree is rewritten, and the rewritten syntax tree is restored. The method comprises the steps of configuring sensitive fields by a user, carrying out syntax tree analysis based on sql statements, obtaining the logical relationship of the statements through the syntax tree, then obtaining all analyzed field information through fast traversal, judging which fields and the affiliated relationship of the fields are inquired, and finally matching the sensitive field information in the analysis result of the full syntax tree, thereby realizing desensitization operation.

Description

Dynamic desensitization method based on full syntax tree analysis
Technical Field
The invention belongs to the technical field of database security, and particularly relates to a dynamic desensitization method based on full syntax tree analysis.
Background
With the widespread use of large data, personal information protection also presents an unprecedented challenge. How personal privacy information is protected is the key to the problem to be solved for desensitization of the database. Database desensitization is a technology for performing data deformation on some sensitive information according to desensitization rules to realize reliable protection of sensitive private data. Dynamic desensitization can perform real-time desensitization processing on data returned by the production library, so that the returned data is available and safe.
The existing dynamic desensitization method can prevent a user from operating some sensitive fields by allocating different authorities to different users, but the method can cause a plurality of limitations to the application of the user; the dynamic desensitization method can also be rewritten by SQL statements, but the dynamic desensitization method by statement rewriting also has a lot of limitations, and the desensitization failure is easily caused because the hierarchical relationship of SQL statements is not existed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a dynamic desensitization method based on full syntax tree analysis, which is reasonable in design, high in accuracy, strong in real-time performance, safe and reliable.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
a dynamic desensitization method based on full syntax tree parsing comprises the following steps:
step 1, carrying out full syntax tree analysis on an sql original sentence;
step 2, matching with field information needing desensitization through quick access to obtain a matching result;
and 3, if the matching is unsuccessful, directly releasing, and if the matching is successful, entering syntax tree desensitization logic, rewriting syntax trees and restoring the rewritten syntax trees.
The specific implementation method of the step 1 comprises the following steps: the analysis vertex information of the sql statement is query, and field information, table information, where information and order information exist in the leaf node of the query; the syntax tree parsing result query field tables are each a separate piece of memory.
And the matching result obtained in the step 2 comprises a desensitization information field and a field.
The specific implementation method of the step 3 comprises the following steps:
⑴, if the field is not a field, then go to step ⑵, otherwise, the following steps are performed:
① traversing the table information associated with;
②, judging the list table, if it is the relation of sub-query, expanding the node to get a new query structure, entering step 2;
③ if the single table is the association table, splitting the association table, entering step ②;
④ if it is a common table, interacting with the database through the table information to obtain the field information queried by the current table, and replacing the one of the current node with the queried field information, and entering step ⑺;
⑵ if the field is a function, a decision is made as to the parameters in the function, and then step ⑴ is entered.
⑶ if the field is in a + b form, splitting the current field into leaf nodes, and then entering step ⑴ in sequence;
⑷ if the field is a sub-query, obtaining a new query structure for the node, and entering step 2;
⑸ if the field is a common table, obtaining the current field and the table to match with the desensitization information, and determining whether the matching is successful;
⑹ if the matching is successful, the current field is a sensitive field, desensitizes and rewrites the current field, and modifies the node information of the current syntax tree, otherwise, the current node is not a sensitive field, the current node remains unchanged;
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
The invention has the advantages and positive effects that:
the method comprises the steps of configuring sensitive fields by a user, carrying out syntax tree analysis based on sql statements, obtaining the logical relationship of the statements through the syntax tree, then obtaining all analyzed field information through fast traversal, judging which fields and the affiliated relationship of the fields are inquired, and finally matching the sensitive field information in the analysis result of the full syntax tree, thereby realizing desensitization operation.
Drawings
FIG. 1 is a schematic diagram of an application of the present invention;
FIG. 2 is a syntax tree model of the full syntax tree parsing of the present invention;
FIG. 3 is a process flow diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The working principle of the invention is as shown in figure 1, the invention is arranged between a client and a database, the client firstly sends an sql statement to the database, the dynamic desensitization method of the invention is used for analyzing the sql statement in the middle, the rewritten statement is desensitized and sent to the database, and finally the desensitized result is returned by the database, thereby realizing the shielding of sensitive data.
A dynamic desensitization method based on full syntax tree parsing, as shown in fig. 3, includes the following steps:
step 1, carrying out full syntax tree analysis on the sql original sentence.
As shown in fig. 2, the specific parsing process of the sql original sentence is as follows: the vertex information of the sql statement is query (the same applies to subqueries, but there is a hierarchical relationship), and fields information, table tables information, where information, order information, and the like exist in the leaf nodes of the query. It is also possible that the expansion is possible in these leaf nodes, that is, the information that may be compounded, for example, the sub-queries in the table of fig. 2, may be expanded, and that the table or field information between peers is linked in the form of a linked list. In order to form a certain association, the order of restoring the original sentence by using the syntax tree is not influenced.
In this step, the syntax tree parsing result query field table and the like are respectively a single memory, so that fast traversal can be realized, and only desensitized fields and field contents need to be traversed sequentially.
Step 2, matching is carried out through quick access and field information needing desensitization, and if the matching is not successful, the field information is directly released; and if the analyzed result has the field information needing desensitization, then entering the step 3.
In this step, if the matched field is x, further determination of the table information is needed, and if the table information is successfully matched, desensitization may be needed and step 3 is also entered.
And 3, desensitizing according to the matching result.
The specific processing procedure of this step is shown in fig. 3, and includes the following steps:
⑴, judging the single field (mainly judging the field information of the outmost layer in the syntax tree), if yes, then processing as follows:
① traverse the table information associated with the x.
②, judging the list table, if the table is the relation of sub-query, the node can expand, after expanding, it is a new query structure, that is, it enters step 2, and returns to the judgment of query.
③ if the table is an association table (e.g., t1 join t2), the association table needs to be split (into leaf nodes, i.e., not deployable), and then each node re-enters the decision logic of the table, i.e., step ②.
④ if it is a common table (general format), it needs to interact with the database through the table information to obtain the field information queried by the current table, and replace the current node with the queried field information, and go to step ⑺.
⑵, if the field information is a function, the parameters in the function need to be determined, and at this time, the parameters need to enter the field determination logic, that is, step ⑴.
⑶, if the field is a + b similar, the current field needs to be split into leaf nodes (single field information), i.e. it cannot be expanded, and then it goes to the field decision logic, i.e. step ⑴.
⑷, if the field is a sub-query, the node can be expanded, a new query structure is formed after the expansion, and the step 2 is returned to be re-entered into the query judgment structure.
⑸, if the field is in a general format, the current field and the table to which it belongs are obtained, and matched with the desensitization information of the configuration together to determine whether the matching can be successful, in the matching process, the table information associated with the field may be many, and thus the matching needs to be performed in sequence, for example, select a from t1, t2, the number of tables associated with the a field is 2, and for select t.a from t, t1, the number of tables associated with the field a is 1 because the field a is assigned to belong to.
⑹, if the matching is successful, it indicates that the current field is a sensitive field, and the current field needs to be rewritten, i.e. a nested desensitization function is performed to modify the node information of the current syntax tree, if the current node is not a sensitive field, the current node remains unchanged.
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
In the above logic, the syntax tree has been rewritten because the syntax tree has a certain logical order, and because the syntax tree can be restored to the sql statement again, the rewritten syntax tree is restored, thereby implementing the rewriting logic of the sql statement.
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but other embodiments derived from the technical solutions of the present invention by those skilled in the art are also within the scope of the present invention.

Claims (4)

1. A dynamic desensitization method based on full syntax tree parsing is characterized by comprising the following steps:
step 1, carrying out full syntax tree analysis on an sql original sentence;
step 2, matching with field information needing desensitization through quick access to obtain a matching result;
and 3, if the matching is unsuccessful, directly releasing, and if the matching is successful, entering syntax tree desensitization logic, rewriting syntax trees and restoring the rewritten syntax trees.
2. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: the specific implementation method of the step 1 comprises the following steps: the analysis vertex information of the sql statement is query, and field information, table information, where information and order information exist in the leaf node of the query; the syntax tree parsing result query field tables are respectively a single memory.
3. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: and the matching result obtained in the step 2 comprises a desensitization information field and a field.
4. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: the specific implementation method of the step 3 comprises the following steps:
⑴, if the field is not a field, then go to step ⑵, otherwise, the following steps are performed:
① traversing the table information associated with;
②, judging the list table, if it is the relation of sub-query, expanding the node to get a new query structure, entering step 2;
③ if the single table is the association table, splitting the association table, entering step ②;
④ if it is a common table, interacting with the database through the table information to obtain the field information queried by the current table, and replacing the one of the current node with the queried field information, and entering step ⑺;
⑵ if the field is a function, a decision is made as to the parameters in the function, and then step ⑴ is entered.
⑶ if the field is in a + b form, splitting the current field into leaf nodes, and then entering step ⑴ in sequence;
⑷ if the field is a sub-query, obtaining a new query structure for the node, and entering step 2;
⑸ if the field is a common table, obtaining the current field and the table to match with the desensitization information, and determining whether the matching is successful;
⑹ if the matching is successful, the current field is a sensitive field, desensitizes and rewrites the current field, and modifies the node information of the current syntax tree, otherwise, the current node is not a sensitive field, the current node remains unchanged;
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
CN201911313880.7A 2019-12-19 2019-12-19 Dynamic desensitization method based on full syntax tree analysis Pending CN111125758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911313880.7A CN111125758A (en) 2019-12-19 2019-12-19 Dynamic desensitization method based on full syntax tree analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911313880.7A CN111125758A (en) 2019-12-19 2019-12-19 Dynamic desensitization method based on full syntax tree analysis

Publications (1)

Publication Number Publication Date
CN111125758A true CN111125758A (en) 2020-05-08

Family

ID=70498403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911313880.7A Pending CN111125758A (en) 2019-12-19 2019-12-19 Dynamic desensitization method based on full syntax tree analysis

Country Status (1)

Country Link
CN (1) CN111125758A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767300A (en) * 2020-05-11 2020-10-13 全球能源互联网研究院有限公司 Dynamic desensitization method and device for penetration of internal and external networks of electric power data
CN112214796A (en) * 2020-10-19 2021-01-12 上海观安信息技术股份有限公司 Dynamic desensitization method based on menu
CN112256721A (en) * 2020-10-21 2021-01-22 平安科技(深圳)有限公司 SQL statement parsing method, system, computer device and storage medium
CN112328599A (en) * 2020-11-12 2021-02-05 杭州数梦工场科技有限公司 Metadata-based field blood relationship analysis method and device
CN112989412A (en) * 2021-03-18 2021-06-18 城云科技(中国)有限公司 Data desensitization method and device based on SQL statement analysis
CN114003231A (en) * 2021-09-28 2022-02-01 厦门国际银行股份有限公司 SQL syntax parse tree optimization method and system
CN115906178A (en) * 2022-12-23 2023-04-04 星环信息科技(上海)股份有限公司 Database management method, data subscription end and data publishing end
CN116303370A (en) * 2023-05-17 2023-06-23 建信金融科技有限责任公司 Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060122993A1 (en) * 2004-12-06 2006-06-08 International Business Machines Corporation Abstract query plan
CN106203170A (en) * 2016-07-19 2016-12-07 北京同余科技有限公司 The Database Dynamic desensitization method of servicing of based role and system
CN106778288A (en) * 2015-11-24 2017-05-31 阿里巴巴集团控股有限公司 A kind of method and system of data desensitization
CN107194270A (en) * 2017-04-07 2017-09-22 广东精点数据科技股份有限公司 A kind of system and method for realizing data desensitization
CN109426725A (en) * 2017-08-22 2019-03-05 中兴通讯股份有限公司 Data desensitization method, equipment and computer readable storage medium
CN109902514A (en) * 2019-03-07 2019-06-18 杭州比智科技有限公司 A kind of data desensitization control system, method, server and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060122993A1 (en) * 2004-12-06 2006-06-08 International Business Machines Corporation Abstract query plan
CN106778288A (en) * 2015-11-24 2017-05-31 阿里巴巴集团控股有限公司 A kind of method and system of data desensitization
CN106203170A (en) * 2016-07-19 2016-12-07 北京同余科技有限公司 The Database Dynamic desensitization method of servicing of based role and system
CN107194270A (en) * 2017-04-07 2017-09-22 广东精点数据科技股份有限公司 A kind of system and method for realizing data desensitization
CN109426725A (en) * 2017-08-22 2019-03-05 中兴通讯股份有限公司 Data desensitization method, equipment and computer readable storage medium
CN109902514A (en) * 2019-03-07 2019-06-18 杭州比智科技有限公司 A kind of data desensitization control system, method, server and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767300A (en) * 2020-05-11 2020-10-13 全球能源互联网研究院有限公司 Dynamic desensitization method and device for penetration of internal and external networks of electric power data
CN112214796A (en) * 2020-10-19 2021-01-12 上海观安信息技术股份有限公司 Dynamic desensitization method based on menu
CN112256721A (en) * 2020-10-21 2021-01-22 平安科技(深圳)有限公司 SQL statement parsing method, system, computer device and storage medium
CN112256721B (en) * 2020-10-21 2021-08-17 平安科技(深圳)有限公司 SQL statement parsing method, system, computer device and storage medium
WO2021179722A1 (en) * 2020-10-21 2021-09-16 平安科技(深圳)有限公司 Sql statement parsing method and system, and computer device and storage medium
CN112328599A (en) * 2020-11-12 2021-02-05 杭州数梦工场科技有限公司 Metadata-based field blood relationship analysis method and device
CN112989412A (en) * 2021-03-18 2021-06-18 城云科技(中国)有限公司 Data desensitization method and device based on SQL statement analysis
CN114003231A (en) * 2021-09-28 2022-02-01 厦门国际银行股份有限公司 SQL syntax parse tree optimization method and system
CN115906178A (en) * 2022-12-23 2023-04-04 星环信息科技(上海)股份有限公司 Database management method, data subscription end and data publishing end
CN116303370A (en) * 2023-05-17 2023-06-23 建信金融科技有限责任公司 Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product
CN116303370B (en) * 2023-05-17 2023-08-15 建信金融科技有限责任公司 Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product

Similar Documents

Publication Publication Date Title
CN111125758A (en) Dynamic desensitization method based on full syntax tree analysis
CN109614432B (en) System and method for acquiring data blood relationship based on syntactic analysis
CN104123288B (en) A kind of data query method and device
US7904488B2 (en) Time stamp methods for unified plant model
CN102609402B (en) Device and method for generation and management of ontology model based on real-time strategy
Peng et al. Adaptive distributed RDF graph fragmentation and allocation based on query workload
WO2018130142A1 (en) Statement parsing method for database statement
CN100561471C (en) Data base automatic operation method based on web service
KR102099069B1 (en) Hybrid ERD Management System, and method thereof
CN111198898B (en) Big data query method and big data query device
CN114756569A (en) Multi-layer parsing method of structured query statement, computer device and storage medium
WO2018059430A1 (en) Database searching
CN115269631A (en) Data query method, data query system, device and storage medium
CN114861229B (en) Hive dynamic desensitization method and system
CN116301755A (en) Automatic batch flow data marking framework construction method based on directed calculation graph
KR102202792B1 (en) Method and device for performing multi-caching on data sources of same or different types by using cluster-based processing system
CN105183736A (en) Universal searching system according to network equipment configuration and state information, and universal searching method thereof
CN111158653B (en) SQL language-based integrated development and execution system for real-time computing program
Akbar et al. An approach for refactoring in model layer on MVC based web application
Li et al. Research on, and development of, data extraction and data cleaning technology based on the internet of things
CN114880351B (en) Recognition method and device of slow query statement, storage medium and electronic equipment
WO2020238597A1 (en) Hadoop-based data updating method, device, system and medium
Wang et al. Constructing data warehouses based on operational metadata-driven builder pattern
Juan et al. A framework of ontology management system
KR100501904B1 (en) Database replication and synchronization method using object identifier

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200508