CN111125758A - Dynamic desensitization method based on full syntax tree analysis - Google Patents
Dynamic desensitization method based on full syntax tree analysis Download PDFInfo
- Publication number
- CN111125758A CN111125758A CN201911313880.7A CN201911313880A CN111125758A CN 111125758 A CN111125758 A CN 111125758A CN 201911313880 A CN201911313880 A CN 201911313880A CN 111125758 A CN111125758 A CN 111125758A
- Authority
- CN
- China
- Prior art keywords
- field
- syntax tree
- information
- matching
- desensitization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000586 desensitisation Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000004458 analytical method Methods 0.000 title claims abstract description 15
- 230000006870 function Effects 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
Abstract
The invention relates to a dynamic desensitization method based on full syntax tree analysis, which is technically characterized by comprising the following steps of: carrying out full syntax tree analysis on the sql original sentence; matching with field information needing desensitization through quick access to obtain a matching result; if the matching is unsuccessful, the syntax tree is directly released, if the matching is successful, the syntax tree desensitization logic is entered, the syntax tree is rewritten, and the rewritten syntax tree is restored. The method comprises the steps of configuring sensitive fields by a user, carrying out syntax tree analysis based on sql statements, obtaining the logical relationship of the statements through the syntax tree, then obtaining all analyzed field information through fast traversal, judging which fields and the affiliated relationship of the fields are inquired, and finally matching the sensitive field information in the analysis result of the full syntax tree, thereby realizing desensitization operation.
Description
Technical Field
The invention belongs to the technical field of database security, and particularly relates to a dynamic desensitization method based on full syntax tree analysis.
Background
With the widespread use of large data, personal information protection also presents an unprecedented challenge. How personal privacy information is protected is the key to the problem to be solved for desensitization of the database. Database desensitization is a technology for performing data deformation on some sensitive information according to desensitization rules to realize reliable protection of sensitive private data. Dynamic desensitization can perform real-time desensitization processing on data returned by the production library, so that the returned data is available and safe.
The existing dynamic desensitization method can prevent a user from operating some sensitive fields by allocating different authorities to different users, but the method can cause a plurality of limitations to the application of the user; the dynamic desensitization method can also be rewritten by SQL statements, but the dynamic desensitization method by statement rewriting also has a lot of limitations, and the desensitization failure is easily caused because the hierarchical relationship of SQL statements is not existed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a dynamic desensitization method based on full syntax tree analysis, which is reasonable in design, high in accuracy, strong in real-time performance, safe and reliable.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
a dynamic desensitization method based on full syntax tree parsing comprises the following steps:
step 1, carrying out full syntax tree analysis on an sql original sentence;
step 2, matching with field information needing desensitization through quick access to obtain a matching result;
and 3, if the matching is unsuccessful, directly releasing, and if the matching is successful, entering syntax tree desensitization logic, rewriting syntax trees and restoring the rewritten syntax trees.
The specific implementation method of the step 1 comprises the following steps: the analysis vertex information of the sql statement is query, and field information, table information, where information and order information exist in the leaf node of the query; the syntax tree parsing result query field tables are each a separate piece of memory.
And the matching result obtained in the step 2 comprises a desensitization information field and a field.
The specific implementation method of the step 3 comprises the following steps:
⑴, if the field is not a field, then go to step ⑵, otherwise, the following steps are performed:
① traversing the table information associated with;
②, judging the list table, if it is the relation of sub-query, expanding the node to get a new query structure, entering step 2;
③ if the single table is the association table, splitting the association table, entering step ②;
④ if it is a common table, interacting with the database through the table information to obtain the field information queried by the current table, and replacing the one of the current node with the queried field information, and entering step ⑺;
⑵ if the field is a function, a decision is made as to the parameters in the function, and then step ⑴ is entered.
⑶ if the field is in a + b form, splitting the current field into leaf nodes, and then entering step ⑴ in sequence;
⑷ if the field is a sub-query, obtaining a new query structure for the node, and entering step 2;
⑸ if the field is a common table, obtaining the current field and the table to match with the desensitization information, and determining whether the matching is successful;
⑹ if the matching is successful, the current field is a sensitive field, desensitizes and rewrites the current field, and modifies the node information of the current syntax tree, otherwise, the current node is not a sensitive field, the current node remains unchanged;
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
The invention has the advantages and positive effects that:
the method comprises the steps of configuring sensitive fields by a user, carrying out syntax tree analysis based on sql statements, obtaining the logical relationship of the statements through the syntax tree, then obtaining all analyzed field information through fast traversal, judging which fields and the affiliated relationship of the fields are inquired, and finally matching the sensitive field information in the analysis result of the full syntax tree, thereby realizing desensitization operation.
Drawings
FIG. 1 is a schematic diagram of an application of the present invention;
FIG. 2 is a syntax tree model of the full syntax tree parsing of the present invention;
FIG. 3 is a process flow diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The working principle of the invention is as shown in figure 1, the invention is arranged between a client and a database, the client firstly sends an sql statement to the database, the dynamic desensitization method of the invention is used for analyzing the sql statement in the middle, the rewritten statement is desensitized and sent to the database, and finally the desensitized result is returned by the database, thereby realizing the shielding of sensitive data.
A dynamic desensitization method based on full syntax tree parsing, as shown in fig. 3, includes the following steps:
step 1, carrying out full syntax tree analysis on the sql original sentence.
As shown in fig. 2, the specific parsing process of the sql original sentence is as follows: the vertex information of the sql statement is query (the same applies to subqueries, but there is a hierarchical relationship), and fields information, table tables information, where information, order information, and the like exist in the leaf nodes of the query. It is also possible that the expansion is possible in these leaf nodes, that is, the information that may be compounded, for example, the sub-queries in the table of fig. 2, may be expanded, and that the table or field information between peers is linked in the form of a linked list. In order to form a certain association, the order of restoring the original sentence by using the syntax tree is not influenced.
In this step, the syntax tree parsing result query field table and the like are respectively a single memory, so that fast traversal can be realized, and only desensitized fields and field contents need to be traversed sequentially.
Step 2, matching is carried out through quick access and field information needing desensitization, and if the matching is not successful, the field information is directly released; and if the analyzed result has the field information needing desensitization, then entering the step 3.
In this step, if the matched field is x, further determination of the table information is needed, and if the table information is successfully matched, desensitization may be needed and step 3 is also entered.
And 3, desensitizing according to the matching result.
The specific processing procedure of this step is shown in fig. 3, and includes the following steps:
⑴, judging the single field (mainly judging the field information of the outmost layer in the syntax tree), if yes, then processing as follows:
① traverse the table information associated with the x.
②, judging the list table, if the table is the relation of sub-query, the node can expand, after expanding, it is a new query structure, that is, it enters step 2, and returns to the judgment of query.
③ if the table is an association table (e.g., t1 join t2), the association table needs to be split (into leaf nodes, i.e., not deployable), and then each node re-enters the decision logic of the table, i.e., step ②.
④ if it is a common table (general format), it needs to interact with the database through the table information to obtain the field information queried by the current table, and replace the current node with the queried field information, and go to step ⑺.
⑵, if the field information is a function, the parameters in the function need to be determined, and at this time, the parameters need to enter the field determination logic, that is, step ⑴.
⑶, if the field is a + b similar, the current field needs to be split into leaf nodes (single field information), i.e. it cannot be expanded, and then it goes to the field decision logic, i.e. step ⑴.
⑷, if the field is a sub-query, the node can be expanded, a new query structure is formed after the expansion, and the step 2 is returned to be re-entered into the query judgment structure.
⑸, if the field is in a general format, the current field and the table to which it belongs are obtained, and matched with the desensitization information of the configuration together to determine whether the matching can be successful, in the matching process, the table information associated with the field may be many, and thus the matching needs to be performed in sequence, for example, select a from t1, t2, the number of tables associated with the a field is 2, and for select t.a from t, t1, the number of tables associated with the field a is 1 because the field a is assigned to belong to.
⑹, if the matching is successful, it indicates that the current field is a sensitive field, and the current field needs to be rewritten, i.e. a nested desensitization function is performed to modify the node information of the current syntax tree, if the current node is not a sensitive field, the current node remains unchanged.
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
In the above logic, the syntax tree has been rewritten because the syntax tree has a certain logical order, and because the syntax tree can be restored to the sql statement again, the rewritten syntax tree is restored, thereby implementing the rewriting logic of the sql statement.
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but other embodiments derived from the technical solutions of the present invention by those skilled in the art are also within the scope of the present invention.
Claims (4)
1. A dynamic desensitization method based on full syntax tree parsing is characterized by comprising the following steps:
step 1, carrying out full syntax tree analysis on an sql original sentence;
step 2, matching with field information needing desensitization through quick access to obtain a matching result;
and 3, if the matching is unsuccessful, directly releasing, and if the matching is successful, entering syntax tree desensitization logic, rewriting syntax trees and restoring the rewritten syntax trees.
2. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: the specific implementation method of the step 1 comprises the following steps: the analysis vertex information of the sql statement is query, and field information, table information, where information and order information exist in the leaf node of the query; the syntax tree parsing result query field tables are respectively a single memory.
3. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: and the matching result obtained in the step 2 comprises a desensitization information field and a field.
4. The method of claim 1, wherein the full syntax tree parsing-based dynamic desensitization method comprises: the specific implementation method of the step 3 comprises the following steps:
⑴, if the field is not a field, then go to step ⑵, otherwise, the following steps are performed:
① traversing the table information associated with;
②, judging the list table, if it is the relation of sub-query, expanding the node to get a new query structure, entering step 2;
③ if the single table is the association table, splitting the association table, entering step ②;
④ if it is a common table, interacting with the database through the table information to obtain the field information queried by the current table, and replacing the one of the current node with the queried field information, and entering step ⑺;
⑵ if the field is a function, a decision is made as to the parameters in the function, and then step ⑴ is entered.
⑶ if the field is in a + b form, splitting the current field into leaf nodes, and then entering step ⑴ in sequence;
⑷ if the field is a sub-query, obtaining a new query structure for the node, and entering step 2;
⑸ if the field is a common table, obtaining the current field and the table to match with the desensitization information, and determining whether the matching is successful;
⑹ if the matching is successful, the current field is a sensitive field, desensitizes and rewrites the current field, and modifies the node information of the current syntax tree, otherwise, the current node is not a sensitive field, the current node remains unchanged;
⑺, writing the rewritten syntax tree back to obtain a rewritten sql statement.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911313880.7A CN111125758A (en) | 2019-12-19 | 2019-12-19 | Dynamic desensitization method based on full syntax tree analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911313880.7A CN111125758A (en) | 2019-12-19 | 2019-12-19 | Dynamic desensitization method based on full syntax tree analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111125758A true CN111125758A (en) | 2020-05-08 |
Family
ID=70498403
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911313880.7A Pending CN111125758A (en) | 2019-12-19 | 2019-12-19 | Dynamic desensitization method based on full syntax tree analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111125758A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767300A (en) * | 2020-05-11 | 2020-10-13 | 全球能源互联网研究院有限公司 | Dynamic desensitization method and device for penetration of internal and external networks of electric power data |
CN112214796A (en) * | 2020-10-19 | 2021-01-12 | 上海观安信息技术股份有限公司 | Dynamic desensitization method based on menu |
CN112256721A (en) * | 2020-10-21 | 2021-01-22 | 平安科技(深圳)有限公司 | SQL statement parsing method, system, computer device and storage medium |
CN112328599A (en) * | 2020-11-12 | 2021-02-05 | 杭州数梦工场科技有限公司 | Metadata-based field blood relationship analysis method and device |
CN112989412A (en) * | 2021-03-18 | 2021-06-18 | 城云科技(中国)有限公司 | Data desensitization method and device based on SQL statement analysis |
CN114003231A (en) * | 2021-09-28 | 2022-02-01 | 厦门国际银行股份有限公司 | SQL syntax parse tree optimization method and system |
CN115906178A (en) * | 2022-12-23 | 2023-04-04 | 星环信息科技(上海)股份有限公司 | Database management method, data subscription end and data publishing end |
CN116303370A (en) * | 2023-05-17 | 2023-06-23 | 建信金融科技有限责任公司 | Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060122993A1 (en) * | 2004-12-06 | 2006-06-08 | International Business Machines Corporation | Abstract query plan |
CN106203170A (en) * | 2016-07-19 | 2016-12-07 | 北京同余科技有限公司 | The Database Dynamic desensitization method of servicing of based role and system |
CN106778288A (en) * | 2015-11-24 | 2017-05-31 | 阿里巴巴集团控股有限公司 | A kind of method and system of data desensitization |
CN107194270A (en) * | 2017-04-07 | 2017-09-22 | 广东精点数据科技股份有限公司 | A kind of system and method for realizing data desensitization |
CN109426725A (en) * | 2017-08-22 | 2019-03-05 | 中兴通讯股份有限公司 | Data desensitization method, equipment and computer readable storage medium |
CN109902514A (en) * | 2019-03-07 | 2019-06-18 | 杭州比智科技有限公司 | A kind of data desensitization control system, method, server and storage medium |
-
2019
- 2019-12-19 CN CN201911313880.7A patent/CN111125758A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060122993A1 (en) * | 2004-12-06 | 2006-06-08 | International Business Machines Corporation | Abstract query plan |
CN106778288A (en) * | 2015-11-24 | 2017-05-31 | 阿里巴巴集团控股有限公司 | A kind of method and system of data desensitization |
CN106203170A (en) * | 2016-07-19 | 2016-12-07 | 北京同余科技有限公司 | The Database Dynamic desensitization method of servicing of based role and system |
CN107194270A (en) * | 2017-04-07 | 2017-09-22 | 广东精点数据科技股份有限公司 | A kind of system and method for realizing data desensitization |
CN109426725A (en) * | 2017-08-22 | 2019-03-05 | 中兴通讯股份有限公司 | Data desensitization method, equipment and computer readable storage medium |
CN109902514A (en) * | 2019-03-07 | 2019-06-18 | 杭州比智科技有限公司 | A kind of data desensitization control system, method, server and storage medium |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767300A (en) * | 2020-05-11 | 2020-10-13 | 全球能源互联网研究院有限公司 | Dynamic desensitization method and device for penetration of internal and external networks of electric power data |
CN112214796A (en) * | 2020-10-19 | 2021-01-12 | 上海观安信息技术股份有限公司 | Dynamic desensitization method based on menu |
CN112256721A (en) * | 2020-10-21 | 2021-01-22 | 平安科技(深圳)有限公司 | SQL statement parsing method, system, computer device and storage medium |
CN112256721B (en) * | 2020-10-21 | 2021-08-17 | 平安科技(深圳)有限公司 | SQL statement parsing method, system, computer device and storage medium |
WO2021179722A1 (en) * | 2020-10-21 | 2021-09-16 | 平安科技(深圳)有限公司 | Sql statement parsing method and system, and computer device and storage medium |
CN112328599A (en) * | 2020-11-12 | 2021-02-05 | 杭州数梦工场科技有限公司 | Metadata-based field blood relationship analysis method and device |
CN112989412A (en) * | 2021-03-18 | 2021-06-18 | 城云科技(中国)有限公司 | Data desensitization method and device based on SQL statement analysis |
CN114003231A (en) * | 2021-09-28 | 2022-02-01 | 厦门国际银行股份有限公司 | SQL syntax parse tree optimization method and system |
CN115906178A (en) * | 2022-12-23 | 2023-04-04 | 星环信息科技(上海)股份有限公司 | Database management method, data subscription end and data publishing end |
CN116303370A (en) * | 2023-05-17 | 2023-06-23 | 建信金融科技有限责任公司 | Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product |
CN116303370B (en) * | 2023-05-17 | 2023-08-15 | 建信金融科技有限责任公司 | Script blood margin analysis method, script blood margin analysis device, storage medium, script blood margin analysis equipment and script blood margin analysis product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111125758A (en) | Dynamic desensitization method based on full syntax tree analysis | |
CN109614432B (en) | System and method for acquiring data blood relationship based on syntactic analysis | |
CN104123288B (en) | A kind of data query method and device | |
US7904488B2 (en) | Time stamp methods for unified plant model | |
CN102609402B (en) | Device and method for generation and management of ontology model based on real-time strategy | |
Peng et al. | Adaptive distributed RDF graph fragmentation and allocation based on query workload | |
WO2018130142A1 (en) | Statement parsing method for database statement | |
CN100561471C (en) | Data base automatic operation method based on web service | |
KR102099069B1 (en) | Hybrid ERD Management System, and method thereof | |
CN111198898B (en) | Big data query method and big data query device | |
CN114756569A (en) | Multi-layer parsing method of structured query statement, computer device and storage medium | |
WO2018059430A1 (en) | Database searching | |
CN115269631A (en) | Data query method, data query system, device and storage medium | |
CN114861229B (en) | Hive dynamic desensitization method and system | |
CN116301755A (en) | Automatic batch flow data marking framework construction method based on directed calculation graph | |
KR102202792B1 (en) | Method and device for performing multi-caching on data sources of same or different types by using cluster-based processing system | |
CN105183736A (en) | Universal searching system according to network equipment configuration and state information, and universal searching method thereof | |
CN111158653B (en) | SQL language-based integrated development and execution system for real-time computing program | |
Akbar et al. | An approach for refactoring in model layer on MVC based web application | |
Li et al. | Research on, and development of, data extraction and data cleaning technology based on the internet of things | |
CN114880351B (en) | Recognition method and device of slow query statement, storage medium and electronic equipment | |
WO2020238597A1 (en) | Hadoop-based data updating method, device, system and medium | |
Wang et al. | Constructing data warehouses based on operational metadata-driven builder pattern | |
Juan et al. | A framework of ontology management system | |
KR100501904B1 (en) | Database replication and synchronization method using object identifier |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200508 |