CN107644073A - A kind of field consanguinity analysis method, system and device based on depth-first traversal - Google Patents
A kind of field consanguinity analysis method, system and device based on depth-first traversal Download PDFInfo
- Publication number
- CN107644073A CN107644073A CN201710842320.5A CN201710842320A CN107644073A CN 107644073 A CN107644073 A CN 107644073A CN 201710842320 A CN201710842320 A CN 201710842320A CN 107644073 A CN107644073 A CN 107644073A
- Authority
- CN
- China
- Prior art keywords
- traversal
- depth
- field
- syntax tree
- abstract syntax
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of field consanguinity analysis method, system and device based on depth-first traversal, method includes carrying out conversion process to SQL statement to be analyzed, generates abstract syntax tree;Depth-first traversal is carried out to the abstract syntax tree of generation;Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement;System includes conversion processing module, depth-first traversal module and verification parsing module;Device includes memory and processor.The present invention is analyzed the genetic connection between field by the method for depth-first traversal, is reduced the complexity of analysis and is improved analysis precision;In addition, the method for the present invention is directly handled SQL statement, independent of existing metadata system, it reduce further the complexity of association analysis and improve compatibility.It the composite can be widely applied to association analysis field.
Description
Technical field
The present invention relates to association analysis field, especially a kind of field consanguinity analysis method based on depth-first traversal,
System and device.
Background technology
With the innovation of land tax operation system and the fast development of synthesization process, the species of product is expanded rapidly,
Relation between product also becomes increasingly complex.At the same time, because the business demand of Internet era is fast changing, system is caused
Workload and the complexity substantial increase of exploitation, test and O&M.Currently, high main of land tax information system complexity is caused
Reason includes:First, system scale extends and expanded with product, huge program and software resource are formd, or even system point
It is segmented into more multiple subsystem and transfers to multiple tissues to be developed and safeguarded;Second, the land tax core system framework of early stage use towards
The designing technique of process, causes that isolation between program module and packaging effects are relatively poor, and the degree of coupling is higher between subsystem;Three
It is that operation flow change is more, performance period is short so that core code modification is frequent.For these reasons, accurate identification software is real
Incidence relation between body, to develop, test and workload assess and quality management it is most important.
Software entity includes the softwares such as program, database, file, public component and their communication interface structure definition
Resource.The correlation analysis of software entity can be divided into two levels of entity level and data field level.Entity level relevance refers to
Coupling, call relation between each software entity, data field level relevance then refer to any software entity data field and its
The coupled relations such as the mapping of the data field of his software entity, calculating, transmission.The former is simple and easy, execution efficiency is high, but analyzes
Precision is relatively low, limited to research and development directive function;The latter is based on program source code (or intermediate code), by programmed instruction
Semantic analysis, identify the mapping between related software solid data field and transitive relation, analysis precision is high, and analysis result can
Influence property analysis view and service logic view for creating program, convenient design, test, operation maintenance personnel assess software entity it
Between influence details and determine modification of program scope, there is broader practice prospect and Geng Gao application value, but realize skill
Art is relative complex.
In addition, existing field level consanguinity analysis needs to coordinate specific a set of metadata system, poor compatibility, to specific
System or the degree of dependence of platform are high, and therefore, the existing consanguinity analysis method based on field level also cannot be widely applied to close
Join analysis field.
The content of the invention
In order to solve the above technical problems, first purpose of the present invention is:There is provided that a kind of complexity is low, analysis precision is high
And compatible good, the field consanguinity analysis method based on depth-first traversal.
Second object of the present invention is:There is provided that a kind of complexity is low, analysis precision is high and compatible good, based on depth
Spend the field consanguinity analysis system of first traversal.
Third object of the present invention is:There is provided that a kind of complexity is low, analysis precision is high and compatible good, based on depth
Spend the field consanguinity analysis device of first traversal.
First technical scheme being taken of the present invention be:
A kind of field consanguinity analysis method based on depth-first traversal, comprises the following steps:
Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax tree;
Depth-first traversal is carried out to the abstract syntax tree of generation;
Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
Further, it is described that conversion process is carried out to SQL statement to be analyzed, the step for generating abstract syntax tree, including
Following steps:
Syntax parsing is carried out to SQL statement, generates abstract syntax tree;
Node in abstract syntax tree is traveled through, generates query block.
Further, the node in abstract syntax tree travels through, the step for generating query block, including following step
Suddenly:
S1, create female query block and using female query block as current queries block;
S2, judge whether the node in abstract syntax tree is TOK_FROM, if so, then by the Grammar section of the node table name
Preserve into the aliasToTabs attributes of current queries block;Conversely, then perform step S3;
S3, judge whether the node in abstract syntax tree is TOK_DESTINATION, if so, the node then is exported into mesh
Target Grammar section is preserved into the nameToDest attributes of current queries block;Conversely, then perform step S4;
S4, judge whether the node in abstract syntax tree is TOK_SELECT, if so, then by the querying node expression formula
Grammar section preserve to destToSelExpr, destToAggregationExprs of current queries block and
In destToDistinctFuncExprs attributes;Conversely, then perform step S5;
S5, judge whether the node in abstract syntax tree is TOK_WHERE, if so, then by the language of the node specified requirements
Method part is preserved into the destToWhereExpr attributes of current queries block;Conversely, then perform step S6;
S6, judge whether the node in abstract syntax tree is TOK_INSERT, if so, then creating subquery block and looking into son
Block is ask as current queries block and return to step S2;Conversely, then perform step S7;
S7, female query block and subquery block exported.
Further, described the step for exporting female query block and subquery block, comprise the following steps:
Preorder traversal is carried out to female query block and subquery block;
Preamble traversal is carried out to the result of preorder traversal, generates middle table;
Judge whether the middle table of generation meets to impose a condition, if so, then the field in middle table is parsed and held
Row next step;Conversely, then directly perform next step;
Information mining operations are carried out to female query block and subquery block.
Further, the abstract syntax tree of described pair of generation carries out the step for depth-first traversal, comprises the following steps:
Abstract syntax tree is traveled through, the metadata information of table and the metadata information of field are stored in state machine;
Abstract syntax tree is traveled through, obtains Union subqueries, Join subqueries and SubSelect subqueries, and will
Union subqueries, Join subqueries and the SubSelect subqueries of acquisition store according to corresponding execution sequence successively pop down;
The subquery that pop down stores is popped and carries out genetic connection analysis, and the analysis result of genetic connection is stored in shape
State machine.
Further, the subquery by pop down storage is popped and carried out genetic connection analysis, and dividing genetic connection
The step for analysing result deposit state machine, it is specially:
State machine enters line statement differentiation to the subquery type currently popped:
If current subquery is Union subqueries, field association is carried out to the subquery and is stored in state machine;
If current subquery is Join subqueries or SubSelect subqueries, the subquery is stored in state machine;
If current subquery is not belonging to Union subqueries, Join subqueries and SubSelect subqueries, according to state
The table and field stored in machine carries out consanguinity analysis to current subquery.
Further, the result to depth-first traversal carries out verification parsing, obtains the blood between field in SQL statement
The step for edge relation, comprise the following steps:
Input the first test case;
First test case is manually parsed respectively and machine parses;
Judge whether artificial parsing is consistent with the result of machine parsing, if so, then performing next step;Conversely, then input
New test case is as the first test case and the first test case is manually parsed respectively for return and machine parses this
One step;
The second test case is inputted, machine parsing is carried out to the second test case;
According to the machine analysis result of the first test case and the machine analysis result of the second test case, analysis first is surveyed
Genetic connection on probation between example and the second use-case;
Verification operation is carried out to the machine analysis result of the second test case, obtains the blood relationship between field in SQL statement
Relation.
Second technical scheme that the present invention takes be:
A kind of field consanguinity analysis method system based on depth-first traversal, including:
Conversion processing module, for carrying out conversion process to SQL statement to be analyzed, generate abstract syntax tree;
Depth-first traversal module, for carrying out depth-first traversal to the abstract syntax tree of generation;
Verify parsing module, for carrying out verification parsing to the result of depth-first traversal, obtain in SQL statement field it
Between genetic connection.
Further, the depth-first traversal module includes:
Tadata memory module, for being traveled through to abstract syntax tree, by the metadata information of table and first number of field
It is believed that breath deposit state machine;
Pop down memory module, for being traveled through to abstract syntax tree, obtain Union subqueries, Join subqueries and
SubSelect subqueries, and subquery is stored according to corresponding execution sequence successively pop down;
Consanguinity analysis module, the subquery for pop down to be stored is popped and carries out genetic connection analysis, and blood relationship is closed
The analysis result deposit state machine of system.
The 3rd technical scheme that the present invention takes be:
A kind of field consanguinity analysis device based on depth-first traversal, including:
Memory, for storage program;
Processor, described program is performed, for:Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax
Tree;
Depth-first traversal is carried out to the abstract syntax tree of generation;
Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
The beneficial effects of the method for the present invention is:This method is and right by carrying out depth-first traversal to abstract syntax tree
The result of depth-first traversal carries out verification parsing, so as to obtain the genetic connection in SQL statement between field, compared to existing
Correlation analysis method based on software entity level, reduce the complexity of analysis and improve analysis precision;It is in addition, of the invention
Method directly SQL statement is handled, independent of existing metadata system, reduce further answering for association analysis
It is miscellaneous to spend and improve compatibility.
The beneficial effect of system of the present invention is:The system is carried out deep by depth-first traversal module to abstract syntax tree
First traversal is spent, and verification parsing is carried out to the result of depth-first traversal by verifying parsing module, so as to obtain SQL statement
Genetic connection between middle field, compared to the existing correlation analysis system based on software entity level, reduce answering for analysis
It is miscellaneous to spend and improve analysis precision;In addition, the system of the present invention is directly handled SQL statement, independent of existing member
Data system, it reduce further the complexity of association analysis and improve the compatibility of system.
The beneficial effect of device of the present invention is:The system carries out depth-first time by processor to abstract syntax tree
Go through, and verification parsing is carried out to the result of depth-first traversal, so as to obtain the genetic connection in SQL statement between field, phase
Compared with the existing correlation analysis system based on software entity level, reduce complexity and improve analysis precision;In addition, this hair
Bright device is directly handled SQL statement, independent of existing metadata system, reduce further association analysis
Complexity and the compatibility for improving device.
Brief description of the drawings
Fig. 1 is a kind of step flow chart of the field consanguinity analysis method based on depth-first traversal of the present invention;
Fig. 2 is a kind of program module block diagram of the field consanguinity analysis system based on depth-first traversal of the present invention;
Fig. 3 is the step flow chart that the embodiment of the present invention one verifies resolving.
Embodiment
A kind of reference picture 1, field consanguinity analysis method based on depth-first traversal, comprises the following steps:
Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax tree;
Depth-first traversal is carried out to the abstract syntax tree of generation;
Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
Preferred embodiment is further used as, it is described that conversion process is carried out to SQL statement to be analyzed, generate abstract language
The step for method tree, comprise the following steps:
Syntax parsing is carried out to SQL statement, generates abstract syntax tree;
Node in abstract syntax tree is traveled through, generates query block.
Wherein, query block refers to each query statement in abstract syntax tree, for preserving the corresponding Grammar section of node.
Preferred embodiment is further used as, the node in abstract syntax tree travels through, and generates query block
The step for, comprise the following steps:
S1, create female query block and using female query block as current queries block;
S2, judge whether the node in abstract syntax tree is TOK_FROM, if so, then by the Grammar section of the node table name
Preserve into the aliasToTabs attributes of current queries block;Conversely, then perform step S3;
S3, judge whether the node in abstract syntax tree is TOK_DESTINATION, if so, the node then is exported into mesh
Target Grammar section is preserved into the nameToDest attributes of current queries block;Conversely, then perform step S4;
S4, judge whether the node in abstract syntax tree is TOK_SELECT, if so, then by the querying node expression formula
Grammar section preserve to destToSelExpr, destToAggregationExprs of current queries block and
In destToDistinctFuncExprs attributes;Conversely, then perform step S5;
S5, judge whether the node in abstract syntax tree is TOK_WHERE, if so, then by the language of the node specified requirements
Method part is preserved into the destToWhereExpr attributes of current queries block;Conversely, then perform step S6;
S6, judge whether the node in abstract syntax tree is TOK_INSERT, if so, then creating subquery block and looking into son
Block is ask as current queries block and return to step S2;Conversely, then perform step S7;
S7, female query block and subquery block exported.
In SQL syntax, TOK_FROM, TOK_DESTINATION, TOK_SELECT, TOK_WHERE and TOK_
INSERT is corresponding node label in abstract syntax tree;aliasToTabs、nameToDest、destToSelExpr、
DestToAggregationExprs, destToDistinctFuncExprs and destToWhereExpr are in query block
Corresponding attribute mark.
It is further used as preferred embodiment, described the step for exporting female query block and subquery block, including with
Lower step:
Preorder traversal is carried out to female query block and subquery block;
Preamble traversal is carried out to the result of preorder traversal, generates middle table;
Judge whether the middle table of generation meets to impose a condition, if so, then the field in middle table is parsed and held
Row next step;Conversely, then directly perform next step;
Information mining operations are carried out to female query block and subquery block.
In SQL syntax, table is a kind of bivariate table for being used to deposit data set, and its keyword is TABLE;Present embodiment
In middle table refer to a kind of table generated after preamble travels through, operated for follow-up judgement;Female query block and son are looked into
Block progress information mining operations are ask to refer to be excavated the information such as the mark stored in respective queries block and attribute.
It is further used as preferred embodiment, the abstract syntax tree of described pair of generation carries out depth-first traversal this step
Suddenly, comprise the following steps:
Abstract syntax tree is traveled through, the metadata information of table and the metadata information of field are stored in state machine;
Abstract syntax tree is traveled through, obtains Union subqueries, Join subqueries and SubSelect subqueries, and will
Union subqueries, Join subqueries and the SubSelect subqueries of acquisition store according to corresponding execution sequence successively pop down;
The subquery that pop down stores is popped and carries out genetic connection analysis, and the analysis result of genetic connection is stored in shape
State machine.
Wherein, state machine is used for the data for storing respective table and field;In SQL syntax, Union subqueries, Join
Inquiry and SubSelect subqueries represent different types of query statement respectively.
The present invention is cached by adding state machine to the information during depth-first traversal, except can be right
The operations such as create, insert and update are carried out outside consanguinity analysis, moreover it is possible to the changes such as alter are operated and carry out blood relationship point
Analysis, makes analysis more comprehensive.
Preferred embodiment is further used as, the subquery by pop down storage, which is popped and carries out genetic connection, to be divided
The step for analysing, and the analysis result of genetic connection is stored in state machine, it is specially:
State machine enters line statement differentiation to the subquery type currently popped:
If current subquery is Union subqueries, field association is carried out to the subquery and is stored in state machine;
If current subquery is Join subqueries or SubSelect subqueries, the subquery is stored in state machine;
If current subquery is not belonging to Union subqueries, Join subqueries and SubSelect subqueries, according to state
The table and field stored in machine carries out consanguinity analysis to current subquery.
Reference picture 3, is further used as preferred embodiment, and the result to depth-first traversal carries out verification solution
Analysis, the step for obtaining in SQL statement the genetic connection between field, comprise the following steps:
Input the first test case;
First test case is manually parsed respectively and machine parses;
Judge whether artificial parsing is consistent with the result of machine parsing, if so, then performing next step;Conversely, then input
New test case is as the first test case and the first test case is manually parsed respectively for return and machine parses this
One step;
The second test case is inputted, machine parsing is carried out to the second test case;
According to the machine analysis result of the first test case and the machine analysis result of the second test case, analysis first is surveyed
Genetic connection on probation between example and the second use-case;
Verification operation is carried out to the machine analysis result of the second test case, obtains the blood relationship between field in SQL statement
Relation.
Reference picture 2, the present invention a kind of field consanguinity analysis side based on depth-first traversal corresponding with Fig. 1 method
Method system, including:
Conversion processing module, for carrying out conversion process to SQL statement to be analyzed, generate abstract syntax tree;
Depth-first traversal module, for carrying out depth-first traversal to the abstract syntax tree of generation;
Verify parsing module, for carrying out verification parsing to the result of depth-first traversal, obtain in SQL statement field it
Between genetic connection.
Preferred embodiment is further used as, the depth-first traversal module includes:
Tadata memory module, for being traveled through to abstract syntax tree, by the metadata information of table and first number of field
It is believed that breath deposit state machine;
Pop down memory module, for being traveled through to abstract syntax tree, obtain Union subqueries, Join subqueries and
SubSelect subqueries, and subquery is stored according to corresponding execution sequence successively pop down;
Consanguinity analysis module, the subquery for pop down to be stored is popped and carries out genetic connection analysis, and blood relationship is closed
The analysis result deposit state machine of system.
It is corresponding with Fig. 1 method, a kind of field consanguinity analysis device based on depth-first traversal of the present invention, including:
Memory, for storage program;
Processor, described program is performed, for:Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax
Tree;
Depth-first traversal is carried out to the abstract syntax tree of generation;
Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
The present invention is described in further detail with reference to Figure of description and specific embodiment.
Embodiment one
Complicated and compatibility is realized based on the low and existing field level association analysis of existing entity level association analysis analysis precision
The shortcomings that poor, the present invention propose a kind of field consanguinity analysis method, system and device based on depth-first traversal, the present invention
The genetic connection between field is analyzed by the method for depth-first traversal, the complexity of analysis is greatly reduced and carries
High analysis precision;Meanwhile the present invention only carries out blood relationship point independent of existing metadata system by parsing SQL statement
Analysis, substantially increases compatibility, can be suitably used for any platform.
Starting with below from explanation of nouns and specific workflow these two aspects, the present invention is described in detail
(1) explanation of nouns
antlr:Antlr refers to that syntax tree and the visual grammer of increasing income shown can be automatically generated according to input
Analyzer.Antlr combines lexical analyzer, syntax analyzer and tree analyzer, and it allows us to define identification word
Accord with the morphological rule of stream and the rule governing parsing for explaining Token streams.Then, the grammer that antlr will provide according to user
File automatically generates corresponding morphology/syntax analyzer, and user can utilize morphology/syntax analyzer of generation by the text of input
Originally it is compiled, and is converted into other forms (such as AST:Abstract Syntax Tree, i.e. abstract syntax tree).
(2) specific workflow
Reference picture 1, a kind of specific workflow of the field consanguinity analysis method based on depth-first traversal of the present invention are:
Step 1:Parsing process.
Parsing process specifically includes following steps:
The SQL statement being analysed to is inputted to antlr syntax analyzers;
SQL statement is converted into abstract syntax tree by antlr syntax analyzers according to specific morphology/syntax rule, simultaneously
Morphology and syntax parsing are carried out to SQL statement to be analyzed.
Wherein, SQL statement is converted into abstract syntax tree by antlr syntax analyzers according to specific morphology/syntax rule,
The step for carrying out morphology and syntax parsing to SQL statement to be analyzed simultaneously comprises the following steps:
Syntax parsing is carried out to SQL statement, generates abstract syntax tree;
Node in abstract syntax tree is traveled through, generates query block.
Specifically, the node in abstract syntax tree is traveled through, generate query block the step for, comprise the following steps:
S1, create female query block and using female query block as current queries block;
S2, judge whether the node in abstract syntax tree is TOK_FROM, if so, then by the Grammar section of the node table name
Preserve into the aliasToTabs attributes of current queries block;Conversely, then perform step S3;
S3, judge whether the node in abstract syntax tree is TOK_DESTINATION, if so, the node then is exported into mesh
Target Grammar section is preserved into the nameToDest attributes of current queries block;Conversely, then perform step S4;
S4, judge whether the node in abstract syntax tree is TOK_SELECT, if so, then by the querying node expression formula
Grammar section preserve to destToSelExpr, destToAggregationExprs of current queries block and
In destToDistinctFuncExprs attributes;Conversely, then perform step S5;
S5, judge whether the node in abstract syntax tree is TOK_WHERE, if so, then by the language of the node specified requirements
Method part is preserved into the destToWhereExpr attributes of current queries block;Conversely, then perform step S6;
S6, judge whether the node in abstract syntax tree is TOK_INSERT, if so, then creating subquery block and looking into son
Block is ask as current queries block and return to step S2;Conversely, then perform step S7;
S7, female query block and subquery block exported.
Described the step for exporting female query block and subquery block, comprise the following steps:
Preorder traversal is carried out to female query block and subquery block;
Preamble traversal is carried out to the result of preorder traversal, generates middle table;
Judge whether the middle table of generation meets to impose a condition, if so, then the field in middle table is parsed and held
Row next step;Conversely, then directly perform next step;
Information mining operations are carried out to female query block and subquery block.
Step 2:Depth-first traversal process.
The depth-first traversal process comprises the following steps:
Abstract syntax tree is traveled through, the metadata information of table and the metadata information of field are stored in state machine;
Abstract syntax tree is traveled through, obtains Union subqueries, Join subqueries and SubSelect subqueries, and will
Union subqueries, Join subqueries and the SubSelect subqueries of acquisition store according to corresponding execution sequence successively pop down;
The subquery that pop down stores is popped and carries out genetic connection analysis, and the analysis result of genetic connection is stored in shape
State machine.
Wherein, the subquery that pop down stores is popped and carries out genetic connection analysis, and by the analysis result of genetic connection
The step for being stored in state machine, it is specially:
State machine enters line statement differentiation to the subquery type currently popped:
If current subquery is Union subqueries, field association is carried out to the subquery and is stored in state machine;
If current subquery is Join subqueries, the subquery is stored in state machine;
If current subquery is SubSelect subqueries, the subquery is stored in state machine;
If current subquery is not belonging to Union subqueries, Join subqueries and SubSelect subqueries, according to state
The table and field stored in machine carries out consanguinity analysis to current subquery.
The present invention by being parsed to SQL statement to be analyzed, obtain corresponding input/output list, input and output field with
And corresponding treatment conditions, show so as to carry out analysis to the genetic connection between field.
The depth-first traversal process of the present invention by the abstract syntax tree of SQL statement to be analyzed is carried out repeatedly traversal come
Realize:
The metadata information deposit state machine of obtained table and field is traveled through first, and eliminates alias influence;
Then travel through again, analyze the subquery sentence comprising Union and Join (in Oracle SQL standards, also
Need subquery sentence of the analysis bag containing subSelect), the subquery sentence analyzed is deposited by execution sequence stacking;
Finally each subquery sentence is popped and independently traveled through, and corresponding field is associated into deposit state
Machine, state machine are judged the coverage of sub- query statement, and then are decided whether to merge or given up field association.Every time
The result of analysis SQL statement will be all retained in state machine, and the result analyzed every time can provide member for later analysis
Data and related information.
Reference picture 3, step 3:Verify resolving.
Verification resolving specifically includes following steps:
Input the first test case;
First test case is manually parsed respectively and machine parses;
Judge whether artificial parsing is consistent with the result of machine parsing, if so, then performing next step;Conversely, then input
New test case is as the first test case and the first test case is manually parsed respectively for return and machine parses this
One step;
The second test case is inputted, machine parsing is carried out to the second test case;
According to the machine analysis result of the first test case and the machine analysis result of the second test case, analysis first is surveyed
Genetic connection on probation between example and the second use-case;
Inspection operation is carried out to the machine analysis result of the second test case, obtains the blood relationship between field in SQL statement
Relation.
In summary, a kind of field consanguinity analysis method based on depth-first traversal of the present invention has advantages below:
1), the present invention is analyzed the genetic connection between field by the method for depth-first traversal, compared to existing
There is the correlation analysis method based on software entity level, reduce the complexity of analysis and improve analysis precision.
2), the present invention directly carries out dissection process to SQL statement, independent of existing metadata system, further drops
The low complexity of association analysis and improve the compatibility of system.
3), the present invention by introducing state machine, can Cache associativity analysis during processing information so that every time analysis
Result can provide metadata and related information for later analysis.
4), the present invention also supports the SQL standards such as MySQL and Oracle except supporting SQL92 standards.
5), the present invention is by introducing state machine, except can be to the carry out blood relationship of the operations such as create, insert and update
Outside analysis, moreover it is possible to the changes such as alter are operated and carry out consanguinity analysis, make analysis more comprehensive.
Above is the preferable implementation to the present invention is illustrated, but the present invention is not limited to the embodiment, ripe
A variety of equivalent variations or replacement can also be made on the premise of without prejudice to spirit of the invention by knowing those skilled in the art, this
Equivalent deformation or replacement are all contained in the application claim limited range a bit.
Claims (10)
- A kind of 1. field consanguinity analysis method based on depth-first traversal, it is characterised in that:Comprise the following steps:Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax tree;Depth-first traversal is carried out to the abstract syntax tree of generation;Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
- A kind of 2. field consanguinity analysis method based on depth-first traversal according to claim 1, it is characterised in that:Institute State and conversion process is carried out to SQL statement to be analyzed, the step for generating abstract syntax tree, comprise the following steps:Syntax parsing is carried out to SQL statement, generates abstract syntax tree;Node in abstract syntax tree is traveled through, generates query block.
- A kind of 3. field consanguinity analysis method based on depth-first traversal according to claim 2, it is characterised in that:Institute State and the node in abstract syntax tree traveled through, generate query block the step for, comprise the following steps:S1, create female query block and using female query block as current queries block;S2, judge whether the node in abstract syntax tree is TOK_FROM, if so, then preserving the Grammar section of the node table name Into the aliasToTabs attributes of current queries block;Conversely, then perform step S3;S3, judge whether the node in abstract syntax tree is TOK_DESTINATION, if so, the node then is exported into target Grammar section is preserved into the nameToDest attributes of current queries block;Conversely, then perform step S4;S4, judge whether the node in abstract syntax tree is TOK_SELECT, if so, then by the grammer of the querying node expression formula Part preserve to current queries block destToSelExpr, destToAggregationExprs and In destToDistinctFuncExprs attributes;Conversely, then perform step S5;S5, judge whether the node in abstract syntax tree is TOK_WHERE, if so, then by the grammer portion of the node specified requirements Code insurance is deposited into the destToWhereExpr attributes of current queries block;Conversely, then perform step S6;S6, judge whether the node in abstract syntax tree is TOK_INSERT, if so, then creating subquery block and by subquery block As current queries block and return to step S2;Conversely, then perform step S7;S7, female query block and subquery block exported.
- A kind of 4. field consanguinity analysis method based on depth-first traversal according to claim 3, it is characterised in that:Institute The step for stating female query block and the output of subquery block, comprises the following steps:Preorder traversal is carried out to female query block and subquery block;Preamble traversal is carried out to the result of preorder traversal, generates middle table;Judge whether the middle table of generation meets to impose a condition, if so, then the field in middle table is parsed and performed down One step;Conversely, then directly perform next step;Information mining operations are carried out to female query block and subquery block.
- A kind of 5. field consanguinity analysis method based on depth-first traversal according to claim 1, it is characterised in that:Institute The step for depth-first traversal is carried out to the abstract syntax tree of generation is stated, is comprised the following steps:Abstract syntax tree is traveled through, the metadata information of table and the metadata information of field are stored in state machine;Abstract syntax tree is traveled through, obtains Union subqueries, Join subqueries and SubSelect subqueries, and will obtain Union subqueries, Join subqueries and SubSelect subqueries according to corresponding execution sequence successively pop down store;The subquery that pop down stores is popped and carries out genetic connection analysis, and the analysis result of genetic connection is stored in state Machine.
- A kind of 6. field consanguinity analysis method based on depth-first traversal according to claim 5, it is characterised in that:Institute State and the subquery of pop down storage popped and carries out genetic connection analysis, and by the analysis result deposit state machine of genetic connection this One step, it is specially:State machine enters line statement differentiation to the subquery type currently popped:If current subquery is Union subqueries, field association is carried out to the subquery and is stored in state machine;If current subquery is Join subqueries or SubSelect subqueries, the subquery is stored in state machine;If current subquery is not belonging to Union subqueries, Join subqueries and SubSelect subqueries, according in state machine The table and field of storage carry out consanguinity analysis to current subquery.
- A kind of 7. field consanguinity analysis method based on depth-first traversal according to claim 1, it is characterised in that:Institute State and verification parsing is carried out to the result of depth-first traversal, the step for obtaining in SQL statement the genetic connection between field, bag Include following steps:Input the first test case;First test case is manually parsed respectively and machine parses;Judge whether artificial parsing is consistent with the result of machine parsing, if so, then performing next step;Conversely, then input newly Test case is as the first test case and the first test case is manually parsed respectively for return and machine parses this step Suddenly;The second test case is inputted, machine parsing is carried out to the second test case;According to the machine analysis result of the first test case and the machine analysis result of the second test case, the test of analysis first is used Genetic connection between example and the second use-case;Verification operation is carried out to the machine analysis result of the second test case, obtains the genetic connection between field in SQL statement.
- A kind of 8. field consanguinity analysis method system based on depth-first traversal, it is characterised in that:Including:Conversion processing module, for carrying out conversion process to SQL statement to be analyzed, generate abstract syntax tree;Depth-first traversal module, for carrying out depth-first traversal to the abstract syntax tree of generation;Parsing module is verified, for carrying out verification parsing to the result of depth-first traversal, is obtained in SQL statement between field Genetic connection.
- A kind of 9. field consanguinity analysis system based on depth-first traversal according to claim 8, it is characterised in that:Institute Stating depth-first traversal module includes:Tadata memory module, for being traveled through to abstract syntax tree, the metadata of the metadata information of table and field is believed Breath deposit state machine;Pop down memory module, for being traveled through to abstract syntax tree, obtain Union subqueries, Join subqueries and SubSelect subqueries, and subquery is stored according to corresponding execution sequence successively pop down;Consanguinity analysis module, the subquery for pop down to be stored are popped and carry out genetic connection analysis, and by genetic connection Analysis result is stored in state machine.
- A kind of 10. field consanguinity analysis device based on depth-first traversal, it is characterised in that:Including:Memory, for storage program;Processor, described program is performed, for:Conversion process is carried out to SQL statement to be analyzed, generates abstract syntax tree;Depth-first traversal is carried out to the abstract syntax tree of generation;Verification parsing is carried out to the result of depth-first traversal, obtains the genetic connection between field in SQL statement.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710842320.5A CN107644073A (en) | 2017-09-18 | 2017-09-18 | A kind of field consanguinity analysis method, system and device based on depth-first traversal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710842320.5A CN107644073A (en) | 2017-09-18 | 2017-09-18 | A kind of field consanguinity analysis method, system and device based on depth-first traversal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107644073A true CN107644073A (en) | 2018-01-30 |
Family
ID=61111944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710842320.5A Pending CN107644073A (en) | 2017-09-18 | 2017-09-18 | A kind of field consanguinity analysis method, system and device based on depth-first traversal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107644073A (en) |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256113A (en) * | 2018-02-09 | 2018-07-06 | 口碑(上海)信息技术有限公司 | The method for digging and device of data genetic connection |
CN108595971A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | A kind of database adaptive refinement method |
CN109325078A (en) * | 2018-09-18 | 2019-02-12 | 拉扎斯网络科技(上海)有限公司 | Data blood margin determination method and device based on structural data |
CN109446279A (en) * | 2018-10-15 | 2019-03-08 | 顺丰科技有限公司 | Based on neo4j big data genetic connection management method, system, equipment and storage medium |
CN109614432A (en) * | 2018-12-05 | 2019-04-12 | 北京百分点信息科技有限公司 | A kind of system and method for the acquisition data genetic connection based on syntactic analysis |
CN109710703A (en) * | 2019-01-03 | 2019-05-03 | 北京顺丰同城科技有限公司 | A kind of generation method and device of genetic connection network |
CN109815378A (en) * | 2019-01-31 | 2019-05-28 | 三盟科技股份有限公司 | A kind of data tracing method and system based on metadata link |
CN110515823A (en) * | 2018-05-21 | 2019-11-29 | 百度在线网络技术(北京)有限公司 | Program code complexity evaluation methodology and device |
CN110555032A (en) * | 2019-09-09 | 2019-12-10 | 北京搜狐新媒体信息技术有限公司 | Data blood relationship analysis method and system based on metadata |
CN110633333A (en) * | 2019-09-25 | 2019-12-31 | 京东数字科技控股有限公司 | Data blood relationship processing method and system, computing device and medium |
CN110674229A (en) * | 2019-09-24 | 2020-01-10 | 山东爱城市网信息技术有限公司 | AST-based relational database SQL table relational analysis and display method |
CN110727677A (en) * | 2019-09-19 | 2020-01-24 | 上海数禾信息科技有限公司 | Method and device for tracing blood relationship of table in data warehouse |
CN111078729A (en) * | 2019-12-19 | 2020-04-28 | 医渡云(北京)技术有限公司 | Medical data tracing method, device, system, storage medium and electronic equipment |
CN111400338A (en) * | 2020-03-04 | 2020-07-10 | 平安医疗健康管理股份有限公司 | SQ L optimization method, device, storage medium and computer equipment |
CN111538743A (en) * | 2020-04-22 | 2020-08-14 | 电子科技大学 | SQL-based data blood relationship analysis method and system |
CN111538744A (en) * | 2020-07-08 | 2020-08-14 | 浙江大华技术股份有限公司 | Method and device for processing data blood margin |
CN111639143A (en) * | 2020-06-05 | 2020-09-08 | 广州市玄武无线科技股份有限公司 | Data blood relationship display method and device of data warehouse and electronic equipment |
CN111782265A (en) * | 2020-06-28 | 2020-10-16 | 中国工商银行股份有限公司 | Software resource system based on field level blood relationship and establishment method thereof |
CN112035508A (en) * | 2020-08-27 | 2020-12-04 | 深圳天源迪科信息技术股份有限公司 | SQL (structured query language) -based online metadata analysis method, system and equipment |
CN112035416A (en) * | 2020-08-31 | 2020-12-04 | 北京嘀嘀无限科技发展有限公司 | Data blood margin analysis method and device, electronic equipment and storage medium |
CN112256720A (en) * | 2020-10-21 | 2021-01-22 | 平安科技(深圳)有限公司 | Data cost calculation method, system, computer device and storage medium |
CN112347123A (en) * | 2020-11-10 | 2021-02-09 | 北京金山云网络技术有限公司 | Data blood margin analysis method and device and server |
CN112948400A (en) * | 2020-09-17 | 2021-06-11 | 深圳市明源云科技有限公司 | Database management method, database management device and terminal equipment |
CN113177057A (en) * | 2021-04-28 | 2021-07-27 | 深圳依时货拉拉科技有限公司 | SQL statement syntax visualization analysis method, system and computer readable storage medium |
CN113220800A (en) * | 2021-05-17 | 2021-08-06 | 上海合合信息科技股份有限公司 | Data field blood relationship analysis method and device based on ANTLR |
CN113326286A (en) * | 2021-08-03 | 2021-08-31 | 杭州量之智能科技有限公司 | Semantic analysis method supporting dialect SQL blood margin analysis |
CN113326401A (en) * | 2021-06-16 | 2021-08-31 | 上海哔哩哔哩科技有限公司 | Method and system for generating field blood margin |
CN113590610A (en) * | 2021-06-29 | 2021-11-02 | 四川新网银行股份有限公司 | Blood relationship representation method based on Elastic Search |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6078746A (en) * | 1993-10-29 | 2000-06-20 | Microsoft Corporation | Method and system for reducing an intentional program tree represented by high-level computational constructs |
CN1859359A (en) * | 2005-07-12 | 2006-11-08 | 上海华为技术有限公司 | Realizing method and its device for communication protocol described by abstract grammar rule |
CN103186541A (en) * | 2011-12-27 | 2013-07-03 | 阿里巴巴集团控股有限公司 | Generation method and device for mapping relationship |
CN103235723A (en) * | 2013-04-23 | 2013-08-07 | 浙江天正思维信息技术有限公司 | Application software code extraction method based on abstract syntax tree and software product features |
CN104199831A (en) * | 2014-07-31 | 2014-12-10 | 深圳市腾讯计算机系统有限公司 | Information processing method and device |
CN104424269A (en) * | 2013-08-30 | 2015-03-18 | 中国电信股份有限公司 | Data linage analysis method and device |
CN104899314A (en) * | 2015-06-17 | 2015-09-09 | 北京京东尚科信息技术有限公司 | Pedigree analysis method and device of data warehouse |
CN105608086A (en) * | 2014-11-17 | 2016-05-25 | 中兴通讯股份有限公司 | Transaction processing method and device of distributed database system |
CN105912595A (en) * | 2016-04-01 | 2016-08-31 | 华南理工大学 | Data origin collection method of relational databases |
CN107133027A (en) * | 2017-03-30 | 2017-09-05 | 南京南瑞继保电气有限公司 | A kind of syntax tree stratification method for expressing |
-
2017
- 2017-09-18 CN CN201710842320.5A patent/CN107644073A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6078746A (en) * | 1993-10-29 | 2000-06-20 | Microsoft Corporation | Method and system for reducing an intentional program tree represented by high-level computational constructs |
CN1859359A (en) * | 2005-07-12 | 2006-11-08 | 上海华为技术有限公司 | Realizing method and its device for communication protocol described by abstract grammar rule |
CN103186541A (en) * | 2011-12-27 | 2013-07-03 | 阿里巴巴集团控股有限公司 | Generation method and device for mapping relationship |
CN103235723A (en) * | 2013-04-23 | 2013-08-07 | 浙江天正思维信息技术有限公司 | Application software code extraction method based on abstract syntax tree and software product features |
CN104424269A (en) * | 2013-08-30 | 2015-03-18 | 中国电信股份有限公司 | Data linage analysis method and device |
CN104199831A (en) * | 2014-07-31 | 2014-12-10 | 深圳市腾讯计算机系统有限公司 | Information processing method and device |
CN105608086A (en) * | 2014-11-17 | 2016-05-25 | 中兴通讯股份有限公司 | Transaction processing method and device of distributed database system |
CN104899314A (en) * | 2015-06-17 | 2015-09-09 | 北京京东尚科信息技术有限公司 | Pedigree analysis method and device of data warehouse |
CN105912595A (en) * | 2016-04-01 | 2016-08-31 | 华南理工大学 | Data origin collection method of relational databases |
CN107133027A (en) * | 2017-03-30 | 2017-09-05 | 南京南瑞继保电气有限公司 | A kind of syntax tree stratification method for expressing |
Non-Patent Citations (1)
Title |
---|
王飞: ""云环境下海量数据查询处理与分析技术研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑I138-701》 * |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256113A (en) * | 2018-02-09 | 2018-07-06 | 口碑(上海)信息技术有限公司 | The method for digging and device of data genetic connection |
CN108256113B (en) * | 2018-02-09 | 2020-06-16 | 口碑(上海)信息技术有限公司 | Data blood relationship mining method and device |
CN108595971A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | A kind of database adaptive refinement method |
CN110515823A (en) * | 2018-05-21 | 2019-11-29 | 百度在线网络技术(北京)有限公司 | Program code complexity evaluation methodology and device |
CN109325078A (en) * | 2018-09-18 | 2019-02-12 | 拉扎斯网络科技(上海)有限公司 | Data blood margin determination method and device based on structural data |
CN109446279A (en) * | 2018-10-15 | 2019-03-08 | 顺丰科技有限公司 | Based on neo4j big data genetic connection management method, system, equipment and storage medium |
CN109614432B (en) * | 2018-12-05 | 2021-01-05 | 北京百分点信息科技有限公司 | System and method for acquiring data blood relationship based on syntactic analysis |
CN109614432A (en) * | 2018-12-05 | 2019-04-12 | 北京百分点信息科技有限公司 | A kind of system and method for the acquisition data genetic connection based on syntactic analysis |
CN109710703A (en) * | 2019-01-03 | 2019-05-03 | 北京顺丰同城科技有限公司 | A kind of generation method and device of genetic connection network |
CN109815378A (en) * | 2019-01-31 | 2019-05-28 | 三盟科技股份有限公司 | A kind of data tracing method and system based on metadata link |
CN110555032A (en) * | 2019-09-09 | 2019-12-10 | 北京搜狐新媒体信息技术有限公司 | Data blood relationship analysis method and system based on metadata |
CN110727677B (en) * | 2019-09-19 | 2022-12-30 | 上海数禾信息科技有限公司 | Method and device for tracing blood relationship of table in data warehouse |
CN110727677A (en) * | 2019-09-19 | 2020-01-24 | 上海数禾信息科技有限公司 | Method and device for tracing blood relationship of table in data warehouse |
CN110674229A (en) * | 2019-09-24 | 2020-01-10 | 山东爱城市网信息技术有限公司 | AST-based relational database SQL table relational analysis and display method |
CN110633333A (en) * | 2019-09-25 | 2019-12-31 | 京东数字科技控股有限公司 | Data blood relationship processing method and system, computing device and medium |
CN111078729A (en) * | 2019-12-19 | 2020-04-28 | 医渡云(北京)技术有限公司 | Medical data tracing method, device, system, storage medium and electronic equipment |
CN111078729B (en) * | 2019-12-19 | 2023-04-28 | 医渡云(北京)技术有限公司 | Medical data tracing method, device, system, storage medium and electronic equipment |
CN111400338A (en) * | 2020-03-04 | 2020-07-10 | 平安医疗健康管理股份有限公司 | SQ L optimization method, device, storage medium and computer equipment |
CN111400338B (en) * | 2020-03-04 | 2022-11-22 | 深圳平安医疗健康科技服务有限公司 | SQL optimization method, device, storage medium and computer equipment |
CN111538743A (en) * | 2020-04-22 | 2020-08-14 | 电子科技大学 | SQL-based data blood relationship analysis method and system |
CN111538743B (en) * | 2020-04-22 | 2023-08-18 | 电子科技大学 | SQL-based data blood relationship analysis method and system |
CN111639143A (en) * | 2020-06-05 | 2020-09-08 | 广州市玄武无线科技股份有限公司 | Data blood relationship display method and device of data warehouse and electronic equipment |
CN111639143B (en) * | 2020-06-05 | 2020-12-22 | 广州市玄武无线科技股份有限公司 | Data blood relationship display method and device of data warehouse and electronic equipment |
CN111782265A (en) * | 2020-06-28 | 2020-10-16 | 中国工商银行股份有限公司 | Software resource system based on field level blood relationship and establishment method thereof |
CN111782265B (en) * | 2020-06-28 | 2024-02-02 | 中国工商银行股份有限公司 | Software resource system based on field-level blood-relation and establishment method thereof |
CN111538744A (en) * | 2020-07-08 | 2020-08-14 | 浙江大华技术股份有限公司 | Method and device for processing data blood margin |
CN112035508A (en) * | 2020-08-27 | 2020-12-04 | 深圳天源迪科信息技术股份有限公司 | SQL (structured query language) -based online metadata analysis method, system and equipment |
CN112035416A (en) * | 2020-08-31 | 2020-12-04 | 北京嘀嘀无限科技发展有限公司 | Data blood margin analysis method and device, electronic equipment and storage medium |
CN112948400A (en) * | 2020-09-17 | 2021-06-11 | 深圳市明源云科技有限公司 | Database management method, database management device and terminal equipment |
CN112256720A (en) * | 2020-10-21 | 2021-01-22 | 平安科技(深圳)有限公司 | Data cost calculation method, system, computer device and storage medium |
CN112347123A (en) * | 2020-11-10 | 2021-02-09 | 北京金山云网络技术有限公司 | Data blood margin analysis method and device and server |
CN113177057A (en) * | 2021-04-28 | 2021-07-27 | 深圳依时货拉拉科技有限公司 | SQL statement syntax visualization analysis method, system and computer readable storage medium |
CN113220800A (en) * | 2021-05-17 | 2021-08-06 | 上海合合信息科技股份有限公司 | Data field blood relationship analysis method and device based on ANTLR |
CN113220800B (en) * | 2021-05-17 | 2023-11-10 | 上海合合信息科技股份有限公司 | ANTLR-based data field blood-edge analysis method and device |
CN113326401B (en) * | 2021-06-16 | 2023-01-20 | 上海哔哩哔哩科技有限公司 | Method and system for generating field blood relationship |
CN113326401A (en) * | 2021-06-16 | 2021-08-31 | 上海哔哩哔哩科技有限公司 | Method and system for generating field blood margin |
CN113590610B (en) * | 2021-06-29 | 2023-06-20 | 四川新网银行股份有限公司 | Blood relationship expression method based on Elastic Search |
CN113590610A (en) * | 2021-06-29 | 2021-11-02 | 四川新网银行股份有限公司 | Blood relationship representation method based on Elastic Search |
CN113326286A (en) * | 2021-08-03 | 2021-08-31 | 杭州量之智能科技有限公司 | Semantic analysis method supporting dialect SQL blood margin analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107644073A (en) | A kind of field consanguinity analysis method, system and device based on depth-first traversal | |
CN110489445B (en) | Rapid mass data query method based on polymorphic composition | |
CN104636257B (en) | The DBAS automated testing method covered based on SQL | |
Weir et al. | Dbpal: A fully pluggable nl2sql training pipeline | |
CN111782265B (en) | Software resource system based on field-level blood-relation and establishment method thereof | |
EP3671526B1 (en) | Dependency graph based natural language processing | |
CN111104423B (en) | SQL statement generation method and device, electronic equipment and storage medium | |
CN110674229A (en) | AST-based relational database SQL table relational analysis and display method | |
CN108984155A (en) | Flow chart of data processing setting method and device | |
WO2021253641A1 (en) | Shading language translation method | |
CN117093599A (en) | Unified SQL query method for heterogeneous data sources | |
CN109885665A (en) | A kind of data query method, apparatus and system | |
CN111914534A (en) | Semantic mapping method and system for constructing knowledge graph | |
CN115470231B (en) | Method, device and equipment for improving equivalent connection query performance based on data pruning | |
CN108536728A (en) | A kind of data query method and apparatus | |
CN110909126A (en) | Information query method and device | |
Castillo et al. | RDFMatView: Idexing RDF Data for SPARQL Queries | |
CN108563561B (en) | Program implicit constraint extraction method and system | |
CN113297212A (en) | Spark query method and device based on materialized view and electronic equipment | |
CN115809063A (en) | Storage process compiling method, system, electronic equipment and storage medium | |
EP3407204A1 (en) | Methods and systems for translating natural language requirements to a semantic modeling language statement | |
CN110309214A (en) | A kind of instruction executing method and its equipment, storage medium, server | |
CN110580170B (en) | Method and device for identifying software performance risk | |
CN109918391A (en) | A kind of streaming transaction methods and system | |
CN115292347A (en) | Active SQL algorithm performance checking device and method based on rules |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180130 |
|
RJ01 | Rejection of invention patent application after publication |