CN110147396B - Mapping relation generation method and device - Google Patents

Mapping relation generation method and device Download PDF

Info

Publication number
CN110147396B
CN110147396B CN201711091430.9A CN201711091430A CN110147396B CN 110147396 B CN110147396 B CN 110147396B CN 201711091430 A CN201711091430 A CN 201711091430A CN 110147396 B CN110147396 B CN 110147396B
Authority
CN
China
Prior art keywords
name
value pair
field
ith key
alias
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711091430.9A
Other languages
Chinese (zh)
Other versions
CN110147396A (en
Inventor
朱刘江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711091430.9A priority Critical patent/CN110147396B/en
Publication of CN110147396A publication Critical patent/CN110147396A/en
Application granted granted Critical
Publication of CN110147396B publication Critical patent/CN110147396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a mapping relation generation method and a mapping relation generation device, and aims to solve the problem that the generated mapping relation in the prior art is low in accuracy. The method comprises the following steps: carrying out syntax analysis on a Structured Query Language (SQL) program to construct an Abstract Syntax Tree (AST) of the SQL program; recursively traversing the AST to obtain information of SQL elements obtained by each recursion, wherein the information of the SQL elements at least comprises table names and field names; establishing and storing a first set according to the acquired information of the SQL element; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least comprises the table name obtained by the jth recursion, the value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, and N is the recursion frequency; and generating a mapping relation according to the first set.

Description

Mapping relation generation method and device
Technical Field
The invention relates to the field of computers and databases, in particular to a mapping relation generation method and device.
Background
The Data extraction, transformation and loading (ETL) technology is used as a core and a soul of a Data Warehouse (DW) and is responsible for a process of converting Data from a Data source to a target Data Warehouse, namely, required Data is extracted from the Data source, and is loaded into the Data Warehouse through Data cleaning and finally according to a predefined Data Warehouse model, wherein the DW is used for integrating Data generated by an On-Line Transaction Processing (OLTP) system of an enterprise, discovering commercial value in the Data, and providing decision support for the enterprise.
In the ETL technology mainly based on Structured Query Language (SQL) scripts, an SQL program needs to be analyzed to extract SQL elements in an SQL syntax book, and a table-level mapping relationship and a field-level mapping relationship are generated according to the extracted SQL elements. In the prior art, SQL elements obtained after an SQL program is analyzed are numbered according to an appearance sequence, a table-level mapping relationship and a field-level mapping relationship are generated according to the number of the SQL elements and corresponding rules, and the correctness of the number of the SQL elements directly affects the correctness of the mapping relationship.
Therefore, table-level mapping relationships and field-level mapping relationships are generated according to the number of the SQL basic element object and the corresponding rule, which easily causes mapping relationships to be wrong, especially for complex SQL programs. Taking a join query as an example, suppose that the table a is generated by joining the table B and the table C, i.e., [ B, C ] → a, in an Abstract Syntax Tree (AST), the order of the table B and the table C is to obtain the table B first and then obtain the table C, i.e., B → C → a, so that the numbers of the table B and the table C are different, and further, the mapping relationship generated according to the numbers of the table B and the table C is wrong.
Disclosure of Invention
The embodiment of the invention provides a mapping relation generation method and a mapping relation generation device, and aims to solve the problem that the accuracy of generating a mapping relation according to the number of an SQL element is low in the prior art.
The embodiment of the invention provides the following specific technical scheme:
in a first aspect, an embodiment of the present invention provides a mapping relationship generation method, including:
carrying out syntax analysis on a Structured Query Language (SQL) program to construct an Abstract Syntax Tree (AST) of the SQL program;
recursively traversing the AST to obtain information of SQL elements obtained by each recursion, wherein the information of the SQL elements at least comprises table names and field names;
establishing and storing a first set according to the acquired information of the SQL element; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least comprises the table name obtained by the jth recursion, the value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion frequency, and i, j and N are positive integers;
and generating a mapping relation according to the first set.
Optionally, if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and/or a field alias, the key in the ith key value pair further includes the sub-query name and/or the table alias obtained by the jth recursion, and the value in the ith key value pair further includes the table alias and/or the field alias obtained by the jth recursion.
Optionally, when the mapping relationship is a table-level mapping relationship, generating a mapping relationship according to the first set includes:
analyzing the keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names in the keys of the ith key value pair as the keys of a second set, otherwise, determining the second table names in the keys of the ith key value pair as the keys of the second set; the first table name is a table name of a table created in the SQL program, the second table name is a table name included in the SQL program except the first table name, and the second set is used for storing the table-level mapping relationship;
analyzing values in the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program;
when the value in the ith key value pair does not include a table alias, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
Optionally, when the mapping relationship is a table-level mapping relationship, generating a mapping relationship according to the first set includes:
analyzing keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as keys of the second set, wherein the first table names are table names of tables created in the SQL program, the second table names are table names except the first table names in the SQL program, and the second set is used for storing the table-level mapping relationship;
if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program; and when the keys in the ith key value pair do not comprise the first table name, determining a second table name included in the keys of the ith key value pair as the values corresponding to the keys of the second set.
Optionally, when the mapping relationship is a field-level mapping relationship, generating a mapping relationship according to the first set includes:
analyzing the ith key value pair, when the value of the ith key value pair comprises a table/alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as the source inquiry sub inquiry name/table name; when the value of the ith key-value pair does not comprise a table alias, determining a second table name included in the key of the ith key-value pair as a source sub-query name/table name, wherein the second table name is a table name in the SQL program except a first table name, and the first table name is a table name of a table created in the SQL program;
if the key in the ith key value pair comprises a sub-query name or a first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as a target sub-query name/table name corresponding to the source sub-query name/table name, otherwise, determining a second table name included in the key of the ith key value pair as the target sub-query name/table name, wherein the first table name is the table name of the table created in the SQL program;
determining a field name included in the value of the ith key value pair as a source field name;
if the value of the ith key value pair comprises a field alias, determining the field alias included in the value of the ith key value pair as a target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the source field name;
and generating a field level mapping relation according to the target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
Optionally, before querying the third set according to the table alias included in the value of the ith key-value pair, the method further includes: resolving the key of the ith key-value pair, if the key of the ith key-value pair comprises a table alias, determining the table alias included in the key of the ith key-value pair as the key in the third set, otherwise, determining a second table name included in the key of the ith key-value pair as the key in the third set; and determining a second table name included in the key of the ith key value pair as a value corresponding to the key in the third set to obtain the third set.
Optionally, a key in the ith key-value pair includes a first field, a second field and a third field, the first field is used for storing a sub-query name or a first table name, the second field is used for storing a second table name, and the third field is used for storing a table alias; any element in the value of the ith key-value pair comprises a fourth field, a fifth field and a sixth field, wherein the fourth field is used for saving a table alias, the fifth field is used for saving a field name, and the sixth field is used for saving a field alias;
if the sub-query name and the first table name are not obtained in the jth recursion, the first field of the ith key value pair is represented by the second table name obtained in the jth recursion; if the jth recursion does not obtain the table alias, a third field and a fourth field of the ith key value pair are represented by a first set character identifier or the table name obtained by the jth recursion; and if the field alias is not obtained in the jth recursion, identifying a sixth field of the ith key value pair by using a second set character.
In a second aspect, an embodiment of the present invention provides an apparatus for generating a mapping relationship, where the apparatus includes:
the syntax tree building module is used for carrying out syntax analysis on a Structured Query Language (SQL) program and building an abstract syntax tree AST of the SQL program;
the information acquisition module is used for recursively traversing the AST and acquiring information of SQL elements obtained by each recursion, wherein the information of the SQL elements at least comprises table names and field names;
the processing module is used for establishing and storing a first set according to the acquired information of the SQL element and generating a mapping relation according to the first set; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least includes the table name obtained by the jth recursion, the value in the ith key value pair at least includes the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion frequency, and i, j, N are positive integers.
Optionally, if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and/or a field alias, the key in the ith key value pair further includes the sub-query name and/or the table alias obtained by the jth recursion, and the value in the ith key value pair further includes the table alias and/or the field alias obtained by the jth recursion.
Optionally, the processing module is specifically configured to:
analyzing the keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as the keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as the keys of the second set; the first table name is a table name of a table created in the SQL program, the second table name is a table name included in the SQL program except the first table name, and the second set is used for storing the table-level mapping relationship;
analyzing values in the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program;
when the value in the ith key value pair does not include a table alias, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
Optionally, the processing module is specifically configured to:
analyzing keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as keys of the second set, wherein the first table names are table names of tables created in the SQL program, the second table names are table names except the first table names in the SQL program, and the second set is used for storing the table-level mapping relationship;
if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program; and when the keys in the ith key value pair do not comprise the first table name, determining a second table name included in the keys of the ith key value pair as the values corresponding to the keys of the second set.
Optionally, the processing module is specifically configured to:
analyzing the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as the source inquiry sub inquiry name/table name; when the value of the ith key-value pair does not comprise a table alias, determining a second table name included in the key of the ith key-value pair as a source sub-query name/table name, wherein the second table name is a table name in the SQL program except a first table name, and the first table name is a table name of a table created in the SQL program;
if the key in the ith key value pair comprises a sub-query name or a first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as a target sub-query name/table name corresponding to the source sub-query name/table name, otherwise, determining a second table name included in the key of the ith key value pair as the target sub-query name/table name, wherein the first table name is the table name of the table created in the SQL program;
determining a field name included in the value of the ith key value pair as a source field name;
if the value of the ith key value pair comprises a field alias, determining the field alias included in the value of the ith key value pair as a target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the source field name;
and generating a field level mapping relation according to the target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
Optionally, the processing module is further configured to:
and before querying a third set according to the table aliases included in the value of the ith key-value pair, analyzing the key of the ith key-value pair, determining a second table alias included in the key of the ith key-value pair as the key in the third set, and determining a second table name included in the key of the ith key-value pair as a value corresponding to the key in the third set to obtain the third set.
Based on the technical scheme, the syntax analysis is carried out on the SQL program to construct the AST of the SQL program; recursively traversing the AST to obtain information of SQL elements obtained by each recursion; establishing and storing a first set according to the acquired information of the SQL elements, and generating a mapping relation according to the first set, wherein the information of the SQL elements at least comprises table names and field names, an ith key value pair in the first set is used for storing the information of the SQL elements obtained by a jth recursion, a key in the ith key value pair at least comprises the table name obtained by the jth recursion, a value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion times, and i, j and N are positive integers. The mapping relation method provided by the embodiment of the invention does not need to number the SQL elements, can avoid the problem of disordered mapping relation caused by wrong numbering of the SQL elements, and further achieves the purpose of improving the accuracy of the generated mapping relation.
Drawings
Fig. 1 is a schematic flow chart of a mapping relationship generation method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a table-level mapping relationship in an embodiment of the invention;
FIG. 3 is a diagram illustrating field level mapping relationships in an embodiment of the invention;
fig. 4 is a schematic structural diagram of an apparatus for generating a mapping relationship according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The mapping relationship generation method provided by the embodiment of the invention is executed by equipment with a database access function, such as a personal computer, a server and the like, and is suitable for a Data Definition Language (DDL) in SQL.
In the embodiment of the present invention, the first table name is a table name of a table created in the SQL program, and the second table name is a table name other than the first table name in the SQL program. The first set is used for storing information of SQL elements obtained by AST trees of the SQL program recursion each time, the second set is used for storing table-level mapping relations, and the third set is used for storing the corresponding relations between table names and table aliases in the SQL program.
In addition, in the description of the present application, a plurality means two or more. And/or, describing the association relationship of the associated objects, indicating that three relationships may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The terms "first," "second," and the like, are used for distinguishing between descriptions and not necessarily for describing a sequential or chronological order.
Referring to fig. 1, a method for generating a mapping relationship according to an embodiment of the present invention specifically includes the following steps:
s101: and carrying out syntactic analysis on the SQL program to construct the AST of the SQL program.
Wherein the SQL program is used to create a new table that is created from other tables in the database.
S102: and recursively traversing the constructed AST to acquire the information of the SQL elements obtained by each recursion.
The information of the SQL element at least comprises a table name and a field name, and the SQL element comprises a sub SQL, a table, a field and the like.
S103: and establishing and storing a first set according to the acquired information of the SQL elements.
The key (key) in the ith key value pair at least comprises a table name obtained by the jth recursion, the value (value) in the ith key value pair at least comprises a field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the total number of recursions, i, j and N are positive integers, that is, each key value pair in the first set is respectively used for storing SQL element information obtained by each recursion. It should be noted that any one of the keys in the first set corresponds to a value, the value corresponding to any one of the keys in the first set may include one or more elements, and the order of storing the key-value pairs in the first set is independent of the order of recursion, i.e., i and j may be the same or different.
In implementation, if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and/or a field alias, the key in the ith key-value pair further includes a sub-query name and/or a table alias obtained by the jth recursion, and the value in the ith key-value pair further includes a table alias and/or a field alias obtained by the jth recursion. Specifically, when the information of the SQL element obtained by the jth recursion further includes a sub-query name, the key in the ith key value pair further includes the sub-query name obtained by the jth recursion; when the information of the SQL element obtained by the jth recursion also comprises a table alias, the key in the ith key value pair also comprises the table alias obtained by the jth recursion, and at least one element in the value of the ith key value pair also comprises the table alias obtained by the jth recursion; when the information of the SQL element obtained by the jth recursion also comprises a field alias, at least one element in the value of the ith key value pair also comprises the field alias obtained by the jth recursion; when the information of the SQL element obtained by the jth recursion further comprises a sub query name and a table alias, the key in the ith key value pair further comprises the sub query name and the table alias obtained by the jth recursion, and at least one element in the value of the ith key value pair further comprises the table alias obtained by the jth recursion; when the information of the SQL element obtained by the jth recursion further comprises a sub-query name and a field alias, the key in the ith key value pair further comprises the sub-query name obtained by the jth recursion, and at least one element in the value of the ith key value pair further comprises the field alias obtained by the jth recursion; when the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and a field alias, the key in the ith key-value pair further includes the sub-query name and the table alias obtained by the jth recursion, and at least one element in the value of the ith key-value pair further includes the table alias and the field alias obtained by the jth recursion. Wherein the subquery name is also a table alias.
In a specific embodiment, a key in the ith key-value pair includes a first field, a second field and a third field, the first field is used for storing a subquery name or a first table name, the second field is used for storing a second table name, and the third field is used for storing a table alias; any one element in the value of the ith key-value pair includes a fourth field for holding a table alias, a fifth field for holding a field name, and a sixth field for holding a field alias. If the sub-query name and the first table name are not obtained in the jth recursion, the first field of the ith key value pair is represented by the table name obtained in the jth recursion; if the jth recursion does not obtain the table alias, the third field and the fourth field of the ith key value pair are represented by a first set character identifier or the table name obtained by the jth recursion; if the field alias is not obtained in the jth recursion, the sixth field of the ith key value pair is represented by the second set character. The first and second set characters may be the same or different, for example, the first and second set characters may be "NONE".
In implementation, the first field, the second field, and the third field included in the key of the ith key pair of the first set may be separated by a separation identifier, and the fourth field, the fifth field, and the sixth field included in the value of the ith key pair of the first set may be separated by a separation identifier, so as to identify and extract the content stored in the different fields when parsing the ith key pair, where the separation identifier may be "#", "$", or the like.
For example, the information of the SQL elements obtained by the AST constructed by the jth recursion includes a subquery name B, a table name bit _ MON, a field name SUBS _ ID, fe _ notify, and INC _ notify, and if no table alias and field alias are obtained, the key of the ith key-value pair in the first set is B $ bit _ MON, and the value of the ith key-value pair is NONE # SUBS _ ID # NONE, NONE # fe _ notify # NONE, and NONE # INC _ notify # NONE.
S104: and generating a mapping relation according to the first set.
Specifically, a mapping relationship is generated by analyzing key-value pairs in the first set, wherein a method for generating a table-level mapping relationship mainly includes the following two methods:
the first method is that each key-value pair in the first set is analyzed to generate a second set used for storing the table-level mapping relationship, wherein one key-value pair in the second set can be obtained by analyzing one key-value pair in the first set. The method mainly comprises the following steps:
step 1 a: and analyzing the keys in the ith key value pair of the first set, if the keys of the ith key value pair comprise subquery names or first table names, determining the subquery names or the first table names included in the keys of the ith key value pair as the keys of the second set, otherwise, determining the second table names included in the keys of the ith key value pair as the keys of the second set.
Step 2 a: analyzing the value in the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set; and when the table alias is not included in the values in the ith key value pair, determining a second table name included in the key of the ith key value pair as the value corresponding to the key of the second set.
For example, when the key of the ith key value pair in the first set is B $ bit _ MON, the values of the ith key value pair are NONE # SUBS _ ID # NONE, NONE # FEE _ NOTAX # NONE, and NONE # INC _ NOTAX # NONE, the key of the ith key value pair is parsed, it is determined that the key of the ith key value pair includes a subquery name, the subquery name B in the key of the ith key value pair is taken as the key in the second set, the value of the ith key value pair is parsed, it is determined that the value of the ith key value pair does not include a table alias, and the table name bit _ MON in the key of the ith key value pair is taken as the value corresponding to the key in the second set.
In an implementation, if the value of the ith key-value pair includes a plurality of elements, the step 2a is performed on each element in the value of the ith key-value pair, and the result obtained by performing the step 2a on all elements in the value of the ith key-value pair is used as the value corresponding to the key of the second set determined according to the key in the ith key-value pair.
And secondly, analyzing each key in the first set and the value corresponding to the key comprising the first table name, and generating a second set for storing the table-level mapping relationship. The method mainly comprises the following steps:
step 1 b: and analyzing the keys in the ith key value pair of the first set, if the keys of the ith key value pair comprise the subquery names or the first table names, determining the subquery names or the first table names included in the keys of the ith key value pair as the keys of the second set, otherwise, determining the second table names included in the keys of the ith key value pair as the keys of the second set.
And step 2 b: if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set; and when the key in the ith key value pair does not comprise the first table name, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
In an implementation, if the key of the ith key-value pair includes the first table name and the value of the ith key-value pair includes a plurality of elements, the step 2b is performed on each element of the value of the ith key-value pair, and the result obtained by performing the step 2b on all elements of the value of the ith key-value pair is used as the value corresponding to the key of the second set determined according to the key of the ith key-value pair.
For example, when the key of the ith key value pair in the first set is B $ bit _ MON, the values of the ith key value pair are NONE # SUBS _ ID # NONE, NONE # FEE _ NOTAX # NONE, and NONE # INC _ NOTAX # NONE, the key of the ith key value pair is parsed, it is determined that the key of the ith key value pair does not include the first table name, the subquery name B in the key of the ith key value pair is taken as the key in the second set, and the table name bit _ MON in the key of the ith key value pair is taken as the value corresponding to the key in the second set. Also, for example, if the key of the ith key-value pair in the first set is DM _ SUM $ list _ TO _ MON, the value of the ith key-value pair is a # SUBS _ ID # NONE, B # FEE _ NOTAX # CNSM _ MON _ NOTAX, where DM _ SUM is the first table name, list _ TO _ MON is the table name, A, B is the table alias, SUBS _ ID, FEE _ NOTAX are the field names, CNSM _ MON _ NOTAX is the field alias, the key of the ith key-value pair is resolved, it is determined that the key of the ith key-value pair includes the first table name, the value of the ith key-value pair is resolved, it is determined that the table alias is included in the value of the ith key-value pair, a third set is queried according TO the value of the ith key-value pair including the alias A, B, and the table name corresponding TO the alias table alias A, B is not queried, and then the table alias A, B in the value of the ith key-value pair is used as the table alias of the corresponding second key-set.
It should be noted that, a value corresponding to any one of the keys in the second set may include one or more elements.
In implementation, after generating the second set for storing the field-level mapping relationship, the table-level mapping relationship may be obtained in reverse direction by using the first table name as an entry according to the second set, and the table-level mapping relationship is displayed in page image rendering.
Specifically, the field-level mapping relationship is generated by analyzing each key-value pair in the first set, and the method mainly includes the following steps:
step 1 c: analyzing the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as a source inquiry sub inquiry name/table name; when the table alias is not included in the value of the ith key-value pair, a second table name included in the key of the ith key-value pair is determined as the source sub-query name/table name.
And step 2 c: and if the key in the ith key value pair comprises the sub-query name or the first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as the target sub-query name/table name corresponding to the determined source sub-query name/table name, and otherwise, determining the second table name included in the key of the ith key value pair as the target sub-query name/table name.
And step 3 c: the field name included in the value of the ith key-value pair is determined as the source field name.
And 4 c: and if the value of the ith key value pair comprises the field alias, determining the field alias included in the value of the ith key value pair as the target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the determined source field name.
And step 5 c: and generating a field level mapping relation according to the determined target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
In implementation, if the value of the ith key-value pair includes multiple elements, the above steps 1c to 4c are performed on each element in the value of the ith key-value pair, one element in the value of the ith key-value pair corresponds to one field mapping relationship, and the generated field-level mapping relationship is formed by the field mapping relationships corresponding to all elements in the value of the first set.
It should be noted that, in the embodiment of the present invention, the key and the value of the ith key-value pair may be analyzed separately, or the key and the value of the ith key-value pair may be analyzed simultaneously. In addition, the embodiment of the present invention does not limit the determination of the sequence of the source sub-query name/table name, the target sub-query name/table name, the source field name, and the target field name.
In implementation, in the process of generating the mapping relationship, before querying the third set according to the table alias included in the value of the ith key-value pair, the third set is further established by: analyzing the key of the ith key value pair, if the key of the ith key value pair comprises a table alias, determining the table alias included in the key of the ith key value pair as the key in the third set, otherwise, determining a second table name included in the key of the ith key value pair as the key in the third set; and determining a second table name included in the key of the ith key value pair as a value corresponding to the key in the third set to obtain the third set.
In summary, in the embodiment of the present invention, the syntax analysis is performed on the SQL program to construct the AST of the SQL program; recursively traversing the AST to obtain information of SQL elements obtained by each recursion; establishing and storing a first set according to the acquired information of the SQL elements, and generating a mapping relation according to the first set, wherein the information of the SQL elements at least comprises table names and field names, an ith key value pair in the first set is used for storing the information of the SQL elements obtained by a jth recursion, a key in the ith key value pair at least comprises the table name obtained by the jth recursion, a value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion times, and i, j and N are positive integers. The mapping relation method provided by the embodiment of the invention does not need to number the SQL elements, can avoid the problem of disordered mapping relation caused by wrong numbering of the SQL elements, and further achieves the purpose of improving the accuracy of the generated mapping relation.
The following describes the mapping relationship generation method according to the present invention in detail by using a specific embodiment.
The SQL program to be parsed is as follows:
Figure BDA0001461303490000151
Figure BDA0001461303490000161
the method for generating the mapping relation of the SQL program comprises the following steps:
1. and according to the SQL grammar rule, performing grammar analysis and lexical analysis on the SQL program to construct the AST of the SQL program.
2. And recursively traversing the constructed AST syntax tree to acquire SQL information obtained by each recursion.
The obtained SQL information may further include a sub-query name, a table alias, a field alias, and join (join) information, in addition to the table name and the field name.
3. Establishing and storing a first set according to the obtained SQL information, wherein the first set is shown in table 1:
TABLE 1 first set
Figure BDA0001461303490000162
Figure BDA0001461303490000171
The key in the first set is obtained by analyzing a from statement or a table building statement in the SQL program, and the value in the first set is obtained by analyzing a select statement in the SQL program. The key in the first set comprises a first field, a second field and a third field, wherein the first field is used for storing a sub-query name or a first table name, the second field is used for storing a second table name, the third field is used for storing a table alias, and different fields are separated by a separation identifier, namely the format of the key in the first set is sub-query name/first table name $ second table name $ table alias; any one element in the value in the first set comprises a fourth field, a fifth field and a sixth field, wherein the fourth field is used for storing table aliases, the fifth field is used for storing field names, the sixth field is used for storing field aliases, different fields are separated by separation identifiers, namely the format of the element in the value in the first set is table alias # field name # alias. And if the sub query name or the first table name is not obtained by a certain recursion, the sub query name/the first table name in the first field is represented by using the second table name obtained by the certain recursion, if the table alias is not obtained by the certain recursion, the table alias in the second field and the fourth field is represented by using the second table name or the set character obtained by the certain recursion, and if the field alias is not obtained by the certain recursion, the field alias in the sixth field is replaced by using the set character.
4. Analyzing each key in the first set, determining the second table name in the first set key as a key in the third set, and determining the table alias in the first set key as a value in the third set, so as to obtain the third set shown in table 2.
TABLE 2 third set
Figure BDA0001461303490000172
When the key of DM _ SUM $ list _ MON $ list _ TO _ MON in the first set is analyzed, the key pair in the third set is obtained as list _ TO _ MON, the value is list _ TO _ MON, and when the key of a $ list _ TO _ MON $ TMP in the first set is analyzed, the key pair in the third set is obtained as list _ TO _ MON, the value is TMP, that is, the value of the key pair in the third set obtained by analyzing the key of DM _ list _ TO _ MON $ list is replaced by TMP, so that the result shown in table 2 is obtained by replacing the value of the key pair in the third set obtained by analyzing the key of DM _ list _ TO _ MON $ list.
5. And generating a table-level mapping relation and a field-level mapping relation according to the first set and the third set.
(a) Generating table-level mapping relationships
For key DM _ SUM $ list _ TO _ MON in the first set, because the key includes the first table name DM _ SUM, the first table name DM _ SUM is determined as the key in the second set, the value corresponding TO the key DM _ SUM $ list _ TO _ MON in the first set includes the table alias A, B, the third set is queried according TO the table alias A, B, the table name corresponding TO the table alias A, B is not found, and therefore the table alias A, B is determined as the value corresponding TO the key DM _ SUM in the third set; for the key B $ BILL _ MON $ BILL _ MON in the first set, determining the subquery name B as the key in the second set, and determining the table name BILL _ MON in the key B $ BILL _ MON in the first set as the value corresponding to the key B in the second set because the value corresponding to the key B $ BILL _ MON in the first set does not comprise the table alias; for key a $ list _ MON $ TMP in the first set, the subquery name a is determined as the key in the second set, since the value corresponding TO the key a $ list _ MON $ TMP in the first set includes the table alias TMP, the third set is queried according TO the table alias TMP TO obtain the table name list _ list corresponding TO the table alias TMP, and the table name list _ list is determined as the value corresponding TO the key a in the second set, and the result of the obtained second set is shown in table 3:
TABLE 3 second set
Figure BDA0001461303490000181
From the second set, the table-level mapping relationship shown in fig. 2 can be pushed backwards, using the first table name DM _ SUM as an entry.
(b) Generating field level mapping relationships
Determining a first table name DM _ SUM in key DM _ SUM $ CUST _ TO _ MON $ CUST _ TO _ MON as a target subquery/table name for a first key-value pair in the first set; for value A # SUBS _ ID # NONE corresponding TO key DM _ SUM $ CUST _ TO _ MON $ CUST _ TO _ MON, because table alias A is included in the A # SUBS _ ID # NONE, the third set is inquired according TO the table alias A, the table name corresponding TO the table alias A is not inquired, and therefore the table alias A is determined as the source sub-inquiry name/table name; since the value a # SUBS _ ID # NONE includes no field alias, the field names in the value a # SUBS _ ID # NONE are determined as the source field name and the target field name, and the determined target sub-query name/table name DM _ SUM, source sub-query name/table name a, source field name SUBS _ ID, and target field name SUBS _ ID are determined as one field mapping relationship. For value B # FEE _ NOTAX # CNSM _ MON _ NOTAX corresponding TO key DM _ SUM $ cue _ TO _ MON, because table alias B is included in B # FEE _ NOTAX # CNSM _ MON _ NOTAX, the third set is queried according TO table alias B without querying the table name corresponding TO table alias B, so table alias B is determined as the source sub-query name/table name, field name FEE _ NOTAX in B # FEE _ NOTAX # CNSM _ MON _ NOTAX is determined as the source field name, and because alias field CNSM _ NOTAX is included in B # FEE _ NOTAX _ CNSM _ NOTAX, field alias _ max _ cnax is determined as the target field name, the determined target sub-query name/table name DM _ SUM, source sub-query name/table name B, source name field notsm _ NOTAX and target field cnax _ NOTAX are determined as one target field name mapping relationship. Similarly, for value B # INC _ NOTAX # NB _ CNSM _ MON _ NOTAX corresponding TO key DM _ SUM $ list _ TO _ MON, the determined source sub-query name/table name is B, the source field name is INC _ NOTAX, the target field name is NB _ CNSM _ MON _ NOTAX, and the determined target sub-query name/table name DM _ SUM, source sub-query name/table name B, source field name INC _ NOTAX, and target field name NB _ CNSM _ MON _ NOTAX are determined as a field mapping relationship.
For the second key-value pair in the first set, determining the subquery name B in key B $ BILL _ MON $ BILL _ MON as the target subquery name/table name; for value NONE # SUBS _ ID # NONE corresponding to key B $ bit _ MON, since no table alias is included in value NONE # SUBS _ ID # NONE, the table name bit _ MON in key B $ bit _ MON is determined as the source query name/table name; since the field alias is not included in the value NONE # SUBS _ ID # NONE, the field name SUBS _ ID in NONE # SUBS _ ID # NONE is determined as the source field name and the target field name; and determining the determined target subquery name/table name B, the source subquery name/table name BILL _ MON, the source field name SUBS _ ID and the target field name SUBS _ ID as a field mapping relation. Similarly, for value NONE # FEE _ NOTAX # NONE corresponding to key B $ bit _ MON, the determined target sub-query name/table name B, the source sub-query name/table name is bit _ MON, the source field name is FEE _ NOTAX, the target field name is FEE _ NOTAX, and the determined target sub-query name/table name B, source sub-query name/table name bit _ MON, source field name FEE _ NOTAX and target field name FEE _ NOTAX are determined as a field mapping relation; for value NONE # INC _ NOAX # NONE corresponding to key B $ BILL _ MON $ BILL _ MON, the determined target sub-query name/table name B, the source sub-query name/table name BILL _ MON, the source field name FEE _ NOTAX, and the target field name FEE _ NOTAX are determined as a field mapping relation.
For the third key-value pair in the first set, determining the sub-query name A in key A $ CUST _ TO _ MON $ TMP as the target sub-query name/table name; because the value TMP # SUBS _ ID # NONE corresponding TO the key A $ CUST _ TO _ MON $ TMP comprises the table alias TMP, the third set is queried according TO the table alias TMP TO obtain the table name CUST _ TO _ MON corresponding TO the table alias TMP, and the table name CUST _ TO _ MON is determined as the source query name/table name; since the field alias is not included in the value TMP # SUBS _ ID # NONE, the field name SUBS _ ID in the value TMP # SUBS _ ID # NONE is determined as the source field name and the target field name; and determining the determined target sub-query name/table name A, the source sub-query name/table name CUST _ TO _ MON, the source field name SUBS _ ID and the target field name SUBS _ ID as a field mapping relation.
And storing the field mapping relation determined according to each key value pair in the first set in a fourth set according to a set format to obtain a field level mapping relation. The format of each field mapping relationship may be target sub-query name/table name $ target field name $ source sub-query name/table name $ source field name, and the result of the fourth set is shown in table 4:
TABLE 4 fourth set
Fourth set
B$SUBS_ID$BILL_MON$SUBS_ID
B$FEE_NOTAX$BILL_MON$FEE_NOTAX
B$INC_NOTAX$BILL_MON$INC_NOTAX
DM_SUM$SUBS_ID$A$SUBS_ID
DM_SUM$CNSM_MON_NOTAX$B$FEE_NOTAX
DM_SUM$NB_CNSM_MON_NOTAX$B$INC_NOTAX
A$SUBS_ID$CUST_TO_MON$SUBS_ID
After the fourth set is obtained, the field-level mapping relationship stored in the fourth set is obtained by using the first table name as an entry, and is displayed in a page image rendering manner, as shown in fig. 3.
Based on the above embodiments, an embodiment of the present application further provides an apparatus for generating a mapping relationship, so as to implement the mapping relationship generation method shown in fig. 1, referring to fig. 4, where the apparatus 400 includes: a syntax tree building module 401, an information acquisition module 402 and a processing module 403.
A syntax tree building module 401, configured to perform syntax analysis on a structured query language SQL program, and build an abstract syntax tree AST of the SQL program;
an information obtaining module 402, configured to recursively traverse the AST, and obtain information of an SQL element obtained by each recursion, where the information of the SQL element at least includes a table name and a field name;
the processing module 403 is configured to establish and store a first set according to the obtained information of the SQL element, and generate a mapping relationship according to the first set; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least includes the table name obtained by the jth recursion, the value in the ith key value pair at least includes the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion frequency, and i, j, N are positive integers.
Optionally, if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and/or a field alias, the key in the ith key value pair further includes the sub-query name and/or the table alias obtained by the jth recursion, and the value in the ith key value pair further includes the table alias and/or the field alias obtained by the jth recursion.
Optionally, the processing module 403 is specifically configured to:
analyzing the keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as the keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as the keys of the second set; the first table name is a table name of a table created in the SQL program, the second table name is a table name included in the SQL program except the first table name, and the second set is used for storing the table-level mapping relationship;
analyzing values in the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program;
when the value in the ith key value pair does not include a table alias, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
Optionally, the processing module 403 is specifically configured to:
analyzing keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as keys of the second set, wherein the first table names are table names of tables created in the SQL program, the second table names are table names except the first table names in the SQL program, and the second set is used for storing the table-level mapping relationship;
if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program; and when the keys in the ith key value pair do not comprise the first table name, determining a second table name included in the keys of the ith key value pair as the values corresponding to the keys of the second set.
Optionally, the processing module 403 is specifically configured to:
analyzing the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as the source inquiry sub inquiry name/table name; when the value of the ith key-value pair does not comprise a table alias, determining a second table name included in the key of the ith key-value pair as a source sub-query name/table name, wherein the second table name is a table name in the SQL program except a first table name, and the first table name is a table name of a table created in the SQL program;
if the key in the ith key value pair comprises a sub-query name or a first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as a target sub-query name/table name corresponding to the source sub-query name/table name, otherwise, determining a second table name included in the key of the ith key value pair as the target sub-query name/table name, wherein the first table name is the table name of the table created in the SQL program;
determining a field name included in the value of the ith key value pair as a source field name;
if the value of the ith key value pair comprises a field alias, determining the field alias included in the value of the ith key value pair as a target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the source field name;
and generating a field level mapping relation according to the target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
Optionally, the processing module 403 is further configured to:
and before querying a third set according to the table aliases included in the value of the ith key-value pair, analyzing the key of the ith key-value pair, determining a second table alias included in the key of the ith key-value pair as the key in the third set, and determining a second table name included in the key of the ith key-value pair as a value corresponding to the key in the third set to obtain the third set.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (12)

1. A mapping relationship generation method is characterized by comprising the following steps:
carrying out syntax analysis on a Structured Query Language (SQL) program to construct an Abstract Syntax Tree (AST) of the SQL program;
recursively traversing the AST to obtain information of SQL elements obtained by each recursion, wherein the information of the SQL elements at least comprises table names and field names;
establishing and storing a first set according to the acquired information of the SQL element; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least comprises the table name obtained by the jth recursion, the value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion frequency, and i, j and N are positive integers;
the keys in the ith key value pair at least include the table names obtained by the jth recursion, and the table names include: keys in the ith key value pair comprise a first field, a second field and a third field, wherein the first field is used for storing a sub-query name or a first table name, the second field is used for storing a second table name, the third field is used for storing a table alias, and the second table name is a table name in the SQL program except the first table name;
the value in the ith key value pair at least includes the field name obtained by the jth recursion, and the method includes: any element in the value of the ith key-value pair comprises a fourth field, a fifth field and a sixth field, wherein the fourth field is used for saving a table alias, the fifth field is used for saving a field name, and the sixth field is used for saving a field alias; if the sub-query name and the first table name are not obtained in the jth recursion, the first field of the ith key value pair is represented by the second table name obtained in the jth recursion; if the jth recursion does not obtain the table alias, a third field and a fourth field of the ith key value pair are represented by a first set character identifier or the table name obtained by the jth recursion; if the field alias is not obtained in the jth recursion, the sixth field of the ith key value pair is represented by a second set character;
and generating a mapping relation according to the first set.
2. The method according to claim 1, wherein if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias and/or a field alias, the key in the ith key-value pair further includes the sub-query name and/or the table alias obtained by the jth recursion, and the value in the ith key-value pair further includes the table alias and/or the field alias obtained by the jth recursion.
3. The method of claim 2, wherein when the mapping is a table-level mapping, generating a mapping from the first set comprises:
analyzing the keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names in the keys of the ith key value pair as the keys of a second set, otherwise, determining the second table names in the keys of the ith key value pair as the keys of the second set; the first table name is a table name of a table created in the SQL program, the second table name is a table name included in the SQL program except the first table name, and the second set is used for storing the table-level mapping relationship;
analyzing values in the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program;
when the value in the ith key value pair does not include a table alias, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
4. The method of claim 2, wherein when the mapping is a table-level mapping, generating a mapping from the first set comprises:
analyzing keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as keys of the second set, wherein the first table names are table names of tables created in the SQL program, the second table names are table names except the first table names in the SQL program, and the second set is used for storing the table-level mapping relationship;
if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program; and when the keys in the ith key value pair do not comprise the first table name, determining a second table name included in the keys of the ith key value pair as the values corresponding to the keys of the second set.
5. The method of claim 2, wherein when the mapping relationship is a field-level mapping relationship, generating a mapping relationship from the first set comprises:
analyzing the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as the source inquiry sub inquiry name/table name; when the value of the ith key-value pair does not comprise a table alias, determining a second table name included in the key of the ith key-value pair as a source sub-query name/table name, wherein the second table name is a table name in the SQL program except a first table name, and the first table name is a table name of a table created in the SQL program;
if the key in the ith key value pair comprises a sub-query name or a first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as a target sub-query name/table name corresponding to the source sub-query name/table name, otherwise, determining a second table name included in the key of the ith key value pair as the target sub-query name/table name;
determining a field name included in the value of the ith key value pair as a source field name;
if the value of the ith key value pair comprises a field alias, determining the field alias included in the value of the ith key value pair as a target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the source field name;
and generating a field level mapping relation according to the target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
6. The method of any of claims 3-5, wherein prior to querying the third set according to the table aliases included in the values of the ith key-value pair, further comprising:
resolving the key of the ith key-value pair, if the key of the ith key-value pair comprises a table alias, determining the table alias included in the key of the ith key-value pair as the key in the third set, otherwise, determining a second table name included in the key of the ith key-value pair as the key in the third set; and determining a second table name included in the key of the ith key value pair as a value corresponding to the key in the third set to obtain the third set.
7. An apparatus for generating a mapping relationship, comprising:
the syntax tree building module is used for carrying out syntax analysis on a Structured Query Language (SQL) program and building an abstract syntax tree AST of the SQL program;
the information acquisition module is used for recursively traversing the AST and acquiring information of SQL elements obtained by each recursion, wherein the information of the SQL elements at least comprises table names and field names;
the processing module is used for establishing and storing a first set according to the acquired information of the SQL element and generating a mapping relation according to the first set; the ith key value pair in the first set is used for storing information of the SQL element obtained by the jth recursion, the key in the ith key value pair at least comprises the table name obtained by the jth recursion, the value in the ith key value pair at least comprises the field name obtained by the jth recursion, i is 1, 2, … N, j is 1, 2, … N, N is the recursion frequency, and i, j and N are positive integers;
the keys in the ith key value pair at least include the table names obtained by the jth recursion, and the table names include: keys in the ith key value pair comprise a first field, a second field and a third field, wherein the first field is used for storing a sub-query name or a first table name, the second field is used for storing a second table name, the third field is used for storing a table alias, and the second table name is a table name in the SQL program except the first table name;
the value in the ith key value pair at least includes the field name obtained by the jth recursion, and the method includes: any element in the value of the ith key-value pair comprises a fourth field, a fifth field and a sixth field, wherein the fourth field is used for saving a table alias, the fifth field is used for saving a field name, and the sixth field is used for saving a field alias; if the sub-query name and the first table name are not obtained in the jth recursion, the first field of the ith key value pair is represented by the second table name obtained in the jth recursion; if the jth recursion does not obtain the table alias, a third field and a fourth field of the ith key value pair are represented by a first set character identifier or the table name obtained by the jth recursion; and if the field alias is not obtained in the jth recursion, the sixth field of the ith key value pair is represented by a second set character.
8. The apparatus of claim 7, wherein if the information of the SQL element obtained by the jth recursion further includes a sub-query name, a table alias, and/or a field alias, the key in the ith key-value pair further includes the sub-query name and/or the table alias obtained by the jth recursion, and the value in the ith key-value pair further includes the table alias and/or the field alias obtained by the jth recursion.
9. The apparatus of claim 8, wherein the processing module is specifically configured to:
analyzing the keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as the keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as the keys of the second set; the first table name is a table name of a table created in the SQL program, the second table name is a table name included in the SQL program except the first table name, and the second set is used for storing a table-level mapping relation;
analyzing values in the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias included in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program;
when the value in the ith key value pair does not include a table alias, determining a second table name included in the key of the ith key value pair as a value corresponding to the key of the second set.
10. The apparatus of claim 8, wherein the processing module is specifically configured to:
analyzing keys in the ith key value pair, if the keys of the ith key value pair comprise sub query names or first table names, determining the sub query names or the first table names included in the keys of the ith key value pair as keys of a second set, otherwise, determining the second table names included in the keys of the ith key value pair as keys of the second set, wherein the first table names are table names of tables created in the SQL program, the second table names are table names except the first table names in the SQL program, and the second set is used for storing table-level mapping relations;
if the key of the ith key value pair comprises a first table name, analyzing the value of the ith key value pair, when the value of the ith key value pair comprises a table alias, querying a third set according to the table alias comprised in the value of the ith key value pair, if the table name corresponding to the table alias comprised in the value of the ith key value pair is found, determining the table name corresponding to the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, otherwise, determining the table alias comprised in the value of the ith key value pair as the value corresponding to the key of the second set, wherein the third set is used for storing the corresponding relation between the table name and the table alias in the SQL program; and when the keys in the ith key value pair do not comprise the first table name, determining a second table name included in the keys of the ith key value pair as the values corresponding to the keys of the second set.
11. The apparatus of claim 8, wherein the processing module is specifically configured to:
analyzing the ith key value pair, when the value of the ith key value pair comprises a table alias, inquiring a third set according to the table alias included in the value of the ith key value pair, if the table name corresponding to the table alias included in the value of the ith key value pair is found, determining the table name corresponding to the table alias included in the value of the ith key value pair as a source sub inquiry name/table name, otherwise, determining the table alias included in the value of the ith key value pair as the source inquiry sub inquiry name/table name; when the value of the ith key-value pair does not comprise a table alias, determining a second table name included in the key of the ith key-value pair as a source sub-query name/table name, wherein the second table name is a table name in the SQL program except a first table name, and the first table name is a table name of a table created in the SQL program;
if the key in the ith key value pair comprises a sub-query name or a first table name, determining the sub-query name or the first table name included in the key of the ith key value pair as a target sub-query name/table name corresponding to the source sub-query name/table name, otherwise, determining a second table name included in the key of the ith key value pair as the target sub-query name/table name;
determining a field name included in the value of the ith key value pair as a source field name;
if the value of the ith key value pair comprises a field alias, determining the field alias included in the value of the ith key value pair as a target field name corresponding to the source field name, otherwise, determining the field name included in the value of the ith key value pair as the target field name corresponding to the source field name;
and generating a field level mapping relation according to the target sub-query name/table name, the target field name, the source sub-query name/table name and the source field name.
12. The apparatus of any of claims 9-11, wherein the processing module is further configured to:
before querying a third set according to table aliases included in values of the ith key-value pair, resolving keys of the ith key-value pair, if the keys of the ith key-value pair include the table aliases, determining the table aliases included in the keys of the ith key-value pair as the keys in the third set, otherwise, determining second table names included in the keys of the ith key-value pair as the keys in the third set; and determining a second table name included in the key of the ith key value pair as a value corresponding to the key in the third set to obtain the third set.
CN201711091430.9A 2017-11-08 2017-11-08 Mapping relation generation method and device Active CN110147396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711091430.9A CN110147396B (en) 2017-11-08 2017-11-08 Mapping relation generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711091430.9A CN110147396B (en) 2017-11-08 2017-11-08 Mapping relation generation method and device

Publications (2)

Publication Number Publication Date
CN110147396A CN110147396A (en) 2019-08-20
CN110147396B true CN110147396B (en) 2021-08-31

Family

ID=67588022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711091430.9A Active CN110147396B (en) 2017-11-08 2017-11-08 Mapping relation generation method and device

Country Status (1)

Country Link
CN (1) CN110147396B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11507575B2 (en) * 2019-11-21 2022-11-22 Sap Se Complex query rewriting
CN117331926B (en) * 2023-12-01 2024-03-01 太平金融科技服务(上海)有限公司 Data auditing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186541A (en) * 2011-12-27 2013-07-03 阿里巴巴集团控股有限公司 Generation method and device for mapping relationship
CN104899314A (en) * 2015-06-17 2015-09-09 北京京东尚科信息技术有限公司 Pedigree analysis method and device of data warehouse
CN105912595A (en) * 2016-04-01 2016-08-31 华南理工大学 Data origin collection method of relational databases
CN106547796A (en) * 2015-09-23 2017-03-29 南京中兴新软件有限责任公司 The execution method and device of data base

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8739118B2 (en) * 2010-04-08 2014-05-27 Microsoft Corporation Pragmatic mapping specification, compilation and validation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186541A (en) * 2011-12-27 2013-07-03 阿里巴巴集团控股有限公司 Generation method and device for mapping relationship
CN104899314A (en) * 2015-06-17 2015-09-09 北京京东尚科信息技术有限公司 Pedigree analysis method and device of data warehouse
CN106547796A (en) * 2015-09-23 2017-03-29 南京中兴新软件有限责任公司 The execution method and device of data base
CN105912595A (en) * 2016-04-01 2016-08-31 华南理工大学 Data origin collection method of relational databases

Also Published As

Publication number Publication date
CN110147396A (en) 2019-08-20

Similar Documents

Publication Publication Date Title
CN110908997B (en) Data blood relationship construction method and device, server and readable storage medium
CN110633292B (en) Query method, device, medium, equipment and system for heterogeneous database
US9870382B2 (en) Data encoding and corresponding data structure
US9600507B2 (en) Index structure for a relational database table
US8943059B2 (en) Systems and methods for merging source records in accordance with survivorship rules
US9323809B2 (en) System and methods for rapid data analysis
JP4955876B2 (en) Cost-based materialized view selection for query optimization
CN103810224B (en) information persistence and query method and device
CN104899295B (en) A kind of heterogeneous data source data relation analysis method
CN105868204A (en) Method and apparatus for converting script language SQL of Oracle
US11269880B2 (en) Retroreflective clustered join graph generation for relational database queries
CN109145003B (en) Method and device for constructing knowledge graph
CN110597844B (en) Unified access method for heterogeneous database data and related equipment
EP2862101B1 (en) Method and a consistency checker for finding data inconsistencies in a data repository
CN112579610A (en) Multi-data source structure analysis method, system, terminal device and storage medium
CN112015722A (en) Database management method, data blood relationship analysis method and related device
US9053207B2 (en) Adaptive query expression builder for an on-demand data service
CA2461871A1 (en) An efficient index structure to access hierarchical data in a relational database system
CN105095436A (en) Automatic modeling method for data of data sources
CN110147396B (en) Mapping relation generation method and device
US20110302220A1 (en) Sql processing for data conversion
CN110008231B (en) MySQL data backtracking method and storage medium
CN104636471A (en) Procedure code finding method and device
CN111460000A (en) Backtracking data query method and system based on relational database
CN112540978A (en) Wide table generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant