CN117493372A - SQL sentence generation method and device based on relational awareness - Google Patents

SQL sentence generation method and device based on relational awareness Download PDF

Info

Publication number
CN117493372A
CN117493372A CN202311549842.8A CN202311549842A CN117493372A CN 117493372 A CN117493372 A CN 117493372A CN 202311549842 A CN202311549842 A CN 202311549842A CN 117493372 A CN117493372 A CN 117493372A
Authority
CN
China
Prior art keywords
sql
problem word
database
structure diagram
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311549842.8A
Other languages
Chinese (zh)
Inventor
王俊荣
李邦明
李勇
肖颀
马学旭
辜希武
庞杰
吴君
陈朝旭
刘子平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
719th Research Institute Of China State Shipbuilding Corp
Original Assignee
719th Research Institute Of China State Shipbuilding Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 719th Research Institute Of China State Shipbuilding Corp filed Critical 719th Research Institute Of China State Shipbuilding Corp
Priority to CN202311549842.8A priority Critical patent/CN117493372A/en
Publication of CN117493372A publication Critical patent/CN117493372A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for generating SQL sentences based on relational awareness, wherein the method comprises the following steps: obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram; obtaining target node characteristic information based on the problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception; based on the decoder and the target node characteristic information, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence; the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is obtained according to schema information of the database. Thereby improving the expression capability of the problems and the database modes and improving the accuracy of generating SQL sentences.

Description

SQL sentence generation method and device based on relational awareness
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for generating SQL sentences based on relational awareness.
Background
With the popularity of electronic devices, databases have become a major tool for storing information, and relational databases employ relational models to organize data, which store data in the form of rows and columns, which are called tables, which are the dominant way to store large-scale structured data of resources, and users can query information stored in the tables through structured query language (Structured Query Language, SQL).
The structured query language SQL statement has strict grammar constraint, and the use threshold is higher. While database professionals can write SQL statements to effectively access table contents, it is more difficult for non-professionals to write the correct SQL query statement. Therefore, in order to improve the information retrieval efficiency of the database and reduce the use threshold of users, the research of converting natural language problems into machine executable SQL sentences has attracted extensive attention in industry and academia. The task of generating a corresponding SQL statement from a natural language question is called a Text-to-SQL task.
The Text-to-SQL task allows non-professional users to easily query the contents of the database, and has wide requirements and research values in practical applications, such as intelligent question-answering services, voice assistants, robot navigation and the like. And automatically converting the problem proposed by the user into an SQL sentence, executing the query in a background database to obtain an accurate retrieval result, and returning the retrieval result to the user. The method can greatly reduce the labor cost by automatically generating SQL sentences from natural language problems and executing the SQL sentences to obtain search results.
In the prior art, in generating SQL based on natural language problems, encoder-decoder architectures based on deep learning designs are often used. The encoder encodes the natural language problem and the known database mode information, and adds the relation information into the encoder to encode by constructing the relation information between the natural language problem and the database mode so as to capture the characteristic information of the natural language problem and the database mode; the decoder decodes the characteristic information generated by the encoder, predicts and generates an abstract syntax tree of the SQL sentence according to the step sequence, and further converts the abstract syntax tree into a corresponding SQL sentence.
The expression capability of the mode is insufficient, and the accuracy of the generated SQL statement is not high.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a method and a device for generating SQL sentences based on relational awareness, which are used for solving the defects of insufficient expression capability and low accuracy of the generated SQL sentences in the prior art, realizing the improvement of the expression capability of the problems and database modes and improving the accuracy of the generated SQL sentences.
The invention provides a SQL sentence generation method based on relational awareness, which comprises the following steps:
obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram;
obtaining target node characteristic information based on the problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception;
based on the decoder and the target node characteristic information, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence;
the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is obtained according to schema information of the database.
According to the SQL sentence generation method based on relational awareness provided by the invention, the problem word pattern diagram is obtained based on a problem word structure diagram and a database pattern structure diagram, and the method concretely comprises the following steps:
respectively determining an explicit matching relationship and an implicit matching relationship between a problem word node in the problem word structure diagram and a list name node in the database mode structure diagram;
and connecting the problem word structure diagram with the database pattern structure diagram based on the explicit matching relation and the implicit matching relation to obtain the problem word pattern diagram.
According to the SQL sentence generation method based on relational awareness, the target node characteristic information is obtained based on the problem word pattern diagram, the graph convolution network with relational awareness and the neural network with relational awareness, and the method specifically comprises the following steps:
inputting the problem word pattern diagram into a diagram convolution network with relation perception to obtain first node characteristic information;
calculating the attention weights of the table name nodes and the problem word nodes, and determining probability values of the table name nodes mentioned by the target natural language problems based on the attention weights;
updating the characteristic information of the list name nodes based on the probability value to obtain second node characteristic information;
and inputting the second node characteristic information into a neural network with relation sensing to obtain target node characteristic information.
According to the method for generating the SQL sentence based on the relational awareness, which is provided by the invention, the abstract syntax tree of SQL is generated based on the characteristic information of the decoder and the target node, and the method specifically comprises the following steps:
inputting the characteristic information of the target node and the motion code of the last step of the decoder into the decoder, and performing the next step of the decoder;
and generating an abstract syntax tree of SQL according to the next action of the decoder.
According to the SQL sentence generation method based on relation awareness, the implicit matching relation between the problem word nodes in the problem word structure diagram and the list name nodes in the database mode structure diagram is determined, and the method specifically comprises the following steps:
and capturing an implicit matching relation between the problem word nodes in the problem word structure diagram and the list name nodes in the database mode structure diagram by utilizing an inference knowledge graph.
According to the SQL sentence generation method based on relational awareness provided by the invention, the generation mode of the problem word structure diagram comprises the following steps:
performing word segmentation processing on the target natural language problem through a Chinese word segmentation tool to obtain the problem word;
acquiring the dependency relationship among the problem words through a dependency analysis tool;
and taking the problem word as a node, taking the dependency relationship as an edge, and generating the problem word structure diagram.
The invention also provides a SQL sentence generating device based on relational awareness, which comprises:
the structure diagram module is used for obtaining a problem word pattern diagram based on the problem word structure diagram and the database pattern structure diagram;
the feature module is used for obtaining feature information of the target node based on the problem word pattern diagram, the graph convolution network with relation perception and the neural network with relation perception;
the statement module is used for generating an abstract syntax tree of SQL based on the decoder and the target node characteristic information, and traversing the abstract syntax tree of SQL according to depth to output a target SQL statement;
the problem word structure diagram is obtained by carrying out word segmentation processing on a target natural language problem to obtain problem words and obtaining dependency relations among the problem words; the database schema structure is obtained according to schema information of the database.
The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the SQL sentence generation method based on the relational awareness when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a relational awareness based SQL statement generation method as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a relational awareness based SQL statement generation method as described in any one of the above.
According to the SQL sentence generation method and device based on relational perception, the problem words are obtained through word segmentation processing on the target natural language problem, the problem word structure diagram is generated, the problem word structure diagram and the database mode structure diagram are fused to obtain the problem word mode diagram, then the target node characteristic information is obtained through a graph convolution network with relational perception and a neural network with relational perception on the problem word mode diagram, the abstract syntax tree of SQL is further decoded and generated, the target SQL sentence is obtained through deep traversal, the structural information of the target natural language problem is fully considered, and the structural information of the target natural language problem is fully fused with the database structural information, so that the expression capability of the natural language problem and the database mode is enhanced, and more accurate SQL sentences can be generated.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a SQL sentence generation method based on relational awareness;
FIG. 2 is a schematic flow chart of an embodiment of the present invention;
FIG. 3 is a schematic diagram of a structure of the SQL sentence generating device based on relational awareness;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The existing method in the process of generating SQL based on natural language problems has the following defects:
1) The existing method is basically realized based on an encoder-decoder, the encoder part utilizes modeling structure information of the problem and the database mode to enhance the representation capability, and finally, the characteristic information of the natural language problem and the database mode is obtained. Modeling structure information is roughly divided into three parts: the connection structure of the natural language problem and the database mode, the internal structure of the database mode and the natural language problem structure. The existing method mainly concentrates modeling structure information on a connection structure and a database internal structure, and ignores a structure of natural language problems;
2) In practical application, the number of tables and columns in the database is large, and the existing method becomes more difficult to select the proper table or column in the large-scale database for coding the method and predicting and generating SQL sentences;
3) Some researches find that the robustness of the existing method is poor, and when the table names and column names of the natural language problem or database are disturbed, the performance of the method is seriously affected.
Aiming at the problems, the invention provides a SQL sentence generation method and device based on relational awareness.
Fig. 1 is a flow chart of an SQL statement generating method based on relational awareness, which is provided by the invention, as shown in fig. 1, and the method comprises the following steps:
step 100, obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram.
And step 101, obtaining target node characteristic information based on the problem word pattern diagram, the graph convolution network with relation perception and the neural network with relation perception.
And 102, generating an abstract syntax tree of SQL based on the decoder and the target node characteristic information, and traversing the abstract syntax tree of SQL according to depth to output a target SQL statement.
The problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is derived from schema information of the database.
Specifically, the target natural language problem refers to a natural language problem which needs to be converted into an SQL sentence, and the target SQL sentence is the SQL sentence which is converted into the target natural language problem according to the SQL sentence generating method provided by the invention.
Because the prior art mainly focuses on the link relation information of the natural language problem and the database mode and the relation information in the database mode in the encoding process, but ignores the structural relation information in the natural language problem.
Optionally, the generating mode of the problem word structure diagram includes:
performing word segmentation processing on the target natural language problem through a Chinese word segmentation tool to obtain a problem word;
acquiring the dependency relationship between the problem words through a dependency analysis tool;
and taking the problem word as a node, taking the dependency relationship as an edge, and generating a problem word structure diagram.
Specifically, in the process of generating a problem word structure diagram, word segmentation processing can be performed on a target natural language problem through a Chinese word segmentation tool to obtain a problem word. The Chinese word segmentation tool in the embodiment of the invention can be Jieba, snowNLP, LTP, hanNLP and the like, and the invention does not limit the type of the Chinese word segmentation tool.
After the problem words are obtained, the dependency relationship among the problem words can be obtained through a dependency analysis tool, the dependency analysis is to express the structure of sentences through the relationship between a certain word and other words, and display which words depend on which other words.
After the dependency relations among the problem words are acquired, the problem words can be used as nodes, the dependency relations are used as edges, and a problem word structure diagram is generated.
The embodiment of the invention also needs to obtain a database mode structure diagram according to the mode information of the database, wherein the list names (namely, the list names and the list names, which are consistent in the whole concept and are not described in detail later) in the database mode information are used as nodes, and the subordinate relations between the list and the list, the key-out relations between the main keys and the like are used as edges to generate the database mode structure diagram.
After the problem word structure diagram and the database pattern diagram are respectively generated, the problem word structure diagram and the database pattern structure diagram can be fused, and the connection relation between each node in the problem word structure diagram and each node in the database pattern structure diagram is analyzed, so that the problem word pattern diagram is obtained.
After the problem word pattern diagram is obtained, the characteristic information of each node in the problem word pattern diagram, namely the characteristic information of the target node, can be obtained based on the problem word pattern diagram, the graph convolution network with relation perception and the neural network with relation perception.
In one embodiment, the problem word pattern graph can sequentially pass through a graph convolution network with relational awareness and a neural network with relational awareness to obtain the target node characteristic information.
The graph rolling network with Relation sensing in the embodiment of the invention can be a Relation graph rolling network (Relational Graph Convolutional Network, R-GCN), and the neural network with Relation sensing can be a Relation-sensing transformation neural network (RAT).
After the target node characteristic information is obtained, the target node characteristic information can be decoded through a decoder, so that an abstract syntax tree of SQL is generated, and finally the abstract syntax tree of SQL is traversed according to depth to output a target SQL sentence.
According to the SQL sentence generation method based on relational perception, problem words are obtained through word segmentation processing on target natural language problems, a problem word structure diagram is generated, the problem word structure diagram and a database mode structure diagram are fused to obtain a problem word mode diagram, then the problem word mode diagram is subjected to graph convolution network with relational perception and neural network with relational perception to obtain target node characteristic information, and further abstract grammar tree of SQL is decoded and generated, the target SQL sentence is obtained through deep traversal, the structural information of the target natural language problem is fully considered, and the problem word structure diagram is fully fused with the database structural information, so that the expression capability of the natural language problem and the database mode is enhanced, and more accurate SQL sentences can be generated.
According to the SQL sentence generation method based on relational awareness, which is provided by the invention, a problem word pattern diagram is obtained based on a problem word structure diagram and a database pattern structure diagram, and the method specifically comprises the following steps:
respectively determining an explicit matching relationship and an implicit matching relationship between a problem word node in a problem word structure diagram and a list name node in a database mode structure diagram;
and connecting the problem word structure diagram with the database pattern structure diagram based on the explicit matching relation and the implicit matching relation to obtain a problem word pattern diagram.
Specifically, in the process of obtaining the problem word pattern graph according to the problem word structure graph and the database pattern structure graph, an explicit matching relationship between the problem word nodes in the problem word structure graph and the list name nodes in the database pattern structure graph can be determined first, and in some embodiments, the matching probability between the problem word and the list name can be calculated by using an n-gram method, so that the explicit matching relationship between the problem word nodes and the list name nodes can be determined.
Implicit matching relationships between the problem word nodes in the problem word structure diagram and the list name nodes in the database schema structure diagram can also be determined. In some embodiments, an inference knowledge graph may be utilized to infer an unknown fact or relationship based on an existing fact or relationship in the knowledge graph, thereby capturing an implicit matching relationship between a problem word node in the problem word structure graph and a list name node in the database schema structure graph, and determining an implicit matching relationship between the problem word node and the list name node, such as a case of a hyponym, a synonym, and the like.
The embodiment of the invention considers the implicit matching relation, can better ensure the relation between the problem and the database mode, ensures that the algorithm is not easy to lose the relation of the same ambiguities, and enhances the robustness of the model.
According to the SQL sentence generation method based on relational awareness, provided by the invention, the target node characteristic information is obtained based on a problem word pattern diagram, a graph convolution network with relational awareness and a neural network with relational awareness, and the method specifically comprises the following steps:
inputting the problem word pattern diagram into a graph convolution network with relation perception to obtain first node characteristic information;
calculating the attention weights of the table name node and the problem word node, and determining the probability value of the table name node mentioned by the target natural language problem based on the attention weights;
updating the characteristic information of the list name nodes based on the probability value to obtain second node characteristic information;
and inputting the second node characteristic information into a neural network with relation awareness to obtain the target node characteristic information.
In the process of obtaining the characteristic information of the target node based on the problem word pattern diagram, the graph rolling network with relation perception and the neural network with relation perception, the problem word pattern diagram can be input into the graph rolling network with relation perception, for example, R-GCN, and for any node, the characteristic information of the node can be obtained through the information of the adjacent node and the information of each side of the node (namely, the relation between the node and other nodes), so that the characteristic information of the first node of the whole problem word pattern diagram can be obtained.
Set problem word node vector setq i Feature vector representing ith problem word node, database schema node vector set +.>s j Representing the characteristic vector of the j-th database mode node, namely, the characteristic vector of the list name node;
inputting the problem word pattern diagram into an n-layer graph rolling network R-GCN with relation perception, and obtaining a specific updating formula of the characteristic information of the first node as follows:
wherein,characteristic information of the ith node in the graph representing the (l+1) th layer, R represents the edge relations in the graph, R represents all possible edge relation sets,/for the possible edge relation sets>Representing a set of nodes pointing to an ith node in relation r, c i,r Is->The number of middle nodes>And->Is a matrix of trainable parameters, σ is an activation function (ReLU).
Then, the attention weights of each table name node and each question word node in the question word pattern graph, that is, whether edges between each table name node and each question word node are noted, can be calculated, so that probability values of each table name node mentioned by the target natural language questions can be obtained. The specific calculation formula is as follows:
e ij =q i W Q (s j W k +r ij ) T
wherein q i Feature vector s representing ith problem word node j Feature vector representing the j-th database schema node, i.e., feature vector of table name node, alpha n*m Attention weight representing problem word node and list name node, W Q 、W k Is a trainable weight matrix, r ij Is the ith question word node and the jth dataRelation vector corresponding to library mode node, u j As the highest probability that the j-th node in the database schema node vector set S is mentioned by the target natural language problem,representing the maximum probability after normalization.
A high probability value indicates a high probability of occurrence and a low probability value indicates a low probability of occurrence. Therefore, after obtaining the probability value of each table name node mentioned by the target natural language problem, the characteristic information of the table name node can be updated according to the probability value. In the updating process, the expression capacity of the list name nodes with high occurrence probability is enhanced, the expression capacity of the list name nodes with low occurrence probability is weakened, and finally the second node characteristic information is obtained. The specific calculation formula is as follows:
wherein,and representing the updated second node characteristic information.
After the second node characteristic information is obtained, the second node characteristic information can be input into a neural network with relation awareness, such as a RAT, the characteristic information of the nodes is further updated by fully utilizing the information of edges (namely the relation among the nodes) among the nodes, and finally the target node characteristic information is obtained. The specific update formula is as follows:
wherein x is i And x j Representing second node characteristic information, H being the number of heads of the multi-head attention,attention weights of the ith node and the jth node representing the h header,/and-> Trainable weight matrix representing the h head, r ij Is the corresponding relation vector of the ith node and the jth node, d z Is the dimension of the feature vector, +.>Is the characteristic information, z, of the ith node after the h head is updated i The characteristic information of H heads is spliced to form a dimension d z Feature vector, y of (2) i Representing the characteristic information finally obtained by the ith node, layerNorm (·) is a layer normalization method, FC (·) represents a fully connected layer, and ReLU (·) represents an activation function.
Therefore, the probability of selecting correct table columns for decoding and generating SQL sentences can be improved under the condition that the number of the table columns in the database is large.
According to the SQL sentence generation method based on relational awareness, which is provided by the invention, the abstract syntax tree of SQL is generated based on the characteristic information of the decoder and the target node, and the method concretely comprises the following steps:
inputting the characteristic information of the target node and the motion code of the last step of the decoder into the decoder to obtain the next motion of the decoder;
and generating an abstract syntax tree of SQL according to the next action of the decoder.
Specifically, in the process of decoding the target node characteristic information by the decoder to generate an abstract syntax tree of SQL, the target node characteristic information and the action of the last step of the decoder may be encoded and input to the decoder, the decoder may predict the next execution action, and the actions may be a selection table action, a selection column action, and an application of a predefined SQL rule action.
Therefore, the decoder can perform decoding processing according to the action of the last step and the characteristic information of the target node, prediction generates the action of the next step, each action is actually in the process of constructing the abstract syntax tree of SQL, and when the prediction of the decoder is finished, the abstract syntax tree of SQL is constructed, so that the abstract syntax tree of SQL is obtained.
The SQL sentence generation method based on the relational awareness provided by the invention is further described below through an embodiment in a specific application scene.
Fig. 2 is a schematic flow chart of an embodiment provided in the present invention, as shown in fig. 2, the embodiment specifically includes the following steps:
(1) The Chinese word segmentation tool is utilized to segment natural language questions to obtain segmented question words;
(2) Obtaining a dependency relation between problem words by utilizing a dependency analysis tool, wherein the words are used as nodes, and the dependency relation is used as an edge, so that a problem structure diagram (namely a problem word structure diagram) is obtained;
(3) According to the mode information of the database, the table name/column name is used as a node, the subordinate relation between the table and the column, the key relation outside the main key and the like are used as edges, and a mode structure diagram of the database is obtained;
(4) The explicit matching relation between the problem words and the table names/column names is calculated by using an n-gram method, the implicit matching relation between the problem words and the table names/column names is captured by using an inference knowledge graph, and the problem structure diagram and the database mode structure diagram are linked into a diagram by using the matching relation: question-pattern graph (i.e., question word pattern graph);
(5) Inputting the problem-pattern diagram into a graph rolling network (R-GCN layer) with relation perception, and updating the characteristic information of the nodes by utilizing the information of the neighbor nodes and the information of the relation (edge) to obtain preliminary node characteristic information;
(6) Calculating the attention weight of the table/column node and the problem word node, solving the probability that the table/column node is referred to by the problem, updating the characteristic information of the table/column node according to the probability, and weakening the expression capability of the table/column which cannot appear;
(7) Inputting the updated node information into a neural network (RAT layer) with relation perception, and fully utilizing relation (side) information to update the characteristic information of the nodes to obtain final characteristic information;
(8) The characteristic information of the node is used as a part of input of a decoder, the decoder predicts and generates the next action according to the action of the last step and the characteristic information of the node, such as selecting a table action, selecting a column action, applying a predefined SQL rule action and the like, and finally generates an abstract syntax tree of SQL;
(9) And traversing and outputting the abstract syntax tree of the SQL according to depth to obtain the generated SQL.
Through the steps, the embodiment can generate the corresponding SQL sentence from the natural language problem input by the user, and further obtain the content required by the user by utilizing the SQL query database.
The method provided by the invention has been verified in some application scenes, such as providing intelligent question-answering service in a two-loop system, and users can inquire related contents in a database through natural language questions, assemble data for the users and return related answers, such as inquiring related file basic information according to labels, etc., thereby improving information retrieval efficiency.
The relational awareness based SQL sentence generating device provided by the invention is described below, and the relational awareness based SQL sentence generating device described below and the relational awareness based SQL sentence generating method described above can be referred to correspondingly.
Fig. 3 is a schematic structural diagram of an SQL statement generating device based on relational awareness, where, as shown in fig. 3, the device includes:
the structure diagram module 300 obtains a problem word pattern diagram based on the problem word structure diagram and the database pattern structure diagram;
the feature module 310 obtains feature information of the target node based on the problem word pattern diagram, the graph convolution network with relation awareness and the neural network with relation awareness;
statement module 320, based on the decoder and the target node feature information, generating an abstract syntax tree of the SQL, and traversing the abstract syntax tree of the SQL according to depth to output a target SQL statement;
the problem word structure diagram is obtained by carrying out word segmentation processing on a target natural language problem to obtain problem words and obtaining dependency relations among the problem words; the database schema structure is derived from schema information of the database.
According to the SQL sentence generating device based on relational awareness, which is provided by the invention, a problem word pattern diagram is obtained based on a problem word structure diagram and a database pattern structure diagram, and the device specifically comprises the following steps:
respectively determining an explicit matching relationship and an implicit matching relationship between a problem word node in a problem word structure diagram and a list name node in a database mode structure diagram;
and connecting the problem word structure diagram with the database pattern structure diagram based on the explicit matching relation and the implicit matching relation to obtain a problem word pattern diagram.
According to the SQL sentence generating device based on the relational awareness, provided by the invention, the target node characteristic information is obtained based on the problem word pattern diagram, the graph convolution network with the relational awareness and the neural network with the relational awareness, and the device concretely comprises the following steps:
inputting the problem word pattern diagram into a graph convolution network with relation perception to obtain first node characteristic information;
calculating the attention weights of the table name node and the problem word node, and determining the probability value of the table name node mentioned by the target natural language problem based on the attention weights;
updating the characteristic information of the list name nodes based on the probability value to obtain second node characteristic information;
and inputting the second node characteristic information into a neural network with relation awareness to obtain the target node characteristic information.
According to the SQL sentence generating device based on relational awareness, which is provided by the invention, the abstract syntax tree of SQL is generated based on the characteristic information of the decoder and the target node, and the device concretely comprises the following steps:
inputting the characteristic information of the target node and the motion code of the last step of the decoder into the decoder to obtain the next motion of the decoder;
and generating an abstract syntax tree of SQL according to the next action of the decoder.
According to the SQL sentence generating device based on relation awareness, the implicit matching relation between the problem word nodes in the problem word structure diagram and the list name nodes in the database mode structure diagram is determined, and the device specifically comprises the following steps:
and capturing an implicit matching relation between the problem word nodes in the problem word structure diagram and the list name nodes in the database mode structure diagram by utilizing the reasoning knowledge graph.
According to the SQL sentence generating device based on relational awareness provided by the invention, the generating mode of the problem word structure diagram comprises the following steps:
performing word segmentation processing on the target natural language problem through a Chinese word segmentation tool to obtain a problem word;
acquiring the dependency relationship between the problem words through a dependency analysis tool;
and taking the problem word as a node, taking the dependency relationship as an edge, and generating a problem word structure diagram.
Fig. 4 is a schematic structural diagram of an electronic device according to the present invention, as shown in fig. 4, the electronic device may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to execute the relational awareness based SQL statement generation method provided by the methods described above, the method comprising:
obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram;
obtaining target node characteristic information based on a problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception;
based on the characteristic information of the decoder and the target node, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence;
the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is derived from schema information of the database.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can execute the method for generating the SQL statement based on relational awareness provided by the above methods, and the method includes:
obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram;
obtaining target node characteristic information based on a problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception;
based on the characteristic information of the decoder and the target node, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence;
the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is derived from schema information of the database.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the relational awareness based SQL statement generation method provided by the methods above, the method comprising:
obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram;
obtaining target node characteristic information based on a problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception;
based on the characteristic information of the decoder and the target node, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence;
the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is derived from schema information of the database.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The structured query language SQL sentence generation method based on relational awareness is characterized by comprising the following steps:
obtaining a problem word pattern diagram based on the problem word pattern diagram and the database pattern diagram;
obtaining target node characteristic information based on the problem word pattern diagram, a graph convolution network with relation perception and a neural network with relation perception;
based on the decoder and the target node characteristic information, generating an abstract syntax tree of SQL, and traversing the abstract syntax tree of SQL according to depth to output a target SQL sentence;
the problem word structure diagram is obtained based on problem words obtained after word segmentation processing is carried out on the target natural language problem; the database schema structure is obtained according to schema information of the database.
2. The method for generating SQL statements based on relational awareness according to claim 1, wherein the obtaining the problem word pattern graph based on the problem word pattern graph and the database pattern graph comprises:
respectively determining an explicit matching relationship and an implicit matching relationship between a problem word node in the problem word structure diagram and a list name node in the database mode structure diagram;
and connecting the problem word structure diagram with the database pattern structure diagram based on the explicit matching relation and the implicit matching relation to obtain the problem word pattern diagram.
3. The method for generating the SQL statement based on the relational awareness according to claim 2, wherein the obtaining the target node characteristic information based on the problem word pattern graph, the graph convolution network with the relational awareness and the neural network with the relational awareness specifically comprises:
inputting the problem word pattern diagram into a diagram convolution network with relation perception to obtain first node characteristic information;
calculating the attention weights of the table name nodes and the problem word nodes, and determining probability values of the table name nodes mentioned by the target natural language problems based on the attention weights;
updating the characteristic information of the list name nodes based on the probability value to obtain second node characteristic information;
and inputting the second node characteristic information into a neural network with relation sensing to obtain target node characteristic information.
4. The method for generating the SQL sentence based on the relational awareness according to claim 1, wherein the generating the abstract syntax tree of the SQL based on the decoder and the target node characteristic information specifically comprises:
inputting the characteristic information of the target node and the motion code of the last step of the decoder into the decoder to obtain the next motion of the decoder;
and generating an abstract syntax tree of SQL according to the next action of the decoder.
5. The relational awareness based SQL statement generation method of claim 2, wherein determining an implicit matching relationship between a problem word node in the problem word structure diagram and a table column name node in the database schema structure diagram comprises:
and capturing an implicit matching relation between the problem word nodes in the problem word structure diagram and the list name nodes in the database mode structure diagram by utilizing an inference knowledge graph.
6. The method for generating the SQL sentence based on the relational awareness according to claim 1, wherein the generating mode of the problem word structure diagram comprises the following steps:
performing word segmentation processing on the target natural language problem through a Chinese word segmentation tool to obtain the problem word;
acquiring the dependency relationship among the problem words through a dependency analysis tool;
and taking the problem word as a node, taking the dependency relationship as an edge, and generating the problem word structure diagram.
7. A structured query language SQL statement generation device based on relational awareness, characterized by comprising:
the structure diagram module is used for obtaining a problem word pattern diagram based on the problem word structure diagram and the database pattern structure diagram;
the feature module is used for obtaining feature information of the target node based on the problem word pattern diagram, the graph convolution network with relation perception and the neural network with relation perception;
the statement module is used for generating an abstract syntax tree of SQL based on the decoder and the target node characteristic information, and traversing the abstract syntax tree of SQL according to depth to output a target SQL statement;
the problem word structure diagram is obtained by carrying out word segmentation processing on a target natural language problem to obtain problem words and obtaining dependency relations among the problem words; the database schema structure is obtained according to schema information of the database.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the relational awareness based SQL statement generation method of any one of claims 1 to 6 when executing the program.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the relational awareness based SQL statement generation method of any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the relational awareness based SQL statement generation method of any one of claims 1 to 6.
CN202311549842.8A 2023-11-17 2023-11-17 SQL sentence generation method and device based on relational awareness Pending CN117493372A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311549842.8A CN117493372A (en) 2023-11-17 2023-11-17 SQL sentence generation method and device based on relational awareness

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311549842.8A CN117493372A (en) 2023-11-17 2023-11-17 SQL sentence generation method and device based on relational awareness

Publications (1)

Publication Number Publication Date
CN117493372A true CN117493372A (en) 2024-02-02

Family

ID=89668895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311549842.8A Pending CN117493372A (en) 2023-11-17 2023-11-17 SQL sentence generation method and device based on relational awareness

Country Status (1)

Country Link
CN (1) CN117493372A (en)

Similar Documents

Publication Publication Date Title
CN111310438B (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN110188167B (en) End-to-end dialogue method and system integrating external knowledge
CN112633010B (en) Aspect-level emotion analysis method and system based on multi-head attention and graph convolution network
CN110096567B (en) QA knowledge base reasoning-based multi-round dialogue reply selection method and system
CN112015868B (en) Question-answering method based on knowledge graph completion
CN111831789B (en) Question-answering text matching method based on multi-layer semantic feature extraction structure
CN110837738B (en) Method, device, computer equipment and storage medium for identifying similarity
CN111930906A (en) Knowledge graph question-answering method and device based on semantic block
CN110765277B (en) Knowledge-graph-based mobile terminal online equipment fault diagnosis method
CN109344242B (en) Dialogue question-answering method, device, equipment and storage medium
CN111274267A (en) Database query method and device and computer readable storage medium
CN115495568B (en) Training method and device for dialogue model, dialogue response method and device
CN116991869A (en) Method for automatically generating database query statement based on NLP language model
CN114091450B (en) Judicial domain relation extraction method and system based on graph convolution network
CN116719520B (en) Code generation method and device
CN110851584A (en) Accurate recommendation system and method for legal provision
CN112632250A (en) Question and answer method and system under multi-document scene
CN113705196A (en) Chinese open information extraction method and device based on graph neural network
CN115658846A (en) Intelligent search method and device suitable for open-source software supply chain
CN112015890B (en) Method and device for generating movie script abstract
CN117290478A (en) Knowledge graph question-answering method, device, equipment and storage medium
KR102277787B1 (en) Column and table prediction method for text to SQL query translation based on a neural network
CN117271558A (en) Language query model construction method, query language acquisition method and related devices
CN114579605B (en) Table question-answer data processing method, electronic equipment and computer storage medium
CN116910190A (en) Method, device and equipment for acquiring multi-task perception model and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination