CN115878662A - Statement generation method and device, electronic equipment and medium - Google Patents

Statement generation method and device, electronic equipment and medium Download PDF

Info

Publication number
CN115878662A
CN115878662A CN202211358681.XA CN202211358681A CN115878662A CN 115878662 A CN115878662 A CN 115878662A CN 202211358681 A CN202211358681 A CN 202211358681A CN 115878662 A CN115878662 A CN 115878662A
Authority
CN
China
Prior art keywords
pair
column
target
vector
compatible
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211358681.XA
Other languages
Chinese (zh)
Inventor
王路涛
刘识
李博
李继伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Big Data Center Of State Grid Corp Of China
Original Assignee
Big Data Center Of State Grid Corp Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Big Data Center Of State Grid Corp Of China filed Critical Big Data Center Of State Grid Corp Of China
Priority to CN202211358681.XA priority Critical patent/CN115878662A/en
Publication of CN115878662A publication Critical patent/CN115878662A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a statement generation method, a statement generation device, electronic equipment and a medium. The method comprises the following steps: determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregated column; determining a target vector pair based on the at least one vector pair and the first joint probability density of each vector pair; determining a target compatible pair based on at least one aggregation operator, at least one column, an aggregation column, and a bidirectional recurrent neural network; and generating a target SQL query statement based on each target vector pair, each target compatible pair and a preset SQL query statement framework. According to the method, the target compatible pair and the target vector pair with a higher association relation are determined through the obtained vector pair, the first joint probability density corresponding to the vector pair, the aggregation operator, the column and the aggregation column, and then the SQL query statement which can be accurately executed can be obtained by combining a preset SQL query statement frame, so that the generation accuracy of the SQL query statement is improved.

Description

Statement generation method and device, electronic equipment and medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a statement generation method, a statement generation device, electronic equipment and a medium.
Background
With the development of society and technology, the conversion of natural Language questions into accurately executable Structured Query Language (SQL) Query statements has received much attention and has been applied in many fields.
Currently, although the deep learning method is introduced into a Natural Language to structured query statement (NL 2 SQL) model to realize conversion of Natural Language questions into SQL query statements; NL2SQL is a technique that converts a user's native statements into executable SQL query statements to obtain query results from a database. However, the method cannot generate more effective and accurate query SQL statements at the syntax and semantic level, thereby reducing the accuracy of converting natural language questions into SQL query statements.
Disclosure of Invention
The embodiment of the invention provides a statement generation method, a statement generation device, electronic equipment and a medium, which are used for improving the accuracy of SQL query statement generation.
According to an aspect of the embodiments of the present invention, there is provided a statement generation method, including:
determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as information to be processed, wherein one word corresponds to the information to be processed, one field corresponds to the information to be processed, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table;
determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column according to each piece of information to be processed and a bidirectional recurrent neural network, wherein the vector pair is composed of two corresponding decoding vectors, the first joint probability density indicates the association degree between the two decoding vectors in the corresponding vector pair, the decoding vectors are obtained after the information to be processed is coded and decoded, and the aggregation column is a column composed of at least one column;
determining at least one target vector pair based on at least one vector pair and a first joint probability density corresponding to each of the vector pairs;
determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair consisting of one aggregation operator and one column or a combined pair consisting of one aggregation operator and one aggregation column;
and generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame.
According to another aspect of the embodiments of the present invention, there is provided a sentence generation apparatus including:
the system comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as to-be-processed information, one word corresponds to one piece of to-be-processed information, one field corresponds to one piece of to-be-processed information, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table;
a second determining module, configured to determine, according to each piece of information to be processed and a bidirectional recurrent neural network, at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column, where the vector pair is formed by two corresponding decoded vectors, the first joint probability density indicates a degree of association between the two decoded vectors in the corresponding vector pair, the decoded vector is a vector obtained after the information to be processed is encoded and decoded, and the aggregation column is a column formed by combining the at least one column;
a third determining module, configured to determine at least one target vector pair based on at least one vector pair and the first joint probability density corresponding to each of the vector pairs;
a fourth determination module for determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair of one aggregation operator and one column or a combined pair of one aggregation operator and one aggregation column;
and the generating module is used for generating the target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame.
According to another aspect of the embodiments of the present invention, there is provided an electronic apparatus, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the statement generation method according to any of the embodiments of the invention.
According to another aspect of the embodiments of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the statement generation method according to any one of the embodiments of the present invention when the computer instructions are executed.
The technical scheme of the embodiment of the invention comprises the steps of firstly determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as information to be processed, wherein one word corresponds to one piece of information to be processed, one field corresponds to one piece of information to be processed, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table; secondly, determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column according to the information to be processed and the bidirectional recurrent neural network, wherein the vector pair is composed of two corresponding decoding vectors, the first joint probability density indicates the association degree between the two decoding vectors in the corresponding vector pair, the decoding vectors are obtained after the information to be processed is coded and decoded, and the aggregation column is a column composed of at least one column; then determining at least one target vector pair based on the at least one vector pair and the first joint probability density corresponding to each vector pair; then, based on at least one aggregation operator, at least one column, an aggregation column and a bidirectional recurrent neural network, determining at least one target compatible pair, wherein the target compatible pair indicates an executable compatible pair, and the compatible pair indicates a combined pair formed by one aggregation operator and one column or a combined pair formed by one aggregation operator and one aggregation column; and finally, generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame. The method can obtain vector pairs, first joint probability densities, aggregation operators, columns and aggregation columns corresponding to the vector pairs through information to be processed corresponding to natural language description information and a bidirectional recurrent neural network; on the basis, a target compatible pair and a target vector pair with a higher incidence relation can be obtained, and an accurate executable SQL query statement corresponding to the natural language description information can be obtained by combining a preset SQL query statement frame, so that the generation accuracy of the SQL query statement is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a statement generating method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a statement generating method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a statement generating apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a schematic flowchart of a statement generating method according to an embodiment of the present invention, where the method is applicable to a case where natural language description information is converted into an SQL query statement to improve the accuracy of generating the SQL query statement, and the method may be executed by a statement generating apparatus, where the apparatus may be implemented by software and/or hardware and is generally integrated on an electronic device, where the electronic device in this embodiment includes but is not limited to: desktop computers, notebook computers, servers and the like.
As shown in fig. 1, a statement generating method provided in an embodiment of the present invention includes the following steps:
s110, determining at least one word and at least one field corresponding to the input natural language description information, and determining the at least one word and the at least one field as information to be processed.
In the present embodiment, the natural language description information may be understood as description information in a natural language form related to a data query. Natural language may generally refer to a language that naturally evolves with culture. The natural language description information may be a natural language question, and may be a sentence such as "data C in table a and table B".
The information to be processed is understood to be the information to be processed. One word corresponds to one piece of information to be processed, namely one word can be used as the information to be processed; one field corresponds to one piece of information to be processed, that is, one field can be used as one piece of information to be processed.
A word may be understood as the smallest semantic unit constituting the natural language description information; that is, a word may be regarded as one word included in the natural language description information, and the natural language description information may include at least one word, and each of "table", "a", and "middle" may be regarded as one word. A field may be understood as a corresponding field of natural language description information in a set database table. Setting a database table can be understood as at least one preset database table; the set database table is not particularly limited, and may include all the database tables in the queried database, or may include some of the database tables in the queried database. In a relational database, a database table is a set of a series of two-dimensional arrays for representing and storing relationships between data objects, and is composed of vertical columns and horizontal rows, for example, in a database table about author information, each column contains a specific type of information of all authors, such as "last name", "first name", and "address", and each row contains all information of a specific author: last name, first name, and address, etc. The "last name", "first name" and "address" can be considered as individual fields in the database table, and can be located at the column head of the corresponding column, that is, the column head of each column is a field.
How to determine the at least one word corresponding to the natural language description information is not particularly limited; if the word segmentation processing can be carried out on the natural language description information, at least one word contained in the natural language description information is obtained; on the basis, filtering can be performed on the obtained at least one word to filter out useless and/or repeated words, and how to filter is not particularly limited here.
The method for determining at least one field corresponding to the natural language description information is not particularly limited, for example, words associated with a table included in the natural language description information may be determined first, the words may be regarded as individual character strings, all fields included in the set database table may also be regarded as individual character strings, and at least one character string corresponding to the words in the natural language description information is searched for from all fields included in the set database table through a character string matching algorithm, where the at least one character string may be regarded as the determined at least one field.
S120, determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column according to the information to be processed and the bidirectional recurrent neural network.
In this embodiment, a vector pair may be formed by two corresponding decoded vectors, that is, may be considered as a pair formed by two corresponding decoded vectors. One vector pair may correspond to one first joint probability density. The first joint probability density may indicate a degree of association between two decoded vectors in a corresponding pair of vectors, association being understood as a relationship of association between two decoded vectors in features. The decoded vector can be understood as a vector obtained after the information to be processed is encoded and decoded. A polymerization column is understood to be a column composed of at least one column; if there are 2 columns, then the aggregate column can be considered a combination of the two columns; a column is understood to be a column in a database table and may also be considered to be a column header in a database table representing the content of the data contained in this column. The aggregation operator may be understood as an operator in the SQL database for aggregation operations; aggregation may refer to an operation that forms values contained in a column into a single value, such as the sum or average of the values in the column. Such as aggregation operators may include, but are not limited to SUM (SUM operator, for summing various values in a column), AVG (average operator, for averaging various values in a column), MIN (minimum operator, for averaging minimum of various values in a column), MAX (maximum operator, for averaging maximum of various values in a column), and the like.
How to determine the at least one vector pair, the first joint probability density corresponding to each vector pair, the at least one aggregation operator, the at least one column, and the aggregation column according to each piece of information to be processed and a Bidirectional Recurrent Neural Network (BRNN) is not particularly limited. For example, each piece of information to be processed can be input to a bidirectional recurrent neural network as input data, and the bidirectional recurrent neural network is a model trained in advance, so that at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column can be output.
S130, determining at least one target vector pair based on at least one vector pair and the first joint probability density corresponding to each vector pair.
In the present embodiment, the target vector pair may be understood as a vector pair with a higher correlation degree between the two corresponding decoding vectors.
One vector corresponds to one first joint probability density. There is no specific limitation on how to determine at least one target vector pair based on at least one vector pair and the corresponding first joint probability density of each vector pair; the first joint probability density is a value, for example, the first joint probability densities may be sorted in descending order, then a set number of first joint probability densities sorted in the front are selected, and the vector pair corresponding to each selected first joint probability density is determined as the target vector pair.
S140, determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column and the bidirectional recurrent neural network.
In the present embodiment, the target compatible pair may indicate an executable compatible pair. Compatible pairs may indicate a combined pair of an aggregation operator and a column, or a combined pair of an aggregation operator and an aggregation column.
There is no specific limitation on how to determine at least one target compatibility based on at least one aggregation operator, at least one column, an aggregation column, and a bidirectional recurrent neural network. If the compatible pair composed of the aggregation operator and the aggregation column can be judged to be executable, if the compatible pair can be added to a preset SQL statement for testing whether the compatible pair is executable, whether the corresponding compatible pair is executable is determined by whether the SQL statement can be executed successfully. On the basis, the compatible pair formed by the aggregation operator and the aggregation column which can be successfully executed can be determined as the target compatible pair. Inputting each compatible pair composed of the aggregation operator and the column as input data to a decoder of the bidirectional recurrent neural network in advance to obtain a joint probability density corresponding to the compatible pair composed of the aggregation operator and the column, wherein the joint probability density can indicate the incidence relation between the aggregation operator and the column; on the basis, for the unexecutable compatible pair formed by the aggregation operator and the aggregation column, the compatible pair formed by the aggregation operator and the column corresponding to the aggregation operator is determined, and then the compatible pair with the maximum joint probability density is selected as the target compatible pair.
S150, generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame.
In this embodiment, the preset SQL query statement frame may be understood as a preset statement frame for generating the target SQL query statement. The preset SQL query statement framework is not specifically limited, and may include, for example, clauses (e.g., SELECT clause, FROM clause, WHERE clause, etc.) for querying, WHERE information related to query conditions under each clause is empty, and waits for subsequent filling to generate a corresponding predicted SQL query statement. The target SQL query statement may be understood as an accurate executable SQL query statement corresponding to the generated natural language description information.
How to generate the target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame is described here. One target vector pair may correspond to two decoded vectors, and one decoded vector may correspond to one word or field (i.e., information to be processed), that is, one target vector pair may correspond to two information to be processed; for example, the information to be processed corresponding to each target vector pair and each target compatible pair may be filled into a preset SQL query statement frame to obtain a target SQL query statement corresponding to the natural language description information.
The statement generating method provided by the first embodiment of the invention includes the steps of firstly determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as information to be processed, wherein one word corresponds to one piece of information to be processed, one field corresponds to one piece of information to be processed, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table; secondly, determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column according to the information to be processed and the bidirectional recurrent neural network, wherein the vector pair is composed of two corresponding decoding vectors, the first joint probability density indicates the association degree between the two decoding vectors in the corresponding vector pair, the decoding vectors are obtained after the information to be processed is coded and decoded, and the aggregation column is a column composed of at least one column; then determining at least one target vector pair based on the at least one vector pair and the first joint probability density corresponding to each vector pair; then, based on at least one aggregation operator, at least one column, an aggregation column and a bidirectional recurrent neural network, determining at least one target compatible pair, wherein the target compatible pair indicates an executable compatible pair, and the compatible pair indicates a combined pair formed by one aggregation operator and one column or a combined pair formed by one aggregation operator and one aggregation column; and finally, generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement frame. The method can obtain vector pairs, first joint probability density corresponding to the vector pairs, aggregation operators, columns and aggregation columns through to-be-processed information corresponding to natural language description information and a bidirectional recurrent neural network; on the basis, a target compatible pair and a target vector pair with a higher incidence relation can be obtained, and an accurate executable SQL query statement corresponding to the natural language description information can be obtained by combining a preset SQL query statement frame, so that the generation accuracy of the SQL query statement is improved.
Optionally, determining at least one word and at least one field corresponding to the input natural language description information includes: performing word segmentation processing on the natural language description information to obtain at least one word; and searching at least one field corresponding to the natural language description information from the set database table through a semantic analysis algorithm.
In this embodiment, the word segmentation processing may be understood as a text processing method in natural language processing, that is, text content is classified at a word level. The natural language description information is used as text content, word segmentation processing is carried out on the natural language description information, and at least one word can be obtained.
A semantic parsing algorithm is understood to be an algorithm for parsing the semantics and syntax of text content. How to search at least one field corresponding to the natural language description information from the setting database table through the semantic parsing algorithm is not particularly limited, for example, the natural language description information may be regarded as one text content, the natural language description information is parsed through the semantic parsing algorithm to obtain information (such as age) related to the table query, and fields (such as age, and the like) related to and matching with the information are searched from all fields contained in the setting database table according to the information to serve as at least one field corresponding to the natural language description information.
Optionally, generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair, and a preset SQL query statement frame includes: and filling statement information corresponding to each target vector pair and each target compatible pair into a preset SQL query statement frame to obtain a target SQL query statement corresponding to the natural language description information, wherein the statement information indicates to-be-processed information corresponding to two decoding vectors in the corresponding target vector pair.
In this embodiment, the statement information may indicate to-be-processed information corresponding to two decoding vectors in the corresponding target vector pair. One target vector pair may correspond to one sentence information.
The statement information corresponding to each target vector pair and each target compatible pair can be filled in an area for placing information related to the query condition under each clause in a preset SQL query statement framework, and a target SQL query statement corresponding to the natural language description information is obtained. How to fill under each clause is not specifically limited herein.
Example two
Fig. 2 is a schematic flow chart of a statement generating method according to a second embodiment of the present invention, which is further detailed based on the above embodiments. In this embodiment, a process of determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column according to each piece of the information to be processed and the bidirectional recurrent neural network, a process of determining at least one target vector pair based on the first joint probability density corresponding to at least one vector pair and each vector pair, and a process of determining at least one target compatible pair based on at least one aggregation operator, at least one column, an aggregation column, and the bidirectional recurrent neural network are specifically described. It should be noted that technical details that are not described in detail in the present embodiment may be referred to any of the above embodiments. As shown in fig. 2, the method includes:
as shown in fig. 2, an embodiment of the present invention provides a method, including the following steps:
s210, determining at least one word and at least one field corresponding to the input natural language description information, and determining the at least one word and the at least one field as information to be processed.
S220, inputting the information to be processed to an encoder of a bidirectional recurrent neural network to obtain encoding vectors corresponding to the information to be processed respectively.
In this embodiment, the bidirectional recurrent neural network may include an encoder and a decoder. The encoding vector can be understood as a vector obtained by encoding the information to be processed. One piece of information to be processed may correspond to one coded vector.
And inputting the information to be processed into an encoder of the bidirectional recurrent neural network as input data, and outputting to obtain the encoding vectors corresponding to the information to be processed respectively.
And S230, inputting each coded vector to a decoder of the bidirectional recurrent neural network to obtain at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column.
In this embodiment, each encoded vector is input as input data to a decoder of the bidirectional recurrent neural network, and at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column may be output.
Optionally, inputting each encoding vector to a decoder of a bidirectional recurrent neural network to obtain at least one vector pair and a first joint probability density corresponding to each vector pair, including: obtaining decoding vectors corresponding to the coding vectors respectively through a decoder of a bidirectional recurrent neural network; selecting any two different decoding vectors from the decoding vectors through a decoder, forming a vector pair by the any two different decoding vectors, and determining a first joint probability density corresponding to the vector pair; the operation of selecting any two different decoded vectors from each decoded vector is repeatedly performed until there are no two decoded vectors in each decoded vector that are not selected simultaneously.
In this embodiment, after each encoding vector is input to the decoder of the bidirectional recurrent neural network, the decoding vectors corresponding to each encoding vector can be obtained, and one encoding vector can correspond to one decoding vector. The decoded vector may be understood as a vector obtained by decoding the encoded vector.
Selecting any two different decoding vectors from the decoding vectors through a decoder, forming a vector pair by the selected any two different decoding vectors, and determining a first joint probability density corresponding to the vector pair; . The determination of how to determine the first joint probability density corresponding to the vector pair is not particularly limited, and the output of the first joint probability density corresponding to the vector pair may be implemented by a pre-trained decoder. The operation of selecting any two different decoding vectors from the decoding vectors is repeatedly performed until there are no two decoding vectors in the decoding vectors that are not selected simultaneously.
S240, selecting a set number of first joint probability densities from the first joint probability densities in descending order, and taking each selected first joint probability density as a target joint probability density.
In the present embodiment, the set number may be understood as a preset number; the set number is not particularly limited, and can be flexibly set according to actual requirements. It will be appreciated that the set number is less than the number of vector pairs. The target joint probability density may be understood as a higher value joint probability density.
From the obtained first joint probability densities, a set number of first joint probability densities can be selected from the obtained first joint probability densities in a descending order, and each selected first joint probability density is taken as a target joint probability density.
And S250, determining the vector pair corresponding to each target joint probability density as a target vector pair.
In this embodiment, the vector pair corresponding to each target joint probability density is determined as a target vector pair. One target joint probability density may correspond to one target vector pair.
S260, aiming at each aggregation operator, enabling the aggregation operator and the aggregation column to form a corresponding first compatible pair, and enabling the aggregation operator and each column to form a corresponding second compatible pair respectively.
In the present embodiment, the first compatible pair may be considered to be a compatible pair consisting of one aggregation operator and one aggregation column. The second compatible pair may be considered a compatible pair consisting of an aggregation operator and a column.
For each aggregation operator, the aggregation operator and the aggregation column may form a corresponding first compatible pair, and the aggregation operator and each column may form a corresponding second compatible pair.
And S270, determining a second joint probability density of each second compatible pair through a decoder of the bidirectional recurrent neural network, wherein the first compatible pair and the second compatible pair of the same aggregation operator correspond to each other.
In this embodiment, a decoder of the bidirectional recurrent neural network performs corresponding parsing on each second compatible pair, so as to obtain second joint probability densities corresponding to each second compatible pair. A second year compatibility corresponds to a second combined probability density. The second joint probability density may indicate a degree of association between the aggregation operator and the column in the corresponding second compatible pair. Wherein the first compatible pair and the second compatible pair of the same aggregation operator may be corresponding.
S280, aiming at each first compatible pair, adding the first compatible pair to a selection area of a set SQL statement to obtain the SQL statement to be executed corresponding to the first compatible pair.
In this embodiment, setting the SQL statement may be understood as a preset SQL statement for testing whether the first compatible pair is executable; the SQL setting statement is not particularly limited, and may be "SELECT (SELECT area) WHERE TRUE" for example. The selection area may be understood as an area for placing information related to the query condition in the SQL statement, and may be a selection area under the SELECT clause.
The statement to be executed may be understood as an SQL statement to be executed for testing whether the first compatible pair is executable. For each first compatible pair, adding the first compatible pair to a selection area of a set SQL statement to obtain the SQL statement to be executed corresponding to the first compatible pair; if the SQL statement to be executed can be expressed as "SELECT (first compatible pair) WHERE TRUE".
S290, determining whether the SQL sentence to be executed can be executed or not by operating the SQL sentence to be executed corresponding to the first compatible pair; if yes, executing S2100; otherwise, S2110 is performed.
In this embodiment, by running the to-be-executed SQL statement corresponding to the first compatible pair, it may be determined whether the to-be-executed SQL statement is executable. If the operation of the to-be-executed SQL statement is successful, it indicates that the to-be-executed SQL statement is executable, and at this time, the execution may continue S2100. If the operation of the to-be-executed SQL statement fails, it indicates that the to-be-executed SQL statement is not executable, or it may indicate that the first compatible pair is not compatible, and at this time, S2110 may be continuously executed.
S2100, determining the first compatible pair as a target compatible pair.
In this embodiment, if the to-be-executed SQL statement corresponding to the first compatible pair is executable, the first compatible pair may be determined as the target compatible pair.
And S2110, determining a maximum joint probability density from the second joint probability densities of the second compatible pairs corresponding to the first compatible pairs, and determining the second compatible pairs corresponding to the maximum joint probability density as target compatible pairs.
In the present embodiment, the maximum joint probability density may be understood as a second joint probability density having a maximum value. A first compatible pair may correspond to at least one more second compatible pair, the corresponding second compatible pair corresponding to the same aggregation operator as the first compatible pair.
If the SQL statement to be executed corresponding to the first compatible pair is not executable, determining the second joint probability density with the maximum value from the second joint probability densities of the second compatible pairs corresponding to the first compatible pair as the maximum joint probability density, and then determining the second compatible pair corresponding to the maximum joint probability density as the target compatible pair corresponding to the first compatible pair.
S2120, generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset SQL query statement framework.
A second embodiment of the present invention provides a method, which embodies the process of determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column according to each piece of information to be processed and a bidirectional recurrent neural network, the process of determining at least one target vector pair based on the first joint probability density corresponding to at least one vector pair and each vector pair, and the process of determining at least one target compatible pair based on at least one aggregation operator, at least one column, an aggregation column, and a bidirectional recurrent neural network. According to the method, a first compatible pair corresponding to an executable SQL (structured query language) statement to be executed is selected as a target compatible pair, and a second joint probability density with the largest value is selected to replace the incompatible first compatible pair as the target compatible pair, so that a more reliable compatible pair with a higher correlation degree can be obtained to generate the target SQL query statement; by selecting the vector pair with higher first joint probability density as the target vector pair, more reliable information to be processed with higher association can be obtained; on the basis, the target SQL query statement is generated by combining the more reliable target vector pair and the target compatible pair with the preset SQL query statement framework, so that the accuracy of generating the SQL query statement can be effectively improved.
The present invention is exemplified below.
In the present embodiment, executable pilot decoding is proposed as an extension of the standard recursive autoregressive decoder, which can be seen as an extension of the standard beam search, applied to the decoder unit decoding of a particular model. The result of the current time period t corresponds to a partial program executable that retains only SQL query statements in the beam that have no execution errors or null outputs from the partial. The frame with the highest probability is selected and on to the next stage of decoding, the decode aggregate operator f and the aggregate column c run the execution engine to select f and c from t and the compatible pair (f; c) with the highest joint probability density in the "part program". The top k (c 1; c 2) combinations (i.e., target vector pairs) with the highest ranking joint probability density can be retained, which can avoid the occurrence of execution errors.
The executable guideline mechanism may be used as a filtering step at the end of the decoding process, e.g. by deleting the resulting program that produced the execution error. The same applies to any autoregressive decoder at the end of the beam decoding. However, in many application domains (including SQL generation), checks may be performed on partially decoded program applications. Standard beam decoding of width k may be performed first, and then the generated SQL procedure with the highest joint probability density rank is selected to avoid errors at the end of performing decoding.
In practice, instead of computing all the correct choices, the pilot decoder is parameterized using the beamwidth k, just discarding the results of the trigger error, similar to a standard beamwidth decoder, where top-k (i.e., the first k) results are not generated and the results of the evaluation discard error procedure are additionally used with the highest probability.
If no valid results are found, different framework results will be traced back and issued from the "coarse" model. The coarse model can be seen as a hybrid model of the template-based model and the end-to-end model. This is a two-stage process general text-to-code translation model, where the first stage generates a coarse "frame" (template) of the target program and the second stage fills the missing "slots" of its coarse frame. Two predictions are made based on the model of the template: (a) which template to use; and (b), which words in the natural language question should be used to fill slots in the selected template. Running a bi-directional RNN on a natural language question, outputs a "used in slot" or "unused in query" signal for each token. The selected template is then predicted from the final state of the RNN using a small fully connected network. An output SQL query statement is constructed by filling a slot from the template with predictive tokens from the input question.
The model is programmed through the following three steps. First, the input natural language query may be encoded using a bidirectional RNN-coded Long-Short Term Memory (LSTM) cell. The framework generator then uses the classifier to select from among a portion of the query framework in the form of "Where ()". The framework determines the condition numbers in the Where clause and the compare operator. Finally, the input and the generated frame can be used to generate a frame, and the complete SQL query statement is generated by filling the slot.
The present embodiment can improve the accuracy of semantic parsing of natural language questions, allowing any autoregressive decoder to be tuned based on the results of non-differentiable partial executions during reasoning, thereby eliminating semantically invalid programs from candidate programs.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a statement generating apparatus according to a third embodiment of the present invention, where the apparatus may be implemented by software and/or hardware. As shown in fig. 3, the apparatus includes:
a first determining module 310, configured to determine at least one word and at least one field corresponding to input natural language description information, and determine the at least one word and the at least one field as to-be-processed information, where a word corresponds to one to-be-processed information, a field corresponds to one to-be-processed information, the word is a minimum semantic unit constituting the natural language description information, and the field is a field corresponding to the natural language description information in a set database table;
a second determining module 320, configured to determine, according to each piece of information to be processed and the bidirectional recurrent neural network, at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column, where the vector pair is formed by two corresponding decoded vectors, the first joint probability density indicates a degree of association between two decoded vectors in the corresponding vector pair, the decoded vectors are obtained after the information to be processed is encoded and decoded, and the aggregation column is a column formed by combining the at least one column;
a third determining module 330, configured to determine at least one target vector pair based on at least one vector pair and the first joint probability density corresponding to each of the vector pairs;
a fourth determining module 340, configured to determine at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair consisting of one aggregation operator and one column or a combined pair consisting of one aggregation operator and one aggregation column;
a generating module 350, configured to generate a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair, and a preset structured query language SQL query statement frame.
In this embodiment, the apparatus first determines, by a first determining module 310, at least one word and at least one field corresponding to the input natural language description information, and determines the at least one word and the at least one field as information to be processed, where one word corresponds to one information to be processed, one field corresponds to one information to be processed, the word is a minimum semantic unit constituting the natural language description information, and the field is a field corresponding to the natural language description information in a set database table; secondly, determining, by a second determining module 320, a first joint probability density corresponding to at least one vector pair and each vector pair, at least one aggregation operator, at least one column and an aggregation column according to the information to be processed and the bidirectional recurrent neural network, where the vector pair is composed of two corresponding decoding vectors, the first joint probability density indicates a degree of association between the two decoding vectors in the corresponding vector pair, the decoding vectors are obtained by encoding and decoding the information to be processed, and the aggregation column is a column combined by at least one column; then, by the third determining module 330, at least one target vector pair is determined based on the at least one vector pair and the first joint probability density corresponding to each vector pair; then, by the fourth determining module 340, based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, at least one target compatible pair is determined, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair composed of one aggregation operator and one column or a combined pair composed of one aggregation operator and one aggregation column; finally, through the generation module 350, a target SQL query statement corresponding to the natural language description information is generated based on each target vector pair, each target compatible pair, and the preset SQL query statement frame. The device can obtain vector pairs, first joint probability density corresponding to the vector pairs, aggregation operators, columns and aggregation columns through to-be-processed information corresponding to natural language description information and a bidirectional recurrent neural network; on the basis, a target compatible pair and a target vector pair with a higher incidence relation can be obtained, and an accurate executable SQL query statement corresponding to the natural language description information can be obtained by combining a preset SQL query statement frame, so that the generation accuracy of the SQL query statement is improved.
Optionally, the second determining module 320 includes:
the first input unit is used for inputting the information to be processed to an encoder of a bidirectional recurrent neural network to obtain encoding vectors corresponding to the information to be processed respectively;
and the second input unit is used for inputting each encoding vector to a decoder of the bidirectional recurrent neural network to obtain at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column.
Optionally, the second input unit includes:
the decoding subunit is used for obtaining decoding vectors corresponding to the coding vectors through a decoder of the bidirectional recurrent neural network;
a selecting subunit, configured to select, by the decoder, any two different decoding vectors from the decoding vectors, form a vector pair with the any two different decoding vectors, and determine a first joint probability density corresponding to the vector pair;
and an execution subunit, configured to repeatedly execute an operation of selecting any two different decoding vectors from the decoding vectors until there are no two decoding vectors that are not simultaneously selected from the decoding vectors.
Optionally, the third determining module 330 includes:
a selecting unit, configured to select a set number of first joint probability densities from the first joint probability densities in descending order, and use each selected first joint probability density as a target joint probability density;
and the vector pair determining unit is used for determining the vector pair corresponding to each target joint probability density as a target vector pair.
Optionally, the fourth determining module 340 includes:
a forming unit, configured to, for each aggregation operator, form a corresponding first compatible pair by the aggregation operator and the aggregation column, and form a corresponding second compatible pair by the aggregation operator and each column respectively;
a density determination unit for determining a second union probability density of each of the second compatible pairs by a decoder of the bidirectional recurrent neural network, wherein the first compatible pair and the second compatible pair of the same aggregation operator correspond to each other;
the adding unit is used for adding the first compatible pair to a selection area of a set SQL statement to obtain an SQL statement to be executed corresponding to the first compatible pair;
the operation unit is used for determining whether the SQL sentences to be executed can be executed or not by operating the SQL sentences to be executed corresponding to the first compatible pair;
a first determining unit, configured to determine, if yes, the first compatible pair as a target compatible pair;
and a second determining unit, configured to determine, if not, a maximum joint probability density from second joint probability densities of second compatible pairs corresponding to the first compatible pair, and determine, as the target compatible pair, the second compatible pair corresponding to the maximum joint probability density.
Optionally, the generating module 350 includes:
and the generating unit is used for filling statement information corresponding to each target vector pair and each target compatible pair into a preset SQL query statement frame to obtain a target SQL query statement corresponding to the natural language description information, wherein the statement information indicates to-be-processed information corresponding to two decoding vectors in the corresponding target vector pair.
Optionally, the first determining module 310 includes:
the word segmentation unit is used for carrying out word segmentation processing on the natural language description information to obtain at least one word;
and the searching unit is used for searching at least one field corresponding to the natural language description information from a set database table through a semantic analysis algorithm.
The statement generation device provided by the embodiment of the invention can execute the statement generation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the respective methods and processes described above, such as the sentence generation method.
In some embodiments, the statement generation method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described sentence generation method may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the statement generation method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A statement generation method, characterized in that the method comprises:
determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as information to be processed, wherein one word corresponds to the information to be processed, one field corresponds to the information to be processed, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table;
determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column according to each piece of information to be processed and a bidirectional recurrent neural network, wherein the vector pair is composed of two corresponding decoding vectors, the first joint probability density indicates the association degree between the two decoding vectors in the corresponding vector pair, the decoding vectors are obtained after the information to be processed is coded and decoded, and the aggregation column is a column composed of at least one column;
determining at least one target vector pair based on at least one vector pair and a first joint probability density corresponding to each of the vector pairs;
determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair consisting of one aggregation operator and one column or a combined pair consisting of one aggregation operator and one aggregation column;
and generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset Structured Query Language (SQL) query statement frame.
2. The method of claim 1, wherein determining at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column according to each of the information to be processed and a bidirectional recurrent neural network comprises:
inputting each piece of information to be processed to an encoder of a bidirectional recurrent neural network to obtain a coding vector corresponding to each piece of information to be processed;
and inputting each coded vector into a decoder of the bidirectional recurrent neural network to obtain at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column and an aggregation column.
3. The method of claim 2, wherein inputting each of the encoded vectors into a decoder of the bidirectional recurrent neural network to obtain at least one vector pair and a first joint probability density corresponding to each of the vector pairs comprises:
obtaining decoding vectors corresponding to the coding vectors respectively through a decoder of the bidirectional recurrent neural network;
selecting any two different decoding vectors from the decoding vectors through the decoder, forming the any two different decoding vectors into a vector pair, and determining a first joint probability density corresponding to the vector pair;
the operation of selecting any two different decoding vectors from each decoding vector is repeatedly executed until there are no two decoding vectors in each decoding vector that are not simultaneously selected.
4. The method of claim 1, wherein determining at least one target vector pair based on at least one vector pair and a first joint probability density for each of the vector pairs comprises:
selecting a set number of first joint probability densities from the first joint probability densities, and taking each selected first joint probability density as a target joint probability density;
and determining the vector pair corresponding to each target joint probability density as a target vector pair.
5. The method of claim 1, wherein determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network comprises:
for each aggregation operator, forming the aggregation operator and the aggregation column into a corresponding first compatible pair, and forming the aggregation operator and each column into a corresponding second compatible pair respectively;
determining, by a decoder of the bidirectional recurrent neural network, a second joint probability density for each of the second compatible pairs, wherein the first and second compatible pairs of the same aggregation operator correspond;
adding the first compatible pair to a selection area of a set SQL statement to obtain an SQL statement to be executed corresponding to the first compatible pair for each first compatible pair;
determining whether the SQL sentences to be executed can be executed or not by operating the SQL sentences to be executed corresponding to the first compatible pairs;
if yes, determining the first compatible pair as a target compatible pair;
otherwise, determining the maximum joint probability density from the second joint probability densities of the second compatible pairs corresponding to the first compatible pairs, and determining the second compatible pairs corresponding to the maximum joint probability density as target compatible pairs.
6. The method of claim 1, wherein generating the target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair, and a preset SQL query statement framework comprises:
and filling statement information corresponding to each target vector pair and each target compatible pair into a preset SQL query statement frame to obtain a target SQL query statement corresponding to the natural language description information, wherein the statement information indicates to-be-processed information corresponding to two decoding vectors in the corresponding target vector pair.
7. The method of claim 1, wherein determining at least one word and at least one field corresponding to the input natural language description information comprises:
performing word segmentation processing on the natural language description information to obtain at least one word;
and searching at least one field corresponding to the natural language description information from a set database table through a semantic analysis algorithm.
8. A sentence generation apparatus, comprising:
the system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining at least one word and at least one field corresponding to input natural language description information, and determining the at least one word and the at least one field as information to be processed, one word corresponds to one piece of information to be processed, one field corresponds to one piece of information to be processed, the word is a minimum semantic unit forming the natural language description information, and the field is a field corresponding to the natural language description information in a set database table;
a second determining module, configured to determine, according to each piece of information to be processed and a bidirectional recurrent neural network, at least one vector pair, a first joint probability density corresponding to each vector pair, at least one aggregation operator, at least one column, and an aggregation column, where the vector pair is formed by two corresponding decoded vectors, the first joint probability density indicates a degree of association between the two decoded vectors in the corresponding vector pair, the decoded vector is a vector obtained after the information to be processed is encoded and decoded, and the aggregation column is a column formed by combining the at least one column;
a third determining module, configured to determine at least one target vector pair based on at least one vector pair and the first joint probability density corresponding to each of the vector pairs;
a fourth determination module for determining at least one target compatible pair based on the at least one aggregation operator, the at least one column, the aggregation column, and the bidirectional recurrent neural network, the target compatible pair indicating an executable compatible pair, the compatible pair indicating a combined pair of one aggregation operator and one column or a combined pair of one aggregation operator and one aggregation column;
and the generating module is used for generating a target SQL query statement corresponding to the natural language description information based on each target vector pair, each target compatible pair and a preset Structured Query Language (SQL) query statement frame.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the statement generation method of any of claims 1-7.
10. A computer-readable storage medium storing computer instructions for causing a processor to implement the statement generation method of any one of claims 1-7 when executed.
CN202211358681.XA 2022-11-01 2022-11-01 Statement generation method and device, electronic equipment and medium Pending CN115878662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211358681.XA CN115878662A (en) 2022-11-01 2022-11-01 Statement generation method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211358681.XA CN115878662A (en) 2022-11-01 2022-11-01 Statement generation method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN115878662A true CN115878662A (en) 2023-03-31

Family

ID=85759306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211358681.XA Pending CN115878662A (en) 2022-11-01 2022-11-01 Statement generation method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN115878662A (en)

Similar Documents

Publication Publication Date Title
CN107301170B (en) Method and device for segmenting sentences based on artificial intelligence
US20220318275A1 (en) Search method, electronic device and storage medium
CN114281968B (en) Model training and corpus generation method, device, equipment and storage medium
CN118210908B (en) Retrieval enhancement method and device, electronic equipment and storage medium
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN116450867B (en) Graph data semantic search method based on contrast learning and large language model
CN114861889A (en) Deep learning model training method, target object detection method and device
CN116028618B (en) Text processing method, text searching method, text processing device, text searching device, electronic equipment and storage medium
CN115293149A (en) Entity relationship identification method, device, equipment and storage medium
CN115576983A (en) Statement generation method and device, electronic equipment and medium
CN115062718A (en) Language model training method and device, electronic equipment and storage medium
CN115454706A (en) System abnormity determining method and device, electronic equipment and storage medium
CN113641830A (en) Model pre-training method and device, electronic equipment and storage medium
CN118410146A (en) Cross search method, device, equipment and storage medium based on large language model
KR102608867B1 (en) Method for industry text increment, apparatus thereof, and computer program stored in medium
CN113051896B (en) Method and device for correcting text, electronic equipment and storage medium
CN115168537A (en) Training method and device of semantic retrieval model, electronic equipment and storage medium
CN116955075A (en) Method, device, equipment and medium for generating analytic statement based on log
CN117371406A (en) Annotation generation method, device, equipment and medium based on large language model
CN116738323A (en) Fault diagnosis method, device, equipment and medium for railway signal equipment
CN115130470B (en) Method, device, equipment and medium for generating text keywords
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN115860003A (en) Semantic role analysis method and device, electronic equipment and storage medium
CN115878662A (en) Statement generation method and device, electronic equipment and medium
CN114969371A (en) Heat sorting method and device of combined knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination