CN112506949B - Method, device and storage medium for generating structured query language query statement - Google Patents

Method, device and storage medium for generating structured query language query statement Download PDF

Info

Publication number
CN112506949B
CN112506949B CN202011412459.4A CN202011412459A CN112506949B CN 112506949 B CN112506949 B CN 112506949B CN 202011412459 A CN202011412459 A CN 202011412459A CN 112506949 B CN112506949 B CN 112506949B
Authority
CN
China
Prior art keywords
sub
sql
processed
clause
clauses
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011412459.4A
Other languages
Chinese (zh)
Other versions
CN112506949A (en
Inventor
王丽杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011412459.4A priority Critical patent/CN112506949B/en
Publication of CN112506949A publication Critical patent/CN112506949A/en
Application granted granted Critical
Publication of CN112506949B publication Critical patent/CN112506949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device and a storage medium for generating SQL query sentences, which relate to the technical field of artificial intelligence such as natural language processing, deep learning and the like, wherein the method can comprise the following steps: dividing the problem to be processed into at least two sub-problems; respectively acquiring SQL clauses corresponding to all the sub-problems; and combining the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed. By applying the scheme, the accuracy and the like of the obtained SQL query statement can be improved.

Description

Method, device and storage medium for generating structured query language query statement
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, and a storage medium for generating a structured query language query sentence in the fields of natural language processing and deep learning.
Background
Semantic parsing (text-to-SQL) is a core technology of language understanding, and aims to automatically convert natural language problems into structured query language (SQL, structured Query Language) query sentences which can interact with a database, and the technology can help a user to acquire database information, reduce the use threshold and cost of the database, and the like.
In practical applications, there are many problems that are complex, which typically involve multi-table federation, nesting, multiple operations, etc. Currently, for any problem, a model obtained through pre-training is generally adopted to generate an SQL query statement corresponding to the problem. However, this approach generally suffers from poor results in terms of complexity, i.e., the accuracy of the obtained SQL query statement.
Disclosure of Invention
In view of this, the present application provides a method, an apparatus, and a storage medium for generating an SQL query statement.
A method of generating a structured query language query statement, comprising:
dividing the problem to be processed into at least two sub-problems;
respectively obtaining structured query language SQL clauses corresponding to all the sub-questions;
and combining the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed.
A structured query language query statement generation apparatus comprising: the device comprises a segmentation module, an acquisition module and a combination module;
the segmentation module is used for segmenting the problem to be processed into at least two sub-problems;
the acquisition module is used for respectively acquiring structured query language SQL clauses corresponding to all the sub-problems;
and the combination module is used for combining the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed.
An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method as described above.
One embodiment of the above application has the following advantages or benefits: aiming at the problem to be processed, the problem to be processed can be firstly segmented into a plurality of sub-problems, then SQL clauses corresponding to the sub-problems can be respectively obtained, and further the required SQL query statement can be obtained by combining the SQL clauses, namely, a mode of simplifying the complex problem into a simple problem sequence is adopted, the obtained SQL clauses corresponding to the simple sub-problems are usually accurate, and further the accuracy and the like of the SQL query statement obtained by combining are improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is a flowchart of an embodiment of a method for generating SQL query statement described in the present application;
FIG. 2 is a schematic diagram of an overall implementation process of the SQL query statement generation method described in the present application;
fig. 3 is a schematic structural diagram of an embodiment of the SQL query statement generating device 30 described in the present application;
fig. 4 is a block diagram of an electronic device according to a method according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In addition, it should be understood that the term "and/or" herein is merely one association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
FIG. 1 is a flowchart of an embodiment of a method for generating SQL query statement described in the present application. As shown in fig. 1, the following detailed implementation is included.
In step 101, the question to be processed is split into at least two sub-questions.
The specific number of sub-questions cut may depend on the actual situation. Preferably, the problem to be treated may be a complex problem.
In step 102, SQL clauses (Clause) corresponding to the respective sub-questions are obtained.
SQL clauses may also be referred to as SQL clauses, etc.
In step 103, each SQL clause is combined to obtain the SQL query statement corresponding to the problem to be processed.
Each SQL clause is combined into an SQL query statement, and the SQL query statement is the SQL query statement corresponding to the problem to be processed.
It can be seen that, in the solution of the embodiment of the method, for the problem to be processed, the problem to be processed may be first split into a plurality of sub-problems, then, the SQL clauses corresponding to each sub-problem may be obtained respectively, and further, the required SQL query statement may be obtained by combining the SQL clauses, that is, a manner of simplifying the complex problem into a simple problem sequence is adopted, and the obtained SQL clause corresponding to the simple sub-problem is generally more accurate, thereby improving accuracy of the SQL query statement obtained by combining and the like.
For a problem to be processed, it may preferably be split into at least two semantically complete and independent sub-problems, the overall semantics of all sub-problem expressions being identical to the semantics of the problem to be processed. For example, the problem to be processed is split into three sub-problems, and then the whole semantics expressed by the three sub-problems need to be equivalent to the semantics of the problem to be processed.
Through the processing, the finally obtained SQL query statement can be matched with the semantics of the problem to be processed, and the accuracy and the like of the obtained SQL query statement are further improved.
Preferably, for the problem to be processed, a coded representation of the problem to be processed may be obtained first, and then the coded representation may be segmented and decoded using a pre-trained problem segmentation model, thereby obtaining at least two sub-problems as required.
The encoded representation of how the problem to be processed is obtained is not limiting. For example, the encoded representation of the problem to be processed may be generated by a bi-directional encoder representation (BERT, bidirectional Encoder Representation from Transformers) pre-training model of the transformer.
Based on the obtained encoded representation, it may be sliced and decoded using a problem slicing model, thereby obtaining sliced at least two sub-problems. In practical applications, the existing complex sentence may be split into a data set of a simple sentence sequence, such as a web page splitting (websplit) and/or wikisplit data set, and a problem splitting model is obtained by training, which is how to train in the prior art. The problem-slicing model may be a sequence-to-sequence (Seq 2Seq, sequence to Sequence) generation model based on an encoding-decoding (Encoder-Decoder) structure.
By means of the problem segmentation model, the problem to be processed can be rapidly and accurately segmented, and therefore a good foundation is laid for subsequent processing.
For each obtained sub-problem, the corresponding SQL clause can be obtained respectively, namely, each sub-problem is mapped into the corresponding SQL clause respectively.
Preferably, for each sub-problem, the SQL clause corresponding to the sub-problem can be obtained by using a semantic analysis model obtained through pre-training. If the sub-problem is taken as input, a semantic analysis model is input, so that the output SQL clause is obtained rapidly and accurately.
Further, for each sub-problem, the following processes may also be performed separately: determining the clause type of the SQL clause corresponding to the sub-problem according to the sub-problem; and obtaining the SQL clause corresponding to the sub-problem by using the semantic analysis model corresponding to the determined clause type, wherein different clause types respectively correspond to the respective semantic analysis models.
In practice, SQL clauses may be divided into different clause types, such as select, filter, order, etc. For each sub-question, the clause type of the SQL clause corresponding to the sub-question can be determined by analyzing the sub-question.
Aiming at different clause types, corresponding semantic analysis models can be trained in advance respectively, and how to train the semantic analysis models is the prior art. Correspondingly, for each sub-problem, after determining the clause type of the SQL clause corresponding to the sub-problem, a semantic analysis model corresponding to the determined clause type can be called, and corresponding elements are selected from the database to be filled, so that the SQL clause corresponding to the sub-problem is obtained.
Assuming that there are M different clause types, M is a positive integer greater than one, and is respectively a clause type 1-clause type M, for each clause type, a corresponding semantic analysis model can be respectively trained to obtain M semantic analysis models, which are respectively a semantic analysis model 1-semantic analysis model M, and assuming that the clause type of the SQL clause corresponding to a certain sub-problem is a clause type 2, then the semantic analysis model 2 corresponding to the clause type 2 can be called, so as to obtain the SQL clause corresponding to the sub-problem.
Corresponding semantic analysis models are respectively trained for different clause types, and correspondingly, the semantic analysis models of the corresponding types are called, so that the accuracy and the like of the obtained SQL clauses can be further improved.
After the SQL clauses corresponding to the sub-questions are obtained respectively, the SQL clauses can be combined, so that the SQL query statement corresponding to the question to be processed is obtained.
Preferably, the SQL clauses are combined according to their grammatical relations. For example, three sub-questions are obtained, and accordingly, three SQL clauses can be obtained, which can be combined according to the grammar relationship, so as to obtain the final required SQL query statement.
Based on the above description, fig. 2 is a schematic diagram of an overall implementation process of the SQL query statement generating method described in the present application.
As shown in fig. 2, for a problem to be processed, such as a complex problem, its coded representation may be obtained first. For example, an encoded representation of the problem to be processed may be generated by a BERT pre-training model.
Then, the segmentation decoding can be performed according to the encoded representation of the problem to be processed, thereby obtaining the sub-problem 1-N sub-problems.
For example, the pre-trained problem segmentation model may be used to segment and decode the encoded representation of the problem to be processed, thereby obtaining a sub-problem 1-N sub-problems. In practical application, the existing complex sentence can be split into a data set of a simple sentence sequence, such as webslit and/or wikisplit data set, and the problem splitting model is obtained through training.
The N sub-questions are all independent sub-questions with complete semantics, and the whole semantics of all the sub-question expressions are equivalent to the semantics of the questions to be processed.
Then, the SQL clauses corresponding to the sub-questions can be generated respectively, that is, the sub-question-SQL generation process shown in fig. 2 is performed, so that N SQL clauses are obtained from the SQL clause 1 to the SQL clause N, the SQL clause 1 corresponds to the sub-question 1, the SQL clause 2 corresponds to the sub-question 2, and the like.
For each sub-problem, the SQL clause corresponding to the sub-problem can be obtained by respectively utilizing a semantic analysis model obtained through pre-training.
Further, for each sub-problem, the clause type of the SQL clause corresponding to the sub-problem can be determined according to the sub-problem, and the SQL clause corresponding to the sub-problem can be obtained by utilizing the semantic analysis model corresponding to the determined clause type, wherein different clause types respectively correspond to the semantic analysis models. Corresponding semantic analysis models can be trained in advance according to different clause types, and correspondingly, according to each sub-problem, after determining the clause type of the SQL clause corresponding to the sub-problem, the semantic analysis model corresponding to the determined clause type can be called, so that the SQL clause corresponding to the sub-problem is obtained.
And then, combining the obtained N SQL clauses so as to obtain the SQL query statement corresponding to the problem to be processed. For example, N SQL clauses may be combined according to a grammatical relationship between the N SQL clauses.
The specific implementation of the SQL query statement generation process shown in fig. 2 refers to the foregoing related description, and is not repeated.
It should be noted that, for simplicity of description, the foregoing method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
The foregoing is a description of embodiments of the method, and the following further describes embodiments of the device.
Fig. 3 is a schematic structural diagram of an embodiment of the SQL query statement generating device 30 described in the present application. As shown in fig. 3, includes: a segmentation module 301, an acquisition module 302 and a combination module 303.
The splitting module 301 is configured to split a problem to be processed into at least two sub-problems.
And the obtaining module 302 is configured to obtain the SQL clauses corresponding to the sub-questions respectively.
And the combination module 303 is configured to combine the SQL clauses to obtain an SQL query statement corresponding to the to-be-processed problem.
Preferably, the problem to be treated may be a complex problem. The splitting module 301 may split the problem to be processed into at least two independent sub-problems with complete semantics, where the overall semantics of all the sub-problem expressions are identical to the semantics of the problem to be processed.
Specifically, for the problem to be processed, the segmentation module 301 may further obtain the encoded representation thereof, for example, the encoded representation of the problem to be processed may be generated by using a BERT pre-training model, and then the encoded representation may be segmented and decoded by using a pre-trained problem segmentation model, so as to obtain at least two sub-problems.
In practical application, the existing complex sentence can be split into a data set of a simple sentence sequence, such as webslit and/or wikisplit data set, and the problem splitting model is obtained through training.
The obtaining module 302 may obtain the SQL clauses corresponding to the sub-questions respectively. For any sub-problem, the obtaining module 302 may obtain the SQL clause corresponding to the sub-problem by using a semantic parsing model obtained by training in advance.
Further, for any sub-problem, the obtaining module 302 may further determine a clause type of the SQL clause corresponding to the sub-problem according to the sub-problem, and obtain the SQL clause corresponding to the sub-problem by using the semantic analysis model corresponding to the determined clause type, where different clause types respectively correspond to respective semantic analysis models.
Corresponding semantic analysis models can be trained in advance for different clause types, and correspondingly, for any sub-problem, after the clause type of the SQL clause corresponding to the sub-problem is determined, the semantic analysis model corresponding to the determined clause type can be called, so that the SQL clause corresponding to the sub-problem is obtained.
Then, the combination module 303 may combine the SQL clauses, so as to obtain an SQL query statement corresponding to the problem to be processed.
For example, each SQL clause may be combined according to the grammatical relation between each SQL clause, so as to obtain the SQL query statement corresponding to the problem to be processed.
The specific workflow of the embodiment of the apparatus shown in fig. 3 is referred to the related description in the foregoing method embodiment, and will not be repeated.
In summary, by adopting the scheme of the embodiment of the application device, aiming at the problem to be processed, the problem to be processed can be firstly segmented into a plurality of sub-problems, then SQL clauses corresponding to the sub-problems can be respectively obtained, and further the required SQL query statement can be obtained by combining the SQL clauses, namely, a mode of simplifying the complex problem into a simple problem sequence is adopted, the obtained SQL clauses corresponding to the simple sub-problems are generally more accurate, and further the accuracy and the like of the SQL query statement obtained by combining are improved.
The scheme can be applied to the field of artificial intelligence, and particularly relates to the fields of natural language processing, deep learning and the like.
Artificial intelligence is the subject of studying certain thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.) that make a computer simulate a person, and has technology at both hardware and software levels, and artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, etc., and artificial intelligence software technologies mainly include computer vision technologies, speech recognition technologies, natural language processing technologies, machine learning/deep learning, big data processing technologies, knowledge graph technologies, etc.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 4, is a block diagram of an electronic device according to a method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 4, the electronic device includes: one or more processors Y01, memory Y02, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of a graphical user interface on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). In fig. 4, a processor Y01 is taken as an example.
The memory Y02 is a non-transitory computer readable storage medium provided in the present application. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
The memory Y02 serves as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present application. The processor Y01 executes various functional applications of the server and data processing, i.e., implements the methods in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory Y02.
The memory Y02 may include a memory program area that may store an operating system, at least one application program required for functions, and a memory data area; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory Y02 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory Y02 may optionally include memory located remotely from processor Y01, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, blockchain networks, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device Y03 and an output device Y04. The processor Y01, memory Y02, input device Y03, and output device Y04 may be connected by a bus or otherwise, for example in fig. 4.
The input device Y03 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output means Y04 may include a display device, an auxiliary lighting means, a tactile feedback means (e.g., a vibration motor), and the like. The display device may include, but is not limited to, a liquid crystal display, a light emitting diode display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific integrated circuitry, computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. The terms "machine-readable medium" and "computer-readable medium" as used herein refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable logic devices) for providing machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a cathode ray tube or a liquid crystal display monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area networks, wide area networks, blockchain networks, and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (8)

1. A method of generating a structured query language query statement, comprising:
splitting the problem to be processed into at least two sub-problems, including: dividing the problem to be processed into at least two independent sub-problems with complete semantics, wherein the whole semantics of all sub-problem expressions are equal to the semantics of the problem to be processed;
the method for respectively obtaining the structured query language SQL clause corresponding to each sub-problem comprises the following steps: aiming at any sub-problem, determining the clause type of the SQL clause corresponding to the sub-problem according to the sub-problem, and obtaining the SQL clause corresponding to the sub-problem by utilizing a semantic analysis model corresponding to the determined clause type, wherein different clause types respectively correspond to the semantic analysis models;
and combining the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed.
2. The method of claim 1, wherein the splitting the problem to be processed into at least two sub-problems comprises:
acquiring a coded representation of the problem to be processed;
and utilizing a pre-trained problem segmentation model to carry out segmentation decoding on the coded representation to obtain the at least two sub-problems.
3. The method of claim 1, wherein the combining the SQL clauses comprises: and combining the SQL clauses according to the grammatical relation among the SQL clauses.
4. A structured query language query statement generation apparatus comprising: the device comprises a segmentation module, an acquisition module and a combination module;
the splitting module is configured to split a problem to be processed into at least two sub-problems, and includes: dividing the problem to be processed into at least two independent sub-problems with complete semantics, wherein the whole semantics of all sub-problem expressions are equal to the semantics of the problem to be processed;
the obtaining module is configured to obtain structured query language SQL clauses corresponding to each sub-problem, respectively, and includes: aiming at any sub-problem, determining the clause type of the SQL clause corresponding to the sub-problem according to the sub-problem, and obtaining the SQL clause corresponding to the sub-problem by utilizing a semantic analysis model corresponding to the determined clause type, wherein different clause types respectively correspond to the semantic analysis models;
and the combination module is used for combining the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed.
5. The apparatus of claim 4, wherein,
the segmentation module acquires the coded representation of the problem to be processed, and segments and decodes the coded representation by utilizing a problem segmentation model obtained by pre-training to obtain the at least two sub-problems.
6. The apparatus of claim 4, wherein,
and the combination module combines the SQL clauses according to the grammatical relation among the SQL clauses to obtain the SQL query statement corresponding to the problem to be processed.
7. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-3.
8. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1-3.
CN202011412459.4A 2020-12-03 2020-12-03 Method, device and storage medium for generating structured query language query statement Active CN112506949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011412459.4A CN112506949B (en) 2020-12-03 2020-12-03 Method, device and storage medium for generating structured query language query statement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011412459.4A CN112506949B (en) 2020-12-03 2020-12-03 Method, device and storage medium for generating structured query language query statement

Publications (2)

Publication Number Publication Date
CN112506949A CN112506949A (en) 2021-03-16
CN112506949B true CN112506949B (en) 2023-07-25

Family

ID=74970185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011412459.4A Active CN112506949B (en) 2020-12-03 2020-12-03 Method, device and storage medium for generating structured query language query statement

Country Status (1)

Country Link
CN (1) CN112506949B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420111B (en) * 2021-06-17 2023-08-11 中国科学院声学研究所 Intelligent question answering method and device for multi-hop reasoning problem
CN114281968B (en) * 2021-12-20 2023-02-28 北京百度网讯科技有限公司 Model training and corpus generation method, device, equipment and storage medium
CN114490709B (en) * 2021-12-28 2023-03-24 北京百度网讯科技有限公司 Text generation method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108426A (en) * 2017-12-15 2018-06-01 杭州网蛙科技有限公司 Understanding method, device and the electronic equipment that natural language is putd question to
CN109657244A (en) * 2018-12-18 2019-04-19 语联网(武汉)信息技术有限公司 A kind of English long sentence automatic segmentation method and system
CN109815318A (en) * 2018-12-24 2019-05-28 平安科技(深圳)有限公司 The problems in question answering system answer querying method, system and computer equipment
CN111104423A (en) * 2019-12-18 2020-05-05 北京百度网讯科技有限公司 SQL statement generation method and device, electronic equipment and storage medium
CN111177184A (en) * 2019-12-24 2020-05-19 深圳壹账通智能科技有限公司 Structured query language conversion method based on natural language and related equipment thereof
CN111241245A (en) * 2020-01-14 2020-06-05 百度在线网络技术(北京)有限公司 Human-computer interaction processing method and device and electronic equipment
CN111309753A (en) * 2020-01-21 2020-06-19 上海达梦数据库有限公司 Method, device and equipment for optimizing structured query statement and storage medium
CN111414380A (en) * 2020-03-20 2020-07-14 华泰证券股份有限公司 Chinese database SQ L statement generation method, equipment and storage medium
CN111858880A (en) * 2020-06-18 2020-10-30 北京百度网讯科技有限公司 Method and device for obtaining query result, electronic equipment and readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9020806B2 (en) * 2012-11-30 2015-04-28 Microsoft Technology Licensing, Llc Generating sentence completion questions
US20200210525A1 (en) * 2018-12-28 2020-07-02 Microsoft Technology Licensing, Llc Predicting query language statements from natural language analytic questions
US11966389B2 (en) * 2019-02-13 2024-04-23 International Business Machines Corporation Natural language to structured query generation via paraphrasing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108426A (en) * 2017-12-15 2018-06-01 杭州网蛙科技有限公司 Understanding method, device and the electronic equipment that natural language is putd question to
CN109657244A (en) * 2018-12-18 2019-04-19 语联网(武汉)信息技术有限公司 A kind of English long sentence automatic segmentation method and system
CN109815318A (en) * 2018-12-24 2019-05-28 平安科技(深圳)有限公司 The problems in question answering system answer querying method, system and computer equipment
CN111104423A (en) * 2019-12-18 2020-05-05 北京百度网讯科技有限公司 SQL statement generation method and device, electronic equipment and storage medium
CN111177184A (en) * 2019-12-24 2020-05-19 深圳壹账通智能科技有限公司 Structured query language conversion method based on natural language and related equipment thereof
CN111241245A (en) * 2020-01-14 2020-06-05 百度在线网络技术(北京)有限公司 Human-computer interaction processing method and device and electronic equipment
CN111309753A (en) * 2020-01-21 2020-06-19 上海达梦数据库有限公司 Method, device and equipment for optimizing structured query statement and storage medium
CN111414380A (en) * 2020-03-20 2020-07-14 华泰证券股份有限公司 Chinese database SQ L statement generation method, equipment and storage medium
CN111858880A (en) * 2020-06-18 2020-10-30 北京百度网讯科技有限公司 Method and device for obtaining query result, electronic equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
question generation from SQL queries improves neural semantic parsing;daya guo 等;computation and language;全文 *
受限领域自然语言数据库查询接口研究;余正涛, 樊孝忠, 耿增民;昆明理工大学学报(理工版)(第04期);全文 *

Also Published As

Publication number Publication date
CN112506949A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN111428507B (en) Entity chain finger method, device, equipment and storage medium
CN112506949B (en) Method, device and storage medium for generating structured query language query statement
US20220004714A1 (en) Event extraction method and apparatus, and storage medium
CN111274764B (en) Language generation method and device, computer equipment and storage medium
CN112148871B (en) Digest generation method, digest generation device, electronic equipment and storage medium
CN111709247A (en) Data set processing method and device, electronic equipment and storage medium
CN111144507B (en) Emotion analysis model pre-training method and device and electronic equipment
CN111368046A (en) Man-machine conversation method, device, electronic equipment and storage medium
CN112489637A (en) Speech recognition method and device
CN111967256A (en) Event relation generation method and device, electronic equipment and storage medium
CN113220836A (en) Training method and device of sequence labeling model, electronic equipment and storage medium
CN111079945B (en) End-to-end model training method and device
JP2021131858A (en) Entity word recognition method and apparatus
US20220129448A1 (en) Intelligent dialogue method and apparatus, and storage medium
CN114281968B (en) Model training and corpus generation method, device, equipment and storage medium
US20220027575A1 (en) Method of predicting emotional style of dialogue, electronic device, and storage medium
CN112541362B (en) Generalization processing method, device, equipment and computer storage medium
CN110807331A (en) Polyphone pronunciation prediction method and device and electronic equipment
CN112633017A (en) Translation model training method, translation processing method, translation model training device, translation processing equipment and storage medium
CN112559552B (en) Data pair generation method and device, electronic equipment and storage medium
CN111539224A (en) Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN112507697A (en) Event name generation method, device, equipment and medium
CN111738015A (en) Method and device for analyzing emotion polarity of article, electronic equipment and storage medium
CN112232089B (en) Pre-training method, device and storage medium of semantic representation model
CN113360751A (en) Intention recognition method, apparatus, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant