CN118132723A - Training method, training equipment, training medium, training software and training product for form question-answer model - Google Patents

Training method, training equipment, training medium, training software and training product for form question-answer model Download PDF

Info

Publication number
CN118132723A
CN118132723A CN202410334221.6A CN202410334221A CN118132723A CN 118132723 A CN118132723 A CN 118132723A CN 202410334221 A CN202410334221 A CN 202410334221A CN 118132723 A CN118132723 A CN 118132723A
Authority
CN
China
Prior art keywords
sql
question
sentence
training
tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410334221.6A
Other languages
Chinese (zh)
Inventor
李犇
张�杰
范清
于皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongguancun Kejin Technology Co Ltd
Original Assignee
Beijing Zhongguancun Kejin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongguancun Kejin Technology Co Ltd filed Critical Beijing Zhongguancun Kejin Technology Co Ltd
Priority to CN202410334221.6A priority Critical patent/CN118132723A/en
Publication of CN118132723A publication Critical patent/CN118132723A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a training method, computer equipment, storage medium, computer software and computer program product of a form question-answer model, comprising the following steps: collecting a plurality of tables and problems corresponding to the tables; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problem and the SQL sentence correspondingly associated with the table; based on the associated questions, SQL sentences and execution results of the table correspondence, a table question-answer model is obtained through training.

Description

Training method, training equipment, training medium, training software and training product for form question-answer model
Technical Field
The present application relates to the field of language processing technology, and in particular, to a training method for a form question-answer model, a computer device, a storage medium, computer software, and a computer program product.
Background
In our daily work, the table is visible everywhere, and the table comprises fields (i.e. a header) and values, is used as the most common structured knowledge storage form, has clear and easy-to-maintain table structure, is friendly to human understanding and machine understanding, is an information transmission method widely used in various industries, and is applied to various documents, such as documents appearing in Excel, word and the like, or tables of databases. Then in the query interaction of the table, there is a higher usage threshold: if the form exists in the relational data, a professional technician is required to write a query statement to complete; if the form is present in a text document, the knowledge of the form may not be well queried as an answer, requiring the use of a search engine. The form question-answering (Table Question Answer, tableQA) technique is directed to a form, allowing a user to directly interact with the form using natural language questions, and obtaining corresponding answers or knowledge from the form.
The form question-answering model is a Pre-trained language model (Pre-train language Model, PLM) based on form training for form question-answering tasks. However, most of the pre-training language models currently are directed to unstructured text data, the data is mostly semantically continuous, the table is data comprising fields (i.e. table heads) and values, the semantic space of the table text is discontinuous, therefore, the pre-training language models are not ideal in the processing of the table question-answer task, and training the pre-training language models for processing the table question-answer task requires a large amount of labeling data, and the labeling cost is quite expensive. In conclusion, the currently trained form question-answering model has the problems of low accuracy and high cost of predicting form question-answering.
Disclosure of Invention
The embodiment of the application aims to provide a training method, computer equipment, a storage medium, computer software and a computer program product of a form question-answer model, which can train out the form question-answer model capable of accurately predicting answers of form question-answer tasks at low cost.
In order to realize the technical scheme, the embodiment of the application is realized as follows:
In a first aspect, a training method for a form question-answer model provided by an embodiment of the present application includes:
collecting a plurality of tables and problems corresponding to the tables, wherein the tables comprise table fields and values corresponding to the table fields;
Based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table;
Determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables;
determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part;
and training to obtain a form question-answering model based on the problems, SQL sentences and execution results correspondingly associated with the forms of the first part and the second part.
In a second aspect, an embodiment of the present application provides a computer device, including:
A processor; and a memory arranged to store computer executable instructions configured for execution by the processor, the executable instructions comprising instructions for performing part or all of the steps of the method according to the first aspect.
In a third aspect, an embodiment of the present application provides a storage medium, where the storage medium is configured to store computer executable instructions, where the executable instructions cause a computer to perform some or all of the steps in the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer software stored in a storage medium, the computer software being executed by at least one processor to perform part or all of the steps of the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of the method according to the first aspect.
It can be seen that, in the embodiment of the present application, by collecting a plurality of tables and questions corresponding to each table, the tables include table fields and values corresponding to the table fields; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part; based on the related problems, SQL sentences and execution results of the tables of the first part and the second part, a table question-answer model is obtained through training, so that a large quantity of pre-training corpus required by the table question-answer model training can be obtained through a small quantity of marked real data, the cost of the table question-answer model training is reduced, the accuracy of the table question-answer model prediction result is improved, and the table question-answer model capable of accurately predicting answers of the table question-answer task can be trained at low cost.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments described in one or more of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a flowchart of a training method of a form question-answer model according to an embodiment of the present application.
Fig. 2 is a second flowchart of a training method of a form question-answer model according to an embodiment of the application.
Fig. 3 is a flowchart illustrating a method for processing a form question-answering task according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the technical solutions of one or more embodiments of the present application, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments, not all embodiments, of one or more embodiments of the present application. All other embodiments, which can be made by one or more embodiments of the application without inventive faculty, are intended to be within the scope of the application.
It should be noted that, without conflict, one or more embodiments of the present application and features of the embodiments may be combined with each other. Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
One or more embodiments of the present application provide a training method for a form question-answer model, a computer device, a storage medium, a computer software and a computer program product, which take into account that a semantic space of a form text for which an existing form question-answer model is trained is discontinuous and a large amount of annotation data is required, so that a trained form question-answer model has the problems of low accuracy and high cost of predicting form question-answer, and based on the problem, the technical scheme includes that a plurality of forms and problems corresponding to each form are acquired, wherein the forms include form fields and values corresponding to the form fields; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part; based on the related problems, SQL sentences and execution results of the tables of the first part and the second part, a table question-answer model is obtained through training, so that a large quantity of pre-training corpus required by the table question-answer model training can be obtained through a small quantity of marked real data, the cost of the table question-answer model training is reduced, the accuracy of the table question-answer model prediction result is improved, and the table question-answer model capable of accurately predicting answers of the table question-answer task can be trained at low cost.
Fig. 1 and fig. 2 are schematic flow diagrams of a training method of a form question-answer model according to one or more embodiments of the present application, where the training method of the form question-answer model of fig. 1 involves steps of the flow, and the training method of the form question-answer model of fig. 2 involves objects of the flow.
As shown in fig. 1, the method at least includes the following steps 102 to 110.
Step 102, collecting a plurality of tables and problems corresponding to the tables, wherein the tables comprise table fields and values corresponding to the table fields.
Referring to fig. 2, in the embodiment of the present application, a crawler system 4 may be used to obtain a real table 8 and a natural language question 6 corresponding to the table from multiple data sources 2 such as the internet, a document database, a relational database, etc. The form may be a form in an Excel file, a form existing in a document of a different type such as Word, or a form in a database. The table includes fields (headers) and field values (values), and a real table is obtained, for example, by a crawler system, as shown in table 1 below:
TABLE 1
The table in table 1 includes two fields of a tax payment subject name and a income tax rate, wherein the values of the fields corresponding to the tax payment subject name are company 1 and company 2, and the values of the fields corresponding to the income tax rate are 15% and 20%. Namely, the header 1 is the "tax payment subject name", and the header 2 is the "income tax rate"; the value 1 of the header 1 is "company 1", and the value 2 of the header 1 is "company 2"; the value 1 of the header 2 is "15%", and the value 2 of the header 2 is "20%".
For example, the problem associated with the acquired form is that "the tax rate of other tax payers than company 1 is hard to tell me to go down".
By the method, a plurality of real tables and corresponding problems can be obtained. Of course, there are also cases where the acquired partial form has no corresponding problem.
Step 104, based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to each table, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table.
In general, a table actually collected in step 102 corresponds to a problem, which can be understood as executing a corresponding SQL statement on the table. For example, the problem is that "the tax rate of the tax payers other than company 1 tells me down" where "the tax payers other than company 1" corresponds to the value 2 of the header 1 and "the tax rate of the tax payers other than company 1" corresponds to the value 2 of the header 2. The problem can get the corresponding execution result "20%", by executing the corresponding SQL statement "Select [ header 2] where [ header 1] = [ value 2]" to query the table.
The number of collected tables and corresponding questions is limited, and in step 104, in conjunction with fig. 2, on the basis of each collected table 8, a plurality of different SQL templates 10 may be utilized to generate a plurality of SQL statements of different questions corresponding to the table 8.
Different SQL templates are for example as follows:
select [ header 1] sphere [ header 2] = = [ value 1];
select [ header 1] sphere [ header 2] > [ value 1];
Select sum [ header 1] sphere [ header 2] = [ value 2].
Thus, through a plurality of different SQL templates and tables, a plurality of corresponding SQL sentences can be generated based on one table, and the SQL sentences can exist corresponding to the real acquired problems of the table, and the SQL sentences are associated with the problems corresponding to the table. And if the SQL sentences are not related to the problems actually acquired by the tables, the SQL sentences generated based on the tables not acquiring the corresponding problems are not related to the other SQL sentences.
Answers to different questions such as "what the tax rate of the tax payer name is for company 1" and "which the tax rate of the tax payer names is for company 1 and company 2" can be realized by executing the SQL statement to look up the table. Therefore, a plurality of SQL sentences corresponding to different questions are generated for each table through the SQL template, and the execution result of the SQL sentences is the answer of the questions. By using different SQL templates, different questions and answers corresponding to the acquisition form can be expanded.
Optionally, the generating, based on the multiple preset SQL templates, the SQL statement corresponding to each table includes: respectively executing SQL sentences corresponding to the collected target table through the plurality of preset SQL templates to obtain a plurality of SQL sentences of the target table and a plurality of executing results corresponding to the SQL sentences one by one; and if the execution result corresponding to the target SQL template is correct, generating an SQL sentence corresponding to the target table based on the SQL sentence corresponding to the target SQL template.
Filling the collected header and value corresponding to each table 8 into different SQL templates 10, then obtaining a plurality of SQL sentences 12 with the same quantity as the SQL templates correspondingly, and executing each SQL sentence, then obtaining the SQL query result of the corresponding table, wherein one SQL sentence corresponds to one execution result. If the target SQL statement does not obtain the execution result, the target SQL statement is removed. And taking each SQL sentence with the correct execution result as the SQL sentence generated by the corresponding table.
Step 106, determining a problem associated with a second SQL statement of a table of a second part of the plurality of tables based on a first SQL statement of a table of the first part of the plurality of tables.
As described above, the first SQL statement is associated with a question that the form corresponds to a true collection, and the second SQL statement is generated based on the form and the SQL template and is not associated with a question that the form corresponds to a true collection. In the step, based on the acquired real tables and the SQL sentences corresponding to the problems of the first part, the problems related to the SQL sentences of the non-related problems generated by the tables of the remaining second part are determined. The second part of the tables comprises tables with corresponding acquisition problems and tables without corresponding acquisition problems.
Optionally, the determining, based on the first SQL statement of the first part of the tables in the plurality of tables, a problem associated with the second SQL statement of the second part of the tables in the plurality of tables includes: training to obtain a language conversion model by taking a first SQL sentence of the table comprising the first part and a sentence pair of a problem correspondingly associated with the first SQL sentence as a training data set, wherein the language conversion model is used for interconversion between the problem and the SQL sentence; based on the language conversion model, converting a question or a second SQL statement corresponding to the second part of the table to determine a question associated with the second SQL statement of the second part of the table, and determining an SQL statement associated with the question of the second part of the table.
As a training data set, these tables (tables), problems related to the tables (questions), SQL Statements (SQL) corresponding to the problems, and execution results (Answer) of the SQL statements need to be labeled in a one-to-one correspondence with sentence pairs including the first SQL statements and problems associated with the tables of the first part, for example, only a small number of collection tables need to be labeled. And (3) obtaining a (Table-query-SQL-Answer) sentence pair with the association relation of the Table of the first part after labeling.
In conjunction with fig. 2, optionally, the language conversion model includes an NL2SQL model 14 and an SQL2NL model 16, and the converting, based on the language conversion model, a problem or a second SQL statement corresponding to the second part of the table to determine a problem associated with the second SQL statement of the second part of the table, and determining an SQL statement associated with the problem of the second part of the table includes: based on the NL2SQL model, converting the problems of the tables of the second part to obtain corresponding related SQL sentences; and converting a second SQL statement of the table of the second part to obtain a corresponding associated problem based on the SQL2NL model.
Selecting a (query-SQL) sentence pair with an association relationship from the Table-query-SQL-Answer sentence pair with an association relationship in the first part as a corpus of a training data set, and respectively training an NL2SQL model (for converting a problem of a natural language into an SQL sentence) and an SQL2NL model (for converting an SQL sentence into a problem of a natural language) by using a language conversion model, such as a pre-training language model of BART, GPT, and the like.
The language conversion model training adopts a commonly used generation type pre-training language model, and performs domain task fine adjustment on a corresponding training data set, and the difference between the training NL2SQL model 14 and the training NL2 model 16 is that the training data are different, and the model architecture and the training method are the same.
For the second SQL statement generated based on the acquired table and not related to the problem acquired corresponding to the table, the second SQL statement in the table comprising the first part and the second part also comprises the second SQL statement generated based on the table not acquired with the corresponding problem, the second SQL statement can be converted by using the SQL2NL model to obtain the problem related to the second SQL statement, namely the problem related to the table.
For the problems corresponding to the acquired tables, the NL2SQL model can be used for converting the problems to obtain corresponding related SQL sentences.
And respectively generating corresponding questions or SQL sentences by utilizing the NL2SQL model 14 and the SQL2NL model 16 obtained through training in the steps. And completing the acquired other tables into (Table-Question-SQL) sentence pairs with association relations based on the problem or SQL sentence obtained by conversion except for the first SQL sentence corresponding to the labeled (Table-Question-SQL-Answer) sentence pairs in the tables of the first part.
Thus, the Table of the first part of the collection and the Table of the second part of the collection are correspondingly related (Table-Question-SQL) sentence pairs.
Step 108, determining the execution result of each SQL sentence based on the problem and the SQL sentence corresponding to the tables of the first part and the second part.
The problem and the SQL sentence corresponding to the Table of the first part and the Table of the second part are related, namely the SQL sentence corresponding to each Table is selected from the (Table-query-SQL) sentence pair corresponding to the Table of the first part, and the SQL sentence corresponding to each Table is selected from the (Table-query-SQL) sentence pair corresponding to the Table of the second part.
The Table-query-SQL sentence pairs obtained in step 106 for the first part of the Table and the second part of the Table also lack the execution results corresponding to the SQL sentence, that is, answers (answers) to the questions associated with the Table. The SQL statement that has obtained the execution result in the above step may not be executed again here.
Specifically, the determining, based on the problems and the SQL statements corresponding to the tables of the first part and the second part, an execution result of each SQL statement includes: the SQL sentence corresponding to the table of the first part is executed to obtain the execution result of the SQL sentence corresponding to the table of the first part; and executing the SQL statement corresponding to the table of the second part to obtain an execution result of the SQL statement corresponding to the table of the second part.
And respectively obtaining the execution result corresponding to each SQL sentence for the SQL sentences corresponding to the plurality of acquisition tables comprising the first part of table and the second part of table. An SQL statement may correspond to one or more execution results.
Thus, the Table and the related problems thereof, SQL sentences corresponding to the problems and execution results corresponding to the SQL sentences are synthesized to obtain a plurality of complete sentence pairs (Question-Table-SQL-Answer) corresponding to the acquisition tables.
Step 110, training to obtain a form question-answer model based on the questions, SQL sentences and execution results associated with the form correspondence of the first part and the second part.
Specifically, the method comprises the following steps: generating a first sentence pair comprising the associated table, the SQL sentence and the execution result and a second sentence pair comprising the associated question and the SQL sentence based on the associated question, the SQL sentence and the execution result corresponding to the tables of the first part and the second part; and training the first sentence pair and the second sentence pair as pre-training corpus to obtain a form question-answer model, wherein the form question-answer model is used for predicting an execution result of an SQL sentence corresponding to and associated with a form question-answer task comprising a question and a form.
Based on a complete sentence pair (Question-Table-SQL-Answer) corresponding to the Table, synthesizing a plurality of problems, SQL sentences and execution results associated with the acquisition tables comprising a first part and a second part to obtain a sentence pair (Table-SQL-Answer) corresponding to each Table, namely a first sentence pair; and synthesizing the problems and SQL sentences associated with the acquisition tables comprising the first part and the second part to obtain sentence pairs (Question-SQL) corresponding to the tables, namely second sentence pairs.
The data sets of the two types of sentence pairs are mixed together as a pre-training corpus 18 (see fig. 2), and pre-training is performed on the target pre-training language model in such a way that the next word of the sentence is predicted from the input sentence. In one embodiment, the target pre-training language model may be a generic generative pre-training language model (see generic PLM 20 of FIG. 2), such as BART, GPT, etc., and the pre-training is continued through the pre-training corpus 18 described above. The specific pre-training mode refers to the pre-training mode of the GPT model and the like.
The format of the pre-training corpus corresponding to the first sentence pair (Table-SQL-Answer) is, for example, as follows:
{"input":"[Head]head_1|head_2[Row]row_1|row_2[Row]row_3|row_4[SQL]Select head_2from T where head_1=row_1[Ans]row_2"}.
The format of the corresponding pre-training corpus of the second sentence pair (Question-SQL) is, for example, as follows:
{ "input": "[ Question ] which is the best-sold product? SQL Select product from product sales table order by sales asc limit 1' }.
Based on the second sentence pair, the target pre-training language model can generate a corresponding SQL sentence according to the problem; based on the first sentence pair, the target pre-training language model may have the ability of SQL statement execution, enabling answers to be found from the table. And (3) sequentially inputting each sentence pair as a pre-training corpus into a target pre-training language model, and comparing the predicted result output by the target pre-training language model with the real label value to adjust the parameters of the target pre-training language model until the loss of the predicted result and the real label value converges, so as to finally obtain a table question-answering model (see table PLM 22 of figure 2). The training results form question-answering model is used for predicting the execution result of the SQL sentence corresponding to the form question-answering task including the question and the form.
Optionally, after the training to obtain the form question-answer model, the method further includes: generating a third sentence pair comprising the associated table, the question and the execution result based on the associated question, the SQL sentence and the execution result corresponding to the tables of the first part and the second part; and taking the third sentence pair as a table question-answer training data set, adjusting the table question-answer model, wherein the adjusted table question-answer model is used for outputting complete answers of the table question-answer task based on predicted execution results of a plurality of SQL sentences.
In this embodiment, the third sentence pair (Table-query-Answer) is synthesized by combining the complete sentence pairs (query-Table-SQL-Answer) corresponding to the plurality of acquisition tables according to a certain proportion. The third sentence pair is used as a table question-answer data set 34 (see fig. 2), the table question-answer model (see table PLM 22 of fig. 2) obtained by the training is subjected to table question-answer field fine tuning, and a final table question-answer model (see table question-answer model 26 of fig. 2) is obtained by training, wherein the table question-answer model has the capability of predicting answers of a plurality of words, and can output complete answers synthesized by execution results corresponding to a plurality of SQL sentences.
In the embodiment of the application, a plurality of tables and problems corresponding to the tables are acquired, wherein the tables comprise table fields and values corresponding to the table fields; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part; based on the related problems, SQL sentences and execution results of the tables of the first part and the second part, a table question-answer model is obtained through training, so that a large quantity of pre-training corpus required by the table question-answer model training can be obtained through a small quantity of marked real data, the cost of the table question-answer model training is reduced, the accuracy of the table question-answer model prediction result is improved, and the table question-answer model capable of accurately predicting answers of the table question-answer task can be trained at low cost.
Further, as shown in fig. 3, the embodiment of the present application further provides a method for processing a form question-answering task, including:
step 202, acquiring a target form question-answering task, wherein the target form question-answering task comprises a form and a problem corresponding to the form;
step 204, outputting the answers of the target form questions and answers by inputting the target form questions and answers into a preset form question and answer model.
The preset form question-answering model is trained based on the steps of the method described in any of the embodiments of fig. 1 to 2, and is not repeated here.
In the embodiment of the application, the table question-answering task comprising the table and the questions corresponding to the table is input into the preset table question-answering model, so that the accurate answer of the corresponding table question-answering task can be output.
Further, according to the training method of the form question-answering model described in fig. 1 to 2 or the processing method of the form question-answering task described in fig. 3, based on the same technical concept, the embodiment of the present application further provides a computer device, which is used for executing the training method of the form question-answering model or the processing method of the form question-answering task described in fig. 4.
The computer device 700 may vary considerably in configuration or performance and may include one or more processors 701 and memory 702, where the memory 702 may store one or more stored applications or data. Wherein the memory 702 may be transient storage or persistent storage. The application programs stored in the memory 702 may include one or more modules (not shown), each of which may include a series of computer-executable instructions for use in an information updating apparatus. Still further, the processor 701 may be arranged to communicate with the memory 702 and execute a series of computer executable instructions in the memory 702 on the information updating device. The computer device 700 may also include one or more power supplies 703, one or more wired or wireless network interfaces 704, one or more input/output interfaces 705, one or more keyboards 706, and the like.
In a particular embodiment, the computer device 700 includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the computer device, and configured to be executed by the one or more processors, the one or more programs comprising computer-executable instructions for:
collecting a plurality of tables and problems corresponding to the tables, wherein the tables comprise table fields and values corresponding to the table fields;
Based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table;
Determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables;
determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part;
and training to obtain a form question-answering model based on the problems, SQL sentences and execution results correspondingly associated with the forms of the first part and the second part.
In the embodiment of the application, a plurality of tables and problems corresponding to the tables are acquired, wherein the tables comprise table fields and values corresponding to the table fields; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part; based on the related problems, SQL sentences and execution results of the tables of the first part and the second part, a table question-answer model is obtained through training, so that a large quantity of pre-training corpus required by the table question-answer model training can be obtained through a small quantity of marked real data, the cost of the table question-answer model training is reduced, the accuracy of the table question-answer model prediction result is improved, and the table question-answer model capable of accurately predicting answers of the table question-answer task can be trained at low cost.
It should be noted that, in the embodiment of the present application related to the computer device and the embodiment of the training method related to the form question-answer model in the present application are based on the same inventive concept, so that the specific implementation of this embodiment may refer to the implementation of the foregoing corresponding training method of the form question-answer model, and the repetition is omitted.
In a particular embodiment, the computer device 700 includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the computer device, and configured to be executed by the one or more processors, the one or more programs comprising computer-executable instructions for:
acquiring a target form question-answering task, wherein the target form question-answering task comprises a form and a problem corresponding to the form;
and outputting answers of the target form questions and answers by inputting the target form questions and answers into a preset form question and answer model, wherein the preset form question and answer model is obtained based on training steps of the method in any one embodiment of the figures 1 to 2.
In the embodiment of the application, the table question-answering task comprising the table and the questions corresponding to the table is input into the preset table question-answering model, so that the accurate answer of the corresponding table question-answering task can be output.
It should be noted that, in the embodiment of the present application related to the computer device and the embodiment of the method related to the processing method of the form question-answer task in the present application are based on the same inventive concept, so that the specific implementation of this embodiment may refer to the implementation of the foregoing corresponding processing method of the form question-answer task, and the repetition is omitted.
Further, corresponding to the methods shown in fig. 1 to 2, based on the same technical concept, the embodiments of the present application further provide a storage medium, which is used to store computer executable instructions, in a specific embodiment, the storage medium may be a U disc, an optical disc, a hard disk, etc., where the computer executable instructions stored in the storage medium can implement the following flows when executed by a processor:
collecting a plurality of tables and problems corresponding to the tables, wherein the tables comprise table fields and values corresponding to the table fields;
Based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table;
Determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables;
determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part;
and training to obtain a form question-answering model based on the problems, SQL sentences and execution results correspondingly associated with the forms of the first part and the second part.
In the embodiment of the application, a plurality of tables and problems corresponding to the tables are acquired, wherein the tables comprise table fields and values corresponding to the table fields; based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table; determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables; determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part; based on the related problems, SQL sentences and execution results of the tables of the first part and the second part, a table question-answer model is obtained through training, so that a large quantity of pre-training corpus required by the table question-answer model training can be obtained through a small quantity of marked real data, the cost of the table question-answer model training is reduced, the accuracy of the table question-answer model prediction result is improved, and the table question-answer model capable of accurately predicting answers of the table question-answer task can be trained at low cost.
It should be noted that, the embodiment of the storage medium and the embodiment of the training method of the form question-answer model in the present application are based on the same inventive concept, so that the specific implementation of the embodiment may refer to the implementation of the foregoing corresponding training method of the form question-answer model, and the repetition is omitted.
Further, corresponding to the method shown in fig. 3, based on the same technical concept, the embodiment of the present application further provides a storage medium, which is used to store computer executable instructions, in a specific embodiment, the storage medium may be a U disc, an optical disc, a hard disk, etc., where the computer executable instructions stored in the storage medium can implement the following flow when executed by a processor:
acquiring a target form question-answering task, wherein the target form question-answering task comprises a form and a problem corresponding to the form;
and outputting answers of the target form questions and answers by inputting the target form questions and answers into a preset form question and answer model, wherein the preset form question and answer model is obtained based on training steps of the method in any one embodiment of the figures 1 to 2.
In the embodiment of the application, the table question-answering task comprising the table and the questions corresponding to the table is input into the preset table question-answering model, so that the accurate answer of the corresponding table question-answering task can be output.
It should be noted that, in the embodiment of the present application related to the computer device and the embodiment of the method related to the processing method of the form question-answer task in the present application are based on the same inventive concept, so that the specific implementation of this embodiment may refer to the implementation of the foregoing corresponding processing method of the form question-answer task, and the repetition is omitted.
Further, according to the method shown in fig. 1 to 3, based on the same technical concept, the embodiment of the present application further provides computer software, where the computer software is stored in a storage medium, and the computer software is executed by at least one processor to implement the training method of the form question-answer model or the processing method of the form question-answer task.
It should be noted that, in the present application, the embodiment of the computer software and the embodiment of the training method of the form question-answer model or the processing method of the form question-answer task in the present application are based on the same inventive concept, so that the specific implementation of the embodiment may refer to the implementation of the corresponding training method of the form question-answer model or the processing method of the form question-answer task, and the repetition is omitted.
Further, according to the method shown in fig. 1 to 3, based on the same technical concept, an embodiment of the present application further provides a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to execute the training method of the form question-answer model or the processing method of the form question-answer task.
It should be noted that, in the embodiment of the present application related to the computer program product and the embodiment of the method for training the form question-answer model or the method for processing the form question-answer task in the present application are based on the same inventive concept, so that the specific implementation of the embodiment may refer to the implementation of the foregoing corresponding method for training the form question-answer model or the method for processing the form question-answer task, and the repetition is omitted.
The foregoing describes certain embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Embodiments of the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is by way of example only and is not intended to limit the present disclosure. Various modifications and changes may occur to those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. that fall within the spirit and principles of the present document are intended to be included within the scope of the claims of the present document.

Claims (11)

1. A method for training a form question-answer model, comprising:
collecting a plurality of tables and problems corresponding to the tables, wherein the tables comprise table fields and values corresponding to the table fields;
Based on a plurality of preset SQL templates, respectively generating SQL sentences corresponding to all tables, wherein one table correspondingly generates a plurality of SQL sentences, the SQL sentences comprise a first SQL sentence and a second SQL sentence, the first SQL sentence is associated with a problem corresponding to the table, and the second SQL sentence is not associated with the problem corresponding to the table;
Determining a problem associated with a second SQL statement of a table of a second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables;
determining an execution result of each SQL sentence based on the problems and the SQL sentences correspondingly associated with the tables of the first part and the second part;
and training to obtain a form question-answering model based on the problems, SQL sentences and execution results correspondingly associated with the forms of the first part and the second part.
2. The method of claim 1, wherein the generating, based on the plurality of preset SQL templates, the SQL statement corresponding to each table includes:
respectively executing SQL sentences corresponding to the collected target table through the plurality of preset SQL templates to obtain a plurality of SQL sentences of the target table and a plurality of executing results corresponding to the SQL sentences one by one;
And if the execution result corresponding to the target SQL template is correct, generating an SQL sentence corresponding to the target table based on the SQL sentence corresponding to the target SQL template.
3. The method of claim 1, wherein the determining a problem associated with a second SQL statement of a table of the second portion of the plurality of tables based on a first SQL statement of a table of the first portion of the plurality of tables comprises:
Training to obtain a language conversion model by taking a first SQL sentence of the table comprising the first part and a sentence pair of a problem correspondingly associated with the first SQL sentence as a training data set, wherein the language conversion model is used for interconversion between the problem and the SQL sentence;
Based on the language conversion model, converting a question or a second SQL statement corresponding to the second part of the table to determine a question associated with the second SQL statement of the second part of the table, and determining an SQL statement associated with the question of the second part of the table.
4. The method of claim 3, wherein the language conversion model comprises an NL2SQL model and an SQL2NL model,
The converting, based on the language conversion model, a problem or a second SQL statement corresponding to the second part of the table to determine a problem associated with the second SQL statement of the second part of the table, and determining an SQL statement associated with the problem of the second part of the table, including:
based on the NL2SQL model, converting the problems of the tables of the second part to obtain corresponding related SQL sentences;
And converting a second SQL statement of the table of the second part to obtain a corresponding associated problem based on the SQL2NL model.
5. The method of claim 3, wherein the determining the execution result of each SQL statement based on the problem and the SQL statement corresponding to the table of the first part and the second part comprises:
the SQL sentence corresponding to the table of the first part is executed to obtain the execution result of the SQL sentence corresponding to the table of the first part;
And executing the SQL statement corresponding to the table of the second part to obtain an execution result of the SQL statement corresponding to the table of the second part.
6. The method of claim 1, wherein training to obtain a form question-answer model based on the questions, SQL statements, and execution results associated with the form correspondence of the first portion and the second portion comprises:
Generating a first sentence pair comprising the associated table, the SQL sentence and the execution result and a second sentence pair comprising the associated question and the SQL sentence based on the associated question, the SQL sentence and the execution result corresponding to the tables of the first part and the second part;
And training the first sentence pair and the second sentence pair as pre-training corpus to obtain a form question-answer model, wherein the form question-answer model is used for predicting an execution result of an SQL sentence corresponding to and associated with a form question-answer task comprising a question and a form.
7. The method of claim 6, wherein after the training to obtain the form question-answer model, further comprising:
generating a third sentence pair comprising the associated table, the question and the execution result based on the associated question, the SQL sentence and the execution result corresponding to the tables of the first part and the second part;
and taking the third sentence pair as a table question-answer training data set, adjusting the table question-answer model, wherein the adjusted table question-answer model is used for outputting complete answers of the table question-answer task based on predicted execution results of a plurality of SQL sentences.
8. A computer device, the computer device comprising:
a processor; and
A memory arranged to store computer executable instructions configured for execution by the processor, the executable instructions comprising steps for performing part or all of the method of any one of claims 1 to 7.
9. A storage medium storing computer executable instructions for causing a computer to perform part or all of the steps of the method of any one of claims 1 to 7.
10. Computer software stored in a storage medium, which computer software is executed by at least one processor to perform part or all of the steps of the method of any one of claims 1 to 7.
11. A computer program product, characterized in that it comprises a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform part or all of the steps of the method according to any one of claims 1 to 7.
CN202410334221.6A 2024-03-22 2024-03-22 Training method, training equipment, training medium, training software and training product for form question-answer model Pending CN118132723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410334221.6A CN118132723A (en) 2024-03-22 2024-03-22 Training method, training equipment, training medium, training software and training product for form question-answer model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410334221.6A CN118132723A (en) 2024-03-22 2024-03-22 Training method, training equipment, training medium, training software and training product for form question-answer model

Publications (1)

Publication Number Publication Date
CN118132723A true CN118132723A (en) 2024-06-04

Family

ID=91233444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410334221.6A Pending CN118132723A (en) 2024-03-22 2024-03-22 Training method, training equipment, training medium, training software and training product for form question-answer model

Country Status (1)

Country Link
CN (1) CN118132723A (en)

Similar Documents

Publication Publication Date Title
CN111813802B (en) Method for generating structured query statement based on natural language
CN116340584A (en) Implementation method for automatically generating complex graph database query statement service
CN106909931B (en) Feature generation method and device for machine learning model and electronic equipment
CN109960815B (en) Method and system for establishing neural machine translation NMT model
CN112506945B (en) Self-adaptive learning guiding method and system based on knowledge graph
Ell et al. SPARQL query verbalization for explaining semantic search engine queries
CN114528312A (en) Method and device for generating structured query language statement
Chuprina et al. Ontology based data access methods to teach students to transform traditional information systems and simplify decision making process
CN116578723A (en) Information query method and device, processor and electronic equipment
CN118132723A (en) Training method, training equipment, training medium, training software and training product for form question-answer model
CN113742447B (en) Knowledge graph question-answering method, medium and equipment based on query path generation
Petrovski et al. Embedding individual table columns for resilient SQL chatbots
Auer et al. Toward an open knowledge research graph
Dong et al. An optimized NL2SQL system for Enterprise data Mart
Srihari et al. Question and answer generation from text using transformers
Minjie et al. Enhanced Campus Information Query System based on ChatGPT Interface and Local Content Database
CN112307053B (en) Language processing method and device based on reinforcement learning
CN113849592B (en) Text emotion classification method and device, electronic equipment and storage medium
CN115600587B (en) Mathematics application question generation system and method, intelligent terminal and readable storage medium
Tomer et al. Ensembled Approach for Text Summarization
Zhu et al. Interpretable Text-to-SQL Generation with Joint Optimization
CN118170894A (en) Knowledge graph question-answering method, knowledge graph question-answering device and storage medium
Staniek et al. Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap
CN117708282A (en) Knowledge question-answering method and system based on large language model
CN118132722A (en) Training method, training device, training equipment, training medium and training product for form question-answer model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination