CN117743546A - Method, apparatus, device and readable medium for question and answer - Google Patents

Method, apparatus, device and readable medium for question and answer Download PDF

Info

Publication number
CN117743546A
CN117743546A CN202311767471.0A CN202311767471A CN117743546A CN 117743546 A CN117743546 A CN 117743546A CN 202311767471 A CN202311767471 A CN 202311767471A CN 117743546 A CN117743546 A CN 117743546A
Authority
CN
China
Prior art keywords
question
entity
target
answer
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311767471.0A
Other languages
Chinese (zh)
Inventor
刘瑞雪
祝天刚
陈蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong City Beijing Digital Technology Co Ltd
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong City Beijing Digital Technology Co Ltd
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong City Beijing Digital Technology Co Ltd, Jingdong Technology Information Technology Co Ltd filed Critical Jingdong City Beijing Digital Technology Co Ltd
Priority to CN202311767471.0A priority Critical patent/CN117743546A/en
Publication of CN117743546A publication Critical patent/CN117743546A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present disclosure provide methods, apparatuses, devices, and readable media for question answering. The method comprises the following steps: determining, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for the structured query statement of the question-answer database; determining at least one query entity from the question-answer database that matches the at least one named entity, respectively, based on the semantics of the at least one named entity, respectively, and the at least one category to which the at least one named entity corresponds; determining a target structured query statement associated with the target question based on the at least one query entity and the at least one category; and determining a target answer to the target question by querying a question-answer database based on the target structured query statement. Thus, answers to questions can be determined based on the database, and the efficiency of questions and answers can be improved while the quality of the answers is ensured.

Description

Method, apparatus, device and readable medium for question and answer
Technical Field
Example embodiments of the present disclosure relate generally to the field of computer technology and, more particularly, relate to a method, apparatus, device, and computer-readable storage medium for question-answering.
Background
With the rapid development of information technology, more and more applications or platforms and the like provide question-answering functions, and bring convenience to the majority of users. An application or platform with question and answer functionality may provide a question and answer service to a user based on a database. After the application or platform with the question and answer function obtains the user question, the user question can be understood and identified, and then corresponding data is obtained from the database based on the result of understanding and identification and provided to the user as an answer. It is desirable to improve the accuracy and efficiency of understanding and identifying questions and retrieving data from databases.
Disclosure of Invention
In a first aspect of the present disclosure, a method for question answering is provided. The method comprises the following steps: determining, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for the structured query statement of the question-answer database; determining at least one query entity from the question-answer database that matches the at least one named entity, respectively, based on the semantics of the at least one named entity, respectively, and the at least one category to which the at least one named entity corresponds; determining a target structured query statement associated with the target question based on the at least one query entity and the at least one category; and determining a target answer to the target question by querying a question-answer database based on the target structured query statement.
In a second aspect of the present disclosure, an apparatus for question answering is provided. The device comprises: a category determination module configured to determine, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for the structured query statement of the question-answer database; an entity determination module configured to determine at least one query entity that matches the at least one named entity, respectively, from the question-answer database based on the semantics of the at least one named entity, respectively, and at least one category to which the at least one named entity corresponds; a statement determination module configured to determine a target structured query statement associated with a target question based on at least one query entity and at least one category; and an answer determination module configured to determine a target answer to the target question by querying the question-answer database based on the target structured query statement.
In a third aspect of the present disclosure, an electronic device is provided. The electronic device comprises at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions when executed by the at least one processing unit cause the electronic device to perform the method of the first aspect of the disclosure.
In a fourth aspect of the present disclosure, a computer-readable storage medium is provided. The computer readable storage medium has stored thereon a computer program executable by a processor to perform the method according to the first aspect of the present disclosure.
It should be understood that what is described in this summary is not intended to limit the critical or essential features of the embodiments of the disclosure nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other features, advantages, and aspects of various implementations of the present disclosure will become more apparent hereinafter with reference to the following detailed description in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals denote like or similar elements, in which:
FIG. 1 illustrates a schematic diagram of an example environment in which embodiments of the present disclosure may be implemented;
FIG. 2 illustrates a flow chart of a process for question answering according to some embodiments of the present disclosure;
FIG. 3 shows a schematic diagram of an example descriptive diagram in accordance with some embodiments of the present disclosure;
FIG. 4 illustrates a schematic diagram of an example architecture for question answering, according to some embodiments of the present disclosure;
FIG. 5 illustrates a schematic block diagram of an apparatus for question answering according to some embodiments of the present disclosure; and
FIG. 6 illustrates a block diagram of a computing device in which one or more embodiments of the disclosure may be implemented.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the embodiments of the present disclosure have been illustrated in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather, embodiments are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions are also possible below.
As used herein, the term "model" may learn the association between the respective inputs and outputs from training data so that, for a given input, a corresponding output may be generated after training is completed. The generation of the model may be based on machine learning techniques. Deep learning is a machine learning algorithm that processes inputs and provides corresponding outputs through the use of multiple layers of processing units. The "model" may also be referred to herein as a "machine learning model," "machine learning network," "neural network," or "network," and these terms are used interchangeably herein.
A "neural network" is a machine learning network based on deep learning. The neural network is capable of processing the input and providing a corresponding output, which generally includes an input layer and an output layer, and one or more hidden layers between the input layer and the output layer. Neural networks used in deep learning applications typically include many hidden layers, thereby increasing the depth of the network. The layers of the neural network are connected in sequence such that the output of the previous layer is provided as an input to the subsequent layer, wherein the input layer receives the input of the neural network and the output of the output layer is provided as the final output of the neural network. Each layer of the neural network includes one or more nodes (also referred to as processing nodes or neurons), each of which processes input from a previous layer.
Generally, machine learning may generally include three phases, namely a training phase, a testing phase, and an application phase (also referred to as an inference phase). In the training phase, a given model may be trained using a large amount of training data, iteratively updating parameter values until the model is able to obtain consistent inferences from the training data that meet the desired goal. By training, the model may be considered to be able to learn the association between input and output (also referred to as input to output mapping) from the training data. Parameter values of the trained model are determined. In the test phase, test inputs are applied to the trained model to test whether the model is capable of providing the correct outputs, thereby determining the performance of the model. In the application phase, the model may be used to process the actual input based on the trained parameter values, determining the corresponding output.
It should be noted that, in the technical solution of the present disclosure, the acquisition, storage, application, etc. of the related personal information of the user all conform to the rules of the related laws and regulations, and do not violate the popular regulations of the public order.
It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to relevant legal regulations.
For example, in response to receiving an active request from a user, prompt information is sent to the user to explicitly prompt the user that the operation requested to be performed will require obtaining and using personal information to the user, so that the user may autonomously select whether to provide personal information to software or hardware such as an electronic device, an application, a server, or a storage medium that performs the operation of the technical solution of the present disclosure according to the prompt information.
As an alternative but non-limiting implementation, in response to receiving an active request from a user, the prompt information may be sent to the user, for example, in a pop-up window, where the prompt information may be presented in text. In addition, a selection control for the user to select "agree" or "disagree" to provide personal information to the electronic device may also be carried in the pop-up window.
As discussed above, an application or platform having a question and answer function may provide a question and answer service to a user based on a database. After the application or platform with the question and answer function obtains the user question, the user question can be understood and identified, and then corresponding data is obtained from the database based on the result of understanding and identification and provided to the user as an answer. Traditionally, only a single modality (i.e., text) answer has been provided. The questions entered by the user are text messages and the answers provided by the application or platform are also text messages. In addition, database-based questions and answers traditionally often need to rely on high quality form information to find information such as column names, column values, etc. in user questions by comparing the user questions with a particular form, thereby collaging into Structured Query (SQL) statements. And then the SQL sentences obtained by splicing are relied on to extract the corresponding data from the database.
However, databases for questions and answers often contain a large amount of data, and there are cases of semantic data sharing and confusion of column name and column values among tables, which makes the tables difficult to locate, and SQL sentences cannot be generated only by using NL2SQL technology. It is desirable to increase the accuracy of the generated SQL statement. Furthermore, in tax scenarios, the data obtained from the tax question-answer database based on SQL statements often relates to numerical information such as tax amount, and generally the numerical information is huge (e.g. "value added tax in the last year of the country and registered tax are data of two columns and a plurality of rows") and it is difficult for a single modality to deliver the numerical information. People expect to present multi-mode answers by combining other information such as charts and the like so as to realize data visualization and improve user experience. And because the complexity of the scene, the tax question-answer database cannot meet all tax question-answer scenes, people expect that the data which cannot be covered by the database can be collected and fed back so as to perfect the database.
In view of the foregoing, embodiments of the present disclosure provide a method for question answering. The method comprises the following steps: based on at least one named entity detected from the target question, it is determined that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for the structured query statement of the question-answer database. At least one query entity that matches the at least one named entity, respectively, is determined from the question-answer database based on the semantics of the at least one named entity, respectively, and the at least one category to which the at least one named entity corresponds. A target structured query statement associated with a target question is determined based on the at least one query entity and the at least one category. The target answer to the target question is determined by querying a question-answer database based on the target structured query statement. In this way, answers to questions can be determined based on the database, structured query sentences can be generated by utilizing query entities matched with named entities in the questions, the quality of the answers can be ensured, meanwhile, the efficiency of the questions and the answers can be improved, and the user experience of users in the questions and the answers can be improved.
FIG. 1 illustrates a schematic diagram of an example environment 100 in which embodiments of the present disclosure may be implemented. As shown in fig. 1, environment 100 may include an electronic device 110.
The electronic device 110 may generate a target answer 112 corresponding to the target question 102 based on the target question 102. That is, the electronic device 110 may execute a question and answer service based on the target question 102 to generate the target answer 112. The electronic device 110 may obtain the target question 102 in any suitable manner. For example, the electronic device 110 may determine the retrieved input text as the target question 102 in response to detecting user input in the input box. For example, the electronic device 110 may receive the user's voice and convert it to input text, which in turn is determined to be the target question 102. The speech herein may be speech of any suitable language, of any duration, of any tone.
Electronic device 110 may utilize model 120 to perform question and answer services, for example. The model 120 may include, for example, but is not limited to, a transducer model, a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a Deep Neural Network (DNN), a multi-layer perceptron (MLP), and the like. The model 120 may be a model local to the electronic device 110 or may be a model installed in other electronic devices 110 (e.g., in a remote device). Model 120 may include a plurality of models, for example, may include models for chart generation, semantic recognition, language models, and the like. The electronic device 110 may utilize the database 130, for example, to determine target answers to the target questions 102. Database 130 may, for example, be a question-answer database, which may include a large amount of data information required for questions and answers. The electronic device 110 may, for example, determine a plurality of entities associated with the target question 102 and query the database 130 based on the plurality of entities to obtain query results. The electronic device 110 may in turn determine the query result as the target answer.
Electronic device 110 may include any computing system having computing capabilities, such as various computing devices/systems, terminal devices, server devices, and the like. The terminal device may be any type of mobile terminal, fixed terminal, or portable terminal, including a mobile handset, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, media computer, multimedia tablet, palmtop computer, portable gaming terminal, VR/AR device, personal communication system (Personal Communication System, PCS) device, personal navigation device, personal digital assistant (Personal Digital Assistant, PDA), audio/video player, digital camera/video camera, positioning device, television receiver, radio broadcast receiver, electronic book device, gaming device, or any combination of the preceding, including accessories and peripherals for these devices, or any combination thereof.
The server device may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network, basic cloud computing services such as big data and an artificial intelligent platform. The server devices may include, for example, computing systems/servers, such as mainframes, edge computing nodes, computing devices in a cloud environment, and so forth.
It should be understood that the structure and function of the various elements in environment 100 are described for illustrative purposes only and are not meant to suggest any limitation as to the scope of the disclosure.
Some example embodiments of the present disclosure will be described below with continued reference to the accompanying drawings.
Fig. 2 illustrates a flow chart of a process 200 for question-answering according to some embodiments of the present disclosure. Process 200 may be implemented at electronic device 110. For ease of discussion, the process 200 will be described with reference to the environment 100 of FIG. 1.
At block 210, electronic device 110 determines that at least one named entity corresponds to at least one category based on at least one named entity detected from the target question, the category of each named entity being a category defined for the structured query statement of the question-answer database.
Target questioning may include a variety of forms. The target question may include, for example, a question captured in text form, and the electronic device 110 may directly obtain a text sequence of the question entered by the user. In order to ensure convenience and ease of handling of questions and answers, the target questions may further include questions captured in a voice form, and the electronic device 110 may capture voice data of the questions of the user through the voice capture means. The voice data may be in any language (e.g., chinese, english, japanese, etc.), any length of time (e.g., 3s, 5s, etc.), and any tone color. It will be appreciated that the target question may also be a question captured in any other suitable form.
The electronic device 110 performs Entity detection on the obtained target questions to determine at least one Named Entity (Named Entity) in the target questions. Named entities are entities in text that have a particular meaning or are referred to strongly. For example, if the target question is "registered people for legal provinces of China," the electronic device 110 may identify the named entities "China", "legal persons", "provinces" and "registered people".
The category of the entity may be a category predefined by the user or may be a category determined by the electronic device 110 itself. In an embodiment of the present disclosure, the categories of entities are defined categories for structured query statements of a question-answer database. Electronic device 110 may determine different categories corresponding to different entities in the question-answer database based on the structured query statement. By way of example, the categories of entities may include dimensions, labels, dimension keys, metrics, and the like. It is to be understood that the categories herein are merely examples, and that virtually any suitable category may be included, and the disclosure is not limited thereto. It should be noted that named entities and categories in a target question may be one-to-one. For example, the target question includes 4 named entities, and the 4 named entities respectively correspond to 4 categories. Named entities and categories in a target question may also be many-to-one. For example, a target question includes 5 named entities, where there are two named entities that correspond to the same category.
With respect to specific methods of identifying at least one named entity and determining the category to which the entity corresponds, in some embodiments, the electronic device 110 may identify named entities in the target question by a named entity identification (Named Entity Recognition, NER) technique to determine at least one named entity in the target question and determine the category to which each named entity corresponds. Illustratively, the electronic device 110 may determine, through NER technology, that the category of the named entity "China" in the target question is dimension, the category of the named entity "legal" is label, the category of the named entity "each province" is dimension key, and the category of the named entity "registered people" is index.
At block 220, electronic device 110 determines at least one query entity from the question-answer database that matches the at least one named entity, respectively, based on the semantics of each of the at least one named entity and the at least one category to which the at least one named entity corresponds. Since the target question may be a question in any scenario, in some embodiments, to improve the accuracy of generating the answer, the question-answer database may be a question-answer database that matches the scenario of the target question. For example, if the target question is a question in a tax scenario, the question-answer database is a tax question-answer database. Accordingly, the query entity is an entity included in the question-answer database, and in the case where the question-answer database is a tax question-answer database, the query entity includes an entity related to tax questions-answers (e.g., proper nouns in a tax scenario, etc.).
In some embodiments, since the named entity may not be an entity included in the question-answer database, the electronic device 110 may semantically identify the named entity in order to promote accuracy of the final generated target answer. The electronic device 110 may in turn determine a query entity that matches the named entity from the entities included in the database based on the recognition result. For example, for a named entity "legal", the named entity may not be included in the question-answer database, after the electronic device 110 performs semantic recognition on the named entity, the electronic device 110 determines, from the entities included in the question-answer database, that the entity matching the named entity is "legal representative", and determines the entity "legal representative" as a query entity matching the named entity "legal".
Regarding the specific manner in which the query entity is determined, in some embodiments, electronic device 110 may obtain an entity lookup table from a question-and-answer database. The entity lookup table comprises at least one column, wherein the column name of each column is a category in at least one category, and the column value of each column is a query entity under the corresponding category. Illustratively, table 1 shows an example of an entity lookup table.
TABLE 1
Label (Label) Dimension(s) Dimension key Index (I) ……
Legal person representative Country-to-country dimension Province dimension Registration number ……
Talent to the soldier National security-industry dimension Enterprise dimension Paramedics and insurers ……
Investors A national Bank-group dimension National dimensions Health degree of staff ……
…… …… …… …… ……
Note that in the column of "dimension" in table 1, each column value is a dimension value-dimension name pair. For example, in the column of "country-country dimension," country "is a dimension value, and" country dimension "is a dimension name.
The electronic device 110 may determine at least one target column in the entity lookup table based on at least one category to which the at least one named entity corresponds. For example, after determining that the category of the named entity "legal" is a label for the named entity "legal", the electronic device 110 may determine a column in the entity lookup table with the column name "label" as the target column. That is, electronic device 110 may determine, for example, the first column in table 1 as the target column for the named entity "legal". Similarly, electronic device 110 may determine the second column (i.e., the column named "dimension") in table 1 as the target column for named entity "China", the third column (i.e., the column named "dimension key") in table 1 as the target column for named entity "each province", and the fourth column (i.e., the column named "index") in table 1 as the target column for named entity "registrant". Thus, the electronic device 110 may determine at least one target column corresponding to a plurality of named entities "China", "French", "provinces", "registrants" in the target question "registrants in French provinces of China".
The electronic device 110 in turn queries the query entities included in the at least one target column based on the semantics of each of the at least one named entity to determine at least one query entity that matches the at least one named entity. Specifically, for each named entity in the at least one named entity, the electronic device 110 may perform semantic recognition on the named entity, so as to determine a semantic matching degree (which may also be referred to as a semantic similarity) between the recognition result (i.e., the semantics of the named entity) and the semantics of each entity in the plurality of entities (i.e., the plurality of column values) included in the corresponding target column. The electronic device 110 may semantically identify the named entity in any suitable manner, and the disclosure is not limited to a particular manner of semantic identification.
In some embodiments, if there is an entity in the target column that has a semantic match with the named entity above a match threshold, electronic device 110 may determine the entity as a query entity that matches the named entity. In some embodiments, if there are multiple entities in the target column that have a semantic match with the named entity above a match threshold, the electronic device 110 may determine the entity of the multiple entities that has the highest corresponding semantic match as the query entity that matches the named entity. For example, for a named entity "legal", electronic device 110 may determine from the column "tag" a query entity "legal representative" that has a semantic match with the named entity above a threshold. Similarly, according to table 1, the electronic device 110 may determine, from the column of "dimension", a query entity "country-country dimension" matching the named entity "country", determine, from the column of "dimension key", a query entity "province dimension" matching the named entity "provinces", and determine, from the column of "index", a query entity "registration number" matching the named entity "registrant".
It should be noted that, in addition to being based on the entity lookup table, the electronic device 110 may determine at least one query entity matching at least one named entity from the question-answer database based on any suitable manner such as text matching, field matching, etc., and the present disclosure is not limited to a specific determination method.
In some embodiments, if the question-answer database does not include a query entity that matches a first named entity of the at least one named entity (e.g., if there is no query entity in the target column that matches the first named entity), electronic device 110 may determine the first named entity as the first entity. It is to be appreciated that the first named entity herein may also include a plurality of entities. The first entity, i.e., the entity newly discovered by electronic device 110 that is not included in the question-answer database. The electronic device 110 may assign a preset identifier to the first entity for identifying the first entity. The identifier here is used to identify the first entity in the target question. The identifier herein may be any suitable identifier. By way of example, the electronic device 110 may identify the first entity using, for example, a sequential representation. The electronic device 110 may, for example, employ a BIO tagging system to identify the first character of the first entity as B (i.e., begin) and the character within the first entity as I (i.e., inside). The electronic device 110 may also identify non-entities in the target question as O (i.e., outlide), for example.
For example, if the target question is "address of student finding all last names", the electronic device 110 determines that the target question includes named entities "find", "last name", "student", "address", and non-entities "all" and "in the target question. The electronic device 110 may in turn identify these named entities and non-entities with a preset identifier. For example, electronic device 110 may identify "find" of the named entities "find" as B, "out" as I, and non-entity "all" as O.
The identified first entity includes the first entity and an identifier corresponding thereto. For example, if the first entity is "find", the identified first entity may be, for example, "find-BI". It is to be appreciated that the electronic device 110 can identify the first entity by any suitable identification means, which is not limiting of the present disclosure. Electronic device 110 may in turn store the identified first entity in a question-answer database. For example, the identified first entity may be added to an entity lookup table. In some embodiments, when a first entity is identified, electronic device 110 may also determine a category to which the first entity corresponds, so that the first entity may be subsequently stored in a column of the corresponding category of the entity lookup table.
Thus, the electronic device 110 may store the newly discovered entity in the question-answer database to update the question-answer database, and may promote accuracy of a subsequent question-answer based on the question-answer database. In addition, the method can help to establish a more comprehensive and accurate question-answer database, and promote the question-answer effect and accuracy. Meanwhile, the efficiency of the subsequent question and answer can be improved.
At block 230, the electronic device 110 determines a target structured query statement associated with the target question based on the at least one query entity and the at least one category.
In some embodiments, the electronic device 110 may obtain a preset structured query term template, where the structured query term template may include at least one slot. It may be appreciated that the structured query term template may be preset by the user or may be determined by the electronic device 110 itself. At least one slot may correspond to at least one category. The electronic device 110 may populate at least one query text in at least one slot in the structured query statement template based on the at least one category. Illustratively, the structured query term template may be, for example, "SELECT Dimension key,SUM(Index (I)) FROM entity lookup table (WHERE)Dimension(s)AND tag='Label (Label)’,GROUP BYDimension key". The underlined area in this example represents the slot, and the italic text on the underlined line represents the category to which the slot corresponds.
The electronic device 110 may, for example, populate the structured query sentence templates with the named entities "in China", "legal person", "provinces" and "registered person number" in the above-mentioned "registered person in each province of legal person" based on the target question and the query entities "legal person representative", "country-country dimension", "province dimension", "registered number" determined in table 1 based on the corresponding categories. The electronic device 110 may thus obtain a target structured query statement "SELECT" for the target question "registrant numbers in legal provinces of our countryProvince dimension,SUM(Registration number) FROM entity lookup table (WHERE)Country dimension = 'a country'AND tag='Method of Human representatives’,GROUP BYProvince dimension”。
In some embodiments, the electronic device 110 may also generate a target structured query statement associated with the target question directly based on the at least one query entity and the at least one category. For example, the electronic device 110 may provide at least one query entity and at least one category to a trained language model and obtain a generated target structured query statement from the language model. It is to be appreciated that the electronic device 110 can determine the target structured query statement based on any suitable manner, which is not limiting of the present disclosure.
At block 240, the electronic device 110 determines a target answer to the target question by querying a question-answer database based on the target structured query statement.
The electronic device 110 may query the question-answer database based on the target structured query statement to obtain a query result for the target question, which in some embodiments, the electronic device 110 may directly determine as a target answer for the target question. The electronic device 110 may also provide target answers to the user, for example. For example, the electronic device 110 may provide the target answer to the user by presenting text, playing voice, and the like.
In some embodiments, if the target question is a question for a value, the query result obtained by the electronic device 110 includes at least the value information for the target question in the question-and-answer database. In this case, if the numerical information is more, the user may not be able to digest the numerical information quickly and understand the numerical information, which may affect the user's question-answering experience. Thus, in some embodiments, the electronic device 110 may also process the query results and generate a multimodal target answer based on the processed results. In particular, the electronic device 110 may process the query results using a first model (e.g., a language model) to determine descriptive text for describing the query results. The electronic device 110 may also process the query results using a second model (e.g., a chart generation model) to determine a description table and/or a description image (which may also be referred to simply as a description chart, which may be used interchangeably herein) for describing the query results. The electronic device 110 in turn generates a multimodal target answer based on the descriptive text and the descriptive chart. Illustratively, if the target question is "what tax revenue is for the XX year A city," the electronic device 110 determines that it is a question for a numerical value, and further determines to obtain a query result for the target question from the question-and-answer database based on the above-described method. The query results may include, for example, numerical information regarding tax revenue for the XX year A city. After the electronic device 110 further obtains the query result, the query result is processed by using the first model to obtain a description text "XX revenue of the urban area of the year a is XXX, wherein the value-added tax is XX, the value-added tax is 22%, the consumption tax is XX, the value-added tax is 5%, the customs tax is XX, the value-added tax is 19%, the income tax of the enterprise is XX, the value-added tax is 26%, the income tax of the individual is XX, the value-added tax is 13% and other tax is XX, and the value-added tax is 15%". The electronic device 110 may also process the query results using the second model to obtain a description chart describing the query results. In some embodiments, the electronic device 110 may also provide the descriptive text obtained using the first model to the second model to obtain the descriptive diagram. Fig. 3 shows a schematic diagram of an example descriptive diagram 300, according to some embodiments of the present disclosure. The electronic device 110 may provide the query results to a second model, which may generate the description chart 300 shown in fig. 3 based on the query results. Electronic device 110 may in turn generate a multi-modal target question-answer based on the descriptive text and the descriptive chart. Electronic device 110 may also provide a multi-modal target question-answer to the user, for example. Therefore, the multi-mode answer can be generated, a user can be helped to quickly digest and understand a large amount of numerical information, the analysis and decision making capability of data is improved, the visualization of the answer can be realized, and the prompt of the user's question and answer experience is facilitated.
Having described the case where the question-answer database includes at least one query entity that matches at least one named entity in the target question, the following will continue to describe the case where the question-answer database does not include a query entity that matches a first named entity in the at least one named entity in the target question. In this case, the electronic device 110 may determine that a named entity in the question-answer database that does not include the corresponding query entity exists in the target question, and the electronic device 110 may not be able to determine the target structured query statement for the target question.
In some embodiments, electronic device 110 may determine a first query entity from the question-answer database that has a text similarity to the first entity above a first threshold. Electronic device 110 may determine the first query entity from the question-answer database that has a text similarity to the first entity above a first threshold in any suitable manner. For example, electronic device 110 may determine a number of common texts between each of a plurality of entities included in the database and the first entity, the higher the number of common texts, the higher the text similarity. The electronic device 110 may then determine the entity with the highest number of common texts (i.e., the entity with the highest text similarity) as the first querying entity. For example, if the first entity is "bank card amount", the first query entity determined by the electronic device 110 may be "apply for bank card", for example. The electronic device 110 may generate an approximate question for the target question based on the first query entity and other named entities of the at least one named entity than the first named entity. That is, the electronic device 110 may replace the first entity in the target question with the first named entity to generate the approximate question. The electronic device 110 in turn determines a first answer to the proximity question and provides the proximity question and the first answer to the user. The specific manner of determining the first answer based on the approximate question is that the question-answer database includes at least one query entity corresponding to at least one named entity, and the electronic device 110 may determine the first answer corresponding to the approximate question based on the manner.
Alternatively or additionally, in some embodiments, the question and answer database includes at least one historical question. The electronic device 110 may determine, based on the target question, a target historical question from the at least one historical question having a text similarity to the target question above a second threshold. The electronic device 110 may also, for example, semantically identify a target question and determine a target historical question from the at least one historical question that has a semantic similarity to the target question that is above a threshold based on the semantic identification. The electronic device 110 may also determine a target history question, for example, in combination with semantic similarity and text similarity. The electronic device 110 may determine a second answer to the target history question and provide the target history question and the second answer. The electronic device 110 may store, for example, a historical question answer, and the electronic device 110 may determine its corresponding target historical answer directly based on the target historical question and determine the target historical answer as the second answer.
In this way, in the case that the target answer to the target question cannot be determined directly based on the database, the electronic device 110 may provide similar questions and corresponding answers to the user by determining the approximate question and/or the history question, so that the user may more conveniently obtain the desired information, and the question and answer efficiency may be improved while ensuring the user experience.
Fig. 4 illustrates a schematic diagram of an example architecture 400 for question-answering, according to some embodiments of the present disclosure. As shown in fig. 4, the architecture 400 includes an entity determination unit 410. After the electronic device 110 obtains the target question 102, the target question may be provided to the entity determination unit 410. The entity determination unit 410 is configured to determine at least one named entity detected from the target question 102 and to determine at least one category to which the at least one named entity corresponds. The entity determining unit 410 may further determine whether at least one query entity that matches with at least one named entity, respectively, is included in the question-answer database. In case it is determined that the question-answer database includes at least one query entity that matches with at least one named entity, respectively, the entity determining unit 410 may provide the determined at least one query entity and at least one category to the structured query language processing unit 420.
The structured query language processing unit 420 may determine a target structured query statement for the target question 102 based on the at least one query entity and the at least one category. The structured query language processing unit 420 may also query the question-answer database based on the determined target structured query statement to determine a query result for the target question.
The structured query language processing unit 420 may provide the query results to the multimodal-text generation unit 430 and the multimodal-graph generation unit 440, respectively. The multimodal-text generating unit 430 and the multimodal-graph generating unit 440 may generate description text and a description graph for describing the query result, respectively, based on the query result. The electronic device 110 may generate a target answer 112 to the target question based on the descriptive text and the descriptive chart.
In case it is determined that the question-answer database does not include a query entity matching a first named entity of the at least one named entity, the entity determining unit 410 may determine the first named entity as the first entity and provide the first entity to the new entity discovery unit 450. The new entity discovery unit 450 may identify the first entity and provide the identified first entity to the database accumulation unit 460. Database accumulating unit 460 may store the identified first entity in a question-answer database.
The electronic device 110 may in turn provide the target question with the identified first entity to the question recommending unit 470. The question recommending unit 470 may generate an approximate question for the target question and/or determine a target history question for the target question. The electronic device 110 may generate a first answer and/or a second answer based on the proximity question and/or the target history question and determine the proximity question-the first answer and/or the target history question-the second answer as the target answer.
In summary, the embodiments of the present disclosure may determine answers to questions based on a database, may generate a structured query statement using a query entity that matches a named entity in the questions, may improve efficiency of questions and answers while guaranteeing quality of answers, and may help to improve user experience of users in the questions and answers.
Embodiments of the present disclosure also provide corresponding apparatus for implementing the above-described methods or processes. Fig. 5 illustrates a schematic block diagram of an apparatus 500 for question-answering according to some embodiments of the present disclosure. The apparatus 500 may be implemented as or included in the electronic device 110. The various modules/components in apparatus 500 may be implemented in hardware, software, firmware, or any combination thereof.
As shown in fig. 5, the apparatus 500 includes a category determination module 510 configured to determine, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for the structured query statement of the question-answer database. The apparatus 500 further comprises an entity determination module 520 configured to determine at least one query entity from the question-answer database that matches the at least one named entity, respectively, based on the semantics of the at least one named entity, respectively, and the at least one category to which the at least one named entity corresponds. The apparatus 500 further includes a statement determination module 530 configured to determine a target structured query statement associated with the target question based on the at least one query entity and the at least one category. The apparatus 500 further comprises an answer determination module 540 configured to determine a target answer to the target question by querying the question-answer database based on the target structured query statement.
In some embodiments, the entity determination module 520 includes: the query table acquisition module is configured to acquire an entity query table from the question-answer database, wherein the entity query table comprises at least one column, the column name of each column is a category in at least one category, and the column value of each column is a query entity under the corresponding category; a target column determination module configured to determine at least one target column in the entity lookup table based on at least one category corresponding to the at least one named entity; and a query entity determination module configured to query the query entities included in the at least one target column based on the semantics of each of the at least one named entity to determine at least one query entity that matches the at least one named entity.
In some embodiments, the target answer comprises a multimodal answer, and the answer determination module 540 comprises: the query result acquisition module is configured to query the question-answer database based on the target structured query statement so as to acquire a query result aiming at the target question; a text determination module configured to process the query result using the first model to determine descriptive text for describing the query result; a chart determination module configured to process the query result using the second model to determine a description table and/or a description image for describing the query result; and an answer generation module configured to generate a target answer based on the descriptive text, the descriptive form, and/or the descriptive image.
In some embodiments, the first model is a language model and the second model is a chart generation model.
In some embodiments, if the target question is a question for a value, the query result includes at least the value information for the target question in the question-and-answer database.
In some embodiments, the apparatus 500 further comprises: a first entity determination module configured to determine a first named entity as a first entity if it is determined that the question-answer database does not include a query entity that matches the first named entity of the at least one named entity; the identification module is configured to allocate a preset identifier for the first entity so as to identify the first entity, wherein the identifier is used for identifying the first entity in the target question; and a storage module configured to store the identified first entity in the question-answer database.
In some embodiments, the apparatus 500 further comprises: a first query entity determination module configured to determine, from the question-answer database, a first query entity having a text similarity to the first entity above a first threshold; an approximate question generation module configured to generate an approximate question for the target question based on the first query entity and other named entities of the at least one named entity than the first named entity; a first answer determination module configured to determine a first answer to the proximity question; and a first providing module configured to provide the approximate question and the first answer.
In some embodiments, the question-answer database includes at least one historical question, and the apparatus 500 further includes: a history question determination module configured to determine, based on the target questions, a target history question having a text similarity with the target question higher than a second threshold from among the at least one history question; a second answer determination module configured to determine a second answer to the target history question; and a second providing module configured to provide the target history question and the second answer.
In some embodiments, the question-answer database is a tax question-answer database, and the querying entity includes entities related to tax questions and answers.
The elements and/or modules included in apparatus 500 may be implemented in various manners, including software, hardware, firmware, or any combination thereof. In some embodiments, one or more units and/or modules may be implemented using software and/or firmware, such as machine executable instructions stored on a storage medium. In addition to or in lieu of machine-executable instructions, some or all of the units and/or modules in apparatus 500 may be implemented at least in part by one or more hardware logic components. By way of example and not limitation, exemplary types of hardware logic components that can be used include Field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standards (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
It will be appreciated that one or more steps of the above methods may be performed by suitable electronic devices or combinations of electronic devices. Such an electronic device or combination of electronic devices may include, for example, electronic device 110 of fig. 1.
Fig. 6 illustrates a block diagram of an electronic device 600 in which one or more embodiments of the disclosure may be implemented. It should be understood that the electronic device 600 illustrated in fig. 6 is merely exemplary and should not be construed as limiting the functionality and scope of the embodiments described herein. The electronic device 600 shown in fig. 6 may be used to implement the electronic device 110 of fig. 1.
As shown in fig. 6, the electronic device 600 is in the form of a general-purpose electronic device. The components of electronic device 600 may include, but are not limited to, one or more processors or processing units 610, memory 620, storage 630, one or more communication units 640, one or more input devices 650, and one or more output devices 660. The processing unit 610 may be an actual or virtual processor and is capable of performing various processes according to programs stored in the memory 620. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to increase the parallel processing capabilities of electronic device 600.
The electronic device 600 typically includes a number of computer storage media. Such a medium may be any available medium that is accessible by electronic device 600, including, but not limited to, volatile and non-volatile media, removable and non-removable media. The memory 620 may be volatile memory (e.g., registers, cache, random Access Memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 630 may be a removable or non-removable media and may include machine-readable media such as flash drives, magnetic disks, or any other media that may be capable of storing information and/or data and that may be accessed within electronic device 600.
The electronic device 600 may further include additional removable/non-removable, volatile/nonvolatile storage media. Although not shown in fig. 6, a magnetic disk drive for reading from or writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk may be provided. In these cases, each drive may be connected to a bus (not shown) by one or more data medium interfaces. Memory 620 may include a computer program product 625 having one or more program modules configured to perform the various methods or acts of the various embodiments of the disclosure.
The communication unit 640 enables communication with other electronic devices through a communication medium. Additionally, the functionality of the components of the electronic device 600 may be implemented in a single computing cluster or in multiple computing machines capable of communicating over a communication connection. Thus, the electronic device 600 may operate in a networked environment using logical connections to one or more other servers, a network Personal Computer (PC), or another network node.
The input device 650 may be one or more input devices such as a mouse, keyboard, trackball, etc. The output device 660 may be one or more output devices such as a display, speakers, printer, etc. The electronic device 600 may also communicate with one or more external devices (not shown), such as storage devices, display devices, etc., with one or more devices that enable a user to interact with the electronic device 600, or with any device (e.g., network card, modem, etc.) that enables the electronic device 600 to communicate with one or more other electronic devices, as desired, via the communication unit 640. Such communication may be performed via an input/output (I/O) interface (not shown).
According to an exemplary implementation of the present disclosure, a computer-readable storage medium having stored thereon computer-executable instructions, wherein the computer-executable instructions are executed by a processor to implement the method described above is provided. According to an exemplary implementation of the present disclosure, there is also provided a computer program product tangibly stored on a non-transitory computer-readable medium and comprising computer-executable instructions that are executed by a processor to implement the method described above.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus, devices, and computer program products implemented according to the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various implementations of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of implementations of the present disclosure has been provided for illustrative purposes, is not exhaustive, and is not limited to the implementations disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various implementations described. The terminology used herein was chosen in order to best explain the principles of each implementation, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand each implementation disclosed herein.

Claims (12)

1. A question-answering method, comprising:
determining, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for a structured query statement of the question-answer database;
determining at least one query entity respectively matched with the at least one named entity from the question-answer database based on the respective semantics of the at least one named entity and at least one category corresponding to the at least one named entity;
determining a target structured query statement associated with the target question based on the at least one query entity and the at least one category; and
A target answer to the target question is determined by querying the question-answer database based on the target structured query statement.
2. The method of claim 1, wherein determining at least one query entity from the question-answer database that matches the at least one named entity comprises:
obtaining an entity lookup table from the question-answer database, wherein the entity lookup table comprises at least one column, the column name of each column is a category in at least one category, and the column value of each column is a query entity under the corresponding category;
determining at least one target column in the entity lookup table based on at least one category corresponding to the at least one named entity; and
query the query entities included in the at least one target column based on the semantics of each of the at least one named entity to determine at least one query entity that matches the at least one named entity.
3. The method of claim 1, wherein the target answer comprises a multimodal answer, and determining a target answer to the target question comprises:
querying the question-answer database based on the target structured query statement to obtain a query result aiming at the target question;
Processing the query results using a first model to determine descriptive text for describing the query results;
processing the query results using a second model to determine a description table and/or a description image for describing the query results; and
the target answer is generated based on the descriptive text, the descriptive form, and/or the descriptive image.
4. A method according to claim 3, wherein the first model is a language model and the second model is a chart generating model.
5. A method according to claim 3, wherein if the target question is a question for a numerical value, the query result includes at least numerical value information for the target question in the question-answer database.
6. The method of claim 1, further comprising:
if it is determined that the question-answer database does not include a query entity that matches a first named entity of the at least one named entity, determining the first named entity as a first entity;
a preset identifier is distributed for the first entity so as to identify the first entity, and the identifier is used for identifying the first entity in the target question; and
Storing the identified first entity in the question-answer database.
7. The method of claim 6, further comprising:
determining a first query entity from the question-answer database, the text similarity between the first query entity and the first entity being higher than a first threshold;
generating an approximate question for the target question based on the first query entity and other named entities of the at least one named entity than the first named entity;
determining a first answer to the approximate question; and
providing the approximate question and the first answer.
8. The method of claim 6, wherein the question-and-answer database includes at least one historical question, the method further comprising:
determining, based on the target question, a target historical question from the at least one historical question, the text similarity with the target question being higher than a second threshold;
determining a second answer to the target history question; and
providing the target history question and the second answer.
9. The method of claim 1, wherein the question-answer database is a tax question-answer database, the query entity comprising an entity related to tax questions-answers.
10. An apparatus for question answering, comprising:
a category determination module configured to determine, based on at least one named entity detected from the target question, that the at least one named entity corresponds to at least one category, the category of each named entity being a category defined for a structured query statement of the question-answer database;
an entity determining module configured to determine at least one query entity that matches the at least one named entity, respectively, from the question-answer database based on the semantics of the at least one named entity, respectively, and at least one category to which the at least one named entity corresponds;
a statement determination module configured to determine a target structured query statement associated with the target question based on the at least one query entity and the at least one category; and
an answer determination module configured to determine a target answer to the target question by querying the question-answer database based on the target structured query statement.
11. An electronic device, comprising:
at least one processing unit; and
at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, which when executed by the at least one processing unit, cause the electronic device to perform the method of any one of claims 1 to 9.
12. A computer readable storage medium having stored thereon a computer program executable by a processor to implement the method of any of claims 1 to 9.
CN202311767471.0A 2023-12-20 2023-12-20 Method, apparatus, device and readable medium for question and answer Pending CN117743546A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311767471.0A CN117743546A (en) 2023-12-20 2023-12-20 Method, apparatus, device and readable medium for question and answer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311767471.0A CN117743546A (en) 2023-12-20 2023-12-20 Method, apparatus, device and readable medium for question and answer

Publications (1)

Publication Number Publication Date
CN117743546A true CN117743546A (en) 2024-03-22

Family

ID=90250546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311767471.0A Pending CN117743546A (en) 2023-12-20 2023-12-20 Method, apparatus, device and readable medium for question and answer

Country Status (1)

Country Link
CN (1) CN117743546A (en)

Similar Documents

Publication Publication Date Title
CN107526799B (en) Knowledge graph construction method based on deep learning
CN111506714A (en) Knowledge graph embedding based question answering
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
US10108661B2 (en) Using synthetic events to identify complex relation lookups
CN116795973B (en) Text processing method and device based on artificial intelligence, electronic equipment and medium
CN109508458B (en) Legal entity identification method and device
Zhao et al. Simple question answering with subgraph ranking and joint-scoring
CN112925898B (en) Question-answering method and device based on artificial intelligence, server and storage medium
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN112214595A (en) Category determination method, device, equipment and medium
CN117573985B (en) Information pushing method and system applied to intelligent online education system
CN115062134A (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
US10229156B2 (en) Using priority scores for iterative precision reduction in structured lookups for questions
US8880562B2 (en) Generating a supplemental description of an entity
CN112199958A (en) Concept word sequence generation method and device, computer equipment and storage medium
KR20210147368A (en) Method and apparatus for generating training data for named entity recognition
US10824811B2 (en) Machine learning data extraction algorithms
CN109993190B (en) Ontology matching method and device and computer storage medium
CN115511104A (en) Method, apparatus, device and medium for training a contrast learning model
CN117743546A (en) Method, apparatus, device and readable medium for question and answer
CN114896382A (en) Artificial intelligent question-answering model generation method, question-answering method, device and storage medium
CN115526177A (en) Training of object association models
Singh et al. Universal Schema for Slot Filling and Cold Start: UMass IESL at TACKBP 2013.
CN112579774A (en) Model training method, model training device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination