CN114064862A - Question answering method, device and equipment - Google Patents

Question answering method, device and equipment Download PDF

Info

Publication number
CN114064862A
CN114064862A CN202010764789.3A CN202010764789A CN114064862A CN 114064862 A CN114064862 A CN 114064862A CN 202010764789 A CN202010764789 A CN 202010764789A CN 114064862 A CN114064862 A CN 114064862A
Authority
CN
China
Prior art keywords
query
information
question
data
data name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010764789.3A
Other languages
Chinese (zh)
Inventor
石翔
宋建春
李维
孙健
李永彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010764789.3A priority Critical patent/CN114064862A/en
Publication of CN114064862A publication Critical patent/CN114064862A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a question answering method, a question answering device and question answering equipment. Wherein the method comprises the following steps: receiving query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement. By adopting the processing mode, the question-answering system in the field is automatically constructed on the basis of the data table in the target field, so that the question-answering system has the question-answering capability of a lightweight table; therefore, the question response requirement which is provided by the user in natural language and depends on data query processing can be effectively met.

Description

Question answering method, device and equipment
Technical Field
The application relates to the technical field of natural language processing, in particular to an automatic question answering method and device and electronic equipment.
Background
The question-answering system accurately positions question knowledge required by the user in a question-answering mode, and provides personalized information service for the user through interaction with the user. The system adopts a natural language processing technology, so that on one hand, the understanding of the user question is completed; and on the other hand, the generation of correct answers is completed.
A typical question-answering system is realized based on a question-answering domain knowledge base. The system needs to pre-construct a knowledge base of the question and answer field; after receiving the user question, the question is identified through a question understanding module, an answer corresponding to the question is searched in a knowledge base through an answer searching module, and the answer is returned to the user side for the user to check. The system usually sets the range of machine-to-human conversation in a fixed flow, when the machine-to-human conversation is performed according to a well-defined flow, such as an invoicing task conversation, firstly, time needs to be determined, then, money is determined, finally, units and details are determined, and the invoicing function can be realized only after the whole flow is finished step by step.
However, in the process of implementing the invention, the inventor finds that the prior art has at least the following problems: the automatic question-answering system realized based on the knowledge base is only suitable for the question-answering scene for processing fixed knowledge questions, does not have the question-answering capability based on the basic data table, and cannot meet the question answering requirement of a user, which is provided by natural language and depends on data query processing. In summary, how to construct an automatic question-answering system with a form question-answering capability to satisfy the requirement of a user for answering questions in natural language, which depends on data query processing, is a problem that needs to be solved urgently by those skilled in the art.
Disclosure of Invention
The application provides a question-answering method to solve the problem that the prior art does not have the question-answering capability of a form. The application additionally provides a question answering device and an electronic device.
The application provides a question answering method, which comprises the following steps:
receiving query information of a user;
extracting entity information from the query information;
splicing the entity information to obtain a query condition clause and a query data name clause;
generating a structured query statement according to the query condition clause and the query data name clause;
and acquiring response content of the query information in a question-answer database according to the structured query statement.
Optionally, the method further includes: and receiving a data table of the target field, and constructing a question-answer database of the target field according to the data table.
Optionally, the query condition clauses are spliced by the following steps:
determining a data name, a data value and an operator;
if the attribute of the data value supports the operator and the text distance of the data value and the operator in the question information is smaller than a first distance threshold, and if the attribute of the data name supports the operator and the text distance of the data name and the operator in the question information is smaller than a second distance threshold, splicing the data name, the data value and the operator into a query condition, and taking the data name in the query condition as a first data name;
determining the logic relation among all the query conditions;
and splicing the plurality of query conditions into query condition clauses according to the logical relationship.
Optionally, the query data name clauses are spliced by the following steps:
determining a second data name and/or an aggregation function related to the query data name clause according to the first data name and the entity word;
if the data value attribute corresponding to the second data name supports the aggregation function and the text distance of the second data name and the aggregation function in the question information is smaller than a third distance threshold, generating an aggregated data name according to the second data name and the aggregation function;
and splicing the second data name and the aggregation data name of the unassociated aggregation function into a query data name clause.
Optionally, the method further includes:
determining the corresponding relation between the data name of each data in the target field and the synonym of the data name;
correspondingly, the extracting entity information from the query information includes:
and extracting related entity words from the query information according to the corresponding relation.
Optionally, the method further includes:
learning from the training data set to obtain an entity word recognition model through a machine learning algorithm;
the extracting of the related entity words from the query information includes:
and extracting related entity words through the recognition model.
Optionally, the method further includes:
receiving data name synonyms of each data in the target field;
and storing the corresponding relation between the data name of each data in the target field and the synonym of the data name.
The application also provides a question answering method, which comprises the following steps:
sending natural language query information of a user to a server;
displaying the response content of the query information returned by the server;
the server side determines the response content by adopting the following steps: extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
The present application further provides a question answering device, including:
the query information receiving unit is used for receiving query information of a user;
an entity information extraction unit, configured to extract entity information from the query information;
the query sentence clause generating unit is used for splicing the entity information to obtain a query condition clause and a query data name clause;
the query statement generating unit is used for generating a structured query statement according to the query condition clause and the query data name clause;
and the query unit is used for acquiring the response content of the query information in a question-answer database according to the structured query statement.
The present application further provides an electronic device, comprising:
a processor; and
a memory for storing a program for implementing the question answering method, the device executing the following steps after being powered on and running the program of the method through the processor: receiving query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
The present application also provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the various methods described above.
The present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the various methods described above.
Compared with the prior art, the method has the following advantages:
the question answering method provided by the embodiment of the application receives the query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; obtaining response content of the query information in a question-answer database according to the structured query statement; by the processing mode, the question-answering system in the field is automatically constructed on the basis of the data table in the target field, so that the question-answering system has the light-weight type table question-answering capability; therefore, the question response requirement which is provided by the user in natural language and depends on data query processing can be effectively met.
Drawings
FIG. 1 is a flow chart of an embodiment of a question answering method provided by the present application;
fig. 2 is a schematic view of an application scenario of an embodiment of a question answering method provided in the present application;
FIG. 3 is a schematic diagram of an interaction of a device according to an embodiment of a question answering method provided by the present application;
FIG. 4 is a system architecture diagram of an embodiment of a question answering method provided by the present application;
FIG. 5 is a diagram illustrating an SQL clause structure according to an embodiment of a question-answering method provided by the present application;
FIG. 6 is a SQL parsing framework diagram of an embodiment of a question-answering method provided by the present application;
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
In the application, a question answering method and device and electronic equipment are provided. Each of the schemes is described in detail in the following examples.
First embodiment
Please refer to fig. 1, which is a flowchart of an embodiment of a question answering method according to the present application, wherein an execution subject of the method includes but is not limited to a server, and may be any device capable of implementing the method. In this embodiment, the method may include the steps of:
step S101: and receiving query information of a user.
In this embodiment, the question-answering system includes: the server side and the first client side. The server can be a server deployed on a cloud server, or a server specially used for realizing an automatic question answering system, and can be deployed in a data center. The server may be a cluster server or a single server. The first client, including but not limited to mobile communication device, is: the mobile phone or the smart phone also includes terminal devices such as a personal computer, a PAD, and an iPad.
Please refer to fig. 2, which is a schematic view of an automatic question answering system according to the present application. The server and the first client can be connected through a network, for example, the first client can be networked through a WIFI or the like, and the like. In this embodiment, the server constructs a question-answering system in the field, such as a financial field question-answering system, according to the data table of the target field; the questioner user interacts with the server through his first client, for example, ask "what is the fund code of Shanghai 300A? "," what were more than 10% of the last half-year rise? "and the like; the server converts the user question into a database query statement (such as an SQL statement), queries financial data related to the user question from a financial field data table, generates reply information according to a query result, and returns the reply information to the first client, wherein the reply information is displayed by the first client and is viewed by the questioning user, and the reply information is sent back to the first client, and the reply information is generated according to the query result, such as ' fund code of Shanghai 300A is 961 ', ' Shanghai board A with a spread of more than 10% in the last half year ', medicine 100A, computer A ', and the like.
Please refer to fig. 3, which is a schematic diagram of an apparatus interaction of an embodiment of the question answering system of the present application. In this embodiment, the server is configured to construct a question-answering system of the target domain according to the data table of the target domain; converting natural language question information aiming at the target field sent by a first client into a database query statement aiming at the data table through a database query statement analysis model included in the target field question-answering system; executing the query statement, generating reply information according to a query result, and returning the reply information to the first client; correspondingly, the first client is used for sending the question information to the server; and displaying the reply information returned by the server.
The target field can be a financial field, a catering field, an electric power field, an automobile field and the like. The target domain data may include various basic data of the domain. The target area data may be data that changes in real time or data that is relatively fixed for a period of time.
For example, in the financial field, the user may request information such as the code of a financial product (e.g., the fund code of Shanghai depth 300A), the latest net worth, accumulated net worth, and the last half-year rise of the fund product, which may be derived from the fund information data table. The fund name and the fund code are fixed data, and the latest net value, accumulated net value, near-half-year fluctuation and other data of the fund product can be variable data updated in real time.
Taking the catering field as an example, a user wants to order coffee from a certain coffee chain point by an online ordering manner, and wants to inquire information about the price of certain coffee (such as how much money to hold coffee), the cup types of the coffee (such as medium cup, big cup and small cup), the ingredients of the coffee (such as milk), and the like, and all the information can be derived from the commodity information data table of the coffee point shop.
The data table of the target domain may be a data table stored in a database. The database may be a relational database (e.g., MySQL, Oracle database, etc.), a non-relational database (e.g., BigTable, MongoDB database, etc.), or a key-value database (e.g., LevelDB). The data sheet of the target field can also be a data sheet of spreadsheet software, such as an Excel sheet.
In the present embodiment, the target domain data is stored in the database. In specific implementation, data of different fields are usually stored in different data tables because data structures of different fields are different. Data tables of a plurality of fields can be stored in the same database. Table 1 lists the data table structure of the financial field and its contents.
Figure BDA0002613292680000061
TABLE 1 financial field data sheet
Table 2 lists the structure of the merchandise information data table of a certain coffee shop chain and its contents.
Figure BDA0002613292680000062
Figure BDA0002613292680000071
TABLE 2 coffee shop commodity information data sheet
As can be seen from tables 1 and 2, the data tables in different fields are different in the number of columns, column names (data names), column attributes, column value attributes, and the like.
In one example, the method may further comprise the steps of: and receiving a data table of the target field, and constructing a question-answer database of the target field according to the data table. In this embodiment, the system may further include a second client. The second client is used for sending the target field data to the server; the server is specifically configured to create the data table in the database according to the target domain data, and store the target domain data in the data table.
The second client is used for submitting data by the data owner, such as the financial company submitting the data of the financial field owned by the financial company to the server through the second client. The second client, including but not limited to mobile communication device, is: the mobile phone or the smart phone also includes terminal devices such as a personal computer, a PAD, and an ipad.
In another example, the server can collect data of multiple fields from the internet through web crawler technology, store the data in a database, and provide information consulting question-answering service of multiple fields to the user according to the data.
In this embodiment, after the server stores the target domain data in the database, a target domain question-answering system associated with the target domain data table is constructed. The server automatically analyzes and processes the target field data form, constructs a robot (namely a target field question-answering system) associated with the form, and a user can ask the robot about the form, for example, the user can ask the form information given by the form 1: what is the fund code of Shanghai depth 300A? "," what funds were present in the last half-year greater than 10% "? "and the like.
In one example, building a robot associated with a target domain data table requires the following data table information: column names (data names) and synonyms thereof, column values (data values), column value attributes and the like in the data table; the construction of the robot may comprise the following specific steps:
s1: a second client user uploads the table data to the server through the second client, and the server configures column names, column values, column name synonyms, column value attributes and the like;
s2: the server stores the table data into a database and a cache;
s3: the ability to read data from the cache and build natural language to database query statements populates the uploaded tabular data for a provided, e.g., SQL, parsing framework.
The steps are automatically completed, and the question answering capability based on the form can be provided after the steps are completed.
As shown in fig. 4, when a user is asked to ask a question of the robot, a dialogue management module (Uniform DM: dialog manager) may send a query of the user to a database query statement parsing model (for example, NL2SQL module); then, NL2SQL processing can be carried out on the query of the user through an NL2SQL module, and the query of the user is converted into a corresponding SQL statement; next, using SQL sentence to search database; and finally, returning the database result to the questioning user.
By applying the question-answering method provided by the embodiment of the application, a user only needs to express the data query intention through the natural language, and the database query statement analysis model converts the data query intention into the database query statement. If the database is a relational database, it will be converted into a structured query statement SQL through NL2SQL (Natural Language to SQL), and if the database is a non-relational database, it will be converted into a query statement NoSQL through NL2NoSQL, which will greatly shorten the distance between the user and the target domain data. This natural language-SQL task may be referred to as semantic parsing, i.e., the natural language is automatically converted into a form of SQL expression that can be understood and executed by a machine.
In this embodiment, the dialog management module in fig. 4 is a core module of the dialog robot, and is responsible for unified scheduling of resources. In the table question-answering, the dialogue management module is responsible for forwarding user query to NL2SQL, and determines the action to be executed next according to the analyzed SQL statement returned by the NL2SQL and the current dialogue state.
In this embodiment, the server may further be configured to determine the session state information through a session management module included in the question-answering system; and if the dialog state information comprises context information, determining the query statement according to the context information and the data value included in the question information.
The dialog management module records dialog state information, which may be "query", "context inheritance", "rejection", "regular dialog", and so on. If the current dialog state is "query" (execute SQL action), the database is called to execute SQL statement, and the returned result after searching is returned to the first client user through packaging. If the current dialogue state is 'context inheritance', the question information needs to be analyzed by combining with context information to carry out SQL sentences, if the last question information is 'how much the rise of fund A is in the last 3 months', the context information comprises 'fund A', and if the question information is 'last half-year rise woolen', the complete question information determined by combining with the context information is 'how much the rise of fund A is in the last half-year'. If the current dialog state is "denied," this indicates that the user's question is not associated with the form data and no reply is required. If the current conversation state is a regular conversation, information replied to the first client user, such as the user saying "thank you" and the system replying "cheer again" can be determined by the regular conversation module.
In this embodiment, the server constructs the analysis model according to the structure information of the data table. In step S3, the server reads the data from the cache, constructs NL2SQL capability, and fills the uploaded table data in the SQL parsing framework provided, so as to form the SQL parsing framework of the target domain as shown in fig. 6. After the NL2SQL capability of the target field is constructed, the user question information can be converted into SQL statements through an SQL analysis framework of the target field.
The method can firstly preprocess the table, store the related column name and column value information into a background database, then ask the query for the table by the user, analyze the query of the user and convert the query into the corresponding SQL. As shown in fig. 5 below, SQL-based consists of two parts: a SELECT part AND a WHERE part, wherein the SELECT part is composed of an aggregation function AGG AND a column name, respectively, AND the WHERE part is composed of a condition AND a connector, wherein the condition part includes a column name, an operator AND a column value, AND the connector can be either AND OR.
As shown in fig. 6, in this embodiment, the process of parsing the user question information through the SQL parsing framework may include the following steps:
step S103: and extracting entity information from the query information.
In the step, all entity words possibly related to SQL in the query are identified on the basis of the query of the user, wherein the entity words comprise column names, column values, aggregation functions, operational characters and connectors. For example, the question "what is the fund code of Shanghai depth 300A? ", the column names are: fund code, column value: the Shanghai depth is 300A.
Step S105: and splicing the entity information to obtain a query condition clause and a query data name clause.
The step can be that the entity words are spliced into a query condition clause and a query data name clause according to the grammar rule of the query sentence. In this embodiment, in this step, entity concatenation is performed based on SQL syntax rules based on entity identification.
First, the process of forming the query condition clause is described, which is a concatenation WHERE portion in this embodiment. In specific implementation, the query condition clauses can be obtained by splicing in the following way: 1) determining a data name, a data value and an operator; 2) if the attribute of the data value supports the operator and the text distance of the data value and the operator in the question information is smaller than a first distance threshold (if the value is 0), and if the attribute of the data name supports the operator and the text distance of the data name and the operator in the question information is smaller than a second distance threshold (if the value is 0), splicing the data name, the data value and the operator into a query condition, and taking the data name in the query condition as the first data name; 3) determining the logic relation among all the query conditions; 4) and splicing the plurality of query conditions into query condition clauses according to the logical relationship.
In this embodiment, for each possible column value, all candidate column name information of the column value is preferentially determined, which may be determined based on the distance between the column value and the column name in the question text, the relationship between the column value and the column name in the data table, and if the relationship is available, the value is retained, otherwise, the value is discarded. After the column value and the column name are determined, the attribute (text or number) of the column value and the distance between the column value and the operator in the questioning text are utilized to judge whether the column value and the column name are reserved, and finally, a WHERE part is determined.
The candidate columns refer to all possible columns identified by the entity in the user query and the unique columns in the table. For example: the user inquires that the rise in the last half year is more than 10 percent, and the candidate list is 'rise in the last 6 months'; the user inquires about "Shanghai depth 300A rise in approximately 3 months", and since "Shanghai depth 300A" may appear only in "fund name", the "fund name" is a candidate column.
The processing process of the splicing WHERE part is illustrated by taking question information 'the last half year rise is more than 10 percent as an example'. After adding the synonym "near half year rise" - > "near 6 month rise", the system identifies the solid word table column "near 6 month rise", the operator "greater than", the column value "10%", and then splices into the WHERE part of SQL:
first, take out the column value information "10%", then determine the operator of the column value, default operation is "equal to", because the entity recognizes that the operator "is greater than", judge whether the "greater than" operation matches with the column value "10%", need to make two steps of judgments:
the first step, the column value attribute of the column value "10%" is "NUMBER", supporting the "greater than" operator.
In the second step, the distance between the column value "10%" and the operator "greater than" is determined, and since the two entities are adjacent in the original text and the distance is 0, the operator having the column value "10%" can be determined as "greater than".
If one of the two conditions is not met, then the operator with the column value "10%" is the default operator "equal".
Then, the column information having the column value of "10%" is judged again, the column information is identified as "near 6-month fluctuation" by the entity, and the same procedure as above is applied, if satisfied, the "near 6-month fluctuation" and "10%" constitute the WHERE condition, if not, the WHERE condition cannot be constituted, and if there are a plurality of conditions, the above-mentioned method is applied to each of the cases. Finally generating WHERE conditions.
And finally, determining the connection relationship among all the conditions, and judging whether the connection relationship is an and or, usually, the default is an and by using the entity identification result.
The forming process of the WHERE condition is described so far, and the forming process of the query condition clause is described next, which is a concatenation SELECT part in this embodiment.
In this embodiment, the query data name clauses are spliced by the following steps: 1) determining a second data name related to the query data name clause or a second data name and an aggregation function according to the first data name and the entity word; 2) if the data value attribute corresponding to the second data name supports the aggregation function and the text distance between the second data name and the aggregation function in the question information is smaller than a third distance threshold (if the value is 0), generating an aggregated data name according to the second data name and the aggregation function; 3) and splicing the second data name and the aggregation data name of the unassociated aggregation function into a query data name clause.
To determine the SELECT part, the column information and aggregation function identified by the entity and the result of the WHERE part are used to remove the column names already appearing in the WHERE part, and then to adapt to the aggregation function for each column name, the judgment condition is the column attribute and the distance appearing in the text, if none of them are related, the default is no aggregation function.
Taking the above table 1 as an example, the user asks "how much the top 3 month rise of the fund name is? First, it can be judged by entity identification that "fund name" and "near 3 month rise" are table column names, "highest" is aggregation function "MAX", and then it is necessary to judge which column name the aggregation function "MAX" is combined with. As shown in fig. 6, in this embodiment, the default aggregation function for a column name is "Null", and to generate a SELECT part, the aggregation function included in the column value corresponding to each column name may be determined under two conditions: whether the aggregation function is matched with the column value attribute or not is judged, if the column value attribute is NUMBER, the aggregation function can be MAX \ MIN \ COUNT and the like; and secondly, the position of the aggregation function in the question text is distant from the position of the column name in the text.
For example, the column value attribute of "fund name" is "TEXT", the TEXT "TEXT" has no maximum value, the obvious aggregation function "MAX" cannot match, and thus the aggregation function for the fund name is "Null".
For example, the column value attribute of "near 3-month amplitude rise" is "NUMBER", and it can exactly match with the aggregation function "MAX", and the position of the aggregation function "MAX" in the text is adjacent to the position of the "near 3-month amplitude rise" in the text, and the distance is 0, so the aggregation function of "near 3-month amplitude rise" is "MAX".
Step S107: and generating a structured query statement according to the query condition clause and the query data name clause.
In the step, a query sentence is generated according to the query condition clause and the query data name clause. The embodiment generates an SQL query statement, and concatenates SQL based on an SQL syntax by using a WHERE part and a SELECT part of entity concatenation, so as to generate a final SQL statement.
So far, the SQL parsing process is explained.
In one example, the server may further be configured to determine a correspondence between a data name of each data in the target domain and a synonym of the data name; and identifying entity words related to the query sentence in the question information according to the corresponding relation.
In this embodiment, a user uploads a table and configures a small number of synonyms to improve the generalization capability of the system. Taking the column "fund name" in table 1 as an example, if only "fund name" is available, the constructed table question-answering robot can only recognize "fund name", but cannot recognize "fund abbreviation", "fund product name", etc., and in order to improve the system recognition capability, synonyms such as "fund name" - "fund abbreviation", "fund abbreviation" - "fund product name", etc. need to be provided.
In one example, the entity word recognition model is learned from a training data set through a machine learning algorithm; in this case, the entity words may be identified from the question information by the identification model. The training data may include a correspondence between user question information and query statement parsing information. By adopting the method, the synonym of the data name in the user question text is automatically determined through the recognition model without manual configuration, so that the generalization capability of the system can be further improved.
Step S107: and acquiring response content of the query information in a question-answer database according to the structured query statement.
After the structured query statement is generated, the response content of the query information can be obtained in a question-answer database.
It should be noted that the task dialog flow of the question and answer system provided by the embodiment of the present application is different from the task dialog flow based on the knowledge base in the prior art. For task-based conversations (conversations with certain tasks, such as mobile package handling, invoicing and the like), people can freely communicate, but machines cannot achieve the purpose at present, some constraints are needed at this time, the range of the machine-to-human conversation is set in a fixed flow (for example, invoicing task conversation, the time is determined at first, then the money amount is determined, finally the unit and the detail are determined, and the whole flow is finished step by step to achieve the invoicing function), and then the machine-to-human conversation is performed according to the agreed flow. The same applies to the task flow of the form question-answering system provided in the embodiment of the present application, and as can be seen from fig. 4, the difference between the task flow generation method in the prior art and the task flow generation method in the prior art is as follows: the generation process is automatically executed based on the data table without manually setting task conversation flow, and can be regarded as a virtual step, and the generated process is to collect WHERE and SELECT information in the conversation and then output the result.
As can be seen from the foregoing embodiments, the question answering method provided in the embodiments of the present application receives query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; obtaining response content of the query information in a question-answer database according to the structured query statement; by the processing mode, the question-answering system in the field is automatically constructed on the basis of the data table in the target field, so that the question-answering system has the light-weight type table question-answering capability; therefore, the question response requirement which is provided by the user in natural language and depends on data query processing can be effectively met.
Second embodiment
In the above embodiment, a question answering method is provided, and correspondingly, a question answering device is also provided in the present application. The apparatus corresponds to an embodiment of the method described above.
Parts of this embodiment that are the same as the first embodiment are not described again, please refer to corresponding parts in the first embodiment. The application provides a question answering device includes:
the query information receiving unit is used for receiving query information of a user;
an entity information extraction unit, configured to extract entity information from the query information;
the query sentence clause generating unit is used for splicing the entity information to obtain a query condition clause and a query data name clause;
the query statement generating unit is used for generating a structured query statement according to the query condition clause and the query data name clause;
and the query unit is used for acquiring the response content of the query information in a question-answer database according to the structured query statement.
Third embodiment
The application also provides an electronic device. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
An electronic device of the present embodiment includes: a processor and a memory; a memory for storing a program for implementing the question answering method, the device executing the following steps after being powered on and running the program of the method through the processor: receiving query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
Fourth embodiment
Corresponding to the above question-answering method, the present application also provides a question-answering method, where an execution subject of the method includes, but is not limited to, the first client, and may also be any device capable of implementing the method. Parts of this embodiment that are the same as the first embodiment are not described again, please refer to corresponding parts in the first embodiment.
In this embodiment, the question answering method includes the following steps:
step 1: sending natural language query information of a user to a server;
step 2: displaying the response content of the query information returned by the server;
the server side determines the response content by adopting the following steps: extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
Fifth embodiment
In the above embodiment, a question answering method is provided, and correspondingly, a question answering device is also provided in the present application. The apparatus corresponds to an embodiment of the method described above.
Parts of this embodiment that are the same as the first embodiment are not described again, please refer to corresponding parts in the first embodiment. The application provides a question answering device includes:
the query information sending unit is used for sending the natural language query information of the user to the server;
the response information display unit is used for displaying the response content of the query information returned by the server;
the server side determines the response content by adopting the following steps: extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
Sixth embodiment
The application also provides an electronic device embodiment. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
An electronic device of the present embodiment includes: a processor and a memory; a memory for storing a program for implementing the question answering method, the device executing the following steps after being powered on and running the program of the method through the processor: sending natural language query information of a user to a server; displaying the response content of the query information returned by the server; the server side determines the response content by adopting the following steps: extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (10)

1. A question-answering method, comprising:
receiving query information of a user;
extracting entity information from the query information;
splicing the entity information to obtain a query condition clause and a query data name clause;
generating a structured query statement according to the query condition clause and the query data name clause;
and acquiring response content of the query information in a question-answer database according to the structured query statement.
2. The method of claim 1, further comprising: and receiving a data table of the target field, and constructing a question-answer database of the target field according to the data table.
3. The method of claim 1, wherein the query condition clause is spliced by:
determining a data name, a data value and an operator;
if the attribute of the data value supports the operator and the text distance of the data value and the operator in the question information is smaller than a first distance threshold, and if the attribute of the data name supports the operator and the text distance of the data name and the operator in the question information is smaller than a second distance threshold, splicing the data name, the data value and the operator into a query condition, and taking the data name in the query condition as a first data name;
determining the logic relation among all the query conditions;
and splicing the plurality of query conditions into query condition clauses according to the logical relationship.
4. The method of claim 3, wherein the query data name clause is concatenated by:
determining a second data name and/or an aggregation function related to the query data name clause according to the first data name and the entity word;
if the data value attribute corresponding to the second data name supports the aggregation function and the text distance of the second data name and the aggregation function in the question information is smaller than a third distance threshold, generating an aggregated data name according to the second data name and the aggregation function;
and splicing the second data name and the aggregation data name of the unassociated aggregation function into a query data name clause.
5. The method of claim 3, further comprising:
determining the corresponding relation between the data name of each data in the target field and the synonym of the data name;
correspondingly, the extracting entity information from the query information includes:
and extracting related entity words from the query information according to the corresponding relation.
6. The method of claim 3, further comprising:
learning from the training data set to obtain an entity word recognition model through a machine learning algorithm;
the extracting of the related entity words from the query information includes:
and extracting related entity words through the recognition model.
7. The method of claim 2, further comprising:
receiving data name synonyms of each data in the target field;
and storing the corresponding relation between the data name of each data in the target field and the synonym of the data name.
8. A question-answering method, comprising:
sending natural language query information of a user to a server;
displaying the response content of the query information returned by the server;
the server side determines the response content by adopting the following steps: extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
9. A question answering device, comprising:
the query information receiving unit is used for receiving query information of a user;
an entity information extraction unit, configured to extract entity information from the query information;
the query sentence clause generating unit is used for splicing the entity information to obtain a query condition clause and a query data name clause;
the query statement generating unit is used for generating a structured query statement according to the query condition clause and the query data name clause;
and the query unit is used for acquiring the response content of the query information in a question-answer database according to the structured query statement.
10. An electronic device, comprising:
a processor; and
a memory for storing a program for implementing the question answering method, the device executing the following steps after being powered on and running the program of the method through the processor: receiving query information of a user; extracting entity information from the query information; splicing the entity information to obtain a query condition clause and a query data name clause; generating a structured query statement according to the query condition clause and the query data name clause; and acquiring response content of the query information in a question-answer database according to the structured query statement.
CN202010764789.3A 2020-07-31 2020-07-31 Question answering method, device and equipment Pending CN114064862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010764789.3A CN114064862A (en) 2020-07-31 2020-07-31 Question answering method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010764789.3A CN114064862A (en) 2020-07-31 2020-07-31 Question answering method, device and equipment

Publications (1)

Publication Number Publication Date
CN114064862A true CN114064862A (en) 2022-02-18

Family

ID=80231392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010764789.3A Pending CN114064862A (en) 2020-07-31 2020-07-31 Question answering method, device and equipment

Country Status (1)

Country Link
CN (1) CN114064862A (en)

Similar Documents

Publication Publication Date Title
US11615791B2 (en) Voice application platform
US20230154461A1 (en) Voice Application Platform
US11899681B2 (en) Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium
US20190371321A1 (en) Voice Application Platform
US20180210883A1 (en) System for converting natural language questions into sql-semantic queries based on a dimensional model
WO2020086234A1 (en) Machine learning tool for navigating a dialogue flow
US11461317B2 (en) Method, apparatus, system, device, and storage medium for answering knowledge questions
CN109902087B (en) Data processing method and device for questions and answers and server
CN109948151A (en) The method for constructing voice assistant
US9652740B2 (en) Fan identity data integration and unification
CN112507139A (en) Knowledge graph-based question-answering method, system, equipment and storage medium
CN115422334A (en) Information processing method, device, electronic equipment and storage medium
CN105335466A (en) Audio data retrieval method and apparatus
US20140095527A1 (en) Expanding high level queries
US11210462B1 (en) Voice input processing
CN109145092A (en) A kind of database update, intelligent answer management method, device and its equipment
CN116662495A (en) Question-answering processing method, and method and device for training question-answering processing model
CN114064862A (en) Question answering method, device and equipment
CN115577085A (en) Processing method and equipment for table question-answering task
CN112470216A (en) Voice application platform
CN115481227A (en) Man-machine interaction dialogue method, device and equipment
CN115757720A (en) Project information searching method, device, equipment and medium based on knowledge graph
CN114357137A (en) Knowledge graph-based question-answering method, knowledge graph-based question-answering equipment, knowledge graph-based storage medium and question-answering robot
CN106682221B (en) Question-answer interaction response method and device and question-answer system
CN113168639A (en) Semantic CRM mobile communication session

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination