CN112860865A - Method, device, equipment and storage medium for realizing intelligent question answering - Google Patents

Method, device, equipment and storage medium for realizing intelligent question answering Download PDF

Info

Publication number
CN112860865A
CN112860865A CN202110185116.7A CN202110185116A CN112860865A CN 112860865 A CN112860865 A CN 112860865A CN 202110185116 A CN202110185116 A CN 202110185116A CN 112860865 A CN112860865 A CN 112860865A
Authority
CN
China
Prior art keywords
information
question
answer
type
question information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110185116.7A
Other languages
Chinese (zh)
Inventor
孙伟伟
昝云飞
纪传俊
陈运文
纪达麒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Datagrand Tech Inc
Original Assignee
Datagrand Tech Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Datagrand Tech Inc filed Critical Datagrand Tech Inc
Priority to CN202110185116.7A priority Critical patent/CN112860865A/en
Publication of CN112860865A publication Critical patent/CN112860865A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for realizing intelligent question answering, wherein the method comprises the following steps: acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information; if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the question information with the at least one recall result in a similarity manner to obtain answer information matched with the question information; and sending the answer information of the question information to the user, so that the analysis of the user question is more accurate, and the accuracy of the recalled answer is improved.

Description

Method, device, equipment and storage medium for realizing intelligent question answering
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a storage medium for realizing intelligent question answering.
Background
With the rapid development of internet technology, people's demand for obtaining information rapidly and accurately is increasing, and the intelligent question-answering system can provide more accurate and concise answers for users, and becomes a research direction which is concerned by people in the fields of artificial intelligence and natural language processing at present.
At present, intelligent question-answering is mainly realized through knowledge base searching, preset questions are stored in a knowledge base of a question-answering system, answer information corresponding to the preset questions is stored, when public users put forward the questions, the questions put forward by the public users are matched with the preset questions through the intelligent question-answering system, if the matching is successful, the answer information corresponding to the preset questions is fed back to the public users through the intelligent question-answering system, when the user questions exceed the coverage range of the knowledge base, the intelligent question-answering system cannot provide accurate answers, and the efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a storage medium for realizing intelligent question answering, which are used for realizing accurate classification of user question information and improving the accuracy of answer recall.
In a first aspect, an embodiment of the present invention provides a method for implementing an intelligent question answering, including:
acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information;
if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the question information with the at least one recall result in a similarity manner to obtain answer information matched with the question information;
and sending answer information of the question information to the user.
In a second aspect, an embodiment of the present invention provides an apparatus for implementing an intelligent question answering, including:
the system comprises a classification type determining module, a semantic classification model and a query type determining module, wherein the classification type determining module is used for acquiring problem information of a user and classifying the problem information through the semantic classification model so as to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
the first answer information acquisition module is used for inquiring the question information in a knowledge map database if the question information is of a fact type so as to acquire answer information matched with the question information;
the second answer information acquisition module is used for retrieving the question information through a search engine to acquire at least one recall result if the question information is of an analysis type or an opinion type, and comparing the question information with the at least one recall result in a similarity manner to acquire answer information matched with the question information;
and the answer information sending module is used for sending the answer information of the question information to the user.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
one or more processors;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for implementing intelligent question answering according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for implementing intelligent question answering according to any embodiment of the present invention.
According to the technical scheme provided by the embodiment of the invention, after the problem information of a user is obtained and is classified by a semantic classification model so as to determine the classification type of the problem information; aiming at the question information of different classification types, different answer retrieval strategies are adopted to obtain answer information matched with the question information; and finally, the answer information of the question information is sent to the user, so that the accurate classification of the user questions is realized, and the accuracy of the recalled answers is improved.
Drawings
Fig. 1A is a flowchart of an implementation method of an intelligent question answering according to an embodiment of the present invention;
FIG. 1B is a word segmentation probability graph of the N-gram according to an embodiment of the present invention;
fig. 2 is a flowchart of another method for implementing intelligent question answering according to the second embodiment of the present invention;
fig. 3 is a block diagram of an apparatus for implementing an intelligent question answering according to a third embodiment of the present invention;
fig. 4 is a block diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1A is a flowchart of an implementation method for intelligent question answering according to an embodiment of the present invention, where the implementation method is applicable to accurately classify a question of a user and recall a corresponding accurate answer, and the method may be implemented by an implementation apparatus for intelligent question answering according to an embodiment of the present invention, where the implementation apparatus may be implemented by software and/or hardware and is integrated on an electronic device, and typically may be integrated in a terminal device or a server, and the method specifically includes the following steps:
s110, acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; wherein the classification type includes a fact type, an analysis type, and/or a point of view type.
Question information of the user, including question information that the user proposes through various forms, for example, question information that the user proposes through the voice command, question information that the way inputs through manually and question information that the alternative question input through clicking the screen inputs; acquiring problem information of a user, wherein the problem information comprises information in a user problem acquired by adopting a corresponding recognition technology according to a problem-proposing form of the user, for example, when the user proposes a problem in a voice instruction form, text content corresponding to the voice instruction is acquired through the voice recognition technology, and then the problem information of the user is acquired; in the embodiment of the invention, after the problem information of a user is obtained, a semantic classification model is adopted to extract the information characteristics of the problem information, and the type of the problem information is determined through information characteristic matching; wherein the classification type comprises a fact type, an analysis type and a viewpoint type; specifically, the fact type question is a question corresponding to a single and definite answer, in which a query tone is used to request others to give an accurate answer, and the answer includes an accurate answer given by the respondents, for example, the question is: "what are the three primary colors in optics? ", the answer is: "red, green, and blue"; the question of the analysis type is a question corresponding to an uncertain answer, the question usually requests others to give an analysis opinion by a query tone, the question does not include a subjective opinion of a questioner, and the corresponding answer includes a personal subjective analysis of a respondent, for example, the question is: "how well fruit is washed with baking soda? ", the answer is: "because most of the agricultural chemicals in the market are weakly acidic, and the baking soda is weakly alkaline, the agricultural chemicals can be neutralized and reacted to generate non-toxic and water-soluble salt, thereby achieving the purpose of removing the pesticide residue, therefore, the effect of cleaning the fruit by the baking soda is better; the viewpoint-type question is a question corresponding to an uncertain answer, in which a viewpoint that a questioner is uncertain as to whether the questioner is correct is expressed, and a respondent is asked to give a positive or negative response, and the corresponding answer includes a positive or negative response given by the respondent, for example, the question is: "is diligent a prerequisite for success? ", the answer is: "yes". The semantic classification model is adopted to classify the question information of the user, and then corresponding processing and retrieval are executed according to the type of the user question, so that more accurate corresponding answers can be obtained, and the accuracy of intelligent question answering is improved.
Optionally, in the embodiment of the present invention, the semantic classification model includes a word segmentation processing model, a word vector extraction model, and a classification processing model that are connected in sequence; the classifying the question information through the semantic classification model comprises the following steps: performing word segmentation processing on the problem information through the word segmentation processing model to obtain word segmentation sentences; extracting word vectors of the word segmentation sentences through the word vector extraction model, and obtaining sentence vectors of the word segmentation sentences according to the word vectors; and classifying the question information according to the sentence vector through the classification processing model so as to determine the classification type of the question information. The word segmentation processing model is used for segmenting words in the problem information of the user so as to identify and separate all the words in the problem information; because the problem of polysemy of a word exists in Chinese, namely the meanings of the same word in different words can be completely different, the Chinese character sequence corresponding to the problem information of the user is directly classified by adopting a classification processing model, so that the problem of inaccurate classification can be caused; before the classification processing, the word segmentation processing model is adopted to perform word segmentation processing on the problem information of the user so as to obtain word segmentation sentences which best meet the intention of the problem information of the user, and the accuracy of problem information classification can be improved; the word vector extraction model is used for expressing the word segmentation sentences in a vector form so as to be more suitable for computer processing; after the word segmentation completion statement is obtained, the word segmentation statement is converted into a vector mode through a word vector extraction model so as to obtain a sentence vector of the word segmentation statement, and the classification speed of a classification processing model can be improved; the classification processing model is used for classifying the problems of the user so as to determine the types of the problems of the user; after the type of the user question is determined, a corresponding retrieval strategy can be executed according to the type, and therefore the accuracy of answer recall is improved.
Optionally, in this embodiment of the present invention, the word segmentation processing model includes an N-ary model, and/or the word vector extraction model includes a word2vec model, and/or the classification processing model includes a long-term memory network model. Wherein, the N-gram model is that the occurrence of the Nth word is only related to the previous (N-1) th word, and the probability of the whole word segmentation sentence is the product of the occurrence probability of each word segmentation; the word segmentation sentence corresponding to the maximum possible probability can be obtained through the N-element model, and accurate word segmentation of user problem information is realized; for example, for the sentence "Nanjing city Changjiang river bridge", if a general word segmentation is used, the sentence may be segmented into "Nanjing city/Changjiang river/bridge" or "Nanjing/city Changjiang river/bridge", so that a word segmentation ambiguity phenomenon occurs; the N-element model constructs the probability of the statement according to the conditional probability; for example, the probability p of occurrence of the "Nanjing City Changjiang river bridge" statement (Nanjing City Changjiang river bridge) can be expressed as: p (the bridge of Changjiang river in Nanjing) ═ p (south) p (Jing | nan) p (city | Nanjing) … p (bridge | Nanjing city Changjiang river); wherein, p (south) is the probability of appearance of a single "south" word, p (Beijing | south) is the probability of appearance of a "Beijing" word on the basis of appearance of a "south" word, p (City | Nanjing) is the probability of appearance of a "City" word on the basis of appearance of a "Nanjing", and p (bridge | Nanjing City Changjiang river) is the probability of appearance of a next word as a "bridge" on the basis of appearance of a "Changjiang river of Nanjing", that is, the appearance of each word is related to the appeared words in the sentence; as shown in fig. 1B, it is a word segmentation probability map of a bigram (i.e., N ═ 2) for the sentence "changjiang river bridge in south kyo city"; wherein "s=w1w2…wi"the probability of occurrence of a sentence can be approximated as:
Figure BDA0002942774280000061
wherein s ═ w1W2…wi,wiIs a word or word, l is the length of the word or word, p (w)i|w1…wi-1) Is at "w1…wi-1"occurrence on the basis of occurrence of wiProbability of p (w)i|wi-1) Is at "wi-1"occurrence on the basis of occurrence of wiI.e., each word is associated with the preceding words, approximately as being associated with only the preceding word, to facilitate simplified computation. The word2vec model can train word vectors of word sentences, express the word sentences by adopting high-dimensional vectors, and superpose the word vectors of words in the user problem information to obtain user problem information vectors, so that the processing speed of the classification processing model can be increased; the Long and Short Term Memory Network (LSTM) model can solve the problem of Short Term Memory when Long sequence data are processed so as to avoid loss of important data when the Long sequence data are processed, and the Long and Short Term Memory Network model is adopted to classify user problem information so as to avoid omission of important information when the user problem information is too Long, thereby realizing more accurate classification of the user problem information.
Optionally, in this embodiment of the present invention, before performing classification processing on the problem information through a semantic classification model, the method may further include: acquiring a question-answer corpus pair set with labeled classification types, and training an initial semantic classification model to acquire a trained semantic classification model; and acquiring the question-answer corpus pair set based on user behaviors. Specifically, the corpus pairs may be obtained by acquiring external data through a data acquisition module, for example, by using a web crawler technology, acquiring the corpus information of users in each web page, and screening through data such as comment amount, praise amount, and/or collection amount to obtain corpus pairs with higher quality; obtaining behavior data of each search, click and the like of the user in an internal search system to obtain an internal question-answer corpus pair; the question type labeling of the question-answer corpus pair can be manual labeling or automatic labeling, for example, the type labeling is performed through keywords (for example, "yes", "how", "opposite") in the question sentence; and adding the labeled question-answer corpus pairs into a question-answer expected pair set, taking the question-answer corpus pair set as a training sample, and performing supervised classification training on the initial voice classification model to enable the classification result of the classification model to approach the labeling result as much as possible so as to obtain the trained semantic classification model. The initial voice classification model is trained on the set through the pre-obtained question-answer corpus, and the voice classification model with higher classification accuracy can be obtained, so that the accuracy of the question-answer system is improved.
And S120, if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information.
The knowledge map is a structured semantic knowledge base, is used for describing concepts and mutual relations in a physical world, and can improve the speed and accuracy of problem information retrieval by effectively processing, processing and integrating the complex document data and converting the complex document data into simple and clear triple data of 'subject entities, entity relations and objective entities'; for the fact type questions, although there are various answer information obtaining methods, for example, the question information retrieval is performed through a search engine, and the answer information is obtained through a knowledge graph method; however, the fact questions have unique and exact answers, data in the knowledge graph are in a triple form, and after the relationship between the subject entity and the entity is determined, the corresponding objective entity is also uniquely determined, so that the knowledge graph is adopted to obtain the fact answer information, and the answer accuracy of the fact questions can be improved.
Optionally, in this embodiment of the present invention, before querying the knowledge map database for obtaining answer information matching the question information, the method may further include: and acquiring a fact-class question-answer corpus pair set, and constructing a knowledge graph according to the question-answer corpus pair set. Specifically, the obtaining of the fact-type question-answer data includes obtaining the fact-type question-answer data in a search engine, for example, obtaining question-answer information of fact types in each webpage through a web crawler technology, and screening through data such as an appraisal amount, a praise amount and/or a collection amount to obtain high-quality fact-type question-answer information; the acquired fact question-answer data is represented as fact data information in a triple form, namely the fact question-answer data is represented through a subject entity, an entity relationship and an objective entity, wherein the subject entity is a subject of a sentence, the objective entity is an object vocabulary in an association relationship with the subject, and the entity relationship reflects the specific association relationship between the subject entity and the objective entity in the sentence; for example, one fact-class question-answering data is that "the three primary colors of the optical intermediate color are red, green and blue", and the three primary colors are converted into a triplet form, namely, the subject entity "optical intermediate color", the entity relationship "three primary colors", and the objective entity "red, green and blue"; importing fact data in a triple form into a JanusGraph database for storage so as to obtain a constructed knowledge map database; the JanusGraph is an extensible graph database, supports real-time graph traversal and analysis query, supports real-time and concurrent access of a large number of users, and serves as a storage database of relational data in the embodiment of the invention. By constructing the fact type knowledge map database, the accuracy of answering to the fact type question information can be improved.
Optionally, in the embodiment of the present invention, after querying the problem information in the knowledge graph database, the method may further include: and if the answer information matched with the question information is not acquired, acquiring recommendation information in the field to which the question information belongs and using the recommendation information as the answer information. Specifically, if the matched answer information is not retrieved from the knowledge graph for the fact-class user question information, the domain to which the current user question information belongs is obtained, the domain information may be retrieved in a search engine, or the domain information in the same domain as the domain to which the user question information belongs is retrieved, so as to obtain recommendation information, and the recommendation information is used as an answer; the classification of the fields can be performed according to international classification standards, for example, fruits are used as a class to determine the classification class of the field to which the user question information belongs; the corresponding secondary sub-fields can also be classified, for example, the secondary sub-fields of fruits, apples, pears and the like are taken as classification categories of the fields. By acquiring the relevant recommendation information as the answer to recall when no matching answer information exists, the condition of returning blank answer information to the user is avoided, and the user experience is improved.
S130, if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the similarity of the question information and the at least one recall result to obtain answer information matched with the question information.
Specifically, if the problem information is determined to be of an analysis type or a viewpoint type, a word segmentation sentence corresponding to the problem information is obtained, and the word segmentation sentence is retrieved through a search engine to obtain at least one corresponding recall result; similarity calculation is carried out on the plurality of recall results and the question opinions respectively to obtain the recall result with the highest similarity with the question information as answer information; performing similarity calculation, including performing morpheme analysis on the question information by adopting a BM25(Best Match 25) algorithm to generate morpheme information, calculating a relevance score between each morpheme information and the recall result, and finally performing weighted summation on the relevance score of each morpheme relative to the recall result to obtain a relevance score between the question information and each recall result, wherein the higher the relevance score is, the higher the similarity between the current recall result and the question information is. By comparing the similarity of the question information and the recall result, the answer with the highest similarity is provided for the user, and the accuracy of the question-answering system can be further improved.
Optionally, in an embodiment of the present invention, and/or after retrieving the question information by a search engine, the method may further include: and if the recall result is not obtained, obtaining recommendation information in the field of the question information and using the recommendation information as answer information. Specifically, if the recall result cannot be retrieved for the word segmentation sentence of the current question information, the field of the current question information is also obtained, the field information may be retrieved in a search engine, or the question information in the same field as the user question information belongs to, so as to obtain the recommendation information as an answer, thereby avoiding the situation of returning blank answer information to the user, and improving the user experience.
And S140, sending the answer information of the question information to the user.
Specifically, the step of sending the answer information of the question information to the user includes sending the retrieved answer information to the user in a form of a question posed by the user, for example, if the user presents a question in a form of a voice instruction, the question and answer system may send the answer information to the user in a form of voice broadcast after obtaining the answer information, or broadcast a prompt sound effect "find the following content for you". Optionally, in this embodiment of the present invention, after sending the answer information of the question information to the user, the method may further include: and updating the semantic classification model according to the current question information and the corresponding answer information of the user, and acquiring the updated semantic classification model. Specifically, after answer information of the question information is sent to the user, the question information and the obtained answer information are stored, and behavior data of the user are recorded, for example, behaviors of the user on browsing time or like of the answer information are recorded, when the user approves the current answer information or the browsing time of the current answer information exceeds a preset threshold, the currently provided answer information can be considered to be satisfied by the user, so that the accuracy of the current answer information is verified, or the current answer information is determined to be information in which the user is interested, the type of the information in which the user is interested is determined, and a reference basis is provided for providing recommendation information for the user; updating the semantic classification model according to the current question information and the corresponding answer information of the user, training the semantic classification model by taking the current question information and the corresponding answer information as new training samples, adding the current question information and the corresponding answer information into the initial question-answer pair set to form a new training set, and training the voice classification model to obtain the optimized semantic classification model. The semantic analysis model is continuously subjected to iterative optimization according to the current question information and the corresponding answer information of the user, so that the classification accuracy of the classification model can be further improved.
According to the technical scheme provided by the embodiment of the invention, after the problem information of a user is obtained and is classified by a semantic classification model so as to determine the classification type of the problem information; aiming at the question information of different classification types, different answer retrieval strategies are adopted to obtain answer information matched with the question information; and finally, the answer information of the question information is sent to the user, so that the accurate classification of the user questions is realized, and the accuracy of the recalled answers is improved.
Example two
Fig. 2 is a flowchart of a method for implementing an intelligent question answering according to a second embodiment of the present invention, which is embodied on the basis of the second embodiment, in the present embodiment, fact-type question information is queried in a knowledge graph database to obtain answer information matching with the question information, and the method specifically includes:
s210, extracting entity words from the word segmentation sentences of the question information, and obtaining subject information through syntactic analysis.
Specifically, extracting entity words in the problem information, including subject entities, entity relations and objective entities, and respectively determining and obtaining subject information and relation information in the entity words through syntactic analysis; the method for analyzing the grammar comprises the steps of analyzing entity vocabularies by adopting a Natural Language understanding (NLP) algorithm to obtain subject entity components in the entity vocabularies as subject information; the NLP is a method for realizing effective communication between a person and a computer by using natural language, Chinese word segmentation, part of speech tagging, syntax analysis and the like can be effectively realized, and the accurate acquisition of subject information of problem information can be realized by adopting the NLP method so as to improve the accuracy of problem retrieval.
And S220, acquiring a target subject entity matched with the subject information in the knowledge map database.
Specifically, the matched subject information is obtained from the knowledge graph database, and the obtained subject information may be matched with subject entities in the knowledge graph database one by one to obtain target subject entities matched with the subject information; or, according to the obtained subject information, performing retrieval search in a knowledge graph database to obtain a matched target subject entity.
And S230, if the target subject entity matched with the subject information is obtained, returning at least one alternative entity relationship matched with the target subject entity, and obtaining the target entity relationship matched with the entity relationship of the problem information in the at least one alternative entity relationship.
Specifically, when a target subject entity matched with subject information is retrieved from a knowledge graph, all candidate entity relationships related to the target subject entity are extracted from the knowledge graph, wherein the number of the candidate entity relationships may be one or more; if the number of the alternative entity relations is one, directly comparing the current alternative entity relation with the relation information of the problem information to judge whether the alternative entity relations are matched or not, and if the alternative entity relations are matched, determining that the alternative entity relations are the target entity relations; and if the number of the alternative entity relations is multiple, comparing the multiple alternative entity relations with the relation information of the problem information one by one to obtain the matched alternative entity relation as the target entity relation.
S240, determining a target objective entity according to the relation between the target subject entity and the target entity, and taking the target objective entity as answer information.
Specifically, in a factual question, one question corresponds to a uniquely determined answer; therefore, in the knowledge map, after the relationship between the subject entity and the entity is determined, the objective entity is also determined, and according to the determined relationship between the target subject entity and the target entity, retrieval is performed in the knowledge map to determine the target objective entity, namely answer information corresponding to the user question information. By adopting the knowledge graph to obtain the answers of the fact questions, the accuracy of answer recall of the fact questions can be improved.
According to the technical scheme provided by the embodiment of the invention, when the user question information is determined to be a fact type, the subject information and the relation information of the question information are obtained by adopting syntactic analysis, and the determined target objective entity is obtained as answer information by searching and matching in the pre-established knowledge map, so that the accuracy of answer recall of the fact type question is improved, and the accuracy of intelligent question answering is further improved.
EXAMPLE III
Fig. 3 is a block diagram of a structure of an apparatus for implementing an intelligent question answering according to a third embodiment of the present invention, where the apparatus specifically includes: a classification type determination module 310, a first answer information acquisition module 320, a second answer information acquisition module 330, and an answer information transmission module 340;
a classification type determining module 310, configured to obtain question information of a user, and perform classification processing on the question information through a semantic classification model to determine a classification type of the question information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
a first answer information obtaining module 320, configured to query the question information in a knowledge graph database to obtain answer information matched with the question information if the question information is of a fact type;
a second answer information obtaining module 330, configured to, if the question information is of an analysis type or an opinion type, retrieve the question information through a search engine to obtain at least one recall result, and perform similarity comparison between the question information and the at least one recall result to obtain answer information matched with the question information;
the answer information sending module 340 is configured to send answer information of the question information to the user.
According to the technical scheme provided by the embodiment of the invention, after the problem information of a user is obtained and is classified by a semantic classification model so as to determine the classification type of the problem information; aiming at the question information of different classification types, different answer retrieval strategies are adopted to obtain answer information matched with the question information; and finally, the answer information of the question information is sent to the user, so that the accurate classification of the user questions is realized, and the accuracy of the recalled answers is improved.
Optionally, on the basis of the above technical scheme, the semantic classification model includes a word segmentation processing model, a word vector extraction model, and a classification processing model, which are connected in sequence.
Optionally, on the basis of the above technical solution, the classification type determining module 310 is specifically configured to perform word segmentation processing on the problem information through the word segmentation processing model to obtain a word segmentation sentence; extracting word vectors of the word segmentation sentences through the word vector extraction model, and obtaining sentence vectors of the word segmentation sentences according to the word vectors; and classifying the question information according to the sentence vector through the classification processing model so as to determine the classification type of the question information.
Optionally, on the basis of the above technical solution, the word segmentation processing model includes an N-ary model, and/or the word vector extraction model includes a word2vec model, and/or the classification processing model includes a long-term memory network model.
Optionally, on the basis of the above technical solution, the apparatus for implementing intelligent question answering further includes:
the classification model training module is used for acquiring a question and answer corpus pair set with labeled classification types and training an initial semantic classification model to acquire a trained semantic classification model; and acquiring the question-answer corpus pair set based on user behaviors.
Optionally, on the basis of the above technical solution, the knowledge graph database includes at least one fact data information; the fact data information comprises a subject entity, an entity relation and an objective entity.
Optionally, on the basis of the foregoing technical solution, the first answer information obtaining module 320 includes:
the subject information acquisition unit is used for extracting entity words from the word segmentation sentences of the question information and acquiring subject information through syntactic analysis;
a target subject entity acquiring unit, configured to acquire, in the knowledge map database, a target subject entity matched with the subject information;
a target entity relationship obtaining unit, configured to, if a target subject entity matching the subject information is obtained, return at least one alternative entity relationship matching the target subject entity, and obtain, in the at least one alternative entity relationship, a target entity relationship matching the entity relationship of the problem information;
and the target objective entity obtaining unit is used for determining a target objective entity according to the relation between the target subject entity and the target entity and taking the target objective entity as answer information.
Optionally, on the basis of the above technical solution, the apparatus for implementing intelligent question answering further includes:
the first recommendation information acquisition module is used for acquiring recommendation information in the field of the question information and using the recommendation information as answer information if the answer information matched with the question information is not acquired;
and/or the second recommendation information acquisition module is used for acquiring recommendation information in the field to which the question information belongs and serving as answer information if the recall result is not acquired.
Optionally, on the basis of the above technical solution, the apparatus for implementing intelligent question answering further includes:
and the classification model updating module is used for updating the semantic classification model according to the current question information of the user and the corresponding answer information and acquiring the updated semantic classification model.
The device can execute the intelligent question answering implementation method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details not described in detail in this embodiment, reference may be made to the method provided in any embodiment of the present invention.
Example four
Fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 4 is only an example and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.
As shown in FIG. 4, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples various system components including the memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing the method for implementing intelligent question answering provided by any embodiment of the present invention. Namely: acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type; if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information; if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the question information with the at least one recall result in a similarity manner to obtain answer information matched with the question information; and sending answer information of the question information to the user.
EXAMPLE five
Fifth, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for implementing an intelligent question answering according to any embodiment of the present invention; the method comprises the following steps:
acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information;
if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the question information with the at least one recall result in a similarity manner to obtain answer information matched with the question information;
and sending answer information of the question information to the user.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. An implementation method of intelligent question answering is characterized by comprising the following steps:
acquiring problem information of a user, and classifying the problem information through a semantic classification model to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
if the question information is of a fact type, inquiring the question information in a knowledge map database to obtain answer information matched with the question information;
if the question information is of an analysis type or an opinion type, retrieving the question information through a search engine to obtain at least one recall result, and comparing the question information with the at least one recall result in a similarity manner to obtain answer information matched with the question information;
and sending answer information of the question information to the user.
2. The method according to claim 1, wherein the semantic classification model comprises a word segmentation processing model, a word vector extraction model and a classification processing model which are connected in sequence;
the classifying the question information through the semantic classification model comprises the following steps:
performing word segmentation processing on the problem information through the word segmentation processing model to obtain word segmentation sentences;
extracting word vectors of the word segmentation sentences through the word vector extraction model, and obtaining sentence vectors of the word segmentation sentences according to the word vectors;
and classifying the question information according to the sentence vector through the classification processing model so as to determine the classification type of the question information.
3. The method according to claim 2, wherein the word segmentation processing model comprises an N-gram model, and/or the word vector extraction model comprises a word2vec model, and/or the classification processing model comprises a long-term memory network model.
4. The method of claim 1, further comprising, prior to classifying the problem information by a semantic classification model:
acquiring a question-answer corpus pair set with labeled classification types, and training an initial semantic classification model to acquire a trained semantic classification model; and acquiring the question-answer corpus pair set based on user behaviors.
5. The method of claim 2, wherein the knowledge-graph database includes at least one fact-class data information; the fact data information comprises a subject language entity, an entity relation and an objective entity;
the querying the question information in a knowledge graph database to obtain answer information matched with the question information includes:
extracting entity words from the word segmentation sentences of the question information, and obtaining subject information through syntactic analysis;
acquiring a target subject entity matched with the subject information in the knowledge map database;
if a target subject language entity matched with the subject language information is obtained, returning at least one alternative entity relation matched with the target subject language entity, and obtaining a target entity relation matched with the entity relation of the problem information in the at least one alternative entity relation;
and determining a target objective entity according to the relation between the target subject entity and the target entity, and taking the target objective entity as answer information.
6. The method of claim 5, wherein after querying the question information in a knowledge graph database, further comprising:
if answer information matched with the question information is not obtained, obtaining recommendation information in the field to which the question information belongs and using the recommendation information as answer information;
and/or after the question information is retrieved through a search engine, further comprising:
and if the recall result is not obtained, obtaining recommendation information in the field of the question information and using the recommendation information as answer information.
7. The method of claim 1, further comprising, after sending answer information of the question information to the user:
and updating the semantic classification model according to the current question information and the corresponding answer information of the user, and acquiring the updated semantic classification model.
8. An implementation device for intelligent question answering is characterized by comprising:
the system comprises a classification type determining module, a semantic classification model and a query type determining module, wherein the classification type determining module is used for acquiring problem information of a user and classifying the problem information through the semantic classification model so as to determine the classification type of the problem information; wherein the classification type comprises a fact type, an analysis type and/or a point of view type;
the first answer information acquisition module is used for inquiring the question information in a knowledge map database if the question information is of a fact type so as to acquire answer information matched with the question information;
the second answer information acquisition module is used for retrieving the question information through a search engine to acquire at least one recall result if the question information is of an analysis type or an opinion type, and comparing the question information with the at least one recall result in a similarity manner to acquire answer information matched with the question information;
and the answer information sending module is used for sending the answer information of the question information to the user.
9. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method for intelligent question answering as recited in any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for implementing intelligent question answering according to any one of claims 1 to 7.
CN202110185116.7A 2021-02-10 2021-02-10 Method, device, equipment and storage medium for realizing intelligent question answering Pending CN112860865A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110185116.7A CN112860865A (en) 2021-02-10 2021-02-10 Method, device, equipment and storage medium for realizing intelligent question answering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110185116.7A CN112860865A (en) 2021-02-10 2021-02-10 Method, device, equipment and storage medium for realizing intelligent question answering

Publications (1)

Publication Number Publication Date
CN112860865A true CN112860865A (en) 2021-05-28

Family

ID=75989612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110185116.7A Pending CN112860865A (en) 2021-02-10 2021-02-10 Method, device, equipment and storage medium for realizing intelligent question answering

Country Status (1)

Country Link
CN (1) CN112860865A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505206A (en) * 2021-07-01 2021-10-15 北京有竹居网络技术有限公司 Information processing method and device based on natural language reasoning and electronic equipment
CN113742469A (en) * 2021-09-03 2021-12-03 科讯嘉联信息技术有限公司 Pipeline processing and ES storage based question-answering system construction method
CN113886546A (en) * 2021-09-29 2022-01-04 平安银行股份有限公司 Knowledge question-answer matching processing method, device, medium and electronic equipment
CN114372215A (en) * 2022-01-12 2022-04-19 北京字节跳动网络技术有限公司 Search result display method, search request processing method and device
CN114610845A (en) * 2022-03-02 2022-06-10 北京百度网讯科技有限公司 Multisystem-based intelligent question answering method, device and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280307A1 (en) * 2013-03-15 2014-09-18 Google Inc. Question answering to populate knowledge base
US9646250B1 (en) * 2015-11-17 2017-05-09 International Business Machines Corporation Computer-implemented cognitive system for assessing subjective question-answers
CN108959531A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 Information search method, device, equipment and storage medium
CN109033229A (en) * 2018-06-29 2018-12-18 北京百度网讯科技有限公司 Question and answer treating method and apparatus
CN109284363A (en) * 2018-12-03 2019-01-29 北京羽扇智信息科技有限公司 A kind of answering method, device, electronic equipment and storage medium
CN110895561A (en) * 2019-11-13 2020-03-20 中国科学院自动化研究所 Medical question and answer retrieval method, system and device based on multi-mode knowledge perception
CN111737499A (en) * 2020-07-27 2020-10-02 平安国际智慧城市科技股份有限公司 Data searching method based on natural language processing and related equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280307A1 (en) * 2013-03-15 2014-09-18 Google Inc. Question answering to populate knowledge base
US9646250B1 (en) * 2015-11-17 2017-05-09 International Business Machines Corporation Computer-implemented cognitive system for assessing subjective question-answers
CN108959531A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 Information search method, device, equipment and storage medium
CN109033229A (en) * 2018-06-29 2018-12-18 北京百度网讯科技有限公司 Question and answer treating method and apparatus
CN109284363A (en) * 2018-12-03 2019-01-29 北京羽扇智信息科技有限公司 A kind of answering method, device, electronic equipment and storage medium
CN110895561A (en) * 2019-11-13 2020-03-20 中国科学院自动化研究所 Medical question and answer retrieval method, system and device based on multi-mode knowledge perception
CN111737499A (en) * 2020-07-27 2020-10-02 平安国际智慧城市科技股份有限公司 Data searching method based on natural language processing and related equipment

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505206A (en) * 2021-07-01 2021-10-15 北京有竹居网络技术有限公司 Information processing method and device based on natural language reasoning and electronic equipment
CN113742469A (en) * 2021-09-03 2021-12-03 科讯嘉联信息技术有限公司 Pipeline processing and ES storage based question-answering system construction method
CN113742469B (en) * 2021-09-03 2023-12-15 科讯嘉联信息技术有限公司 Method for constructing question-answering system based on Pipeline processing and ES storage
CN113886546A (en) * 2021-09-29 2022-01-04 平安银行股份有限公司 Knowledge question-answer matching processing method, device, medium and electronic equipment
CN114372215A (en) * 2022-01-12 2022-04-19 北京字节跳动网络技术有限公司 Search result display method, search request processing method and device
CN114610845A (en) * 2022-03-02 2022-06-10 北京百度网讯科技有限公司 Multisystem-based intelligent question answering method, device and equipment
CN114610845B (en) * 2022-03-02 2024-05-14 北京百度网讯科技有限公司 Intelligent question-answering method, device and equipment based on multiple systems

Similar Documents

Publication Publication Date Title
US11775760B2 (en) Man-machine conversation method, electronic device, and computer-readable medium
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN108829757B (en) Intelligent service method, server and storage medium for chat robot
CN112860865A (en) Method, device, equipment and storage medium for realizing intelligent question answering
CN109783631B (en) Community question-answer data verification method and device, computer equipment and storage medium
CN110363194A (en) Intelligently reading method, apparatus, equipment and storage medium based on NLP
WO2021218028A1 (en) Artificial intelligence-based interview content refining method, apparatus and device, and medium
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
US20180204106A1 (en) System and method for personalized deep text analysis
CN113505586A (en) Seat-assisted question-answering method and system integrating semantic classification and knowledge graph
CN110096572B (en) Sample generation method, device and computer readable medium
CN113672708A (en) Language model training method, question and answer pair generation method, device and equipment
CN115599899B (en) Intelligent question-answering method, system, equipment and medium based on aircraft knowledge graph
CN112287090A (en) Financial question asking back method and system based on knowledge graph
CN110889024A (en) Method and device for calculating information-related stock
CN110717021A (en) Input text and related device for obtaining artificial intelligence interview
CN110969005B (en) Method and device for determining similarity between entity corpora
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN113918703A (en) Intelligent customer service question and answer method, device, server and storage medium
CN112330387A (en) Virtual broker applied to house-watching software
CN115878847B (en) Video guiding method, system, equipment and storage medium based on natural language
CN111708870A (en) Deep neural network-based question answering method and device and storage medium
CN113468311B (en) Knowledge graph-based complex question and answer method, device and storage medium
CN116401344A (en) Method and device for searching table according to question
CN114722830A (en) Intelligent customer service semantic recognition general model construction method and question-answering robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination