CN103577556A - Device and method for obtaining association degree of question and answer pair - Google Patents

Device and method for obtaining association degree of question and answer pair Download PDF

Info

Publication number
CN103577556A
CN103577556A CN201310495641.4A CN201310495641A CN103577556A CN 103577556 A CN103577556 A CN 103577556A CN 201310495641 A CN201310495641 A CN 201310495641A CN 103577556 A CN103577556 A CN 103577556A
Authority
CN
China
Prior art keywords
answer
question
word
analyzed
awj
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310495641.4A
Other languages
Chinese (zh)
Other versions
CN103577556B (en
Inventor
孙林
陈培军
秦吉胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310495641.4A priority Critical patent/CN103577556B/en
Publication of CN103577556A publication Critical patent/CN103577556A/en
Priority to PCT/CN2014/086838 priority patent/WO2015058604A1/en
Application granted granted Critical
Publication of CN103577556B publication Critical patent/CN103577556B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention discloses a device and a method for obtaining the association degree of a question and answer pair. The method comprises the following steps: carrying out word extraction on question content and answer content of a question and answer pair to be analyzed to obtain at least one question word to be analyzed and at least one answer word to be analyzed; selecting at least one question and answer knowledge record from a question and answer knowledge library including a plurality of question and answer knowledge records according to the question words to be analyzed and the answer words to be analyzed, and calculating the association degree of the question and answer pair to be analyzed according to the selected question and answer knowledge records. According to the device and the method, the quality of the question and answer pair can be semantically evaluated, and the evaluation effect is good; in addition, the method is easy to implement and good in universality.

Description

A kind of apparatus and method of obtaining the degree that is associated that question and answer are right
Technical field
The present invention relates to network data communication field, be specifically related to a kind of apparatus and method of obtaining the degree that is associated that question and answer are right.
Background technology
Ask-Answer Community is the network application that a kind of user produces content, and citation form is to be asked a question according to the demand of oneself by user, and provides answer by other user.This form provides new channel for user's obtaining information on network.Yet due to any user content creating optionally, caused the information quality difference in Ask-Answer Community very large, to such an extent as in Ask-Answer Community, occurred a large amount of inferior quality question and answer pair.This has not only brought inconvenience to user's information of searching, and has also reduced the quality of Ask-Answer Community simultaneously.Meanwhile, the method for prior art, right non-text feature is evaluated question and answer to quality to depend on more question and answer, can affect its versatility.
Summary of the invention
In view of the above problems, the present invention has been proposed to a kind of a kind of method of obtaining the device of the degree that is associated that question and answer are right and obtaining accordingly the degree that is associated that question and answer are right that overcomes the problems referred to above or address the above problem is at least in part provided.
According to one aspect of the present invention, a kind of device that obtains the degree that is associated that question and answer are right is provided, this device comprises:
Question and answer knowledge base, is suitable for storing many question and answer knowledge records;
Word extraction unit, is suitable for the right problem content of question and answer to be analyzed and answer content to carry out word extraction operation, obtains at least one problem word to be analyzed and at least one answer word to be analyzed;
The degree that is associated computing unit, is suitable for, according to problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, selecting at least one question and answer knowledge record, according to selected question and answer knowledge record, calculates the degree that is associated that question and answer to be analyzed are right.
Alternatively, this device further comprises question and answer construction of knowledge base unit, described question and answer construction of knowledge base unit, is suitable for that right webpage extracts a plurality of question and answer pair from containing question and answer in advance, according to the question and answer of extracting to building the question and answer knowledge base that comprises many question and answer knowledge records; Described question and answer construction of knowledge base unit, be further adapted for from the right webpage that contains question and answer, extract a plurality of question and answer to time, capture with described question and answer corresponding classification; Described question and answer construction of knowledge base unit, be further adapted for according to the question and answer of extracting when building question and answer knowledge base, according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge record; Each question and answer knowledge record, corresponding to a classification, comprises respectively a problem word, an answer word, and the semantic relevancy between described problem word and described answer word.
Alternatively, described in the degree computing unit that is associated, be suitable for choosing the question and answer knowledge record of it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed; According in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to the degree that is associated for each classification; Choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
Alternatively, described in the degree computing unit that is associated, be suitable for by the question and answer knowledge record of choosing corresponding to the semantic relevancy weighting summation of the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification.
Alternatively, described word extraction unit, is suitable for the right problem content of question and answer to be analyzed and answer content to carry out participle, removal stop words, word merging, and the operation of extracting entity word.
Alternatively, described question and answer construction of knowledge base unit, is suitable for each question and answer carrying out following operation: the right problem content of these question and answer and answer content are carried out to word and extract operation, obtain problem set of words and answer set of words; Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively; Described question and answer construction of knowledge base unit, be suitable for each information recording, carry out following operation: calculate this answer word and belong to such other probability, calculating is the single-minded degree of this answer word to the explanation of this problem word in this classification, calculates the intensity that this problem word makes an explanation with this answer word in this classification; Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word; Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record.
Alternatively, described question and answer construction of knowledge base unit, is suitable for calculating as follows this answer word and belongs to such other probability:
P ( Ck | AWj ) = P ( AWj | Ck ) * P ( Ck ) P ( AWj ) ;
Described question and answer construction of knowledge base unit, is suitable for calculating as follows the single-minded degree of each answer word to the explanation of this problem word in this classification:
apecific ( QWi , AWi | C = Ck ) = P ( QWi | AWj , C = Ck ) = # ( QWi , AWj ) # ( AWj ) | C = Ck ;
Described question and answer construction of knowledge base unit, is suitable for calculating as follows the intensity that this problem word makes an explanation with each answer word in this classification:
interpret ( QWi , AWj | C = Ck ) = P ( AWj | QWi , C = Ck ) = # ( QWi , AWj ) Σ j = 1 x # ( QWi , AWj ) | C = Ck ;
Described question and answer construction of knowledge base unit, is suitable for as follows above-mentioned probability, single-minded degree and intensity being multiplied each other:
weight(QWi,AWj|C=Ck)=P(Ck|AWj)*specific(QWi,AWj|C=Ck)*interpret(QWi,AWj|C=Ck);
Wherein, the probability that P(Ck) represents classification Ck appearance; P(AWj) represent the probability that answer is AWj; P(AWj │ Ck) represent that Ck classification belongs to the probability of AWj;
#(QWi, AWj) problem of representation word is the number of times that QWi and answer word are AWj;
#(AWj) represent the number of times that answer word is AWj.
According to a further aspect in the invention, provide a kind of method of obtaining the degree that is associated that question and answer are right, the method comprises the steps:
The right problem content of question and answer to be analyzed and answer content are carried out to word extraction operation, obtain at least one problem word to be analyzed and at least one answer word to be analyzed;
According to problem word to be analyzed and answer word to be analyzed, from comprising that the question and answer knowledge base of many question and answer knowledge records selects at least one question and answer knowledge record, according to selected question and answer knowledge record, calculate the degree that is associated that question and answer to be analyzed are right.
Alternatively, the method further comprises: from containing question and answer, right webpage extracts a plurality of question and answer pair in advance, according to the question and answer of extracting, structure is comprised the question and answer knowledge base of many question and answer knowledge records; From the right webpage that contains question and answer, extract a plurality of question and answer to time, capture with described question and answer corresponding classification; According to the question and answer of extracting when building question and answer knowledge base, according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge record; Each question and answer knowledge record, corresponding to a classification, comprises respectively a problem word, an answer word, and the semantic relevancy between described problem word and described answer word.
Alternatively, described according to problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, select at least one question and answer knowledge record, according to selected question and answer knowledge record, calculate the degree that is associated that question and answer to be analyzed are right, specifically comprise: the question and answer knowledge record of choosing it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed; According in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to the degree that is associated for each classification; Choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
Alternatively, according in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification, specifically comprise: by the question and answer knowledge record of choosing corresponding to the semantic relevancy weighting summation of the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification.
Alternatively, describedly the right problem content of described question and answer to be analyzed and answer content are carried out to word extract operation, specifically comprise: the right problem content of question and answer to be analyzed and answer content are carried out to participle, removal stop words, word merging, and the operation of extracting entity word.
Alternatively, described according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge base, specifically comprise: to each question and answer pair, the right problem content of these question and answer and answer content are carried out to word extraction operation, obtain problem set of words and answer set of words; Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively; To each information recording, carry out following operation: calculate this answer word and belong to such other probability, calculating is the single-minded degree of this answer word to the explanation of this problem word in this classification, calculates the intensity that this problem word makes an explanation with this answer word in this classification; Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word; Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record.
Alternatively, this answer word of described calculating belongs to such other probability, specifically comprises:
P ( Ck | AWj ) = P ( AWj | Ck ) * P ( Ck ) P ( AWj ) ;
Described calculating is the single-minded degree of each answer word to the explanation of this problem word in this classification, specifically comprises:
apecific ( QWi , AWi | C = Ck ) = P ( QWi | AWj , C = Ck ) = # ( QWi , AWj ) # ( AWj ) | C = Ck ;
The described calculating intensity that this problem word makes an explanation with each answer word in this classification, specifically comprises:
interpret ( QWi , AWj | C = Ck ) = P ( AWj | QWi , C = Ck ) = # ( QWi , AWj ) Σ j = 1 x # ( QWi , AWj ) | C = Ck ;
Above-mentioned probability, single-minded degree and intensity are multiplied each other, specifically comprise:
weight(QWi,AWj|C=Ck)=P(Ck|AWj)*specific(QWi,AWj|C=Ck)*interpret(QWi,AWj|C=Ck);
Wherein, the probability that P(Ck) represents classification Ck appearance; P(AWj) represent the probability that answer is AWj; P(AWj │ Ck) represent that Ck classification belongs to the probability of AWj;
#(QWi, AWj) problem of representation word is the number of times that QWi and answer word are AWj;
#(AWj) represent the number of times that answer word is AWj.
According to technical scheme of the present invention, from the right webpage that contains question and answer extract a plurality of question and answer to and according to the question and answer of extracting to building the question and answer knowledge base that comprises many question and answer knowledge records, the right problem content of question and answer to be analyzed and answer content are carried out word extraction operation and obtain at least one problem word to be analyzed and at least one answer word to be analyzed, and then select at least one question and answer knowledge record and calculate according to selected question and answer knowledge record the degree that is associated that question and answer to be analyzed are right from question and answer knowledge base according to problem word to be analyzed and answer word to be analyzed, can evaluate the right quality of question and answer from semantic aspect, solve prior art and only in morphology aspect, evaluated the quality that question and answer are right and the not good problem of evaluation effect causing, and easily realize, highly versatile.
Accompanying drawing explanation
By reading below detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing is only for the object of preferred implementation is shown, and do not think limitation of the present invention.And in whole accompanying drawing, by identical reference symbol, represent identical parts.In the accompanying drawings:
Fig. 1 shows the process flow diagram of the method for obtaining according to an embodiment of the invention the degree that is associated that question and answer are right;
Fig. 2 shows the detailed process flow diagram that builds question and answer knowledge base;
Fig. 3 shows step as shown in Figure 2 of use and an interpretation model schematic diagram of the question and answer knowledge base that obtains;
Fig. 4 shows the detailed process flow diagram of step S200 in Fig. 1; And
Fig. 5 shows the block diagram of the device that obtains according to an embodiment of the invention the degree that is associated that question and answer are right;
Fig. 6 shows the block diagram of the device that obtains in accordance with another embodiment of the present invention the degree that is associated that question and answer are right.
Embodiment
The existing method of obtaining the degree that is associated that question and answer are right is to describe with text feature and non-text feature problem and the answer that question and answer are right.Text feature mainly comprises text visual signature (punctuation mark density for example, average word is long, text entropy etc.) and content of text feature (content of text word ratio for example, interrogative density, and extract the Chinese feature that mistake extensively adopts automatically (such as individual character density feature etc.) related term covering etc.); The technorati authority index that non-text feature comprises user, answer problem state, answer response time, customer relationship interaction feature etc.Problem and answer are being extracted respectively after feature, on training set, learning out respectively a problem prediction of quality model and answer prediction of quality model, and evaluate question and answer to quality with the Output rusults of two models.Yet, while using the existing method of obtaining the degree that is associated that question and answer are right to evaluate for answer quality, only used related term Cover Characteristics to carry out the semantic matches degree between description problem and answer, this not only only rests in morphology aspect, and do not consider a problem and answer between semantic matches degree.Yet the semantic matches degree between problem and answer is the core of question and answer to quality exactly, such as problem for " China capital where be? ", answer 1 is " Beijing ", answer 2 is " capital of China is Shanghai ".Problem, through participle and after abandoning stop words and processing, is " the Chinese capital where " so, and answer 1 word segmentation result is " Beijing ", and answer 2 word segmentation result are " the Chinese capital Shanghai ".In prior art, semantic matches degree can be defined as: in problem and answer, the common word number occurring is divided by the number of all words in problem and answer.The semantic matches degree of problem and answer 1 is: 0/4=0.The semantic matches degree of problem and answer 2 is: 2/4=0.5.Use prior art, will think that answer 2 and problem comparatively mate.And we know that this is obviously improperly.
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in accompanying drawing, yet should be appreciated that and can realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order more thoroughly to understand the disclosure that these embodiment are provided, and can by the scope of the present disclosure complete convey to those skilled in the art.
Fig. 1 shows the process flow diagram of the method for obtaining according to an embodiment of the invention the degree that is associated that question and answer are right.According to a further aspect in the invention, provide a kind of method of obtaining the degree that is associated that question and answer are right, the method comprises the steps S100 and step S200:
S100, the right problem content of question and answer to be analyzed and answer content are carried out to word extract operation, obtain at least one problem word to be analyzed and at least one answer word to be analyzed.
In one embodiment of the invention, the right problem content of question and answer to be analyzed and answer content are carried out to word to be extracted operation and specifically comprises: to the right problem content of question and answer to be analyzed and answer content carry out participle, remove stop words, word merges (word join), and extracts the operation of entity word (such as noun, verb etc.).By the right problem content of question and answer to be analyzed, obtain at least one problem word to be analyzed, by the right answer content of question and answer to be analyzed, obtain at least one answer word to be analyzed.
S200, according to problem word to be analyzed and answer word to be analyzed, from comprising that the question and answer knowledge base of many question and answer knowledge records selects at least one question and answer knowledge record, according to selected question and answer knowledge record, calculate the degree that is associated that question and answer to be analyzed are right.
The step S200 of the present embodiment, can be by utilizing question and answer knowledge base to analyze to obtain from semantic aspect to the right problem content of question and answer to be analyzed and answer content the degree that is associated that question and answer to be analyzed are right, and evaluation effect better and is easily realized.
Further, described in comprise the question and answer knowledge base of many question and answer knowledge records, be by from containing question and answer, right webpage extracts a plurality of question and answer pair in advance, according to the question and answer of extracting, structure is obtained.In one embodiment of the invention, from the right webpage that contains question and answer, extract a plurality of question and answer to time, capture with described question and answer corresponding classification.According to the question and answer of extracting when building question and answer knowledge base, according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge record.Each question and answer knowledge record among the question and answer knowledge base obtaining, corresponding to a classification, comprises respectively a problem word (QW), an answer word (AW), and the semantic relevancy between described problem word and described answer word.
By utilize the magnanimity extracted by webpage, high-quality question and answer are to building the question and answer knowledge base that comprises many question and answer knowledge records, can be based on the study of magnanimity information is obtained to the problem word of many question and answer knowledge records and the semantic relevancy between answer word; And extract by utilizing from webpage the information architecture question and answer knowledge base obtaining, and applicable is wider, and the versatility of method is stronger.
Fig. 2 shows the detailed process flow diagram that builds question and answer knowledge base.Specifically comprise the following steps S310, step S320 and step S330:
S310, from containing question and answer, right webpage extracts a plurality of question and answer pair in advance, captures with described question and answer corresponding classification.
In the present embodiment, can, by using web crawlers, from internet, contain the webpage that high-quality question and answer are right and capture data and extract question and answer pair, the right quality of question and answer of being extracted to guarantee; Describedly contain high-quality question and answer right webpage comprises cQA community, each large professional forum etc., can use floor recognition technology, according to building-owner, ask a question, 1st floor 2nd floors etc. is the mode of answer, extracts question and answer pair.Due to described, contain high-quality question and answer right webpage comprises corresponding to the right classification information of each question and answer, so can capture in the lump with described question and answer corresponding classification in right capturing question and answer.
S320, to each question and answer pair, the right problem content of these question and answer and answer content are carried out to word and extract operation, obtain problem set of words and answer set of words; Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively.
In one embodiment of the invention, to extracting right problem content and the answer content of each question and answer of the described question and answer centering obtaining in step S310, carry out word extraction operation, specifically comprise, the right problem content of question and answer and answer content are carried out to participle, removal stop words, word merging, and the operation of extracting entity word.
By the right problem content of each question and answer, obtain at least one problem word, by the right answer content of each question and answer, obtain at least one answer word, can obtain for the right classification set <C of these question and answer 1..., C k..., C p>, problem set of words <QW 1..., QW i..., QW m> and answer set of words <AW 1..., AW j..., AW n>.
By making each the problem word (QW in problem set of words i) with answer set of words in each answer word (AW j) respectively with these question and answer to each corresponding classification (C k) upper formation information recording, for example a <QW i, AW j, C k>, can form m*n*p bar information recording.
S330, to each information recording, carry out following operation: calculate this answer word and belong to such other probability, calculating is the single-minded degree of this answer word to the explanation of this problem word in this classification, calculates the intensity that this problem word makes an explanation with this answer word in this classification; Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word; Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record <QW i, AW j, weight(QW i, AW j) > or <QW i, AW j, C k, weight(QW i, AW j) >.Step S330 in the present embodiment, can be after the information recording that the question and answer of the magnanimity capturing from webpage is obtained to magnanimity to having carried out word as described in step S320 and extracting operation based on as described in the information recording of magnanimity carry out, the information recording based on magnanimity and the semantic relevancy that obtains is more accurate.
Preferably, this answer word of described calculating belongs to such other probability, specifically comprises:
P ( Ck | AWj ) = P ( AWj | Ck ) * P ( Ck ) P ( AWj ) ;
Described calculating is the single-minded degree of each answer word to the explanation of this problem word in this classification, specifically comprises:
apecific ( QWi , AWi | C = Ck ) = P ( QWi | AWj , C = Ck ) = # ( QWi , AWj ) # ( AWj ) | C = Ck ;
The described calculating intensity that this problem word makes an explanation with each answer word in this classification, specifically comprises:
interpret ( QWi , AWj | C = Ck ) = P ( AWj | QWi , C = Ck ) = # ( QWi , AWj ) &Sigma; j = 1 x # ( QWi , AWj ) | C = Ck ;
Above-mentioned probability, single-minded degree and intensity are multiplied each other, specifically comprise:
weight(QWi,AWj|C=Ck)=P(Ck|AWj)*specific(QWi,AWj|C=Ck)*interpret(QWi,AWj|C=Ck);
Wherein, the probability that P(Ck) represents classification Ck appearance; P(AWj) represent the probability that answer is AWj; P(AWj │ Ck) represent that Ck classification belongs to the probability of AWj;
#(QWi, AWj) problem of representation word is the number of times that QWi and answer word are AWj;
#(AWj) represent the number of times that answer word is AWj.
By step S310, step S320 and step S330, can obtain question and answer knowledge record and build question and answer knowledge base.Fig. 3 shows step as shown in Figure 2 of use and an interpretation model schematic diagram of the question and answer knowledge base that obtains.Known, for each problem word QW i, can be for classification set <C 1..., C k..., C peach classification in >, obtains n bar question and answer knowledge record.Certainly, those skilled in the art are scrutable, if the semantic relevancy calculating is 0, can delete corresponding question and answer knowledge record; Moreover, if the quantity of question and answer knowledge record is excessive and make to store question and answer knowledge record and calculate the expense of the degree that is associated that question and answer to be analyzed are right excessive in question and answer knowledge base, can preset a threshold value, the question and answer knowledge record that semantic relevancy is less than to threshold value deletes to reduce expense.
Fig. 4 shows the detailed process flow diagram of step S200 in Fig. 1.Obtain at least one problem word to be analyzed and at least one answer word to be analyzed by step S100 after, step S200 specifically comprises the following steps S210, step S220 and step S230:
S210, choose the question and answer knowledge record of it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed.In the present embodiment, problem word refers to problem word match to be analyzed the substring that problem word to be analyzed is identical with problem word or problem word to be analyzed is problem word; Answer word refers to answer word match to be analyzed the substring that answer word to be analyzed is identical with answer word or answer word to be analyzed is answer word, the present embodiment is by step S210, use the method for fields match or field search, from question and answer knowledge base, select part to question and answer to be analyzed to relevant question and answer knowledge record.
S220, according to described in the question and answer knowledge record chosen corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification, specifically comprise: by the question and answer knowledge record of choosing corresponding to the semantic relevancy weighting summation of the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification.
The present embodiment, divides into groups the question and answer knowledge record of selecting by step S210 according to its corresponding classification, corresponding to the question and answer knowledge record of identical category, be one group; The semantic relevancy weighting of the question and answer knowledge record of each group (for example, weights are 1 or 100) is added, obtains these question and answer to be analyzed to the degree that is associated for such other; Degree is associated to obtain thus at least one (number of the degree that is associated in the present embodiment is the numbers of question and answer to be analyzed to corresponding classification).
S230, choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
Fig. 5 shows the block diagram of the device that obtains according to an embodiment of the invention the degree that is associated that question and answer are right.This device comprises question and answer knowledge base 100, word extraction unit 200 and the degree computing unit 300 that is associated.
Question and answer knowledge base 100, is suitable for storing many question and answer knowledge records; The question and answer knowledge base 100 of the present embodiment can obtain building by the magnanimity question and answer that capture in webpage.
Word extraction unit 200, is suitable for the right problem content of question and answer to be analyzed and answer content to carry out word extraction operation, obtains at least one problem word to be analyzed and at least one answer word to be analyzed.
In one embodiment of the invention, word extraction unit 200, be suitable for the right problem content of question and answer to be analyzed and answer content to carry out participle, removal stop words, word merging (word join), with the operation of extracting entity word (such as noun, verb etc.), to obtain at least one problem word to be analyzed and at least one answer word to be analyzed.
The degree that is associated computing unit 300, is suitable for, according to problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, selecting at least one question and answer knowledge record, according to selected question and answer knowledge record, calculates the degree that is associated that question and answer to be analyzed are right.
In one embodiment of the invention, the degree that is associated computing unit 300, is suitable for choosing the question and answer knowledge record of it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed.In the present embodiment, problem word refers to problem word match to be analyzed the substring that problem word to be analyzed is identical with problem word or problem word to be analyzed is problem word, answer word refers to answer word match to be analyzed the substring that answer word to be analyzed is identical with answer word or answer word to be analyzed is answer word, according in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to the degree that is associated for each classification, more specifically, be by the question and answer knowledge record of choosing corresponding to the semantic relevancy weighting of the question and answer knowledge record of identical category (for example, weights are 1 or 100) be added and obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification, degree is associated to obtain thus at least one (number of the degree that is associated in the present embodiment is the numbers of question and answer to be analyzed to corresponding classification), choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
Utilize question and answer knowledge base 100, word extraction unit 200 and the degree computing unit 300 that is associated, by utilizing problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, select at least one question and answer knowledge record, and calculate according to selected question and answer knowledge record the degree that is associated that question and answer to be analyzed are right, can be from semantic aspect to question and answer to be analyzed to analyzing, evaluation effect better and is easily realized, by utilizing from webpage, extract the information architecture question and answer knowledge base obtaining, applicable is wider, and versatility is stronger.
Fig. 6 shows the block diagram of the device that obtains in accordance with another embodiment of the present invention the degree that is associated that question and answer are right.In the present embodiment, this device also comprises question and answer construction of knowledge base unit 400, question and answer construction of knowledge base unit 400 is suitable for that right webpage extracts a plurality of question and answer pair from containing question and answer in advance, according to the question and answer of extracting to building the question and answer knowledge base that comprises many question and answer knowledge records.In the device shown in Fig. 5, question and answer knowledge base is existing, because the quantity of information of real network constantly increases, the pace of change of the information content is fast, the content of question and answer knowledge base often needs to upgrade, the present embodiment builds (upgrading in other words) question and answer knowledge base by setting up question and answer construction of knowledge base unit 400, can guarantee instantaneity and the reliability of the content of question and answer knowledge base.
Preferably, from the right webpage that contains question and answer, extract a plurality of question and answer to time, question and answer construction of knowledge base unit 400 captures with described question and answer corresponding classification.In the present embodiment, can, by using web crawlers, from internet, contain the webpage that high-quality question and answer are right and capture data and extract question and answer pair, the right quality of question and answer of being extracted to guarantee; Describedly contain high-quality question and answer right webpage comprises cQA community, each large professional forum etc.Question and answer construction of knowledge base unit 400 due to described, contain high-quality question and answer right webpage comprises corresponding to the right classification information of each question and answer, so can capture with described question and answer to corresponding classification in right in the lump capturing question and answer.
In the present embodiment, question and answer construction of knowledge base unit 400, be suitable for each question and answer carrying out following operation: the right problem content of these question and answer and answer content are carried out to word and extract operation, obtain problem set of words and answer set of words, particularly, the problem content that each question and answer of the described question and answer centering that the extraction of 400 pairs of question and answer construction of knowledge base unit obtains are right and answer content are carried out participle, are removed stop words, word merges, and extract the operation of entity word and obtain problem word and answer word; Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively.Question and answer construction of knowledge base unit 400, be suitable for each information recording, carry out following operation: calculate this answer word and belong to such other probability, calculating is the single-minded degree of this answer word to the explanation of this problem word in this classification, calculates the intensity that this problem word makes an explanation with this answer word in this classification; Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word; Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record.
More specifically, question and answer construction of knowledge base unit 400, is suitable for calculating as follows this answer word and belongs to such other probability:
P ( Ck | AWj ) = P ( AWj | Ck ) * P ( Ck ) P ( AWj ) ;
More specifically, question and answer construction of knowledge base unit 400, is suitable for calculating as follows the single-minded degree of each answer word to the explanation of this problem word in this classification:
apecific ( QWi , AWi | C = Ck ) = P ( QWi | AWj , C = Ck ) = # ( QWi , AWj ) # ( AWj ) | C = Ck ;
More specifically, question and answer construction of knowledge base unit 400, is suitable for calculating as follows the intensity that this problem word makes an explanation with each answer word in this classification:
interpret ( QWi , AWj | C = Ck ) = P ( AWj | QWi , C = Ck ) = # ( QWi , AWj ) &Sigma; j = 1 x # ( QWi , AWj ) | C = Ck ;
More specifically, question and answer construction of knowledge base unit 400, is suitable for as follows above-mentioned probability, single-minded degree and intensity being multiplied each other:
weight(QWi,AWj|C=Ck)=P(Ck|AWj)*specific(QWi,AWj|C=Ck)*interpret(QWi,AWj|C=Ck);
Wherein, the probability that P(Ck) represents classification Ck appearance; P(AWj) represent the probability that answer is AWj; P(AWj │ Ck) represent that Ck classification belongs to the probability of AWj;
#(QWi, AWj) problem of representation word is the number of times that QWi and answer word are AWj;
#(AWj) represent the number of times that answer word is AWj.
The effect of using embodiments of the invention to reach by an example explanation below, such as there being following question and answer pair, classification is " medical treatment & health ":
Figure BDA0000399250180000134
Figure BDA0000399250180000141
By participle technique, process, obtain problem word to be analyzed and answer word to be analyzed is as follows:
Figure BDA0000399250180000142
From word segmentation result, can find out in problem and answer, do not have related term to cover, if therefore use prior art, easily think that these question and answer are low to the degree of being associated, of low quality.But in fact use obvious known these question and answer of artificial judgment to being high-quality question and answer pair.
If use method and apparatus of the present invention to process above-mentioned question and answer pair, first, can transfer existing question and answer knowledge base, or by capturing the question and answer pair of cQA community, each large professional forum, build question and answer knowledge base;
Second step, to above-mentioned question and answer pair to be analyzed, extracts operation through word and obtains problem set of words child < to be analyzed, cough, nasal mucus >, answer set of words < symptom to be analyzed, medicine, treatment, antiviral, xiao'er ganmao granules, explanation, dosage, cough-relieving, Chinese medicine, electuary, microbiotic, Amoxicillin, amoxicillin granules, particle, oral, Roxithromycin, curative effect >, and obtain classification that question and answer to be analyzed are right for " medical treatment & health ";
The 3rd step, according to each problem word to be analyzed and this classification, from question and answer knowledge base, select to obtain some question and answer knowledge records of problem word and problem word match to be analyzed, thereby obtain following answer word and semantic relevancy (for easy-to-read, the numerical value of the semantic relevancy in following table is the numerical value having carried out after suitable normalized):
Figure BDA0000399250180000151
Figure BDA0000399250180000161
Figure BDA0000399250180000171
The 4th step, according to the answer word to be analyzed in answer set of words to be analyzed, on the basis of the selected question and answer knowledge record obtaining of the 3rd step, filter out the question and answer knowledge record of it answer word comprising and answer word match to be analyzed, and then obtain the semantic relevancy of filtered out question and answer knowledge record.Known by analysis, in this example with question and answer knowledge record in the answer word to be analyzed of answer word match comprise: < is oral, coughs and breathes heavily, and xiao'er ganmao granules, checks, cough-relieving, treatment, flu-like symptom, cold granules >.
The right degree of being associated can draw to calculate above-mentioned question and answer to be analyzed again, and the degree of being associated that these question and answer to be analyzed are right has reached under the condition that 0.9(is 0~1 in the degree span of being associated).
It should be noted that:
The algorithm providing at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with demonstration.Various general-purpose systems also can with based on using together with this teaching.According to description above, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.It should be understood that and can utilize various programming languages to realize content of the present invention described here, and the description of above language-specific being done is in order to disclose preferred forms of the present invention.
In the instructions that provided herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can not put into practice in the situation that there is no these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the above in the description of exemplary embodiment of the present invention, each feature of the present invention is grouped together into single embodiment, figure or sometimes in its description.Yet, the method for the disclosure should be construed to the following intention of reflection: the present invention for required protection requires than the more feature of feature of clearly recording in each claim.Or rather, as reflected in claims below, inventive aspect is to be less than all features of disclosed single embodiment above.Therefore, claims of following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can the module in the equipment in embodiment are adaptively changed and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and can put them into a plurality of submodules or subelement or sub-component in addition.At least some in such feature and/or process or unit are mutually repelling, and can adopt any combination to combine all processes or the unit of disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and disclosed any method like this or equipment.Unless clearly statement in addition, in this instructions (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or the alternative features of similar object replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included in other embodiment, the combination of the feature of different embodiment means within scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, or realizes with the software module moved on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize according to the some or all functions of the some or all parts in the device that obtains the degree that is associated that question and answer are right of the embodiment of the present invention.The present invention for example can also be embodied as, for carrying out part or all equipment or device program (, computer program and computer program) of method as described herein.Realizing program of the present invention and can be stored on computer-readable medium like this, or can there is the form of one or more signal.Such signal can be downloaded and obtain from internet website, or provides on carrier signal, or provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation that do not depart from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed as element or step in the claims.Being positioned at word " " before element or " one " does not get rid of and has a plurality of such elements.The present invention can be by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to carry out imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title by these word explanations.

Claims (10)

1. a device that obtains the degree that is associated that question and answer are right, this device comprises:
Question and answer knowledge base, is suitable for storing many question and answer knowledge records;
Word extraction unit, is suitable for the right problem content of question and answer to be analyzed and answer content to carry out word extraction operation, obtains at least one problem word to be analyzed and at least one answer word to be analyzed;
The degree that is associated computing unit, is suitable for, according to problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, selecting at least one question and answer knowledge record, according to selected question and answer knowledge record, calculates the degree that is associated that question and answer to be analyzed are right.
2. device according to claim 1, wherein, this device further comprises question and answer construction of knowledge base unit,
Described question and answer construction of knowledge base unit, is suitable for that right webpage extracts a plurality of question and answer pair from containing question and answer in advance, according to the question and answer of extracting to building the question and answer knowledge base that comprises many question and answer knowledge records;
Described question and answer construction of knowledge base unit, be further adapted for from the right webpage that contains question and answer, extract a plurality of question and answer to time, capture with described question and answer corresponding classification;
Described question and answer construction of knowledge base unit, be further adapted for according to the question and answer of extracting when building question and answer knowledge base, according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge record; Each question and answer knowledge record, corresponding to a classification, comprises respectively a problem word, an answer word, and the semantic relevancy between described problem word and described answer word.
3. device according to claim 1 and 2, wherein,
The described degree computing unit that is associated, is suitable for choosing the question and answer knowledge record of it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed; According in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to the degree that is associated for each classification; Choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
4. device according to claim 2, wherein,
Described question and answer construction of knowledge base unit, is suitable for each question and answer carrying out following operation:
The right problem content of these question and answer and answer content are carried out to word extraction operation, obtain problem set of words and answer set of words; Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively;
Described question and answer construction of knowledge base unit, is suitable for each information recording, carries out following operation:
Calculate this answer word and belong to such other probability, calculate the single-minded degree of this answer word to the explanation of this problem word in this classification, calculate the intensity that this problem word makes an explanation with this answer word in this classification; Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word; Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record.
5. a method of obtaining the degree that is associated that question and answer are right, the method comprises the steps:
The right problem content of question and answer to be analyzed and answer content are carried out to word extraction operation, obtain at least one problem word to be analyzed and at least one answer word to be analyzed;
According to problem word to be analyzed and answer word to be analyzed, from comprising that the question and answer knowledge base of many question and answer knowledge records selects at least one question and answer knowledge record, according to selected question and answer knowledge record, calculate the degree that is associated that question and answer to be analyzed are right.
6. method according to claim 5, wherein, the method further comprises:
From containing question and answer, right webpage extracts a plurality of question and answer pair in advance, according to the question and answer of extracting, structure is comprised the question and answer knowledge base of many question and answer knowledge records;
From the right webpage that contains question and answer, extract a plurality of question and answer to time, capture with described question and answer corresponding classification;
According to the question and answer of extracting when building question and answer knowledge base, according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge record;
Each question and answer knowledge record, corresponding to a classification, comprises respectively a problem word, an answer word, and the semantic relevancy between described problem word and described answer word.
7. according to the method described in claim 5 or 6, wherein,
Describedly according to problem word to be analyzed and answer word to be analyzed, from question and answer knowledge base, select at least one question and answer knowledge record, according to selected question and answer knowledge record, calculate the degree that is associated that question and answer to be analyzed are right, specifically comprise:
Choose the question and answer knowledge record of it problem word comprising and problem word match to be analyzed and the answer word comprising and answer word match to be analyzed;
According in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to the degree that is associated for each classification;
Choose the maximal value of above-mentioned these question and answer to be analyzed to the degree that is associated for each classification, using this maximal value as the right degree that is associated of question and answer to be analyzed.
8. method according to claim 7, wherein,
According in the described question and answer knowledge record of choosing corresponding to the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification, specifically comprise:
By in the question and answer knowledge record of choosing corresponding to the semantic relevancy weighting summation of the question and answer knowledge record of identical category, obtain these question and answer to be analyzed to respectively for the degree that is associated of each classification.
9. method according to claim 6, wherein, described according to question and answer to with described question and answer, corresponding classification is built to question and answer knowledge base, specifically comprise:
To each question and answer pair, the right problem content of these question and answer and answer content are carried out to word extraction operation, obtain problem set of words and answer set of words;
Make each problem word in problem set of words and each answer word in answer set of words form an information recording with these question and answer on to each corresponding classification respectively;
To each information recording, carry out following operation:
Calculate this answer word and belong to such other probability, calculate the single-minded degree of this answer word to the explanation of this problem word in this classification, calculate the intensity that this problem word makes an explanation with this answer word in this classification;
Above-mentioned probability, single-minded degree and intensity are multiplied each other, and resulting product is the semantic relevancy of this answer word and this problem word;
Make this problem word, this answer word and its semantic relevancy form one corresponding to such other question and answer knowledge record.
10. method according to claim 9, wherein,
This answer word of described calculating belongs to such other probability, specifically comprises:
P ( Ck | AWj ) = P ( AWj | Ck ) * P ( Ck ) P ( AWj ) ;
Described calculating is the single-minded degree of each answer word to the explanation of this problem word in this classification, specifically comprises:
apecific ( QWi , AWi | C = Ck ) = P ( QWi | AWj , C = Ck ) = # ( QWi , AWj ) # ( AWj ) | C = Ck ;
The described calculating intensity that this problem word makes an explanation with each answer word in this classification, specifically comprises:
interpret ( QWi , AWj | C = Ck ) = P ( AWj | QWi , C = Ck ) = # ( QWi , AWj ) &Sigma; j = 1 x # ( QWi , AWj ) | C = Ck ;
Above-mentioned probability, single-minded degree and intensity are multiplied each other, specifically comprise:
weight(QWi,AWj|C=Ck)=P(Ck|AWj)*specific(QWi,AWj|C=Ck)*interpret(QWi,AWj|C=Ck);
Wherein, the probability that P(Ck) represents classification Ck appearance; P(AWj) represent the probability that answer is AWj; P(AWj │ Ck) represent that Ck classification belongs to the probability of AWj;
#(QWi, AWj) problem of representation word is the number of times that QWi and answer word are AWj;
#(AWj) represent the number of times that answer word is AWj.
CN201310495641.4A 2013-10-21 2013-10-21 Device and method for obtaining association degree of question and answer pair Active CN103577556B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201310495641.4A CN103577556B (en) 2013-10-21 2013-10-21 Device and method for obtaining association degree of question and answer pair
PCT/CN2014/086838 WO2015058604A1 (en) 2013-10-21 2014-09-18 Apparatus and method for obtaining degree of association of question and answer pair and for search ranking optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310495641.4A CN103577556B (en) 2013-10-21 2013-10-21 Device and method for obtaining association degree of question and answer pair

Publications (2)

Publication Number Publication Date
CN103577556A true CN103577556A (en) 2014-02-12
CN103577556B CN103577556B (en) 2017-01-18

Family

ID=50049332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310495641.4A Active CN103577556B (en) 2013-10-21 2013-10-21 Device and method for obtaining association degree of question and answer pair

Country Status (1)

Country Link
CN (1) CN103577556B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404618A (en) * 2014-09-16 2016-03-16 阿里巴巴集团控股有限公司 Dialogue text data processing method and apparatus
CN105786851A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Question and answer knowledge base construction method as well as search provision method and apparatus
CN105786872A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Method and device for providing question-answer onebox based on user searches
CN106909572A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of construction method and device of question and answer knowledge base
CN106909573A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of method and apparatus for evaluating question and answer to quality
CN107168967A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 The acquisition methods and device of object knowledge point
CN107305578A (en) * 2016-04-25 2017-10-31 北京京东尚科信息技术有限公司 Human-machine intelligence's answering method and device
CN107436916A (en) * 2017-06-15 2017-12-05 百度在线网络技术(北京)有限公司 The method and device of intelligent prompt answer
CN108090127A (en) * 2017-11-15 2018-05-29 北京百度网讯科技有限公司 Question and answer text evaluation model is established with evaluating the method, apparatus of question and answer text
CN108932349A (en) * 2018-08-17 2018-12-04 齐鲁工业大学 Medical automatic question-answering method and device, storage medium, electronic equipment
CN109271495A (en) * 2018-08-14 2019-01-25 阿里巴巴集团控股有限公司 Question and answer recognition effect detection method, device, equipment and readable storage medium storing program for executing
CN109783631A (en) * 2019-02-02 2019-05-21 北京百度网讯科技有限公司 Method of calibration, device, computer equipment and the storage medium of community's question and answer data
CN110399466A (en) * 2019-08-01 2019-11-01 北京百度网讯科技有限公司 Screening technique, device, equipment and the storage medium of question and answer data
CN110442690A (en) * 2019-06-26 2019-11-12 重庆兆光科技股份有限公司 A kind of query optimization method, system and medium based on probability inference
CN111444724A (en) * 2020-03-23 2020-07-24 腾讯科技(深圳)有限公司 Medical question-answer quality testing method and device, computer equipment and storage medium
WO2024051115A1 (en) * 2022-09-05 2024-03-14 苏州元脑智能科技有限公司 Text generation method and apparatus, device, and non-volatile readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073683A1 (en) * 2003-10-24 2007-03-29 Kenji Kobayashi System and method for question answering document retrieval
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept
CN101441660A (en) * 2008-12-16 2009-05-27 腾讯科技(深圳)有限公司 Knowledge evaluating system and method in inquiry and answer community
CN101520802A (en) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 Question-answer pair quality evaluation method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073683A1 (en) * 2003-10-24 2007-03-29 Kenji Kobayashi System and method for question answering document retrieval
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept
CN101441660A (en) * 2008-12-16 2009-05-27 腾讯科技(深圳)有限公司 Knowledge evaluating system and method in inquiry and answer community
CN101520802A (en) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 Question-answer pair quality evaluation method and system

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404618B (en) * 2014-09-16 2018-10-02 阿里巴巴集团控股有限公司 A kind of dialog text treating method and apparatus
CN105404618A (en) * 2014-09-16 2016-03-16 阿里巴巴集团控股有限公司 Dialogue text data processing method and apparatus
CN105786851A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Question and answer knowledge base construction method as well as search provision method and apparatus
CN105786872A (en) * 2014-12-23 2016-07-20 北京奇虎科技有限公司 Method and device for providing question-answer onebox based on user searches
CN106909572A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of construction method and device of question and answer knowledge base
CN106909573A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of method and apparatus for evaluating question and answer to quality
CN107168967A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 The acquisition methods and device of object knowledge point
CN107168967B (en) * 2016-03-07 2020-12-04 创新先进技术有限公司 Target knowledge point acquisition method and device
CN107305578A (en) * 2016-04-25 2017-10-31 北京京东尚科信息技术有限公司 Human-machine intelligence's answering method and device
CN107436916A (en) * 2017-06-15 2017-12-05 百度在线网络技术(北京)有限公司 The method and device of intelligent prompt answer
CN107436916B (en) * 2017-06-15 2021-04-27 百度在线网络技术(北京)有限公司 Intelligent answer prompting method and device
CN108090127A (en) * 2017-11-15 2018-05-29 北京百度网讯科技有限公司 Question and answer text evaluation model is established with evaluating the method, apparatus of question and answer text
CN109271495A (en) * 2018-08-14 2019-01-25 阿里巴巴集团控股有限公司 Question and answer recognition effect detection method, device, equipment and readable storage medium storing program for executing
CN109271495B (en) * 2018-08-14 2023-02-17 创新先进技术有限公司 Question-answer recognition effect detection method, device, equipment and readable storage medium
CN108932349A (en) * 2018-08-17 2018-12-04 齐鲁工业大学 Medical automatic question-answering method and device, storage medium, electronic equipment
CN108932349B (en) * 2018-08-17 2019-03-26 齐鲁工业大学 Medical automatic question-answering method and device, storage medium, electronic equipment
AU2019322953B2 (en) * 2018-08-17 2021-08-19 Qilu University Of Technology Method, system, storage medium and electric device of medical automatic question answering
WO2020034642A1 (en) * 2018-08-17 2020-02-20 齐鲁工业大学 Automatic medical question answering method and apparatus, storage medium, and electronic device
CN109783631A (en) * 2019-02-02 2019-05-21 北京百度网讯科技有限公司 Method of calibration, device, computer equipment and the storage medium of community's question and answer data
US11372942B2 (en) 2019-02-02 2022-06-28 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, computer device and storage medium for verifying community question answer data
CN110442690B (en) * 2019-06-26 2021-08-17 重庆兆光科技股份有限公司 Query optimization method, system and medium based on probabilistic reasoning
CN110442690A (en) * 2019-06-26 2019-11-12 重庆兆光科技股份有限公司 A kind of query optimization method, system and medium based on probability inference
CN110399466A (en) * 2019-08-01 2019-11-01 北京百度网讯科技有限公司 Screening technique, device, equipment and the storage medium of question and answer data
CN111444724A (en) * 2020-03-23 2020-07-24 腾讯科技(深圳)有限公司 Medical question-answer quality testing method and device, computer equipment and storage medium
WO2024051115A1 (en) * 2022-09-05 2024-03-14 苏州元脑智能科技有限公司 Text generation method and apparatus, device, and non-volatile readable storage medium

Also Published As

Publication number Publication date
CN103577556B (en) 2017-01-18

Similar Documents

Publication Publication Date Title
CN103577556A (en) Device and method for obtaining association degree of question and answer pair
CN103577558A (en) Device and method for optimizing search ranking of frequently asked question and answer pairs
Song et al. A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks
CN104778209B (en) A kind of opining mining method for millions scale news analysis
CN108108426B (en) Understanding method and device for natural language question and electronic equipment
CN102750316B (en) Based on the conceptual relation label abstracting method of semantic co-occurrence patterns
CN103577557A (en) Device and method for determining capturing frequency of network resource point
CN110347894A (en) Knowledge mapping processing method, device, computer equipment and storage medium based on crawler
CN106776574B (en) User comment text mining method and device
CN106503055A (en) A kind of generation method from structured text to iamge description
CN105512687A (en) Emotion classification model training and textual emotion polarity analysis method and system
CN107506389B (en) Method and device for extracting job skill requirements
CN105893410A (en) Keyword extraction method and apparatus
CN104268160A (en) Evaluation object extraction method based on domain dictionary and semantic roles
CN109684476B (en) Text classification method, text classification device and terminal equipment
CN104636465A (en) Webpage abstract generating methods and displaying methods and corresponding devices
CN106909572A (en) A kind of construction method and device of question and answer knowledge base
CN103164698A (en) Method and device of generating fingerprint database and method and device of fingerprint matching of text to be tested
CN105095091B (en) A kind of software defect code file localization method based on Inverted Index Technique
CN106909573A (en) A kind of method and apparatus for evaluating question and answer to quality
CN110083829A (en) Feeling polarities analysis method and relevant apparatus
Ronan et al. Determining light verb constructions in contemporary British and Irish English
CN106202034A (en) A kind of adjective word sense disambiguation method based on interdependent constraint and knowledge and device
CN107871002A (en) A kind of across language plagiarism detection method based on fingerprint fusion
CN108664512A (en) Text object sorting technique and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220727

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right