WO2015058604A1 - Appareil et procédé d'obtention de degré d'association d'une paire de question et de réponse et d'optimisation de classement de recherche - Google Patents

Appareil et procédé d'obtention de degré d'association d'une paire de question et de réponse et d'optimisation de classement de recherche Download PDF

Info

Publication number
WO2015058604A1
WO2015058604A1 PCT/CN2014/086838 CN2014086838W WO2015058604A1 WO 2015058604 A1 WO2015058604 A1 WO 2015058604A1 CN 2014086838 W CN2014086838 W CN 2014086838W WO 2015058604 A1 WO2015058604 A1 WO 2015058604A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
answer
word
analyzed
category
Prior art date
Application number
PCT/CN2014/086838
Other languages
English (en)
Chinese (zh)
Inventor
孙林
陈培军
秦吉胜
Original Assignee
北京奇虎科技有限公司
奇智软件(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201310495881.4A external-priority patent/CN103577558B/zh
Priority claimed from CN201310495641.4A external-priority patent/CN103577556B/zh
Priority claimed from CN201310495856.6A external-priority patent/CN103577557B/zh
Application filed by 北京奇虎科技有限公司, 奇智软件(北京)有限公司 filed Critical 北京奇虎科技有限公司
Publication of WO2015058604A1 publication Critical patent/WO2015058604A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis

Definitions

  • the present invention relates to the field of network data communication technologies, and in particular, to an apparatus and method for obtaining a correlation degree of a question and answer pair, an apparatus and method for optimizing a search ranking of a question and answer pair, and a method for determining a frequency of capturing network resource points. Apparatus and method.
  • the Q&A community is a web application that generates content for users.
  • the basic form is that users ask questions according to their own needs, and other users give answers. This form provides a new channel for users to access information on the web.
  • the quality of the information in the Q&A community is so different that there are a large number of low-quality Q&A pairs in the Q&A community. This not only brings a lot of inconvenience to users to find information, but also reduces the quality of the Q&A community.
  • the prior art method of judging the quality of question and answer depends more on the non-text features of the question and answer pair to evaluate the quality of the question and answer, which will affect its versatility.
  • the prior art sets the crawl frequency method for the network resource point, and relies more on Q&A analysis of links to websites, such methods are used for question-and-answer searches. They cannot be semantically analyzed. Q&A pairs cannot adjust the frequency of crawling (or crawling fineness, crawling frequency) according to the quality of network resource points. The accuracy and versatility of search results.
  • the present invention has been made in order to provide an apparatus and method for obtaining the degree of association of a question and answer pair that overcomes the above problems or at least partially solves the above problems, and an apparatus and method for optimizing a search ranking of a question and answer pair, And an apparatus and method for determining a crawl frequency of a network resource point.
  • an apparatus for obtaining a degree of association of a question and answer pair comprising: a question and answer knowledge base adapted to store a plurality of question and answer knowledge records; a word extraction unit adapted to the question and answer pair to be analyzed The problem content and the answer content are subjected to a word extraction operation to obtain at least one question word to be analyzed and at least one answer word to be analyzed; the correlation degree calculating unit is adapted to select at least the question answer knowledge base according to the question word to be analyzed and the answer word to be analyzed.
  • a question and answer knowledge record that calculates the degree of association of the question and answer pairs to be analyzed based on the selected question and answer knowledge record.
  • an apparatus for optimizing a search ranking of a question and answer pair comprising: a question and answer knowledge base adapted to store a plurality of question and answer knowledge records; and a search unit adapted to receive a user's search request, Obtaining, according to the user's search request, a plurality of pairs of questions and answers to be analyzed that are matched with the search request; and the calculating unit is configured to acquire, according to the question and answer knowledge base, the degree of association of each question and answer pair to be analyzed; the search ranking unit is adapted to be according to the The degree of association of the question and answer pairs to be analyzed optimizes the search ranking of the question and answer pairs to be analyzed.
  • an apparatus for determining a crawling frequency of a network resource point comprising: a question and answer knowledge base adapted to store a plurality of question and answer knowledge records; and a resource analysis unit adapted to be configured by a network resource point Grasping a plurality of pairs of questions to be analyzed; the calculating unit is adapted to obtain an association degree of each question and answer pair to be analyzed according to the question and answer knowledge base; the crawling frequency determining unit determines the association according to the degree of association of the question and answer pairs to be analyzed The frequency of crawling network resource points.
  • a method for obtaining a degree of association of a question and answer pair comprising the steps of: performing a word extraction operation on a question content and an answer content of the question and answer pair to be analyzed, and obtaining at least one problem to be analyzed a word and at least one word to be analyzed; selecting at least one question and answer knowledge record from the question and answer knowledge base including the plurality of question and answer knowledge records according to the question word to be analyzed and the word to be analyzed, and calculating the question and answer to be analyzed according to the selected question and answer knowledge record The degree of association.
  • a method for optimizing a search ranking of a question and answer pair comprising the steps of: receiving a search request of a user, and acquiring a plurality of to-be-matched matches with the search request according to the search request of the user
  • the question and answer pair is analyzed; according to the question and answer knowledge base including the plurality of question and answer knowledge records, the degree of association of each question and answer pair to be analyzed is obtained; and the search ranking of the question and answer pair to be analyzed is optimized according to the degree of association of the question and answer pairs to be analyzed.
  • a method for determining a crawling frequency of a network resource point comprising the steps of: capturing, by a network resource point, a plurality of question and answer pairs to be analyzed; according to the plurality of question and answer knowledge records
  • the question and answer knowledge base obtains the degree of association of each question and answer pair to be analyzed; and determines the frequency of the crawling of the network resource points according to the degree of association of the question and answer pairs to be analyzed.
  • multiple question and answer pairs are extracted from a webpage containing a question and answer pair, and multiple pieces are constructed according to the extracted question and answer pairs.
  • the question and answer knowledge base of the question and answer knowledge record, the word extraction operation of the question and answer pair of the question and the answer, and at least one word to be analyzed and at least one word to be analyzed are obtained, and then according to the question word to be analyzed and the word to be analyzed
  • Selecting at least one Q&A knowledge record from the Q&A knowledge base and calculating the correlation degree of the Q&A pairs to be analyzed according to the selected Q&A knowledge record can evaluate the quality of the Q&A pair from the semantic aspect and solve the prior art evaluation only on the lexical level.
  • each question and question to be analyzed is obtained according to the question and answer knowledge base.
  • the degree of association of the pair and the search ranking of the question and answer pair to be analyzed according to the degree of association of the question and answer pairs to be analyzed can evaluate the quality of the question and answer pair to be analyzed from the semantic aspect, and solve the problem that the prior art relies on the question and answer on the webpage and question and answer.
  • the problem of poor sorting effect further, by grasping a plurality of question and answer pairs to be analyzed by the network resource point, obtaining the correlation degree of each question and answer pair to be analyzed according to the question and answer knowledge base and determining the correlation degree according to the question and answer pair to be analyzed
  • the crawling frequency of the network resource point can determine the crawling frequency by evaluating the quality of the network resource point, and solves the problem that the prior art cannot select the crawling frequency according to the quality of the network resource point.
  • the solution of the present application is easy to implement and has high versatility.
  • FIG. 1 shows a flow chart of a method of obtaining a degree of association of a question and answer pair, in accordance with one embodiment of the present invention
  • Figure 2 shows a detailed flow chart for building a Q&A knowledge base
  • FIG. 3 is a schematic diagram showing an explanation model of the question and answer knowledge base obtained by using the steps shown in FIG. 2;
  • FIG. 4 shows a detailed flow chart of step S200 of Figure 1;
  • FIG. 5 illustrates a block diagram of an apparatus for obtaining a degree of association of a question and answer pair, in accordance with one embodiment of the present invention
  • FIG. 6 shows a flow chart of a method for optimizing a search ranking of a question and answer pair, in accordance with one embodiment of the present invention
  • FIG. 7 shows a block diagram of an apparatus for optimizing a search ranking of a question and answer pair, in accordance with one embodiment of the present invention
  • FIG. 8 shows a flow chart of a method of determining a crawl frequency of a network resource point, in accordance with one embodiment of the present invention
  • FIG. 9 shows a block diagram of an apparatus for determining a crawl frequency of a network resource point, in accordance with one embodiment of the present invention.
  • Figure 10 shows a block diagram of an application server for performing the method according to the invention
  • Figure 11 shows a storage unit for holding or carrying program code implementing the method according to the invention.
  • the existing method of obtaining the degree of association of question and answer pairs is to use text features and non-text features to describe the questions and answers of the question and answer pairs.
  • the existing method for obtaining a search ranking of a question and answer pair is to use a text feature and a non-text feature to describe the question and answer pair to rank the question and answer pair, or to answer questions based on the question and answer.
  • Text features mainly include textual visual features (such as punctuation density, average word length, text entropy, etc.) and text content features (such as text content word scale, question word density, related word coverage, etc.), and extract Chinese automatic errors widely used.
  • non-text features include user weightedness indicators, answer question status, answer answer time, user relationship interaction features, and so on.
  • a problem quality prediction model and an answer quality prediction model are respectively learned on the training set, and the output of the two models is used to evaluate the quality of the question and answer.
  • the relevant word coverage feature is used to describe the semantic matching of the question and answer questions, which is not only at the lexical level. And did not consider the semantic matching of questions and answers.
  • the semantic matching of questions and answers is precisely the core of question and answer.
  • the question is “Where is the capital of China?”, the answer 1 is “Beijing” and the answer 2 is “China's capital is Shanghai”. Then the question is “where is the capital of China” after the word segmentation and discarding the stop words, the answer 1 word segmentation result is “Beijing”, and the answer 2 word segmentation result is “China Capital Shanghai”.
  • FIG. 1 shows a flow chart of a method of obtaining the degree of association of a question and answer pair, in accordance with one embodiment of the present invention.
  • a method of obtaining a degree of association of a question and answer pair comprising the following steps S100 and S200:
  • S100 performing a word extraction operation on the question content and the answer content of the question and answer pair to be analyzed, and obtaining at least one question word to be analyzed and at least one answer word to be analyzed.
  • the word extraction operation of the question content and the answer content of the question and answer pair to be analyzed specifically includes: segmenting the question content and the answer content of the question and answer pair to be analyzed, removing the stop word, and word merge (word Join), and the operation of extracting entity words (such as nouns, verbs, etc.). Then, at least one problem word to be analyzed is obtained from the question content of the question and answer pair to be analyzed, and at least one answer word to be analyzed is obtained from the answer content of the question and answer pair to be analyzed.
  • S200 Select at least one question and answer knowledge record from the question and answer knowledge base including the plurality of question and answer knowledge records according to the problem word to be analyzed and the answer word to be analyzed, and calculate the correlation degree of the question and answer pair to be analyzed according to the selected question and answer knowledge record.
  • the problem content and the answer content of the analysis question and answer pair may be analyzed from the semantic aspect by using the question and answer knowledge base to obtain the correlation degree of the question and answer pair to be analyzed, and the evaluation effect is better and easy to implement.
  • the question and answer knowledge base including a plurality of question and answer knowledge records is obtained by extracting a plurality of question and answer pairs from a webpage having a question and answer pair in advance, and constructing according to the extracted question and answer pairs.
  • the category corresponding to the question and answer pair is captured.
  • the question and answer knowledge record is constructed according to the question and answer pair and the category corresponding to the question and answer pair.
  • Each question and answer knowledge record in the obtained question and answer knowledge base corresponds to a category, which includes a question word (QW), an answer word (AW), and a semantic relevance between the question word and the answer word. .
  • FIG. 2 shows a detailed flow chart for building a Q&A knowledge base. Specifically, the following steps S310, S320, and S330 are included:
  • data may be fetched from a webpage containing a high-quality question and answer pair on the Internet, and a question and answer pair may be extracted to ensure the quality of the extracted question and answer pair;
  • the webpage including the high-quality question and answer pair includes cQA (Customer Quality Assurance) community, major professional forums, etc.
  • cQA Customer Quality Assurance
  • the webpage containing the high-quality question and answer pair includes the category information corresponding to each question and answer pair, the category corresponding to the question and answer pair can be grasped together while the question and answer pair is captured.
  • the word extraction operation is performed on the question content and the answer content of each question and answer pair in the question and answer pairs extracted in step S310, specifically including the question content and the answer content of the question and answer pair.
  • Word segmentation, removal of stop words, word merging, and operations for extracting entity words are examples of Word segmentation, removal of stop words, word merging, and operations for extracting entity words.
  • At least one question word is obtained from the question content of each question and answer pair, and at least one answer word is obtained from the answer content of each question and answer pair, and the category set ⁇ C 1 ,..., C k ,... for the question and answer pair can be obtained.
  • Step S330 in this embodiment may be performed based on the mass information record after the massive question and answer pair obtained from the web page is subjected to the word extraction operation as described in step S320 to obtain a massive information record.
  • the semantic relevance obtained based on massive information records is more accurate.
  • the calculating the probability that the answer word belongs to the category includes:
  • the calculating the degree of specificity of each answer word on the question word in the category includes:
  • the calculating the strength of the question word in the category to be explained by each answer word specifically comprising:
  • C Ck)*interpret(QWi,AWj
  • C Ck);
  • P(C k ) represents the probability of occurrence of the category C k
  • P(AW j ) represents the probability that the answer is AW j
  • C k ) represents the probability that the C k category belongs to AW j ;
  • #(QW i , AW j ) indicates the number of times the question word is QW i and the answer word is AW j ;
  • #(AW j ) indicates the number of times the answer word is AW j .
  • a question and answer knowledge record can be obtained to construct a question and answer knowledge base.
  • Figure 3 shows a schematic diagram of an explanatory model of a question and answer knowledge base obtained using the steps shown in Figure 2. It can be seen that for each question word QW i , n question and answer knowledge records can be obtained for each of the category sets ⁇ C 1 , . . . , C k , . . . , C p >.
  • the calculated semantic relevance is 0, the corresponding question and answer knowledge record can be deleted; further, if the number of question and answer knowledge records in the question and answer knowledge base is too large, the question and answer knowledge is stored.
  • the overhead of recording and calculating the degree of association of the question and answer pairs to be analyzed is too large, and a threshold can be preset, and the question and answer knowledge record whose semantic relevance is less than the threshold is deleted to reduce the overhead.
  • FIG. 4 shows a detailed flowchart of step S200 in FIG. 1.
  • step S200 specifically includes the following steps S210, S220, and S230:
  • step S210 Select a question and answer knowledge record that matches the problem words included in the problem word to be analyzed and the included answer words and the answer words to be analyzed.
  • the matching of the problem words and the problem words to be analyzed refers to the sub-strings of the problem words to be analyzed and the problem words to be analyzed or the problem words to be analyzed are problem words; the matching words and the words to be analyzed match the words to be analyzed and The answer word is the same or the answer word to be analyzed is a substring of the answer word.
  • a field matching or field search method is used to select a part of the question and answer knowledge record related to the question and answer pair to be analyzed from the question and answer knowledge base. .
  • the question and answer knowledge record corresponding to the same category in the selected question and answer knowledge record obtain the degree of association of the question and answer pairs to be analyzed for each category, and specifically include: the selected question and answer knowledge record corresponds to the same category
  • the semantic relevance of the Q&A knowledge record is weighted and added, and the degree of association of the question and answer pairs to be analyzed for each category is obtained.
  • the Q&A knowledge records selected by step S210 are grouped according to their corresponding categories, and the Q&A knowledge records corresponding to the same category are grouped; the semantic relevance of each group of Q&A knowledge records is weighted (for example, And adding a weight of 1 or 100), obtaining the degree of association of the question and answer pair to be analyzed for the category; thereby obtaining at least one (the number of degrees of association in the embodiment is the corresponding category of the question and answer pair to be analyzed The number) the degree of association.
  • Figure 5 illustrates a block diagram of an apparatus for obtaining the degree of association of a question and answer pair, in accordance with one embodiment of the present invention.
  • the apparatus includes a question and answer knowledge base 100, a word extraction unit 200, and an associated degree calculation unit 300.
  • the question and answer knowledge base 100 is adapted to store a plurality of question and answer knowledge records; the question and answer knowledge base 100 of the present embodiment can be constructed by crawling a large number of question and answer pairs in the web page.
  • the word extracting unit 200 is adapted to perform a word extracting operation on the question content and the answer content of the question and answer pair to be analyzed, and obtain at least one question word to be analyzed and at least one answer word to be analyzed.
  • the word extracting unit 200 is adapted to perform word segmentation, remove stop words, word join, and extract entity words (for example, nouns) for the question content and the answer content of the question and answer pair to be analyzed. The operation of the verb, etc.) to obtain at least one word to be analyzed and at least one word to be analyzed.
  • the association degree calculation unit 300 is adapted to select at least one question and answer knowledge record from the question and answer knowledge base according to the problem word to be analyzed and the answer word to be analyzed, and calculate the correlation degree of the question and answer pair to be analyzed according to the selected question and answer knowledge record.
  • the correlation degree calculation unit 300 is adapted to select a question and answer knowledge record whose question words are matched with the question words to be analyzed and the included answer words match the answer words to be analyzed.
  • the matching of the problem words and the problem words to be analyzed refers to the sub-strings of the problem words to be analyzed and the problem words to be analyzed or the problem words to be analyzed are problem words; the matching words and the words to be analyzed match the words to be analyzed and The answer word is the same or the answer word to be analyzed is a substring of the answer word; according to the Q&A knowledge record corresponding to the same category in the selected question and answer knowledge record, the relevance of the question and answer pair to be analyzed for each category is obtained, more specific And adding the semantic relevance weights (for example, the weights of 1 or 100) corresponding to the same category of question and answer knowledge records in the selected question and answer knowledge records to obtain the association of the question and answer pairs to be analyzed respectively for each category.
  • the semantic relevance weights for example, the weight
  • the number of degrees of association in the embodiment that is, the number of categories to be analyzed, the number of categories to be analyzed
  • the above-mentioned question and answer pairs to be analyzed are selected for each category
  • the maximum value of the degree of association with the maximum value as the degree of association of the question and answer pairs to be analyzed.
  • the word extracting unit 200 Using the question and answer knowledge base 100, the word extracting unit 200, and the associated degree calculating unit 300, selecting at least one question and answer knowledge record from the question and answer knowledge base by using the question word to be analyzed and the answer word to be analyzed, and calculating according to the selected question and answer knowledge record
  • the degree of correlation between the question and answer pairs to be analyzed can be analyzed from the semantic aspect of the analysis question and answer pair.
  • the evaluation effect is better and easier to implement.
  • the scope of application is wider and versatile. Stronger.
  • the device further includes a question and answer knowledge base construction unit 400, and the question and answer knowledge base construction unit 400 is adapted to extract a plurality of question and answer pairs from the webpage containing the question and answer pair in advance, and construct a plurality of question and answer knowledge according to the extracted question and answer pairs.
  • Recorded Q&A knowledge base the Q&A knowledge base.
  • the Q&A knowledge base is existing. Since the amount of information of the actual network is increasing, the information content changes rapidly, and the content of the Q&A knowledge base often needs to be updated, by adding a Q&A knowledge base building unit 400. Build (or update) the Q&A knowledge base to ensure the immediacy and reliability of the content of the Q&A knowledge base.
  • the question and answer knowledge base construction unit 400 grabs the category corresponding to the question and answer pair.
  • data may be fetched from a webpage containing a high-quality question and answer pair on the Internet, and a question and answer pair may be extracted to ensure the quality of the extracted question and answer pair; the webpage including the high-quality question and answer pair includes cQA community, major professional forums, etc. Since the webpage containing the high quality question and answer pair includes category information corresponding to each question and answer pair, the question and answer knowledge base construction unit 400 can grab the category corresponding to the question and answer pair while grabbing the question and answer pair.
  • the question and answer knowledge base construction unit 400 is adapted to perform the following operations on each question and answer pair: performing a word extraction operation on the question content and the answer content of the question and answer pair to obtain a question word set and an answer word set, specifically
  • the question and answer knowledge base construction unit 400 performs the word segmentation, the removal of the stop word, the word combination, and the operation of extracting the entity word for the problem content and the answer content of each of the question and answer pairs in the extracted question and answer pairs to obtain the question words and answers.
  • a word each of the question words in the set of question words and each answer word in the set of answer words form an information record on each of the categories corresponding to the question and answer pair.
  • the question and answer knowledge base construction unit 400 is adapted to record, for each piece of information, an operation of calculating a probability that the answer word belongs to the category, and calculating a degree of specificity of the answer word to the question word on the category, The strength of the question word in the category to be explained by the answer word; multiplying the above probability, the degree of specificity and the intensity, the product obtained is the semantic relevance of the answer word and the question word;
  • the answer words and their semantic relevance form a question and answer knowledge record corresponding to the category.
  • the question and answer knowledge base construction unit 400 is adapted to calculate the probability that the answer word belongs to the category according to the following method:
  • the question and answer knowledge base construction unit 400 is adapted to calculate the degree of specificity of the interpretation of the question words by the respective answer words on the category according to the following method:
  • the question and answer knowledge base construction unit 400 is adapted to calculate the strength of the problem words explained by the respective answer words on the category according to the following method:
  • the question and answer knowledge base construction unit 400 is adapted to multiply the above probability, specific degree, and intensity according to the following method:
  • C Ck)*interpret(QWi,AWj
  • C Ck);
  • P(C k ) represents the probability of occurrence of the category C k
  • P(AW j ) represents the probability that the answer is AW j
  • C k ) represents the probability that the C k category belongs to AW j ;
  • #(QW i , AW j ) indicates the number of times the question word is QW i and the answer word is AW j ;
  • #(AW j ) indicates the number of times the answer word is AW j .
  • the words to be analyzed and the words to be analyzed are as follows:
  • an existing Q&A knowledge base may be retrieved, or a Q&A knowledge base may be constructed by grasping the QQA community and the Q&A pairs of the major professional forums;
  • the second step is to answer the question and answer pair to be analyzed.
  • the word set to be analyzed is obtained.
  • the answer word set to be analyzed ⁇ symptoms, drugs, treatment, anti-virus, pediatric cold particles, description , dosage, cough, Chinese medicine, granules, antibiotics, amoxicillin, amoxicillin granules, granules, oral, roxithromycin, efficacy>, and the type of question and answer pair to be analyzed is “medical health”;
  • a plurality of question and answer knowledge records matching the problem words and the words to be analyzed are selected from the question and answer knowledge base, thereby obtaining the following answer words and semantic relevance (for convenience of reading,
  • the values of the semantic relevance in the table are the values that have been properly normalized):
  • the Q&A knowledge records including the answer words and the answers to be analyzed are selected, and further Get the semantic relevance of the selected question and answer knowledge records.
  • the answers to the answers in this example that match the answer words in the Q&A knowledge record include: ⁇ Oral, Kechuan, Pediatric cold particles, examination, cough, treatment, flu symptoms, cold particles>.
  • the degree of correlation of the question and answer pairs to be analyzed may be calculated, and the degree of correlation of the question and answer pairs to be analyzed reaches 0.9 (under the condition that the correlation degree ranges from 0 to 1).
  • FIG. 6 shows a flow chart of a method of optimizing a search ranking of a question and answer pair, in accordance with one embodiment of the present invention.
  • the method includes the following steps S610, S620, and S630:
  • S610 Receive a search request of the user, and obtain a plurality of question and answer pairs to be analyzed that match the search request according to the search request of the user.
  • the network search technology may be used, for example, using a question and answer pair search engine to obtain a question and answer pair to be analyzed according to the user's search request.
  • S620 Obtain an association degree of each question and answer pair to be analyzed according to a Q&A knowledge base including a plurality of Q&A knowledge records.
  • the question content and the answer content of the question and answer pair may be analyzed from the semantic aspect by using the question and answer knowledge base to obtain the correlation degree of the question and answer pair to be analyzed, and the evaluation effect is better and easy to implement.
  • step S620 of the embodiment is substantially the same as the method of obtaining the degree of association of the question and answer pair as shown in FIG. repeat.
  • the question and answer knowledge base including a plurality of question and answer knowledge records is obtained by extracting a plurality of question and answer pairs from a webpage having a question and answer pair in advance, and constructing according to the extracted question and answer pairs.
  • the category corresponding to the question and answer pair is captured.
  • the question and answer knowledge record is constructed according to the question and answer pair and the category corresponding to the question and answer pair.
  • Each question and answer knowledge record in the obtained question and answer knowledge base corresponds to a category, which includes a question word (QW), an answer word (AW), and a semantic relevance between the question word and the answer word.
  • QW question word
  • AW answer word
  • semantic relevance between the question word and the answer word.
  • the semantics between problem words and answer words of multiple Q&A knowledge records can be obtained based on the learning of massive information. Correlation; by using the information extracted from the web page to build a question-and-answer knowledge base, the scope of application is broader, and the method is more versatile.
  • the method of the embodiment further includes the step of constructing the question and answer knowledge base, and the process of constructing the question and answer knowledge base is substantially the same as the process shown in FIG. 2; the interpretation model of the question and answer knowledge base of the present embodiment is as shown in FIG. The interpretation model is roughly the same. It will not be repeated here.
  • the search ranking of the question and answer pair to be analyzed can be optimized by using the degree of association, and the ranking effect is better.
  • the specific method may be the search ranking of the question-and-answer pair to be analyzed in the order of the degree of association of the question-and-answer pairs to be analyzed, that is, the search ranking of the question-and-answer pair with a high degree of relevance is ranked first; or may be based on the search first
  • the ranking technique initially arranges the website to which the question and answer pair to be analyzed belongs, and calculates a search ranking of the pair of questions to be analyzed according to the degree of association between the sequence number of the preliminary arrangement and the question and answer pair to be analyzed, for example, the waiting
  • the analysis question and answer is multiplied by the degree of association of the preliminary arrangement of the website to which it belongs, and the order of the result of the multiplication operation is used as the search ranking of the question and answer pair to be analyzed;
  • the quality of the pair and the row of the website to which it belongs The combination of names, sorting pairs of questions and answers to be analyzed, users can get better results sorting quality when using Q&A.
  • the device includes a question and answer knowledge base 710, a search unit 720, a calculation unit 730, and a search ranking unit 740.
  • the question and answer knowledge base 710 is adapted to store a plurality of question and answer knowledge records.
  • the question and answer knowledge base 710 of the present embodiment can be constructed by crawling a massive question and answer pair in a web page.
  • the searching unit 720 is adapted to receive a search request of the user, and obtain a plurality of question and answer pairs to be analyzed that match the search request according to the search request of the user.
  • the search unit 720 may be a question and answer pair search engine, and obtain a question and answer pair to be analyzed according to the user's search request; for example, the search unit 720 is a web search engine for question and answer search, and the receiving user passes The search request entered by the browser and the question and answer pair to be analyzed.
  • the calculating unit 730 is adapted to obtain the degree of association of each question and answer pair to be analyzed according to the question and answer knowledge base 710.
  • the calculation unit 730 of the present invention can analyze the problem content and the answer content of the analysis question and answer pair from the semantic aspect by using the question and answer knowledge base to obtain the correlation degree of the question and answer pair to be analyzed, and the evaluation effect is better and easy to implement.
  • the question and answer knowledge base 710 constructs and includes a plurality of question and answer knowledge records using a large number of high quality question and answer pairs extracted from web pages, and can acquire semantics between problem words and answer words of multiple question and answer knowledge records based on learning of massive information. relativity.
  • the search ranking unit 740 is adapted to optimize the search ranking of the question and answer pair to be analyzed according to the degree of association of the question and answer pairs to be analyzed.
  • the specific method may be the search ranking of the question-and-answer pair to be analyzed in the order of the degree of association of the question-and-answer pairs to be analyzed, that is, the search ranking of the question-and-answer pair with a high degree of relevance is ranked first; or may be based on the search first
  • the ranking technique initially arranges the website to which the question and answer pair to be analyzed belongs, and calculates a search ranking of the pair of questions to be analyzed according to the degree of association between the sequence number of the preliminary arrangement and the question and answer pair to be analyzed, for example, the waiting
  • the analysis question and answer is multiplied by the degree of association of the preliminary arrangement of the website to which it belongs, and the order of the result of the multiplication operation is used as the search ranking of the question and answer pair to be analyzed.
  • the apparatus further includes a question and answer knowledge base construction unit 750, wherein the question and answer knowledge base construction unit 750 is adapted to extract a plurality of question and answer pairs from the webpage containing the question and answer pair in advance, and construct a plurality of question and answer knowledge according to the extracted question and answer pairs.
  • Recorded Q&A knowledge base In the device shown in FIG. 7, the Q&A knowledge base 710 is already existing. Since the information volume of the actual network is increasing, the information content changes rapidly, and the content of the Q&A knowledge base 710 often needs to be updated.
  • the knowledge base building unit 750 constructs (or updates) the question and answer knowledge base 710, which can ensure the immediacy and reliability of the content of the question and answer knowledge base 710.
  • the question and answer knowledge base construction unit 750 of the present embodiment is the same as the question and answer knowledge base construction unit 400 shown in FIG. 5, and the description thereof will not be repeated here.
  • the calculation unit 630 in FIG. 7 specifically includes a word extraction subunit and an associated degree calculation subunit (not shown).
  • the word extraction subunit is adapted to perform the word extraction operation on the question content and the answer content of the question and answer pair to be analyzed, and obtain at least one question word to be analyzed and at least one answer word to be analyzed.
  • the word extraction subunit is adapted to perform word segmentation, remove stop words, word join, and extract entity words (eg, nouns, the question content and the answer content of the question and answer pair to be analyzed. The operation of the verb, etc.) to obtain at least one word to be analyzed and at least one word to be analyzed.
  • entity words eg, nouns, the question content and the answer content of the question and answer pair to be analyzed. The operation of the verb, etc.
  • the correlation degree calculation subunit is adapted to select at least one question and answer knowledge record from the question and answer knowledge base according to the problem word to be analyzed and the answer word to be analyzed, and calculate the correlation degree of the question and answer pair to be analyzed according to the selected question and answer knowledge record.
  • the correlation degree calculation subunit is adapted to select a question and answer knowledge record whose question words are matched with the question word to be analyzed and the included answer words match the answer words to be analyzed.
  • the matching of the problem words and the problem words to be analyzed refers to the sub-strings of the problem words to be analyzed and the problem words to be analyzed or the problem words to be analyzed are problem words; the matching words and the words to be analyzed match the words to be analyzed and The answer word is the same or the answer word to be analyzed is a substring of the answer word; according to the Q&A knowledge record corresponding to the same category in the selected question and answer knowledge record, the relevance of the question and answer pair to be analyzed for each category is obtained, more specific And adding the semantic relevance weights (for example, the weights of 1 or 100) corresponding to the same category of question and answer knowledge records in the selected question and answer knowledge records to obtain the association of the question and answer pairs to be analyzed respectively for each category.
  • the semantic relevance weights for example, the weight
  • Degree thereby obtaining at least one (the number of degrees of association in the embodiment, that is, the number of categories to be analyzed, the number of categories to be analyzed) is associated; selecting the above-mentioned question and answer pairs to be analyzed is the largest degree of association for each category The value, with the maximum value as the degree of association of the question and answer pairs to be analyzed.
  • FIG. 8 illustrates a flow chart of a method of determining a crawl frequency of a network resource point, in accordance with one embodiment of the present invention.
  • the method includes the following steps S810, S820, and S830:
  • the plurality of to-be-analyzed question and answer pairs are captured by the network resource point.
  • it may be a network resource point for determining a specific fetching frequency, for example, a Q&A community that needs to determine a fetching frequency, using a floor identification technology, according to the landlord (ie, the first post for a question)
  • the user asks questions, and the content of the reply on the 2nd floor of the 1st floor (that is, the user who replies to the post in order) is the answer, to extract the question and answer pair to be analyzed.
  • the question content and the answer content of the question and answer pair may be analyzed semantically by using the question and answer knowledge base.
  • the analysis is performed to obtain the degree of correlation of the question and answer pairs to be analyzed, and the evaluation effect is better and easier to implement.
  • step S820 of the embodiment is substantially the same as the method of obtaining the degree of association of the question and answer pair as shown in FIG. repeat.
  • the question and answer knowledge base including a plurality of question and answer knowledge records is obtained by extracting a plurality of question and answer pairs from a webpage having a question and answer pair in advance, and constructing according to the extracted question and answer pairs.
  • the category corresponding to the question and answer pair is captured.
  • the question and answer knowledge record is constructed according to the question and answer pair and the category corresponding to the question and answer pair.
  • Each question and answer knowledge record in the obtained question and answer knowledge base corresponds to a category, which includes a question word (QW), an answer word (AW), and a semantic relevance between the question word and the answer word.
  • QW question word
  • AW answer word
  • semantic relevance between the question word and the answer word.
  • the semantics between problem words and answer words of multiple Q&A knowledge records can be obtained based on the learning of massive information. Correlation; by using the information extracted from the web page to build a question-and-answer knowledge base, the scope of application is broader, and the method is more versatile.
  • the method of the embodiment further includes the step of constructing a question and answer knowledge base, wherein the process of constructing the question and answer knowledge base is substantially the same as the process shown in FIG. 2; the interpretation model of the question and answer knowledge base of the present embodiment is as shown in FIG. 3
  • the explanatory models shown are roughly the same. It will not be repeated here.
  • the quality of the network resource points can be determined by using the correlation degree of the plurality of question and answer pairs to be analyzed, thereby determining the frequency of the network resource points.
  • the specific method may be that the average value of the correlation degree of the pair of questions to be analyzed is used as the crawling frequency of the network resource point, that is, the network resource point with a large average value (ie, good quality) of the associated degree The higher the frequency (for example, the frequency at which the spider crawler crawls the network resource point is high); or the spider crawler may be used to obtain the initial crawl frequency of the network resource point, and calculate the correlation degree of the question and answer pair to be analyzed.
  • the frequency for example, the frequency at which the spider crawler crawls the network resource point is high
  • the spider crawler may be used to obtain the initial crawl frequency of the network resource point, and calculate the correlation degree of the question and answer pair to be analyzed.
  • An average value, using the average value to adjust the initial crawl frequency to determine a crawl frequency of the network resource point for example, an spider crawler may be used to obtain an initial crawl frequency of the network resource point, using the correlation degree
  • the average value of the initial capture frequency is weighted (including multiplication, normalization, etc.) to determine the capture frequency of the network resource point, so that the capture frequency of the high-quality network resource point is improved, thereby optimizing Search quality.
  • the correlation degree of the question and answer pair to be analyzed is analyzed by the network resource point, and the crawling frequency of the network resource point is determined according to the degree of association, so that the accuracy of the crawling result can be improved.
  • the apparatus includes a question and answer knowledge base 91, a resource analysis unit 920, a calculation unit 930, and a capture frequency acquisition unit 940.
  • the Q&A knowledge base 910 is adapted to store a plurality of Q&A knowledge records.
  • the question and answer knowledge base 910 of the present embodiment can be constructed by crawling a large number of question and answer pairs in a web page.
  • the resource analysis unit 920 is adapted to capture a plurality of question and answer pairs to be analyzed by the network resource point.
  • the resource analysis unit 920 may determine a network resource point of a capture frequency for a specific need, for example, a question and answer community that needs to determine a crawl frequency, and use a floor identification technology according to the landlord (ie, for a problem first)
  • the user who posts the question) asks questions, and the content of the reply on the 1st floor and the 2nd floor (that is, the user who replies to the post in order) is the answer, to extract the question and answer pair to be analyzed.
  • the calculating unit 930 is adapted to obtain the degree of association of each question and answer pair to be analyzed according to the question and answer knowledge base.
  • the calculation unit 930 of the present invention can analyze the problem content and the answer content of the analysis question and answer pair from the semantic aspect by using the question and answer knowledge base to obtain the correlation degree of the question and answer pair to be analyzed, and the evaluation effect is better and easy to implement.
  • the Q&A knowledge base 910 is constructed using a large number of high-quality Q&A pairs extracted from web pages and includes a plurality of Q&A knowledge records, which can acquire semantics between problem words and answer words of multiple Q&A knowledge records based on learning of massive information. relativity.
  • the capture frequency determining unit 940 is adapted to determine a crawling frequency of the network resource point according to the correlation degree of the question and answer pair to be analyzed.
  • the quality of the network resource points can be determined by using the correlation degree of the plurality of question and answer pairs to be analyzed, thereby determining the frequency of the network resource points.
  • the specific method may be that the average value of the correlation degree of the pair of questions to be analyzed is used as the crawling frequency of the network resource point, that is, the network resource point with a large average value (ie, good quality) of the associated degree The higher the frequency (for example, the frequency at which the spider crawler crawls the network resource point is high); or the spider crawler may be used to obtain the initial crawl frequency of the network resource point, and calculate the correlation degree of the question and answer pair to be analyzed.
  • An average value, using the average value to adjust the initial crawl frequency to determine a crawl frequency of the network resource point for example, an spider crawler may be used to obtain an initial crawl frequency of the network resource point, using the correlation degree
  • the average value of the initial capture frequency is weighted (including multiplication, normalization, etc.) to determine the capture frequency of the network resource point, so that the capture frequency of the high-quality network resource point is improved, thereby optimizing Search quality.
  • the apparatus further includes a question and answer knowledge base construction unit 950, and the question and answer knowledge base construction unit 950 is adapted to extract a plurality of question and answer pairs from the webpage containing the question and answer pair in advance, and construct a plurality of question and answer knowledge according to the extracted question and answer pairs.
  • Recorded Q&A knowledge base the Q&A knowledge base 910 is existing. Since the amount of information of the actual network is increasing, the information content changes rapidly, and the content of the Q&A knowledge base 910 often needs to be updated.
  • the knowledge base building unit 950 builds (or updates) the Q&A knowledge base to ensure the immediacy and reliability of the content of the Q&A knowledge base.
  • the question and answer knowledge base construction unit 950 of the present embodiment is the same as the question and answer knowledge base construction unit 400 shown in FIG. 5, and the description thereof will not be repeated here.
  • the calculation unit 930 in FIG. 9 specifically includes a word extraction subunit and an associated degree calculation subunit (not shown).
  • the word extraction subunit is adapted to perform the word extraction operation on the question content and the answer content of the question and answer pair to be analyzed, and obtain at least one question word to be analyzed and at least one answer word to be analyzed.
  • the word extraction subunit is adapted to perform word segmentation, remove stop words, word join, and extract entity words (eg, nouns, the question content and the answer content of the question and answer pair to be analyzed. The operation of the verb, etc.) to obtain at least one word to be analyzed and at least one word to be analyzed.
  • entity words eg, nouns, the question content and the answer content of the question and answer pair to be analyzed. The operation of the verb, etc.
  • the correlation degree calculation subunit is adapted to select at least one question and answer knowledge record from the question and answer knowledge base according to the problem word to be analyzed and the answer word to be analyzed, and calculate the correlation degree of the question and answer pair to be analyzed according to the selected question and answer knowledge record.
  • the correlation degree calculation subunit is adapted to select a question and answer knowledge record whose question words are matched with the question word to be analyzed and the included answer words match the answer words to be analyzed.
  • the matching of the problem words and the problem words to be analyzed refers to the sub-strings of the problem words to be analyzed and the problem words to be analyzed or the problem words to be analyzed are problem words; the matching words and the words to be analyzed match the words to be analyzed and The answer word is the same or the answer word to be analyzed is a substring of the answer word; according to the Q&A knowledge record corresponding to the same category in the selected question and answer knowledge record, the relevance of the question and answer pair to be analyzed for each category is obtained, more specific And adding the semantic relevance weights (for example, the weights of 1 or 100) corresponding to the same category of question and answer knowledge records in the selected question and answer knowledge records to obtain the association of the question and answer pairs to be analyzed respectively for each category.
  • the semantic relevance weights for example, the weight
  • Degree thereby obtaining at least one (the number of degrees of association in the embodiment, that is, the number of categories to be analyzed, the number of categories to be analyzed) is associated; selecting the above-mentioned question and answer pairs to be analyzed is the largest degree of association for each category The value, with the maximum value as the degree of association of the question and answer pairs to be analyzed.
  • the various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof. It should be understood by those skilled in the art that a microprocessor or digital signal processor (DSP) can be used in practice to implement a device for obtaining the degree of association of a question and answer pair according to an embodiment of the present invention, and a device for optimizing search ranking of a question and answer pair. And some or all of the functions of some or all of the means for determining the frequency of crawling of network resource points.
  • the invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein. Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
  • FIG. 10 illustrates a method for performing an association degree of obtaining a question and answer pair according to the present invention, a method of optimizing a search ranking of a question and answer pair, and a server for determining a frequency of crawling a network resource point, such as an application server.
  • the application server traditionally includes a processor 1010 and a computer program product or computer readable medium in the form of a memory 1020.
  • the memory 1020 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, a hard disk, or a ROM.
  • the memory 1020 has a memory space 1030 for executing program code 1031 of any of the above method steps.
  • storage space 1030 for program code may include various program code 1031 for implementing various steps in the above methods, respectively.
  • the program code can be read from or written to one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit as described with reference to FIG.
  • the storage unit may have a storage section, a storage space, and the like arranged similarly to the storage 1020 in the application server of FIG.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit includes computer readable code 1131 ', ie, code that can be read by, for example, a processor, such as processor 1010, which, when executed by a server, causes the server to perform each of the methods described above. step.

Abstract

L'invention concerne un appareil et un procédé pour obtenir le degré d'association d'une paire de question et de réponse, un procédé pour l'optimisation du classement de recherche de la paire de question et de réponse, et un appareil et un procédé pour déterminer la fréquence d'avance d'un point de ressource de réseau. Le procédé pour obtenir le degré d'association de la paire de question et de réponse comprend les étapes suivantes : l'exécution d'une opération d'extraction de mot sur le contenu de question et sur le contenu de réponse d'une paire de question et de réponse à analyser, pour obtenir au moins un mot de question à analyser et au moins un mot de réponse à analyser ; la sélection d'au moins un enregistrement de connaissance de question et de réponse dans une bibliothèque de connaissances de questions et de réponses comprenant une pluralité d'enregistrements de connaissances de questions et de réponses conformément au mot de question à analyser et au mot de réponse à analyser ; et le calcul du degré d'association de la paire de question et de réponse à analyser conformément à l'enregistrement de connaissance de question et de réponse sélectionné. Avec l'appareil et le procédé pour obtenir le degré d'association de la paire de question et de réponse, la qualité de la paire de question et de réponse peut être évaluée sémantiquement, et l'effet de l'évaluation est meilleur ; de plus, l'appareil et le procédé sont faciles à mettre en œuvre et présentent une excellente universalité.
PCT/CN2014/086838 2013-10-21 2014-09-18 Appareil et procédé d'obtention de degré d'association d'une paire de question et de réponse et d'optimisation de classement de recherche WO2015058604A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CN201310495881.4A CN103577558B (zh) 2013-10-21 2013-10-21 一种优化问答对的搜索排名的装置和方法
CN201310495856.6 2013-10-21
CN201310495881.4 2013-10-21
CN201310495641.4 2013-10-21
CN201310495641.4A CN103577556B (zh) 2013-10-21 2013-10-21 一种获取问答对的相关联程度的装置和方法
CN201310495856.6A CN103577557B (zh) 2013-10-21 2013-10-21 一种确定网络资源点的抓取频率的装置和方法

Publications (1)

Publication Number Publication Date
WO2015058604A1 true WO2015058604A1 (fr) 2015-04-30

Family

ID=52992233

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/086838 WO2015058604A1 (fr) 2013-10-21 2014-09-18 Appareil et procédé d'obtention de degré d'association d'une paire de question et de réponse et d'optimisation de classement de recherche

Country Status (1)

Country Link
WO (1) WO2015058604A1 (fr)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9760627B1 (en) 2016-05-13 2017-09-12 International Business Machines Corporation Private-public context analysis for natural language content disambiguation
CN108717433A (zh) * 2018-05-14 2018-10-30 南京邮电大学 一种面向程序设计领域问答系统的知识库构建方法及装置
CN109934347A (zh) * 2017-12-18 2019-06-25 上海智臻智能网络科技股份有限公司 扩展问答知识库的装置
CN110019729A (zh) * 2017-12-25 2019-07-16 上海智臻智能网络科技股份有限公司 智能问答方法及存储介质、终端
CN110019838A (zh) * 2017-12-25 2019-07-16 上海智臻智能网络科技股份有限公司 智能问答系统及智能终端
US10361981B2 (en) 2015-05-15 2019-07-23 Microsoft Technology Licensing, Llc Automatic extraction of commitments and requests from communications and content
CN110334272A (zh) * 2019-05-29 2019-10-15 平安科技(深圳)有限公司 基于知识图谱的智能问答方法、装置及计算机存储介质
CN110580313A (zh) * 2018-06-08 2019-12-17 北京搜狗科技发展有限公司 一种数据处理方法、装置和用于数据处理的装置
CN111382235A (zh) * 2018-12-27 2020-07-07 上海智臻智能网络科技股份有限公司 一种问答知识库的优化方法及其装置
CN111552789A (zh) * 2020-04-27 2020-08-18 中国银行股份有限公司 一种客服知识库自学习方法及装置
CN111984768A (zh) * 2019-05-24 2020-11-24 北京京东尚科信息技术有限公司 语料处理及问答交互方法、装置、计算机设备及存储介质
US10984387B2 (en) 2011-06-28 2021-04-20 Microsoft Technology Licensing, Llc Automatic task extraction and calendar entry
CN113239164A (zh) * 2021-05-13 2021-08-10 杭州摸象大数据科技有限公司 多轮对话流程构建方法、装置、计算机设备及存储介质
CN113807512A (zh) * 2020-06-12 2021-12-17 株式会社理光 机器阅读理解模型的训练方法、装置及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520802A (zh) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 一种问答对的质量评价方法和系统
CN101986293A (zh) * 2010-09-03 2011-03-16 百度在线网络技术(北京)有限公司 用于在搜索界面中呈现搜索答案信息的方法及设备
US20120078826A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Fact checking using and aiding probabilistic question answering
US8346701B2 (en) * 2009-01-23 2013-01-01 Microsoft Corporation Answer ranking in community question-answering sites
CN102884527A (zh) * 2010-04-06 2013-01-16 新加坡国立大学 根据基于社区的问题回答档案库的自动常问问题汇编

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8346701B2 (en) * 2009-01-23 2013-01-01 Microsoft Corporation Answer ranking in community question-answering sites
CN101520802A (zh) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 一种问答对的质量评价方法和系统
CN102884527A (zh) * 2010-04-06 2013-01-16 新加坡国立大学 根据基于社区的问题回答档案库的自动常问问题汇编
CN101986293A (zh) * 2010-09-03 2011-03-16 百度在线网络技术(北京)有限公司 用于在搜索界面中呈现搜索答案信息的方法及设备
US20120078826A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Fact checking using and aiding probabilistic question answering

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10984387B2 (en) 2011-06-28 2021-04-20 Microsoft Technology Licensing, Llc Automatic task extraction and calendar entry
US10361981B2 (en) 2015-05-15 2019-07-23 Microsoft Technology Licensing, Llc Automatic extraction of commitments and requests from communications and content
US9760627B1 (en) 2016-05-13 2017-09-12 International Business Machines Corporation Private-public context analysis for natural language content disambiguation
CN109934347A (zh) * 2017-12-18 2019-06-25 上海智臻智能网络科技股份有限公司 扩展问答知识库的装置
CN109934347B (zh) * 2017-12-18 2024-02-02 上海智臻智能网络科技股份有限公司 扩展问答知识库的装置
CN110019729A (zh) * 2017-12-25 2019-07-16 上海智臻智能网络科技股份有限公司 智能问答方法及存储介质、终端
CN110019838A (zh) * 2017-12-25 2019-07-16 上海智臻智能网络科技股份有限公司 智能问答系统及智能终端
CN110019729B (zh) * 2017-12-25 2024-03-15 上海智臻智能网络科技股份有限公司 智能问答方法及存储介质、终端
CN108717433A (zh) * 2018-05-14 2018-10-30 南京邮电大学 一种面向程序设计领域问答系统的知识库构建方法及装置
CN110580313B (zh) * 2018-06-08 2024-02-02 北京搜狗科技发展有限公司 一种数据处理方法、装置和用于数据处理的装置
CN110580313A (zh) * 2018-06-08 2019-12-17 北京搜狗科技发展有限公司 一种数据处理方法、装置和用于数据处理的装置
CN111382235A (zh) * 2018-12-27 2020-07-07 上海智臻智能网络科技股份有限公司 一种问答知识库的优化方法及其装置
CN111984768A (zh) * 2019-05-24 2020-11-24 北京京东尚科信息技术有限公司 语料处理及问答交互方法、装置、计算机设备及存储介质
CN110334272B (zh) * 2019-05-29 2022-04-12 平安科技(深圳)有限公司 基于知识图谱的智能问答方法、装置及计算机存储介质
CN110334272A (zh) * 2019-05-29 2019-10-15 平安科技(深圳)有限公司 基于知识图谱的智能问答方法、装置及计算机存储介质
CN111552789A (zh) * 2020-04-27 2020-08-18 中国银行股份有限公司 一种客服知识库自学习方法及装置
CN113807512A (zh) * 2020-06-12 2021-12-17 株式会社理光 机器阅读理解模型的训练方法、装置及可读存储介质
CN113807512B (zh) * 2020-06-12 2024-01-23 株式会社理光 机器阅读理解模型的训练方法、装置及可读存储介质
CN113239164B (zh) * 2021-05-13 2023-07-04 杭州摸象大数据科技有限公司 多轮对话流程构建方法、装置、计算机设备及存储介质
CN113239164A (zh) * 2021-05-13 2021-08-10 杭州摸象大数据科技有限公司 多轮对话流程构建方法、装置、计算机设备及存储介质

Similar Documents

Publication Publication Date Title
WO2015058604A1 (fr) Appareil et procédé d'obtention de degré d'association d'une paire de question et de réponse et d'optimisation de classement de recherche
US10831769B2 (en) Search method and device for asking type query based on deep question and answer
US9558264B2 (en) Identifying and displaying relationships between candidate answers
CN103577558B (zh) 一种优化问答对的搜索排名的装置和方法
JP7153004B2 (ja) コミュニティ質問応答データの検証方法、装置、コンピュータ機器、及び記憶媒体
Hartawan et al. Using vector space model in question answering system
US20160019293A1 (en) Interpreting and Distinguishing Lack of an Answer in a Question Answering System
CN107193796B (zh) 一种舆情事件检测方法及装置
US8825620B1 (en) Behavioral word segmentation for use in processing search queries
US20180204106A1 (en) System and method for personalized deep text analysis
CN104376115B (zh) 一种基于全局搜索的模糊词确定方法及装置
US20150206101A1 (en) System for determining infringement of copyright based on the text reference point and method thereof
WO2020074017A1 (fr) Procédé et dispositif basés sur l'apprentissage profond destinés au criblage de mots-clés dans un document médical
CN107784069B (zh) 一种用于智能诊断学生知识能力的方法
CN108280081B (zh) 生成网页的方法和装置
US20190294705A1 (en) Image annotation
CN103577557A (zh) 一种确定网络资源点的抓取频率的装置和方法
WO2017000659A1 (fr) Procédé et appareil d'identification de localisateur uniforme de ressources (url) enrichi
CN109033318A (zh) 智能问答方法及装置
US10783140B2 (en) System and method for augmenting answers from a QA system with additional temporal and geographic information
CN113010639A (zh) 一种基于电商平台的商品分析方法及装置
CN117454217A (zh) 一种基于深度集成学习的抑郁情绪识别方法、装置及系统
WO2019192122A1 (fr) Procédé d'extraction de paramètres de sujet de document, procédé et dispositif de recommandation de produit, et support d'informations
CN113569044B (zh) 一种基于自然语言处理技术的网页文本内容的分类方法
CN104933097A (zh) 一种用于检索的数据处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14856111

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14856111

Country of ref document: EP

Kind code of ref document: A1