CN107168967B

CN107168967B - Target knowledge point acquisition method and device

Info

Publication number: CN107168967B
Application number: CN201610127893.5A
Authority: CN
Inventors: 张云; 魏洪平; 阮征
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2016-03-07
Filing date: 2016-03-07
Publication date: 2020-12-04
Anticipated expiration: 2036-03-07
Also published as: CN107168967A

Abstract

The embodiment of the application relates to a method and a device for acquiring a target knowledge point, and the method comprises the following steps: collecting multiple conversation contents of a user and a question-answering robot, wherein each conversation content comprises at least one question-answering pair, each question-answering pair comprises a question and at least one knowledge point, the knowledge point is an answer corresponding to the question, and each knowledge point belongs to a preset category; according to a preset algorithm, obtaining a matching value of each knowledge point in the multi-time conversation content and a corresponding problem, and taking the matching value as a first score value of each knowledge point; and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point. Also in the application, the actual answer effect of the knowledge points on the user problems is considered, so that the accuracy of acquiring the target knowledge points can be improved, and the requirement that the user wants to quickly know the popular knowledge points can be met.

Description

Target knowledge point acquisition method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for acquiring a target knowledge point.

Background

The knowledge points can refer to some hot topics in the internet, and generally can be pushed to a user under the condition that any information input by the user is not received, for example, in an automatic question and answer platform, after a chat window of the automatic question and answer platform is opened, titles of the knowledge points can be displayed in the chat window, if the knowledge points contain contents which the user wants to know, the user only needs to click the titles of the corresponding knowledge points to link the related contents of the knowledge points, and then the knowledge points can be known, so that the user can conveniently know the hot topics in the internet, and the user experience is improved.

In the prior art, the process of acquiring a target knowledge point (e.g., a hot topic) is as follows: counting the number of clicks of a plurality of knowledge points within a preset time period; and then acquiring a target knowledge point from the plurality of knowledge points according to the click number. However, the method does not consider the answering effect of the knowledge points on the user problems, and may cause that the user only clicks some knowledge points by mistake or clicks some knowledge points curiously, and the accuracy of statistics of the number of clicks of the knowledge points is affected by the user clicking the knowledge points by mistake or clicking the knowledge points curiosity, so that the accuracy of acquisition of the target knowledge points is affected, and therefore, the recommended target knowledge points deviate from the actual appeal of the user, and the user needs to manually input a problem to seek the answering, so that the process of knowing the hot knowledge points by the user is complicated, and the requirement that the user wants to know the hot knowledge points quickly cannot be met.

Disclosure of Invention

The embodiment of the application provides a method and a device for acquiring a target knowledge point, which can improve the accuracy of acquiring the target knowledge point, so that the requirement that a user wants to quickly know the popular knowledge point can be met.

In a first aspect, a method for acquiring a target knowledge point is provided, where the method includes:

collecting multiple conversation contents of a user and a question-answering robot, wherein each conversation content comprises at least one question-answering pair, each question-answering pair comprises a question and at least one knowledge point, the knowledge points are answers corresponding to the question, and each knowledge point belongs to a preset category;

determining a matching value of each knowledge point in the multi-session content and a corresponding problem according to a preset algorithm, and taking the matching value as a first score value of each knowledge point;

and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point.

In a second aspect, a method for acquiring a target knowledge point is provided, and the method includes:

collecting multiple conversation contents of a user and a question-answering robot, wherein each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, and the knowledge point refers to an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value;

for each knowledge point in each conversation content, determining a category attenuation factor of each knowledge point according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs;

updating the initial score value of each knowledge point according to the category attenuation factor of the knowledge point and the category attenuation factors of the subsequent knowledge points of the knowledge point to obtain a second score value of each knowledge point in the multi-session content;

and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the second score value of each knowledge point.

In a third aspect, a method for acquiring a target knowledge point is provided, where the method includes:

for each knowledge point in each conversation content, determining a time attenuation factor of each knowledge point according to the current time, the creation time of a question-answer pair to which the knowledge point belongs and a preset threshold value;

updating the initial score value of each knowledge point according to the time attenuation factor of each knowledge point to obtain a third score value of each knowledge point in the multi-session content;

and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the third fractional value of each knowledge point.

In a fourth aspect, an apparatus for acquiring a target knowledge point is provided, the apparatus including:

the system comprises a collecting unit, a processing unit and a processing unit, wherein the collecting unit is used for collecting the content of multiple conversations between a user and a question-answering robot, each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, the knowledge points refer to answers corresponding to the question, and each knowledge point belongs to a preset category;

the determining unit is used for determining the matching value of each knowledge point in the multi-session content collected by the collecting unit and the corresponding problem according to a preset algorithm, and taking the matching value as a first score value of each knowledge point;

and the acquisition unit is used for acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point determined by the determination unit.

In a fifth aspect, an apparatus for acquiring a target knowledge point is provided, the apparatus comprising:

the system comprises a collecting unit, a processing unit and a control unit, wherein the collecting unit is used for collecting the contents of multiple conversations between a user and a question-answering robot, each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, and the knowledge point refers to an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value;

the determining unit is used for determining a category attenuation factor of each knowledge point in each conversation content according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs;

the updating unit is used for updating the initial score value of each knowledge point according to the category attenuation factor of the knowledge point determined by the determining unit and the category attenuation factors of the subsequent knowledge points of the knowledge point to obtain a second score value of each knowledge point in the multi-session content;

and the acquisition unit is used for acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the second score value of each knowledge point updated by the updating unit.

In a sixth aspect, an apparatus for acquiring a target knowledge point is provided, the apparatus comprising:

the determining unit is used for determining a time attenuation factor of each knowledge point in each conversation content according to the current time, the creation time of a question-answer pair to which the knowledge point belongs and a preset threshold;

the updating unit is used for updating the initial score value of each knowledge point according to the time attenuation factor of each knowledge point determined by the determining unit to obtain a third score value of each knowledge point in the multi-session content;

and the acquisition unit is used for acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the third fractional value of each knowledge point updated by the updating unit.

The method and the device for acquiring the target knowledge point collect multiple conversation contents of a user and a question-answering robot, wherein each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, the knowledge point refers to an answer corresponding to the question, and each knowledge point belongs to a preset category; according to a preset algorithm, obtaining a matching value of each knowledge point in the multi-time conversation content and a corresponding problem, and taking the matching value as a first score value of each knowledge point; and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point. Also in the application, the actual answer effect of the knowledge points on the user problems is considered, so that the accuracy of acquiring the target knowledge points can be improved, and the requirement that the user wants to quickly know the popular knowledge points can be met.

Drawings

Fig. 1 is a flowchart of a target knowledge point obtaining method according to an embodiment of the present application;

fig. 2 is a flowchart of a target knowledge point acquisition method according to another embodiment of the present application;

FIG. 3 is a flowchart of a method for obtaining a target knowledge point according to yet another embodiment of the present application;

FIG. 4 is a flowchart of a method for obtaining a target knowledge point according to another embodiment of the present application;

FIG. 5 is a schematic diagram of an apparatus for acquiring a target knowledge point according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an apparatus for obtaining a target knowledge point according to another embodiment of the present application;

fig. 7 is a schematic diagram of an apparatus for acquiring a target knowledge point according to still another embodiment of the present application.

Detailed Description

Embodiments of the present invention will be described below with reference to the accompanying drawings.

The method and the device for acquiring the target knowledge point are suitable for scenes for acquiring the hot knowledge point in the Internet, and are particularly suitable for scenes for acquiring the hot knowledge point in the automatic question and answer platform. The automatic question-answering platform mainly answers user questions through a question-answering robot with an instant messaging function, and the question-answering robot answers user questions received in real time on the basis of learning a large number of pre-collected question-answering corpora; which may have a certain traffic background. For example, it may be an automatic question and answer platform in a pay treasure system, or an automatic question and answer platform in a treasure panning network, etc.

The knowledge point of the application can be understood as the answer pushed by the automatic question and answer platform according to the user question. In an automated question-and-answer platform with a business context, knowledge points generally have the following three characteristics:

1) long tail effect: the click numbers of different knowledge points obey long tail distribution, hot knowledge points are concentrated with higher click numbers, and cold knowledge points still have a small number of clicks;

2) and (3) timeliness: part of knowledge points have timeliness, a large number of knowledge points can be clicked in a specific time period, and the number of clicks in the rest time periods is obviously reduced;

3) relevance: there is a correlation between different knowledge points, and when some knowledge points are clicked in a large number, the knowledge points with similar subjects or business backgrounds will have a larger number of clicks.

If the statistics of hot knowledge points are carried out manually, the following problems can exist:

1) manpower consumption: the user session involves a large amount of data, and the dependence on manpower can bring a large workload;

2) subjectivity: the manual work inevitably has subjective emotional colors, so that the counted hot knowledge points can not really reflect the appeal of the user;

3) hysteresis property: the time period of manual statistics is longer, the counted knowledge points always lag behind the current hot topics, and the timeliness of the knowledge points cannot be reflected;

4) and (3) tropism: the relevance among knowledge points is difficult to avoid through manual statistics, so the calculated knowledge points have the same or similar theme backgrounds, and the coverage rate difference of different services is large.

Aiming at the characteristics of the knowledge points and the problems existing in artificial statistics, the method for acquiring the popular knowledge points through technical means has certain practical value.

Fig. 1 is a flowchart of a target knowledge point obtaining method according to an embodiment of the present application. The execution subject of the method may be a device with processing capabilities: as shown in fig. 1, the method may specifically include:

and step 110, collecting the multi-time conversation content of the user and the question-answering robot.

Wherein, each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, the knowledge point is the answer corresponding to the question, and each knowledge point belongs to a preset category.

Here, the question answering robot refers to a machine model capable of answering received user questions in real time in an automatic question answering platform, and may be obtained on the basis of learning a large number of pre-collected question and answer pairs (including questions and answers), such as the honey and the degree secret of the hundreds of degrees of the acriba group. Specifically, the server may collect, in advance, contents of multiple sessions of the user and the question-answering robot within a preset time period (e.g., 01:00-24:00) from a background database of the automatic question-answering platform.

A conversation of the application can start by opening a chat window in the automatic question-answering platform by a user and end by closing the chat window by the user. In the process from opening of a chat window to ending, the user can chat with the question and answer robot at least once, wherein each chat forms a chat record, which may also be called a question and answer pair, and the question and answer pair may include a question asked by the user and one knowledge point pushed by the question and answer robot, or may also include a question asked by the user and a plurality of knowledge points pushed by the question and answer robot. Specifically, if the question-answering robot can accurately locate the user question, that is, the question-answering robot learns the question-answering corpus related to the user question in advance, one knowledge point corresponding to the user question may be pushed to the user, otherwise, a plurality of knowledge points may be pushed to the user.

It should be noted that, in the present application, a category to which each knowledge point belongs may be set in advance for each of the collected knowledge points. Here, the number of the preset categories may be one or more, and the two categories may include a first-level sub-category and a second-level sub-category, for example, the number of the preset categories is two. For example, assuming that a knowledge point in a pay treasure system is "balance treasure profit calculation", the first-level sub-category to which the knowledge point belongs may be "balance treasure", and the second-level sub-category to which the knowledge point belongs may be "balance treasure profit".

Alternatively, after collecting a plurality of knowledge points in the content of a plurality of sessions, the initial score value of each knowledge point may be set to 1.

In one example, the multiple session content collected may be as shown in table 1.

TABLE 1

For purposes of this description, and by way of brevity, Table 1 lists a few examples, and in practice, the collected session content may contain tens of thousands of knowledge points.

And step 120, determining a matching value of each knowledge point in the multi-session content and the corresponding problem according to a preset algorithm, and taking the matching value as a first score value of each knowledge point.

Here, the preset algorithm may be a word frequency position weighted ranking algorithm, and the like, and the matching value of the knowledge point determined by the preset algorithm and the corresponding problem may be used to measure the correlation between the knowledge point and the corresponding problem. Specifically, when the matching degree value is large, the correlation between the knowledge point and the corresponding problem is high; and when the matching value is small, the correlation between the knowledge point and the corresponding problem is weak. Here, the process of obtaining the matching degree value of the knowledge point and the corresponding problem belongs to a simple application of a preset algorithm, and the details of the application are not repeated herein.

After determining the matching value of each knowledge point and the corresponding problem, directly taking the matching value as a first score value of each knowledge point; alternatively, when the initial score value has been set for each knowledge point, the initial score value is directly updated to the above-described matching value.

It should be noted that, the step 120 is mainly to solve the problem that, in the prior art, when a knowledge point is classified, the correlation between a problem and the knowledge point is not considered, so that the accuracy of acquiring a target knowledge point is affected.

And step 130, acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point.

Specifically, after the first score values of the knowledge points are determined, the knowledge points may be sorted in the order from the largest to the smallest of the first score values, and the top K knowledge points in the top order may be used as target knowledge points. It should be noted that, for the same knowledge point in different session contents, the final score value of the knowledge point may be obtained by directly summing, averaging, weighted averaging, or bayesian scoring the plurality of first score values of the knowledge point.

It should be noted that the knowledge points collected in the present application have category attributes, and these categories can be regarded as topics of the knowledge points, and meanwhile, chat records in the content of one session of the user have a certain dependency relationship and similar topics. When the category to which the knowledge point in the conversation content belongs jumps, the following conditions are indicated: 1) the previous knowledge point of the response is inconsistent with the user's intention; 2) the user clicks on the recommended knowledge point at will. Therefore, the category jump of the subsequent knowledge points in the session content can be used as one of the factors for measuring the score value of the previous knowledge points, namely, the score value of the previous knowledge points is attenuated along with the category jump of the knowledge points, and the accuracy of acquiring the target knowledge points can be improved. Therefore, before executing step 130 in the embodiment of the present application, the following steps may also be executed:

step A: and for each knowledge point in the content of each conversation, determining a category attenuation factor of each knowledge point according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs.

The following will describe the process of determining the category decay factor for a knowledge point in the content of a session:

taking the automated question and answer platform in the pay treasure system as an example, the pre-collected content of one session may be as follows:

problem 1: what is the rule for red envelope?

Knowledge point 1: the Payment treasure sends out 8 hundred million cash red packages in the spring and night red package activities, wherein the first four rounds of ping qi red packages each exceed 1 hundred million cash.

Problem 2: how do not dedicate?

Knowledge point 2: after the five-fortune activity is finished, more than 78 ten thousand people have collected five fortune to date, …, and the game strength is strong. Thank you for participating in the pay for Bao to rob the red envelope.

Problem 3: i have collected five benefits, but have not been divided into cash?

Knowledge point 3: after the activity is finished, 78 ten thousand people count accurately, 2.15 hundred million cash is sent out at present, and due to the peak period of cash turnover at present, the cash can be delayed to be paid out and you can wait for the cash.

…

In the above example, "knowledge point 1" is the previous knowledge point of "knowledge point 2". It is understood that "knowledge point 1" has no previous knowledge point.

In one example, a category decay factor for a knowledge point in the content of a session may be determined according to equation 1:

(formula 1)

Wherein, jump_jFor a class attenuation factor of a certain knowledge point in the content of a session, taking the knowledge point as the ith knowledge point as an example, j represents the sequence of the ith knowledge point appearing in the content of the session, and as in the previous example, "knowledge point 3" is ordered as the third bit in the content of the session, so the value of j is 3; catj is the category to which the j-th knowledge point in the content of the session belongs, i.e. the category, cat, of the i-th knowledge point_j-1LCP (cat) is the category to which the knowledge point in the first session content ranked at the j-1 th position belongs, namely the category to which the previous knowledge point in the first session content located before the ith knowledge point belongs_j-1,cat_j) The longest common prefix, minLen (cat), of the category to which the knowledge point arranged at the j-th position belongs and the category to which the knowledge point arranged at the j-1 th position belongs in the contents of one session_j-1,cat_j) The minimum string length of the category to which the knowledge point ranked at the j-th position belongs and the category to which the knowledge point ranked at the j-1 th position belongs in the content of one session.

The foregoing description of the longest common prefix and the minimum string length will be explained by the following examples. Taking the knowledge point 1 and the knowledge point 2 in the session content 1 in table 1 as an example, both the knowledge points have three levels of sub-categories, and the first level sub-category and the second level sub-category thereof are the same, and the third level sub-categories are different, so that the longest common prefix of the two knowledge points is "balance treasure + balance treasure benefit", that is, LCP (knowledge point 1, knowledge point 2) ═ 8. Taking the knowledge point 2 and the knowledge point 3 in the session content 1 in table 1 as an example, the knowledge point 2 has three levels of sub-categories, while the knowledge point 3 has only two levels of sub-categories, and the two previous levels of sub-categories are the same, so the minimum character string of the two knowledge points is "balance treasure + balance treasure benefit", that is, minLen (knowledge point 2, knowledge point 3) ═ 8.

It should be noted that, when a knowledge point is ranked first in the content of a session, that is, when j is 0, because the knowledge point does not have a previous knowledge point in the content of the session, the category attenuation factor of the knowledge point in the content of the session may be set to a threshold, and the threshold may be set to a value range of [0, 1 ].

According to the method for determining the category attenuation factor of one knowledge point in the one-time conversation content, the category attenuation factor of each knowledge point in the multi-time conversation content can be determined. As according to formula 1, category attenuation factors of knowledge point 1, knowledge point 2 and knowledge point 3 in session content 1 in table 1 can be obtained; and may obtain a category decay factor or the like for knowledge points 4 in session content 2. Of course, in practical application, since there is no previous knowledge point between the knowledge point 1 and the knowledge point 4, the formula 1 cannot be applied, so that the category attenuation factor can be set as a threshold, that is, when a certain knowledge point is ranked at the head of the content of a session, the category attenuation factor of the knowledge point in the content of the session is set as a threshold.

And B: and updating the first score value of each knowledge point according to the category attenuation factor of the knowledge point and the category attenuation factors of the subsequent knowledge points of the knowledge point to obtain the second score value of each knowledge point in the multi-session content.

In one example, a second score value for a knowledge point in the content of a session may be obtained according to equation 2.

(formula 2)

Wherein, Score₂(i) Is the second Score value of the ith knowledge point, Score₁(i) Is the first score value of the ith knowledge point, M is a sessionNumber of knowledge points in the speech content, jump_nA category decay factor for the nth knowledge point in the content of a session. It is understood that the above formula is: score₂(i)＝Score1(i)*jump_j*jump_j+1*...*jump_MThat is, when calculating the second score value of the ith knowledge point in the content of the session, the category jump condition of the subsequent knowledge points in the same session content is considered comprehensively, for example, the category jump condition of the knowledge points arranged at j +1, j +2, … and M bits in the content of the session is considered, where j is the sequencing position of the ith knowledge point in the content of the session, that is, the measurement factor of the target knowledge point is increased.

According to the method for determining the second score value of one knowledge point in the one-time conversation content, the second score value of each knowledge point in the multi-time conversation content can be determined. According to the formula 2, second score values of the knowledge point 1, the knowledge point 2 and the knowledge point 3 in the session content 1 in the table 1 can be obtained; and a second score value for the knowledge point 4 in the session content 2, etc. may be obtained.

After obtaining the second score values for the respective knowledge points, step 130 may be replaced by step C: and acquiring the target knowledge point from a plurality of knowledge points in the multi-session content according to the second score value of each knowledge point.

Here, the process of obtaining the target knowledge point according to the second score value of each knowledge point is similar to the process of obtaining the target knowledge point according to the first score value of each knowledge point in step 130, and the details of the process are not repeated herein.

Since the hot knowledge points have timeliness, if the count is simple, the statistical granularity is too large, and the timeliness attribute of the knowledge points is annihilated, so in the embodiment of the application, when the final score value of each knowledge point is determined, a time decay factor can be considered, so as to ensure that the recently clicked knowledge point has a higher weight. Therefore, before step C in the present embodiment is executed, the following steps may also be executed:

step X: and for each knowledge point in the conversation content, determining the time attenuation factor of each knowledge point according to the current time, the creation time of the question-answer pair to which the knowledge point belongs and a preset threshold value.

In one example, the time decay factor for each knowledge point in the primary session content may be obtained according to equation 3.

timeFactor_i＝(currTime-chatTime_i+2)^G(formula 3)

Wherein, the timeFactor_iIs the time attenuation factor of the ith knowledge point in the content of one session, currTime is the current time, chatTime_iG is a preset threshold value and can also be called a gravity factor for the creation time of a question-answer pair to which the ith knowledge point in the content of one session belongs.

According to formula 3, the time attenuation factors of knowledge point 1, knowledge point 2 and knowledge point 3 in session content 1 in table 1 can be obtained; and the time decay factor etc. of the knowledge point 4 in the session content 2 can be obtained.

Step Y: and updating the second score value of each knowledge point according to the time attenuation factor of each knowledge point to obtain the third score value of each knowledge point in the multi-session content.

In one example, the third score value of a knowledge point in the content of a session may be obtained according to equation 4.

Score₃(i)＝Score₂(i)*timeFacter_i(formula 4)

Wherein, Score₃(i) Score, the third Score value of the ith knowledge point₂(i) Is the second score value of the ith knowledge point, timeFactor_iThe time decay factor of the ith knowledge point in the content of one session.

According to the method for obtaining the third fractional value of one knowledge point in the one-time conversation content, the third fractional value of each knowledge point in the multi-time conversation content can be obtained. The third score values of knowledge point 1, knowledge point 2, and knowledge point 3 in session content 1 in table 1 may be obtained, as according to equation 4; and a third score value for the knowledge point 4 in the session content 2, etc. may be obtained.

After obtaining the third score value for each knowledge point, then step C may be replaced with Z: and acquiring the target knowledge point from a plurality of knowledge points in the multi-session content according to the third fractional value of each knowledge point.

Here, the process of obtaining the target knowledge point according to the third fractional value of each knowledge point is similar to the process of obtaining the target knowledge point according to the first fractional value of each knowledge point in step 130, and the details of the process are not repeated herein.

Considering that the knowledge points of different categories are equally divided and the click times are different, if the popular knowledge points are sorted according to the equal division or the total division, the coverage of the acquired popular knowledge points is small and is concentrated in a small number of categories, so that the Bayesian scoring can be further adopted for carrying out weighted scoring on the knowledge points. Therefore, before step Z in the embodiment of the present application is executed, the following steps may also be executed:

step a, for each knowledge point in the multi-session content, determining a first average numerical value of each knowledge point according to the third numerical value of the knowledge point in the multi-session content and the click times.

In one example, the first average value for each knowledge point may be determined according to equation 5:

(formula 5)

Wherein the content of the first and second substances,

is the first average numerical value of the ith knowledge point, sigma_iScore₃(i) Is the sum of the third fractional values of the ith knowledge point in the multi-session content, hit_iThe number of clicks of the ith knowledge point is counted in advance. It should be noted that, some knowledge points in the content of multiple sessions may be the same, and for the same knowledge point, only the first average score value of the knowledge point is calculated once, that is, only the final score value of the knowledge point is calculated once. If, for example, it is assumed that knowledge point 1 in session content 1 in table 1 is the same as knowledge point 4 in session content 2, then only the first average score of knowledge point 1 in session content 1 is calculatedThe value is only needed. Specifically, assuming that the third score value of the knowledge point 1 in the session content 1 is 0.5, the third score value of the knowledge point 4 in the session content 2 is 0.7, and the number of clicks of the knowledge point (the number of clicks of the knowledge point 1 + the number of clicks of the knowledge point 4) is 20, the first average score value of the knowledge point 1 may be: (0.5+0.7)/20 ═ 0.006; it is understood that 0.006 is also the first average value of knowledge point 4.

And b, determining a second average value of each knowledge point according to the first average value of each knowledge point and the total number of the knowledge points in the multi-session content.

In one example, the second average value for each knowledge point may be determined according to equation 6:

(formula 6)

Wherein the content of the first and second substances,

is the second average value of the ith knowledge point,

the sum of the first average numerical values of all the knowledge points is obtained, N is the total number of all the knowledge points in the multi-session content, and the knowledge points are different. It will be appreciated that the second average is the same for each knowledge point. Here, it should be specifically described that N is provided, and assuming that a total of two session contents are collected, where there are 20 knowledge points in the session content 1, 10 knowledge points in the session content 2, and the number of the same knowledge points in the two session contents is 5, the total number of the knowledge points in the two session contents is 25, that is, the value of N is 25.

And c, determining the final score value of each knowledge point in the multi-session content according to the click frequency of each knowledge point, the first average score value, the second average score value and the preset click frequency.

In one example, the final score value for each knowledge point in the multi-session content may be determined according to equation 7:

(formula 7)

Wherein, finalScore_iIs the final score value, hit, for the ith knowledge point_iThe number of clicks for the ith knowledge point,

is the first average value of the ith knowledge point,

and the average value is the second average value of the ith knowledge point, and minHit is the preset number of clicks.

Under the condition that all knowledge points in the table 1 are different, the final score values of the knowledge points 1, the knowledge points 2 and the knowledge points 3 in the session content 1 in the table 1 can be obtained according to the formula 5-the formula 7; and the final score value of the knowledge point 4 in the session content 2, etc. can be obtained.

After obtaining the final score values of the respective knowledge points, step Z may then be replaced by step d: and acquiring the target knowledge point from the knowledge points in the multi-session content according to the final score value of each knowledge point.

Here, the process of obtaining the target knowledge point according to the final score value of each knowledge point is similar to the process of obtaining the target knowledge point according to the first score value of each knowledge point in step 130, and details of the process are not repeated herein.

After the target knowledge point is acquired, the target knowledge point can be pushed to the user. For example, in the automatic question and answer platform, after a chat window of the automatic question and answer platform is opened, the title of the target knowledge point can be displayed in the chat window, and if the target knowledge point contains the content that the user wants to know, the user can know the knowledge points only by clicking the title of the corresponding knowledge point so as to link to the related content of the knowledge point, so that the user can conveniently know about the hot topics in the internet, and the user experience is further improved.

It should be noted that, the present application also provides the following ways to obtain the target knowledge point:

fig. 2 is a flowchart of a target knowledge point obtaining method according to another embodiment of the present application, and as shown in fig. 2, the method may include the following steps:

and step 210, collecting the multi-time conversation content of the user and the question-answering robot.

The content of each session comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, and the knowledge point is an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value.

Step 220, for each knowledge point in the content of each session, determining a category attenuation factor of each knowledge point according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs.

And step 230, updating the initial score value of each knowledge point according to the category attenuation factor of the knowledge point and the category attenuation factors of the subsequent knowledge points of the knowledge point to obtain a second score value of each knowledge point in the multi-session content.

And 240, acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the second score value of each knowledge point.

Since step 210 is the same as step 110 in the previous embodiment, and step 220 and step 240 are the same as step a and step C in the previous embodiment, which are not repeated herein; step 230 is similar to step B, wherein the same parts are not repeated herein, except that in step 230, the second score value of each knowledge point in the multi-session content can be obtained according to the transformation formula 8 of formula 2.

Equation 8 may be:

(formula 8)

Wherein score (i) is an initial score value of the ith knowledge point, other parameters in formula 8 are the same as those in formula 2, and the corresponding description thereof can refer to the description of each parameter in formula 2, which is not repeated herein.

It is understood that before step 240 of the present embodiment is executed, step X to step Y in the foregoing embodiment may also be executed, and after step Y to step Y are executed, step 240 may be replaced by step Z; furthermore, after performing steps Y-Y, before replacing step 240, steps a-c may also be performed, and after performing steps a-c, step 240 may be directly replaced with step d. The implementation of the above two schemes can be referred to the above description, and is not repeated herein.

In the implementation, the category jump of the subsequent knowledge points in the session content is used as one of the factors for measuring the score value of the previous knowledge points, namely, the score value of the previous knowledge points is attenuated along with the category jump of the knowledge points, so that the accuracy of acquiring the target knowledge points can be improved.

Fig. 3 is a flowchart of a method for acquiring a target knowledge point according to yet another embodiment of the present application, and as shown in fig. 3, the method may include the following steps:

and step 310, collecting the multi-session content of the user and the question-answering robot.

Wherein, each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, and the knowledge point is an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value.

And step 320, determining a time attenuation factor of each knowledge point in the content of each conversation according to the current time, the creation time of the question-answer pair to which the knowledge point belongs and a preset threshold value.

And 330, updating the initial score value of each knowledge point according to the time attenuation factor of each knowledge point to obtain a third score value of each knowledge point in the multi-session content.

And 340, acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the third fractional value of each knowledge point.

Since step 310 is the same as step 110 in the previous embodiment, and step 320 and step 340 are the same as step X and step Z in the previous embodiment, they are not repeated herein; step 330 is similar to step Y, wherein the same parts are not repeated herein, except that in step 330, the third fractional value of each knowledge point in the content of multiple sessions may be obtained according to the transformation formula 9 of formula 4.

Equation 9 may be:

Score₃(i)＝Score(i)*timeFacter_i(formula 9)

Wherein score (i) is an initial score value of the ith knowledge point, other parameters in formula 9 are the same as those in formula 4, and the corresponding description thereof can refer to the description of each parameter in formula 4, which is not repeated herein.

It is understood that before step 340 of the present embodiment is performed, steps a to c of the foregoing embodiment may also be performed, and after steps a to c are performed, step 340 may be replaced with step d. The implementation of this scheme can be seen in the foregoing, and is not repeated here.

In the implementation, when the final score value of each knowledge point is determined, the time attenuation factor of the knowledge point is considered, so that the latest clicked knowledge point is ensured to have higher weight, and the accuracy of selecting the target knowledge point is improved.

Fig. 4 is a flowchart of a method for acquiring a target knowledge point according to another embodiment of the present application, and as shown in fig. 4, the method may include:

and step 410, collecting the contents of multiple conversations of the user and the question-answering robot.

Wherein, each conversation content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, the knowledge point is an answer corresponding to the question, and each knowledge point belongs to a preset category

And step 420, determining the matching value of each knowledge point in the multi-session content and the corresponding problem according to a preset algorithm, and taking the matching value as the first score value of each knowledge point.

Step 430, for each knowledge point in the content of each session, determining a category attenuation factor of each knowledge point according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs.

Step 440, updating the first score value of each knowledge point according to the category attenuation factor of the knowledge point and the category attenuation factors of the subsequent knowledge points of the knowledge point, so as to obtain the second score value of each knowledge point in the multi-session content.

Step 450, for each knowledge point in the content of each session, determining a time attenuation factor of each knowledge point according to the current time, the creation time of the question-answer pair to which the knowledge point belongs, and a preset threshold.

And step 460, updating the second score value of each knowledge point according to the time attenuation factor of each knowledge point to obtain the third score value of each knowledge point in the multi-session content.

Step 470, for each knowledge point in the multi-session content, determining the first average value of each knowledge point according to the third fractional value and the number of clicks of the knowledge point in the multi-session content.

And step 480, determining a second average numerical value of each knowledge point according to the first average numerical value of each knowledge point and the total number of the knowledge points in the multi-session content.

And step 490, determining the final score value of each knowledge point in the multi-session content according to the click frequency of each knowledge point, the first average score value, the second average score value and the preset click frequency.

Step 4100, obtaining target knowledge points from a plurality of knowledge points in the multi-session content according to the final score values of the knowledge points.

Therefore, the accuracy of acquiring the target knowledge points can be improved, and the requirement that a user wants to know the hot knowledge points quickly can be met.

Corresponding to the method for acquiring a target knowledge point provided by the foregoing one embodiment, an apparatus for acquiring a target knowledge point provided by the embodiment of the present application, as shown in fig. 5, includes:

the collecting unit 501 is configured to collect content of multiple sessions between a user and a question-answering robot, where the content of each session includes at least one question-answer pair, each question-answer pair includes a question and at least one knowledge point, the knowledge point refers to an answer corresponding to the question, and each knowledge point belongs to a preset category.

The determining unit 502 is configured to determine, according to a preset algorithm, a matching value of each knowledge point in the multiple session contents collected by the collecting unit 501 and a corresponding problem, and use the matching value as a first score value of each knowledge point.

An obtaining unit 503, configured to obtain a target knowledge point from the knowledge points in the multiple session contents according to the first score value of each knowledge point determined by the determining unit 502.

Optionally, the obtaining unit 503 may be specifically configured to:

updating the first score value of each knowledge point according to the category attenuation factor of the knowledge point and the category attenuation factors of the subsequent knowledge points of the knowledge point to obtain a second score value of each knowledge point in the multi-session content;

Optionally, the obtaining unit 503 may be further specifically configured to:

updating the second score value of each knowledge point according to the time attenuation factor of each knowledge point to obtain a third score value of each knowledge point in the multi-session content;

Optionally, the obtaining unit may be further specifically configured to:

for each knowledge point in each conversation content, determining a first average numerical value of each knowledge point according to the third numerical value of the knowledge point in the conversation content for multiple times and the number of clicks;

determining a second average numerical value of each knowledge point according to the first average numerical value of each knowledge point and the total number of the knowledge points in the multi-session content;

determining the final score value of each knowledge point in the multi-time conversation content according to the click frequency of each knowledge point, the first average numerical value, the second average numerical value and the preset click frequency;

and acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the final score value of each knowledge point.

The functions of the functional modules of the device in the embodiment of the present application may be implemented through the steps in the method embodiment described above, and therefore, the specific working process of the device provided in the present application is not repeated herein.

In the device for acquiring the target knowledge point provided by the embodiment of the application, the collecting unit 501 collects the content of multiple conversations between the user and the question and answer robot; the determining unit 502 determines a matching value of each knowledge point in the multi-session content and a corresponding problem according to a preset algorithm, and takes the matching value as a first score value of each knowledge point; the obtaining unit 503 obtains a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point. Therefore, the accuracy of acquiring the target knowledge points can be improved, and the requirement that a user wants to know the hot knowledge points quickly can be met.

Corresponding to the method for acquiring a target knowledge point provided in the above another embodiment, an apparatus for acquiring a target knowledge point provided in the embodiment of the present application, as shown in fig. 6, includes:

the system comprises a collecting unit 601, a question-answer robot and a question-answer robot, wherein the collecting unit 601 is used for collecting the contents of multiple sessions of a user and the question-answer robot, each session content comprises at least one question-answer pair, each question-answer pair comprises a question and at least one knowledge point, and the knowledge point refers to an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value.

The determining unit 602 is configured to, for each knowledge point in the content of each session, determine a category attenuation factor of each knowledge point according to the category to which the knowledge point belongs and the category to which the previous knowledge point of the knowledge point belongs.

An updating unit 603, configured to update the initial score value of each knowledge point according to the category attenuation factor of the knowledge point determined by the determining unit 602 and the category attenuation factors of the subsequent knowledge points of the knowledge point, so as to obtain a second score value of each knowledge point in the multiple session contents.

An obtaining unit 604, configured to obtain a target knowledge point from multiple knowledge points in the multiple session contents according to the second score value of each knowledge point obtained by updating in the updating unit 603.

Corresponding to the method for acquiring a target knowledge point provided by the above further embodiment, as shown in fig. 7, an apparatus for acquiring a target knowledge point according to the embodiment of the present application includes:

a collecting unit 701, configured to collect content of multiple sessions between a user and a question-answering robot, where the content of each session includes at least one question-answer pair, each question-answer pair includes a question and at least one knowledge point, and the knowledge point refers to an answer corresponding to the question; each knowledge point belongs to a preset category and corresponds to an initial score value.

The determining unit 702 is configured to, for each knowledge point in the content of each session, determine a time decay factor of each knowledge point according to the current time, the creation time of the question-answer pair to which the knowledge point belongs, and a preset threshold.

An updating unit 703 is configured to update the initial score value of each knowledge point according to the time decay factor of each knowledge point determined by the determining unit 702, so as to obtain a third score value of each knowledge point in the multi-session content.

An obtaining unit 704, configured to obtain a target knowledge point from multiple knowledge points in the multiple session content according to the third fractional value of each knowledge point obtained by updating in the updating unit 703.

Those of skill would further appreciate that the various illustrative objects and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above-mentioned embodiments, objects, technical solutions and advantages of the present application are described in further detail, it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present application, and are not intended to limit the scope of the present application, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present application should be included in the scope of the present application.

Claims

1. A method for acquiring a target knowledge point is characterized by comprising the following steps:

acquiring a target knowledge point from a plurality of knowledge points in the multi-session content according to the first score value of each knowledge point;

the acquiring, according to the first score value of each knowledge point, a target knowledge point from a plurality of knowledge points in the multi-session content specifically includes:

2. The method according to claim 1, wherein the obtaining of the target knowledge point from the plurality of knowledge points in the multi-session content according to the second score value of each knowledge point comprises:

3. The method according to claim 2, wherein the obtaining of the target knowledge point from the plurality of knowledge points in the multi-session content according to the third score value of each knowledge point comprises:

4. A method for acquiring a target knowledge point is characterized by comprising the following steps:

5. An apparatus for acquiring a target knowledge point, the apparatus comprising:

an obtaining unit, configured to obtain a target knowledge point from a plurality of knowledge points in the multiple session contents according to the first score value of each knowledge point determined by the determining unit;

the obtaining unit is specifically configured to:

6. The apparatus according to claim 5, wherein the obtaining unit is further specifically configured to:

7. The apparatus according to claim 6, wherein the obtaining unit is further specifically configured to:

8. An apparatus for acquiring a target knowledge point, the apparatus comprising: