CN116956885A

CN116956885A - Question screening method, device, electronic equipment and readable storage medium

Info

Publication number: CN116956885A
Application number: CN202210320027.3A
Authority: CN
Inventors: 蔡少委; 张清平; 周聪聪; 罗晓衡; 谭闯; 易晨希; 沈欣; 张宁; 胡孝思
Original assignee: SF Technology Co Ltd
Current assignee: SF Technology Co Ltd
Priority date: 2022-03-29
Filing date: 2022-03-29
Publication date: 2023-10-27

Abstract

The application discloses a problem screening method, a device, electronic equipment and a readable storage medium, and on one hand, when screening target problems to be recommended, the problem screening method provided by the application considers the problem attention degree of coarse granularity from the whole problem of an initial problem, considers the problem attention degree of fine granularity from the problem individual of the initial problem, so that the screened target problems better accord with the preference of target users, and the situation that the screened target problems are inaccurate when the target users are only from the coarse granularity but are not interested in specific problems in the problem, or only from the fine granularity but are less common, or the problem quality is low can be avoided. On the other hand, the problem screening method provided by the application does not depend on feedback data of the user after problem recommendation, but is based on historical behavior data of the user, so that the accuracy is higher.

Description

Question screening method, device, electronic equipment and readable storage medium

Technical Field

The application relates to the technical field of question and answer recommendation, in particular to a method and device for screening questions, electronic equipment and a readable storage medium.

Background

Along with the development of company business, knowledge related platforms are also increasing, staff's questions and solutions go deep into various fields, and knowledge content is also precipitating continuously. The knowledge contains aspects of an operation method, a flow, a regulation system and the like in daily work, so that the popularization of the knowledge is improved, the purpose of multiplexing is achieved in actual work, and the working efficiency of staff can be greatly improved. Often, the newly built 85% question knowledge often cannot be noticed and approved by staff due to lack of enough high-quality answers, so that questions need to be screened to obtain questions of interest to staff for recommendation in order to improve the quality of the answers.

The existing problem screening method in the market mainly analyzes feedback of users after problem recommendation, but the accuracy of the analysis method depends on the accuracy of feedback data of the users, and a plurality of users often fill in feedback data randomly due to trouble, so that the problem screening method is low in accuracy, and the recommended problems are inconsistent with the preference of staff.

Disclosure of Invention

The application provides a problem screening method, a device, electronic equipment and a readable storage medium, and aims to solve the problem that the existing problem screening method depends on the accuracy of user feedback data and is low in accuracy.

In a first aspect, the present application provides a method for screening a problem, including:

acquiring an initial problem to be screened;

counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem;

determining the problem attention degree of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data;

and screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree.

In one possible implementation manner of the present application, the counting the topic interest degree of the initial question according to the historical behavior data of the target user and the topic type of the initial question includes:

extracting each thematic type corresponding to the user behavior from the historical behavior data of the target user;

dividing the user behaviors according to the corresponding thematic types to obtain target behaviors corresponding to the thematic types of the initial problems;

and determining the thematic attention degree of the initial problem according to the number of the user behaviors and the number of the target behaviors.

In one possible implementation manner of the present application, the determining the topic interest degree of the initial problem according to the number of the user behaviors and the number of the target behaviors includes:

Acquiring a preset score corresponding to the target behavior;

weighting the number of the target behaviors according to the preset score to obtain weighted number;

and determining the ratio between the weighted quantity and the quantity of the user behaviors as the topical attention of the initial problem.

In one possible implementation manner of the present application, the determining, according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data, the problem attention of the initial problem includes:

calculating the similarity between the initial problem and each historical problem corresponding to the historical behavior data to obtain the maximum target similarity;

counting the number of preset types contained in the thematic types of the initial problems;

and determining the problem attention of the initial problem according to the value coefficient corresponding to the number of the preset types and the target similarity.

In one possible implementation manner of the present application, the calculating the similarity between the initial problem and each historical problem corresponding to the historical behavior data, to obtain the maximum target similarity includes:

detecting the parts of speech of each word in the initial question to obtain a target word corresponding to the target part of speech in the initial question and the total number of words of the target word in the initial question;

Counting the number of the target words in each history problem corresponding to the history behavior data;

and calculating the ratio between the total number of the words and the number of the target words in each historical problem, and obtaining the similarity between the initial problem and each historical problem and the maximum target similarity.

In one possible implementation manner of the present application, the selecting, according to the topic interest level and the problem interest level, the target problem to be recommended from the initial problem includes:

according to the special attention degree and the problem attention degree, calculating to obtain a first recommendation score of the initial problem;

if the first recommendation score meets a preset behavior data deletion condition, obtaining second recommendation scores of users to be recommended, and constructing a recommendation score matrix between the users and the initial problem according to the first recommendation score and each second recommendation score;

performing iterative processing on the recommended score matrix by adopting an alternate least square method to obtain a target score matrix;

updating the first recommendation score according to the target score matrix;

and screening and obtaining target questions to be recommended from the initial questions according to the updated first recommendation score.

In one possible implementation manner of the present application, after the target problem to be recommended is screened from the initial problem according to the topic interest level and the problem interest level, the method further includes:

inquiring the recommended quantity of the problems corresponding to the target user;

and if the recommended number of the questions is smaller than or equal to a preset number threshold, sending the target questions to the target users.

In a second aspect, the present application provides a problem screening apparatus comprising:

an acquisition unit for acquiring initial problems to be screened;

the statistics unit is used for counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem;

a determining unit, configured to determine a problem attention degree of the initial problem according to a similarity between the initial problem and each historical problem corresponding to the historical behavior data;

and the screening unit is used for screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree.

In a possible implementation of the application, the statistics unit is further configured to:

acquiring a preset score corresponding to the target behavior;

In a possible implementation of the application, the determining unit is further configured to:

In a possible implementation of the application, the screening unit is further configured to:

updating the first recommendation score according to the target score matrix;

In a third aspect, the present application also provides an electronic device, the electronic device comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor executing the steps of any of the problem screening methods provided by the present application when calling the computer program in the memory.

In a fourth aspect, the present application also provides a readable storage medium having stored thereon a computer program which when executed by a processor performs steps in any of the problem screening methods provided by the present application.

In summary, the problem screening method provided by the application includes: acquiring an initial problem to be screened; counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem; determining the problem attention degree of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data; and screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree. On the one hand, when the target problems to be recommended are screened, namely, the problem attention degree of coarse granularity is considered from the whole problem of the initial problems, and the problem attention degree of fine granularity is considered from the problem individuals of the initial problems, so that the screened target problems more accord with the preference of target users, and the situation that the screened target problems are inaccurate when the target users are only from the coarse granularity but do not interest in specific problems in the problem, or only from the fine granularity but the initial problems are less common or the problem quality is low can be avoided. On the other hand, the problem screening method provided by the application does not depend on feedback data of the user after problem recommendation, but is based on historical behavior data of the user, so that the accuracy is higher.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an application scenario of a problem screening method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a problem screening method according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of determining attention to a problem provided in an embodiment of the present application;

FIG. 4 is a schematic flow chart of obtaining a target similarity according to an embodiment of the present application;

FIG. 5 is a schematic structural view of an embodiment of a problem screening apparatus according to the present application;

fig. 6 is a schematic structural diagram of an embodiment of an electronic device provided in an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

In describing embodiments of the present application, it should be understood that the terms "first," "second," and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or number of features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present application, the meaning of "plurality" is two or more, unless explicitly defined otherwise.

The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known processes have not been described in detail in order to avoid unnecessarily obscuring the description of the embodiments of the application. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.

The embodiment of the application provides a problem screening method, a device, electronic equipment and a readable storage medium. The problem screening device can be integrated in electronic equipment, and the electronic equipment can be a server or a terminal and other equipment.

The execution body of the problem screening method according to the embodiment of the present application may be a problem screening device provided by the embodiment of the present application, or different types of electronic devices such as a server device, a physical host, or a User Equipment (UE) that is integrated with the problem screening device, where the problem screening device may be implemented in a hardware or software manner, and the UE may specifically be a terminal device such as a smart phone, a tablet computer, a notebook computer, a palm computer, a desktop computer, or a personal digital assistant (Personal Digital Assistant, PDA).

The electronic device may be operated in a single operation mode, or may also be operated in a device cluster mode.

Referring to fig. 1, fig. 1 is a schematic view of a scenario of a problem screening system according to an embodiment of the present application. The problem screening system may include an electronic device 101, where a problem screening apparatus is integrated in the electronic device 101.

In addition, as shown in FIG. 1, the problem screening system may also include a memory 102 for storing data, such as text data.

It should be noted that, the schematic view of the scenario of the problem screening system shown in fig. 1 is only an example, and the problem screening system and scenario described in the embodiment of the present application are for more clearly describing the technical solution of the embodiment of the present application, and do not constitute a limitation on the technical solution provided by the embodiment of the present application, and those skilled in the art can know that, along with the evolution of the problem screening system and the appearance of a new service scenario, the technical solution provided by the embodiment of the present application is equally applicable to similar technical problems.

In the following, an embodiment of the present application will be described to provide a problem screening method, where an electronic device is used as an execution body, and in order to simplify and facilitate description, in a subsequent method embodiment, the execution body is omitted, where the problem screening method includes: acquiring an initial problem to be screened; counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem; determining the problem attention degree of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data; and screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree.

Referring to fig. 2, fig. 2 is a schematic flow chart of a problem screening method according to an embodiment of the present application. It should be noted that although a logical order is depicted in the flowchart, in some cases the steps depicted or described may be performed in a different order than presented herein. The problem screening method specifically may include the following steps 201 to 204, in which:

201. the initial questions to be screened are obtained.

The initial question may refer to a question that the user did not answer. For example, the initial question may refer to a question that is newly added to a preset knowledge database, is not recommended to the user, and thus has not been answered. For example, when updating and maintaining a preset knowledge database, a new question added to the knowledge database can be used as an initial question to be screened.

The preset knowledge database may refer to a knowledge database inside an enterprise, in which questions and answers related to information such as enterprise culture, salary treatment, regulation and system of the enterprise are stored, and staff in the enterprise may query the knowledge database to obtain solutions of the questions when encountering the questions related to work. For example, an employee can access an internal App of an enterprise through a terminal such as a smart phone, a personal computer and the like, and connect a corresponding knowledge database through a trigger key on the internal App so as to inquire the content in the knowledge database.

In some embodiments, the questions that are not answered by the user may include some questions that are not valuable, for example, where the answer of some questions may be common general knowledge in the corresponding industry, or where there are already answered questions similar to some questions in the knowledge database, so the initial questions may also be some valuable questions among the questions that are not answered by the user, and in order to reduce the calculation amount of the question screening method in the embodiment of the present application, the questions that are not answered by the user may be screened first to obtain the valuable initial questions therein. The method includes the steps of comparing questions unanswered by a user with preset questions stored in the knowledge database, removing questions with high similarity with each preset question, and simultaneously comparing the questions unanswered by the user with industry basic information stored in a preset industry information database, and removing the questions with high similarity with each industry basic information.

The preset questions may refer to answered questions in a knowledge database, among other things.

The preset industry information database stores the basic common knowledge in the industry corresponding to the unanswered questions of the user, and may be a background database corresponding to App such as encyclopedia of knowledge. For example, industry common knowledge in information sources such as internet, textbook and the like can be stored in a cloud database associated with apps such as knowledge encyclopedia and the like, so as to obtain a preset industry information database.

Before calculating the similarity, the basic information of each problem or industry can be processed through a preset word segmentation model to obtain corresponding word vectors, and then the similarity is calculated according to the word vectors. Taking the comparison of questions unanswered by the user with the preset questions stored in the knowledge database as an example:

firstly, word segmentation processing can be carried out on questions which are not answered by a user and preset questions stored in a knowledge database through a preset word segmentation model, so that word vectors corresponding to each word and symbol in the questions and the part of speech of each word are obtained. The word segmentation model with an open source such as a jieba model can be used as a preset word segmentation model.

(1.2) then, removing the word vector corresponding to the symbol, and removing the word vector corresponding to the word of the specific part of speech according to the part of speech of each word. For example, a word vector corresponding to a word whose part of speech is an imaginary word may be removed.

And (1.3) fusing the residual word vectors corresponding to the problems to obtain semantic features of the problems.

And (1.4) respectively comparing the semantic features of the questions unanswered by the user with the semantic features of the preset questions to obtain the similarity between the questions unanswered by the user and the preset questions.

Wherein jieba is a word segmentation model for segmenting words by a custom vocabulary.

It can be seen that through the steps (1.1) - (1.4), redundant words and/or symbols can be reduced in the problem, so that on one hand, the calculation amount when calculating the similarity is reduced, and on the other hand, inaccuracy in calculating the similarity caused by the redundant words and/or symbols can be avoided.

In other embodiments, if a new question added in the preset knowledge database is used as an initial question to be screened, since the text added in the knowledge database may also include non-question content such as comments of the preset question, updated answers of the preset question, and the like, the text added in the knowledge database may also be first identified, and the question therein is used as the initial question.

For example, whether or not a preset keyword exists in each sentence added to the knowledge database may be detected to determine whether or not each sentence is a question, and the question therein may be taken as an initial question. For example, the question may be set as a preset keyword, and a sentence having the question may be determined as a question. The following examples are given by way of illustration:

the general questions of things, time, places and quantity such as "who, what, where and where" and the general questions of how, what, how and why are used as preset keywords, if "who, what are included in a sentence, any of where, what, how, what is, how is, what is why, and so on, the sentence is determined as a question.

Taking the industry corresponding to the initial problem as the express logistics industry, and taking the logistics knowledge database in which the problem is stored in the express logistics enterprise as an example, the flow of step 201 is exemplarily described in the following:

and (2.1) importing a new text into the logistics knowledge database by staff, reading the new text by the electronic equipment, detecting whether each sentence of the new text has a preset keyword, and taking the sentences with the keywords as new questions which are not answered by the users in enterprises such as couriers and the like.

And (2.2) the electronic equipment reads the new questions and the preset questions stored in the logistics knowledge database, filters words and symbols in the new questions and the preset questions through the method, and fuses the residual word vectors to obtain semantic features corresponding to the new questions and the preset questions respectively.

And (2.3) comparing the new problem with each preset problem, and screening the new problem according to each similarity obtained by comparison to obtain the high-value initial problem. For example, the first question "do a gun send repair? ", if each preset question includes a second question," the gun can be modified. And when the similarity between the semantic features of the first problem and the semantic features of the second problem is greater than or equal to a preset first similarity threshold value, the first problem can be judged to be worthless, and the electronic equipment does not take the first problem as an initial problem. The size of the first similarity threshold is not limited, and the first similarity threshold can be set according to actual scene requirements.

202. And counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem.

The topic type refers to a type of a topic corresponding to a problem, where the topic type of the initial problem may include one or more of a plurality of preset types, and the preset types may be set according to requirements of a scene, which is not limited in the embodiment of the present application. For example, when the preset knowledge database may refer to a knowledge database within an enterprise, the preset type may include human resources, operations, products, IT, and the like. For ease of understanding, if not specifically described, the topic type of the initial question is considered to include one of a plurality of preset types, and when the topic type of the initial question includes a plurality of preset types, the specific implementation is similar and will not be described in detail.

The historical behavior data may include historical data generated when the user operates on each preset problem in the knowledge database. For example, the historical behavior data of the target user may include historical data generated when the target user reviews, endorses, pays attention to, browses, shares, etc. the target user performs different types of operations on each preset question in the knowledge database. The following examples are illustrative: if the target user performs a praise operation on the first preset problem stored in the knowledge database, performs an attention operation on the second preset problem, performs a sharing operation on the third preset problem, and does not perform any operation on the fourth preset problem, that is, the target user does not comment, praise, pay attention to, browse or share the fourth preset problem, the historical behavior data includes: (x) The target user performs praise operation on the first preset problem; (xi) The target user pays attention to the second preset problem; (xii) The target user performs sharing operation on the third preset problem; a total of 3 pieces of historical behavioral data. It can be seen that the historical behavior data includes data of the operation type of the target user and data of the operation target problem of the target user.

For example, the historical behavior data may be stored in a preset user information database. The user information database may be a database associated with the App corresponding to the knowledge database, where multiple types of information such as behavior, identity, and the like of the user are stored.

The topic interest degree of a question refers to the interest degree of a target user in a topic corresponding to the topic type of the question. The higher the topic interest level of a question, the more attention the target user has to pay attention to the topic corresponding to the topic type of the question, for example, the higher the topic interest level of a question, the more attention the target user has to pay attention to the related content of the topic corresponding to the question can be described. The following is an exemplary illustration of one specific example: when the topic type of an initial question only includes "product" in the preset type, the higher the topic attention degree of the initial question, the more attention the target user pays attention to the related content of the topic "product", wherein the topic "product" refers to that the topic type of the topic is a product, and detailed description is omitted hereinafter.

In some embodiments, the topical focus of the initial problem may be determined based on the number of behaviors in the historical behavior data. At this time, the step of "counting the topic interest level of the initial question according to the historical behavior data of the target user and the topic type of the initial question" may be performed by:

And (3.1) extracting each topic type corresponding to the user behavior from the historical behavior data of the target user.

User behavior may be understood as the user's operational behavior of a preset question in the above. For example, the historical behavior data includes: (x) The target user performs praise operation on the first preset problem; (xi) The target user pays attention to the second preset problem; (xii) The target user performs sharing operation on the third preset problem; when 3 sections of historical behavior data are used, the user behaviors in the historical behavior data comprise: behavior one "praise operation for the first preset question"; behavior two "attention operation to second preset issue"; action three "sharing operation for third preset question". The descriptions of the first preset problem, the second preset problem and the third preset problem may refer to the above, and detailed descriptions are omitted.

The topic type corresponding to the user behavior refers to the topic type of the problem corresponding to the user behavior. The topic types corresponding to the user behaviors are exemplarily described by the examples in the step: if the first, second and third behaviors are as described above, the corresponding topic types of the first, second and third behaviors are the topic type of the first preset question, the topic type of the second preset question and the topic type of the third preset question.

And (3.2) dividing the user behaviors according to the corresponding thematic types to obtain target behaviors corresponding to the thematic types of the initial problems.

The partitioning is illustrated by way of example in step (3.1): if the first, second and third behaviors are as described above, and the topic type of the first preset question, the topic type of the second preset question and the topic type of the third preset question are the first type, the first type and the second type, respectively, then the first set of the first and second behaviors and the second set of the third behavior can be obtained after the division. It will be appreciated that the first set corresponds to a first type and the second set corresponds to a second type.

The target behavior corresponding to the topic type of an initial question refers to a user behavior of which the topic type of the question corresponding to the behavior is the same as the topic type of the initial question. For example, the topic type of the first initial question is the first type, and the topic type of the first preset question is the first type, and the first behavior is a target behavior corresponding to the topic type of the first initial question, and there may be only one or multiple target behaviors of one initial question.

After division, matching the topic types corresponding to the sets with the topic types of the initial problems to obtain the set corresponding to the topic types of the initial problems and target behaviors in the set. For example, a first set of the first and second behaviors and a second set of the third behaviors are obtained after the dividing, and when the topic type of the second initial question is the first type and the topic type of the third initial question is the second type, the target behaviors corresponding to the topic type of the second initial question after the matching are referred to as the first and second behaviors, and the target behaviors corresponding to the topic type of the third initial question are referred to as the third behavior.

(3.3) determining the topical attention of the initial question according to the number of user behaviors and the number of target behaviors.

In some embodiments, the ratio between the number of target actions and the number of user actions may be taken as the target user's topical attention to the initial problem. For example, the user behaviors only include the first behavior, the second behavior and the third behavior, and the target behaviors corresponding to the topic types of the second initial problem after matching refer to the first behavior and the second behavior, and when the target behaviors corresponding to the topic types of the third initial problem refer to the third behavior, the ratio between the number of the corresponding target behaviors and the number of the user behaviors is 2/3 for the second initial problem, so that the topic attention of the target user to the second initial problem is 2/3. For the third initial question, the ratio between the number of corresponding target behaviors and the number of user behaviors is 1/3, so that the topical attention of the target user to the third initial question is 1/3.

Because different types of user behaviors reflect different interest degrees of the user, for example, the interest degree expressed when the user makes a "praise" behavior on a certain preset problem is higher than the interest degree expressed when the user makes a "browse" behavior on the preset problem, in order to highlight the distinction in the topic attention, in other embodiments, corresponding scores can be given to different target behaviors, and the number of the corresponding target behaviors is weighted by the scores, so that the accuracy of the topic attention is improved. At this time, the step of "determining the topic interest degree of the initial problem according to the number of user behaviors and the number of target behaviors" may be performed by:

and (3.31) obtaining a preset score corresponding to the target behavior.

In some embodiments, preset scores may be pre-assigned to different behavior types, where a behavior type may be understood as the operation type above. For example, when the behavior types include comment, praise, attention, browse and share, the preset scores of 5,4,3,2 and 1 can be respectively allocated to the users according to the degree of interest of the users, and the high score represents that the degree of interest of the users is high and the low score represents that the degree of interest of the users is low. At this time, if the first target behavior is the first behavior above, i.e. "praise operation for the first preset question", and the second target behavior is the second behavior above, i.e. "attention operation for the second preset question", the preset score corresponding to the first target behavior is 4 points, and the preset score corresponding to the second target behavior is 3 points.

In other embodiments, the preset score may be associated with a topic type in addition to a behavioral type. For example, the preset score corresponding to the target behavior may be determined according to the priorities of different topics in the App corresponding to the knowledge database, and the behavior type and the corresponding topic type of the target behavior. The following specific examples are illustrative:

assuming that the preset behavior types of comment, praise, attention and browse, sharing the corresponding reference preset scores of 5,4,3,2 and 1 are adopted, if the preset types of topics comprise human resources, operations, products and IT, and the top page of the knowledge database corresponding to the App is set to be the topic "product", namely, when the App is opened by a user, the default first-jumped topic interface is the interface corresponding to the topic "product", the priority of the topic "product" in the App is higher than that of the topics "human resources", "operations" and "IT", therefore, when the topic type corresponding to the target behavior is the "product", the reference preset score corresponding to the target behavior can be reduced, the reduced reference preset score is used as the preset score corresponding to the target behavior at the moment, so that the influence of the priority on the attention degree of the topics is reduced, and when the topic type corresponding to the target behavior is the other preset types of human resources and IT, the reference preset score corresponding to the target behavior at the moment is adopted as the preset score corresponding to the target behavior at the moment. For example, when the topic type corresponding to the target behavior is "product" and the behavior type of the target behavior is "comment", the reference preset score corresponding to the target behavior is 5 points, so that the 5 points can be reduced, the reduced reference preset score is obtained, the reduced reference preset score is used as the preset score corresponding to the target behavior, and for example, the 5 points can be reduced to 4 points. And the topic type corresponding to the target behavior is "IT", and when the behavior type of the target behavior is "comment", the reference preset score of 5 points can be directly used as the preset score corresponding to the target behavior. Wherein the magnitude of the decrease may be set according to the actual scenario. For convenience of description, the preset score of the target behavior is considered to be associated with the behavior type of the target behavior only if not specifically described below, but is not to be construed as limiting the embodiments of the present application.

And (3.32) weighting the number of the target behaviors according to the preset score to obtain weighted number.

When weighting is performed, if a plurality of target behaviors of an initial problem exist, the number of each target behavior is weighted and summed according to the preset score corresponding to each target behavior, so as to obtain the weighted number of the target behaviors corresponding to the initial problem. For example, the user behavior only includes "praise operation on the first preset problem", second concern operation on the second preset problem "and third concern sharing operation on the third preset problem" of the first behavior, and the target behavior corresponding to the topic type of the second initial problem after matching refers to the first behavior and the second behavior, and when the preset scores corresponding to the behavior types are "5,4,3,2,1" respectively, the weighted number of the target behaviors corresponding to the second initial problem is 7, and for example, when the target behavior corresponding to the topic type of the third initial problem refers to the third behavior, if the preset score corresponding to the third initial problem is unchanged, the weighted number of the target behavior corresponding to the third initial problem is 1.

(3.33) determining a ratio between the weighted number and the number of user actions as a topical focus of the initial question.

As an example in the step (3.32), since the weighted number of target behaviors corresponding to the second initial problem is 7 and the user behaviors include only the above-described behavior one "praise operation for the first preset problem", behavior two "attention operation for the second preset problem" and behavior three "sharing operation for the third preset problem", the specific attention of the second initial problem is 7/3. Since the weighted number of target behaviors corresponding to the third initial problem is 1, the specific attention of the third initial problem is 1/3.

It can be seen that, through the steps (3.31) and (3.32), the number of target behaviors can be weighted by the preset score, so as to obtain more accurate topical attention.

203. And determining the problem attention degree of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data.

As can be seen from the description of the historical behavior data in step 202, the historical behavior data includes data of the operation type of the target user and data of the operation target problem of the target user. Therefore, each history problem corresponding to the history behavior data refers to an operation target problem of the target user, namely, a problem that the target user performs operations such as comment, praise, attention, browsing, sharing and the like among the preset problems stored in the knowledge database. The example in step 202 is illustrated: if the knowledge database stores the first preset problem, the second preset problem, the third preset problem and the fourth preset problem, the historical behavior data of the target user comprises: (x) The target user performs praise operation on the first preset problem; (xi) The target user pays attention to the second preset problem; (xii) The target user performs sharing operation on the third preset problem; and 3 sections of historical behavior data are used, wherein the first preset problem, the second preset problem and the third preset problem are all historical problems corresponding to the historical behavior data, and the fourth preset problem is not the historical problem corresponding to the historical behavior data.

By way of example, the semantic features of the initial problem and the semantic features of each historical problem can be extracted through a semantic model, and the semantic features of the initial problem and the semantic features of each historical problem are respectively compared to obtain the similarity between the initial problem and each historical problem.

The semantic model may be a model of open source such as word2 vec.

Word2vec is a model that converts text into a vector format by encoding.

In some embodiments, the average of the similarities may be calculated, and the calculated average may be used as the problem attention of the initial problem. Thus, the higher the average similarity of the initial question to each of the historical questions, the higher the likelihood that the target user is interested in the initial question, and therefore the higher the concern of the question, the higher the likelihood that the target user is recommended.

204. And screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree.

In some embodiments, the special attention degree and the problem attention degree of each initial problem may be added, multiplied, and the like, to obtain a recommendation score of each initial problem, and a target problem in the initial problems is selected according to the recommendation score. For example, a product between the special attention degree and the problem attention degree of each initial problem may be taken as a recommendation score of the initial problem, and an initial problem with a corresponding recommendation score greater than a preset score threshold may be taken as a problem to be recommended.

In other embodiments, the special attention degree and the problem attention degree of each initial problem can be multiplied, the product is processed through Box-cox transformation to obtain the recommended score of the initial problem, and the initial problem with the corresponding recommended score being greater than the preset score threshold is used as the problem to be recommended, so that the correlation between the unobservable error and the recommended scores of different initial problems is reduced, the recommended scores of the initial problems are normally distributed, and the rationality of the initial problem screening is improved.

The Box-cox transformation is a data transformation method, and is used for the condition that continuous response variables do not meet normal distribution.

It can be seen that, by the method from step 201 to step 204, the topics of interest of the target user are considered from coarse granularity on one hand, and the interest degree of the target user for the initial problem is considered from fine granularity on the other hand, and the screened target problem better accords with the preference of the target user. If the target user starts from coarse granularity, the target user may be interested in the topic corresponding to the initial problem, but the content contained in one topic also has multiple directions, the target user may not be interested in the direction corresponding to the initial problem, and the screened target problem may not accord with the preference of the target user. For example, for the topic "IT", which may include various software usage methods, and various software background descriptions, etc., if the target user is interested in only various software usage methods, but the initial problem relates to the software background descriptions, the initial problem may be selected as the target problem to be recommended only from coarse granularity, but IT is obvious that the target problem does not conform to the preference of the target user. When the method starts from fine granularity, if the terms used in the initial problems are rare or the problems of the initial problems are low in quality, grammar errors or wrongly written characters and the like occur, the calculated similarity between the initial problems and each historical problem is inaccurate, and the preference of the target user is difficult to determine only according to the similarity between the initial problems and each historical problem. The problem screening method provided by the embodiment of the application can avoid the problems.

After obtaining the target questions to be recommended, the number of the questions which the target user has recommended can be queried, if the number of the questions is too large, the target user is not recommended any more, so as to avoid the boring mind of the target user, if the number of the questions is not large, the target questions to be recommended are sent to the target user, at this time, the steps of screening the initial questions to obtain the target questions to be recommended according to the topic attention degree and the question attention degree can also judge whether to recommend the target questions to the target user by the following method:

(4.1) inquiring the recommended quantity of the questions corresponding to the target user.

The recommended number of questions may be the total number of questions that have been recommended to the target user, or the number of questions that have been recommended to the target user within a certain period of time. For example, the number of recommended questions for the target user n days before the current point in between may be obtained, resulting in the recommended number of questions.

The electronic device may read the interaction record between the knowledge database and the target user, and obtain the recommended number of questions corresponding to the target user. The interaction record between the knowledge database and the target user can be stored in the user information database.

And (4.2) if the recommended number of the questions is less than or equal to a preset number threshold, sending the target questions to the target users.

The number threshold may be set according to the requirements of the actual scenario, which is not limited in the embodiment of the present application.

In summary, the problem screening method provided by the embodiment of the present application includes: acquiring an initial problem to be screened; counting the topic attention degree of the initial problem according to the historical behavior data of the target user and the topic type of the initial problem; determining the problem attention degree of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data; and screening and obtaining target problems to be recommended from the initial problems according to the special attention degree and the problem attention degree. On the one hand, when the problem screening method provided by the embodiment of the application screens the target problem to be recommended, namely, from the whole problem of the initial problem, the problem attention degree of coarse granularity is considered, and from the problem individual of the initial problem, the problem attention degree of fine granularity is considered, so that the screened target problem is more in line with the preference of the target user, and the situation that the screened target problem is inaccurate when the target user is not interested in the specific problem in the problem only from the coarse granularity or only from the fine granularity but the initial problem is less common or the problem quality is low can be avoided. On the other hand, the problem screening method provided by the embodiment of the application does not depend on feedback data of the user after problem recommendation, but is based on historical behavior data of the user, so that the accuracy is higher.

While the approach in step 203 may generally determine the likelihood of interest of the target user in the initial question, it is still inaccurate. Referring to fig. 3, a method for obtaining a problem attention degree more accurately is provided in fig. 3, where the step of determining a problem attention degree of the initial problem according to a similarity between the initial problem and each historical problem corresponding to the historical behavior data includes:

301. and calculating the similarity between the initial problem and each historical problem corresponding to the historical behavior data to obtain the maximum target similarity.

The description of the history problem and the method for calculating the similarity may refer to step 203, which is not described in detail.

302. Counting the number of preset types contained in the thematic types of the initial problems.

As already described above, there may be only one or a plurality of preset types included in the topic type of an initial problem. For example, if the preset types of topics include 4 types of "human resources, operations, products, IT", the preset types of topics of an initial problem include 3 types of topics if the preset types of topics include human resources, operations, and products. If the topic type of an initial question contains only human resources, the number of preset types is 1.

303. And determining the problem attention of the initial problem according to the value coefficient corresponding to the number of the preset types and the target similarity.

Wherein the value coefficient is used to characterize the value recommended to the target user by the initial question.

In some embodiments, the value coefficient corresponding to the number of preset types may be obtained by the equation (1):

wherein P is ₁ For the value coefficient, N is the total number of preset types of topics, which is a predetermined value, for example, when the preset types of topics include 4 types of "human resources, operations, products and IT", N is 4, and i is the number of preset types included in the preset types of topics of the initial problem.

It can be seen that the larger the number of preset types included in the topic type of the initial problem, the wider and common the content related to the initial problem is explained, and thus the lower the value coefficient is.

For example, the product of the value coefficient and the target similarity may be taken as the problem attention of the initial problem. For example, the problem attention of the initial problem can be calculated by the equation (2):

P ₃ ＝P ₁ ×P ₂ formula (2)

Wherein P is ₁ As a value coefficient, P ₂ For target similarity, P ₃ Concerns about the initial problem.

Compared to the method of obtaining the attention of the problem in step 203, the methods of steps 301-303 increase the value coefficient on one hand, increase the recommendation value of the initial problem when determining the attention of the problem, and on the other hand, use the maximum target similarity instead of the average similarity when determining the attention of the problem, so when the content in the initial problem is less common, as long as the content is included in the historical problem of the target user, the method of steps 301-303 can determine that the target user may be interested in the initial problem, even if the similarity between other problems except the problem corresponding to the content in the historical problem and the initial problem is low, the accurate attention of the problem can be obtained, and if the average similarity is used to determine the attention of the problem of the initial problem, when the content in the initial problem is less common, even if the content is included in the historical problem of the target user, the similarity between other problems except the problem corresponding to the content in the historical problem is low, the average similarity obtained by final calculation is low, and therefore the attention of the problem is correspondingly lower than the actual attention of the obtained problem is not accurate.

In some embodiments, although the accuracy of the attention of the problem can be improved by the method from step 301 to step 303, in the process of obtaining the similarity by adopting the semantic feature comparison, a certain semantic may be lost, and the lost semantic may affect the accuracy of the similarity, and the semantic feature further includes redundant information such as a term, punctuation mark, and the like, so that the accuracy of the attention of the problem is not high. Referring to fig. 4, another method for obtaining the target similarity is provided in fig. 4, where the step of "calculating the similarity between the initial problem and each historical problem corresponding to the historical behavior data, and obtaining the maximum target similarity" includes:

401. detecting the parts of speech of each word in the initial question to obtain a target word corresponding to the target part of speech in the initial question and the total number of words of the target word in the initial question.

The target parts of speech may refer to all parts of speech except for the stop, punctuation. Since the semantic help of the virtual words and punctuations to express the initial problem is small, the redundant information in the initial problem can be removed through the step 401, so that the efficiency and the accuracy of the problem screening are improved.

In some embodiments, the word segmentation model in the step (1.1) above may also be used to perform word segmentation on the initial problem to obtain word vectors corresponding to each word and symbol in the initial problem, and part of speech of each word, and then select word vectors corresponding to the target part of speech to obtain target words, where the target words may include one word or multiple words.

It should be noted that if steps (1.1) - (1.4) have been performed before step 201, then when step 401 is performed, the initial problem is not required to be processed through the word segmentation model again, and only the word vectors corresponding to the initial problem and having been segmented are required to be read, and then the parts of speech corresponding to the word vectors are detected, so that the target word can be obtained.

402. And counting the number of the target words in each historical problem corresponding to the historical behavior data.

The description of the history problem may refer to the above, and detailed description is omitted.

In some embodiments, the word segmentation model may also be used to segment each history question to obtain word vectors corresponding to each history question, and then, for each history question, compare the word vector corresponding to each history question with one or more word vectors corresponding to the target word to determine whether each history question includes the target word, and include several target words altogether. For the history A1, given that the first word vector obtained after word segmentation includes A1, a2, a3 and a4 and the second word vector corresponding to the target word includes b1, b2 and b3, when step 403 is performed, b1, b2 and b3 may be sequentially compared with A1, a2, a3 and a4, respectively, to obtain 4 similarities corresponding to b1, b2 and b3, and if the similarities corresponding to b1, b2 and b3 are greater than or equal to a preset second similarity threshold, it is determined that one of the second word vectors (b 1, b2 and b 3) corresponding to the calculated similarities is identical to one of the first word vectors (A1, a2, a3 and a 4), so that the history A1 contains the target word corresponding to the second word vector. For example, the second similarity threshold is 0.9, and the similarity between the second word vector b1 and a1, a2, a3 and a4 is in turn: 0.5, 0.3, 0.6, 0.9, the similarity between the second word vector b2 and a1, a2, a3 and a4 is in order: the similarity between the second word vector b3 and a1, a2, a3 and a4 is, in order, 0.1, 0.2, 0.3 and 0.4: 0.8, 0.9, 0.4 and 0.3, the history problem A1 includes a first target word and a third target word corresponding to the second word vector b1 and the second word vector b3 respectively, and the number of the target words in the history problem A1 is 2. The size of the second similarity threshold is not limited, and the second similarity threshold can be set according to actual scene requirements.

In other embodiments, one or more word vectors corresponding to the target word may be directly compared with semantic features of each history question, to determine whether each history question includes the target word, and to include several target words in total. The method for obtaining the semantic features may refer to the above, and details are not described in detail.

By way of example, each history problem can be processed through a word2vec model to obtain semantic features of each history problem, one or more word vectors corresponding to target words are respectively compared with the semantic features of each history problem, and word vectors with the corresponding similarity greater than a preset third similarity threshold value for each history problem are selected to obtain the number of target words in each history problem.

403. And calculating the ratio between the total number of the words and the number of the target words in each historical problem, and obtaining the similarity between the initial problem and each historical problem and the maximum target similarity.

For each historical problem, the ratio between the number of target words and the total number of words may account for the degree of overlap between the words in the initial problem and the words in the historical problem. For example, for a history problem, the total number of words in the initial problem is 10, and for a history problem, where the number of target words is 8, it is stated that after redundant information is eliminated, 80% of words in the initial problem appear in the history problem, so 80% can be regarded as the similarity between the initial problem and the history problem, and the maximum similarity between the similarities can be regarded as the target similarity.

It can be seen that, by the method from step 401 to step 403, since redundant information is eliminated, and the situation that the semantic information of the initial problem is lost in the process of extracting the semantic features is avoided, the accuracy of the target similarity can be improved.

In some embodiments, if the historical behavior data of the target user contains less information, for example, the number of historical questions in the historical behavior data is small, the recommendation score may not be calculated correctly for some of the initial questions, so that the recommendation score for the some of the questions obtained by the electronic device may be 0. Therefore, the part of questions can be processed through an alternate least square method to obtain correct recommendation scores of the part of questions for target users, and the initial questions are screened according to the correct recommendation scores to obtain target questions to be recommended.

For example, a first recommendation score of the initial question for the target user may be calculated first according to the topic interest level and the question interest level of the initial question. When the first recommendation score meets a preset behavior data deletion condition, calculating to obtain a second recommendation score of the initial problem for the user to be recommended except the target user by the same method, wherein the preset behavior data deletion condition can be that the first recommendation score is 0. Then, a recommendation score matrix is constructed according to the second recommendation score and the first recommendation score. For example, a user may be considered one dimension of the recommendation score matrix, and an initial question may be considered another dimension of the recommendation score matrix, the recommendation score matrix being constructed with each value in the recommendation score matrix representing a recommendation score for one of the users. After the recommended score matrix is obtained, iterative processing can be carried out on the recommended score matrix through an alternate least square method to obtain an iterated target score matrix, the first recommended score is updated through a corresponding value in the target score matrix, and then the target problem to be recommended is obtained through screening from the initial problems according to the updated first recommended score.

Wherein the alternating least squares method refers to the decomposition of the target matrix into two small matrix multiplications. And then alternatively using a least square method for the two small matrixes to calculate another small matrix so as to estimate the missing value in the target matrix.

In order to better implement the problem screening method according to the embodiment of the present application, on the basis of the problem screening method, the embodiment of the present application further provides a problem screening apparatus, as shown in fig. 5, which is a schematic structural diagram of an embodiment of the problem screening apparatus according to the embodiment of the present application, where the problem screening apparatus 500 includes:

an obtaining unit 501, configured to obtain an initial problem to be screened;

a statistics unit 502, configured to count a topic interest degree of the initial problem according to historical behavior data of a target user and a topic type of the initial problem;

a determining unit 503, configured to determine a problem attention degree of the initial problem according to a similarity between the initial problem and each historical problem corresponding to the historical behavior data;

and a screening unit 504, configured to screen and obtain a target problem to be recommended from the initial problems according to the topic attention degree and the problem attention degree.

In a possible implementation of the present application, the statistics unit 502 is further configured to:

acquiring a preset score corresponding to the target behavior;

In a possible implementation of the present application, the determining unit 503 is further configured to:

In a possible implementation of the present application, the screening unit 504 is further configured to:

Updating the first recommendation score according to the target score matrix;

In the implementation, each unit may be implemented as an independent entity, or may be implemented as the same entity or several entities in any combination, and the implementation of each unit may be referred to the foregoing method embodiment, which is not described herein again.

Since the problem screening device can execute the steps in the problem screening method in any embodiment, the beneficial effects that can be achieved by the problem screening method in any embodiment of the present application can be achieved, and detailed descriptions are omitted herein.

In addition, in order to better implement the problem screening method in the embodiment of the present application, on the basis of the problem screening method, the embodiment of the present application further provides an electronic device, and referring to fig. 6, fig. 6 shows a schematic structural diagram of the electronic device in the embodiment of the present application, and specifically, the electronic device provided in the embodiment of the present application includes a processor 601, where the processor 601 is configured to implement each step of the problem screening method in any embodiment when executing a computer program stored in a memory 602; alternatively, the processor 601 is configured to implement the functions of the respective modules in the corresponding embodiment of fig. 5 when executing the computer program stored in the memory 602.

By way of example, a computer program may be partitioned into one or more modules/units that are stored in the memory 602 and executed by the processor 601 to accomplish an embodiment of the application. One or more of the modules/units may be a series of computer program instruction segments capable of performing particular functions to describe the execution of the computer program in a computer device.

The electronic device may include, but is not limited to, a processor 601, a memory 602. It will be appreciated by those skilled in the art that the illustrations are merely examples of electronic devices and are not limiting of electronic devices, and may include more or fewer components than illustrated, or may combine certain components, or different components.

The processor 601 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center for an electronic device, with various interfaces and lines connecting various parts of the overall electronic device.

The memory 602 may be used to store computer programs and/or modules, and the processor 601 implements various functions of the computer device by running or executing the computer programs and/or modules stored in the memory 602 and invoking data stored in the memory 602. The memory 602 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the electronic device, and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the problem screening apparatus, the electronic device and the corresponding units described above may refer to the description of the problem screening method in any embodiment, and will not be described in detail herein.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions or by controlling associated hardware, which may be stored on a readable storage medium and loaded and executed by a processor.

Therefore, an embodiment of the present application provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs steps in a problem screening method according to any embodiment of the present application, and specific operations may refer to descriptions of the problem screening method according to any embodiment, which are not described herein.

Wherein the readable storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

Because the instructions stored in the readable storage medium can execute the steps in the problem screening method in any embodiment of the present application, the beneficial effects that can be achieved by the problem screening method in any embodiment of the present application can be achieved, and detailed descriptions are omitted herein.

The foregoing describes in detail a problem screening method, apparatus, storage medium and electronic device provided by the embodiments of the present application, and specific examples are applied to illustrate the principles and embodiments of the present application, where the foregoing examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application.

Claims

1. A method of problem screening comprising:

acquiring an initial problem to be screened;

2. The method according to claim 1, wherein the counting the topic interest degree of the initial question according to the historical behavior data of the target user and the topic type of the initial question comprises:

3. The method of claim 2, wherein determining the topic interest level of the initial problem according to the number of user behaviors and the number of target behaviors comprises:

Acquiring a preset score corresponding to the target behavior;

4. The method according to claim 1, wherein determining the problem interest level of the initial problem according to the similarity between the initial problem and each historical problem corresponding to the historical behavior data includes:

5. The method of claim 4, wherein calculating the similarity between the initial problem and each historical problem corresponding to the historical behavior data to obtain the maximum target similarity comprises:

6. The method of claim 1, wherein the step of screening the initial question for a target question to be recommended according to the topic interest level and the question interest level includes:

if the first recommendation score meets a preset behavior data deletion condition, obtaining second recommendation scores of the initial questions for the users to be recommended, and constructing a recommendation score matrix between the users and the initial questions according to the first recommendation scores and the second recommendation scores;

updating the first recommendation score according to the target score matrix;

7. The method according to any one of claims 1 to 6, wherein after the target problem to be recommended is screened from the initial problems according to the topic interest level and the problem interest level, the method further comprises:

8. A problem screening apparatus, comprising:

an acquisition unit for acquiring initial problems to be screened;

9. An electronic device comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps in the problem screening method according to any one of claims 1 to 7 when the computer program is executed by the processor.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the problem screening method of any one of claims 1 to 7.