CN114840762A - Recommended content determining method and device and electronic equipment - Google Patents

Recommended content determining method and device and electronic equipment Download PDF

Info

Publication number
CN114840762A
CN114840762A CN202210546185.0A CN202210546185A CN114840762A CN 114840762 A CN114840762 A CN 114840762A CN 202210546185 A CN202210546185 A CN 202210546185A CN 114840762 A CN114840762 A CN 114840762A
Authority
CN
China
Prior art keywords
retrieval
contents
recommended
cluster
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210546185.0A
Other languages
Chinese (zh)
Inventor
吕乐宾
王洪斌
权佳成
李宽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Xiaofei Finance Co Ltd
Original Assignee
Mashang Xiaofei Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Xiaofei Finance Co Ltd filed Critical Mashang Xiaofei Finance Co Ltd
Priority to CN202210546185.0A priority Critical patent/CN114840762A/en
Publication of CN114840762A publication Critical patent/CN114840762A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

The application discloses a method and a device for determining recommended content, and the scheme comprises the following steps: acquiring target information, wherein the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition; determining the distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold; and determining N contents to be recommended which meet the recommendation condition based on the plurality of distances. By adopting the method and the device, the actual requirement of relevance between the recommended contents in the scene to be recommended is met, and the accuracy of the recommended contents is improved.

Description

Recommended content determining method and device and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for determining recommended content, and an electronic device.
Background
In the information age, information production and consumption have promoted the rapid development of the information industry and information technology. Currently, the internet has become an important information source, however, the huge scale of the internet and the rapid growth of information resources bring about the problem of information overload, i.e. although the current information resources are abundant, people have difficulty in effectively acquiring useful information.
At present, content recommendation is usually to select recommended content matched with behavior operation habits and user figures of users according to information such as behavior operation habits and user figures of the users. However, the recommendation method is single and cannot meet the requirements of some scenes.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for determining recommended content, which are used for meeting the scene with the requirement on the relevance between the recommended content.
In a first aspect, a method for determining recommended content is provided, including:
acquiring target information, wherein the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer greater than 1;
determining distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
determining N contents to be recommended which meet the recommendation condition based on the plurality of distances, wherein the N is smaller than or equal to the N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
In a second aspect, a recommended content determining apparatus is provided, including:
the system comprises an acquisition module, a recommendation module and a recommendation module, wherein the acquisition module is used for acquiring target information, and the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer greater than 1;
the distance determining module is used for determining the distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
a content determination module for determining N recommended contents satisfying the recommendation condition based on the plurality of distances, wherein N is less than or equal to N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
In a third aspect, an electronic device is provided, which comprises a processor, a memory and a computer program stored on the memory and operable on the processor, which computer program, when executed by the processor, performs the steps of the recommended content determination method according to the first aspect.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the recommended content determination method according to the first aspect.
In the embodiment of the application, the retrieval contents in the content retrieval library are grouped in advance, the retrieval contents in the same cluster have correlation, and the retrieval contents in different clusters have no difference, when the recommendation contents are determined, the distance between the source target content of the scene to be recommended and each cluster in the content retrieval library is determined, then the recommendation contents meeting the recommendation condition are determined based on the determined distance, because most of the determined recommendation contents mainly have correlation with each other or most of the determined recommendation contents mainly have difference with each other, the actual requirement of the correlation between the recommendation contents in the scene to be recommended is realized, in addition, the recommendation contents are determined based on the distance between the source target content of the scene to be recommended and each cluster, and the accuracy of the recommendation contents is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a flowchart illustrating a recommended content determining method according to an embodiment of the present application.
Fig. 2 is a diagram illustrating a first practical scenario of an application of a recommended content determining method according to an embodiment of the present application.
Fig. 3 is an exemplary diagram of a second practical scenario of an application of a recommended content determining method according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a recommended content determining apparatus according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. The reference numbers in the present application are only used for distinguishing the steps in the scheme and are not used for limiting the execution sequence of the steps, and the specific execution sequence is described in the specification.
As described in the background art, in the conventional content recommendation, recommended content matching with the behavior operation habit and the user portrait is often selected for the user according to information such as the behavior operation habit and the user portrait of the user. More specifically, content recommendation is often centered on source target content that matches user behavior and operation habits and matches a user image, and search content having a high content similarity with the source target content is searched in a content search library as content to be recommended. This may cause that, when determining the content to be recommended with strong correlation, some search contents, which have high content similarity with the source target content but have large difference therebetween, may be recommended to the user as the content to be recommended, thereby causing a problem of large difference between the contents recommended to the user. In a scenario where diversity recommendation is required, a plurality of retrieved contents of the same category may be recommended to a user as contents to be recommended. Obviously, in a to-be-recommended scene with an additional requirement on the similarity between recommended contents, the accuracy of the to-be-recommended contents determined for a user by the existing content recommendation method has a certain problem.
In order to solve the above problems in the prior art, an embodiment of the present application provides a method for determining recommended content. The method is applied to a recommended content determining apparatus. As shown in fig. 1, a method for determining recommended content provided in an embodiment of the present application includes:
s110, target information is obtained, wherein the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer larger than 1.
As an example, the scenario to be recommended may be a scenario in which a question posed by a user is intelligently asked and answered by means of an intelligent customer service in a customer service. In the scene to be recommended, the questions posed by the user can be acquired as source target content, the number n of the contents to be recommended can be preset based on the scene to be recommended, for example, after the questions posed by the user are acquired, 5 questions related to the questions posed by the user and answers thereof can be set to be replied, and the questions and the answers thereof are recommended to the user as n contents to be recommended for the user to select the questions most suitable for the user, so that the probability that the intelligent customer service correctly solves the questions posed by the user is improved. For example, when a user inputs "how do i want to add money in an input box of an intelligent customer service of a financial APP? "the content of the question can be obtained as the source target content.
S120, determining the distances between the source target content and W retrieval clusters in the content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1.
Each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is larger than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold.
As an example, the distance between the source target content and the W search clusters in the content search library may specifically be a content similarity between the source target content and the W search clusters in the content search library. More specifically, the content similarity between the source target content and the W search clusters in the content search library can be obtained by calculating the euclidean distance, the hamming distance, the edit distance, the Jaccard distance, the cosine distance, the manhattan distance, and the like.
Optionally, in order to improve accuracy and efficiency of determining the distance between the source target content and the W search clusters in the content search library, in the embodiment of the present application, the distance between the source target content and the W search clusters in the content search library may be determined by a pre-trained search model. Specifically, determining the distance between the source target content and W retrieval clusters in the content retrieval library comprises the following steps:
determining the distance between the source target content and W retrieval clusters in the content retrieval library through a retrieval model;
the retrieval model is obtained by training based on the retrieval contents of the plurality of source target contents similar to the source target contents and the distance between each source target content and the corresponding retrieval content similar to the source target contents.
As an example, the retrieval model may be used to detect similarity between the source target content and the retrieved content, or to perform content matching between the source target content and the retrieved content, so to say, to determine content similarity between the source target content and the retrieved content. The training process of the search model may include: firstly, selecting two or more than two question conversations and other content data with similar contents as a group of positive samples according to a scene to be recommended, and selecting two or more than two question conversations and other content data with dissimilar contents as a group of negative samples; sequentially putting the positive sample and the negative sample into a neural network model, and sequentially performing feature extraction and matching to obtain a content similarity score of the input positive sample or negative sample; and finally, optimizing the neural network model according to a preset content similarity label, thereby obtaining a content similarity calculation model, namely a retrieval model, which is suitable for a scene to be recommended.
Optionally, in order to improve the efficiency of determining similar content to the source target content, in the embodiment of the present application, a plurality of search contents in the content search library may be clustered in advance to obtain a plurality of search clusters, where a content similarity satisfying a setting between the search contents included in each search cluster, that is, a content similarity on certain set features is within a preset content similarity range, that is, a degree of association between the search contents included in each search cluster is greater than or equal to a first threshold. Specifically, the W search clusters are obtained by clustering search contents in the content search library with W search target contents as centers, wherein one search cluster includes at least one search content. The W retrieval clusters can be obtained by clustering retrieval contents in a content retrieval library in advance, each retrieval cluster is obtained by clustering by taking one selected retrieval target content as a center, and certain content difference exists among the retrieval target contents, namely the correlation degree among the retrieval target contents is smaller than a second threshold value.
Alternatively, the distance between the source target content and the retrieval content with the highest content similarity between each retrieval cluster in the plurality of retrieval clusters and the source target content may be used as the distance between each retrieval cluster and the source target content. Specifically, the specific implementation manner of determining the distance between the source target content and the search cluster is as follows:
determining the distance between the source target content and each retrieval content in the retrieval cluster;
and determining the determined minimum distance as the distance between the source target content and the retrieval cluster.
After the search model is trained, the distance between the source target content and the plurality of search clusters in the content search base, namely, the content similarity between the source target content and the plurality of search clusters in the content search base, namely, the distance between the source target content and each search content in each search cluster in the plurality of search clusters is respectively output through the search model, the target search content with the minimum distance between the source target content and the search content in each search cluster in the plurality of search clusters is determined, and the distance between the source target content and the target search content in each search cluster in the plurality of search clusters is taken as the distance between the source target content and the plurality of search clusters.
S130, determining N contents to be recommended meeting recommendation conditions based on the distances, wherein N is smaller than or equal to N.
The N contents to be recommended are contents in a content search library.
The relevance degree of M contents to be recommended of the N contents to be recommended is larger than or equal to a first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, M and K are integers larger than N/2, and M and K are smaller than or equal to N.
The degree of association between the contents to be recommended can be represented by the content similarity between the contents to be recommended, the greater the degree of association between the contents to be recommended, and the smaller the content similarity between the contents to be recommended, the smaller the degree of association between the contents to be recommended.
In the embodiment of the present application, the scene to be recommended may include two types: one is mainly recommending contents with relevance among users, namely, the determined contents to be recommended meet the recommendation condition that the degree of relevance among most (more than half) of the contents to be recommended is greater than or equal to a first threshold value; the other is to recommend the contents with difference between the contents mainly for the users, that is, the determined contents to be recommended satisfy the recommendation condition that the degree of association between most (more than half) of the contents to be recommended is less than the second threshold.
When the scene to be recommended is mainly to recommend the contents having the relevance between the contents for the user, the recommendation condition may be summarized as that the degree of the relevance between the recommended contents is greater than or equal to the first threshold. In this case, a first search cluster closest to the source target content may be determined from the plurality of search clusters in the content search library, and it may be determined whether the number of search contents in the first search cluster is greater than or equal to the number N of contents to be recommended, and if the number of search contents in the first search cluster is greater than the number N of contents to be recommended, the first N search contents closest to the source target content in the first search cluster may be determined as N contents to be recommended, where N is equal to N. It should be understood that if the number of search contents in the first search cluster is equal to the number N of contents to be recommended, all search contents in the first search cluster may be determined as N contents to be recommended, where N is equal to N.
Specifically, if the recommendation condition is that the degree of association between recommended contents is greater than or equal to a first threshold; determining N contents to be recommended meeting recommendation conditions based on a plurality of distances, wherein the method comprises the following steps:
if the number of the retrieval contents included in the first retrieval cluster is larger than N, selecting the contents in the first retrieval cluster based on a first selection strategy to obtain N contents to be recommended;
the first retrieval cluster is a retrieval cluster corresponding to the minimum distance in the plurality of distances;
the first selection strategy is: and selecting the first N retrieval contents closest to the source target content in the first retrieval cluster.
It should be understood that if the number of all the search contents included in the first search cluster is equal to N, all the search contents included in the first search cluster may be determined as N to-be-recommended contents.
When the scene to be recommended is mainly contents which are recommended for the user and have relevance with each other, if the number of the retrieval contents in the first retrieval cluster is smaller than the number N of the contents to be recommended, more retrieval contents in the first retrieval cluster can be added into a set of N contents to be recommended, and then a second retrieval cluster which is closest to the source target content is determined from retrieval clusters except the first retrieval cluster in the plurality of retrieval clusters in the content retrieval library. If the sum M of the number of the retrieval contents included in the first retrieval cluster and the number of the retrieval contents included in the second retrieval cluster is greater than or equal to N, adding the first L retrieval contents closest to the source target content in the second retrieval cluster into a set of N contents to be recommended when M > N, wherein the sum of L and the number of the retrieval contents included in the first retrieval cluster is equal to N, and N is equal to N. It should be understood that when M is equal to N, all the search contents in the second search cluster are added to the set of N contents to be recommended, where N is equal to N.
Specifically, if the recommendation condition is that the degree of association between recommended contents is greater than or equal to a first threshold; determining N contents to be recommended that satisfy the recommendation condition based on the plurality of distances, may further include:
if the number of the retrieval contents included in the first retrieval cluster is smaller than N, and the total number of the retrieval contents included in the first retrieval cluster and the total number of the retrieval contents included in the second retrieval cluster are larger than N, selecting the contents in the first retrieval cluster and the second retrieval cluster based on a second selection strategy to obtain N contents to be recommended;
the second retrieval cluster is a retrieval cluster corresponding to the minimum distance except the minimum distance;
the second selection strategy is: all search contents in the first search cluster are selected, the first L search contents which are closest to the source target content in the second search cluster are selected, and the sum of L and the number of the search contents included in the first search cluster is equal to N.
It should be understood that if the number of search contents included in the first search cluster is less than N, and the total number of search contents included in the first search cluster and the search contents included in the second search cluster is equal to N, all search contents included in the first search cluster and the second search cluster may be determined as N contents to be recommended.
And if the sum of the number of the retrieval contents included in the first retrieval cluster and the number of the retrieval contents included in the second retrieval cluster is still less than N, determining a third retrieval cluster closest to the source target content from the retrieval clusters except the first retrieval cluster and the second retrieval cluster in the plurality of retrieval clusters, and selecting the first M retrieval contents closest to the source target content from the third retrieval cluster as the contents to be recommended. And so on until the number of the contents to be recommended reaches N.
In order to avoid that the determined contents to be recommended do not satisfy the recommendation condition of the scene to be recommended because the degree of association between the recommended contents is greater than or equal to the first threshold and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is smaller than N, that is, the difference between the retrieval contents and the retrieval contents is large, all the retrieval contents included in the first retrieval cluster and the second retrieval cluster may be determined as N contents to be recommended, and the retrieval contents are no longer selected from the remaining retrieval clusters as the contents to be recommended, where N is less than N, it should be understood that N is greater than a positive integer of N/2 in order to satisfy the recommendation condition in the scene to be recommended.
Specifically, if the recommendation condition is that the degree of association between recommended contents is greater than or equal to a first threshold; determining N contents to be recommended that satisfy the recommendation condition based on the plurality of distances, may further include:
and if the number of the retrieval contents included in the first retrieval cluster is less than N, and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is less than N, determining all the retrieval contents included in the first retrieval cluster and the second retrieval cluster as N contents to be recommended.
Fig. 2 is an exemplary diagram of a first practical scenario of an application of a recommended content determining method according to an embodiment of the present application, where a to-be-recommended scenario is a scenario in which contents having a correlation with each other are mainly recommended for users, and the number n of the to-be-recommended contents is set to 3. In fig. 2, the source target content S presents the question "how do i want to offer the amount to do for the user to the smart customer service? "content search library contains search content a" cloud flash payment quotation? "what is the quota for activating the cloud flash payment for the search content B? "search content C" how no provision is paid by cloud flash ", search content D" how to inquire about the provision of cloud flash? "and search content E" how does ease flower raise? ". The content similarity between every two of the retrieval contents a to E in the content retrieval library can be determined in advance through a retrieval model, and the retrieval contents a to E in the content retrieval library are clustered according to the content similarity between every two of the retrieval contents a to E, assuming that the clustering result is as shown in fig. 2. The retrieval contents A to D are a retrieval cluster, and the retrieval content E is an independent retrieval cluster. Then, determining N contents to be recommended that satisfy the recommendation condition based on the plurality of distances may include:
determining a first retrieval cluster closest to the source target content S from the plurality of retrieval clusters, namely the first retrieval cluster containing retrieval contents A-D in FIG. 2;
since the number 4 of search contents included in the first search cluster is greater than the set number 3 of contents to be recommended, the top 3 search contents closest to the source target content S can be determined from the first search cluster including the search contents a to D, assuming search contents D, C and a.
With the method of the embodiment of the present application, since the first 3 search contents closest to the source target content S are selected in units of clusters, that is, the search contents D, C and a are preferentially selected, a situation that the degree of association between the first 3 search contents (that is, the search contents D, C and E in fig. 2) selected in units of search contents is low, that is, the content difference is large, for example, the content difference between the search content E and the search content D, C is large, is avoided.
Optionally, if the scene to be recommended is mainly used for recommending contents with difference among users so as to ensure the diversity among the determined recommended contents, only one retrieved content can be selected in each cluster as the content to be recommended to the user by taking the cluster as a unit. Specifically, the recommendation condition is that the degree of association between recommended contents is smaller than a second threshold; determining N contents to be recommended which meet recommendation conditions based on a plurality of distances, wherein the N contents to be recommended comprise:
selecting the contents in the H third retrieval clusters based on a third selection strategy to obtain N contents to be recommended, wherein H is greater than 1, H is less than W, and H is less than or equal to N;
the H third retrieval clusters are H retrieval clusters with the smallest distance in the plurality of distances, or the distance between the third retrieval cluster and the source target content is smaller than a third threshold value;
the N contents to be recommended comprise at least one retrieval content in each third retrieval cluster.
When a scene to be recommended is mainly used for recommending contents with differences among users, the number of W retrieval clusters in a content retrieval library has the following two conditions: in the first case, the number W of retrieval clusters included in the content retrieval base is greater than or equal to n; in the second case, the number W of search clusters included in the content search library is smaller than n. In the first case, the third selection policy may be: and selecting one retrieval content in each third retrieval cluster, wherein the selected retrieval content has the smallest distance with the source target content. In the second case, the third selection policy may be: selecting n-H-1 retrieval contents in a fourth retrieval cluster, and selecting one retrieval content in other third retrieval clusters except the fourth retrieval cluster, wherein the fourth retrieval cluster is the third retrieval cluster with the minimum distance from the source target content.
Specifically, the third selection policy is: selecting one retrieval content in each third retrieval cluster, wherein the distance between the selected retrieval content and the source target content is minimum;
or, the third selection policy is: if the H is smaller than the n, selecting n-H-1 retrieval contents in a fourth retrieval cluster, and selecting one retrieval content in other third retrieval clusters except the fourth retrieval cluster, wherein the fourth retrieval cluster is the third retrieval cluster with the minimum distance from the source target content.
Fig. 3 is an exemplary diagram of a second actual scenario of an application of a recommended content determining method according to an embodiment of the present application, where the scenario to be recommended is assumed to mainly recommend, to users, content having a difference between each other, and the number n of the content to be recommended is set to 3. In fig. 3, the source target content S presents the question "how do i want to offer the amount to do for the user to the smart customer service? "content search library contains search content F" cloud flash payment quotation? "what is the quota for opening the cloud flash payment for the search content G? "search content H" how to submit the cloud payment ", search content I" how to inquire about the cloud payment quotation? "and search content J" how does ease flower raise? ". The content similarity between every two of the retrieval contents F to J in the content retrieval library can be determined in advance through a retrieval model, and the retrieval contents F to J in the content retrieval library are clustered according to the content similarity between every two of the retrieval contents F to J in the content retrieval library, assuming that the clustering result is as shown in fig. 3. The search contents F, H and G are a search cluster, the search content I is a single search cluster, and the search content J is a single search cluster. Then, determining N contents to be recommended that satisfy the recommendation condition based on the plurality of distances may include:
determining the first 3 search clusters closest to the source target content S from the plurality of search clusters, namely, the search cluster containing the search contents F, H and G, the search cluster containing the search content I and the search cluster containing the search content J in fig. 3;
then, the search content closest to the source target content S is determined from the search clusters containing the search contents F, H and G, the search cluster containing the search content I, and the search cluster containing the search content J, that is, each search cluster determines a search content closest to the source target content S, that is, the search contents H, I and J shown in fig. 3.
By adopting the method of the embodiment of the application, since the retrieval contents in the first 3 retrieval clusters closest to the source target content S, namely the retrieval contents H, I and J are selected preferentially in cluster unit, the situation that the first 3 retrieval contents (namely the retrieval contents F, H and G in fig. 3) selected in cluster unit have a large degree of association with each other but a small difference is avoided, and thus the diversity among the contents to be recommended determined for the user in the scene to be recommended is ensured.
In the embodiment of the application, the retrieval contents in the content retrieval library are grouped in advance, the retrieval contents in the same cluster have correlation, and the retrieval contents in different clusters have no difference, when the recommendation contents are determined, the distance between the source target content of the scene to be recommended and each cluster in the content retrieval library is determined, then the recommendation contents meeting the recommendation conditions are determined based on the determined distance, because most of the determined recommendation contents mainly have correlation with each other or most of the determined recommendation contents mainly have difference with each other, the actual requirement of correlation between the recommendation contents in some scenes is realized, and in addition, the recommendation contents are determined based on the distance between the source target content of the scene to be recommended and each cluster, and the accuracy of the recommendation contents is improved.
In order to solve the problems in the prior art, as shown in fig. 4, an embodiment of the present application further provides a recommended content determining apparatus 400, including:
an obtaining module 401, configured to obtain target information, where the target information includes source target content of a scene to be recommended, a number n of contents to be recommended, and a recommendation condition, where n is an integer greater than 1;
a distance determining module 402, configured to determine distances between the source target content and W search clusters in a content search library, to obtain a plurality of distances, where W is an integer greater than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
a content determining module 403, configured to determine, based on the plurality of distances, N recommended contents that satisfy the recommendation condition, where N is less than or equal to N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
By the recommended content determining device provided by the embodiment of the application, the retrieval contents in the content retrieval library are grouped in advance, the retrieval contents in the same cluster have correlation, the retrieval contents in different clusters have no difference, when determining the recommended content, firstly determining the distance between the source target content of the scene to be recommended and each cluster in the content search library, then determining the recommended content meeting the recommendation condition based on the determined distance, since the determined recommended contents mostly have relevance with each other or the determined recommended contents mostly have difference with each other, the practical requirement of relevance between the recommended contents in some scenes is realized, in addition, the recommended content is determined based on the distance between the source target content of the scene to be recommended and each cluster, and the accuracy of the recommended content is improved.
Optionally, in an embodiment, the recommendation condition is that the degree of association between the recommended contents is greater than or equal to the first threshold; the content determining module 403, when determining, based on the distances, N to-be-recommended contents that satisfy the recommendation condition, specifically performs:
if the number of the retrieval contents included in the first retrieval cluster is larger than N, selecting the contents in the first retrieval cluster based on a first selection strategy to obtain the N contents to be recommended;
wherein the first search cluster is a search cluster corresponding to the minimum distance in the plurality of distances;
the first selection policy is: and selecting the first N retrieval contents closest to the source target content in the first retrieval cluster.
Optionally, in an implementation manner, when the content determining module 403 determines, based on the plurality of distances, N to-be-recommended contents that satisfy the recommendation condition, further specifically performs:
if the number of the retrieval contents included in the first retrieval cluster is smaller than N, and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is larger than N, selecting the contents in the first retrieval cluster and the second retrieval cluster based on a second selection strategy to obtain the N contents to be recommended;
the second retrieval cluster is a retrieval cluster corresponding to the minimum distance of the plurality of distances except the minimum distance;
the second selection policy is: selecting all retrieval contents in the first retrieval cluster, and selecting the first L retrieval contents in the second retrieval cluster which are closest to the source target content, wherein the sum of L and the number of the retrieval contents included in the first retrieval cluster is equal to N.
Optionally, in an embodiment, when the content determining module 403 determines, based on the multiple distances, N to-be-recommended contents that satisfy the recommendation condition, further specifically performs:
if the number of the retrieval contents included in the first retrieval cluster is smaller than N, and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is smaller than N, determining all the retrieval contents included in the first retrieval cluster and the second retrieval cluster as the N contents to be recommended.
Optionally, in an embodiment, the recommendation condition is that the degree of association between the recommended contents is less than the second threshold; when the content determining module 403 determines, based on the distances, N to-be-recommended contents that satisfy the recommendation condition, further specifically perform:
selecting retrieval contents in H third retrieval clusters based on a third selection strategy to obtain the N contents to be recommended, wherein H is greater than 1, H is less than W, and H is less than or equal to N;
wherein the H third search clusters are H search clusters with the smallest distance among the plurality of distances, or the distance between the third search cluster and the source target content is smaller than a third threshold;
the N contents to be recommended comprise at least one retrieval content in each third retrieval cluster.
Optionally, in an embodiment, the third selection policy is: selecting one retrieval content in each third retrieval cluster, wherein the distance between the selected retrieval content and the source target content is minimum;
or, the third selection policy is: if the H is smaller than the n, selecting n-H-1 retrieval contents in a fourth retrieval cluster, and selecting one retrieval content in other third retrieval clusters except the fourth retrieval cluster, wherein the fourth retrieval cluster is the third retrieval cluster with the minimum distance from the source target content.
Optionally, in an embodiment, when determining the distance between the source target content and the W search clusters in the content search library, the distance determining module 402 specifically performs:
determining a distance between the source target content and each retrieved content in the retrieval cluster;
and determining the determined minimum distance as the distance between the source target content and the retrieval cluster.
The modules in the device provided by the embodiment of the present application may also implement the method steps provided by the above method embodiment. Alternatively, the apparatus provided in the embodiment of the present application may further include other modules besides the modules described above, so as to implement the method steps provided in the foregoing method embodiment. The device provided by the embodiment of the application can achieve the technical effects achieved by the method embodiment.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Referring to fig. 5, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 5, but this does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the recommended content determining device on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
acquiring target information, wherein the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer greater than 1;
determining distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
determining N contents to be recommended which meet the recommendation condition based on the plurality of distances, wherein the N is smaller than or equal to the N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
In the electronic device provided by the embodiment of the present application, the search contents in the content search library are grouped in advance, the search contents included in the same cluster have correlation, the search contents included in different clusters have no difference, when determining the recommended content, firstly determining the distance between the source target content of the scene to be recommended and each cluster in the content search library, then determining the recommended content meeting the recommendation condition based on the determined distance, since the determined recommended contents mostly have relevance with each other or the determined recommended contents mostly have difference with each other, the practical requirement of relevance between the recommended contents in some scenes is realized, in addition, the recommended content is determined based on the distance between the source target content of the scene to be recommended and each cluster, and the accuracy of the recommended content is improved.
The method performed by the recommendation content determining apparatus according to the embodiments disclosed in fig. 1 to 3 of the present specification can be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute the methods in fig. 1 to fig. 3, and implement the functions of the recommended content determining apparatus in the embodiments shown in fig. 1 to fig. 3, which are not described herein again in this specification.
Embodiments of the present specification also propose a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiments shown in fig. 1-3, and in particular to perform the following:
acquiring target information, wherein the target information comprises source target content of a scene to be recommended, the number n of the contents to be recommended and a recommendation condition, and n is an integer greater than 1;
determining distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
determining N contents to be recommended which meet the recommendation condition based on the plurality of distances, wherein the N is smaller than or equal to the N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
The computer-readable storage medium provided in the embodiment of the present application, retrieved contents in a content retrieval base are grouped in advance, retrieved contents included in a same cluster have a correlation therebetween, and retrieved contents included in different clusters have no difference therebetween, when determining recommended contents, a distance between source target contents of a scene to be recommended and each cluster in the content retrieval base is determined, and then recommended contents satisfying a recommendation condition are determined based on the determined distance, since most of the determined recommended contents have a correlation therebetween or most of the determined recommended contents have a difference therebetween, an actual requirement for the correlation between the recommended contents in some scenes is fulfilled, and further, the recommended contents are determined based on the distances between the source target contents of the scene to be recommended and each cluster, the accuracy of recommending the content is improved.
Of course, besides the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
The foregoing description of specific embodiments has been presented for purposes of illustration and description. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In short, the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present specification shall be included in the protection scope of the present specification.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Claims (10)

1. A recommended content determining method, comprising:
acquiring target information, wherein the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer greater than 1;
determining distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
determining N contents to be recommended which meet the recommendation condition based on the plurality of distances, wherein the N is smaller than or equal to the N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
2. The method according to claim 1, wherein the recommendation condition is that a degree of association between recommended contents is greater than or equal to the first threshold; the determining N contents to be recommended which meet the recommendation condition based on the plurality of distances comprises:
if the number of the retrieval contents included in the first retrieval cluster is larger than N, selecting the contents in the first retrieval cluster based on a first selection strategy to obtain the N contents to be recommended;
wherein the first search cluster is a search cluster corresponding to the minimum distance in the plurality of distances;
the first selection policy is: and selecting the first N retrieval contents closest to the source target content in the first retrieval cluster.
3. The method of claim 2, further comprising:
if the number of the retrieval contents included in the first retrieval cluster is smaller than N, and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is larger than N, selecting the contents in the first retrieval cluster and the second retrieval cluster based on a second selection strategy to obtain the N contents to be recommended;
the second retrieval cluster is a retrieval cluster corresponding to the minimum distance of the plurality of distances except the minimum distance;
the second selection policy is: selecting all the search contents in the first search cluster, and selecting the first L search contents in the second search cluster which are closest to the source target content, wherein the sum of L and the number of the search contents included in the first search cluster is equal to N.
4. The method of claim 3, further comprising:
if the number of the retrieval contents included in the first retrieval cluster is smaller than N, and the total number of the retrieval contents included in the first retrieval cluster and the retrieval contents included in the second retrieval cluster is smaller than N, determining all the retrieval contents included in the first retrieval cluster and the second retrieval cluster as the N contents to be recommended.
5. The method according to claim 1, wherein the recommendation condition is that a degree of association between recommended contents is smaller than the second threshold; the determining N contents to be recommended which meet the recommendation condition based on the plurality of distances comprises:
selecting retrieval contents in H third retrieval clusters based on a third selection strategy to obtain the N contents to be recommended, wherein H is greater than 1, H is less than W, and H is less than or equal to N;
wherein the H third search clusters are H search clusters with the smallest distance among the plurality of distances, or the distance between the third search cluster and the source target content is smaller than a third threshold;
the N contents to be recommended comprise at least one retrieval content in each third retrieval cluster.
6. The method of claim 5, wherein the third selection policy is: selecting one retrieval content in each third retrieval cluster, wherein the distance between the selected retrieval content and the source target content is minimum;
or, the third selection policy is: if the H is smaller than the n, selecting n-H-1 retrieval contents in a fourth retrieval cluster, and selecting one retrieval content in other third retrieval clusters except the fourth retrieval cluster, wherein the fourth retrieval cluster is the third retrieval cluster with the minimum distance from the source target content.
7. The method of any of claims 1-6, wherein determining the distance between the source target content and the search cluster is performed by:
determining a distance between the source target content and each retrieved content in the retrieval cluster;
and determining the determined minimum distance as the distance between the source target content and the retrieval cluster.
8. A recommended content determining apparatus, comprising:
the system comprises an acquisition module, a recommendation module and a recommendation module, wherein the acquisition module is used for acquiring target information, and the target information comprises source target content of a scene to be recommended, the number n of the content to be recommended and a recommendation condition, and n is an integer greater than 1;
the distance determining module is used for determining the distances between the source target content and W retrieval clusters in a content retrieval library to obtain a plurality of distances, wherein W is an integer larger than 1; each retrieval cluster comprises one or more retrieval contents, the association degree between the retrieval contents in the same retrieval cluster is greater than or equal to a first threshold, the association degree between the retrieval contents in different retrieval clusters is smaller than a second threshold, and the second threshold is smaller than the first threshold;
a content determination module for determining N recommended contents satisfying the recommendation condition based on the plurality of distances, wherein N is less than or equal to N;
the N contents to be recommended have the association degree among M contents to be recommended which is greater than or equal to the first threshold value; or the association degree of K contents to be recommended in the N contents to be recommended is smaller than the second threshold, both M and K are integers larger than N/2, and both M and K are smaller than or equal to N.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the recommended content determining method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the recommended content determining method according to any one of claims 1 to 7.
CN202210546185.0A 2022-05-19 2022-05-19 Recommended content determining method and device and electronic equipment Pending CN114840762A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210546185.0A CN114840762A (en) 2022-05-19 2022-05-19 Recommended content determining method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210546185.0A CN114840762A (en) 2022-05-19 2022-05-19 Recommended content determining method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114840762A true CN114840762A (en) 2022-08-02

Family

ID=82569462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210546185.0A Pending CN114840762A (en) 2022-05-19 2022-05-19 Recommended content determining method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114840762A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7371844B1 (en) 2023-03-02 2023-10-31 17Live株式会社 Systems, methods, and computer-readable media for recommendations

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7371844B1 (en) 2023-03-02 2023-10-31 17Live株式会社 Systems, methods, and computer-readable media for recommendations

Similar Documents

Publication Publication Date Title
CN108763952B (en) Data classification method and device and electronic equipment
CN108829808B (en) Page personalized sorting method and device and electronic equipment
CN110321537B (en) Method and device for generating file
WO2019169978A1 (en) Resource recommendation method and device
CN111242217A (en) Training method and device of image recognition model, electronic equipment and storage medium
CN115862088A (en) Identity recognition method and device
CN113688313A (en) Training method of prediction model, information pushing method and device
CN110599307A (en) Commodity recommendation method and device
CN114817538B (en) Training method of text classification model, text classification method and related equipment
CN116108150A (en) Intelligent question-answering method, device, system and electronic equipment
CN113536003A (en) Feature extraction model training method, image retrieval method, device and equipment
CN112733024A (en) Information recommendation method and device
CN110704423B (en) Excitation information acquisition method and device, storage medium and electronic equipment
CN114840762A (en) Recommended content determining method and device and electronic equipment
CN115129791A (en) Data compression storage method, device and equipment
CN110334936B (en) Method, device and equipment for constructing credit qualification scoring model
CN109063967B (en) Processing method and device for wind control scene feature tensor and electronic equipment
CN108830298B (en) Method and device for determining user feature tag
CN117195046A (en) Abnormal text recognition method and related equipment
CN110866085A (en) Data feedback method and device
CN111461892B (en) Method and device for selecting derived variables of risk identification model
CN111144098B (en) Recall method and device for extended question
CN111311372A (en) User identification method and device
CN110059272B (en) Page feature recognition method and device
CN113343069A (en) User information processing method, device, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination