WO2022213662A1 - 应用推荐方法、系统、终端以及存储介质 - Google Patents

应用推荐方法、系统、终端以及存储介质 Download PDF

Info

Publication number
WO2022213662A1
WO2022213662A1 PCT/CN2021/138552 CN2021138552W WO2022213662A1 WO 2022213662 A1 WO2022213662 A1 WO 2022213662A1 CN 2021138552 W CN2021138552 W CN 2021138552W WO 2022213662 A1 WO2022213662 A1 WO 2022213662A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
application
algorithm
recommendation
matrix decomposition
Prior art date
Application number
PCT/CN2021/138552
Other languages
English (en)
French (fr)
Inventor
王洋
吴嘉澍
须成忠
Original Assignee
深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳先进技术研究院 filed Critical 深圳先进技术研究院
Publication of WO2022213662A1 publication Critical patent/WO2022213662A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Definitions

  • the present invention relates to an application recommendation method, system, terminal and storage medium.
  • the recommendation system has become one of the favorable tools to solve the problem of information overload, and its application has become more and more extensive.
  • the recommendation system can individually recommend the information that the user is interested in to the user according to the user's information needs, interests and hobbies.
  • the application recommendation system its purpose is to recommend applications that users may be interested in to users, so that users can better use smart devices such as mobile phones.
  • An embodiment of the present application provides an application recommendation method, which includes the following steps: a. using a matrix decomposition algorithm to perform matrix decomposition on a user feature matrix including user features to obtain dimensionality-reduced and denoised user features; b. decomposing the matrix to obtain The user characteristics of the user are clustered using the hierarchical clustering algorithm, so that users with similar user characteristics are clustered into the same user class, and the mean vector of each user class is used as the user feature in the user class; c. According to the matrix After decomposing and clustering the user characteristics, user situation characteristics and application characteristics, the Bayesian model is used to predict the recommended probability of each application, and the application recommendation list with the predicted probability in descending order is generated.
  • the method further includes the steps of: d. Calculate the recall rate according to the application recommendation data set, and use the Bayesian optimization grid search algorithm to automatically tune the parameters required in the hierarchical clustering algorithm to optimize the recall rate .
  • the user characteristics include: age, gender, location, and interests; and the matrix decomposition algorithm includes: implicit semantic analysis matrix decomposition algorithm, singular value matrix decomposition algorithm, non-negative matrix decomposition algorithm, and neural network decomposition machine algorithm.
  • the hierarchical clustering algorithm specifically includes the following steps:
  • each user as a leaf node, that is, a user and a group
  • the Euclidean distance between the mean vectors of each group is calculated; then, the user groups with the closest Euclidean distance are merged, so that the distance between the mean vectors of each user group generated by the clustering is at least ⁇ ;
  • using the mean vector of each user class as the user feature in the user class specifically includes:
  • the characteristics of this type of users are used to represent the mean vector of each group.
  • the formula for calculating the mean vector is as follows:
  • the user feature representation of the c-th user namely U c
  • U c is the average value of the user feature vectors of all users assigned to this class.
  • step c specifically includes:
  • M' is the user feature vector after matrix decomposition and clustering operations
  • A is the feature of the recommended application, such as application category, application rating, etc.
  • S is the user's situational feature, including: the user's last browsed application type, The user's click on the application, the length of the user's stay in a certain type of application;
  • the probabilities of different applications being recommended are obtained, and then the probabilities are sorted in descending order to find out the top K applications with the highest recommendation probability, and obtain the application recommendation list.
  • An embodiment of the present application provides an application recommendation system, the system includes a decomposition module, a clustering module and a recommendation module, wherein: the decomposition module is used to perform matrix decomposition on a user feature matrix including user features by using a matrix decomposition algorithm obtaining user features for dimensionality reduction and denoising; the clustering module is used to perform a clustering operation on the user features obtained after matrix decomposition using a hierarchical clustering algorithm, so that users with similar user features are clustered into the same user class, The mean vector of each user class is used as the user feature in the user class; the recommendation module is used for predicting using a Bayesian model according to the user feature after matrix decomposition and clustering operations, the user's situational feature and application feature The recommended probability of each application is generated, and the application recommendation list with the predicted probability in descending order is generated.
  • system also includes:
  • the optimization module is used to calculate the recall rate according to the application recommendation data set, and use the Bayesian optimization grid search algorithm to automatically tune the parameters required in the hierarchical clustering algorithm to optimize the recall rate.
  • An embodiment of the present application provides a terminal, where the terminal includes a processor and a memory coupled to the processor, wherein,
  • the memory stores program instructions for implementing the application recommendation method
  • the processor is configured to execute the program instructions stored in the memory for application recommendation.
  • An embodiment of the present application provides a storage medium storing program instructions executable by a processor, where the program instructions are used to execute the application recommendation method.
  • This application provides an application recommendation method and system.
  • the matrix decomposition can not only remove redundant features and features with weak representation ability in the data, but also make the algorithm recommendation more high-quality. Dimensionality reduction can be achieved for user features, which makes the operation more efficient.
  • perform hierarchical clustering on the decomposed user features and use the Bayesian optimization grid search algorithm to automatically adjust the hierarchical clustering parameters, without the need to manually set parameters.
  • the present application will use the mean vector of the features of each category of users as the representation of the features of the users in this category.
  • this application combines the decomposed and clustered user characteristics, the user's context and application characteristics , predict the probability that the application is recommended by the Bayesian model, and sort the applications in descending order of the probability, so as to generate the application recommendation list for the user, and recommend the application software that the user may be interested in more effectively and efficiently.
  • FIG. 1 is a flowchart of applying a recommendation method according to an embodiment of the present application
  • Fig. 2 is the schematic diagram of the user characteristic matrix M of the application embodiment
  • FIG. 3 is a schematic diagram of a hierarchical clustering algorithm and a stop threshold ⁇ according to an application embodiment
  • FIG. 4 is a hardware architecture diagram of an application recommendation system according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • FIG. 1 it is a working flow chart of a preferred embodiment of the application recommendation method of the present invention.
  • a matrix decomposition algorithm is used to perform matrix decomposition on a user feature matrix including user features to obtain dimensionality reduction and denoising user features.
  • the user characteristics include: age, gender, location, interests, etc.
  • the matrix decomposition algorithm includes: LSA (Latent semantic analysis, latent semantic analysis) matrix decomposition algorithm, singular value (SVD, Singular Value Decomposition) matrix decomposition Algorithms, Non-negative Matrix Factorization (NMF, Non-negative Matrix Factorization) algorithm, Neural Network Factorization Machine (NFM, Neural Factorization Machines) algorithm, etc.
  • LSA Local semantic analysis, latent semantic analysis
  • SVD singular value
  • Singular Value Decomposition matrix decomposition Algorithms
  • NMF Non-negative Matrix Factorization
  • NMF Non-negative Matrix Factorization
  • NMF Neural Network Factorization Machine
  • NVM Neural Network Factorization Machines
  • the LSA matrix decomposition algorithm is used to perform matrix decomposition on the user characteristic matrix including the user characteristics (age, gender, location, interest, etc.).
  • the user feature matrix M each user is represented by N dimensions, and there are T users in total, so the user feature matrix M is a matrix with T rows and N columns, as shown in FIG. 2 .
  • the LSA matrix factorization algorithm decomposes the user feature matrix M into the matrix product of the user-hidden space matrix U, the latent space transition matrix ⁇ , and the transpose of the latent space-feature matrix V, as shown in the following formula:
  • Matrix decomposition can not only remove redundant features and features with weak representation ability in the data, so that the recommendation of this application is more high-quality, but also can reduce the dimension of user features, so that the operation of this application is more efficient. Therefore, compared with the algorithm without matrix decomposition, the user feature matrix is more refined and the operation is more efficient.
  • Step S2 using the hierarchical clustering algorithm to perform a clustering operation on the user characteristics obtained after the matrix decomposition, so that users with similar user characteristics are clustered into the same user class.
  • the mean vector of each user class is used as the user features in that user class. in particular:
  • the user characteristics after matrix decomposition are clustered by hierarchical clustering algorithm, so that users are clustered into user groups with similar characteristics and similar interests.
  • the bottom-up hierarchical clustering algorithm is shown in Figure 3:
  • each user initialize each user as a leaf node, that is, a user and a group.
  • the similarity between the mean vectors of each group such as Euclidean distance
  • the user groups with the closest Euclidean distance are merged to make the distance between the mean vectors of each user group generated by clustering
  • the minimum is ⁇ .
  • the hierarchical clustering algorithm continues to iteratively merge until the termination condition of the iteration is reached, that is, the distance between the mean vectors of each group is greater than or equal to the distance threshold ⁇ .
  • the hierarchical clustering algorithm will stop after one iteration, because the minimum distance between the mean vectors of each group at this time is 1.26, which has met the iteration stop threshold condition. . Therefore, when the hierarchical clustering algorithm stops, users U1 and U2 will be grouped into a group, users U5 and U6 will be grouped into a group, and users U3 and U4 will each form a group. Satisfaction of the iteration stopping condition means that each user group has enough differentiation to avoid users being ambiguously misclassified into user groups they should not belong to.
  • the user feature representation of the c-th user namely U c
  • U c is the average value of the user feature vectors of all users assigned to this class.
  • Step S3 combining the user characteristics after matrix decomposition and clustering operations, the situation characteristics of the user, and the application characteristics, using the Bayesian model to predict the probability of each application being recommended, and performing the descending order of the predicted probability for each application to generate.
  • the characteristics of the situation in which the user is located include: the type of the application the user browsed last time, the length of time the user stayed in a certain type of application, and other characteristics.
  • the application characteristics include application type, application rating, etc.; for example, a certain application (ID #000) belongs to the #8 category of real-time news applications, and its news attribute index is 0.95, the technology attribute index is 0.76, and the sports attribute index is 0.66 ,and many more.
  • ID #000 belongs to the #8 category of real-time news applications, and its news attribute index is 0.95, the technology attribute index is 0.76, and the sports attribute index is 0.66 ,and many more.
  • the above properties will be used as a vector representing its application characteristics. in particular:
  • the present application is no longer limited to the user's own characteristics such as age, gender, hobbies and other characteristics when using data characteristics for analysis and application recommendation.
  • the user's context information can better assist in analyzing the user's preference characteristics, thereby improving the accuracy of the recommendation.
  • This application combines the user characteristics after matrix decomposition and clustering operations, the user's context characteristics (please refer to Table 2) and application characteristics. , using the Bayesian model to predict the probability of each application being recommended, and arranging the predicted probability of each application in descending order to generate an application recommendation list for users with such user characteristics and user context characteristics.
  • M' is the user feature vector after matrix decomposition and clustering operations
  • A is the feature of the recommended application, such as application category, application score, etc.
  • S is the feature of the situation in which the user is located, such as the type of application the user browsed last time, the length of time the user stayed in a certain type of application, and the like.
  • the probability of different applications being recommended can be calculated for a given user characteristics and the characteristics of the situation in which they are located, and then the probability is sorted in descending order to find out the top K applications with the highest recommendation probability, that is, the most recommended applications. K applications worthy of recommendation, get the application recommendation list.
  • this application can accept and deal with the characteristics that user scene information is easy to change over time, and update the probability in the formula through the real-time user's situation information fed back by the user, so that the probability in the formula can be updated in real time, so that the application recommendation has a better real-time sex.
  • the application adopts the user's click on the application and the duration of the stay as the characteristics of the user's situation, thus fully avoiding the disadvantages of the user not scoring or evaluating the application or the large difference in the scoring scale of different users.
  • the selected context information does not require users to do special operations, is easy to obtain, and there is no problem of inconsistent scale standards.
  • Step S4 according to the recall rate of the recommendation system on the application recommendation data set, the Bayesian optimization grid search algorithm is used to automatically adjust the parameters required in the hierarchical clustering algorithm to optimize the recall rate of the recommendation result. Therefore, the present application does not need to manually specify parameters for hierarchical clustering, which is more efficient. in particular:
  • the recall rate can be calculated.
  • the parameters of the stopping distance threshold ⁇ in the hierarchical clustering of the present application are adjusted, thereby optimizing the recommendation performance of the present application.
  • the adopted Bayesian optimization grid algorithm can consider the adjustment effect of the parameters before each adjustment of the parameters, so as to find out the parameter settings with better effect more quickly.
  • FIG. 4 it is a hardware architecture diagram of the application recommendation system 10 of the present invention.
  • the system includes: a decomposition module 101 , a clustering module 102 , a recommendation module 103 and an optimization module 104 .
  • the decomposition module 101 is configured to perform matrix decomposition on a user feature matrix including user features by using a matrix decomposition algorithm to obtain dimensionality reduction and denoising user features.
  • the user characteristics include: age, gender, location, interests, etc.
  • the matrix decomposition algorithm includes: LSA (Latent semantic analysis, latent semantic analysis) matrix decomposition algorithm, singular value (SVD, Singular Value Decomposition) matrix decomposition Algorithms, Non-negative Matrix Factorization (NMF, Non-negative Matrix Factorization) algorithm, Neural Network Factorization Machine (NFM, Neural Factorization Machines) algorithm, etc.
  • LSA Local semantic analysis, latent semantic analysis
  • SVD singular value
  • Singular Value Decomposition matrix decomposition Algorithms
  • NMF Non-negative Matrix Factorization
  • NMF Non-negative Matrix Factorization
  • NMF Neural Network Factorization Machine
  • NVM Neural Network Factorization Machines
  • the LSA matrix decomposition algorithm is used to perform matrix decomposition on the user characteristic matrix including the user characteristics (age, gender, location, interest, etc.).
  • the user feature matrix M each user is represented by N dimensions, and there are T users in total, so the user feature matrix M is a matrix with T rows and N columns, as shown in FIG. 2 .
  • the LSA matrix factorization algorithm decomposes the user feature matrix M into the matrix product of the user-hidden space matrix U, the latent space transition matrix ⁇ , and the transpose of the latent space-feature matrix V, as shown in the following formula:
  • Matrix decomposition can not only remove redundant features and features with weak representation ability in the data, so that the recommendation of this application is more high-quality, but also can reduce the dimension of user features, so that the operation of this application is more efficient. Therefore, compared with the algorithm without matrix decomposition, the user feature matrix is more refined and the operation is more efficient.
  • the clustering module 102 is configured to perform a clustering operation on the user characteristics obtained after matrix decomposition using a hierarchical clustering algorithm, so that users with similar user characteristics are clustered into the same user class. After hierarchical clustering of users, the mean vector of each user class is used as the user features in that user class. Specifically include:
  • the user characteristics after matrix decomposition are clustered by hierarchical clustering algorithm, so that users are clustered into user groups with similar characteristics and similar interests.
  • the bottom-up hierarchical clustering algorithm is shown in Figure 3:
  • each user initialize each user as a leaf node, that is, a user and a group.
  • the similarity between the mean vectors of each group such as Euclidean distance
  • the user groups with the closest Euclidean distance are merged to make the distance between the mean vectors of each user group generated by clustering
  • the minimum is ⁇ .
  • the hierarchical clustering algorithm continues to iteratively merge until the termination condition of the iteration is reached, that is, the distance between the mean vectors of each group is greater than or equal to the distance threshold ⁇ .
  • the hierarchical clustering algorithm will stop after one iteration, because the minimum distance between the mean vectors of each group at this time is 1.26, which has met the iteration stop threshold condition. . Therefore, when the hierarchical clustering algorithm stops, users U1 and U2 will be grouped into a group, users U5 and U6 will be grouped into a group, and users U3 and U4 will each form a group. Satisfaction of the iteration stopping condition means that each user group has enough differentiation to avoid users being ambiguously misclassified into user groups they should not belong to.
  • the user feature representation of the c-th user namely U c
  • U c is the average value of the user feature vectors of all users assigned to this class.
  • the recommendation module 103 is used to combine the user characteristics, the situation characteristics and application characteristics of the user after the matrix decomposition and clustering operations, use the Bayesian model to predict the probability of each application being recommended, and perform the prediction probability for each application. Arrange in descending order to generate an application recommendation list for users with such user characteristics and the context characteristics of the user.
  • the characteristics of the situation in which the user is located include: the type of the application the user browsed last time, the length of time the user stayed in a certain type of application, and other characteristics.
  • the application characteristics include application type, application rating, etc.; for example, a certain application (ID #000) belongs to the #8 category of real-time news applications, and its news attribute index is 0.95, the technology attribute index is 0.76, and the sports attribute index is 0.66 ,and many more.
  • ID #000 belongs to the #8 category of real-time news applications, and its news attribute index is 0.95, the technology attribute index is 0.76, and the sports attribute index is 0.66 ,and many more.
  • the above properties will be used as a vector representing its application characteristics. in particular:
  • the present application is no longer limited to the user's own characteristics such as age, gender, hobbies and other characteristics when using data characteristics for analysis and application recommendation.
  • the user's context information can better assist in analyzing the user's preference characteristics, thereby improving the accuracy of the recommendation.
  • This application combines the user characteristics after matrix decomposition and clustering operations, the user's context characteristics (please refer to Table 2) and application characteristics. , using a Bayesian model to predict the probability of each application being recommended, and arranging the predicted probability of each application in descending order to generate an application recommendation list for users with such user characteristics and user context characteristics.
  • M' is the user feature vector after matrix decomposition and clustering operations
  • A is the feature of the recommended application, such as application category, application score, etc.
  • S is the feature of the situation in which the user is located, such as the type of application the user browsed last time, the length of time the user stayed in a certain type of application, and the like.
  • the probability of different applications being recommended can be calculated for a given user characteristics and the characteristics of the situation in which they are located, and then the probability is sorted in descending order to find out the top K applications with the highest recommendation probability, that is, the most recommended applications. K applications worthy of recommendation, get the application recommendation list.
  • this application can accept and deal with the characteristics that user scene information is easy to change over time, and update the probability in the formula through the real-time user's situation information fed back by the user, so that the probability in the formula can be updated in real time, so that the application recommendation has a better real-time sex.
  • the application adopts the user's click on the application and the duration of the stay as the characteristics of the user's situation, thus fully avoiding the disadvantages of the user not scoring or evaluating the application or the large difference in the scoring scale of different users.
  • the selected context information does not require users to do special operations, is easy to obtain, and there is no problem of inconsistent scale standards.
  • the optimization module 104 is used to automatically optimize the parameters required in the hierarchical clustering algorithm by using the Bayesian optimization grid search algorithm according to the recall rate of the recommendation system on the application recommendation data set, so as to optimize the recall rate of the recommendation result. . Therefore, the present application does not need to manually specify parameters for hierarchical clustering, which is more efficient. in particular:
  • the recall rate can be calculated.
  • the parameters of the stopping distance threshold ⁇ in the hierarchical clustering of the present application are adjusted, so as to optimize the recommendation performance of the present application.
  • the adopted Bayesian optimization grid algorithm can consider the adjustment effect of the parameters before each adjustment of the parameters, so as to find out the parameter settings with better effect more quickly.
  • FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51 .
  • the memory 52 stores program instructions for implementing the above-described method recommended for applications.
  • the processor 51 is configured to execute program instructions stored in the memory 52 for application recommendation.
  • the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 51 may be an integrated circuit chip with signal processing capability.
  • the processor 51 may also be a general purpose processor, digital signal processor (DSP), application specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component .
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • FIG. 6 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • the storage medium of this embodiment of the present application stores a program file 61 capable of implementing all the above methods, wherein the program file 61 may be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which may It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods of the various embodiments of the present application.
  • a computer device which may It is a personal computer, a server, or a network device, etc.
  • processor processor
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes , or terminal devices such as computers, servers, mobile phones, and tablets.
  • the present application can use the user feature information that has undergone matrix decomposition and hierarchical clustering, combined with the user's context information and application features, and use a Bayesian model to predict the probability of an application being recommended, so as to provide better application recommendation for the user. specifically:
  • the present application can perform hierarchical clustering on the user features after matrix decomposition and take the mean vector for the features of each type of users.
  • the features are integrated, and the diversity brought by similar users can be appropriately introduced without destroying the original interest orientation of users, so that the characteristics of similar users with roughly the same interests as a user can be used to help the user discover similar users.
  • Other points of interest so as to guide the user to discover potential interests that similar users may have, making the application recommendation of the present application more effective.
  • the present application can not only remove redundant features in the data and features with weak representation ability by performing matrix decomposition on the user feature matrix, making the application of the present application.
  • the recommendation is more high-quality, and dimensionality reduction can be achieved for user characteristics, which makes the operation more efficient. Therefore, compared with the algorithm without matrix decomposition, the user feature matrix is more refined and the operation is more efficient.
  • the present application can make more effective application recommendation for the user by considering the context information of the user.
  • the present application analyzes user preference feature information by using user context information, so as to assist the application recommendation of the present application to better obtain user preferences, and provide users with better information. Good app recommendation.
  • the present application uses the user's click on the application and the stay time and the like as the context information. This fully avoids the disadvantages of users not rating the application or having large differences in the scoring scales of different users.
  • the context information selected in this application does not require users to do special operations, is easy to obtain, and there is no problem of inconsistent scale standards.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Computation (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种应用推荐方法和系统,所述方法包括:利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征(S1);对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内,使用每个用户类的均值向量作为该用户类中的用户特征(S2);根据矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,生成预测概率降序的应用推荐列表(S3);根据推荐系统在应用推荐数据集上的召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优从而优化推荐结果的召回率(S4)。本方法能够利用用户特征信息,结合用户所处情境信息、应用特征及预测的应用被推荐概率,为用户进行更优质的应用推荐。

Description

应用推荐方法、系统、终端以及存储介质 技术领域
本发明涉及一种应用推荐方法、系统、终端以及存储介质。
背景技术
随着数字时代的到来以及智能移动设备的广泛应用,人们可获取的信息量也随之急剧增多,信息过载的问题也因此变得日趋严重。为应对信息过载带来的弊端,将真正有用的信息以个性化的形式推荐给用户,推荐系统成为了解决信息过载问题的有利工具之一,其应用也变得愈发广泛。面对海量的信息,推荐系统可以根据用户的信息需求、兴趣爱好等,个性化地将用户感兴趣的信息推荐给用户。而对于应用推荐系统而言,其目的在于将用户可能感兴趣的应用程序推荐给用户,使得用户可以更好地使用如手机等智能设备。
然而,推荐系统算法经过长期的发展与演化后,仍存在一些不足之处亟待解决。首先,绝大多数推荐系统面临着推荐同质化的困境,即推荐系统长期为用户推荐与其历史兴趣相似的信息资源,同类资源的重复推荐内容缺乏新颖性,无法引导用户发现更多潜在的新兴趣点,从而使得推荐系统具有较大的局限性。除此之外,现有的绝大多数推荐系统算法均着重于对用户和所推荐信息的属性进行分析并进行推荐,在推荐过程中往往缺失对用户所处情境的考虑,如用户上一个浏览的应用类型、用户在某一类应用中的停留时间等,从而使得推荐系统的推荐效果欠佳。
发明内容
有鉴于此,有必要提供一种应用推荐方法、系统、终端以及存储介质,其能够利用用户特征信息,结合用户所处情境信息、应用特征及预测的应用被推荐概率,为用户进行更优质的应用推荐。
本申请实施例提供一种应用推荐方法,该方法包括如下步骤:a.利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征;b.对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内,使用每个用户类的均值向量作为该用户类中的用户特征;c.根据矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,生成预测概率降序的应用推荐列表。
其中,该方法还包括步骤:d.根据应用推荐数据集计算召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优以对所述召回率进行优化。
其中,所述用户特征包括:年龄、性别、所在地、兴趣;所述矩阵分解算法包括:隐含语义分析矩阵分解算法、奇异值矩阵分解算法,非负矩阵分解算法,神经网络分解机算法。
其中,所述的层次聚类算法具体包括如下步骤:
将每一个用户都初始化为一个叶子节点,即一个用户一个组;
在每一次迭代中,计算各组均值向量之间的欧氏距离;然后,合并欧氏距离最近的用户组,使得聚类产生的各用户组的均值向量之间的距离最小为λ;
不断进行迭代合并,直到迭代的终止条件达成:各个组的均值向量之间的距离都大于等于距离阈值λ。
其中,所述的使用每个用户类的均值向量作为该用户类中的用户特 征,具体包括:
用该类用户的特征表示每一组的均值向量,求均值向量的公式如下:
Figure PCTCN2021138552-appb-000001
其中:第c类用户的用户特征表示,即U c,为被分到该类的所有用户的用户特征向量的平均值。
其中,所述步骤c具体包括:
所述贝叶斯模型所用公式为:
Figure PCTCN2021138552-appb-000002
其中,M’为矩阵分解及聚类操作后的用户特征向量,A为推荐应用的特征,如应用类别、应用评分等,S为用户所处情境特征,包括:用户上一次浏览的应用类型、用户对应用的点击、用户在某一类应用中停留的时长;
根据上述公式计算得到不同的应用被推荐的概率,再以概率做降序排列,从而找出推荐概率最大的前K个应用,得到应用推荐列表。
本申请实施例提供一种应用推荐系统,该系统包括该系统包括分解模块、聚类模块以及推荐模块,其中:所述分解模块用于利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征;所述聚类模块用于对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内,使用每个用户类的均值向量作为该用户类中的用户特征;所述推荐模块用于根据矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,生成预测概率降序的应用推荐列表。
其中,所述系统还包括:
优化模块,用于根据应用推荐数据集计算召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优以对所述召回率进行优化。
本申请实施例提供一种终端,所述终端包括处理器、与所述处理器耦接的存储器,其中,
所述存储器存储有用于实现应用推荐方法的程序指令;
所述处理器用于执行所述存储器存储的所述程序指令以进行应用推荐。
本申请实施例提供一种存储介质,存储有处理器可运行的程序指令,所述程序指令用于执行所述的应用推荐方法。
本申请提供了一种应用推荐方法及系统,首先对用户特征矩阵进行LSA矩阵分解操作,矩阵分解不仅可以去除掉数据中的冗余特征以及表示能力不强的特征,使得算法推荐更加优质,而且可以对用户特征实现降维,从而使得运行更加高效。之后,对分解后的用户特征进行层次聚类,并利用贝叶斯优化网格搜索算法对层次聚类参数进行自动调整,无需手动设置参数。对用户特征进行聚类后,本申请将用每一类用户的特征的均值向量作为该类中用户的特征的表示。最后,考虑到通过利用用户情境信息可以更好地分析出用户的喜好特征,从而为用户更好地进行应用推荐,本申请将分解聚类后的用户特征、用户所处情境与应用特征相结合,通过贝叶斯模型预测应用被推荐的概率,并对应用做概率的降序排列,从而生成对用户的应用推荐列表,更有效、更高效地为用户推荐其可能感兴趣的应用软件。
附图说明
图1为本申请实施例应用推荐方法的流程图;
图2是申请实施例用户特征矩阵M示意图;
图3是申请实施例层次聚类算法及停止阈值λ示意图;
图4为本申请实施例应用推荐系统的硬件架构图;
图5为本申请实施例的终端结构示意图;
图6为本申请实施例的存储介质的结构示意图。
具体实施方式
下面结合附图及具体实施例对本发明作进一步详细的说明。
参阅图1所示,是本发明应用推荐方法较佳实施例的作业流程图。
步骤S1,利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征。其中,所述用户特征包括:年龄、性别、所在地、兴趣等;所述矩阵分解算法包括:LSA(Latent semantic analysis,隐含语义分析)矩阵分解算法、奇异值(SVD,Singular Value Decomposition)矩阵分解算法,非负矩阵分解(NMF,Non-negative Matrix Factorization)算法,神经网络分解机(NFM,Neural Factorization Machines)算法等。不同的矩阵分解算法具有不同的准确度以及计算复杂度,可根据情况进行适当选择。具体而言:
本实施例对包含用户特征(年龄、性别、所在地、兴趣等)的用户特征矩阵利用LSA矩阵分解算法对用户特征矩阵进行矩阵分解。在用户特征矩阵M中,每个用户都用N个维度进行表示,总共有T个用户,故用户特征矩阵M是一个T行N列的矩阵,如图2所示。
本实施例中具体使用的用户特征如表1所示:
Figure PCTCN2021138552-appb-000003
表1
LSA矩阵分解算法把用户特征矩阵M分解为用户-隐空间矩阵U、隐空间转移矩阵Σ、以及隐空间-特征矩阵V的转置的矩阵乘积,如下公式所示:
M=UΣV *
矩阵分解不仅可以去除掉数据中的冗余特征以及表示能力不强的特征,使得本申请推荐更加优质,而且可以对用户特征实现降维,从而使得本申请运行更加高效。所以,对用户特征矩阵进行矩阵分解相较于不进行矩阵分解的算法而言,用户特征更加精炼,且运行更加高效。
步骤S2,对矩阵分解后得到的用户特征利用层次聚类算法进行聚类 操作,使得拥有相似用户特征的用户被聚类至同一用户类内。在对用户进行层次聚类之后,使用每个用户类的均值向量作为该用户类中的用户特征。具体而言:
将矩阵分解后的用户特征利用层次聚类算法进行聚类操作,从而将用户们聚类到特征相似、兴趣爱好相仿的用户组中。从下至上的层次聚类算法如图3所示:
首先,将每一个用户都初始化为一个叶子节点,即一个用户一个组。之后,在每一次迭代中,计算各组均值向量之间的相似度,如欧氏距离;然后,合并欧氏距离最近的用户组,使得聚类产生的各用户组的均值向量之间的距离最小为λ。层次聚类算法不断进行迭代合并,直到迭代的终止条件达成,即各个组的均值向量之间的距离都大于等于距离阈值λ。如图3所示,假设迭代停止阈值λ设置为1.2,则层次聚类算法在迭代一次后就会停止,因为此时各组的均值向量之间的最小距离为1.26,已符合迭代停止阈值条件。所以,在层次聚类算法停止时,用户U1和U2会被聚为一组,用户U5和U6会被聚为一组,用户U3、U4各自自成一组。迭代停止条件的满足意味着每个用户组之间已经拥有了足够的差异性,从而能够避免用户被模棱两可地错分入其本不应该属于的用户组。
通过层次聚类算法,具有相似特征的用户被聚合至同一组中,之后用该类用户的特征表示每一组的均值向量,求均值向量的公式如下:
Figure PCTCN2021138552-appb-000004
其中:第c类用户的用户特征表示,即U c,为被分到该类的所有用户的用户特征向量的平均值。
通过这种方式,虽然同处一类的用户具有大体相似的特征,但是处于同类的用户与用户之间也存在一定的差异性与多样性,并非完全相同。对每一类用户的特征取均值向量可以很好地将同类用户的特征进行融合,在不破坏用户原有兴趣取向的同时适当的引入同类用户带来的多样性,从而可以借助与某一用户兴趣大体相同的同类用户的特征,来帮助该用户发现同类用户具有的其他兴趣点,从而引导该用户发现同类用户可能拥有的潜在兴趣,使得推荐更加有效。
步骤S3,结合矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,并对各个应用进行对于预测概率的降序排列,生成对于具有此种用户特征与用户所处情境特征的用户的应用推荐列表。其中:所述用户所处情境特征包括:用户上一次浏览的应用类型、用户在某一类应用中停留的时长等特征。所述应用特征包括应用类型、应用评分等;例如,某某应用(ID #000)属于第#8类实时新闻类应用,其新闻属性指数为0.95,科技属性指数为0.76,体育属性指数为0.66,等等。上述属性将作为一个向量,表示其应用特征。具体而言:
差别于传统的推荐系统算法,本申请在利用数据特征进行分析与应用推荐时,不再局限于用户自身如年龄、性别、爱好等特征。用户所处情境信息可以更好地辅助分析用户喜好特征,从而提升推荐的准确性,本申请结合矩阵分解及聚类操作后的用户特征、用户所处情境特征(请参考表2)以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率, 并对各个应用进行对于预测概率的降序排列,以生成对于具有此种用户特征与用户所处情境特征的用户的应用推荐列表。
所述贝叶斯模型所用公式如下:
Figure PCTCN2021138552-appb-000005
其中,M’为矩阵分解及聚类操作后的用户特征向量,A为推荐应用的特征,如应用类别、应用评分等。S为用户所处情境特征,如用户上一次浏览的应用类型、用户在某一类应用中停留的时长等。根据公式可计算出对于一个给出的用户特征及其所处情境特征,不同的应用被推荐的概率,之后再以概率做降序排列,从而找出推荐概率最大的前K个应用,即为最值得被推荐的K个应用,得到应用推荐列表。
同时,本申请可以接受并处理用户场景信息容易随时间发生变化的特点,通过用户反馈的实时用户所处情境信息,更新公式中的概率,从而能够实时更新,以便使得应用推荐具有更好的实时性。
除此之外,本申请采用用户对应用的点击以及停留时长等作为用户所处情境特征,从而充分地避免了用户没有对应用进行评分、评价或不同用户打分尺度差异大等弊端,本申请所选用的情境信息无需用户做特殊操作,容易获取,且不存在尺度标准不一致的问题。
本实施例中具体使用的用户所处情境特征表如表2所示:
用户所处情境特征
用户对于各类应用的平均日启动次数
用户对于各类应用的平均周启动次数
用户对于各类应用的平均月启动次数
用户对于各类应用的平均使用时长
用户上一次打开的应用类型
表2
步骤S4,根据推荐系统在应用推荐数据集上的召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优从而优化推荐结果的召回率。所以,本申请无需人为手工指定层次聚类时的参数,更加高效。具体而言:
利用推荐数据集计算推荐系统产生的推荐的召回率。所用推荐数据集样本如表3所示:
Figure PCTCN2021138552-appb-000006
表3
通过使用此数据集,即可计算出召回率。之后,通过贝叶斯优化网 格搜索算法,以优化本申请推荐结果的召回率为目标,对本申请层次聚类中的停止距离阈值λ进行调参,从而优化本申请的推荐性能。所采取的贝叶斯优化网格算法可以在每次对参数调整时考虑之前对参数的调整效果,从而更快地找出效果较好的参数设置。
参阅图4所示,是本发明应用推荐系统10的硬件架构图。该系统包括:分解模块101、聚类模块102、推荐模块103以及优化模块104。
所述分解模块101用于利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征。其中,所述用户特征包括:年龄、性别、所在地、兴趣等;所述矩阵分解算法包括:LSA(Latent semantic analysis,隐含语义分析)矩阵分解算法、奇异值(SVD,Singular Value Decomposition)矩阵分解算法,非负矩阵分解(NMF,Non-negative Matrix Factorization)算法,神经网络分解机(NFM,Neural Factorization Machines)算法等。不同的矩阵分解算法具有不同的准确度以及计算复杂度,可根据情况进行适当选择。具体而言:
本实施例对包含用户特征(年龄、性别、所在地、兴趣等)的用户特征矩阵利用LSA矩阵分解算法对用户特征矩阵进行矩阵分解。在用户特征矩阵M中,每个用户都用N个维度进行表示,总共有T个用户,故用户特征矩阵M是一个T行N列的矩阵,如图2所示。
本实施例中具体使用的用户特征如表1所示:
Figure PCTCN2021138552-appb-000007
Figure PCTCN2021138552-appb-000008
表1
LSA矩阵分解算法把用户特征矩阵M分解为用户-隐空间矩阵U、隐空间转移矩阵Σ、以及隐空间-特征矩阵V的转置的矩阵乘积,如下公式所示:
M=UΣV *
矩阵分解不仅可以去除掉数据中的冗余特征以及表示能力不强的特征,使得本申请推荐更加优质,而且可以对用户特征实现降维,从而使得本申请运行更加高效。所以,对用户特征矩阵进行矩阵分解相较于不进行矩阵分解的算法而言,用户特征更加精炼,且运行更加高效。
所述聚类模块102用于对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内。在对用户进行层次聚类之后,使用每个用户类的均值向量作为该用户类中的用户特征。具体包括:
将矩阵分解后的用户特征利用层次聚类算法进行聚类操作,从而将用户们聚类到特征相似、兴趣爱好相仿的用户组中。从下至上的层次聚类算法如图3所示:
首先,将每一个用户都初始化为一个叶子节点,即一个用户一个组。之后,在每一次迭代中,计算各组均值向量之间的相似度,如欧氏距离;然后,合并欧氏距离最近的用户组,使得聚类产生的各用户组的均值向量之间的距离最小为λ。层次聚类算法不断进行迭代合并,直到迭代的终止条件达成,即各个组的均值向量之间的距离都大于等于距离阈值λ。如图3所示,假设迭代停止阈值λ设置为1.2,则层次聚类算法在迭代一次后就会停止,因为此时各组的均值向量之间的最小距离为1.26,已符合迭代停止阈值条件。所以,在层次聚类算法停止时,用户U1和U2会被聚为一组,用户U5和U6会被聚为一组,用户U3、U4各自自成一组。迭代停止条件的满足意味着每个用户组之间已经拥有了足够的差异性,从而能够避免用户被模棱两可地错分入其本不应该属于的用户组。
通过层次聚类算法,具有相似特征的用户被聚合至同一组中,之后用该类用户的特征表示每一组的均值向量,求均值向量的公式如下:
Figure PCTCN2021138552-appb-000009
其中:第c类用户的用户特征表示,即U c,为被分到该类的所有用户的用户特征向量的平均值。
通过这种方式,虽然同处一类的用户具有大体相似的特征,但是处于同类的用户与用户之间也存在一定的差异性与多样性,并非完全相同。对每一类用户的特征取均值向量可以很好地将同类用户的特征进行融 合,在不破坏用户原有兴趣取向的同时适当的引入同类用户带来的多样性,从而可以借助与某一用户兴趣大体相同的同类用户的特征,来帮助该用户发现同类用户具有的其他兴趣点,从而引导该用户发现同类用户可能拥有的潜在兴趣,使得推荐更加有效。
所述推荐模块103用于结合矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,并对各个应用进行对于预测概率的降序排列,生成对于具有此种用户特征与用户所处情境特征的用户的应用推荐列表。其中:所述用户所处情境特征包括:用户上一次浏览的应用类型、用户在某一类应用中停留的时长等特征。所述应用特征包括应用类型、应用评分等;例如,某某应用(ID #000)属于第#8类实时新闻类应用,其新闻属性指数为0.95,科技属性指数为0.76,体育属性指数为0.66,等等。上述属性将作为一个向量,表示其应用特征。具体而言:
差别于传统的推荐系统算法,本申请在利用数据特征进行分析与应用推荐时,不再局限于用户自身如年龄、性别、爱好等特征。用户所处情境信息可以更好地辅助分析用户喜好特征,从而提升推荐的准确性,本申请结合矩阵分解及聚类操作后的用户特征、用户所处情境特征(请参考表2)以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,并对各个应用进行对于预测概率的降序排列,以生成对于具有此种用户特征与用户所处情境特征的用户的应用推荐列表。
所述贝叶斯模型所用公式如下:
Figure PCTCN2021138552-appb-000010
其中,M’为矩阵分解及聚类操作后的用户特征向量,A为推荐应用的特征,如应用类别、应用评分等。S为用户所处情境特征,如用户上一次浏览的应用类型、用户在某一类应用中停留的时长等。根据公式可计算出对于一个给出的用户特征及其所处情境特征,不同的应用被推荐的概率,之后再以概率做降序排列,从而找出推荐概率最大的前K个应用,即为最值得被推荐的K个应用,得到应用推荐列表。
同时,本申请可以接受并处理用户场景信息容易随时间发生变化的特点,通过用户反馈的实时用户所处情境信息,更新公式中的概率,从而能够实时更新,以便使得应用推荐具有更好的实时性。
除此之外,本申请采用用户对应用的点击以及停留时长等作为用户所处情境特征,从而充分地避免了用户没有对应用进行评分、评价或不同用户打分尺度差异大等弊端,本申请所选用的情境信息无需用户做特殊操作,容易获取,且不存在尺度标准不一致的问题。
本实施例中具体使用的用户所处情境特征表如表2所示:
用户所处情境特征
用户对于各类应用的平均日启动次数
用户对于各类应用的平均周启动次数
用户对于各类应用的平均月启动次数
用户对于各类应用的平均使用时长
用户上一次打开的应用类型
表2
所述优化模块104用于根据推荐系统在应用推荐数据集上的召回率, 利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优从而优化推荐结果的召回率。所以,本申请无需人为手工指定层次聚类时的参数,更加高效。具体而言:
利用推荐数据集计算推荐系统产生的推荐的召回率。所用推荐数据集样本如表3所示:
Figure PCTCN2021138552-appb-000011
表3
通过使用此数据集,即可计算出召回率。之后,通过贝叶斯优化网格搜索算法,以优化本申请推荐结果的召回率为目标,对本申请层次聚类中的停止距离阈值λ进行调参,从而优化本申请的推荐性能。所采取的贝叶斯优化网格算法可以在每次对参数调整时考虑之前对参数的调整效果,从而更快地找出效果较好的参数设置。
请参阅图5,为本申请实施例的终端结构示意图。该终端50包括处理器51、与处理器51耦接的存储器52。
存储器52存储有用于实现上述针对应用推荐方法的程序指令。
处理器51用于执行存储器52存储的程序指令以进行应用推荐。
其中,处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
请参阅图6,为本申请实施例的存储介质的结构示意图。本申请实施例的存储介质存储有能够实现上述所有方法的程序文件61,其中,该程序文件61可以以软件产品的形式存储在上述存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算机、服务器、手机、平板等终端设备。
本申请能够利用经过矩阵分解并层次聚类的用户特征信息,结合用户所处情境信息与应用特征,利用贝叶斯模型预测应用被推荐的概率,为用户进行更优质的应用推荐。具体地:
1)相较于未进行用户特征聚类的算法而言,本申请通过对矩阵分解后的用户特征进行层次聚类,并对每一类用户的特征取均值向量可以很好地将同类用户的特征进行融合,在不破坏用户原有兴趣取向的同时适当引入同类用户带来的多样性,从而使得能够借助与某一用户兴趣大体相同的同类用户的特征,来帮助该用户发现同类用户具有的其他兴趣 点,从而引导该用户发现同类用户可能拥有的潜在兴趣,使得本申请的应用推荐更加有效。
2)相较于未对用户特征进行矩阵分解的算法而言,本申请通过对用户特征矩阵进行矩阵分解,不仅可以去除数据中的冗余特征以及表示能力不强的特征,使得本申请的应用推荐更加优质,而且可以对用户特征实现降维,从而使得运行更加高效。所以,对用户特征矩阵进行矩阵分解相较于不进行矩阵分解的算法而言,用户特征更加精炼,且运行更加高效。
3)相较于没有考虑用户所处情境的算法而言,本申请通过考虑用户所处情境信息,可以为用户进行更有效的应用推荐。
4)相较于利用用户情境信息分析所推荐产品的算法而言,本申请通过利用用户情境信息分析用户喜好特征信息,从而辅助本申请的应用推荐可以更好地获取用户喜好,并为用户更好地推荐应用。
5)相较于使用评分、评价等信息作为用户情境信息的算法而言,本申请采用用户对应用的点击以及停留时长等作为情境信息。充分避免了用户没有对应用进行评分评价或不同用户打分尺度差异大等弊端,本申请所选用的情境信息无需用户做特殊操作,容易获取,且不存在尺度标准不一致的问题。
虽然本发明参照当前的较佳实施方式进行了描述,但本领域的技术人员应能理解,上述较佳实施方式仅用来说明本发明,并非用来限定本发明的保护范围,任何在本发明的精神和原则范围之内,所做的任何修饰、等效替换、改进等,均应包含在本发明的权利保护范围之内。

Claims (10)

  1. 一种应用推荐方法,其特征在于,该方法包括如下步骤:
    a.利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征;
    b.对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内,使用每个用户类的均值向量作为该用户类中的用户特征;
    c.根据矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,生成预测概率降序的应用推荐列表。
  2. 如权利要求1所述的方法,其特征在于,该方法还包括步骤:
    d.根据应用推荐数据集计算召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优以对所述召回率进行优化。
  3. 如权利要求2所述的方法,其特征在于:
    所述用户特征包括:年龄、性别、所在地、兴趣;
    所述矩阵分解算法包括:隐含语义分析矩阵分解算法、奇异值矩阵分解算法,非负矩阵分解算法,神经网络分解机算法。
  4. 如权利要求3所述的方法,其特征在于,所述的层次聚类算法具体包括如下步骤:
    将每一个用户都初始化为一个叶子节点,即一个用户一个组;
    在每一次迭代中,计算各组均值向量之间的欧氏距离;然后,合并欧氏距离最近的用户组,使得聚类产生的各用户组的均值向量之间的距离最小为λ;
    不断进行迭代合并,直到迭代的终止条件达成:各个组的均值向量之间的距离都大于等于距离阈值λ。
  5. 如权利要求4所述的方法,其特征在于,所述的使用每个用户类的均值向量作为该用户类中的用户特征,具体包括:
    用该类用户的特征表示每一组的均值向量,求均值向量的公式如下:
    Figure PCTCN2021138552-appb-100001
    其中:第c类用户的用户特征表示,即U c,为被分到该类的所有用户的用户特征向量的平均值。
  6. 如权利要求5所述的方法,其特征在于,所述步骤c具体包括:
    所述贝叶斯模型所用公式为:
    Figure PCTCN2021138552-appb-100002
    其中,M’为矩阵分解及聚类操作后的用户特征向量,A为推荐应用的特征,如应用类别、应用评分等,S为用户所处情境特征,包括:用户上一次浏览的应用类型、用户对应用的点击、用户在某一类应用中停留的时长;
    根据上述公式计算得到不同的应用被推荐的概率,再以概率做降序排列,从而找出推荐概率最大的前K个应用,得到应用推荐列表。
  7. 一种应用推荐系统,其特征在于,该系统包括分解模块、聚类模块以及推荐模块,其中:
    所述分解模块用于利用矩阵分解算法对包含用户特征的用户特征矩阵进行矩阵分解得到降维去噪的用户特征;
    所述聚类模块用于对矩阵分解后得到的用户特征利用层次聚类算法进行聚类操作,使得拥有相似用户特征的用户被聚类至同一用户类内, 使用每个用户类的均值向量作为该用户类中的用户特征;
    所述推荐模块用于根据矩阵分解及聚类操作后的用户特征、用户所处情境特征以及应用特征,利用贝叶斯模型预测各个应用被推荐的概率,生成预测概率降序的应用推荐列表。
  8. 如权利要求7所述的系统,其特征在于,所述系统还包括:
    优化模块,用于根据应用推荐数据集计算召回率,利用贝叶斯优化网格搜索算法对层次聚类算法中所需要的参数进行自动调优以对所述召回率进行优化。
  9. 一种终端,其特征在于,所述终端包括处理器、与所述处理器耦接的存储器,其中,
    所述存储器存储有用于实现权利要求1-6任一项所述的应用推荐方法的程序指令;
    所述处理器用于执行所述存储器存储的所述程序指令以进行应用推荐。
  10. 一种存储介质,其特征在于,存储有处理器可运行的程序指令,所述程序指令用于执行权利要求1至6任一项所述的应用推荐方法。
PCT/CN2021/138552 2021-04-06 2021-12-15 应用推荐方法、系统、终端以及存储介质 WO2022213662A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110367692.3 2021-04-06
CN202110367692.3A CN113158039A (zh) 2021-04-06 2021-04-06 应用推荐方法、系统、终端以及存储介质

Publications (1)

Publication Number Publication Date
WO2022213662A1 true WO2022213662A1 (zh) 2022-10-13

Family

ID=76888837

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138552 WO2022213662A1 (zh) 2021-04-06 2021-12-15 应用推荐方法、系统、终端以及存储介质

Country Status (2)

Country Link
CN (1) CN113158039A (zh)
WO (1) WO2022213662A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158039A (zh) * 2021-04-06 2021-07-23 深圳先进技术研究院 应用推荐方法、系统、终端以及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090271356A1 (en) * 2008-04-25 2009-10-29 Samsung Electronics Co., Ltd. Situation-aware thresholding for recommendation
CN103093376A (zh) * 2013-01-16 2013-05-08 北京邮电大学 基于奇异值分解算法的聚类协同过滤推荐系统
CN103744917A (zh) * 2013-12-27 2014-04-23 东软集团股份有限公司 混合推荐方法及系统
CN111368210A (zh) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 基于人工智能的信息推荐方法、装置以及电子设备
CN111798259A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 应用推荐方法、装置、存储介质及电子设备
CN113158039A (zh) * 2021-04-06 2021-07-23 深圳先进技术研究院 应用推荐方法、系统、终端以及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108453A (zh) * 2017-12-28 2018-06-01 北京奇虎科技有限公司 应用信息的推荐方法及装置
CN108763515B (zh) * 2018-05-31 2021-12-17 天津理工大学 一种基于概率矩阵分解的时间敏感个性化推荐方法
CN109410075A (zh) * 2018-10-23 2019-03-01 广州市勤思网络科技有限公司 基于贝叶斯的智能保险推荐方法与系统
CN109840833B (zh) * 2019-02-13 2020-11-10 苏州大学 贝叶斯协同过滤推荐方法
CN110765364A (zh) * 2019-10-22 2020-02-07 哈尔滨理工大学 基于局部优化降维和聚类的协同过滤方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090271356A1 (en) * 2008-04-25 2009-10-29 Samsung Electronics Co., Ltd. Situation-aware thresholding for recommendation
CN103093376A (zh) * 2013-01-16 2013-05-08 北京邮电大学 基于奇异值分解算法的聚类协同过滤推荐系统
CN103744917A (zh) * 2013-12-27 2014-04-23 东软集团股份有限公司 混合推荐方法及系统
CN111798259A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 应用推荐方法、装置、存储介质及电子设备
CN111368210A (zh) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 基于人工智能的信息推荐方法、装置以及电子设备
CN113158039A (zh) * 2021-04-06 2021-07-23 深圳先进技术研究院 应用推荐方法、系统、终端以及存储介质

Also Published As

Publication number Publication date
CN113158039A (zh) 2021-07-23

Similar Documents

Publication Publication Date Title
CN110321422B (zh) 在线训练模型的方法、推送方法、装置以及设备
US11620450B2 (en) Deep learning based text classification
Cape et al. Signal-plus-noise matrix models: eigenvector deviations and fluctuations
CN105868334B (zh) 一种基于特征递增型的电影个性化推荐方法及系统
US20210056458A1 (en) Predicting a persona class based on overlap-agnostic machine learning models for distributing persona-based digital content
US9176969B2 (en) Integrating and extracting topics from content of heterogeneous sources
US20200167690A1 (en) Multi-task Equidistant Embedding
US12020267B2 (en) Method, apparatus, storage medium, and device for generating user profile
Zhang et al. FeatureMF: an item feature enriched matrix factorization model for item recommendation
CN111522889B (zh) 用户兴趣标签扩展方法、装置、电子设备及存储介质
CN112395500A (zh) 内容数据推荐方法、装置、计算机设备及存储介质
CN113536139B (zh) 基于兴趣的内容推荐方法、装置、计算机设备及存储介质
US20220366299A1 (en) Provisioning interactive content based on predicted user-engagement levels
Nasiri et al. Increasing prediction accuracy in collaborative filtering with initialized factor matrices
CN114547257B (zh) 类案匹配方法、装置、计算机设备及存储介质
US20220197978A1 (en) Learning ordinal regression model via divide-and-conquer technique
WO2022213662A1 (zh) 应用推荐方法、系统、终端以及存储介质
CN111190967A (zh) 用户多维度数据处理方法、装置及电子设备
Pujahari et al. Ordinal consistency based matrix factorization model for exploiting side information in collaborative filtering
Alabdulrahman et al. Active learning and deep learning for the cold-start problem in recommendation system: A comparative study
CN114139059A (zh) 资源推荐模型的训练方法、资源推荐方法及装置
CN109902169B (zh) 基于电影字幕信息提升电影推荐系统性能的方法
Liu et al. An online activity recommendation approach based on the dynamic adjustment of recommendation lists
Zhu et al. Collaborative filtering with information-rich and information-sparse entities
CN111400567A (zh) 一种基于ai的用户数据的处理方法、装置及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21935878

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21935878

Country of ref document: EP

Kind code of ref document: A1