WO2021068610A1

WO2021068610A1 - Resource recommendation method and apparatus, electronic device and storage medium

Info

Publication number: WO2021068610A1
Application number: PCT/CN2020/105925
Authority: WO
Inventors: 陆园丽; 余玉霞
Original assignee: 平安国际智慧城市科技股份有限公司
Priority date: 2019-10-12
Filing date: 2020-07-30
Publication date: 2021-04-15
Also published as: CN110866181A; CN110866181B

Abstract

A resource recommendation method and apparatus, an electronic device and a storage medium, relating to a data analysis technique, the method comprising: acquiring first user information of a user, acquiring first resource information explicitly associated with the user and, on the basis of the first user information and the first resource information, generating a user explicit vector (S1); acquiring second resource information, acquiring second user information of the user explicitly associated with the second resource information and, on the basis of the second resource information and the second user information, generating a resource explicit vector of the same dimension as the user explicit vector (S2); acquiring an implicit behaviour feature of the user, acquiring third user information and third resource information associated with the implicit behaviour feature and, on the basis of the third user information and the third resource information, constructing a triplet relationship matrix and using a predetermined algorithm to decompose the triplet relationship matrix to obtain a user implicit vector and a resource implicit vector of the user (S3); calculating a first similarity of the user explicit vector of the user and the corresponding resource explicit vector and calculating a second similarity of the user implicit vector and the corresponding resource implicit vector (S4); implementing a weighted sum calculation of the first similarity and the second similarity and, on the basis of the result of the weighted sum, selecting resource information and recommending same to the user (S5). The present method can increase the accuracy of resource recommendation.

Description

Resource recommendation method, device, electronic equipment and storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, with application number 201910970985.3, titled "Resource Recommendation Method, Apparatus, and Storage Medium" on October 12, 2019, the entire content of which is incorporated into this application by reference in.

Technical field

This application relates to the field of data analysis technology, and in particular to a method, device, electronic device, and storage medium for resource recommendation.

Background technique

User portraits and resource portraits are important ways to improve the accuracy of the recommendation system. Comprehensive and accurate tags can fully reflect user characteristics and resource characteristics. According to the characteristics formed by the portraits, a personalized resource pool can be generated for users, so as to achieve the effect of thousands of people. , Improve the accuracy of recommendations, and at the same time, improve user satisfaction.

At present, in the application of portraits in recommendation systems, the inventor realizes that two methods are used for resource selection and prediction: one is to use dominant features (for example, similar features such as content and/or attributes) for resource selection. This method generally requires a large amount of feature engineering to find a suitable feature combination. The effect of the feature combination determines the quality of the final screening and prediction effect to a certain extent, and the accuracy needs to be improved; the other is to use machine learning algorithms to calculate hidden Type features (for example, the content and/or attributes are not similar, but there are certain related features) to filter resources. In the case of a large amount of data, this method can alleviate data sparsity to a certain extent, but there is resource update The accuracy of the characteristics of slowness and low interpretability of the results needs to be improved.

Summary of the invention

A method for recommending resources, the method for recommending resources includes:

Acquiring first user information of a user, acquiring first resource information explicitly associated with the user, and generating a user explicit vector based on the first user information and the first resource information;

Obtain second resource information, obtain second user information of users explicitly associated with the second resource information, and generate a resource display with the same dimensions as the user explicit vector based on the second resource information and second user information Formula vector

Obtain the user's hidden behavior characteristics, obtain the third user information and third resource information associated with the hidden behavior characteristics, construct a triple relationship matrix based on the third user information and third resource information, and use the predetermined The algorithm decomposes and calculates the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

Calculating the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculating the second similarity between the user implicit vector and the corresponding resource implicit vector;

A weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and recommended to the user.

A device for recommending resources, the device for recommending resources includes:

A first generation module, configured to obtain first user information of a user, obtain first resource information explicitly associated with the user, and generate a user explicit vector based on the first user information and first resource information;

The second generation module is configured to obtain second resource information, obtain second user information of users who are explicitly associated with the second resource information, and generate and display the second user information based on the second resource information and the second user information. Explicit vector of resources with the same dimension of formula vector;

The decomposition module is used to obtain the hidden behavior characteristics of users, obtain the third user information and third resource information associated with the hidden behavior characteristics, and construct a triple relationship based on the third user information and third resource information Matrix, using a predetermined algorithm to decompose and calculate the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

A calculation module, configured to calculate the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculate the second similarity between the user implicit vector and the corresponding resource implicit vector;

The recommendation module is configured to perform a weighted summation on the first similarity and the second similarity, select resource information based on the result of the weighted summation, and recommend to the user.

An electronic device including a memory and a processor connected to the memory, the memory stores a computer program that can run on the processor, and the computer program is executed by the processor to implement the following steps:

A computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the following steps are implemented:

Description of the drawings

FIG. 1 is a schematic diagram of the hardware architecture of an embodiment of an electronic device of this application;

Figure 2 is a program module diagram of an embodiment of a resource recommendation device;

FIG. 3 is a schematic flowchart of an embodiment of a resource recommendation method of this application.

Detailed ways

In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

It should be noted that the descriptions related to "first", "second", etc. in this application are only for descriptive purposes, and cannot be understood as indicating or implying their relative importance or implicitly indicating the number of indicated technical features . Therefore, the features defined with "first" and "second" may explicitly or implicitly include at least one of the features. In addition, the technical solutions between the various embodiments can be combined with each other, but it must be based on what can be achieved by a person of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be achieved, it should be considered that such a combination of technical solutions does not exist. , Is not within the scope of protection required by this application.

Refer to FIG. 1, which is a schematic diagram of the hardware architecture of an embodiment of the electronic device of the present application. The electronic device 1 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. The electronic device 1 may be a computer, a single web server, a server group composed of multiple web servers, or a cloud composed of a large number of hosts or web servers based on cloud computing, where cloud computing is a type of distributed computing, A super virtual computer composed of a group of loosely coupled computer sets.

In this embodiment, the electronic device 1 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13 that can be communicably connected to each other through a system bus. The memory 11 stores a computer program that can run on the processor 12. It should be pointed out that FIG. 1 only shows the electronic device 1 with the components 11-13, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.

The memory 11 includes a memory and at least one type of readable storage medium, and the readable storage medium may be non-volatile or volatile. The memory provides a cache for the operation of the electronic device 1; the readable storage medium can be, for example, flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM) ), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks and other non-volatile storage media. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1. In other embodiments, the nonvolatile storage medium may also be an external storage unit of the electronic device 1. Storage devices, such as plug-in hard disks, Smart Media Card (SMC), Secure Digital (SD) cards, flash memory cards (Flash Card), etc., equipped on the electronic device 1. In this embodiment, the readable storage medium of the memory 11 is generally used to store the operating system and various application software installed in the electronic device 1, for example, to store the code of the computer program 14 in an embodiment of the present application. In addition, the memory 11 can also be used to temporarily store various types of data that have been output or will be output.

The processor 12 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments, and is used to run data stored in the memory 11 Program code or processing data, such as running computer program 14 and so on.

The network interface 13 may include a standard wireless network interface and a wired network interface. The network interface 13 is usually used to establish a communication connection between the electronic device 1 and other electronic devices.

The computer program 14 is stored in the memory 11, and includes at least one computer readable instruction stored in the memory 11, and the at least one computer readable instruction can be executed by the processor 12 to implement the method of each embodiment of the present application; And, the at least one computer-readable instruction can be divided into different logic modules according to the different functions implemented by each part thereof.

In an embodiment, when the above-mentioned computer program 14 is executed by the processor 12, the following steps are implemented:

Preferably, the first user information includes basic information and behavior information of the user, the first resource information includes resource information with the same or different service attributes, and the first resource information is based on the first user information and the first resource. The steps for generating user explicit vectors from information include:

Obtaining a pre-defined multi-dimensional vector, the multi-dimensional vector including basic information and business attributes;

Assign values to the basic information in the multi-dimensional vector based on the basic information, and assign values to the business attributes in the multi-dimensional vector based on the resource information, basic information, and behavior information in the first resource information, so that the assigned multi-dimensional vector The vector is used as the user explicit vector.

Preferably, the behavior information includes explicit behavior characteristics and implicit behavior characteristics, and the step of assigning values to the business attributes in the multidimensional vector based on each resource information, basic information, and behavior information in the first resource information , Specifically including:

Obtain explicit behavior characteristics and time information generated when the user operates on each resource information in the first resource information, and calculate the user's preference for corresponding resource information based on the explicit behavior characteristics and time information , Use the preference degree as the value of the corresponding business attribute; or

The users are grouped based on the basic information, and the user's preference degree for each resource information in the first resource information is predicted through the association rules in the group, and the preference degree is used as the value of the corresponding business attribute.

Preferably, the step of performing a weighted summation on the first similarity degree and the second similarity degree, and selecting resource information based on the result of the weighted summation and recommending to the user specifically includes:

Normalize the first similarity and the second similarity respectively to obtain a predetermined weight, and perform a weighted summation based on the normalized first similarity, the normalized second similarity and the weight , Get the total similarity;

The shelf time and popularity of each resource information are acquired, and multiple resource information is selected based on the total similarity, the shelf time and popularity of each resource information, and recommended to the user.

Referring to FIG. 2, a program module diagram of the device 10 for resource recommendation. The resource recommendation device 10 is divided into multiple modules, and the multiple modules are stored in the memory 12 and executed by the processor 13 to complete the application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions.

The resource recommendation device 10 can be divided into: a first generation module 101, a second generation module 102, a decomposition module 103, a calculation module 104, and a recommendation module 105.

The first generating module 101 is configured to obtain first user information of a user, obtain first resource information explicitly associated with the user, and generate a user explicit vector based on the first user information and the first resource information;

The second generation module 102 is configured to obtain second resource information, obtain second user information of users that are explicitly associated with the second resource information, and generate and share information based on the second resource information and second user information. Explicit vectors of resources with the same dimensions as the user's explicit vectors;

The decomposition module 103 is used to obtain the hidden behavior feature of the user, obtain the third user information and the third resource information associated with the hidden behavior feature, and construct the third user information and third resource information based on the third user information and the third resource information. A tuple relation matrix, decomposing and calculating the triple relation matrix using a predetermined algorithm, to obtain the user implicit vector and resource implicit vector of the user;

The calculation module 104 is configured to calculate the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculate the second similarity between the user implicit vector and the corresponding resource implicit vector;

The recommendation module 105 is configured to perform a weighted summation on the first similarity degree and the second similarity degree, select resource information based on the result of the weighted summation, and make recommendations to the user.

For the specific principle, please refer to the introduction of the flow chart of the method in Figure 3 below.

As shown in FIG. 3, FIG. 3 is a schematic flowchart of an embodiment of a resource recommendation method of this application. When the processor 13 of the electronic device 1 executes the computer program 14 stored in the memory 12, the following steps of the method are implemented:

Step S1, acquiring first user information of a user, acquiring first resource information explicitly associated with the user, and generating a user explicit vector based on the first user information and first resource information;

Among them, the first user information includes the user's basic information and behavior information. The basic information includes gender, age, consumption ability, work information, etc. The behavior information is the behavior operation information of the user when browsing or operating resources, which can be obtained from the log, Including dominant behavior characteristics and recessive behavior characteristics. Explicit behavior characteristics can directly reflect the user's preference for resources. Explicit behavior characteristics such as collecting, liking, sharing, etc., recessive behavior characteristics cannot directly reflect the user's preference for resources, such as resource page browsing Time, search keywords, comments, clicks, mouse sliding, etc.

The first resource information is resource information on the network, including various resource information with the same or different business attributes, which are distinguished according to business attributes. For example, the resource information can be product information, sales information, training information, artificial intelligence information, etc.

Among them, from the perspective of the user, if the user's behavior information when browsing or operating resource information is an explicit behavior feature, then the user is explicitly associated with the resource information.

Wherein, the step of generating a user explicit vector based on the first user information and the first resource information specifically includes:

Predefine multi-dimensional vectors (a ₁ , a ₂ ,..., a _j , b ₁ , b ₂ ,..., b _k ), where the multi-dimensional vectors support scalable and configurable operations in the form of configuration files. a ₁ , a ₂ ,..., a _j is the user's basic information (including gender, age, consumption ability, work information, etc.), and its value is 0 or 1. Based on the basic information, the basic information in the multidimensional vector is assigned. The discrete variable can directly obtain the corresponding value, and the minimum entropy bin method is used to discretize the value of the continuous variable to obtain the corresponding value. For example, for gender, the value corresponding to male gender is 0, and the value corresponding to female gender is 1; for age, the value corresponding to 20 years old and above including 20 years old is 0, and the value corresponding to under 20 years old is 1; for work information , The value corresponding to the writer is 0, not the value corresponding to the writer is 1.

b ₁ , b ₂ ,..., b _k are the business attributes of each resource information of the first resource information that is explicitly associated with the first user information. The business attributes of each resource information of the first resource information can be determined by the following method: pre-build a business attribute label structure that meets the business development goals, and then extract the text information of each resource information of the first resource information, and then The subsequent processing of information can adopt existing technologies, namely word segmentation, data cleaning, LDA subject extraction, vectorization, and vector-based business attribute similarity calculation. When the similarity between the text information and the corresponding business attribute label exceeds the threshold, the first The business attribute of each resource information of the resource information is the business attribute pointed to by the business attribute tag.

The value of each business attribute can be obtained in any of the following predetermined ways:

The first method is to obtain the explicit behavior characteristics and time information generated when the user operates on each resource information in the first resource information, and calculate the user's corresponding resource information based on the explicit behavior characteristics and time information. The preference degree of information, the preference degree is taken as the value of the corresponding business attribute, that is, the user’s preference degree for resource information of each business attribute is calculated through the user’s explicit behavior characteristics and time factor, as the value of each business attribute: The resource information of a business attribute, the user's execution of the corresponding behavior is closely related to time. Obtain the user’s explicit behavior characteristics (for example, likes, favorites, etc.) from the behavior information, and calculate the user’s preference b using the following formula:

Among them, t is the number of days since the user performed the dominant behavior feature on the resource information, α, β, c, t _γ are constant parameters, α>0, β>0, c>0, α, β, c The default values of, t _γ are 1, 0.42, 0.025, and 0.0025 respectively. Of course, the corresponding values _{of α, β, c, and t γ can also be generated according to the data changes of the business attribute.} Since users may browse or operate the same resource information at different times, they can be summarized according to user and business attributes. For example, if a user interacts with the same resource information in a certain period of time, the maximum value of the period can be taken The preference degree is taken as the value of b, and finally, the preference degree corresponding to each business attribute is corresponding to the value of each business attribute b ₁ , b ₂ ,..., b _k .

In this embodiment, the fusion time factor is proposed to calculate the user's preference degree, and the correlation of the time factor is beneficial to improve the accuracy of resource recommendation.

The second method is to group users based on the basic information, and predict the user's preference for resource information of each business attribute through the association rules in the group, as the value of each business attribute: group users, and define the group It can be judged according to the user’s data scale. For example, if the user’s data scale is small, all users belong to the entire group. If the user’s data scale is large, it can be based on the user’s basic information, such as grouping by region or industry. Each user has a corresponding group. In the spark platform, the association rule (FP-Growth) algorithm is used to predict the user's preferred resource information in each group. Specifically, constructing each group is denoted as G={g ₁ , g ₂ , g ₃ ,..., g _n }, where n is the number of groups. Obtaining explicit user behavior characteristics, sexual characteristics significantly user u _i v _i is denoted as {u _{_i,} v _i}, each user has a corresponding packet u _i g _1, generates a corresponding relation R = {r _1, _{_{r 2, ..., r m}}} , i = {g i, v i}, where m is the number of users, so that each packet corresponding resource configured frequent item. Generate each group's preference resource list and recommendation scores based on resource frequent items, and then obtain (recommendation scores of users, business attributes corresponding to the preference resources) based on the relationship R, each group's preference resource list and recommendation scores, and use recommendation scores The value is taken as the value of the corresponding business attributes b ₁ , b ₂ ,..., b _k .

The association rules within the group proposed in this embodiment help to solve the problem of excessive consumption of calculation resources of the association rule algorithm on the one hand, and on the other hand, it helps to enhance the group effect of users and improve the accuracy of resource recommendation.

The third way is to merge the above-mentioned first way and the second way, that is, the business attribute values b ₁ , b ₂ ,..., b _k in the first way and the business attribute values in the second way After b ₁ , b ₂ ,..., b _k are corresponding, a weighted sum is performed, and each weight can be determined in advance. Among them, the weights corresponding to the values of the business attributes in the first method are all the same, for example, 0.55, and the weights corresponding to the values of the business attributes in the second method are all the same, for example, 0.45. After weighted summation, Obtain the final business attributes b ₁ , b ₂ ,..., b _k values.

After the values of the above-mentioned multi-dimensional vectors (a ₁ , a ₂ ,..., a _j , b ₁ , b ₂ ,..., b _k ) are determined, a user explicit vector is generated.

Step S2, acquiring second resource information, acquiring second user information of users explicitly associated with the second resource information, and generating based on the second resource information and second user information that have the same dimensions as the user explicit vector Explicit vector of resources;

Wherein, the second resource information is also resource information on the network, including various resource information with the same or different service attributes. The second user information also includes the user's basic information and behavior information.

Among them, from the perspective of resources, the behavior information of the user when browsing or operating resource information is an explicit behavior feature, then the resource information is explicitly associated with the user, and the user information of the user is obtained. All explicit associations The user information of the user constitutes the second user information.

Wherein, based on the second resource information and second user information, generating an explicit vector of resources with the same dimension as the explicit vector of the user specifically includes:

Predefine multi-dimensional vectors (A ₁ ,A ₂ ,…,A _j ,B ₁ ,B ₂ ,…,B _k ), where the multi-dimensional vectors support scalable and configurable operations in the form of configuration files. The dimensions of the aforementioned user explicit vectors are the same. B ₁ , B ₂ ,..., B _k are the business attributes of each resource information in the second resource information, which are determined by the following method: pre-build a business attribute label structure that meets the business development goals, and then compare the business attributes in the second resource information The text information of each resource information is extracted, and the subsequent processing of the text information can use the existing technology, namely word segmentation, data cleaning, LDA subject extraction, vectorization, vector-based business attribute similarity calculation, when the text information and the corresponding business attribute label When the similarity of, exceeds the threshold, the business attribute of each resource information in the second resource information is the business attribute pointed to by the business attribute tag, and its value is the similarity with the corresponding business attribute tag.

A ₁ , A ₂ ,..., A _j are basic information of users that are explicitly associated with the second resource information. The users who are explicitly associated with the second resource information may have multiple attribute tags (for example, male, high-consumption group, R&D engineer, etc.), which makes the user's basic information relatively scattered. The resource information of one type of business attribute may be browsed and operated by different users, but the resource information of this type of business attribute may not actually be applicable to all these users. Therefore, this embodiment selects the corresponding attribute tag by clustering. Users, and get the values of _{A 1} , A ₂ ,..., A _j based on the user's basic information of these attribute tags.

The clustering in this embodiment can use the kmeans algorithm: group users (which can be based on basic user information, such as grouping by region or industry), and based on the analysis of user groups, a preset number of center points can be obtained, set to k, Clustering the basic information related to users with historical behavior information records can obtain the clustering center of the basic information, and obtain the relationship between the user's basic information and the clustering center [basic information list, clustering center]. After clustering, the relationship [resource information, basic information list] can be obtained through the user's historical behavior information, and the two relationships can be merged to obtain the relationship [resource information, clustering center]. Because one resource information corresponds to multiple clusters Therefore, the value of N cluster centers is selected according to the total number of users of the attribute category, and the values of the first 3 cluster centers can be defaulted. Then the weighted sum is finally performed according to the proportion of users as the weight, and the result of the weighted sum is Get the final values _{of A 1} , A ₂ ,..., A _j.

This embodiment vectorizes user explicit features and resource explicit features to obtain user explicit vectors and resource explicit vectors, which avoids the process of feature combination and related preprocessing of a large number of features, and can reduce the complexity of calculation.

Step S3: Obtain the hidden behavior feature of the user, obtain third user information and third resource information associated with the hidden behavior feature, and construct a triple relationship based on the third user information and third resource information Matrix, using a predetermined algorithm to decompose and calculate the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

In this embodiment, the user's hidden behavior characteristics can be obtained from the log, and third user information and third resource information associated with the hidden behavior characteristics can be obtained, where the third user information also includes the user's basic information, and the third user information Resource information is also resource information on the network, including resource information with the same or different business attributes. Construct a triple relationship matrix R[user, product, rating] based on the third user information and the third resource information. In the triple relationship matrix, there are m users and n products. User represents users and product represents resources. , Rating represents the rating (that is, the degree of preference). In this embodiment, the rating corresponding to the implicit behavior feature is uniformly defined as 1.

In practical applications, since the numbers of n and m are both very large, the scale of the triple relation matrix R is very large. At this time, the traditional matrix decomposition method is difficult to handle such a large amount of data; furthermore, it is impossible for a user to rate all resource products. Therefore, the triple relation matrix R is a sparse matrix with many missing items.

This embodiment is based on the spark platform and uses a predetermined algorithm (alternating least squares ALS) to calculate the user implicit vector and the resource implicit vector, thereby obtaining the user implicit vector and the resource implicit vector. Among them, since the triple relation matrix R is a matrix of m*n, it can be regarded as the multiplication of two matrices m*k and k*n, where k<<m, n, and the typical value of k is generally 20～200, so the following formula is obtained:

R _m*n = u _m*k ×p _k*n ;

In the above formula, _um*k represents the user’s preference for hidden behavior features, and p _k*n represents the degree to which the resource contains hidden behavior features. U _m*k and p _k*n can be calculated by the above formula, The calculated u _{m*k is} used as the user implicit vector, and p _{k*n is} used as the resource implicit vector.

Step S4: Calculate the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculate the second similarity between the user implicit vector and the corresponding resource implicit vector;

Since the direct calculation of the similarity between the user explicit vector and the corresponding resource explicit vector has the problem of large calculation and high computational complexity, the locality-sensitive hashing algorithm (Locality-Sensitive Hashing, LSH) is used to calculate the similarity. . First, perform hash mapping between the user explicit vector and the corresponding resource explicit vector, specify the feature column and the unique identification column, use the feature as the input of the algorithm, specify the output column of the algorithm, and then use the approximate similarity for the mapped vector The method of connection calculates the Euclidean distance between vectors as the similarity value:

Calculate the Euclidean distance between the user's explicit vector and the resource's explicit vector, and use the above-described local sensitive hash algorithm LSH to obtain the first similarity sim _{explict_init} in the implementation process;

Calculate the Euclidean distance between the user's implicit vector and the resource's implicit vector, and use the above-described local sensitive hash algorithm LSH to obtain the second similarity sim _{impiict_init} in the implementation process.

Step S5: Perform a weighted summation on the first similarity and the second similarity, select resources based on the result of the weighted summation, and recommend to the user.

First degree of similarity

Second similarity

Normalize to the (0,1) interval to obtain sim _explict and sim _implict , and perform a weighted summation on the two sets of similarity values to obtain the total similarity:

Sim=α*sim _explict +β*sim _explict ,

Among them, the weights α and β are mainly set in two ways. One way is expert scoring, which sets fixed values of α and β. The other method is to determine through linear regression. By randomly sampling users, the sampled users are used as experience officers to score the similarity of the provided resources, and the scoring results are used as training data to generate the values of α and β. .

Finally, according to the total similarity Sim obtained by the weighted summation, the shelf time and popularity of the resource information, the topN resource information is selected to calculate the priority of each resource information, and the priority sorting is adopted

View represents the heat (the previous day is the default), age represents the number of days from the current shelf time of the resource information, and the constant parameters i and j are all set to 1 by default. Finally, the sorted topN resource information can be pushed to the user.

From the above description, it can be concluded that this embodiment combines the user's recessive features and the resource's recessive features on the basis of the user's dominant feature and the resource's dominant feature, and the existing recommendation algorithm can be modified to improve resource recommendation. Accuracy, while enhancing the interpretability of the system.

In addition, the embodiment of the present application also proposes a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium may be a hard disk, a multimedia card, SD card, flash memory card, SMC, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, etc. or Any combination of several. The computer-readable storage medium includes a computer program. For the functions that the computer program implements when executed by the processor, please refer to the above introduction with respect to FIG. 3, which will not be repeated here.

The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the superiority of the embodiments.

It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, It also includes other elements not explicitly listed, or elements inherent to the process, device, article, or method.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A method for recommending resources, wherein the method for recommending resources includes:

Acquiring first user information of a user, acquiring first resource information explicitly associated with the user, and generating a user explicit vector based on the first user information and the first resource information;

Obtain second resource information, obtain second user information of users explicitly associated with the second resource information, and generate a resource display with the same dimensions as the user explicit vector based on the second resource information and second user information Formula vector

Obtain the user's hidden behavior characteristics, obtain the third user information and third resource information associated with the hidden behavior characteristics, construct a triple relationship matrix based on the third user information and third resource information, and use the predetermined The algorithm decomposes and calculates the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

Calculating the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculating the second similarity between the user implicit vector and the corresponding resource implicit vector;

A weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and recommended to the user.
The method for resource recommendation according to claim 1, wherein the first user information includes basic user information and behavior information, and the first resource information includes resource information with the same or different service attributes, and the The step of generating a user explicit vector based on the first user information and the first resource information specifically includes:

Obtaining a pre-defined multi-dimensional vector, the multi-dimensional vector including basic information and business attributes;

Assign values to the basic information in the multi-dimensional vector based on the basic information, and assign values to the business attributes in the multi-dimensional vector based on the resource information, basic information, and behavior information in the first resource information, so that the assigned multi-dimensional vector The vector is used as the user explicit vector.
The method for resource recommendation according to claim 2, wherein the behavior information includes explicit behavior characteristics and implicit behavior characteristics, and the resource information, basic information, and behavior information based on the first resource information are The steps of assigning business attributes in the multidimensional vector specifically include:

Obtain explicit behavior characteristics and time information generated when the user operates on each resource information in the first resource information, and calculate the user's preference for corresponding resource information based on the explicit behavior characteristics and time information , Use the preference degree as the value of the corresponding business attribute; or

The users are grouped based on the basic information, and the user's preference degree for each resource information in the first resource information is predicted through the association rules in the group, and the preference degree is used as the value of the corresponding business attribute.
The method for resource recommendation according to any one of claims 1 to 3, wherein the weighted summation is performed on the first similarity and the second similarity, and resource information is selected based on the result of the weighted summation and sent to all Describes the steps for the user to make recommendations, including:

Normalize the first similarity and the second similarity respectively to obtain a predetermined weight, and perform a weighted summation based on the normalized first similarity, the normalized second similarity and the weight , Get the total similarity;

The shelf time and popularity of each resource information are acquired, and multiple resource information is selected based on the total similarity, the shelf time and popularity of each resource information, and recommended to the user.
The method for resource recommendation according to claim 4, wherein the step of selecting a plurality of resource information based on the total similarity, the shelf time and popularity of each resource information and recommending it to the user specifically includes:

The priority of each resource information is calculated based on the total similarity, the shelf time and popularity of each resource information, and multiple resource information is selected according to the priority of each resource information and recommended to the user.
The method for resource recommendation according to claim 2, wherein the determination of the service attribute specifically includes the following steps:

Pre-build a business attribute label structure that meets business development goals;

Extract the text information of each resource information of the first resource information, and calculate similarity between the extracted text information and the business attribute label. When the similarity between the text information and the corresponding business attribute label exceeds the threshold, Then, the business attribute of each resource information of the first resource information is the business attribute pointed to by the business attribute tag.
The method for resource recommendation according to claim 4, wherein the first degree of similarity and the second degree of similarity are calculated by a local sensitive hash algorithm, and the specific calculation method includes the following steps:

Perform a hash mapping process on the user explicit vector and the corresponding resource explicit vector, specify the feature column and the unique identification column, use the feature as the input of the algorithm, specify the algorithm output column, and perform the mapping process on the user after the mapping process. The explicit vector and the resource explicit vector adopt an approximate similarity connection method to calculate the Euclidean distance between the two as the first similarity;

Perform hash mapping processing on the user implicit vector and the corresponding resource implicit vector, specify the feature column and the unique identification column, use the feature as the algorithm input, specify the algorithm output column, and perform the mapping process on the user after the mapping process. The implicit vector and the resource implicit vector adopt an approximate similarity connection method to calculate the Euclidean distance between the two as the second similarity.
A device for recommending resources, wherein the device for recommending resources includes:

A first generation module, configured to obtain first user information of a user, obtain first resource information explicitly associated with the user, and generate a user explicit vector based on the first user information and first resource information;

The second generation module is configured to obtain second resource information, obtain second user information of users who are explicitly associated with the second resource information, and generate and display the second user information based on the second resource information and the second user information. Explicit vector of resources with the same dimension of formula vector;

The decomposition module is used to obtain the hidden behavior characteristics of users, obtain the third user information and third resource information associated with the hidden behavior characteristics, and construct a triple relationship based on the third user information and third resource information Matrix, using a predetermined algorithm to decompose and calculate the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

A calculation module, configured to calculate the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculate the second similarity between the user implicit vector and the corresponding resource implicit vector;

The recommendation module is configured to perform a weighted summation on the first similarity and the second similarity, select resource information based on the result of the weighted summation, and recommend to the user.
An electronic device comprising a memory and a processor connected to the memory, and a computer program that can be run on the processor is stored in the memory, wherein the computer program is executed by the processor as follows step:

Acquiring first user information of a user, acquiring first resource information explicitly associated with the user, and generating a user explicit vector based on the first user information and the first resource information;

Obtain second resource information, obtain second user information of users explicitly associated with the second resource information, and generate a resource display with the same dimensions as the user explicit vector based on the second resource information and second user information Formula vector

Obtain the user's hidden behavior characteristics, obtain the third user information and third resource information associated with the hidden behavior characteristics, construct a triple relationship matrix based on the third user information and third resource information, and use the predetermined The algorithm decomposes and calculates the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

Calculating the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculating the second similarity between the user implicit vector and the corresponding resource implicit vector;

A weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and recommended to the user.
The electronic device according to claim 9, wherein the first user information includes basic information and behavior information of the user, the first resource information includes resource information with the same or different service attributes, and the first resource information is based on all The steps of generating a user explicit vector with the first user information and the first resource information specifically include:

Obtaining a pre-defined multi-dimensional vector, the multi-dimensional vector including basic information and business attributes;

Assign values to the basic information in the multi-dimensional vector based on the basic information, and assign values to the business attributes in the multi-dimensional vector based on the resource information, basic information, and behavior information in the first resource information, so that the assigned multi-dimensional vector The vector is used as the user explicit vector.
The electronic device according to claim 10, wherein the behavior information includes explicit behavior characteristics and implicit behavior characteristics, and the resource information, basic information, and behavior information based on the first resource information are the The steps of business attribute assignment in the multidimensional vector include:

Obtain explicit behavior characteristics and time information generated when the user operates on each resource information in the first resource information, and calculate the user's preference for corresponding resource information based on the explicit behavior characteristics and time information , Use the preference degree as the value of the corresponding business attribute; or

The users are grouped based on the basic information, and the user's preference degree for each resource information in the first resource information is predicted through the association rules in the group, and the preference degree is used as the value of the corresponding business attribute.
The electronic device according to any one of claims 9 to 11, wherein the weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and reported to the user The recommended steps include:

Normalize the first similarity and the second similarity respectively to obtain a predetermined weight, and perform a weighted summation based on the normalized first similarity, the normalized second similarity and the weight , Get the total similarity;

The shelf time and popularity of each resource information are acquired, and multiple resource information is selected based on the total similarity, the shelf time and popularity of each resource information, and recommended to the user.
The electronic device according to claim 12, wherein the step of selecting a plurality of resource information based on the total similarity, the shelf time and popularity of each resource information and recommending it to the user specifically comprises:

The priority of each resource information is calculated based on the total similarity, the shelf time and popularity of each resource information, and multiple resource information is selected according to the priority of each resource information and recommended to the user.
The electronic device according to claim 10, wherein the determination of the business attribute specifically includes the following steps:

Pre-build a business attribute label structure that meets business development goals;

Extract the text information of each resource information of the first resource information, and calculate similarity between the extracted text information and the business attribute label. When the similarity between the text information and the corresponding business attribute label exceeds the threshold, Then, the business attribute of each resource information of the first resource information is the business attribute pointed to by the business attribute tag.
The electronic device according to claim 12, wherein the first degree of similarity and the second degree of similarity are calculated by a local sensitive hash algorithm, and the specific calculation method thereof includes the following steps:

Perform a hash mapping process on the user explicit vector and the corresponding resource explicit vector, specify the feature column and the unique identification column, use the feature as the input of the algorithm, specify the algorithm output column, and perform the mapping process on the user after the mapping process. The explicit vector and the resource explicit vector adopt an approximate similarity connection method to calculate the Euclidean distance between the two as the first similarity;

Perform hash mapping processing on the user implicit vector and the corresponding resource implicit vector, specify the feature column and the unique identification column, use the feature as the algorithm input, specify the algorithm output column, and perform the mapping process on the user after the mapping process. The implicit vector and the resource implicit vector adopt an approximate similarity connection method to calculate the Euclidean distance between the two as the second similarity.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the following steps are implemented:

Acquiring first user information of a user, acquiring first resource information explicitly associated with the user, and generating a user explicit vector based on the first user information and the first resource information;

Obtain second resource information, obtain second user information of users explicitly associated with the second resource information, and generate a resource display with the same dimensions as the user explicit vector based on the second resource information and second user information Formula vector

Obtain the user's hidden behavior characteristics, obtain the third user information and third resource information associated with the hidden behavior characteristics, construct a triple relationship matrix based on the third user information and third resource information, and use the predetermined The algorithm decomposes and calculates the triple relationship matrix to obtain the user implicit vector and resource implicit vector of the user;

Calculating the first similarity between the user explicit vector of the user and the corresponding resource explicit vector, and calculating the second similarity between the user implicit vector and the corresponding resource implicit vector;

A weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and recommended to the user.
The computer-readable storage medium according to claim 16, wherein the first user information includes basic user information and behavior information, and the first resource information includes resource information with the same or different service attributes, and The step of generating a user explicit vector based on the first user information and first resource information specifically includes:

Obtaining a pre-defined multi-dimensional vector, the multi-dimensional vector including basic information and business attributes;

Assign values to the basic information in the multi-dimensional vector based on the basic information, and assign values to the business attributes in the multi-dimensional vector based on the resource information, basic information, and behavior information in the first resource information, so that the assigned multi-dimensional vector The vector is used as the user explicit vector.
The computer-readable storage medium according to claim 17, wherein the behavior information includes explicit behavior characteristics and implicit behavior characteristics, and the information is based on each resource information, basic information, and behavior information in the first resource information. The steps of assigning values to the business attributes in the multidimensional vector specifically include:

Obtain explicit behavior characteristics and time information generated when the user operates on each resource information in the first resource information, and calculate the user's preference for corresponding resource information based on the explicit behavior characteristics and time information , Use the preference degree as the value of the corresponding business attribute; or

The users are grouped based on the basic information, and the user's preference degree for each resource information in the first resource information is predicted through the association rules in the group, and the preference degree is used as the value of the corresponding business attribute.
The computer-readable storage medium according to any one of claims 16 to 18, wherein the weighted summation is performed on the first similarity degree and the second similarity degree, and resource information is selected based on the result of the weighted summation and sent to The steps for the user to recommend specifically include:

Normalize the first similarity and the second similarity respectively to obtain a predetermined weight, and perform a weighted summation based on the normalized first similarity, the normalized second similarity and the weight , Get the total similarity;

Acquire the shelf time and popularity of each resource information, select multiple resource information based on the total similarity, the shelf time and popularity of each resource information, and recommend them to the user.
The computer-readable storage medium according to claim 19, wherein the step of selecting a plurality of resource information based on the total similarity, the shelf time and popularity of each resource information and recommending to the user specifically comprises:

The priority of each resource information is calculated based on the total similarity, the shelf time and popularity of each resource information, and multiple resource information is selected according to the priority of each resource information and recommended to the user.