CN114297521A

CN114297521A - Product recommendation sorting algorithm based on big data

Info

Publication number: CN114297521A
Application number: CN202111674049.1A
Authority: CN
Inventors: 谢洋
Original assignee: Shenzhen Yiyun Cloud Computing Co ltd
Current assignee: Shenzhen Yiyun Cloud Computing Co ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-08

Abstract

The invention discloses a big data-based product recommendation sorting algorithm, which is used for generating recommendation candidate information comprising products expected by users by using a big data analysis algorithm installed in advance, generating a similar tendency historical relationship information set based on the calculated historical relationship information by collecting historical relationship information among the users, recommending candidate products according to the information set which has the similar tendency with the users and generates the recommendation candidate information by the arrangement of the recommendation algorithm generating the recommendation candidate information, and extracting the information of a preset number of the recommendation candidate products as the recommendation product information. By combining the historical relationship information and the recommendation algorithm arrangement, the purchased products can be recommended and sorted in a similarity accumulation mode when a plurality of users extracting approximate historical behaviors through big data generate new product purchasing behaviors, so that possible purchasing behaviors of the predicted users are generated, and the recommended products are sorted.

Description

Product recommendation sorting algorithm based on big data

Technical Field

The invention relates to the technical field of big data intelligent analysis and application, in particular to a product recommendation sequencing algorithm based on big data.

Background

With the rapid development of information technology and the rise of network technology, the network data is in an explosive growth trend by the internet technology, and more data information and services are full of networks. Meanwhile, the internet technology also becomes a common channel for people to gather and collect information. However, the data information resources have uneven quality and complex structure, so that people can hardly find their own requirements in huge information. In the prior art, for example, chinese patent application with publication number CN112380451A discloses a favorite content recommendation method based on big data, which includes acquiring target user information by acquiring a face image of a target person, acquiring user history information clustered with the target user information according to the target user information, optimizing the user history information, calculating similarity between the target user information and the acquired user history information to score unevaluated items of the target user, and finally pushing the items with high scores to the target user according to the score. The similarity calculation formula based on the project coincidence dependency improves the precision of the similarity between projects, and the collaborative filtering recommendation algorithm based on the projects is utilized to relieve the sparsity of the scoring data, so that the recommendation effect of the collaborative filtering recommendation algorithm based on the users is greatly improved.

Disclosure of Invention

The invention aims to provide a product recommendation sequencing algorithm based on big data, which can improve user experience and push accurately and efficiently aiming at the prior art.

The product recommendation sorting algorithm based on big data is used for generating recommendation candidate information comprising products expected by users by using a big data analysis algorithm installed in advance, recommending candidate products according to the arrangement of the recommendation algorithm generating the recommendation candidate information and the historical relationship information set with the same tendency of the users by collecting historical relationship information among the users and generating the historical relationship information set with the same tendency based on the calculated historical relationship information, and extracting the information of a preset number of the recommendation candidate products as the recommendation product information; extracting the similarity of the first m recommended candidate products and the first n users, accumulating the similarity of each user to the m recommended candidate products, obtaining the recommendation scores of the m recommended candidate products, and recommending the product information in the order of the scores. By combining the historical relationship information and the recommendation algorithm arrangement, the purchased products can be recommended and sorted in a similarity accumulation mode when a plurality of users extracting approximate historical behaviors through big data generate new product purchasing behaviors, so that possible purchasing behaviors of the predicted users are generated, and the recommended products are sorted.

In order to further optimize the technical scheme, the adopted measures further comprise:

the historical relationship information includes: browsing, purchasing, searching, collecting operation and order adding; the recommendation candidate information contains product name, product introduction, and purchase information.

Historical relationship information is stored in a user profile, context data for the user may be accessed, the context data indicating user activity; the user profile corresponding to the activities of the user is created through induction, comparison, clustering and the like, and the habit and the intention which are similar to each other among the users are evaluated.

The user similarity is a similarity between the contextual data in the user profile of the first user and the contextual data in the user profile of the second user. The similarity can be calculated by adopting a classical text or data similarity comparison method and then carrying out normalization.

User activities include browsing, purchasing, searching, collecting operations, joining orders and corresponding product names, product introductions, purchasing information. The records of the user activities are used for comparing the similarity between the users, under the condition that the departure times of the common behaviors are more, the corresponding users are gradually classified as users with similar behaviors and intentions, the corresponding users are used for predicting the behaviors which may occur to new (current) users with similar characteristics, and then the product information related to the behaviors which may occur is recommended to the new (current) users.

The invention also discloses a computer program of the product recommendation sorting algorithm based on the big data.

A storage medium storing the computer program of the big data based product recommendation ranking algorithm.

The invention adopts a product recommendation sorting algorithm based on big data, which is characterized in that: the recommendation system comprises a database, a recommendation server and a recommendation server, wherein the database is used for generating recommendation candidate information comprising products expected by users by using a big data analysis algorithm installed in advance, generating a similar tendency historical relationship information set by collecting historical relationship information among the users and based on the calculated historical relationship information, recommending candidate products according to the arrangement of the recommendation algorithm generating the recommendation candidate information and the similar tendency historical relationship information set of the users, and extracting a preset amount of recommendation candidate product information as recommendation product information; extracting the similarity of the first m recommended candidate products and the first n users, accumulating the similarity of each user to the m recommended candidate products, obtaining the recommendation scores of the m recommended candidate products, and recommending the product information in the order of the scores. By combining the historical relationship information and the recommendation algorithm arrangement, the purchased products can be recommended and sorted in a similarity accumulation mode when a plurality of users extracting approximate historical behaviors through big data generate new product purchasing behaviors, so that possible purchasing behaviors of the predicted users are generated, and the recommended products are sorted. Therefore, the method and the device have the advantages of improving user experience and pushing accurately and efficiently.

Drawings

FIG. 1 is a schematic diagram of a prior art method mode of an embodiment of the present invention;

FIG. 2 is a schematic diagram of an improved method mode of an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a sequence of steps according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following examples.

Example (b):

referring to fig. 1 to 3, a big data based product recommendation sorting algorithm for generating recommendation candidate information including a product desired by a user using a big data analysis algorithm installed in advance, recommending candidate products according to an information set having a history relation of similar tendencies with the user in an arrangement of recommendation algorithms generating the recommendation candidate information by collecting history relation information between users and generating a history relation information set of similar tendencies based on the calculated history relation information, and extracting a predetermined number of recommendation candidate product information as recommendation product information; extracting the similarity of the first m recommended candidate products and the first n users, accumulating the similarity of each user to the m recommended candidate products, obtaining the recommendation scores of the m recommended candidate products, and recommending the product information in the order of the scores. By combining the historical relationship information and the recommendation algorithm arrangement, the purchased products can be recommended and sorted in a similarity accumulation mode when a plurality of users extracting approximate historical behaviors through big data generate new product purchasing behaviors, so that possible purchasing behaviors of the predicted users are generated, and the recommended products are sorted. Referring to fig. 1 and 2, the present invention improves the prior art by abstracting the predicted behavior, which is derived from the simple single-user behavior of the original user, into a data set through learning, and then performing behavior prediction. The abstracted behaviors are pre-judged and are presumed as the next intention of the current entity user, so that the judgment of a plurality of historical user behaviors is reduced, on one hand, the calculation loss is reduced, and the method is more efficient. On the other hand, obviously, the stability of the obtained abstract behavior prejudgment is higher through learning and clustering.

Historical relationship information is typically stored in a user profile, and context data for the user may be accessed, the context data indicating user activity (e.g., browsing, joining an order). The user profile may determine an activity of the user and may thereby indicate that at least a portion of the user profile corresponds to the determined activity of the user. User profiles having corresponding activities of the users can be created by induction, comparison, clustering, etc., and habits and intentions having similar behaviors and intentions among the users can be evaluated. This may have the effect of creating a plurality of user profiles (e.g. taste profiles) corresponding to a plurality of activities engaged in or performed by the user, respectively. For example, when a user conducts a merchandise search, the user profile may describe the user, the user behavior, the information of the purchased merchandise.

The historical relationship information extraction pseudo code is as follows:

input given data set (i.e., set of standard user history data) X ═ X₁,x₂,…,x_N]^TWherein x ∈ R^dD is a characteristic dimension, and the set of historical relationship information Y corresponding to a given dataset X is [ Y ═ Y₁,y₂,…,y_N]^TThe test set (i.e., the set of current user history data) T ═ T₁,t₂,…,t_M]^TWherein, t_M∈R^dD is the sample characteristic dimension, y_iE {1,2, C }; parameter k in process of constructing computing network by using KNN algorithm₁Truncation distance dc, width parameter σ, parameter k in KNN algorithm when implementing physical similarity assumptions₂A damping coefficient lambda, a maximum iteration number H and a threshold theta when an iteration termination condition is met.

And outputting a computation history relation information set aiming at the test set.

Training phase

And Step1, constructing a computing network Q according to a given training set X and a corresponding historical relationship information set Y by using a KNN algorithm.

Step2 calculation of the concentration ρ of each node in the network Q by using the equation (7)_j。

Step3 respective use formula(6) Equation (5) calculates the weight ε of the ith node in subnetwork q_c ⁱAnd the weight ε of the subnetwork q_c。

Step4, setting h to 0;

step5 looping until an iteration end condition is met

Or H > H.

Step5.1:h＝h+1；

Step5.2, circulating until j is N;

step5.2.1 updating node impact factor In using equation (1)^h _j。

Calculation phase

Step6, setting M to 1, and circulating the program until M is larger than M.

Step6.1 determining test samples T in the test set T by applying similarity hypothesis through KNN algorithm by using formula (2)_mV set of neighboring nodes_i；

Step6.2 calculating the maximum probability c using the equations (5) and (4)^*Then according to c^*Will determine the test sample t_mAnd t is the historical relationship information type_mClassification into class c;

Step6.3:m＝m+1

step7 outputting a set of computation history relation information [ y ] for the test set T₁,y₂,…,y_M]

And Step8, exporting the user information in the historical relationship information set to generate a similar tendency user association information set.

End up

The calculation history relation information contains

In_i ^hRepresenting the impact factor of the node i during the h-th iteration.

v_tKnn (t), formula (2)

In formula (1), when h is 0, the moietyThe concentration of the dots is set as the influence factor of the node, i.e.

Representing the impact factor of the node i during the h-th iteration. de_iRepresenting the degree of departure of node i, i.e. directed edge e_ijThe number of (2).

In formula (2), v_tK sets of neighbor nodes, v, representing t_tThe nodes in (a) may be from different sub-networks q_cC is more than or equal to 1 and less than or equal to C. To be able to implement the style similarity assumption from a data point of view, we will implement the maximum probability c of t for the c-th class (i.e. sub-network qc)^*＝arg max_cΨ_cFormula (3); Ψ_cRepresenting the probability that t belongs to the c-th category as measured from a stylistic perspective.

Defining Ψ by weights of sub-networks and influence factors of nodes in a computational network Q_c，

In the formula (4), the reaction mixture is,

representative set v_tThe ith neighbor node of t, and

as its influencing factor.

The weights defining the sub-network qc are as follows:

sub-network q_cIs the average of the weights of all nodes it contains, so that the sub-network q_cThe weight of the ith node in (1) is:

e_ijrepresenting a subnetwork q_cDirected edge from node i to node j, N_iRepresenting a directed edge e_ijThe number of (2).

The local concentrations of the nodes are not equal, and the influence factors of the nodes calculated by using 1/N iteration do not accord with the actual distribution condition of the nodes. Here, the concentration of node j (i.e., corresponding to the jth training sample in the training set) is defined as follows:

n represents the total number of nodes in the computational network Q, d_jlRepresenting the Euclidean distance between the node j and the node l, dc representing the truncation distance, and χ (…) representing the distance function when

When χ (…) ═ 1, conversely χ (…) ═ 0.

The invention also discloses a computer program of the product recommendation sorting algorithm based on the big data and a storage medium for storing the computer program.

While the invention has been described in connection with a preferred embodiment, it is not intended to limit the invention, and it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims

1. The product recommendation sorting algorithm based on big data is characterized in that: the recommendation system comprises a database, a recommendation server and a recommendation server, wherein the database is used for generating recommendation candidate information comprising products expected by users by using a big data analysis algorithm installed in advance, generating a similar tendency historical relationship information set by collecting historical relationship information among the users and based on the calculated historical relationship information, recommending candidate products according to the arrangement of the recommendation algorithm generating the recommendation candidate information and the similar tendency historical relationship information set of the users, and extracting a preset amount of recommendation candidate product information as recommendation product information; extracting the similarity of the previous m recommended candidate products and the previous n users, accumulating the similarity of each user to the m recommended candidate products, obtaining the recommendation scores of the m recommended candidate products, and recommending the product information in the order of the scores.

2. The big-data based product recommendation ranking algorithm of claim 1, wherein: the historical relationship information comprises: browsing, purchasing, searching, collecting operation and order adding; the recommendation candidate information contains product name, product introduction, and purchase information.

3. The big-data based product recommendation ranking algorithm of claim 1, wherein: said historical relationship information is stored in a user profile, context data of the user is accessible, the context data being indicative of user activity; the user profile corresponding to the activities of the user is created through induction, comparison, clustering and the like, and the habit and the intention which are similar to each other among the users are evaluated.

4. The big-data based product recommendation ranking algorithm of claim 3 wherein: the user similarity is the similarity between the contextual data in the user profile of the first user and the contextual data in the user profile of the second user.

5. The big-data based product recommendation ranking algorithm of claim 3 wherein: the user activities comprise browsing, purchasing, searching, collecting operation, order adding and corresponding product names, product introduction and purchasing information.

6. A computer program implementing a big-data based product recommendation ranking algorithm according to claim 1.

7. A storage medium storing the computer program of claim 6.