CN108921670A

CN108921670A - A kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity

Info

Publication number: CN108921670A
Application number: CN201810724191.4A
Authority: CN
Inventors: 冯永; 王亚男; 王亚清; 魏然; 尚家兴
Original assignee: Chongqing Medical Data Mdt Infotech Ltd; Chongqing University
Current assignee: Chongqing Medical Data Mdt Infotech Ltd; Chongqing University
Priority date: 2018-07-04
Filing date: 2018-07-04
Publication date: 2018-11-30
Anticipated expiration: 2038-07-04
Also published as: CN108921670B

Abstract

The invention discloses a kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity, including obtaining the purchaser record data of user's drug purchase from the data set of electric business platform, and purchaser record data are arranged to obtain user-drug rating matrix；Purchaser record based on similar users in purchaser record data establishes the potential interest model of user, and obtains the potential interesting data of user based on the potential interest model of user；The potential interesting data of user is merged into user-drug rating matrix；The popularity for the drug generic bought based on user in purchaser record data and user establish classification correlation model to the preference of the category；Matrix decomposition is carried out to the user-drug rating matrix for incorporating the potential interesting data of user, and obtained user preference prediction matrix and classification correlation model progress linear fusion generation recommendation list will be decomposed.The present invention solves the problems, such as that rating matrix sparsity impacts recommendation efficiency in the prior art.

Description

Drug transaction recommendation method fusing potential interest, spatio-temporal data and category popularity of user

Technical Field

The invention relates to the technical field of computers, in particular to a medicine transaction recommendation method fusing potential interest, spatio-temporal data and category popularity of a user.

Background

In recent years, electronic commerce has been actively carried out with the development of internet and information technology, and more consumers have started shopping online. Electronic commerce not only opens up a new commercial profit channel, but also subverts the traditional sales mode, and endows more convenience and autonomy to both sides of the transaction from space and time. In particular, medicine, as a necessity of daily life, has gradually come into the field of electronic commerce in recent years, and more pharmaceutical enterprises have acquired the qualification of establishing a pharmaceutical electronic commerce platform, and the development prospect of electronic commerce in the pharmaceutical industry is clear.

Because the medicine e-commerce platform contains multiple types and large quantities of medicines, a user needs to spend a large amount of time and energy to screen out the required medicines, and the user experience of the platform is greatly reduced. In order to solve the problem that the user experience is poor due to the fact that the user consumes too much time in massive medicines, it is necessary to introduce a personalized recommendation technology into a medicine e-commerce platform.

In the medical e-commerce platform, due to the particularity of medicines, the number of the medicines scored by a user is far lower than that of the articles (music and movies) scored by the user in the traditional recommendation, a user-medicine scoring matrix is quite sparse, and the recommendation of the medical e-commerce platform faces a more serious data cold start problem than the traditional recommendation.

In the face of massive and diversified medicines in the field of medical and e-commerce, how to design an excellent recommendation algorithm to provide accurate recommendation for users is a puzzling problem. Currently, some recommendation algorithms exist in the field, but most of the algorithms are performed on an original user-medicine scoring matrix, and the sparsity of the scoring matrix is greatly influenced.

Therefore, how to effectively alleviate the influence of the sparsity of the scoring matrix on the recommendation efficiency is an urgent problem to be solved.

Disclosure of Invention

In view of the above, the invention provides a medicine transaction recommendation method fusing user potential interest, spatio-temporal data and category popularity, which learns the user potential interest through historical purchase data of a user, and then fills the user potential interest into a user-medicine scoring matrix, thereby effectively solving the problem that the scoring matrix sparsity affects the recommendation efficiency in the prior art.

In order to achieve the above object, the present invention provides a drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity, the method comprising the steps of:

s1, acquiring purchase record data of the medicine purchased by the user from the data set of the E-commerce platform, and sorting the purchase record data to obtain a user-medicine scoring matrix;

s2, establishing a user potential interest model based on the purchase records of similar users in the purchase record data, and obtaining user potential interest data based on the user potential interest model;

s3, merging the potential interest data of the user into a user-medicine scoring matrix; the influence of the matrix sparsity on the recommendation result is relieved, and the recommendation efficiency is improved;

s4, establishing a category correlation model based on the popularity of the category of the medicine purchased by the user in the purchase record data and the preference of the user for the category;

and S5, performing matrix decomposition on the user-medicine scoring matrix combined with the potential interest data of the user, and performing linear fusion on the user preference prediction matrix obtained by decomposition and the category correlation model in the step S4 to generate a recommendation list.

Preferably, the step S1 includes the steps of:

s1-1, collating the purchase record data of the user, the purchase record data including the user' S score, time of purchase, and type of medicine, and obtaining a user set U ═ { U ═₁,u₂,...,u_i...,u_nD ═ D } and drug set₁,d₂,...,d_j...,d_mU represents a user, i represents an ID of the user; d represents a drug, j represents the ID of the drug;

s1-2, counting the number of the medicines purchased and scored by each user, and if the number of the medicines purchased and scored by the user is lower than a preset value, deleting the user; to obtain a user containing sufficient user information;

s1-3, counting the times of purchasing and grading each medicine, and if the frequency of purchasing the medicines is lower than a preset value, deleting the related records of the medicines; due to the loss of data, noise is easy to occur;

and S1-4, obtaining an original user-medicine scoring matrix based on the sorted purchase record data.

Preferably, the step S2 includes the steps of:

s2-1, merging similar user set F of time factors_i：

1) Dividing a year into T discrete time periods by adopting a time discretization method, and dividing the original user-medicine scoring matrix in the step S1 into T time periods-user-medicine scoring matrices according to the purchasing scoring time;

2) given a target user i, defining a scoring vector of the user i in a time period T (T ∈ T) as: r is_i,t＝{r_i,t,1,r_i,t,2,..r_i,t,mWherein r is_i,t,mIndicating the value of the user i's score for the drug m over time period t. For a user i, calculating the time interval t of the user in any two time intervals_pAnd t_qScore vector ofAndthen taking the average value of the cosine similarity values of all the users in the two time periods as the similarity of the two time periods, thereby obtaining the similarity of the users between any two time periods in the discrete time period;

3) representing the similarity of all users between any two discrete time periods as a time period similarity matrix TS, and translating the time period-user-medicine scoring matrix by using the time period similarity matrix TS, wherein the specific translation formula is as follows:

wherein,is the new time period-user-drug scoring matrix to be used for calculation obtained after the panning;is to indicate the periods t and t^*Time interval similarity of (d), t^*∈[1,T]；Is the user i's score for drug j over time period t;

then, the matrix after translation is used for calculating the similarity of the users, and s users with the highest similarity are obtained for the user i and serve as similar users F_i；

S2-2, based on similar users F_iObtaining potential interest data of a user:

for user i, the step is similar user F of the user in S2-1_iAnd the medicine purchased but not purchased by the user i is used as the standby potential interest medicine of the user i, and a user potential interest model is established to learn the potential interest of the user, so that the potential interest data of the user is obtained.

Preferably, the step S3 includes the steps of:

s3-1, filling the user potential interest data into the original user-medicine scoring matrix in the step S1, and for each user i, dividing medicines into three categories: d_iIs a collection of drugs purchased by the user; p_iIs a set of potential purchases of drugs by the user; u shape_iIs a set of not purchased and not potentially purchased drugs by the user, the original user-drug scoring matrix is transformed into a new scoring matrix and weighting matrix:

wherein NewR is a new scoring matrix, NewR_i,jRepresents the user i's score for drug j; NewW is a new weight matrix, NewW_i,jPreference of drug j for user i;when the medicine is a potential medicine purchased by the user, the user can use the medicineIs a number between 0 and 1; μ is the tuning parameter, here taken to be 0.3, multiplied by.

Preferably, the step S4 includes the steps of:

s4-1, establishing a scoring matrix B of the user for a certain medicine category through the scoring matrix of the user for the medicine and the type of the medicine_N,|C|Where N is the number of users, | C | is the number of types of drugs, each element in the scoring matrix represents the user's score for the category to which the purchased drug belongs;

s4-2, constructing a medicine popularity matrix P_|C|,MWhere | C | is the number of types of drugs, M is the number of drugs, each element in the drug popularity matrix represents the popularity of the drug in the category to which it belongs, and the number of times a drug in a category is purchased is used to represent the popularity of the drug in the category;

s4-3, obtaining a category-related model of the medicine purchased by the user as follows:

wherein, y_i,jRepresenting the grade of the drug j by the user i under the category model; b is_i,c∈B_N,|C|，P_c,j∈P_|C|,M。

Preferably, the step S5 includes the following steps:

s5-1, decomposing the obtained new scoring matrix and the weight matrix by using a matrix decomposition algorithm, wherein an error function in the decomposition process is as follows:

where i denotes a user, j denotes a medicine, N denotes the number of users, M denotes the number of medicines,the product of the user implicit factor matrix and the drug implicit factor matrix vector represents the score of the user i on the drug j; gamma represents the weight of the user and the drug; | U | represents a user hidden factor matrix, | D | represents a drug hidden factor matrix,the square of the frobenius norm representing the user hidden factor matrix,a square of a Frobenius norm representing a drug hidden factor matrix;

s5-2, decomposing the new scoring matrix and the new weighting matrix to obtain a user hidden feature matrix and a medicine hidden feature matrix, multiplying the two matrixes obtained after decomposition to obtain a user preference prediction matrix, and then combining the user preference prediction matrix and the category correlation model to obtain a final recommendation model as follows:

wherein,is the user i's score for drug j;the product of the updated user hidden factor matrix and the medicine hidden factor matrix vector represents the prediction score of the user i on the medicine j; y is_i,jRepresenting the grade of the drug j by the user i under the category model; oc means proportional to; denotes multiplication.

S5-3, according toThe magnitude of the score values is sorted in order,and then selecting the medicines with the score values ranked from high to low to generate a recommendation list, and recommending the recommendation list to the user.

In summary, the invention discloses a medicine transaction recommendation method fusing the potential interest, the time-space data and the category popularity of a user, which comprises the steps of firstly obtaining the purchase record data of medicines purchased by the user from the data set of an e-commerce platform, and sorting the purchase record data to obtain a user-medicine scoring matrix; then, establishing a user potential interest model based on the purchase records of similar users in the purchase record data, and obtaining user potential interest data based on the user potential interest model; then merging the potential interest data of the users into a user-medicine scoring matrix; establishing a category correlation model based on the popularity of the category of the medicine purchased by the user in the purchase record data and the preference of the user to the category; and finally, performing matrix decomposition on the user-medicine scoring matrix combined with the user potential interest data, and performing linear fusion on the user preference prediction matrix obtained by decomposition and the category correlation model to generate a recommendation list. According to the method and the device, the potential interest of the user is learned through the historical purchase data of the user, and then the potential interest of the user is filled into the user-medicine scoring matrix, so that the problem that the sparsity of the scoring matrix influences the recommendation efficiency in the prior art is effectively solved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a basic flow chart of a method for recommending a drug transaction according to the present invention, which combines the user's potential interest, spatiotemporal data and category popularity;

FIG. 2 is a schematic diagram of a learning algorithm of potential interest of a user according to the present disclosure;

FIG. 3 is a schematic diagram of a process for creating a category-dependent model according to the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.

In the description of the present invention, unless otherwise specified and limited, it is to be noted that the terms "mounted," "connected," and "connected" are to be interpreted broadly, and may be, for example, a mechanical connection or an electrical connection, a communication between two elements, a direct connection, or an indirect connection via an intermediate medium, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.

The invention provides a medicine transaction recommendation method fusing potential interest, spatio-temporal data and category popularity of a user, which comprises the following steps as shown in figures 1-3:

Preferably, step S1 includes the steps of:

Preferably, step S2 includes the steps of:

s2-1, merging similar user set F of time factors_i：

2) given a target user i, defining a scoring vector of the user i in a time period T (T ∈ T) as: r is_i,t＝{r_i,t,1,r_i,t,2,..r_i,t,mIn which r is_i,t,mIndicating the value of the user i's score for the drug m over time period t. For a user i, calculating the time interval t of the user in any two time intervals_pAnd t_qScore vector ofAndthen taking the average value of the cosine similarity values of all the users in the two time periods as the similarity of the two time periods, thereby obtaining the similarity of the users between any two time periods in the discrete time period;

wherein,is the new time period-user-drug scoring matrix to be used for calculation obtained after the panning;is to indicate the periods t and t^*Time interval similarity of (d), t^*∈[1,T]；Is the user i's score for drug j over time period t.

S2-2, based on similar users F_iObtaining potential interest data of a user:

Preferably, step S3 includes the steps of:

wherein NewR is a new scoring matrix, NewR_i,jRepresents the user i's score for drug j; NewW is a new weight matrix, NewW_i,jPreference of drug j for user i;when the medicine is a potential purchased medicine of the user, the score of the user for the medicine is a numerical value between 0 and 1; μ is the tuning parameter, here taken to be 0.3, multiplied by.

Preferably, step S4 includes the steps of:

Preferably, step S5 includes the following steps:

where i denotes a user, j denotes a medicine, N denotes the number of users, M denotes the number of medicines,the product of the user implicit factor matrix and the drug implicit factor matrix vector represents the score of the user i on the drug j; gamma represents the weight of the user and the drug; | U | represents a user hidden factor matrix, | D | represents a drug hidden factor matrix,the square of the frobenius norm representing the user hidden factor matrix,the square of the Frobenius norm representing the drug hidden factor matrix.

S5-3, according toThe size of the score values is sorted, and then the medicines with the score values ranked from large to small at the top k are selected to generate a recommendation list.

Specifically, in the above embodiment, the user potential interest model is established in step S2-2 to learn the user potential interest, and the user potential interest model can specifically be learned by the following two selection algorithms:

the first selection algorithm is a maximum value selection strategy, which represents the preference of the user by using the similar user of the target user i who purchased the medicine j with the highest similarity to the target user, and the linear model is represented as follows:

wherein, pr_i,jIndicating the user i's score for the drug j,is the similarity of the preference of the user i and the related users for the medicine j, F belongs to F_iAre relevant users of user i.

The second selection algorithm is a meta-path selection strategy in the heterogeneous network G<V,E,A>In (1), V is a set of nodes, E is a set of edges, and A is a set of node categories. A meta path is defined as a path of the formWherein A is_i∈A,R_iRepresenting relationships existing between nodes, R_iE.g. { U-U, U-D, D-D }. Then for this meta path P, if there is an instance path P ═ { v ═ v₁,v₂...v_n+1Is the instance of the meta-path, and all such instance paths are defined as instance paths P' of the meta-path P. For each instance path, the paper defines a feature value concept for describing the node v₁And v_n+1Is denoted cor (p), then the feature value of the meta-path is the sum of the feature values of all the instance paths, denoted as:

for example path p ═ a₁,a₂...a_n+1}，a₁E.g. U is a user node, a_n+1e.D is the node of the drug, other a_iIs an intermediate node in the instance path. Indicating the p start node of the pathThe degree of association between points cor (p) is the idea of random walk, assuming an object from node a₁Starting from random walk in the network, defining cor (p) as object to walk to node a according to example path p_n+1Since each of the random walks are assumed to be independent of each other. The probability of the object walking according to p is equal to the product of the probabilities of each step of walking, and the calculation formula is as follows

Wherein Pro (a)_i,a_i+1) Representing the slave node a in the random walk process_iDirectly to node a_i+1The probability of (c). In a heterogeneous network, its formula is defined as:

wherein N (a)_i) Is represented by_i+1Node types of consistent type.

The end user interests are expressed as:

pr_i,j＝Eig(P_i,j)

and finally, obtaining potential interest points of the target user.

Specifically, in the above embodiment, the matrix decomposition algorithm in step S5-1 adopts the following pseudo code of the hidden matrix learning algorithm:

it should be noted that the system structures or method flows shown in fig. 1 to fig. 3 of the present invention are only some preferred embodiments of the present invention, and the illustration is only for the convenience of understanding the present invention and is not to be construed as a limitation of the present invention.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity is characterized by comprising the following steps:

s3, merging the potential interest data of the user into a user-medicine scoring matrix;

2. The drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity according to claim 1, wherein the step S1 comprises the steps of:

s1-2, counting the number of the medicines purchased and scored by each user, and if the number of the medicines purchased and scored by the user is lower than a preset value, deleting the user;

s1-3, counting the times of purchasing and grading each medicine, and if the frequency of purchasing the medicines is lower than a preset value, deleting the related records of the medicines;

and S1-4, obtaining a user-drug scoring matrix based on the sorted purchase record data.

3. The drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity according to claim 1, wherein the step S2 comprises the steps of:

s2-1, merging similar user set F of time factors_i：

2) given a target user i, defining a scoring vector of the user i in a time period T (T ∈ T) as: r is_i,t＝{r_i,t,1,r_i,t,2,..r_i,t,mIn which r is_i,t,mThe score value of the medicine m of the user i in the time period t is represented, and for the user i, the user i is calculated in any two time periods t_pAnd t_qScore vector ofAndthen taking the average value of the cosine similarity values of all the users in the two time periods as the similarity of the two time periods, thereby obtaining the similarity of the users between any two time periods in the discrete time period;

S2-2, based on similar users F_iObtaining potential interest data of a user:

4. The drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity according to claim 1, said step S3 comprising the steps of:

s3-1, filling the user potential interest data into the user-medicine scoring matrix in the step S1, and for each user i, dividing medicines into three categories: d_iIs a collection of drugs purchased by the user; p_iIs a set of potential purchases of drugs by the user; u shape_iIs a set of not purchased and not potentially purchased drugs by the user, the original user-drug scoring matrix is transformed into a new scoring matrix and weighting matrix:

wherein NewR is a new scoring matrix, NewR_i,jRepresents the user i's score for drug j; NewW is a new weight matrix, NewW_i,jPreference of drug j for user i;is when the medicine is the userWhen a user purchases a drug, the user's score for the drug is a value between 0 and 1; μ is the tuning parameter.

5. The drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity according to claim 1, wherein the step S4 comprises the steps of:

6. The drug transaction recommendation method fusing user potential interest, spatiotemporal data and category popularity according to claim 1, said step S5 comprising the steps of:

wherein,is the user i's score for drug j;the product of the updated user hidden factor matrix and the medicine hidden factor matrix vector represents the prediction score of the user i on the medicine j; y is_i,jRepresenting the grade of the drug j by the user i under the category model; oc means proportional to; denotes multiplication;