CN108776919A

CN108776919A - The item recommendation method of information core is built based on cluster and evolution algorithm

Info

Publication number: CN108776919A
Application number: CN201810550780.5A
Authority: CN
Inventors: 慕彩红; 刘逸; 朱贤武; 刘若辰; 张丹; 侯彪; 熊涛; 焦李成
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2018-11-09
Anticipated expiration: 2038-05-31
Also published as: CN108776919B

Abstract

The present invention discloses a kind of item recommendation method building information core based on cluster and evolution algorithm, and step is：(1) consumer articles rating matrix is built；(2) by consumer articles rating matrix dimensionality reduction；(3) clustering algorithm is utilized to build Virtual User article rating matrix；(4) build and update consumer articles training matrix and consumer articles optimization matrix；(5) parent population is initialized；(6) cross and variation generates transition population；(7) the recommendation precision of information core individual is calculated；(8) progeny population is generated；(9) parent population is updated；(10) judge whether iterations are 100 times；(11) information core structure is completed；(12) it is that user recommends article to utilize information core.The present invention has structure information core fast, the advantage for recommending article more accurate for user.

Description

The item recommendation method of information core is built based on cluster and evolution algorithm

Technical field

The invention belongs to field of computer technology, the one kind further related in article recommended technology field is based on cluster And the item recommendation method of evolution algorithm structure information core.The present invention can pass through structure according to user to the score information of article Information core be user recommend oneself need article.

Background technology

Commending system is a kind of information filtering system, and by analyzing the historical behavior data of user, analysis finds user's Hobby, and recommend its interested article or information for user.Nowadays, already present recommendation method has very much, collaboration Filter algorithm is the proposed algorithm being most widely used at present, but as data volume increases, the run time of algorithm and give user The time of recommendation can be elongated, this scalability problem greatly suppresses the development of collaborative filtering.

Paper " the Information core optimization using that Caihong Mu et al. are delivered at it Evolutionary Algorithm with Elite Population in recommender systems”(Congress On Evolutionary Computation (CEC), 2017IEEE) in propose a kind of evolution algorithm based on elite population Extract the recommendation method of information core.The step of recommendation method is：Step 1, the sparse rating matrix of user and article is established；Step Rapid 2, use evolution algorithm：Parent population is initialized, the fitness of each individual in population is calculated；Step 3, according to M elite plan Slightly, it is sorted from big to small to individual adaptation degree, carries out sequence crossover according to individual adaptation degree, extract information core；Step 4, According to information core, the scoring of all articles not scored of target user is predicted, and recommended.It is insufficient existing for the recommendation method Place is, when extracting information core using evolution algorithm, the time spent is often calculated for the information core individual adaptation degree in population too It is long, cause to choose information core offline excessively slow.

University of Electronic Science and Technology " a kind of personalized recommendation method based on key user and is in the patent document of its application System " (application number：201510157504.9 application publication number：CN 104778237A) in disclose it is a kind of based on key user's Personalized recommendation method.The implementation steps of this method are：Step 1：The score data of article is concentrated from user and obtains user couple The scoring of article, and the similarity between different user is calculated to different article score informations using user；Step 2, mesh is determined Mark user's length be N neighbor list, and in neighbor list the position of neighbours be according to target user's similarity from high to low Arrangement；Step 3, a weight rule is set：The number that user appears in other users neighbor list is more, arrangement position More forward, weight is bigger；Step 4, using the maximum P user of weight as key user；Step 5, according to key user to object The score information of product, the articles not scored all to target user carry out score in predicting, are used to target according to score in predicting situation Recommended at family.Shortcoming existing for this method is, using the similarity between user, the selection criteria of set information core, The mode of this selection information core causes the selected information core taken out to recommend precision low.

Invention content

It is a kind of based on cluster and evolution algorithm it is an object of the invention in view of the deficiency of the prior art, propose Build the item recommendation method of information core.

Realize that the concrete thought of the object of the invention is, by consumer articles rating matrix dimensionality reduction, to obtain low-dimensional matrix, using poly- Class algorithm is multiple to the user clustering in low-dimensional matrix, builds Virtual User article rating matrix, using evolution algorithm from virtual Information core is extracted in consumer articles rating matrix, is that user recommends article using information core.

Steps are as follows for the specific implementation of the present invention：

(1) consumer articles rating matrix is built：

(1a) concentrates all score informations of the extraction user to article from user to the score data of article, creates user's object Sub-matrix is judged, the line number of the matrix is equal with the number of users that score data is concentrated, the matrix column number and score data collection In number of articles it is equal；

(1b) indicates that user in rating matrix does not comment the score value of excessive article with 0, indicates to score with practical score value User comments the score value of excessive article in matrix；

(2) by consumer articles rating matrix dimensionality reduction：

Low-dimensional matrix is obtained to consumer articles rating matrix dimensionality reduction using T distribution random neighbor embedded mobile GISs；

(3) utilize clustering algorithm multiple to the user clustering in low-dimensional matrix：

(3a) utilizes clustering algorithm, and the cluster of Ψ classification is carried out to the user in low-dimensional matrix, each user is obtained and exists Corresponding classification in Ψ classification；

The classification of (3b) using each user in Ψ classification, obtains the class that user is corresponded in consumer articles rating matrix Not；

After (3c) is to the user clustering 5 times in low-dimensional matrix, it is corresponding all to obtain user in consumer articles rating matrix Class of subscriber；

(4) Virtual User article rating matrix is built：

(4a) arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is scored Mean value the cluster centre of selected classification is saved as into a vector as the cluster centre of selected classification；

(4b) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (4d) is thened follow the steps, Otherwise, step (4a) is executed；

The corresponding vector of the cluster centre of all class of subscribers is formed Virtual User article rating matrix by (4d)；

(5) structure optimization matrix：

(5a) utilizes clustering algorithm, and the cluster of K classification is carried out to the user in low-dimensional matrix, obtains each user in K Corresponding classification in a classification；

The classification of (5b) using each user in K classification, obtains the class that user is corresponded in consumer articles rating matrix Not；

(5c) arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is scored Mean value the cluster centre of selected classification is saved as into a vector as the cluster centre of selected classification；

(5d) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (5e) is thened follow the steps, Otherwise, step (5c) is executed；

(5e) is by the corresponding vector of the cluster centre of all class of subscribers, compositional optimization matrix；

(5f) arbitrarily chooses a user as target user from optimization matrix；

(5g) arbitrarily chooses an article as target item from optimization matrix；

(5h) according to the following formula, updates score value of the target user to target item：

Wherein, p_bjIndicate score values of the target user b to target item j, | | indicate the operation that takes absolute value, G_cIt indicates to use Belong to class of subscriber c in the article rating matrix of family and comment excessive user to gather target item j, c indicates that target user b is corresponded to Cluster centre where class of subscriber, T_cIndicate the user's set for belonging to class of subscriber c in consumer articles rating matrix；

(5i) judges whether to have selected all users in optimization matrix, if so, thening follow the steps (5j), otherwise, executes step (5f)；

(5j) judges whether to have selected all items in optimization matrix, if so, thening follow the steps (6), otherwise, executes step (5g)；

(6) it builds consumer articles training matrix and consumer articles optimizes matrix：

(6a) builds consumer articles training matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix Columns with optimization matrix in number of articles it is equal；

(6b) builds consumer articles and optimizes matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix Columns with optimization matrix in number of articles it is equal；

(7) it updates consumer articles training matrix and consumer articles optimizes matrix：

(7a) is used as training data, remaining 20% score data conduct from optimize extraction score data in matrix 80% Optimize data；

(7b) extracts score information from training data, and excessive object is not commented with user in 0 replacement consumer articles training matrix The score value of product replaces the score value that user in consumer articles training matrix comments excessive article with practical score value；

(7c) does not comment excessive object from optimization extracting data score information, with user in 0 replacement consumer articles optimization matrix The score value of product replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value；

(8) evolution algorithm is utilized to choose information core：

(8a) extracts Customs Assigned Number ID, composition Customs Assigned Number ID set from Virtual User article rating matrix；

(8b) regard 60% that Customs Assigned Number ID gathers length as information core length；

(8c) arbitrarily chooses the Customs Assigned Number ID subsets of 100 and information core equal length from Customs Assigned Number ID set, Form parent population；

(8d) intersects, after mutation operation each information core individual in parent population, generates transition population；

(8e) obtains the recommendation essence of parent population and each information core individual in transition population using accuracy method is recommended Degree；

(8f) sorts the information core individual in parent population and transition population, from row from big to small according to recommendation precision Preceding 100 information core individual is chosen in sequence, forms progeny population；

(8g) uses each information core individual in progeny population to replace each information core individual in parent population, raw The parent population of Cheng Xin；

(8h) judges whether current iteration number reaches maximum iteration 100 times, if so, (9) are thened follow the steps, it is no Then, step (8d) is executed；

(9) information core structure is completed：

It is chosen from parent population and recommends the maximum information core individual of precision, as best information core；

(10) it is that user recommends article to utilize information core：

Using the best information core found out, recommend the article of its needs for each user in consumer articles rating matrix.

The present invention has the advantage that compared with prior art：

First, the present invention is calculated due to optimizing matrix by building consumer articles training matrix and consumer articles using evolving Method chooses information core, overcomes the prior art, often suitable for the information core individual in population when extracting information core using evolution algorithm Response calculates the disadvantage that the time spent is too long, causes offline selection information core excessively slow so that the present invention can shorten offline selection The time of information core alleviates offline optimization time longer problem when choosing information core using evolution algorithm.

Second, the present invention from Virtual User article rating matrix using evolution algorithm due to choosing information core, from parent It is chosen in population and recommends the maximum information core individual of precision, as best information core, overcome the prior art using between user Similarity, the mode of the selection criteria of set information core, this selection information core causes the selected information core taken out to recommend essence Spend low disadvantage so that the information core that the present invention selects has the advantages that higher recommendation precision.

Description of the drawings

Fig. 1 is the flow chart of the present invention；

Fig. 2 is the analogous diagram of the present invention.

Specific implementation mode：

The present invention is described in further detail below in conjunction with attached drawing.

Referring to Fig.1, the specific implementation step of the present invention is further described.

Step 1, consumer articles rating matrix is built.

All score informations of the extraction user to article are concentrated to the score data of article from user, creates consumer articles and comments The line number of sub-matrix, the matrix is equal with the number of users that score data is concentrated, which concentrates with score data Number of articles is equal.

In the embodiment of the present invention user to the score data collection of article include MovieLens-100K score datas collection and MovieLens-1M score data collection.

The score information includes the score value of Customs Assigned Number ID, project number ID and user to article.

It indicates that user in rating matrix does not comment the score value of excessive article with 0, indicates that user commented with practical score value The score value of the article divided.

Step 2, by consumer articles rating matrix dimensionality reduction.

Low-dimensional matrix is obtained to consumer articles rating matrix dimensionality reduction using T distribution random neighbor embedded mobile GISs.

Step 3, multiple to the user clustering in low-dimensional matrix using clustering algorithm.

Using clustering algorithm, the cluster of Ψ classification is carried out to the user in low-dimensional matrix, obtains each user at Ψ Corresponding classification in classification, Ψ is 20 when score data collection MovieLens-100K, Ψ when score data collection MovieLens-1M It is 64.

Steps are as follows for the clustering algorithm：

A. Ψ user is randomly choosed from low-dimensional matrix as Ψ initial cluster centre, a cluster centre corresponds to One class of subscriber.

B. a user is arbitrarily chosen from low-dimensional matrix as division user.

C. it according to the Euclidean distance divided between user and Ψ cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre.

D. judge whether to have selected all users in low-dimensional matrix, if so, thening follow the steps E, otherwise, execute step B.

E. a class of subscriber is arbitrarily chosen from all class of subscribers as target category.

F. mean value user in selected target user's classification to be scored, the cluster centre as selected target user's classification.

G. judge whether that the cluster centre of all class of subscribers does not change, if so, obtaining each using in low-dimensional matrix Otherwise the corresponding class of subscriber in family executes step B.

Using classification of each user in Ψ classification, the classification that user is corresponded in consumer articles rating matrix is obtained.

After the user clustering 5 times in low-dimensional matrix, the corresponding all users of user in consumer articles rating matrix are obtained Classification.

Step 4, Virtual User article rating matrix is built.

1st step arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is commented The cluster centre of selected classification is saved as a vector by the mean value divided as the cluster centre of selected classification.

2nd step judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, executing this step Otherwise 3rd step executes the 1st step of this step.

The corresponding vector of the cluster centre of all class of subscribers is formed Virtual User article rating matrix by the 3rd step.

Step 5, structure optimization matrix.

1st step carries out the user in low-dimensional matrix the cluster of K classification, obtains each user and exist using clustering algorithm Corresponding classification in K classification, K is 100 when score data collection MovieLens-100K, when score data collection MovieLens-1M K is 320.

Steps are as follows for the clustering algorithm：

A. K user is randomly choosed from low-dimensional matrix as K initial cluster centre, a cluster centre corresponds to one A class of subscriber.

B. a user is arbitrarily chosen from low-dimensional matrix as division user.

C. it according to the Euclidean distance divided between user and K cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre.

2nd step obtains corresponding to user's in consumer articles rating matrix using classification of each user in K classification Classification.

3rd step arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is commented The cluster centre of selected classification is saved as a vector by the mean value divided as the cluster centre of selected classification.

4th step judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, executing this step Otherwise 5th step executes the 3rd step of this step.

5th step, by the corresponding vector of the cluster centre of all class of subscribers, compositional optimization matrix.

6th step arbitrarily chooses a user as target user from optimization matrix.

7th step arbitrarily chooses an article as target item from optimization matrix.

8th step, according to the following formula, score value of the update target user to target item.

Wherein, p_bjIndicate score values of the selected target user b to selected target item j, | ● | expression takes absolute value Operation, G_cIt indicates to belong to class of subscriber c in consumer articles rating matrix and comments excessive user to collect selected target item j It closes, c indicates the class of subscriber where the corresponding cluster centres of selected target user b, T_cIt indicates in consumer articles rating matrix Belong to user's set of class of subscriber c.

9th step judges whether to have selected all users in optimization matrix, otherwise be held if so, executing the 10th step of this step Row the 6th step of this step.

10th step judges whether to have selected all items in optimization matrix otherwise, to execute this step if so, thening follow the steps 6 Rapid 7th step.

Step 6, it builds consumer articles training matrix and consumer articles optimizes matrix.

Consumer articles training matrix is built, the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix.

It builds consumer articles and optimizes matrix, the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix.

Step 7, it updates consumer articles training matrix and consumer articles optimizes matrix.

The 80% of extraction score data is used as training data from optimization matrix, and remaining 20% score data is as an optimization Data.

Score information is extracted from training data, and excessive article is not commented with user in 0 replacement consumer articles training matrix Score value replaces the score value that user in consumer articles training matrix comments excessive article with practical score value.

From optimization extracting data score information, excessive article is not commented with user in 0 replacement consumer articles optimization matrix Score value replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value.

Step 8, information core is chosen using evolution algorithm.

1st step extracts Customs Assigned Number ID, composition Customs Assigned Number ID set from Virtual User article rating matrix.

2nd step, 60% that Customs Assigned Number ID is gathered length are used as information core length.

3rd step, from Customs Assigned Number ID set, arbitrary Customs Assigned Number ID for choosing 100 and information core equal length Collection forms parent population.

4th step intersects, after mutation operation each information core individual in parent population, generates transition population.

5th step obtains the recommendation of parent population and each information core individual in transition population using accuracy method is recommended Precision.

The recommendation accuracy method and step is as follows：

A. according to the following formula, calculate in consumer articles training matrix in each user and information core individual each user it Between cosine similarity.

Wherein, sim (u, v) indicates u-th user in consumer articles training matrix and v-th user in information core individual Cosine similarity, ∑ indicate that sum operation, i indicate article set I (u) and i-th of article in article set I (v) intersections, I (u) indicate that u-th of user in consumer articles training matrix comments excessive article set, I (v) to indicate v in information core individual A user comments excessive article set, ∈ expressions to belong to symbol, and ∩ indicates the operation that seeks common ground, r_uiIndicate consumer articles training matrix In u-th of user couple, i-th of article score value, r_viIndicate the scoring of v-th of user couple, i-th of article in information core individual Value,Indicate extraction of square root operation.

B. a user is arbitrarily chosen from consumer articles training matrix as experience user.

C. it from information core individual, chooses and the experience maximum top n user of user's cosine similarity, composition experience user Neighbours collection.

D. it is that an article is arbitrarily chosen in 0 article set as experience article from experience user's scoring.

E. according to the following formula, prediction score value of the experience user to experience article is calculated.

Wherein, p_adIndicate that prediction score values of the selected experience user a to selected experience article d, s indicate user's set H_adIn s-th of user, H_adIndicating that the neighbours of selected experience user a concentrate comments excessive user to collect selected experience article d It closes, sim (a, s) indicates the cosine similarity of selected experience user a and s-th of user, r_sdIndicate s-th of user to selected Experience the score value of article d.

F. judge whether that it is all items in 0 article set to have selected experience user's scoring, if so, G is thened follow the steps, it is no Then, step D is executed.

G. the article that user's score value is 0 will be experienced in consumer articles training matrix, from big to small according to prediction score value The forward article that sorts, the recommendation list of composition experience user are chosen in sequence from sequence.

H. according to the following formula, the recommendation precision of experience user is calculated.

Wherein, pr_aIndicate that the recommendation precision of selected experience user a, Q indicate selected experience user a in consumer articles It comments excessive in optimization matrix and appears in the number of articles in its recommendation list, L indicates the recommendation row of selected experience user a Number of articles in table.

I. judge whether to obtain the recommendation precision of all users in consumer articles training matrix, if so, J is thened follow the steps, Otherwise, step B is executed.

J. according to the following formula, the recommendation precision of information core individual is calculated.

Wherein, PR indicates that the recommendation precision of information core individual, e indicate that e-th of user in user's set U, U indicate user's object User gathers in product training matrix, pr_eIndicate the recommendation precision of e-th of user.

6th step sorts the information core individual in parent population and transition population according to recommendation precision from big to small, from Preceding 100 information core individual is chosen in sequence, forms progeny population.

7th step replaces each information core individual in parent population with each information core individual in progeny population, Generate new parent population.

8th step judges whether current iteration number reaches maximum iteration 100 times, if so, 9 are thened follow the steps, it is no Then, the 4th step of this step is executed.

Step 9, information core structure is completed.

It is chosen from parent population and recommends the maximum information core individual of precision, as best information core.

Step 10, collaborative filtering recommending is carried out.

The effect of the present invention is described further with reference to emulation experiment.

1. simulated conditions：

The running environment of emulation experiment of the present invention is：64 bit manipulation systems of Windows7, CPU are Intel (R) Core (TM) i3-CPU 550U 3.20GHz inside save as 4GB, translation and compiling environment Matlab2017a.

2. emulation experiment data and evaluation index：

In the emulation experiment of the present invention using the common data set MovieLens-100K in commending system field and MovieLens-1M tears two datasets open for recommendation effect of the authentication present invention on extraction information core respectively It is divided into training data subset Train and test data subset Test, two data statistics are shown in such as the following table 1.

1 data set Statistics table of table

Wherein, the English Dataset (original) in table 1 indicates that raw data set, Dataset (subset) indicate former Subset in beginning data set, including training data subset Train, test data subset Test, #User indicate number of users, U tables Show that user gathers, #Item indicates that number of articles, I indicate that article set, #Ratings indicate that scoring quantity, R indicate user to object The scoring of product.

The present invention uses the common evaluation index in commending system field to recommend precision precision, as follows, meter Calculate recommendation precision precision of the information core in test data subset：

1st step arbitrarily chooses a user as target user from test data subset, according to the following formula, calculates information Check the recommendation precision of target user.

Wherein, precision_uIndicate the recommendation accuracy of the selected target user u of information verification in test data subset, Q indicates that the number of articles that user u needs in article of recommending of the selected target user u of information verification, L indicate selected by information verification Take the recommendation number of articles of target user u.

2nd step judges whether to obtain the recommendation precision of all users in information verification test data subset, if so, holding Otherwise the 3rd step of row executes the 1st step.

3rd step calculates recommendation precision of the information core in test data subset according to the following formula.

Wherein, precision indicates recommendation precision of the information core in test data subset, and u indicates in user's set U the U user, U indicate user's set in test data subset, precision_uIndicate that information checks the recommendation essence of u-th of user Degree.

3. emulation experiment content and interpretation of result：

There are two the emulation experiments of the present invention.

The emulation experiment 1 of the present invention is training data subset Train and the survey after two frequently-used data collection are respectively split It tries on data subset Test, to recommend precision precision as evaluation index, recommendation list length is 20 times progress.Emulation Experiment 1 is that the present invention based on cluster and evolution algorithm builds the item recommendation method of information core with 4 prior arts (based on row The name information core construction method of Rank, the information core construction method based on frequency Frequency, based on common evolution algorithm EA's Information core construction method and method based on traditional collaborative filtering CF) recommendation precision compared, it is contemplated that the method for the present invention It reruns 10 times and is averaging when seeking recommendation precision with the randomness of the information core construction method based on common evolution algorithm EA Value, data set MovieLens-100K, shown in the result such as table 2 and Fig. 2 (a) of comparison, data set MovieLens-1M, comparison As a result as shown in table 3 and Fig. 2 (b).

Ordinate in Fig. 2 (a), Fig. 2 (b) indicates that precision, abscissa is recommended to indicate neighbours' number.In Fig. 2 (a), Fig. 2 (b) Indicate that the information core construction method based on frequency Frequency recommends precision curve with obtaining with the curve of square mark.Fig. 2 (a), the curve indicated with diamond shape in Fig. 2 (b) indicates that the information core construction method based on ranking Rank recommends precision bent with obtaining Line.Indicate that the method based on traditional collaborative filtering CF recommends precision with obtaining in Fig. 2 (a), Fig. 2 (b) with the curve of circle mark Curve indicates that the information core construction method of common evolution algorithm EA obtains ground in Fig. 2 (a), Fig. 2 (b) with the curve of Asterisk marks Recommend precision curve.The method that the curve indicated with right triangle in Fig. 2 (a), Fig. 2 (b) indicates the present invention recommends essence with obtaining It writes music line.

From Fig. 2 (a), Fig. 2 (b) as can be seen that the curve of the present invention is based on the information core structure of frequency Frequency The curve of the curve of method, the curve of information core construction method based on ranking Rank, method based on traditional collaborative filtering CF With the top of the curve of the information core construction method based on common evolution algorithm EA, therefore illustrate the present invention recommendation precision be most High.

The recommendation accuracy table of each method when recommendation list length on MovieLens-100K is 20 of table 2

The recommendation accuracy table of each method when recommendation list length on MovieLens-1M is 20 of table 3

Wherein, neighbours' number indicates that user neighbours concentrate user's number in table 2, table 3, frequency Frequency tables in table 2, table 3 Show the recommendation precision of the information core construction method based on frequency Frequency, ranking Rank indicates to be based on ranking in table 2, table 3 The recommendation precision of the information core construction method of Rank, collaborative filtering CF indicates the side based on traditional collaborative filtering CF in table 2, table 3 The recommendation precision of method, evolution algorithm EA indicates the recommendation of the information core construction method based on common evolution algorithm EA in table 2, table 3 Precision, the present invention indicates the recommendation precision of the method for the present invention in table 2, table 3.

From table 2, table 3 as can be seen that the present invention in two common score data collection MovieLens-100K and Recommendation precision on MovieLens-1M is all higher than other recommendation precision showed there are four types of technology.

The emulation experiment 2 of the present invention is training data subset Train and the survey after two frequently-used data collection are respectively split It tries on data subset Test, to choose time of information core offline as evaluation index, neighbours' number is to carry out for 10 time.Emulation experiment 2 be the present invention with the prior art (the information core construction method based on common evolution algorithm EA) offline choose information core time into Row comparison, it is contemplated that the randomness of the method for the present invention and the information core building method based on common evolution algorithm EA, statistics are offline When choosing the time of information core, reruns 10 times and average, comparing result is as shown in table 4.

4 two kinds of table chooses the timetable that information verification ratio method chooses information core offline

As can be seen from Table 4, it on two datasets MovieLens-100K and MovieLens-1M, is evolved based on common The time that the information core construction method of algorithm EA chooses information core offline is longer, and the method for the present invention chooses the time of information core offline It is shorter, illustrate that the present invention can choose information core offline more quickly.

Claims

1. a kind of item recommendation method building information core based on cluster and evolution algorithm, which is characterized in that utilize clustering algorithm Virtual User article rating matrix is built, information core is extracted from Virtual User article rating matrix by evolution algorithm, profit It is that user recommends article with information core；The step of this method includes as follows：

(1) consumer articles rating matrix is built：

(1a) concentrates all score informations of the extraction user to article from user to the score data of article, creates consumer articles and comments The line number of sub-matrix, the matrix is equal with the number of users that score data is concentrated, which concentrates with score data Number of articles is equal；

(1b) indicates that user in rating matrix does not comment the score value of excessive article with 0, and rating matrix is indicated with practical score value Middle user comments the score value of excessive article；

(2) by consumer articles rating matrix dimensionality reduction：

(3a) utilizes clustering algorithm, and the cluster of Ψ classification is carried out to the user in low-dimensional matrix, obtains each user at Ψ Corresponding classification in classification；

The classification of (3b) using each user in Ψ classification, obtains the classification that user is corresponded in consumer articles rating matrix；

After (3c) is to the user clustering 5 times in low-dimensional matrix, the corresponding all users of user in consumer articles rating matrix are obtained Classification；

(4) Virtual User article rating matrix is built：

(4a) arbitrarily chooses a class of subscriber from consumer articles rating matrix, user in selected user classification is scored equal Value, as the cluster centre of selected classification, a vector is saved as by the cluster centre of selected classification；

(4b) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (4d) is thened follow the steps, it is no Then, step (4a) is executed；

(5) structure optimization matrix：

(5a) utilizes clustering algorithm, and the cluster of K classification is carried out to the user in low-dimensional matrix, obtains each user in K class Corresponding classification in not；

The classification of (5b) using each user in K classification, obtains the classification that user is corresponded in consumer articles rating matrix；

(5c) arbitrarily chooses a class of subscriber from consumer articles rating matrix, user in selected user classification is scored equal Value, as the cluster centre of selected classification, a vector is saved as by the cluster centre of selected classification；

(5d) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (5e) is thened follow the steps, it is no Then, step (5c) is executed；

(5f) arbitrarily chooses a user as target user from optimization matrix；

(5g) arbitrarily chooses an article as target item from optimization matrix；

Wherein, p_bjIndicate score values of the target user b to target item j, | | indicate the operation that takes absolute value, G_cIndicate user's object It judges and belongs to class of subscriber c in sub-matrix and comment excessive user to gather target item j, c indicates that target user b is corresponding poly- Class of subscriber where class center, T_cIndicate the user's set for belonging to class of subscriber c in consumer articles rating matrix；

(6a) builds consumer articles training matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix；

(6b) builds consumer articles and optimizes matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix；

(7a) is used as training data from optimize extraction score data in matrix 80%, and remaining 20% score data is as an optimization Data；

(7b) extracts score information from training data, and excessive article is not commented with user in 0 replacement consumer articles training matrix Score value replaces the score value that user in consumer articles training matrix comments excessive article with practical score value；

(7c) does not comment excessive article from optimization extracting data score information, with user in 0 replacement consumer articles optimization matrix Score value replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value；

(8) evolution algorithm is utilized to choose information core：

(8c) from Customs Assigned Number ID set, the arbitrary Customs Assigned Number ID subsets for choosing 100 and information core equal length form Parent population；

(8e) obtains the recommendation precision of parent population and each information core individual in transition population using accuracy method is recommended；

(8f) sorts the information core individual in parent population and transition population, from sequence from big to small according to recommendation precision Preceding 100 information core individual is chosen, progeny population is formed；

(8g) uses each information core individual in progeny population to replace each information core individual in parent population, generates new Parent population；

(8h) judges that whether current iteration number reaches maximum iteration 100 times, if so, thening follow the steps (9), otherwise, holds Row step (8d)；

(9) information core structure is completed：

(10) it is that user recommends article to utilize information core：

2. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed Score information described in, step (1a), step (7b), step (7c) includes Customs Assigned Number ID, project number ID, user couple The score value of article.

3. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed In the clustering algorithm described in step (3a), step (5a) is as follows：

The first step randomly chooses N number of user as initial N number of cluster centre from low-dimensional matrix, and a cluster centre corresponds to One class of subscriber；

Second step arbitrarily chooses a user as division user from low-dimensional matrix；

Third walks, and according to the Euclidean distance divided between user and N number of cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre；

4th step judges whether to have selected all users in low-dimensional matrix, if so, executing the 5th step, otherwise, executes second step；

5th step arbitrarily chooses a class of subscriber as target category from all class of subscribers；

6th step, the mean value that user in selected target user's classification is scored, the cluster centre as selected target user's classification；

7th step judges whether that the cluster centre of all class of subscribers does not change, if so, obtaining each using in low-dimensional matrix Otherwise the corresponding class of subscriber in family executes second step.

4. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed In recommendation accuracy method described in step (8e) is as follows：

The first step calculates each user and each user in information core individual in consumer articles training matrix according to the following formula Between cosine similarity；

Wherein, sim (u, v) indicates the cosine of u-th of user and v-th of user in information core individual in consumer articles training matrix Similarity, ∑ indicate that sum operation, i indicate article set I (u) and i-th of article in article set I (v) intersections, I (u) tables Show that u-th of user in consumer articles training matrix comments excessive article set, I (v) to indicate v-th of user in information core individual Excessive article set, ∈ expressions is commented to belong to symbol, ∩ indicates the operation that seeks common ground, r_uiIndicate u in consumer articles training matrix The score value of i-th of article of a user couple, r_viIndicate the score value of v-th of user couple, i-th of article in information core individual, Indicate extraction of square root operation；

Second step arbitrarily chooses a user as experience user from consumer articles training matrix；

Third walks, and from information core individual, chooses and is used with the experience maximum top n user of user's cosine similarity, composition experience The neighbours at family collect；

4th step is that an article is arbitrarily chosen in 0 article set as experience article from experience user's scoring；

5th step calculates prediction score value of the experience user to experience article according to the following formula：

Wherein, p_adIndicate that prediction score values of the selected experience user a to selected experience article d, s indicate user's set H_adIn S-th of user, H_adIndicating that the neighbours of selected experience user a concentrate comments excessive user to gather selected experience article d, Sim (a, s) indicates the cosine similarity of selected experience user a and s-th of user, r_sdIndicate s-th of user to selected body Test the score value of article d；

6th step judges whether that it is all items in 0 article set to have selected experience user's scoring, if so, the 7th step is executed, it is no Then, the 4th step is executed；

7th step will experience the article that user's score value is 0, from big to small according to prediction score value in consumer articles training matrix The forward article that sorts, the recommendation list of composition experience user are chosen in sequence from sequence；

8th step calculates the recommendation precision of experience user according to the following formula：

Wherein, pr_aIndicate that the recommendation precision of selected experience user a, Q indicate that selected experience user a optimizes square in consumer articles It comments excessive in battle array and appears in the number of articles in its recommendation list, L indicates object in the selected recommendation list for experiencing user a Product quantity；

9th step judges whether the recommendation precision for obtaining all users in consumer articles training matrix, if so, executing the tenth Otherwise step executes second step；

Tenth step calculates the recommendation precision of information core individual according to the following formula：

Wherein, PR indicates that the recommendation precision of information core individual, e indicate that e-th of user in user's set U, U indicate consumer articles instruction Practice user in matrix to gather, pr_eIndicate the recommendation precision of e-th of user.