CN108776919A - The item recommendation method of information core is built based on cluster and evolution algorithm - Google Patents

The item recommendation method of information core is built based on cluster and evolution algorithm Download PDF

Info

Publication number
CN108776919A
CN108776919A CN201810550780.5A CN201810550780A CN108776919A CN 108776919 A CN108776919 A CN 108776919A CN 201810550780 A CN201810550780 A CN 201810550780A CN 108776919 A CN108776919 A CN 108776919A
Authority
CN
China
Prior art keywords
user
matrix
article
consumer articles
information core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810550780.5A
Other languages
Chinese (zh)
Other versions
CN108776919B (en
Inventor
慕彩红
刘逸
朱贤武
刘若辰
张丹
侯彪
熊涛
焦李成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201810550780.5A priority Critical patent/CN108776919B/en
Publication of CN108776919A publication Critical patent/CN108776919A/en
Application granted granted Critical
Publication of CN108776919B publication Critical patent/CN108776919B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Finance (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Probability & Statistics with Applications (AREA)
  • Physiology (AREA)
  • Biomedical Technology (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of item recommendation method building information core based on cluster and evolution algorithm, and step is:(1) consumer articles rating matrix is built;(2) by consumer articles rating matrix dimensionality reduction;(3) clustering algorithm is utilized to build Virtual User article rating matrix;(4) build and update consumer articles training matrix and consumer articles optimization matrix;(5) parent population is initialized;(6) cross and variation generates transition population;(7) the recommendation precision of information core individual is calculated;(8) progeny population is generated;(9) parent population is updated;(10) judge whether iterations are 100 times;(11) information core structure is completed;(12) it is that user recommends article to utilize information core.The present invention has structure information core fast, the advantage for recommending article more accurate for user.

Description

The item recommendation method of information core is built based on cluster and evolution algorithm
Technical field
The invention belongs to field of computer technology, the one kind further related in article recommended technology field is based on cluster And the item recommendation method of evolution algorithm structure information core.The present invention can pass through structure according to user to the score information of article Information core be user recommend oneself need article.
Background technology
Commending system is a kind of information filtering system, and by analyzing the historical behavior data of user, analysis finds user's Hobby, and recommend its interested article or information for user.Nowadays, already present recommendation method has very much, collaboration Filter algorithm is the proposed algorithm being most widely used at present, but as data volume increases, the run time of algorithm and give user The time of recommendation can be elongated, this scalability problem greatly suppresses the development of collaborative filtering.
Paper " the Information core optimization using that Caihong Mu et al. are delivered at it Evolutionary Algorithm with Elite Population in recommender systems”(Congress On Evolutionary Computation (CEC), 2017IEEE) in propose a kind of evolution algorithm based on elite population Extract the recommendation method of information core.The step of recommendation method is:Step 1, the sparse rating matrix of user and article is established;Step Rapid 2, use evolution algorithm:Parent population is initialized, the fitness of each individual in population is calculated;Step 3, according to M elite plan Slightly, it is sorted from big to small to individual adaptation degree, carries out sequence crossover according to individual adaptation degree, extract information core;Step 4, According to information core, the scoring of all articles not scored of target user is predicted, and recommended.It is insufficient existing for the recommendation method Place is, when extracting information core using evolution algorithm, the time spent is often calculated for the information core individual adaptation degree in population too It is long, cause to choose information core offline excessively slow.
University of Electronic Science and Technology " a kind of personalized recommendation method based on key user and is in the patent document of its application System " (application number:201510157504.9 application publication number:CN 104778237A) in disclose it is a kind of based on key user's Personalized recommendation method.The implementation steps of this method are:Step 1:The score data of article is concentrated from user and obtains user couple The scoring of article, and the similarity between different user is calculated to different article score informations using user;Step 2, mesh is determined Mark user's length be N neighbor list, and in neighbor list the position of neighbours be according to target user's similarity from high to low Arrangement;Step 3, a weight rule is set:The number that user appears in other users neighbor list is more, arrangement position More forward, weight is bigger;Step 4, using the maximum P user of weight as key user;Step 5, according to key user to object The score information of product, the articles not scored all to target user carry out score in predicting, are used to target according to score in predicting situation Recommended at family.Shortcoming existing for this method is, using the similarity between user, the selection criteria of set information core, The mode of this selection information core causes the selected information core taken out to recommend precision low.
Invention content
It is a kind of based on cluster and evolution algorithm it is an object of the invention in view of the deficiency of the prior art, propose Build the item recommendation method of information core.
Realize that the concrete thought of the object of the invention is, by consumer articles rating matrix dimensionality reduction, to obtain low-dimensional matrix, using poly- Class algorithm is multiple to the user clustering in low-dimensional matrix, builds Virtual User article rating matrix, using evolution algorithm from virtual Information core is extracted in consumer articles rating matrix, is that user recommends article using information core.
Steps are as follows for the specific implementation of the present invention:
(1) consumer articles rating matrix is built:
(1a) concentrates all score informations of the extraction user to article from user to the score data of article, creates user's object Sub-matrix is judged, the line number of the matrix is equal with the number of users that score data is concentrated, the matrix column number and score data collection In number of articles it is equal;
(1b) indicates that user in rating matrix does not comment the score value of excessive article with 0, indicates to score with practical score value User comments the score value of excessive article in matrix;
(2) by consumer articles rating matrix dimensionality reduction:
Low-dimensional matrix is obtained to consumer articles rating matrix dimensionality reduction using T distribution random neighbor embedded mobile GISs;
(3) utilize clustering algorithm multiple to the user clustering in low-dimensional matrix:
(3a) utilizes clustering algorithm, and the cluster of Ψ classification is carried out to the user in low-dimensional matrix, each user is obtained and exists Corresponding classification in Ψ classification;
The classification of (3b) using each user in Ψ classification, obtains the class that user is corresponded in consumer articles rating matrix Not;
After (3c) is to the user clustering 5 times in low-dimensional matrix, it is corresponding all to obtain user in consumer articles rating matrix Class of subscriber;
(4) Virtual User article rating matrix is built:
(4a) arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is scored Mean value the cluster centre of selected classification is saved as into a vector as the cluster centre of selected classification;
(4b) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (4d) is thened follow the steps, Otherwise, step (4a) is executed;
The corresponding vector of the cluster centre of all class of subscribers is formed Virtual User article rating matrix by (4d);
(5) structure optimization matrix:
(5a) utilizes clustering algorithm, and the cluster of K classification is carried out to the user in low-dimensional matrix, obtains each user in K Corresponding classification in a classification;
The classification of (5b) using each user in K classification, obtains the class that user is corresponded in consumer articles rating matrix Not;
(5c) arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is scored Mean value the cluster centre of selected classification is saved as into a vector as the cluster centre of selected classification;
(5d) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (5e) is thened follow the steps, Otherwise, step (5c) is executed;
(5e) is by the corresponding vector of the cluster centre of all class of subscribers, compositional optimization matrix;
(5f) arbitrarily chooses a user as target user from optimization matrix;
(5g) arbitrarily chooses an article as target item from optimization matrix;
(5h) according to the following formula, updates score value of the target user to target item:
Wherein, pbjIndicate score values of the target user b to target item j, | | indicate the operation that takes absolute value, GcIt indicates to use Belong to class of subscriber c in the article rating matrix of family and comment excessive user to gather target item j, c indicates that target user b is corresponded to Cluster centre where class of subscriber, TcIndicate the user's set for belonging to class of subscriber c in consumer articles rating matrix;
(5i) judges whether to have selected all users in optimization matrix, if so, thening follow the steps (5j), otherwise, executes step (5f);
(5j) judges whether to have selected all items in optimization matrix, if so, thening follow the steps (6), otherwise, executes step (5g);
(6) it builds consumer articles training matrix and consumer articles optimizes matrix:
(6a) builds consumer articles training matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix Columns with optimization matrix in number of articles it is equal;
(6b) builds consumer articles and optimizes matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix Columns with optimization matrix in number of articles it is equal;
(7) it updates consumer articles training matrix and consumer articles optimizes matrix:
(7a) is used as training data, remaining 20% score data conduct from optimize extraction score data in matrix 80% Optimize data;
(7b) extracts score information from training data, and excessive object is not commented with user in 0 replacement consumer articles training matrix The score value of product replaces the score value that user in consumer articles training matrix comments excessive article with practical score value;
(7c) does not comment excessive object from optimization extracting data score information, with user in 0 replacement consumer articles optimization matrix The score value of product replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value;
(8) evolution algorithm is utilized to choose information core:
(8a) extracts Customs Assigned Number ID, composition Customs Assigned Number ID set from Virtual User article rating matrix;
(8b) regard 60% that Customs Assigned Number ID gathers length as information core length;
(8c) arbitrarily chooses the Customs Assigned Number ID subsets of 100 and information core equal length from Customs Assigned Number ID set, Form parent population;
(8d) intersects, after mutation operation each information core individual in parent population, generates transition population;
(8e) obtains the recommendation essence of parent population and each information core individual in transition population using accuracy method is recommended Degree;
(8f) sorts the information core individual in parent population and transition population, from row from big to small according to recommendation precision Preceding 100 information core individual is chosen in sequence, forms progeny population;
(8g) uses each information core individual in progeny population to replace each information core individual in parent population, raw The parent population of Cheng Xin;
(8h) judges whether current iteration number reaches maximum iteration 100 times, if so, (9) are thened follow the steps, it is no Then, step (8d) is executed;
(9) information core structure is completed:
It is chosen from parent population and recommends the maximum information core individual of precision, as best information core;
(10) it is that user recommends article to utilize information core:
Using the best information core found out, recommend the article of its needs for each user in consumer articles rating matrix.
The present invention has the advantage that compared with prior art:
First, the present invention is calculated due to optimizing matrix by building consumer articles training matrix and consumer articles using evolving Method chooses information core, overcomes the prior art, often suitable for the information core individual in population when extracting information core using evolution algorithm Response calculates the disadvantage that the time spent is too long, causes offline selection information core excessively slow so that the present invention can shorten offline selection The time of information core alleviates offline optimization time longer problem when choosing information core using evolution algorithm.
Second, the present invention from Virtual User article rating matrix using evolution algorithm due to choosing information core, from parent It is chosen in population and recommends the maximum information core individual of precision, as best information core, overcome the prior art using between user Similarity, the mode of the selection criteria of set information core, this selection information core causes the selected information core taken out to recommend essence Spend low disadvantage so that the information core that the present invention selects has the advantages that higher recommendation precision.
Description of the drawings
Fig. 1 is the flow chart of the present invention;
Fig. 2 is the analogous diagram of the present invention.
Specific implementation mode:
The present invention is described in further detail below in conjunction with attached drawing.
Referring to Fig.1, the specific implementation step of the present invention is further described.
Step 1, consumer articles rating matrix is built.
All score informations of the extraction user to article are concentrated to the score data of article from user, creates consumer articles and comments The line number of sub-matrix, the matrix is equal with the number of users that score data is concentrated, which concentrates with score data Number of articles is equal.
In the embodiment of the present invention user to the score data collection of article include MovieLens-100K score datas collection and MovieLens-1M score data collection.
The score information includes the score value of Customs Assigned Number ID, project number ID and user to article.
It indicates that user in rating matrix does not comment the score value of excessive article with 0, indicates that user commented with practical score value The score value of the article divided.
Step 2, by consumer articles rating matrix dimensionality reduction.
Low-dimensional matrix is obtained to consumer articles rating matrix dimensionality reduction using T distribution random neighbor embedded mobile GISs.
Step 3, multiple to the user clustering in low-dimensional matrix using clustering algorithm.
Using clustering algorithm, the cluster of Ψ classification is carried out to the user in low-dimensional matrix, obtains each user at Ψ Corresponding classification in classification, Ψ is 20 when score data collection MovieLens-100K, Ψ when score data collection MovieLens-1M It is 64.
Steps are as follows for the clustering algorithm:
A. Ψ user is randomly choosed from low-dimensional matrix as Ψ initial cluster centre, a cluster centre corresponds to One class of subscriber.
B. a user is arbitrarily chosen from low-dimensional matrix as division user.
C. it according to the Euclidean distance divided between user and Ψ cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre.
D. judge whether to have selected all users in low-dimensional matrix, if so, thening follow the steps E, otherwise, execute step B.
E. a class of subscriber is arbitrarily chosen from all class of subscribers as target category.
F. mean value user in selected target user's classification to be scored, the cluster centre as selected target user's classification.
G. judge whether that the cluster centre of all class of subscribers does not change, if so, obtaining each using in low-dimensional matrix Otherwise the corresponding class of subscriber in family executes step B.
Using classification of each user in Ψ classification, the classification that user is corresponded in consumer articles rating matrix is obtained.
After the user clustering 5 times in low-dimensional matrix, the corresponding all users of user in consumer articles rating matrix are obtained Classification.
Step 4, Virtual User article rating matrix is built.
1st step arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is commented The cluster centre of selected classification is saved as a vector by the mean value divided as the cluster centre of selected classification.
2nd step judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, executing this step Otherwise 3rd step executes the 1st step of this step.
The corresponding vector of the cluster centre of all class of subscribers is formed Virtual User article rating matrix by the 3rd step.
Step 5, structure optimization matrix.
1st step carries out the user in low-dimensional matrix the cluster of K classification, obtains each user and exist using clustering algorithm Corresponding classification in K classification, K is 100 when score data collection MovieLens-100K, when score data collection MovieLens-1M K is 320.
Steps are as follows for the clustering algorithm:
A. K user is randomly choosed from low-dimensional matrix as K initial cluster centre, a cluster centre corresponds to one A class of subscriber.
B. a user is arbitrarily chosen from low-dimensional matrix as division user.
C. it according to the Euclidean distance divided between user and K cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre.
D. judge whether to have selected all users in low-dimensional matrix, if so, thening follow the steps E, otherwise, execute step B.
E. a class of subscriber is arbitrarily chosen from all class of subscribers as target category.
F. mean value user in selected target user's classification to be scored, the cluster centre as selected target user's classification.
G. judge whether that the cluster centre of all class of subscribers does not change, if so, obtaining each using in low-dimensional matrix Otherwise the corresponding class of subscriber in family executes step B.
2nd step obtains corresponding to user's in consumer articles rating matrix using classification of each user in K classification Classification.
3rd step arbitrarily chooses a class of subscriber from consumer articles rating matrix, and user in selected user classification is commented The cluster centre of selected classification is saved as a vector by the mean value divided as the cluster centre of selected classification.
4th step judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, executing this step Otherwise 5th step executes the 3rd step of this step.
5th step, by the corresponding vector of the cluster centre of all class of subscribers, compositional optimization matrix.
6th step arbitrarily chooses a user as target user from optimization matrix.
7th step arbitrarily chooses an article as target item from optimization matrix.
8th step, according to the following formula, score value of the update target user to target item.
Wherein, pbjIndicate score values of the selected target user b to selected target item j, | ● | expression takes absolute value Operation, GcIt indicates to belong to class of subscriber c in consumer articles rating matrix and comments excessive user to collect selected target item j It closes, c indicates the class of subscriber where the corresponding cluster centres of selected target user b, TcIt indicates in consumer articles rating matrix Belong to user's set of class of subscriber c.
9th step judges whether to have selected all users in optimization matrix, otherwise be held if so, executing the 10th step of this step Row the 6th step of this step.
10th step judges whether to have selected all items in optimization matrix otherwise, to execute this step if so, thening follow the steps 6 Rapid 7th step.
Step 6, it builds consumer articles training matrix and consumer articles optimizes matrix.
Consumer articles training matrix is built, the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix.
It builds consumer articles and optimizes matrix, the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix.
Step 7, it updates consumer articles training matrix and consumer articles optimizes matrix.
The 80% of extraction score data is used as training data from optimization matrix, and remaining 20% score data is as an optimization Data.
Score information is extracted from training data, and excessive article is not commented with user in 0 replacement consumer articles training matrix Score value replaces the score value that user in consumer articles training matrix comments excessive article with practical score value.
From optimization extracting data score information, excessive article is not commented with user in 0 replacement consumer articles optimization matrix Score value replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value.
Step 8, information core is chosen using evolution algorithm.
1st step extracts Customs Assigned Number ID, composition Customs Assigned Number ID set from Virtual User article rating matrix.
2nd step, 60% that Customs Assigned Number ID is gathered length are used as information core length.
3rd step, from Customs Assigned Number ID set, arbitrary Customs Assigned Number ID for choosing 100 and information core equal length Collection forms parent population.
4th step intersects, after mutation operation each information core individual in parent population, generates transition population.
5th step obtains the recommendation of parent population and each information core individual in transition population using accuracy method is recommended Precision.
The recommendation accuracy method and step is as follows:
A. according to the following formula, calculate in consumer articles training matrix in each user and information core individual each user it Between cosine similarity.
Wherein, sim (u, v) indicates u-th user in consumer articles training matrix and v-th user in information core individual Cosine similarity, ∑ indicate that sum operation, i indicate article set I (u) and i-th of article in article set I (v) intersections, I (u) indicate that u-th of user in consumer articles training matrix comments excessive article set, I (v) to indicate v in information core individual A user comments excessive article set, ∈ expressions to belong to symbol, and ∩ indicates the operation that seeks common ground, ruiIndicate consumer articles training matrix In u-th of user couple, i-th of article score value, rviIndicate the scoring of v-th of user couple, i-th of article in information core individual Value,Indicate extraction of square root operation.
B. a user is arbitrarily chosen from consumer articles training matrix as experience user.
C. it from information core individual, chooses and the experience maximum top n user of user's cosine similarity, composition experience user Neighbours collection.
D. it is that an article is arbitrarily chosen in 0 article set as experience article from experience user's scoring.
E. according to the following formula, prediction score value of the experience user to experience article is calculated.
Wherein, padIndicate that prediction score values of the selected experience user a to selected experience article d, s indicate user's set HadIn s-th of user, HadIndicating that the neighbours of selected experience user a concentrate comments excessive user to collect selected experience article d It closes, sim (a, s) indicates the cosine similarity of selected experience user a and s-th of user, rsdIndicate s-th of user to selected Experience the score value of article d.
F. judge whether that it is all items in 0 article set to have selected experience user's scoring, if so, G is thened follow the steps, it is no Then, step D is executed.
G. the article that user's score value is 0 will be experienced in consumer articles training matrix, from big to small according to prediction score value The forward article that sorts, the recommendation list of composition experience user are chosen in sequence from sequence.
H. according to the following formula, the recommendation precision of experience user is calculated.
Wherein, praIndicate that the recommendation precision of selected experience user a, Q indicate selected experience user a in consumer articles It comments excessive in optimization matrix and appears in the number of articles in its recommendation list, L indicates the recommendation row of selected experience user a Number of articles in table.
I. judge whether to obtain the recommendation precision of all users in consumer articles training matrix, if so, J is thened follow the steps, Otherwise, step B is executed.
J. according to the following formula, the recommendation precision of information core individual is calculated.
Wherein, PR indicates that the recommendation precision of information core individual, e indicate that e-th of user in user's set U, U indicate user's object User gathers in product training matrix, preIndicate the recommendation precision of e-th of user.
6th step sorts the information core individual in parent population and transition population according to recommendation precision from big to small, from Preceding 100 information core individual is chosen in sequence, forms progeny population.
7th step replaces each information core individual in parent population with each information core individual in progeny population, Generate new parent population.
8th step judges whether current iteration number reaches maximum iteration 100 times, if so, 9 are thened follow the steps, it is no Then, the 4th step of this step is executed.
Step 9, information core structure is completed.
It is chosen from parent population and recommends the maximum information core individual of precision, as best information core.
Step 10, collaborative filtering recommending is carried out.
Using the best information core found out, recommend the article of its needs for each user in consumer articles rating matrix.
The effect of the present invention is described further with reference to emulation experiment.
1. simulated conditions:
The running environment of emulation experiment of the present invention is:64 bit manipulation systems of Windows7, CPU are Intel (R) Core (TM) i3-CPU 550U 3.20GHz inside save as 4GB, translation and compiling environment Matlab2017a.
2. emulation experiment data and evaluation index:
In the emulation experiment of the present invention using the common data set MovieLens-100K in commending system field and MovieLens-1M tears two datasets open for recommendation effect of the authentication present invention on extraction information core respectively It is divided into training data subset Train and test data subset Test, two data statistics are shown in such as the following table 1.
1 data set Statistics table of table
Wherein, the English Dataset (original) in table 1 indicates that raw data set, Dataset (subset) indicate former Subset in beginning data set, including training data subset Train, test data subset Test, #User indicate number of users, U tables Show that user gathers, #Item indicates that number of articles, I indicate that article set, #Ratings indicate that scoring quantity, R indicate user to object The scoring of product.
The present invention uses the common evaluation index in commending system field to recommend precision precision, as follows, meter Calculate recommendation precision precision of the information core in test data subset:
1st step arbitrarily chooses a user as target user from test data subset, according to the following formula, calculates information Check the recommendation precision of target user.
Wherein, precisionuIndicate the recommendation accuracy of the selected target user u of information verification in test data subset, Q indicates that the number of articles that user u needs in article of recommending of the selected target user u of information verification, L indicate selected by information verification Take the recommendation number of articles of target user u.
2nd step judges whether to obtain the recommendation precision of all users in information verification test data subset, if so, holding Otherwise the 3rd step of row executes the 1st step.
3rd step calculates recommendation precision of the information core in test data subset according to the following formula.
Wherein, precision indicates recommendation precision of the information core in test data subset, and u indicates in user's set U the U user, U indicate user's set in test data subset, precisionuIndicate that information checks the recommendation essence of u-th of user Degree.
3. emulation experiment content and interpretation of result:
There are two the emulation experiments of the present invention.
The emulation experiment 1 of the present invention is training data subset Train and the survey after two frequently-used data collection are respectively split It tries on data subset Test, to recommend precision precision as evaluation index, recommendation list length is 20 times progress.Emulation Experiment 1 is that the present invention based on cluster and evolution algorithm builds the item recommendation method of information core with 4 prior arts (based on row The name information core construction method of Rank, the information core construction method based on frequency Frequency, based on common evolution algorithm EA's Information core construction method and method based on traditional collaborative filtering CF) recommendation precision compared, it is contemplated that the method for the present invention It reruns 10 times and is averaging when seeking recommendation precision with the randomness of the information core construction method based on common evolution algorithm EA Value, data set MovieLens-100K, shown in the result such as table 2 and Fig. 2 (a) of comparison, data set MovieLens-1M, comparison As a result as shown in table 3 and Fig. 2 (b).
Ordinate in Fig. 2 (a), Fig. 2 (b) indicates that precision, abscissa is recommended to indicate neighbours' number.In Fig. 2 (a), Fig. 2 (b) Indicate that the information core construction method based on frequency Frequency recommends precision curve with obtaining with the curve of square mark.Fig. 2 (a), the curve indicated with diamond shape in Fig. 2 (b) indicates that the information core construction method based on ranking Rank recommends precision bent with obtaining Line.Indicate that the method based on traditional collaborative filtering CF recommends precision with obtaining in Fig. 2 (a), Fig. 2 (b) with the curve of circle mark Curve indicates that the information core construction method of common evolution algorithm EA obtains ground in Fig. 2 (a), Fig. 2 (b) with the curve of Asterisk marks Recommend precision curve.The method that the curve indicated with right triangle in Fig. 2 (a), Fig. 2 (b) indicates the present invention recommends essence with obtaining It writes music line.
From Fig. 2 (a), Fig. 2 (b) as can be seen that the curve of the present invention is based on the information core structure of frequency Frequency The curve of the curve of method, the curve of information core construction method based on ranking Rank, method based on traditional collaborative filtering CF With the top of the curve of the information core construction method based on common evolution algorithm EA, therefore illustrate the present invention recommendation precision be most High.
The recommendation accuracy table of each method when recommendation list length on MovieLens-100K is 20 of table 2
The recommendation accuracy table of each method when recommendation list length on MovieLens-1M is 20 of table 3
Wherein, neighbours' number indicates that user neighbours concentrate user's number in table 2, table 3, frequency Frequency tables in table 2, table 3 Show the recommendation precision of the information core construction method based on frequency Frequency, ranking Rank indicates to be based on ranking in table 2, table 3 The recommendation precision of the information core construction method of Rank, collaborative filtering CF indicates the side based on traditional collaborative filtering CF in table 2, table 3 The recommendation precision of method, evolution algorithm EA indicates the recommendation of the information core construction method based on common evolution algorithm EA in table 2, table 3 Precision, the present invention indicates the recommendation precision of the method for the present invention in table 2, table 3.
From table 2, table 3 as can be seen that the present invention in two common score data collection MovieLens-100K and Recommendation precision on MovieLens-1M is all higher than other recommendation precision showed there are four types of technology.
The emulation experiment 2 of the present invention is training data subset Train and the survey after two frequently-used data collection are respectively split It tries on data subset Test, to choose time of information core offline as evaluation index, neighbours' number is to carry out for 10 time.Emulation experiment 2 be the present invention with the prior art (the information core construction method based on common evolution algorithm EA) offline choose information core time into Row comparison, it is contemplated that the randomness of the method for the present invention and the information core building method based on common evolution algorithm EA, statistics are offline When choosing the time of information core, reruns 10 times and average, comparing result is as shown in table 4.
4 two kinds of table chooses the timetable that information verification ratio method chooses information core offline
As can be seen from Table 4, it on two datasets MovieLens-100K and MovieLens-1M, is evolved based on common The time that the information core construction method of algorithm EA chooses information core offline is longer, and the method for the present invention chooses the time of information core offline It is shorter, illustrate that the present invention can choose information core offline more quickly.

Claims (4)

1. a kind of item recommendation method building information core based on cluster and evolution algorithm, which is characterized in that utilize clustering algorithm Virtual User article rating matrix is built, information core is extracted from Virtual User article rating matrix by evolution algorithm, profit It is that user recommends article with information core;The step of this method includes as follows:
(1) consumer articles rating matrix is built:
(1a) concentrates all score informations of the extraction user to article from user to the score data of article, creates consumer articles and comments The line number of sub-matrix, the matrix is equal with the number of users that score data is concentrated, which concentrates with score data Number of articles is equal;
(1b) indicates that user in rating matrix does not comment the score value of excessive article with 0, and rating matrix is indicated with practical score value Middle user comments the score value of excessive article;
(2) by consumer articles rating matrix dimensionality reduction:
Low-dimensional matrix is obtained to consumer articles rating matrix dimensionality reduction using T distribution random neighbor embedded mobile GISs;
(3) utilize clustering algorithm multiple to the user clustering in low-dimensional matrix:
(3a) utilizes clustering algorithm, and the cluster of Ψ classification is carried out to the user in low-dimensional matrix, obtains each user at Ψ Corresponding classification in classification;
The classification of (3b) using each user in Ψ classification, obtains the classification that user is corresponded in consumer articles rating matrix;
After (3c) is to the user clustering 5 times in low-dimensional matrix, the corresponding all users of user in consumer articles rating matrix are obtained Classification;
(4) Virtual User article rating matrix is built:
(4a) arbitrarily chooses a class of subscriber from consumer articles rating matrix, user in selected user classification is scored equal Value, as the cluster centre of selected classification, a vector is saved as by the cluster centre of selected classification;
(4b) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (4d) is thened follow the steps, it is no Then, step (4a) is executed;
The corresponding vector of the cluster centre of all class of subscribers is formed Virtual User article rating matrix by (4d);
(5) structure optimization matrix:
(5a) utilizes clustering algorithm, and the cluster of K classification is carried out to the user in low-dimensional matrix, obtains each user in K class Corresponding classification in not;
The classification of (5b) using each user in K classification, obtains the classification that user is corresponded in consumer articles rating matrix;
(5c) arbitrarily chooses a class of subscriber from consumer articles rating matrix, user in selected user classification is scored equal Value, as the cluster centre of selected classification, a vector is saved as by the cluster centre of selected classification;
(5d) judges whether to have selected class of subscriber all in consumer articles rating matrix, if so, (5e) is thened follow the steps, it is no Then, step (5c) is executed;
(5e) is by the corresponding vector of the cluster centre of all class of subscribers, compositional optimization matrix;
(5f) arbitrarily chooses a user as target user from optimization matrix;
(5g) arbitrarily chooses an article as target item from optimization matrix;
(5h) according to the following formula, updates score value of the target user to target item:
Wherein, pbjIndicate score values of the target user b to target item j, | | indicate the operation that takes absolute value, GcIndicate user's object It judges and belongs to class of subscriber c in sub-matrix and comment excessive user to gather target item j, c indicates that target user b is corresponding poly- Class of subscriber where class center, TcIndicate the user's set for belonging to class of subscriber c in consumer articles rating matrix;
(5i) judges whether to have selected all users in optimization matrix, if so, thening follow the steps (5j), otherwise, executes step (5f);
(5j) judges whether to have selected all items in optimization matrix, if so, thening follow the steps (6), otherwise, executes step (5g);
(6) it builds consumer articles training matrix and consumer articles optimizes matrix:
(6a) builds consumer articles training matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix;
(6b) builds consumer articles and optimizes matrix, and the line number of the matrix is equal with number of users in optimization matrix, the matrix column Number is equal with number of articles in optimization matrix;
(7) it updates consumer articles training matrix and consumer articles optimizes matrix:
(7a) is used as training data from optimize extraction score data in matrix 80%, and remaining 20% score data is as an optimization Data;
(7b) extracts score information from training data, and excessive article is not commented with user in 0 replacement consumer articles training matrix Score value replaces the score value that user in consumer articles training matrix comments excessive article with practical score value;
(7c) does not comment excessive article from optimization extracting data score information, with user in 0 replacement consumer articles optimization matrix Score value replaces user in consumer articles optimization matrix with practical score value and comments excessive article score value;
(8) evolution algorithm is utilized to choose information core:
(8a) extracts Customs Assigned Number ID, composition Customs Assigned Number ID set from Virtual User article rating matrix;
(8b) regard 60% that Customs Assigned Number ID gathers length as information core length;
(8c) from Customs Assigned Number ID set, the arbitrary Customs Assigned Number ID subsets for choosing 100 and information core equal length form Parent population;
(8d) intersects, after mutation operation each information core individual in parent population, generates transition population;
(8e) obtains the recommendation precision of parent population and each information core individual in transition population using accuracy method is recommended;
(8f) sorts the information core individual in parent population and transition population, from sequence from big to small according to recommendation precision Preceding 100 information core individual is chosen, progeny population is formed;
(8g) uses each information core individual in progeny population to replace each information core individual in parent population, generates new Parent population;
(8h) judges that whether current iteration number reaches maximum iteration 100 times, if so, thening follow the steps (9), otherwise, holds Row step (8d);
(9) information core structure is completed:
It is chosen from parent population and recommends the maximum information core individual of precision, as best information core;
(10) it is that user recommends article to utilize information core:
Using the best information core found out, recommend the article of its needs for each user in consumer articles rating matrix.
2. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed Score information described in, step (1a), step (7b), step (7c) includes Customs Assigned Number ID, project number ID, user couple The score value of article.
3. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed In the clustering algorithm described in step (3a), step (5a) is as follows:
The first step randomly chooses N number of user as initial N number of cluster centre from low-dimensional matrix, and a cluster centre corresponds to One class of subscriber;
Second step arbitrarily chooses a user as division user from low-dimensional matrix;
Third walks, and according to the Euclidean distance divided between user and N number of cluster centre, obtains and divides user's Euclidean distance minimum Target cluster centre, by divide user's mark be the corresponding class of subscriber of target cluster centre;
4th step judges whether to have selected all users in low-dimensional matrix, if so, executing the 5th step, otherwise, executes second step;
5th step arbitrarily chooses a class of subscriber as target category from all class of subscribers;
6th step, the mean value that user in selected target user's classification is scored, the cluster centre as selected target user's classification;
7th step judges whether that the cluster centre of all class of subscribers does not change, if so, obtaining each using in low-dimensional matrix Otherwise the corresponding class of subscriber in family executes second step.
4. the item recommendation method according to claim 1 for being built information core based on cluster and evolution algorithm, feature are existed In recommendation accuracy method described in step (8e) is as follows:
The first step calculates each user and each user in information core individual in consumer articles training matrix according to the following formula Between cosine similarity;
Wherein, sim (u, v) indicates the cosine of u-th of user and v-th of user in information core individual in consumer articles training matrix Similarity, ∑ indicate that sum operation, i indicate article set I (u) and i-th of article in article set I (v) intersections, I (u) tables Show that u-th of user in consumer articles training matrix comments excessive article set, I (v) to indicate v-th of user in information core individual Excessive article set, ∈ expressions is commented to belong to symbol, ∩ indicates the operation that seeks common ground, ruiIndicate u in consumer articles training matrix The score value of i-th of article of a user couple, rviIndicate the score value of v-th of user couple, i-th of article in information core individual, Indicate extraction of square root operation;
Second step arbitrarily chooses a user as experience user from consumer articles training matrix;
Third walks, and from information core individual, chooses and is used with the experience maximum top n user of user's cosine similarity, composition experience The neighbours at family collect;
4th step is that an article is arbitrarily chosen in 0 article set as experience article from experience user's scoring;
5th step calculates prediction score value of the experience user to experience article according to the following formula:
Wherein, padIndicate that prediction score values of the selected experience user a to selected experience article d, s indicate user's set HadIn S-th of user, HadIndicating that the neighbours of selected experience user a concentrate comments excessive user to gather selected experience article d, Sim (a, s) indicates the cosine similarity of selected experience user a and s-th of user, rsdIndicate s-th of user to selected body Test the score value of article d;
6th step judges whether that it is all items in 0 article set to have selected experience user's scoring, if so, the 7th step is executed, it is no Then, the 4th step is executed;
7th step will experience the article that user's score value is 0, from big to small according to prediction score value in consumer articles training matrix The forward article that sorts, the recommendation list of composition experience user are chosen in sequence from sequence;
8th step calculates the recommendation precision of experience user according to the following formula:
Wherein, praIndicate that the recommendation precision of selected experience user a, Q indicate that selected experience user a optimizes square in consumer articles It comments excessive in battle array and appears in the number of articles in its recommendation list, L indicates object in the selected recommendation list for experiencing user a Product quantity;
9th step judges whether the recommendation precision for obtaining all users in consumer articles training matrix, if so, executing the tenth Otherwise step executes second step;
Tenth step calculates the recommendation precision of information core individual according to the following formula:
Wherein, PR indicates that the recommendation precision of information core individual, e indicate that e-th of user in user's set U, U indicate consumer articles instruction Practice user in matrix to gather, preIndicate the recommendation precision of e-th of user.
CN201810550780.5A 2018-05-31 2018-05-31 Article recommendation method for constructing information core based on clustering and evolutionary algorithm Active CN108776919B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810550780.5A CN108776919B (en) 2018-05-31 2018-05-31 Article recommendation method for constructing information core based on clustering and evolutionary algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810550780.5A CN108776919B (en) 2018-05-31 2018-05-31 Article recommendation method for constructing information core based on clustering and evolutionary algorithm

Publications (2)

Publication Number Publication Date
CN108776919A true CN108776919A (en) 2018-11-09
CN108776919B CN108776919B (en) 2021-07-20

Family

ID=64028262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810550780.5A Active CN108776919B (en) 2018-05-31 2018-05-31 Article recommendation method for constructing information core based on clustering and evolutionary algorithm

Country Status (1)

Country Link
CN (1) CN108776919B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109949099A (en) * 2019-03-23 2019-06-28 西安电子科技大学 Information core construction method based on cluster and multi-arm fruit machine
CN111859155A (en) * 2020-08-04 2020-10-30 深圳前海微众银行股份有限公司 Item recommendation method, equipment and computer-readable storage medium
CN112955883A (en) * 2018-12-29 2021-06-11 深圳市欢太科技有限公司 Application recommendation method and device, server and computer-readable storage medium
WO2023279685A1 (en) * 2021-07-05 2023-01-12 南京信息工程大学 Method for mining core users and core items in large-scale commodity sales

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139037A (en) * 2015-09-06 2015-12-09 西安电子科技大学 Integrated multi-objective evolutionary automatic clustering method based on minimum spinning tree
CN105868281A (en) * 2016-03-23 2016-08-17 西安电子科技大学 Location-aware recommendation system based on non-dominated sorting multi-target method
CN105976140A (en) * 2016-04-27 2016-09-28 大连海事大学 Real-time vehicle commodity matching method under large-scale streaming data environment
CN106157156A (en) * 2016-07-29 2016-11-23 电子科技大学 A kind of cooperation recommending system based on communities of users
CN107274255A (en) * 2017-05-19 2017-10-20 西安电子科技大学 It is a kind of that recommendation method is filtered based on the collaboration for decomposing multi-objective Evolutionary Algorithm
CN107391713A (en) * 2017-07-29 2017-11-24 内蒙古工业大学 A kind of method and system for solving the problems, such as cold start-up in collaborative filtering recommending technology
CN107609033A (en) * 2017-08-10 2018-01-19 西安电子科技大学 Information core extracting method based on self-adapting synergizing evolution algorithm

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139037A (en) * 2015-09-06 2015-12-09 西安电子科技大学 Integrated multi-objective evolutionary automatic clustering method based on minimum spinning tree
CN105868281A (en) * 2016-03-23 2016-08-17 西安电子科技大学 Location-aware recommendation system based on non-dominated sorting multi-target method
CN105976140A (en) * 2016-04-27 2016-09-28 大连海事大学 Real-time vehicle commodity matching method under large-scale streaming data environment
CN106157156A (en) * 2016-07-29 2016-11-23 电子科技大学 A kind of cooperation recommending system based on communities of users
CN107274255A (en) * 2017-05-19 2017-10-20 西安电子科技大学 It is a kind of that recommendation method is filtered based on the collaboration for decomposing multi-objective Evolutionary Algorithm
CN107391713A (en) * 2017-07-29 2017-11-24 内蒙古工业大学 A kind of method and system for solving the problems, such as cold start-up in collaborative filtering recommending technology
CN107609033A (en) * 2017-08-10 2018-01-19 西安电子科技大学 Information core extracting method based on self-adapting synergizing evolution algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
樊鸿: ""基于进化计算理论的推荐系统算法设计"", 《电脑知识与技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112955883A (en) * 2018-12-29 2021-06-11 深圳市欢太科技有限公司 Application recommendation method and device, server and computer-readable storage medium
CN109949099A (en) * 2019-03-23 2019-06-28 西安电子科技大学 Information core construction method based on cluster and multi-arm fruit machine
CN109949099B (en) * 2019-03-23 2022-04-08 西安电子科技大学 Information core construction method based on clustering and multi-arm gambling machine
CN111859155A (en) * 2020-08-04 2020-10-30 深圳前海微众银行股份有限公司 Item recommendation method, equipment and computer-readable storage medium
WO2023279685A1 (en) * 2021-07-05 2023-01-12 南京信息工程大学 Method for mining core users and core items in large-scale commodity sales

Also Published As

Publication number Publication date
CN108776919B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
CN108776919A (en) The item recommendation method of information core is built based on cluster and evolution algorithm
CN105335491B (en) Behavior is clicked come to the method and system of user's Recommended Books based on user
CN105373597B (en) The user collaborative filtered recommendation method merging based on k medoids item cluster and partial interest
CN103678672B (en) Method for recommending information
CN103412948B (en) The Method of Commodity Recommendation and system of collaborative filtering based on cluster
CN103678457B (en) Determining alternative visualizations for data based on an initial data visualization
CN105893609A (en) Mobile APP recommendation method based on weighted mixing
CN102841946B (en) Commodity data retrieval ordering and Method of Commodity Recommendation and system
CN105868281B (en) Location aware recommender system based on non-dominated ranking multi-target method
CN108763362A (en) Method is recommended to the partial model Weighted Fusion Top-N films of selection based on random anchor point
CN103365997B (en) A kind of opining mining method based on integrated study
CN107391713A (en) A kind of method and system for solving the problems, such as cold start-up in collaborative filtering recommending technology
CN107633444B (en) Recommendation system noise filtering method based on information entropy and fuzzy C-means clustering
CN106156372B (en) A kind of classification method and device of internet site
CN104077357A (en) User based collaborative filtering hybrid recommendation method
CN106156333B (en) A kind of improvement list class collaborative filtering method of mosaic society's information
CN110110225B (en) Online education recommendation model based on user behavior data analysis and construction method
CN104503973A (en) Recommendation method based on singular value decomposition and classifier combination
CN104239496B (en) A kind of method of combination fuzzy weighted values similarity measurement and cluster collaborative filtering
CN103559630A (en) Customer segmentation method based on customer attribute and behavior characteristic analysis
CN108763496A (en) A kind of sound state data fusion client segmentation algorithm based on grid and density
Zuo Sentiment analysis of steam review datasets using naive bayes and decision tree classifier
CN103279552A (en) Collaborative filtering recommendation method based on user interest groups
CN103810162A (en) Method and system for recommending network information
CN104657336B (en) A kind of personalized recommendation method based on half cosine function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant