Summary of the invention
In view of above-mentioned the deficiencies in the prior art, the object of the present invention is to provide a kind of implementation method and system of personalized recommendation, be intended to solve the content that existing recommend method recommends inaccurate, have that sparse property, data processing are more complicated, diversity and degree of accuracy the two the problem such as be difficult to take into account.
Technical scheme of the present invention is as follows:
An implementation method for personalized recommendation, wherein, comprises step:
A, obtain user behavior characteristic, according to described user behavior characteristic, obtain original user behavioural characteristic data set;
B, use different group of multinuclear nuclear matrix to carry out nuclear space mapping to original user behavioural characteristic data set, obtain nuclear space data set;
C, by different group of multinuclear nuclear matrix, original user behavioural characteristic data set is trained, set up initial different group of multinuclear user behavior examination criteria, with the initial different group of multinuclear user behavior examination criteria of setting up, approximately subtract nuclear space data set, obtain approximately subtracting data set;
D, by approximately subtracting data set, different group of multinuclear nuclear matrix, initial different group of multinuclear user behavior examination criteria are input in nuclear space clustering algorithm, different group of multinuclear user behavior examination criteria and cluster classification result are optimized in output;
E, according to optimizing different group of multinuclear user behavior examination criteria and cluster classification result, to approximately subtracting data set, carry out classification and Detection and obtain classification results collection, according to described classification results, integrate the content being associated with user behavior feature classification for user's recommendation.
The method of described intelligent television personalized recommendation, wherein, described step B specifically comprises:
B1, set in advance one group of kernel function, calculate the nuclear matrix of each kernel function on original user behavioural characteristic data set, and determine the optimum running parameter of each nuclear matrix;
The interior point method of B2, use Semidefinite Programming calculates the optimum combination coefficient of different group of multinuclear nuclear matrix in conjunction with the optimum running parameter of each nuclear matrix;
B3, according to described optimum combination coefficient, each nuclear matrix is carried out to linear combination, obtain different group of multinuclear nuclear matrix
,
for nuclear matrix,
coefficient for nuclear matrix;
B4, use this different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, obtain nuclear space data set.
The method of described intelligent television personalized recommendation, wherein, described step C specifically comprises:
C1, by the middle solving result of different group of multinuclear nuclear matrix and Semidefinite Programming, determine initial different group of multinuclear user behavior examination criteria;
C2, initial different group of multinuclear user behavior examination criteria approximately subtracted to nuclear space data set, obtain approximately subtracting data set.
The method of described intelligent television personalized recommendation, wherein, described step D specifically comprises:
D1, the data object of data centralization will approximately be subtracted
by different group of multinuclear nuclear matrix, be mapped in nuclear space, the data object after being shone upon is
;
D2, from approximately subtracting data centralization, choose m object as accurate initial center point, with accurate initial center point to data object
carry out Preliminary division;
In D3, the classification that obtains in Preliminary division, aim at initial center point adjustment and obtain final initial center point;
D4, according to final initial center point again to data object
divide the classification that is optimized as cluster classification result;
D5, increase central point Candidate Set are replaced initial center point, and constantly iteration is upgraded central point until central point no longer changes, and different group of multinuclear user behavior examination criteria is optimized.
The method of described intelligent television personalized recommendation, wherein, in described step D3, the mode of adjusting initial center point is: each data object of setting in each classification is classification center, calculate the distance sum between other data objects in each classification center and respective classes, make classification center apart from sum minimum as final initial center point.
The method of described intelligent television personalized recommendation, is characterized in that, in described step D5, the acquisition process of central point Candidate Set comprises: calculate and approximately subtract each data object of data centralization in the distance of nuclear space, thereby collect, obtain central point Candidate Set.
The system that realizes, wherein, comprising:
Original user behavioural characteristic data set acquisition module, for obtaining user behavior characteristic, obtains original user behavioural characteristic data set according to described user behavior characteristic;
Nuclear space mapping block, for using different group of multinuclear nuclear matrix to carry out nuclear space mapping to original user behavioural characteristic data set, obtains nuclear space data set;
The intensive module that subtracts of data, for original user behavioural characteristic data set being trained by different group of multinuclear nuclear matrix, set up initial different group of multinuclear user behavior examination criteria, with the initial different group of multinuclear user behavior examination criteria of setting up, approximately subtract nuclear space data set, obtain approximately subtracting data set;
Optimize module, for by approximately subtracting data set, different group of multinuclear nuclear matrix, initial different group of multinuclear user behavior examination criteria are input to nuclear space clustering algorithm, different group of multinuclear user behavior examination criteria and cluster classification result are optimized in output;
Recommending module, for carrying out classification and Detection and obtain classification results collection approximately subtracting data set according to optimizing different group of multinuclear user behavior examination criteria and cluster classification result, integrates the content being associated with user behavior feature classification for user's recommendation according to described classification results.
The system that realizes of described personalized recommendation, wherein, described nuclear space mapping block specifically comprises:
Nuclear matrix acquiring unit, for setting in advance one group of kernel function, calculates the nuclear matrix of each kernel function on original user behavioural characteristic data set, and determines the optimum running parameter of each nuclear matrix;
Optimum combination coefficient calculation unit, for being used the interior point method of Semidefinite Programming to calculate the optimum combination coefficient of different group of multinuclear nuclear matrix in conjunction with the optimum running parameter of each nuclear matrix;
Linear combination unit, for each nuclear matrix being carried out to linear combination according to described optimum combination coefficient, obtains different group of multinuclear nuclear matrix
,
for nuclear matrix,
coefficient for nuclear matrix;
Nuclear space map unit, for using this different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, obtains nuclear space data set.
The system that realizes of described personalized recommendation, wherein, the intensive module that subtracts of described data comprises:
Primary standard determining unit, for the middle solving result by different group of multinuclear nuclear matrix and Semidefinite Programming, determines initial different group of multinuclear user behavior examination criteria;
Approximately subtract unit, for initial different group of multinuclear user behavior examination criteria approximately subtracted to nuclear space data set, obtain approximately subtracting data set.
The system that realizes of described personalized recommendation, wherein, described optimization module comprises:
Data object map unit, for will approximately subtracting the data object of data centralization
by different group of multinuclear nuclear matrix, be mapped in nuclear space, the data object after being shone upon is
;
Preliminary division unit, for choosing m object as accurate initial center point from approximately subtracting data centralization, with accurate initial center point to data object
carry out Preliminary division;
Initial center point adjustment unit, aims at initial center point adjustment for the classification obtaining in Preliminary division and obtains final initial center point;
Repartition unit, for the initial center point according to final again to data object
divide the classification that is optimized as cluster classification result;
Criteria optimization unit, replaces initial center point for increasing central point Candidate Set, and constantly iteration is upgraded central point until central point no longer changes, and different group of multinuclear user behavior examination criteria is optimized.
Beneficial effect: the present invention adopts different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, and adopt the nuclear space clustering algorithm of optimizing to carry out cluster to the data set after approximately subtracting, obtain cluster classification result and optimize different group of multinuclear user behavior examination criteria, and then user's hobby tagsort is detected and obtains classification results collection, for user recommends corresponding content.The present invention had both improved the satisfaction of accuracy, high efficiency and user's use of content recommendation, can reduce again the time complexity of data training, had improved recommendation efficiency.
Embodiment
The invention provides a kind of implementation method and system of personalized recommendation, for making object of the present invention, technical scheme and effect clearer, clear and definite, below the present invention is described in more detail.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Refer to Fig. 1, Fig. 1 is the implementation method of a kind of personalized recommendation of the present invention, and as shown in the figure, its preferred embodiment comprises step:
S101, obtain user behavior characteristic, according to described user behavior characteristic, obtain original user behavioural characteristic data set;
S102, use different group of multinuclear nuclear matrix to carry out nuclear space mapping to original user behavioural characteristic data set, obtain nuclear space data set;
S103, by different group of multinuclear nuclear matrix, original user behavioural characteristic data set is trained, set up initial different group of multinuclear user behavior examination criteria, with the initial different group of multinuclear user behavior examination criteria of setting up, approximately subtract nuclear space data set, obtain approximately subtracting data set;
S104, by approximately subtracting data set, different group of multinuclear nuclear matrix, initial different group of multinuclear user behavior examination criteria are input in nuclear space clustering algorithm, different group of multinuclear user behavior examination criteria and cluster classification result are optimized in output;
S105, according to optimizing different group of multinuclear user behavior examination criteria and cluster classification result, to approximately subtracting data set, carry out classification and Detection and obtain classification results collection, according to described classification results, integrate the content being associated with user behavior feature classification for user's recommendation.
The personalized recommendation of intelligent television of take is below example, and above-mentioned steps is described in detail respectively.
In step S101, first user enters into exemplary application interface, obtains user behavior characteristic, thereby combination obtains original user behavioural characteristic data set.If enter into first exemplary application interface, show default recommendation content, the layout at exemplary application interface as shown in Figure 2, can be adjusted certainly according to actual needs.
Described user behavior characteristic comprises:
Access vestige when 1, user uses intelligent television; 2, the improvement that user proposes intelligent television requires and uses feedback opinion; 3, user uses intelligent television application software or watches movie and television contents evaluation information or mark afterwards; 4, user has access to the vestige of the content recommendation of intelligent television by other approach; The label staying when 5, user uses network.Foregoing can be used as the main source of user behavior characteristic, certainly can from other modes, obtain other user behavior characteristics as required.
In step S102, first obtain a different group of multinuclear nuclear matrix, original user behavioural characteristic data set is carried out to nuclear space mapping, in the present invention, different group of multinuclear nuclear matrix is a plurality of different nuclear matrix to be carried out to the new nuclear matrix of linear combination formation.
Because the features such as randomness when user uses intelligent television, sparse property, so the user behavior characteristic of obtaining in step S101 has higher-dimension and nonlinear feature, cause identifying preferably user's behavioural characteristic, reduced the validity of content recommendation.And nuclear space can be processed higher-dimension and nonlinear data the dimension that reduces data set by mapping, make the data linear separability that becomes simultaneously.Because each kernel function has different features, it is also different that the data that obtain after mapping distribute, the embodiment of the present invention is exactly that nuclear matrix corresponding to a plurality of different kernel functions carried out to linear combination, improves the validity of content recommendation with the feature for user behavior characteristic.
Specifically, as shown in Figure 3, described step S102 can specifically be refined as following steps:
S201, set in advance one group of kernel function, calculate the nuclear matrix of each kernel function on original user behavioural characteristic data set, and determine the optimum running parameter of each nuclear matrix;
In the present embodiment, the kernel function adopting comprises gaussian kernel function, polynomial kernel function, perceptron kernel function, nuclear matrix corresponding to above-mentioned kernel function is carried out to linear combination and obtain different group of multinuclear nuclear matrix.
The interior point method of S202, use Semidefinite Programming calculates the optimum combination coefficient of different group of multinuclear nuclear matrix in conjunction with the optimum running parameter of each nuclear matrix;
S203, according to described optimum combination coefficient, each nuclear matrix is carried out to linear combination, obtain different group of multinuclear nuclear matrix
,
for nuclear matrix,
coefficient for nuclear matrix;
S204, use this different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, obtain nuclear space data set, also according to theorem in Euclid space range formula, calculate each data of nuclear space data centralization in the core distance of nuclear space, core distance refers to that data are in the distance between points of nuclear space.
In step S103, by different group of multinuclear nuclear matrix, user behavior characteristic is trained and obtained an initial different group of multinuclear user behavior examination criteria, then utilize this initial different group of multinuclear user behavior examination criteria approximately to subtract incipient nucleus space data sets, specifically, as shown in Figure 4, step S103 can be refined as following steps:
S301, by the middle solving result of different group of multinuclear nuclear matrix and Semidefinite Programming, determine initial different group of multinuclear user behavior examination criteria;
S302, initial different group of multinuclear user behavior examination criteria approximately subtracted to nuclear space data set, obtain approximately subtracting data set.
Approximately subtract data set and also claim border supported data collection, its expression be in contingency table directrix or the closer data acquisition of distance classification standard lines, these data are larger on the impact of data tagsort result, these data can be called to proper vector, and the data that approximately subtracted are on classification results impact little, belong to redundant data, so need to weed out these redundant datas.
In step S104, by aforementioned obtain approximately subtract data set, different group of multinuclear nuclear matrix, initial different group of multinuclear user behavior standard input in nuclear space clustering algorithm, use different group of multinuclear nuclear matrix to carry out nuclear space mapping to approximately subtracting data set, thereby be optimized nuclear space data set.As shown in Figure 5, it specifically comprises step:
S401, the data object of data centralization will approximately be subtracted
by different group of multinuclear nuclear matrix, be mapped in nuclear space, the data object after being shone upon is
;
S402, from approximately subtracting data centralization, choose m object as accurate initial center point, with accurate initial center point to data object
carry out Preliminary division;
In S403, the classification that obtains in Preliminary division, aim at initial center point adjustment and obtain final initial center point;
S404, according to final initial center point again to data object
divide the classification that is optimized as cluster classification result;
S405, increase central point Candidate Set are replaced initial center point, and constantly iteration is upgraded central point until central point no longer changes, and different group of multinuclear user behavior examination criteria is optimized.
In the present embodiment, nuclear space clustering algorithm is original nuclear space K-means clustering algorithm to be optimized to improve obtain.
It is mainly reflected in:
1, in step S403, initial center point has been carried out to optimizing and revising among a small circle, in original nuclear space clustering algorithm, initial center point is chosen at random, the embodiment of the present invention is adjusted initial center point: after choosing at random surely initial center point, data object is divided, in each classification obtaining in division, aim at initial center point adjustment, suppose that each data object in each classification is classification center, calculate the distance sum between other data objects in each classification center and respective classes, make its classification center apart from sum minimum as final initial center point.
2, in step S405, increasing central point Candidate Set replaces central point, if desired replace one of them central point, can there is proper candidate point in the surrounding of heart point (being in this classification or in close classification) so hereinto, so only need to search for these candidate points, just can complete the replacement step to central point.Specifically can collect candidate point by calculating the mode of core distance in abovementioned steps, that is: calculate and approximately subtract each data object of data centralization in the distance of nuclear space, thereby collect, obtain central point Candidate Set (set of candidate point).During each renewal central point, all first complete the collection to candidate point, candidate point set increases in constantly upgrading interative computation always, increase along with iterations, the hunting zone of candidate point set can be incremented on the object of all non-central point of data centralization, thereby cover whole data set, improve clustering precision.
Nuclear space clustering algorithm after improving significantly reduced cluster iterations, and the time complexity having reduced, has improved clustering precision.
The input and output of the nuclear space clustering algorithm in the present invention are as follows:
Input: the number m of classification, different group of multinuclear nuclear matrix
with the training dataset that comprises n object, initial different group of multinuclear user behavior examination criteria.
Output: m classification, different group of multinuclear user behavior examination criteria of optimization.
In step S105, by optimizing different group of multinuclear user behavior examination criteria and cluster classification result, to approximately subtracting data set, carry out classification and Detection, obtain classification results collection, according to classification results, integrate as user and recommend the content relevant to behavioural characteristic classification.
In this step, carry out having obtained user behavior feature classification after classification and Detection, and then can obtain classification results collection, be i.e. the content relevant to user behavior feature classification.Classification results wherein concentrates the recommending data comprising to arrange according to the needs at exemplary application interface, for example, during programs recommended application, and the concentrated information such as program placard, program ID, program details, program broadcast source that comprised of this classification results.
Exemplary application obtains the concentrated recommending data of classification results by download, then these recommending datas is shown to application interface, shows user, completes the process of personalized recommendation.
The embodiment of the present invention by by data set High Dimensional Mapping to nuclear space, increased the feature gathering to data similarity, thereby preferably resolved the problem of Deta sparseness; The embodiment of the present invention is utilized the linear combination technology of kernel function, given full play to the feature of each kernel function, obtained different group of multinuclear nuclear matrix, large-scale data have been carried out to cluster and approximately subtract processing, obtain more accurate cluster family and user behavior feature, improved the accuracy of recommendation results; The embodiment of the present invention, by the data of higher-dimension are carried out to nuclear space mapping, has been rejected redundant data, obtains useful characteristic vector data (approximately subtracting data set), has significantly reduced the dimension of data and the time complexity of proposed algorithm, has improved recommendation efficiency; The embodiment of the present invention is also optimized cluster proposed algorithm, has reduced the iterations of cluster, has improved clustering precision simultaneously; The present invention is directed to existing recommend method the excavation of user behavior is had to certain limitation, utilize user behavior characteristic to keep punching and excavate and approximately subtract processing, under more conditions of data message more accurately for user does recommendation service, thereby improved user's user satisfaction, and made personalized recommendation method more intelligent, more personalized.
Based on said method, the present invention also provides a kind of system that realizes of personalized recommendation, and as shown in Figure 6, its preferred embodiment comprises:
Original user behavioural characteristic data set acquisition module 100, for obtaining user behavior characteristic, obtains original user behavioural characteristic data set according to described user behavior characteristic;
Nuclear space mapping block 200, for using different group of multinuclear nuclear matrix to carry out nuclear space mapping to original user behavioural characteristic data set, obtains nuclear space data set;
The intensive module 300 that subtracts of data, for original user behavioural characteristic data set being trained by different group of multinuclear nuclear matrix, set up initial different group of multinuclear user behavior examination criteria, with the initial different group of multinuclear user behavior examination criteria of setting up, approximately subtract nuclear space data set, obtain approximately subtracting data set;
Optimize module 400, for by approximately subtracting data set, different group of multinuclear nuclear matrix, initial different group of multinuclear user behavior examination criteria are input to nuclear space clustering algorithm, different group of multinuclear user behavior examination criteria and cluster classification result are optimized in output;
Recommending module 500, for carrying out classification and Detection and obtain classification results collection approximately subtracting data set according to optimizing different group of multinuclear user behavior examination criteria and cluster classification result, according to described classification results, integrate the content being associated with user behavior feature classification for user's recommendation.
Further, as shown in Figure 7, described nuclear space mapping block 200 specifically comprises:
Nuclear matrix acquiring unit 210, for setting in advance one group of kernel function, calculates the nuclear matrix of each kernel function on original user behavioural characteristic data set, and determines the optimum running parameter of each nuclear matrix;
Optimum combination coefficient calculation unit 220, for being used the interior point method of Semidefinite Programming to calculate the optimum combination coefficient of different group of multinuclear nuclear matrix in conjunction with the optimum running parameter of each nuclear matrix;
Linear combination unit 230, for each nuclear matrix being carried out to linear combination according to described optimum combination coefficient, obtains different group of multinuclear nuclear matrix
,
for nuclear matrix,
coefficient for nuclear matrix;
Nuclear space map unit 240, for using this different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, obtains nuclear space data set.
Further, as shown in Figure 8, the intensive module 300 that subtracts of described data comprises:
Primary standard determining unit 310, for the middle solving result by different group of multinuclear nuclear matrix and Semidefinite Programming, determines initial different group of multinuclear user behavior examination criteria;
Approximately subtract unit 320, for initial different group of multinuclear user behavior examination criteria approximately subtracted to nuclear space data set, obtain approximately subtracting data set.
Further, as shown in Figure 9, described optimization module 400 comprises:
Data object map unit 410, for will approximately subtracting the data object of data centralization
by different group of multinuclear nuclear matrix, be mapped in nuclear space, the data object after being shone upon is
;
Preliminary division unit 420, for choosing m object as accurate initial center point from approximately subtracting data centralization, with accurate initial center point to data object
carry out Preliminary division;
Initial center point adjustment unit 430, aims at initial center point adjustment for the classification obtaining in Preliminary division and obtains final initial center point;
Repartition unit 440, for the initial center point according to final again to data object
divide the classification that is optimized as cluster classification result;
Criteria optimization unit 450, replaces initial center point for increasing central point Candidate Set, and constantly iteration is upgraded central point until central point no longer changes, and different group of multinuclear user behavior examination criteria is optimized.About the ins and outs of above-mentioned modular unit, existing detailed description in detail in method above, therefore repeat no more.
In sum, the embodiment of the present invention adopts different group of multinuclear nuclear matrix to carry out nuclear space mapping to user behavior characteristic, and adopt the nuclear space clustering algorithm of optimizing to carry out cluster to the data set after approximately subtracting, obtain cluster classification result and optimize different group of multinuclear user behavior examination criteria, and then user's hobby tagsort is detected and obtains classification results collection, for user recommends corresponding content, the embodiment of the present invention had both improved the accuracy of content recommendation, the satisfaction that high efficiency and user use, can reduce again the time complexity of data training, improved recommendation efficiency.
Should be understood that, application of the present invention is not limited to above-mentioned giving an example, and for those of ordinary skills, can be improved according to the above description or convert, and all these improvement and conversion all should belong to the protection domain of claims of the present invention.