CN102289478A - System and method for recommending video on demand based on fuzzy clustering - Google Patents

System and method for recommending video on demand based on fuzzy clustering Download PDF

Info

Publication number
CN102289478A
CN102289478A CN2011102169330A CN201110216933A CN102289478A CN 102289478 A CN102289478 A CN 102289478A CN 2011102169330 A CN2011102169330 A CN 2011102169330A CN 201110216933 A CN201110216933 A CN 201110216933A CN 102289478 A CN102289478 A CN 102289478A
Authority
CN
China
Prior art keywords
user
module
fuzzy clustering
data
submodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011102169330A
Other languages
Chinese (zh)
Inventor
王小军
朱祎
王红林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU RADIO AND TV UNIVERSITY
Original Assignee
JIANGSU RADIO AND TV UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU RADIO AND TV UNIVERSITY filed Critical JIANGSU RADIO AND TV UNIVERSITY
Priority to CN2011102169330A priority Critical patent/CN102289478A/en
Publication of CN102289478A publication Critical patent/CN102289478A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a system and a method for recommending video on demand based on fuzzy clustering and belongs to the technical field of recommendation of individuation of internet video on demand. The system is formed by sequentially connecting a data pre-processing module, a fuzzy clustering analysis module, an individuation user recommendation module and a system effectiveness analysis module. By pre-processing and screening user access records, and using a fuzzy clustering algorithm, a clustering center and a user access mode are generated; and based on the degree of membership of the user access mode and a system access mode, an individuation video on demand recommendation list of a user is generated. By adoption of the system and the method for recommending the video on demand, influence of noise data in source data on a recommendation result can be effectively reduced, the execution efficiency and convergence time of the fuzzy clustering are improved, and more reasonable recommendation service for individuation video programs is provided for the user.

Description

Video request program commending system and method based on fuzzy clustering
Technical field
The present invention relates to a kind of video request program commending system and method, belong to internet video program request personalized recommendation technical field based on fuzzy clustering.
Background technology
Constantly increase along with moving the user group who inserts, it with the online video every aspect that the information communication mode of carrier has been infiltrated daily life, in the process of video request program, user's loyalty and use " viscosity " to the user and become to weigh and use whether successful key.
Present existing personalized recommendation system is mainly the lead referral commodity, satisfy client's individual demand, main advantage is that it can collect the user characteristics data and provide succinct navigation Service and personalized commercial product recommending service according to user capture feature, hobby for the user.The personalized recommendation aspect of video request program on the net, mainly comprise page navigation and optimize two aspects, the technology that is adopted comprises statistical study, association analysis, collaborative filtering and classification analysis, as Bayes, decision tree classification, in these technical methods, provide the prerequisite of personalized recommendation service all to be, set up and improve the user capture model according to user's visit situation.In the process of setting up the user capture model, below two aspects also want further perfect, first feature selecting problem, in personalized recommendation system, at the feature selecting problem (screening of the extraction of key feature, derived character and extraneous features) of magnanimity higher-dimension, nonumeric type data, in relevant patent and document, all do not provide solution preferably.The expression of its two personalized users Access Model, in the prior art solution, usually adopt user capture record and system's access characteristic model to replace the user capture characteristic model, do not recommend respectively, and do not accomplish real personalized recommendation according to user's personalization features.
Summary of the invention
In order to address the above problem, the invention provides a kind of video request program commending system and method based on fuzzy clustering, perfect user capture model in the video-on-demand applications has been set up efficient, personalized commending system on the net.
The present invention adopts following technical scheme for solving its technical matters:
A kind of video request program commending system based on fuzzy clustering, link to each other in proper order by data preprocessing module, fuzzy cluster analysis module, personalized user recommending module and system effectiveness analysis module, wherein, described data preprocessing module is to be linked to each other in proper order with the principal component analysis submodule by Source Data Acquisition submodule, data cleansing submodule, user conversation recognin module, character attibute transformant module, data normalization submodule, feature screening submodule; Described fuzzy cluster analysis module generates submodule by initialization cluster centre submodule, fuzzy clustering algorithm application submodule and system access module and links to each other in proper order; The personalized recommendation module generates submodule, personalized recommendation generation and feedback by user access pattern and links to each other in proper order with the evaluation and test submodule.
Recommend method based on the video request program commending system of fuzzy clustering comprises the steps:
(1) Visitor Logs of collection video on-demand system, the abnormal access record is cleaned, status attribute identification user conversation according to Visitor Logs, and character attibute transformed, Visitor Logs after transforming is carried out the feature screening, go out key feature according to the characteristic similarity index screening, the data after the screening are carried out principal component analysis again, determine the dimension of feature according to the contribution rate of accumulative total of feature;
(2) output data of data pretreatment module is carried out normalizing operation and the random sampling of putting back to is arranged, utilize KR density Estimation method and K-Means to analyze, initialization fuzzy clustering center, use fuzzy clustering algorithm SFCM, produce the fuzzy clustering center, system's access module and system's visit recommendation list;
(3) with user be classification foundation, use caching technology pre-service user capture record, utilize KR density Estimation method generate the user the initialization cluster centre and calculate generalized variable value in the principal component analysis, according to the degree of membership threshold value and the ratio of user access pattern and system's access module, produce this user's individualized video program request recommendation list;
(4) by definition,, the parameter of fuzzy clustering algorithm SFCM is regulated, reach fuzzy clustering effect preferably, realize more excellent video request program personalized recommendation service in conjunction with cluster validity function to division factor and feasibility division factor.
Beneficial effect of the present invention is as follows:
1, the present invention adopts the method for feature screening and principal component analysis (PCA), realizes the source data dimensionality reduction and preserves the quantity of information of source data, has improved the efficient of fuzzy clustering.
2, adopt random sampling and KR density Estimation algorithm at the big data quantity sample, realize fuzzy clustering center initialization, accelerated the speed of convergence of fuzzy clustering, and utilized the fuzzy clustering center to set up system's access module.Produce user access pattern and the degree of membership in system's access module thereof by fuzzy clustering, the personalized user's recommendation list of ratio generation according to degree of membership realizes that the video request program of personalization is recommended.
3, effectively reduce that noise data improves the execution efficient and the convergence time of fuzzy clustering simultaneously to the influence of recommendation results in the source data, the recommendation service of more rational individualized video program is provided for the user.
Description of drawings
Fig. 1 is a video request program commending system structure composition frame chart of the present invention.
Fig. 2 is feature selecting and data pre-service figure.
Fig. 3 is the data flowchart of fuzzy clustering generation system access module.
Fig. 4 is for generating the data flowchart of personalized user recommendation list.
Embodiment
Below in conjunction with accompanying drawing the invention is described in further details.
As Fig. 1 is video request program commending system structure composition frame chart of the present invention, link to each other in proper order by data preprocessing module, fuzzy cluster analysis module, personalized user recommending module and system effectiveness analysis module, wherein, described data preprocessing module is to be linked to each other in proper order with the principal component analysis submodule by Source Data Acquisition submodule, data cleansing submodule, user conversation recognin module, character attibute transformant module, data normalization submodule, feature screening submodule; Described fuzzy cluster analysis module generates submodule by initialization cluster centre submodule, fuzzy clustering algorithm application submodule and system access module and links to each other in proper order; The personalized recommendation module generates submodule, personalized recommendation generation by user access pattern, and feedback links to each other in proper order with the evaluation and test submodule.
The recommend method of described video request program commending system based on fuzzy clustering comprises the steps:
(1) Visitor Logs of collection video on-demand system, the abnormal access record is cleaned, status attribute identification user conversation according to Visitor Logs, and character attibute transformed, Visitor Logs after transforming is carried out the feature screening, go out key feature according to the characteristic similarity index screening, the data after the screening are carried out principal component analysis again, determine the dimension of feature according to the contribution rate of accumulative total of feature;
(2) output data of data pretreatment module is carried out normalizing operation and the random sampling of putting back to is arranged, utilize KR density Estimation method and K-Means to analyze, initialization fuzzy clustering center, use fuzzy clustering algorithm SFCM, produce the fuzzy clustering center, system's access module and system's visit recommendation list;
(3) with user be classification foundation, use caching technology pre-service user capture record, utilize KR density Estimation method generate the user the initialization cluster centre and calculate generalized variable value in the principal component analysis, according to the degree of membership threshold value and the ratio of user access pattern and system's access module, produce this user's individualized video program request recommendation list;
(4) by definition,, the parameter of fuzzy clustering algorithm SFCM is regulated, reach fuzzy clustering effect preferably, realize more excellent video request program personalized recommendation service in conjunction with cluster validity function to division factor and feasibility division factor.
As Fig. 2 is feature selecting and the pretreated process flow diagram of data, comprises Source Data Acquisition, data cleansing, user conversation identification, character attibute conversion, data normalization, feature screening and 7 parts of principal component analysis.
Before system's Visitor Logs carries out fuzzy cluster analysis, need carry out the conversion and the cleaning of data layout, the access activity of General System visit meeting identifying user is as client ip, server ip, port, requesting method, visit date, access time, request path, URL, protocol type, transmission byte number, browser version, system version, Access status, user agent (User-agent), reference (Reference).Wherein Access status (C-State) comprises 8 kinds of Access status, and 200 expressions link successfully as state code, and state code 500 expression server internal error stop visit.By preliminary screening, can remove the outlier data of visit failure to Access status.The identification of user and session, at the sign of registered user and anonymous, different client ips can be designated different users, and identical IP is because factors such as NAT, Proxy if browser and system version change, also are designated different user.The identification of session judges according to the numerical value of client time of reception item (X-Duration), if surpass 30 seconds data bufferings and reception, then is considered as effective visit.Because the character type eigenwert is not easy to cluster analysis, therefore need handle the character type eigenwert, the eigenwert of character type wherein, transfer its corresponding protocol number as TCP, UDP to according to the RFC1340 standard, the IP address transfers decimal system numerical value to according to its 32 codings, and such IP address transfers unique numerical value to and identifies.
By to the screening of source data Access status, the identification of user conversation, nonumeric type data conversion treatment, can obtain rational fuzzy clustering input data, but owing to fuzzy clustering is handled is higher-dimension, mass data, also needs to consider ageing when guaranteeing clustering precision.Therefore adopt feature screening and principal component analysis method to reduce the dimension of source data in the method as far as possible, aspect quantity of information, keep original quantity of information simultaneously as far as possible.
Similarity in feature screening employing [0,1] the expression system Visitor Logs between feature and the feature, 0 expression is uncorrelated fully, and 1 expression is relevant fully, other numeric representation similarity degrees, the similarity measurement formula of feature is as follows:
Figure 477858DEST_PATH_IMAGE001
(1)
Wherein
Figure 739075DEST_PATH_IMAGE002
Be sample size,
Figure 453959DEST_PATH_IMAGE003
In the expression sample the
Figure 712902DEST_PATH_IMAGE004
Individual feature and
Figure 529548DEST_PATH_IMAGE005
The similarity of individual feature, Expression the
Figure 665312DEST_PATH_IMAGE007
Of individual sample
Figure 95156DEST_PATH_IMAGE004
Data after the individual feature normalization.By the similarity index between feature and the feature in the calculating source data
Figure 382787DEST_PATH_IMAGE008
, given reference threshold
Figure 985806DEST_PATH_IMAGE009
, if feature
Figure 629277DEST_PATH_IMAGE004
With feature
Figure 777493DEST_PATH_IMAGE005
Between similarity greater than reference threshold, then can therefrom remove a more unmanageable feature.In the method, the reference threshold value is 0.75.
The step of screening data later being carried out principal component analysis is as follows:
(1) the covariance matrix C of calculating sample;
(2) proper vector of calculating covariance matrix C Eigenwert
Figure 647546DEST_PATH_IMAGE011
,
Figure 457108DEST_PATH_IMAGE002
Be natural number, eigenwert is by sorting from big to small:
Figure 25493DEST_PATH_IMAGE012
(3) with data projection in the space that proper vector generates, the characteristic of correspondence value is
Figure 54760DEST_PATH_IMAGE013
, like this data can Represent in the dimension space.For dimension
Figure 352066DEST_PATH_IMAGE014
Selection, according to preceding
Figure 629636DEST_PATH_IMAGE014
The contribution rate of accumulative total of individual feature is selected, when adding up contribution margin greater than reference value, before just can thinking
Figure 67571DEST_PATH_IMAGE014
Individual eigenwert has comprised the abundant quantity of information of sample, and the contribution rate of accumulative total reference value is 0.8 in this method.
The collection of source data comprises the daily record of the NCSA form and the Common form of web server, and the daily record of streaming media server (MediaServer, FlashServer).Data cleansing utilizes user's abnormal access record in the sed wscript.exe deleted data source, and the abnormal access record mainly comprises (disable access, error code 403; File does not exist, error code 404; Server internal error, error code 500).At first differentiate during user conversation identification according to client ip, if IP difference then be different user visit is if IP is identical, because NAT, IP reuse technology such as Proxy, browser version in the Visitor Logs or operating system version difference are also thought the visit of different user.The identification of user conversation if this value surpasses 30 seconds, then is considered as effective visit according to the numerical value of the time of reception item (X-Duration) in the Visitor Logs.Character attibute is converted into numeric type and handles according to the practical significance of character, transfer its corresponding protocol number as TCP, UDP to according to the RFC1340 standard, Transmission Control Protocol numbers 6, udp protocol numbers 17, the IP address transfers decimal system numerical value to according to its 32 codings, and an IP address identifies with a numerical value.
Data normalization can guarantee result's reliability, source data is carried out after character type changes the numeric type operation, to adopt formula (1-1, standard deviation method 1-2) is carried out standardization to data, the data normalization operation can be adopted SPSS statistical analysis software or Matlab software to assist and finish, and writes down each attribute average of sample simultaneously
Figure 550504DEST_PATH_IMAGE014
The average absolute skew
Figure 23074DEST_PATH_IMAGE015
The screening attribute according to the data after the standardization, utilizes formula (1), calculate between any two attributes, if the similarity between two attributes greater than
Figure 683994DEST_PATH_IMAGE009
, then can therefrom screen a more unmanageable attribute,
Figure 671541DEST_PATH_IMAGE016
Reference value be 0.75.Principal component analysis is handled is standardized data after the feature screening, uses the principal component analysis function in the SPSS software to analyze, in analysis result, and generalized variable
Figure 895849DEST_PATH_IMAGE017
The descending arrangement of contribution rate, according to preceding If the contribution rate of accumulative total of a generalized variable is greater than threshold value
Figure 881178DEST_PATH_IMAGE009
, before then selecting
Figure 293704DEST_PATH_IMAGE014
The generalized variable of item is as the pre-service result of source data, generation system access module, contribution rate of accumulative total
Figure 869173DEST_PATH_IMAGE016
Reference value be 0.8.Data after the feature screening can be used
Figure 50756DEST_PATH_IMAGE014
The item generalized variable is represented, and is common
Figure 568325DEST_PATH_IMAGE014
Value between 5-10 for more suitable.Fully guaranteeing further to the source data dimensionality reduction, to effectively raise the efficient of cluster analysis under the prerequisite of SDI amount.
As the data flowchart of Fig. 3 fuzzy clustering generation system access module, mainly be the tabulation of generation system access module and system recommendation.The standardisation process of source data is as follows:
(1-1)
Figure 345843DEST_PATH_IMAGE019
(1-2)
Wherein,
Figure 444249DEST_PATH_IMAGE014
Be the mean value of each characteristic attribute,
Figure 883452DEST_PATH_IMAGE015
Be the average absolute skew, then the data of each attribute carried out following standardization:
Figure 332887DEST_PATH_IMAGE020
Wherein Expression the after standardization
Figure 904869DEST_PATH_IMAGE022
Of sample data
Figure 764241DEST_PATH_IMAGE023
Individual attribute, this conversion with data from original space conversion to standardized space, finish the standardization of data source data.
The cluster centre initialization, because the sample data amount is bigger, therefore when the cluster centre initialization, the method that adopts sample random sampling and Density Clustering method to combine at first has the random sampling of putting back to sample data, for example 0.1%, repeat
Figure 373076DEST_PATH_IMAGE002
Inferior, for the sampling results of each, adopt KR density Estimation method, obtain The cluster centre of individual sampling, for
Figure 432616DEST_PATH_IMAGE002
The group cluster centre adopts the k-means algorithm carrying out cluster, and select performance the best one group, as the initialization cluster centre
Figure 712157DEST_PATH_IMAGE024
Use fuzzy clustering algorithm SFCM(Shift-Fuzzy C-means) process prescription is as follows:
(1) initialization: provide initial cluster center
Figure 808289DEST_PATH_IMAGE024
,
Figure 316631DEST_PATH_IMAGE025
,
Figure 260447DEST_PATH_IMAGE026
Be iterations, greatest iteration number is T, and threshold value is
Figure 461621DEST_PATH_IMAGE027
, variable element is
Figure 356634DEST_PATH_IMAGE028
(2) upgrade
Figure 606349DEST_PATH_IMAGE029
(degree of membership is new formula (2) more);
(3) upgrade
Figure 653940DEST_PATH_IMAGE030
(cluster centre is new formula (3) more)
(4) if
Figure 776748DEST_PATH_IMAGE031
Perhaps
Figure 909789DEST_PATH_IMAGE032
, then stop; Otherwise,
Figure 274780DEST_PATH_IMAGE033
, forwarded for (2) step to.
    
Figure 114560DEST_PATH_IMAGE034
(2)
Figure 657537DEST_PATH_IMAGE035
(3)
Figure 763027DEST_PATH_IMAGE036
(4)
Wherein formula (4) is, the objective function of fuzzy cluster analysis.
Result (fuzzy clustering center according to fuzzy clustering
Figure 682442DEST_PATH_IMAGE037
) recommendation list of generation system access module and access module.
Carry out fuzzy cluster analysis for the source data of handling well among Fig. 2, the generation system access module.Fuzzy cluster analysis at first needs the initialization cluster centre, and randomly drawing of putting back to arranged from source data
Figure 688313DEST_PATH_IMAGE002
Inferior sample, each sample drawn ratio is controlled at about 0.1%~1% according to the size of source data sample space, and each sample drawn amount is about about 10,000.At
Figure 339874DEST_PATH_IMAGE002
Inferior each sample of randomly drawing all adopts the KR density Estimation method of Leonard Kaufman, calculates Individual cluster centre vector calculates optimum cluster centre with standard k-means clustering method, then as initialized fuzzy clustering center Operate according to the step in the fuzzy clustering arthmetic statement then, iterations threshold value T value is 20, threshold value
Figure 204559DEST_PATH_IMAGE027
Value 0.0001, variable ginseng
Figure 354917DEST_PATH_IMAGE038
Value 0.001, fuzzy weight index
Figure 933535DEST_PATH_IMAGE014
Value [1.5,2.5], general value
Figure 866856DEST_PATH_IMAGE039
Use formula (2) degree of membership matrix, formula (3) upgrades the cluster centre matrix.Up to the ultimate range of twice cluster centre less than threshold
Figure 598052DEST_PATH_IMAGE027
Perhaps iterations then finishes fuzzy clustering greater than iterations threshold value T.Generally, iterations will be restrained end at 10~15 times, the result of fuzzy clustering is exactly the access module of system, concrete system's access module comprises video frequency program degree of membership matrix in cluster numbers, cluster centre and the every class, for example the probability that certain video frequency program may belong to classification 1 in system's access module is 90%, the probability that belongs to classification 2 is 40%, the probability that belongs to classification 3 is 10%, if the degree of membership threshold value is set is 70%, then this video frequency program belongs to classification 1, and the degree of membership threshold value is set to 75% usually.Produce the access module of system by above-mentioned fuzzy clustering method, can effectively reduce the influence of noise data in the source data, improve the execution efficient of fuzzy clustering and reduce convergence time, the more rational recommendation service that video frequency program is provided for the user.
Fig. 4 is for generating the data flowchart of personalized user recommendation list, and the recommendation according to the personalized video frequency request program of user's Visitor Logs realization comprises the steps:
1) with registered user's Visitor Logs from system's Visitor Logs, call in the cache storage, reduce the IO of system expense;
2) data among the cache are carried out standardization, utilize the KR density estimation method, generate the initialization cluster centre;
3), calculate user capture and be recorded in generalized variable value in the principal component analysis according to principal component analysis result (formula 5)
Figure 670044DEST_PATH_IMAGE040
Figure 158794DEST_PATH_IMAGE041
(5)
Wherein
Figure 223702DEST_PATH_IMAGE042
For
Figure 58672DEST_PATH_IMAGE043
The pairing proper vector of eigenwert of covariance matrix Σ,
Figure 285254DEST_PATH_IMAGE044
Be the value of original variable through standardization;
(4) user's degree of membership threshold value T is set,, calculates the degree of membership that the user belongs to each class according to system's access module
Figure 526879DEST_PATH_IMAGE045
If, degree of membership Greater than threshold value T, then preferentially recommend such online video program, degree of membership to the user
Figure 320840DEST_PATH_IMAGE046
Less than threshold value T, then do not recommend such online video program to the user.Can obtain by calculating, can recommend to the user
Figure 233170DEST_PATH_IMAGE047
Class online video program;
(5) during customized personal user's recommendation list,, utilize formula (6), be provided with according to the ratio of user access pattern degree of membership
Figure 696513DEST_PATH_IMAGE047
Every class video frequency program finally generates personalized user video program request recommendation list to the recommendation ratio of personalized user;
Figure 103223DEST_PATH_IMAGE048
(6)
Wherein,
Figure 414250DEST_PATH_IMAGE049
Be
Figure 920318DEST_PATH_IMAGE007
Individual user's personalized recommendation tabulation,
Figure 198852DEST_PATH_IMAGE050
Be the degree of membership of this user and system's access module number greater than threshold value, Be system's access module
Figure 807743DEST_PATH_IMAGE004
The recommendation list of class.
For example: certain user access pattern belongs to
Figure 484712DEST_PATH_IMAGE052
The degree of membership of class is 90%, belongs to
Figure 735696DEST_PATH_IMAGE053
The degree of membership of class is 60%, then in the tabulation of this user's personalized recommendation, belongs to
Figure 484209DEST_PATH_IMAGE052
The content recommendation of class account for 60%, belong to
Figure 691199DEST_PATH_IMAGE053
The content recommendation of class account for 40%.Realize reasonably recommendation function.
The user mainly is divided into registered user and anonymous, for the recommendation of anonymous, because be difficult to the individual implementations of clear and definite each anonymous of differentiation from Visitor Logs, so the recommendation of anonymous, the access module of employing system is recommended, as: the current video program of user capture belongs to
Figure 116233DEST_PATH_IMAGE052
Probability be 90%, belong to
Figure 103781DEST_PATH_IMAGE053
Probability be 45%, according to formula (6), in the video program recommendation of this anonymous tabulation 2/3 from
Figure 328089DEST_PATH_IMAGE052
, 1/3 from
Figure 468214DEST_PATH_IMAGE053
The recommendation of the video frequency program in the same classification then according to the weight of video frequency program in such and user's attention rate ordering, is finished the recommendation of anonymous video frequency program.
The registered user, can from system's Visitor Logs, filter out user's Visitor Logs according to user name, behind the preprocessed data, can be according to formula (5), obtain the generalized variable value behind the principal component analysis, if user capture record number is bigger, then can adopt random sampling in conjunction with KR density Estimation method, calculate the fuzzy clustering center of this user capture record, otherwise reduce the random sampling step, equally through obtaining user's cluster centre after the fuzzy clustering
Figure 814882DEST_PATH_IMAGE054
, calculate user's cluster centre then
Figure 961829DEST_PATH_IMAGE055
Cluster centre with system
Figure 301413DEST_PATH_IMAGE056
Degree of membership, if degree of membership
Figure 545312DEST_PATH_IMAGE046
Greater than threshold value T, then preferentially recommend such online video program, degree of membership to the user Less than threshold value T, then do not recommend such online video program to the user.Can obtain by calculating, can recommend to the user
Figure 775754DEST_PATH_IMAGE047
Class online video program can generate in this user's the individualized video program request recommendation list according to formula (6) equally The ratio that program is shared, similar with anonymous, the recommendation of the video frequency program in the same classification, then according to the weight of video frequency program in such and user's attention rate ordering, finish the recommendation of user individual video frequency program, compare with anonymous, the registered user can generate personalized user video request program recommendation list according to user's access module.
For the efficiency analysis of system, to analyze the validity of fuzzy clustering exactly, can adopt method based on the validity function of fuzzy division factor.Count c and degree of membership matrix U for given cluster centre, division factor is defined as:
Figure 689538DEST_PATH_IMAGE057
(7) (
Figure 378008DEST_PATH_IMAGE002
The vector dimension)
The possibility division factor is defined as:
Figure 312597DEST_PATH_IMAGE058
(8)
Cluster validity function [5] is defined as:
Figure 744715DEST_PATH_IMAGE059
(9)
Figure 212475DEST_PATH_IMAGE060
(10)
If exist Satisfy formula (10), then Validity cluster for " optimum ".After setting up system and user access pattern, can utilize formula (10) to carry out the efficiency analysis of cluster function,
Figure 401645DEST_PATH_IMAGE062
Functional value is more little, shows that the effect of fuzzy clustering is good more.

Claims (2)

1. video request program commending system based on fuzzy clustering, it is characterized in that linking to each other in proper order by data preprocessing module, fuzzy cluster analysis module, personalized user recommending module and system effectiveness analysis module, wherein, described data preprocessing module is to be linked to each other in proper order with the principal component analysis submodule by Source Data Acquisition submodule, data cleansing submodule, user conversation recognin module, character attibute transformant module, data normalization submodule, feature screening submodule; Described fuzzy cluster analysis module generates submodule by initialization cluster centre submodule, fuzzy clustering algorithm application submodule and system access module and links to each other in proper order; The personalized recommendation module generates submodule, personalized recommendation generation and feedback by user access pattern and links to each other in proper order with the evaluation and test submodule.
2. a recommend method that is applied to the described video request program commending system based on fuzzy clustering of claim 1 is characterized in that, comprises the steps:
(1) Visitor Logs of collection video on-demand system, the abnormal access record is cleaned, status attribute identification user conversation according to Visitor Logs, and character attibute transformed, Visitor Logs after transforming is carried out the feature screening, go out key feature according to the characteristic similarity index screening, the data after the screening are carried out principal component analysis again, determine the dimension of feature according to the contribution rate of accumulative total of feature;
(2) output data of data pretreatment module is carried out normalizing operation and the random sampling of putting back to is arranged, utilize KR density Estimation method and K-Means to analyze, initialization fuzzy clustering center, use fuzzy clustering algorithm SFCM, produce the fuzzy clustering center, system's access module and system's visit recommendation list;
(3) with user be classification foundation, use caching technology pre-service user capture record, utilize KR density Estimation method generate the user the initialization cluster centre and calculate generalized variable value in the principal component analysis, according to the degree of membership threshold value and the ratio of user access pattern and system's access module, produce this user's individualized video program request recommendation list;
(4) by definition,, the parameter of fuzzy clustering algorithm SFCM is regulated, reach fuzzy clustering effect preferably, realize more excellent video request program personalized recommendation service in conjunction with cluster validity function to division factor and feasibility division factor.
CN2011102169330A 2011-08-01 2011-08-01 System and method for recommending video on demand based on fuzzy clustering Pending CN102289478A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011102169330A CN102289478A (en) 2011-08-01 2011-08-01 System and method for recommending video on demand based on fuzzy clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011102169330A CN102289478A (en) 2011-08-01 2011-08-01 System and method for recommending video on demand based on fuzzy clustering

Publications (1)

Publication Number Publication Date
CN102289478A true CN102289478A (en) 2011-12-21

Family

ID=45335905

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102169330A Pending CN102289478A (en) 2011-08-01 2011-08-01 System and method for recommending video on demand based on fuzzy clustering

Country Status (1)

Country Link
CN (1) CN102289478A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207896A (en) * 2013-03-14 2013-07-17 无锡清华信息科学与技术国家实验室物联网技术中心 Method and system for stable and efficient self-adaptive clustering
CN103544206A (en) * 2013-07-16 2014-01-29 Tcl集团股份有限公司 Method and system for achieving individualized recommendations
CN103577602A (en) * 2013-11-18 2014-02-12 浪潮(北京)电子信息产业有限公司 Secondary clustering method and system
CN103647800A (en) * 2013-11-19 2014-03-19 乐视致新电子科技(天津)有限公司 Method and system of recommending application resources
CN103686236A (en) * 2013-11-19 2014-03-26 乐视致新电子科技(天津)有限公司 Method and system for recommending video resource
WO2014180224A1 (en) * 2013-11-21 2014-11-13 中兴通讯股份有限公司 Method and device for service recommendation
CN104462383A (en) * 2014-12-10 2015-03-25 山东科技大学 Movie recommendation method based on feedback of users' various behaviors
CN104853248A (en) * 2015-05-07 2015-08-19 海信集团有限公司 Video recommendation method and device
CN105812834A (en) * 2016-05-10 2016-07-27 南京大学 Video recommendation server, recommendation method and pre-caching method based on cluster information
CN106649540A (en) * 2016-10-26 2017-05-10 Tcl集团股份有限公司 Video recommendation method and system
CN107180088A (en) * 2017-05-10 2017-09-19 广西师范学院 News based on Fuzzy C-Means Cluster Algorithm recommends method
CN107194769A (en) * 2017-05-17 2017-09-22 东莞市华睿电子科技有限公司 A kind of Method of Commodity Recommendation that content is searched for based on user
CN110165657A (en) * 2018-08-30 2019-08-23 中国南方电网有限责任公司 Consider substation's load characteristics clustering analysis method of user's industry attribute
WO2020087388A1 (en) * 2018-10-31 2020-05-07 深圳市欢太科技有限公司 Quick application recommendation method and apparatus, storage medium, and electronic device
CN112148920A (en) * 2020-08-11 2020-12-29 中标慧安信息技术股份有限公司 Data management method
CN114610234A (en) * 2022-02-28 2022-06-10 浪潮电子信息产业股份有限公司 Storage system parameter recommendation method and related device

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207896B (en) * 2013-03-14 2017-02-01 无锡清华信息科学与技术国家实验室物联网技术中心 Method and system for stable and efficient self-adaptive clustering
CN103207896A (en) * 2013-03-14 2013-07-17 无锡清华信息科学与技术国家实验室物联网技术中心 Method and system for stable and efficient self-adaptive clustering
CN103544206A (en) * 2013-07-16 2014-01-29 Tcl集团股份有限公司 Method and system for achieving individualized recommendations
CN103544206B (en) * 2013-07-16 2017-09-15 Tcl集团股份有限公司 A kind of realization method and system of personalized recommendation
CN103577602A (en) * 2013-11-18 2014-02-12 浪潮(北京)电子信息产业有限公司 Secondary clustering method and system
CN103686236A (en) * 2013-11-19 2014-03-26 乐视致新电子科技(天津)有限公司 Method and system for recommending video resource
CN103647800A (en) * 2013-11-19 2014-03-19 乐视致新电子科技(天津)有限公司 Method and system of recommending application resources
CN103647800B (en) * 2013-11-19 2017-12-12 乐视致新电子科技(天津)有限公司 Recommend the method and system of application resource
WO2014180224A1 (en) * 2013-11-21 2014-11-13 中兴通讯股份有限公司 Method and device for service recommendation
CN104462383A (en) * 2014-12-10 2015-03-25 山东科技大学 Movie recommendation method based on feedback of users' various behaviors
CN104462383B (en) * 2014-12-10 2017-11-21 山东科技大学 A kind of film based on a variety of behavior feedbacks of user recommends method
CN104853248A (en) * 2015-05-07 2015-08-19 海信集团有限公司 Video recommendation method and device
CN104853248B (en) * 2015-05-07 2017-09-22 海信集团有限公司 A kind of video recommendation method and device
CN105812834A (en) * 2016-05-10 2016-07-27 南京大学 Video recommendation server, recommendation method and pre-caching method based on cluster information
CN105812834B (en) * 2016-05-10 2019-03-12 南京大学 Video recommendations server, recommended method and pre-cache method based on clustering information
CN106649540A (en) * 2016-10-26 2017-05-10 Tcl集团股份有限公司 Video recommendation method and system
CN106649540B (en) * 2016-10-26 2022-04-01 Tcl科技集团股份有限公司 Video recommendation method and system
CN107180088A (en) * 2017-05-10 2017-09-19 广西师范学院 News based on Fuzzy C-Means Cluster Algorithm recommends method
CN107194769A (en) * 2017-05-17 2017-09-22 东莞市华睿电子科技有限公司 A kind of Method of Commodity Recommendation that content is searched for based on user
CN110165657A (en) * 2018-08-30 2019-08-23 中国南方电网有限责任公司 Consider substation's load characteristics clustering analysis method of user's industry attribute
WO2020087388A1 (en) * 2018-10-31 2020-05-07 深圳市欢太科技有限公司 Quick application recommendation method and apparatus, storage medium, and electronic device
CN112673370A (en) * 2018-10-31 2021-04-16 深圳市欢太科技有限公司 Fast application recommendation method and device, storage medium and electronic equipment
CN112148920A (en) * 2020-08-11 2020-12-29 中标慧安信息技术股份有限公司 Data management method
CN114610234A (en) * 2022-02-28 2022-06-10 浪潮电子信息产业股份有限公司 Storage system parameter recommendation method and related device
CN114610234B (en) * 2022-02-28 2024-02-20 浪潮电子信息产业股份有限公司 Storage system parameter recommendation method and related device

Similar Documents

Publication Publication Date Title
CN102289478A (en) System and method for recommending video on demand based on fuzzy clustering
US20210248613A1 (en) Systems and methods for real-time processing of data streams
Wu et al. Contextual bandits in a collaborative environment
US11172048B2 (en) Method and apparatus for predicting experience degradation events in microservice-based applications
US9911143B2 (en) Methods and systems that categorize and summarize instrumentation-generated events
US20170300966A1 (en) Methods and systems that predict future actions from instrumentation-generated events
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN102075851B (en) Method and system for acquiring user preference in mobile network
Liu et al. Personalized recommendation of popular blog articles for mobile applications
WO2021196639A1 (en) Message pushing method and apparatus, and computer device and storage medium
WO2009032856A2 (en) Customized today module
US20220417339A1 (en) Feature-based network embedding
CN101957968A (en) Online transaction service aggregation method based on Hadoop
Zhang et al. LA-LMRBF: Online and long-term web service QoS forecasting
Bagherjeiran et al. Combining behavioral and social network data for online advertising
Varghese et al. Cluster optimization for enhanced web usage mining using fuzzy logic
Mehta et al. Collaborative personalized web recommender system using entropy based similarity measure
CN103095849A (en) A method and a system of spervised web service finding based on attribution forecast and error correction of quality of service (QoS)
CN107958070B (en) Personalized message pushing method based on user preference
Chen et al. User intent-oriented video QoE with emotion detection networking
Martín-Guerrero et al. Studying the feasibility of a recommender in a citizen web portal based on user modeling and clustering algorithms
Liu et al. Recognizing and characterizing dynamics of cellular devices in cellular data network through massive data analysis
Wang et al. Intent mining: A social and semantic enhanced topic model for operation-friendly digital marketing
CN116860856A (en) Financial data processing method and device, computer equipment and storage medium
Jun et al. Parallelized Jaccard-based learning method and MapReduce implementation for mobile devices recognition from massive network data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20111221