Summary of the invention
The invention provides a kind of advertising media method for evaluating quality and device, to solve advertising media's assessment stability and the not high problem of accuracy.
According to an aspect of the present invention, a kind of advertising media method for evaluating quality is provided, comprise: obtain user respectively in response to many group access informations of a plurality of advertising media, wherein, the every group access information in described many group access informations includes multiple visit information; Described multiple visit information is carried out to principal component analysis (PCA), obtain multiple major component, wherein, described multiple major component is separate and can contain most of visit information of described a plurality of advertising media; The weight that contains the information of described multiple major component according to described multiple major component and every kind of major component, determines the quality of described a plurality of advertising media.
Preferably, described multiple visit information being carried out to principal component analysis (PCA) comprises: determine the correlativity between every kind of visit information in described multiple visit information; According to described correlativity, described multiple visit information is carried out to noise reduction and de-redundancy.
Preferably, determine that the described correlativity between every kind of visit information in described multiple visit information comprises: many group access informations described in standardization, and according to standardized described many group access informations, generate original sample matrix; Calculate described original sample matrix about the covariance matrix of described multiple visit information, wherein, the element in described covariance matrix is used to indicate the correlativity between every kind of visit information in described multiple visit information.
Preferably, according to described correlativity, described multiple visit information is carried out to noise reduction and de-redundancy comprises: covariance matrix described in diagonalization, obtain a stack features value, wherein, an eigenwert in a described stack features value is for representing the relative size of the information of the described original sample matrix that corresponding dimension contains; Determine the contribution rate of each eigenwert in a described stack features value; According to described contribution rate and preselected threshold condition, determine another stack features value, another stack features value characteristic of correspondence vector is major component, wherein, described another stack features value is a subset of a described stack features value; One group major component and the described original sample matrix corresponding according to described another stack features value, calculate new samples matrix, wherein, an element in described new samples matrix is for representing the score value of every kind of major component of the described multiple major component of corresponding Yi Ge advertising media.
Preferably, the weight that contains the information of described multiple major component according to described many group major components and every group of major component, the quality of determining described a plurality of advertising media comprises: determine weight corresponding to every group of major component in described many group major components, wherein, described weight contains the ratio of the information of described multiple major component for representing corresponding a kind of major component; According to score value and the described weight of described multiple major component corresponding to each advertising media in described a plurality of advertising media, calculate respectively the integrate score of described each advertising media, wherein, described integrate score is for representing the quality of corresponding advertising media.
According to another aspect of the present invention, a kind of advertising media quality evaluation device is also provided, has comprised: acquisition module, for obtaining user respectively in response to many group access informations of a plurality of advertising media, wherein, the every group access information in described many group access informations includes multiple visit information; Analysis module, for described multiple visit information is carried out to principal component analysis (PCA), obtains multiple major component, and wherein, described multiple major component is separate and can contain most of visit information of described a plurality of advertising media; Determination module, for contain the weight of the information of described multiple major component according to described multiple major component and every kind of major component, determines the quality of described a plurality of advertising media.
Preferably, described analysis module comprises: the first determining unit, for the correlativity between every kind of visit information of definite described multiple visit information; Processing unit, for according to described correlativity, carries out noise reduction and de-redundancy to described multiple visit information.
Preferably, described determining unit comprises: process subelement, for many group access informations described in standardization, and according to standardized described many group access informations, generate original sample matrix; The first computation subunit, for calculating described original sample matrix about the covariance matrix of described multiple visit information, wherein, the element in described covariance matrix is used to indicate the correlativity between every kind of visit information in described multiple visit information.
Preferably, described processing unit comprises: diagonalization subelement, for covariance matrix described in diagonalization, obtains a stack features value, wherein, an eigenwert in a described stack features value is for representing the relative size of the information of the described original sample matrix that corresponding dimension contains; First determines subelement, for determining the contribution rate of described each eigenwert of stack features value; Second determines subelement, for according to described contribution rate and preselected threshold condition, determines another stack features value, and another stack features value characteristic of correspondence vector is major component, and wherein, described another stack features value is a subset of a described stack features value; The second computation subunit, be used for according to major component corresponding to described another stack features value and described original sample matrix, calculate new samples matrix, wherein, an element in described new samples matrix is for representing a kind of score value of major component of the described multiple major component of corresponding Yi Ge advertising media.
Preferably, described determination module comprises: the second determining unit, and for determining the weight corresponding to every kind of major component of described multiple major component, wherein, described weight contains the ratio of the information of described multiple major component for representing one group of corresponding major component; Computing unit, be used for according to score value and the described weight of described multiple major component corresponding to each advertising media of described a plurality of advertising media, calculate respectively the integrate score of described each advertising media, wherein, described integrate score is for representing the quality of corresponding advertising media.
By the present invention, adopt and obtain user respectively in response to many group access informations of a plurality of advertising media, wherein, the every group access information in these many group access informations includes multiple visit information; Multiple visit information is carried out to principal component analysis (PCA), obtain multiple major component, wherein, this multiple major component is separate and can contain most of visit information of a plurality of advertising media; The weight that contains the information of described multiple major component according to this multiple major component and every kind of major component, determine the mode of the quality of a plurality of advertising media, solve advertising media's assessment stability and the not high problem of accuracy, improved stability and the accuracy of advertising media's assessment.
Embodiment
It should be noted that, in the situation that not conflicting, embodiment and the feature in embodiment in the application can combine mutually.Describe below with reference to the accompanying drawings and in conjunction with the embodiments the present invention in detail.
The present embodiment provides a kind of advertising media method for evaluating quality, and Fig. 1 is according to the schematic flow sheet of advertising media's method for evaluating quality of the embodiment of the present invention, and as shown in Figure 1, this flow process comprises the steps:
Step S102, obtains user respectively in response to many group access informations of a plurality of advertising media, and wherein, the every group access information in many group access informations includes multiple visit information;
Step S104, carries out principal component analysis (PCA) to multiple visit information, obtains multiple major component, and wherein, multiple major component is separate and can contain most of visit information of a plurality of advertising media;
Step S106, contains the weight of the information of this multiple major component according to multiple major component and every kind of major component, determine the quality of a plurality of advertising media.
Pass through above-mentioned steps, many group access informations to a plurality of advertising media carry out principal component analysis (PCA), and according to separate and can contain the multiple major component of most of visit information of a plurality of advertising media and the quality that the weight of the information that major component contains all major components is determined a plurality of advertising media, thereby can avoid the multicollinearity between visit information to disturb, and, because the multiple major component obtaining can contain most visit information, therefore, also guaranteed the objectivity of evaluated data and assessment result.Visible, by the present embodiment, solved advertising media's assessment stability and the not high problem of accuracy, improved stability and the accuracy of advertising media's assessment.
When the problem with statistical analysis technique Study of Multivariable, variable number will increase the complicacy of problem too much.People wish that variable number information less and that obtain is more naturally.In a lot of situations, between variable, be to have certain correlationship, when having certain correlationship between two variablees, can being interpreted as these two variablees, to reflect that the information of this problem has certain overlapping.Principal component analysis (PCA) is all variablees for original proposition, sets up the least possible new variables, and it is incoherent between two making these new variables, and these new variables keep original information as far as possible at the message context of reflection problem.
The method of principal component analysis (PCA) has multiple in correlation technique, and these analytical approachs can be applied to reach in the embodiment of the present invention similar effect.Preferably, principal component analysis (PCA) in the present embodiment comprises: determine the correlativity between every kind of visit information in multiple visit information; According to correlativity, multiple visit information is carried out to noise reduction and de-redundancy.
Preferably, determine that the correlativity between every kind of visit information in multiple visit information comprises: the many group access informations of standardization, and according to standardized many group access informations, generate original sample matrix; Calculate original sample matrix about the covariance matrix of multiple visit information, wherein, the element in covariance matrix is used to indicate the correlativity between every kind of visit information in multiple visit information.
Preferably, according to correlativity, multiple visit information is carried out to noise reduction and de-redundancy comprises: diagonalization covariance matrix, obtain a stack features value, wherein, an eigenwert in a stack features value is for representing the relative size of the information of the original sample matrix that corresponding dimension contains; Determine the contribution rate of each eigenwert in a stack features value; According to contribution rate and preselected threshold condition, determine another stack features value, another stack features value characteristic of correspondence vector is major component, wherein, another stack features value is a subset of a stack features value; According to major component corresponding to another stack features value and original sample matrix, calculate new samples matrix, wherein, an element in new samples matrix is for representing a kind of score value of major component of the multiple major component of corresponding Yi Ge advertising media.
Preferably, the weight that contains the information of multiple major component according to multiple major component and every kind of major component, the quality of determining a plurality of advertising media comprises: determine weight corresponding to every kind of major component in multiple major component, wherein, weight contains the ratio of all major component information for representing corresponding a kind of major component; According to score value and the weight of multiple major component corresponding to each advertising media in a plurality of advertising media, calculate respectively the integrate score of each advertising media, wherein, integrate score is for representing the quality of corresponding advertising media.
Fig. 2 is according to the schematic flow sheet of advertising media's quality evaluation method for optimizing of the embodiment of the present invention, and as shown in Figure 2, the present embodiment provides a kind of advertising media quality evaluation method for optimizing, comprises the steps:
Step S202, obtains user respectively in response to many group access informations of a plurality of advertising media, and wherein, the every group access information in these many group access informations includes identical multiple visit information;
Step S204, determines the correlativity between every kind of visit information in multiple visit information;
Step S206, according to this correlativity, carries out noise reduction and de-redundancy to this multiple visit information;
Step S208, according to many groups of new major component score value and the weights of determining after noise reduction and de-redundancy, determines the quality of a plurality of advertising media.
Pass through above-mentioned steps, correlativity between every kind of visit information in the common multiple visit information of a plurality of advertising media is carried out to the processing of noise reduction and de-redundancy, thereby can to the quality of advertising media, assess according to new major component and the weight thereof determined, with respect to not considering advertising media's quality assessment result deficient in stability that between every kind of visit information, correlativity causes and the defect of accuracy in correlation technique, according to the many groups of new visit informations of determining, can remove the impact of plural visit information on assessment result with correlativity to a certain degree in the present embodiment, thereby advertising media's assessment stability and the not high problem of accuracy have been solved, stability and the accuracy of advertising media's assessment have been improved.
In above-mentioned steps, the corresponding group access information of each advertising media, the corresponding one group of new major component score value of each advertising media.
Preferably, in correlation technique, definite mode to the correlativity between two elements has multiple, for example, what adopt in the present embodiment is by by after many group access informations matrixing corresponding to a plurality of advertising media, then by calculating and the covariance matrix of the original sample matrix that matrixing obtains is determined the correlativity between each visit information.Each element in the covariance matrix of setting up is used to indicate the correlativity between two corresponding visit informations, wherein, the absolute value of this element is larger, represent that the correlativity between two corresponding visit informations is larger, the value of this element is zero, represents between two corresponding visit informations uncorrelated.Visible, by the element in covariance matrix, can indicate the correlativity between every kind of visit information in these at least two kinds of visit informations.
In the above-described embodiments, in order to set up original sample matrix to different types of visit information, visit information need to be carried out to standardization, be called again centralization and process, or normalized.
Preferably, in visit information, may there are several visit informations that correlativity is larger, for example: for pushing the fixedly advertising media of duration content, may there is larger correlativity in number of visits and browsing time, in this case, if carry out the assessment of advertising media's quality according to number of visits and browsing time simultaneously, the problem that can exist multiple indexes collinearity to disturb, causes the higher of assessment result.It should be noted that, above-mentioned is only a kind of simple examples that the present embodiment may occur in application for example, is not limited in actual applications this, and may be more complicated.
In order to remove the impact of multiple visit information on assessment result with correlativity to a certain degree, also need to break away from the inaccurate and contingency factor that artificial judgement brings, adopt in the present embodiment predetermined noise reduction process mode to carry out noise reduction and de-redundancy processing to the element in covariance matrix simultaneously.For example, by diagonalization covariance matrix, obtain a stack features value, the new matrix that this stack features value characteristic of correspondence vector forms,, the number of this stack features value is identical with the kind number of visit information in a group access information, and, each eigenwert is respectively used to represent the index of new major component degree of influence size, if for example eigenwert is less than 1, illustrates that the explanation dynamics of this major component is not as good as directly introducing the average explanation great efforts of a former variable.Therefore, when de-redundancy, can set the predetermined value of eigenwert, and the value in this stack features value is less than to the eigenwert removal of predetermined value, obtain another and organize new eigenwert; Finally, according to another obtaining, organize new eigenwert characteristic of correspondence vector respectively, determine new major component, wherein, new major component can be explained in the original visit information of the overwhelming majority every group, and wherein, major component corresponding to one group of new feature value screening can contain the information that most visit informations of original sample matrix can be contained, therefore, can be according to the quality of many groups of new major component accurate evaluation advertising media that determine.
Preferably, in above-mentioned de-redundancy is processed, can adopt according to the size of the size of the value of each eigenwert in a stack features value and the total value of a stack features value, determine the contribution rate of each eigenwert in a stack features value.Preferably, in determining a described stack features value after the contribution rate of each eigenwert, can get wherein maximum several eigenwerts its more than cumulative contribution rate to 90%, determine this stack features value; Remove the eigenwert that the cumulative contribution rate of several eigenwerts minimum in a described stack features value is less than 10%, determine this stack features value stack features value.
Preferably, in step S208, according to the new major component of determining after noise reduction and de-redundancy, after generating new sample matrix, according to new sample matrix, adopt predetermined assessment mode, just can determine respectively the quality of a plurality of advertising media.
Corresponding to said method embodiment, a kind of advertising media quality evaluation device is also provided in the present embodiment, function in this device embodiment realizes and in said method embodiment, having been described in detail, in the situation that not conflicting, the present embodiment can be described in conjunction with said method embodiment, at this, will repeat no more.
Fig. 3 is according to the structural representation of advertising media's quality evaluation device of the embodiment of the present invention, as shown in Figure 3, this device comprises: acquisition module 32, analysis module 34 and determination module 36, wherein, acquisition module 32, for obtaining user respectively in response to many group access informations of a plurality of advertising media, wherein, the every group access information in many group access informations includes multiple visit information; Analysis module 34 is coupled to acquisition module 32, for multiple visit information is carried out to principal component analysis (PCA), obtains multiple major component, and wherein, multiple major component is separate and can contain most of visit information of a plurality of advertising media; Determination module is coupled to analysis module 34, for contain the weight of the information of all major components according to multiple major component and every kind of major component, determines the quality of a plurality of advertising media.
Module, unit involved in embodiments of the invention can be realized by the mode of software, also can realize by the mode of hardware.Described module in the present embodiment, unit also can be arranged in processor, for example, can be described as: a kind of processor comprises acquisition module 32, analysis module 34 and determination module 36.Wherein, the title of these modules does not form the restriction to this module itself under certain conditions, and for example, acquisition module can also be described to " for obtaining user respectively in response to the module of many group access informations of a plurality of advertising media ".
Preferably, analysis module 34 comprises: the first determining unit, for the correlativity between every kind of visit information of definite multiple visit information; Processing unit is coupled to the first determining unit, for according to correlativity, multiple visit information is carried out to noise reduction and de-redundancy.
Preferably, determining unit 36 comprises: process subelement, for the many group access informations of standardization, and according to standardized many group access informations, generate original sample matrix; The first computation subunit is coupled to processing subelement, and for calculating original sample matrix about the covariance matrix of multiple visit information, wherein, the element in covariance matrix is used to indicate the correlativity between every kind of visit information in multiple visit information.
Preferably, processing unit comprises: diagonalization subelement, for diagonalization covariance matrix, obtain a stack features value, and wherein, an eigenwert in a stack features value is for representing the relative size of the information of the original sample matrix that corresponding dimension contains; First determines that subelement is coupled to diagonalization subelement, for determining the contribution rate of each eigenwert of stack features value; Second determines that subelement is coupled to first and determines subelement, for according to contribution rate and preselected threshold condition, determines another stack features value, and another stack features value characteristic of correspondence vector is major component; Wherein, another stack features value is a subset of a stack features value; The second computation subunit is coupled to second and determines subelement, be used for according to major component corresponding to another stack features value and original sample matrix, calculate new samples matrix, wherein, an element in new samples matrix is for representing the score value of a major component of the multiple major component of corresponding Yi Ge advertising media.
Preferably, determination module 36 comprises: the second determining unit, and for determining the weight corresponding to every kind of major component of multiple major component, wherein, weight contains the weight of the information of multiple major component for representing corresponding a kind of major component; Computing unit is coupled to the second determining unit, be used for according to score value and the weight of multiple major component corresponding to each advertising media of a plurality of advertising media, calculate respectively the integrate score of each advertising media, wherein, integrate score is for representing the quality of corresponding advertising media.
Below in conjunction with preferred embodiment and embodiment, the present invention will be described.
Fig. 4 is the process flow diagram that advertising media's quality sample is assessed according to the preferred embodiment of the invention, and as shown in Figure 4, this process flow diagram comprises the steps:
User's response condition of step S402 ,Dui advertising media is sampled, and forms sample matrix;
Step S404, carries out standardization to each element in sample matrix, to realize center of a sample;
Step S406, according to each element after standardization, Criterion sample matrix;
Step S408, calculates the covariance matrix of this standardization sample matrix;
Step S410, this covariance matrix is carried out to noise reduction and de-redundancy processing, the new samples matrix after the less corresponding dimension of the visit information that comprises user has been removed in foundation, and wherein, this new samples matrix is the product of original matrix after standardization and new eigenvectors matrix;
Step S412, according to a new stack features value parameter weight;
Step S414, according to this index weights, calculates respectively the integrate score of corresponding advertising media, and this integrate score is for identifying the quality of corresponding advertising media.
By above-mentioned steps, avoided many indexs to enter the multicollinearity that model causes and disturbed, make can not have information overlap between each index, also solved dimension problem such as how to confirm weight coefficient when cumulative simultaneously.
Below by a concrete example, above-mentioned steps is elaborated.
In this preferred embodiment, by a kind of algorithm, assess current the thrown in n media quality of advertiser and ad quality, after collection data, form the matrix of a N*P.Ge Xianshang advertising media sample observation data matrix is:
In step S404, original index data are carried out to standardization (centralization) and process:
Wherein,
In step S406~step S408, the sample matrix after standardization is designated as S, calculates covariance matrix and obtains:
In step S410, by covariance matrix C diagonalization, find an orthogonal matrix H, meet:
H
TCH=Λ,H,Λ∈R
pxp,
The p an obtaining eigenwert is γ
1, γ
2..., γ
p;
Get maximum front d(d < p) dimension corresponding to individual eigenwert, by this d eigenwert, formed new diagonal matrix Λ
1∈ R
d * d, corresponding this d proper vector has formed new eigenvectors matrix H
1∈ R
p * d.
In step S412, according to d eigenwert corresponding to this d New Set, ask the contribution rate of their correspondence to be:
The cumulative contribution rate of supposing d eigenwert need reach more than 90%, thereby determines the value of d.
Obtain new samples matrix: S
1=SH
1s
1∈ R
n * d;
In step S414, obtain after new samples matrix, the dimension of each sample has become d, and the new index of corresponding new samples is: F
1, F
2..., F
d;
Can calculate integrate score by following formula:
Because the indication information (being visit information) for assessment of advertising media's quality in correlation technique exists multicollinearity to disturb, index is carried out the processing of noise reduction and de-redundancy in this preferred embodiment, by step S404 center of a sampleization and step S406, calculate sample matrix and reach nondimensional effect, the index of being convenient to commensurate not or magnitude can compare and weighting.By step S408 calculating covariance matrix and step S410, calculate the effect that new samples matrix has reached noise reduction de-redundancy.Wherein, the object of " noise reduction " is exactly that correlativity between the dimension that makes to remain is as far as possible little, that is to say to allow the off diagonal element in the covariance matrix of this p*p be all zero substantially, namely diagonalization covariance matrix; And " de-redundancy " be exactly, those dimensions that new variance less on the diagonal line of the covariance matrix after diagonalization is corresponding to be removed, " energy " (being variance) that the dimension that makes to remain contains is large as far as possible.This preferred embodiment is only got those and is contained compared with the dimension of macro-energy (eigenwert).
This preferred embodiment can apply in advertising media (comprise automobile, soon the series advertisements etc. that disappears), the index that the advertising media that advertiser is thrown in draws according to algorithm is discharged ranking list, for example, certain media interviews amount is large, the ad click bringing is just many comparatively speaking with conversion, and all these indexs of assessment that directly do not add processing can cause multicollinearity to cause the larger error of result.Method by above preferred embodiment is weighted processing to each index, acts on greatly, and weight is just large, otherwise weight is less, thereby has assessed accurately the quality of advertising media.
It should be noted that, in the step shown in the process flow diagram of accompanying drawing, can in the computer system such as one group of computer executable instructions, carry out, and, although there is shown logical order in flow process, but in some cases, can carry out shown or described step with the order being different from herein.
Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in memory storage and be carried out by calculation element, or they are made into respectively to each integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module to be realized.Like this, the present invention is not restricted to any specific hardware and software combination.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.