CN107229702A

CN107229702A - Micro- video popularity Forecasting Methodology with various visual angles Fusion Features is constrained based on low-rank

Info

Publication number: CN107229702A
Application number: CN201710378158.6A
Authority: CN
Inventors: 苏育挺; 白须; 井佩光; 张静
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2017-05-24
Filing date: 2017-05-24
Publication date: 2017-10-03
Anticipated expiration: 2037-05-24
Also published as: CN107229702B

Abstract

The invention discloses a kind of micro- video popularity Forecasting Methodology constrained based on low-rank with various visual angles Fusion Features, methods described includes：Low-rank approximate processing is carried out respectively to 4 kinds of visual angle modal characteristics, obtains removing 4 kinds of low-rank characteristic informations of noise；Fusion Features are carried out to 4 kinds of low-rank characteristic informations by various visual angles information canonical correlation analysis；Using the characteristic information after fusion, the Laplacian Matrix for representing the figure relation between each micro- video is set up；Based on Laplacian Matrix, the popularity of micro- video is predicted using semi-supervised method.Present invention, avoiding limitation of the feature of single visual angle for Popularity prediction, the feature for handling each visual angle is constrained using low-rank so that the Laplacian Matrix between the feature of foundation has higher stability.

Description

Micro- video popularity Forecasting Methodology with various visual angles Fusion Features is constrained based on low-rank

Technical field

Melt the present invention relates to micro- video Popularity prediction field, more particularly to a kind of constrained based on low-rank with various visual angles feature The micro- video popularity Forecasting Methodology closed.

Background technology

With the popularization of network technology and social platform, micro- video as a kind of new user content, is received more Carry out more concerns.Micro- video refers to short then 30 seconds, video clip that is long then being no more than 20 minutes.The appearance of micro- video, not only Meet network viewing custom and the mobile terminal characteristic under modern society's fast pace life mode, can also meet amusement blast, note The autonomous sense of participation of the meaning rare epoch consumer of power and the demand of notice return rate, it is contemplated that " micro- video " brings masses Will be that random whenever and wherever possible video is enjoyed.And the prediction of micro- video popularity is in advertisement pushing, video recommendations and reserved Directive function is respectively provided with terms of bandwidth, therefore, the prediction for micro- video popularity has great importance.

In actual life, each object can be indicated with a variety of different visual angle characteristics, for example：Micro- video Expression there may be acoustic feature, visual signature, the diversified forms such as social property feature and text feature, different visual angles Feature can play different effects for the prediction of micro- video popularity, therefore Fusion Features and feature selecting are also current ratio The method of more popular processing various visual angles feature^[1]。

Inventor is during the present invention is realized, discovery at least has the following disadvantages and not enough in the prior art：

In actual application, because the change of external environment condition and the shake of camera cause micro- video to be contaminated, depending on There is the feature that frequency is extracted noise can not be completely dependent on, and current method does not have solves noise for feature well The problem of influence, it is impossible to meet a variety of needs in practical application.

The content of the invention

The invention provides a kind of micro- video popularity Forecasting Methodology constrained based on low-rank with various visual angles Fusion Features, sheet Invention avoids limitation of the feature for Popularity prediction of single visual angle, and the feature for handling each visual angle is constrained using low-rank, So that the Laplacian Matrix between the feature set up has higher stability, it is described below：

A kind of micro- video popularity Forecasting Methodology constrained based on low-rank with various visual angles Fusion Features, methods described is included：

Low-rank approximate processing is carried out respectively to 4 kinds of visual angle modal characteristics, obtains removing 4 kinds of low-rank characteristic informations of noise；

Fusion Features are carried out to 4 kinds of low-rank characteristic informations by various visual angles information canonical correlation analysis；

Using the characteristic information after fusion, the Laplacian Matrix for representing the figure relation between each micro- video is set up；It is based on Laplacian Matrix, is predicted using semi-supervised method to the popularity of micro- video.

Methods described also includes：To given 4 kinds of visual angle modal characteristics of micro- video extraction.

4 kinds of visual angle modal characteristics are specially：Visual signature, acoustic feature, text feature and social property feature.

It is described to be specially to 4 kinds of low-rank characteristic informations progress Fusion Features by various visual angles information canonical correlation analysis：

Make the COS distance on the low-rank Projection Character to public subspace at each visual angle maximum, with the public son of low-rank feature Feature space after space representation fusion, carries out the prediction of micro- video popularity on this basis.

The characteristic information using after fusion, sets up the Laplacian Matrix tool for representing the figure relation between each micro- video Body is：

Wherein, L is low-rank proper subspaceNormalization Laplacian Matrix, D is diagonal matrix, and its value isEach row Numerical value sum.

The beneficial effect for the technical scheme that the present invention is provided is：

1st, low-rank approximate processing is carried out to the characteristic information at each visual angle, obtains even closer architectural feature, and adopt With noise processed is removed, make the Laplacian Matrix got that there is higher stability；

2nd, 4 kinds of Viewing-angle informations are learnt using the method for various visual angles canonical correlation analysis, reaches the mesh of Fusion Features , study obtains public subspace, and single features space is eliminated with this for the limitation that predicts the outcome；

3rd, the figure relation between each micro- video features is represented using the Laplacian Matrix got, prevalence is improved with this The precision of prediction is spent, a variety of needs in practical application are met.

Brief description of the drawings

Fig. 1 is a kind of flow chart constrained based on low-rank with micro- video popularity Forecasting Methodology of various visual angles Fusion Features；

Fig. 2 is method proposed by the present invention and the schematic diagram of the comparing result of other Popularity prediction algorithms.

Embodiment

To make the object, technical solutions and advantages of the present invention clearer, further is made to embodiment of the present invention below It is described in detail on ground.

Embodiment 1

In order to reach preferable prediction effect, it is desirable to be able to comprehensively, automatically, accurately carry out the side of micro- video Popularity prediction Method.Research shows：Micro- video between close feature has similar popularity.The embodiment of the present invention proposes a kind of based on low Micro- video popularity Forecasting Methodology of order constraint and various visual angles Fusion Features, it is described below referring to Fig. 1：

101：Low-rank approximate processing is carried out respectively to 4 kinds of visual angle modal characteristics, obtains removing 4 kinds of low-rank features letter of noise Breath；

102：Fusion Features are carried out to 4 kinds of low-rank characteristic informations by various visual angles information canonical correlation analysis；

103：Using the characteristic information after fusion, the Laplacian Matrix for representing the figure relation between each micro- video is set up； Based on Laplacian Matrix, the popularity of micro- video is predicted using semi-supervised method.

Wherein, before step 101, this method also includes：To given 4 kinds of visual angle modal characteristics of micro- video extraction.

Further, above-mentioned 4 kinds of visual angle modal characteristics are specially：Visual signature, acoustic feature, text feature and society Attributive character.

Wherein, 4 kinds of low-rank characteristic informations progress features are melted by various visual angles information canonical correlation analysis in step 102 Conjunction is specially：

Wherein, the characteristic information utilized after fusion in step 103, sets up the drawing for representing the figure relation between each micro- video This matrix of pula is specially：

In summary, the embodiment of the present invention by above-mentioned steps 101- steps 103 avoid the feature of single visual angle for The limitation of Popularity prediction, the feature for handling each visual angle is constrained using low-rank so that the Laplce between the feature of foundation Matrix has higher stability.

Embodiment 2

The scheme in embodiment 1 is further introduced with reference to specific calculation formula, example, it is as detailed below Description：

201：To given 4 kinds of visual angle modal characteristics of micro- video extraction, i.e.,：Visual signature, acoustic feature, text feature with And social property feature；

Common feature of the embodiment of the present invention first to the given micro- video research of 4 kinds of micro- video extraction, including：Vision is special Levy, acoustic feature, text feature and social property feature.

1st, visual signature includes：Object information in color histogram information, micro- video (can use convolutional neural networks Method is obtained, it would however also be possible to employ other method is obtained, and the embodiment of the present invention is without limitation) and aesthetic features.

2nd, acoustic feature includes：The feature of music and other main background sounds in micro- video.

3rd, text feature includes：Text marking in micro- video etc., can use word2vec^[2]Method is directly obtained.

4th, social property feature refers to the information of user account, including：Whether account is by verifying, the information such as bean vermicelli number. Influence is played in the prediction of the feature of this 4 kinds of mode energy pop degree, and is complemented one another.

Wherein, above-mentioned visual signature, acoustic feature, text feature and social property are characterized as known to micro- video field Technical term, the embodiment of the present invention only does simple introduction to this, will not be described here.

202：4 kinds of visual angle modal characteristics are respectively processed using the approximate method of low-rank, obtained after low-rank processing Remove 4 kinds of low-rank characteristic informations of noise；

In 4 kinds of visual angle modal characteristics of actual extracting, due to the noise of micro- video in itself, the influence such as visual angle can cause The video figure relation applicability of foundation is not high, therefore micro- video modality feature of extraction is handled using low-rank, removes The polluted informations such as noise so that the Laplacian Matrix of foundation is more stablized.The formula of implicit low-rankization processing is expressed as follows：

s.t.X_k=A_kZ_k+E_k (1)

Wherein, λ is the equilibrium constant, | | | |₁The l-1 norms of representing matrix, | | | |_*The trace norm of representing matrix, X_kFor The initial characteristic data at k visual angle, Z_kRepresent low-rank transition matrix, E_kRepresent noise information, A_kRepresent dictionary square set in advance Battle array, in general, from practical application angle, generally selectes A_k=X_k, then the low-rank table of original feature space is obtained Show resultThe minimum of above-mentioned object function trace norm can shrink (SVT) Algorithm for Solving, tool using singular value Body solution procedure is known to those skilled in the art, and the embodiment of the present invention is not repeated this.

203：4 kinds of low-rank characteristic informations are handled using various visual angles information canonical correlation analysis, Fusion Features are carried out；

The low-rank of the characteristic at 4 visual angles has been obtained in step 202. as a result, it is desirable to utilize various visual angles canonical correlation The method of analysis carries out Fusion Features, to obtain public subspace to consider the information at each visual angle.Various visual angles typical case The formula of correlation analysis is as follows：

Wherein, W₁,...,W_KFor the Feature Conversion matrix in various visual angles canonical correlation analysis, S_ijMicro- for different visual angles regards Covariance matrix between frequency, S_iiFor auto-variance matrix, K is the quantity of visual angle characteristic, D_iFor the intrinsic dimensionality of i-th of mode, T Transposition is represented, i, j is the label at the feature visual angle of micro- video, and I is unit matrix, and K value is the positive integer more than 1. W₁,...,W_KSolve can use standard TraceRatio^[3]The step of method, specific solution is those skilled in the art institute Known, the embodiment of the present invention is not repeated this.

The purpose of various visual angles canonical correlation analysis is to calculate a public subspace so that the Projection Character at each visual angle is arrived COS distance on the public subspace is maximum, i.e., closer to can then represent the feature after merging with the public subspace Space, carries out the prediction of micro- video popularity on this basis.

204：Using the characteristic information after fusion, the Laplacian Matrix for representing the figure relation between each micro- video is set up；

In the presence of a priori, i.e., similar micro- video should have similar popularity fraction, in this priori On the basis of, it is desirable to the figure relation set up between each micro- video.The method of figure relation between commonplace each video of sign It is to set up Laplacian Matrix, the specific method that it calculates Laplacian Matrix is as follows：

Wherein,It is the radial direction cardinal distance between micro- video of low-rank proper subspace from representing the phase between each micro- video Like degree, σ₀Represent low-rank proper subspaceEuclidean distance intermediate value.It is expressed as i-th and j-th Micro- video is in low-rank proper subspaceIn feature.

On this basis, normalization Laplacian Matrix can be calculated as follows：

Wherein, L is low-rank proper subspaceNormalization Laplacian Matrix, D is diagonal matrix, and its value isEach row Numerical value sum.Then the similar micro- video of feature may there can be this priori of similar popularity with mathematical expression The form of formula is write out as follows：

Wherein, f represents micro- video popularity fraction of prediction.

205：Based on Laplacian Matrix, the popularity of micro- video is predicted using semi-supervised method.

On the basis of Laplacian Matrix, the prediction of popularity fraction is carried out by the way of Semi-Supervised Regression, specifically Operation is as follows：

Wherein, α is coefficient of balance, and the popularity that f obtains for prediction, y is real popularity fraction, and M is diagonal matrix, Wherein markd micro- pixel value is 1, and unlabelled micro- pixel value is 0, in the Popularity prediction, it is only necessary to ensure training set Popularity it is close with actual value, the popularity fraction of test set is then defined with figure relation.The solution of object function can be with Solved using the derivation mode of standard, specific solution procedure is known to those skilled in the art, the embodiment of the present invention pair This is not repeated.

In summary, the embodiment of the present invention by above-mentioned steps 201- steps 205 avoid the feature of single visual angle for The limitation of Popularity prediction, the feature for handling each visual angle is constrained using low-rank so that the Laplce between the feature of foundation Matrix has higher stability.

Embodiment 3

Feasibility checking is carried out to the scheme in Examples 1 and 2 with reference to specific example, it is described below：

First, test data set

The test data set that this experiment is used is the micro- video set downloaded from Vine social network sites, and its micro- video length is equal For 6S.

2nd, evaluation criteria

Micro- video popularity estimated performance of this method is weighed using mean square error and p value, mean square error (nMSE) is characterized The accuracy of prediction, p value (P-value) characterizes forecasting reliability.

3rd, algorithm is contrasted

This method is contrasted with a variety of methods in experiment, including TMALL^[1],MLR^[2],Lasso^[3],SVR^[4], RegMVMT^[5],MLHR^[6],MSNL^[7],MvDA^[8]Deng 8 kinds of micro- video popularity Forecasting Methodologies commonplace in the recent period.

4th, experimental result

Fig. 2 is this method and the nMSE of other 8 kinds micro- video popularity prediction algorithms and the comparing result of p value index.By Contrast understands that TLRMVR (method proposed by the present invention) its degree of accuracy (nMSE values are minimum) predicted on available data collection is higher than Other control methods, stability (nMSE mean square error is smaller) is preferably.P is calculated for control with this experiment by other method Value, p value is smaller value, it was demonstrated that the reliability of this method.The experimental verification feasibility and superiority of this method.

Bibliography：

[1]Chen J,Song X,Nie L,et al.Micro tells macro:predicting the popularity of micro-videos via a transductive model[C]//Proceedings of the 2016ACM on Multimedia Conference.ACM,2016:898-907.

[2]T.Mikolov,I.Sutskever,K.Chen,G.S.Corrado,and J.Dean.Distributed representations of words and phrases and their compositionality.In Proceedings of the Annual Conference on Neural Information Processing Systems,pages 3111–3119.NIPS Foundation,2013.

[3]Yangqing Jia,Feiping Nie,Changshui Zhang.Trace Ratio Problem Revisited.IEEE Transactions on Neural Networks(TNN),Volume 20,Issue 4,Pages 729-735,2009.

[4]A.J.Smola and B.Scholkopf,“A tutorial on support vector regression,”Statistics and computing,vol.14,no.3,pp.199–222,2004.

[5]J.Zhang and J.Huan,“Inductive multi-task learning with multiple view data,”in Proceedings of ACM International Conference on Knowledge Discovery and Data Mining.ACM,2012,pp.543–551.

[6]Y.Yang,J.Song,Z.Huang,and Z.Ma,“Multi-feature fusion via hierarchical regression for multimedia analysis,”IEEE Transactions on Multimedia,vol.15,no.3,pp.572–581,2013.

[7]X.Song,L.Nie,L.Zhang,M.Akbari,and T.-S.Chua,“Multiple social network learning and its application in volunteerism tendency prediction,”in Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2015,pp.213–222.

[8]M.Kan,S.Shan,H.Zhang,S.Lao,and X.Chen,“Multi-view discriminant analysis,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38,no.1,pp.188–194,2016.

The embodiment of the present invention is to the model of each device in addition to specified otherwise is done, and the model of other devices is not limited, As long as the device of above-mentioned functions can be completed.

It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, the embodiments of the present invention Sequence number is for illustration only, and the quality of embodiment is not represented.

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims

1. a kind of micro- video popularity Forecasting Methodology constrained based on low-rank with various visual angles Fusion Features, it is characterised in that described Method includes：

Using the characteristic information after fusion, the Laplacian Matrix for representing the figure relation between each micro- video is set up；It is general based on drawing Lars matrix, is predicted using semi-supervised method to the popularity of micro- video.

2. a kind of micro- video Popularity prediction side constrained based on low-rank with various visual angles Fusion Features according to claim 1 Method, it is characterised in that methods described also includes：To given 4 kinds of visual angle modal characteristics of micro- video extraction.

3. a kind of micro- video popularity based on low-rank constraint and various visual angles Fusion Features according to claim 1 or 2 is pre- Survey method, it is characterised in that 4 kinds of visual angle modal characteristics are specially：Visual signature, acoustic feature, text feature and society Can attributive character.

4. a kind of micro- video Popularity prediction side constrained based on low-rank with various visual angles Fusion Features according to claim 1 Method, it is characterised in that described specific to 4 kinds of low-rank characteristic informations progress Fusion Features by various visual angles information canonical correlation analysis For：

Make the COS distance on the low-rank Projection Character to public subspace at each visual angle maximum, with the public subspace of low-rank feature The feature space after fusion is represented, the prediction of micro- video popularity is carried out on this basis.

5. a kind of micro- video Popularity prediction side constrained based on low-rank with various visual angles Fusion Features according to claim 1 Method, it is characterised in that the characteristic information using after fusion, sets up the Laplce for representing the figure relation between each micro- video Matrix is specially：

Wherein, L is low-rank proper subspaceNormalization Laplacian Matrix, D is diagonal matrix, and its value isThe number respectively arranged It is worth sum.