CN105260398A

CN105260398A - Quick sorting method for movie types based on poster and plot summary

Info

Publication number: CN105260398A
Application number: CN201510592018.XA
Authority: CN
Inventors: 胡卫明; 付志康; 李兵
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2015-09-17
Filing date: 2015-09-17
Publication date: 2016-01-20

Abstract

The invention discloses a quick sorting method for movie types based on posters and plot summaries. The method comprises: determining a type assembly a movie belongs to, establishing poster training sets and plot summary training sets of various kinds of movies; extracting characteristics of a poster of a to-be-tested movie, using the obtained characteristics of each poster and a label training support vector machine corresponding to the poster to obtain a sorting model of posters; extracting characteristics of a text of the plot summary of the to-be-tested movie, using the obtained characteristics of each text and the corresponding label training support vector machine to obtain a sorting model of the texts; using the sorting model of the posters to predict the poster of the to-be-tested movie, to obtain a result Y1, and then calling the sorting model of the texts to predict the plot summary of the to-be-tested movie, to obtain a result Y2; and finally performing OR operation on the Y1 and Y2, to obtain the type of the to-be-tested movie. The method can rapidly and accurately predict the type of a movie under a condition of no movie videos.

Description

A kind of rapid classification method of the film types based on placard and story introduction

Technical field

The present invention relates to area of pattern recognition, particularly the detection technique of film types.

Background technology

Along with the fast development of internet, film has become an indispensable part in people's free life.Do not make unified regulation to the kind of film so far, also, the classification of film is divided into substantially: terror, love, action, comedy, science fiction etc.Web film all manually can put on class label to film, so, the Fast Classification that realizes film is necessary.

The detection of film types is all detect based on video content itself substantially.The detection of video content comprises: the detection of shot boundary, the detection of camera lens key frame of video and the detection of audio frequency characteristics.The basic assumption of shot boundary detector is that the content of adjacent two camera lenses exists larger difference.Therefore, the border of camera lens can be determined by the difference degree measured between consecutive frame.The feature of camera lens key frame of video comprises: the color of key frame of video, contrast, lightness, Texture eigenvalue.Detect by extracting the key frame of these features to video.Audio frequency characteristics mainly contains: temporal signatures, frequency domain character and acoustics Perception Features etc.

There is following problem in the detection based on video content: the data volume of needs is large, video detecting comparatively slowly, obviously cannot complete Detection task when not having video content itself, and accuracy rate is not very high simultaneously.

Summary of the invention

(1) technical matters that will solve

The object of the invention is to propose a kind ofly quickly and easily to the method that film detects, thus separated film fast to be achieved when there is no film video.

Two) technical scheme

In order to solve the problems of the technologies described above, the present invention proposes a kind of rapid classification method of the film types based on placard and story introduction, the method comprises the following steps: step 1: determine the type set belonging to film, sets up the placard training set of various types of film and the training set of story introduction;

Step 2: the feature extracting the placard of film to be measured, the feature of every width placard that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of placard;

Step 3: the feature extracting the text of the story introduction of film to be measured, the feature of each text that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of text;

Step 4: with the disaggregated model of placard, prediction is carried out to the placard of film to be measured and obtains type Y1, and then the disaggregated model calling text carries out prediction to the story introduction of film to be measured and obtains type Y2; Finally Y1 and Y2 is carried out OR operation; Namely contrast with the type label of film to be measured, if one to predict the outcome be correct, then predict that correct type is used as the type of last film to be measured; Otherwise using the type of Y1 as last film to be measured.

(3) beneficial effect

The present invention detects in conjunction with the placard of film and the type of story introduction to film, can to realize fast, the detection of high-accuracy when not having film video to the type of film.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the rapid classification method of the film types based on placard and story introduction of the present invention.

Fig. 2 is the type determining film of the present invention, and obtains the method flow diagram of type set.

Fig. 3 is the method flow diagram of acquisition placard disaggregated model of the present invention.

Fig. 4 is the method flow diagram of acquisition textual classification model of the present invention.

Fig. 5 is the method flow diagram of acquisition of the present invention film types to be measured.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in further detail.

The hardware of method carrying out practically of the present invention and programming language are also unrestricted, can realize method of the present invention by any language compilation.The present invention adopts a computing machine with 2.67G hertz central processing unit and 4G byte of memory, and by the C Plus Plus establishment program arrived involved in the present invention, achieves method of the present invention.

Fig. 1 is the process flow diagram of the rapid classification method of film types based on placard and story introduction.

Step 101: collect China and foreign countries' video website, determine the type set belonging to film, collect placard corresponding to film as much as possible and story introduction, set up the training set of film poster and the training set of story introduction, idiographic flow as shown in Figure 2.

First collect the film common type in China and foreign countries' video website, determine that common film types set is: horror film, romance movie, comedy and action movie etc.Then placard and the story introduction of the film of this Four types as much as possible is collected.Set up the training set of film poster and the training set of film story introduction respectively.

Step 102: the feature extracting placard, the feature of every width placard that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of placard, and idiographic flow as shown in Figure 3.

1): the feature extracting placard.

The feature of the placard that every width placard extracts is comprised: the number of face in color affective characteristics, color harmony analysis matrix feature, edge feature, textural characteristics, color variation characteristic and placard.

The computing method of color affective characteristics are as follows.Color emotion is commonly used to the emotion of Description Image.In the calculating of color affective characteristics, first, by RGB color space conversion in CLELAB and CLELCH color space, three factors relevant with color affective characteristics are: temperature (heat), importance (weight) and activity (activity), and the computing method of these three factors are as follows:

a c t i v i t y = - 2.1 + 0.06 {[{(a^{*} - 3)}^{2} + {(L^{*} - 50)}^{2} + {(\frac{b^{*} - 17}{1.4})}^{2}]}^{1 / 2}

weight＝-1.8+0.45cos(h-10°)+0.04(100-L ^*)

heat＝-0.5+0.02(C ^*) ^1.07cos(h-50°)

Wherein, (L*, C*, h) and (L*, a ^*, b ^*) be the color component of color space CIELCH and CIELAB respectively.

Color affective characteristics EI (x, y) that the present invention adopts is defined as:

E I (x, y) = \sqrt{{activity}^{2} + {weight}^{2} + {heat}^{2}}

The computing method of color harmony analysis matrix feature are as follows.The harmony analysis matrix feature of color is commonly used to the emotion of Description Image equally.In the calculating of color affective characteristics, first, by RGB color space conversion to CLELAB color space, the harmony analysis matrix factor relevant to color harmony analysis matrix feature comprises: tone factor H _h(hueeffect), luminance factor H _land saturation degree factor H (lightnesseffect) _c(chromaticeffect):

H _L＝H _Lsum+H _ΔL

H _Lsum＝0.28+0.54tanh(-3.88+0.029ΔL _sum)

L _sum＝L ₁ ^*+L ₂ ^*

H _ΔL＝0.14+0.15tanh(-2+0.2ΔL)

ΔL＝|L ^* ₁-L* ₂|

H _H＝H _SY1+H _SY2

H _SY＝E _C(H _S+E _Y)

E _C＝0.5+0.5tanh(-2+0.5C _ab ^*)

H _s＝0.08-0.14sin(h _ab+50°)-0.07sin(2h _ab+90°)

Wherein, h _aband C ^* _abwhat represent is tone in CIELAB color space and color saturation, Δ C _ab ^*with Δ H ^* _abthen two kinds of colors look secondary colo(u)rs in cielab color space respectively

Saturation degree difference, L ₁ ^*and L ^* ₂color pair brightness value in cielab color space respectively.

Overall color harmony analysis matrix feature is then by tone factor H _h, luminance factor H _lwith saturation degree factor H _cjoin together to obtain:

CH＝H _H+H _C+H _L

The computing method of edge feature are as follows.Research shows, the perception of the mankind to color can be stated more accurately than RGB color space in hsv color space.So, first by image by RGB color space conversion to hsv color space, then, V passage carries out filtering through Gaussian filter, subsequently the result obtained and edge detector is carried out mask, finally calculates and exceedes the number of the pixel of threshold value.

The computing method of textural characteristics are as follows.Textural characteristics and Image emotional semantic have close relationship.The spatial texture feature of scene meets the distribution of Wei cloth:

w b (y) = \frac{γ}{β} {(\frac{x}{β})}^{γ - 1} e^{- \frac{1}{γ} {(\frac{y}{β})}^{γ}}

Wherein, x is stochastic variable, and (beta, gamma) is Wei cloth distribution parameter.The space structure of parameter in Weibull distribution to image texture has very complete expression, the contrast of what parameter beta represented is image, and the larger picture contrast of its value is larger; The granularity of what parameter γ then represented is image, its value larger expression image granularity is less.

The computing method of color variation characteristic are as follows.Research shows, Luv color space has space unitarity.The present invention's determinant Δ _f=det (ρ) represents color variation characteristic.In the computation process of color variation characteristic, first by RGB color space conversion to Luv color space, then obtain colour switching matrix:

ρ = (\begin{matrix} {σ_{L}}^{2} & {σ_{L u}}^{2} & {σ_{L v}}^{2} \\ {σ_{L u}}^{2} & {σ_{u}}^{2} & {σ_{u v}}^{2} \\ {σ_{L v}}^{2} & {σ_{u v}}^{2} & {σ_{v}}^{2} \end{matrix})

Wherein, σ _i ²represent the variance at Luv space i passage, represent the covariance at Luv space i and j passage.

The computation process calculating the number of face in placard is as follows.Do not have normal face in terrified placard, be two faces mostly in love placard, in comedy placard, face number is greater than two.Therefore, the present invention extracts the number of the face in placard to embody the difference of dissimilar film.In the number process calculating face in placard, the model adopting opencv to carry detects the number of face in placard.

2): the feature of every width placard that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of placard.

Step 103: the feature extracting the text of story introduction, the feature of each text that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of text, and idiographic flow as shown in Figure 4.

1) pre-service of the text of story introduction.

First to remove the punctuation mark in the text and stop word.The film that the present embodiment relates to is foreign film, so their story introduction English is write, therefore will carry out the reduction of morphological pattern to English word.

Then word bag model is built.Build word bag model and need Feature Words.Present invention employs the method for information gain to obtain Feature Words.The computing formula of information gain is as follows:

IG(T)＝H(C)-H(C|T)

H (C) = - Σ_{i = 1}^{n} P (C_{i}) 1 {og}_{2} P (C_{i})

H (C | T) = - P (t) Σ_{i = 1}^{n} P (C_{i} | t) 1 {og}_{2} P (C_{i} | t) - P (\overset{&OverBar;}{t}) Σ_{i = 1}^{n} P (C_{i} | \overset{&OverBar;}{t}) 1 {og}_{2} P (C_{i} | \overset{&OverBar;}{t})

Wherein, p (c _i) represent film types c _ithe probability occurred, the probability that p (t) representation feature T occurs.P (c _i| time t) there is T in expression, class c _ithe probability occurred.H (C) represents the entropy of system when there is n kind film types, and H (C|T) represents the entropy reduction of the system when learning feature T.

2): the story introduction word bag model representation of each film is become space vector.The feature of each text that utilization obtains and the label Training Support Vector Machines of correspondence thereof obtain the disaggregated model of text.

Step 104:

With the disaggregated model of placard, prediction is carried out to the placard of film to be measured and obtains type Y1, and then the disaggregated model calling text carries out prediction to the story introduction of film to be measured and obtains type Y2.Finally Y1 and Y2 is carried out " or " operation, namely contrast with the type label of film to be measured, obtain the type of last film to be measured, idiographic flow is as shown in Figure 5.

As long as it is correct for having one to predict the outcome in Y1 and Y2, then will detect that correct result as last predicting the outcome; Otherwise, the result Y1 of placard model prediction is used as last predicting the outcome.

Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1., based on a rapid classification method for the film types of placard and story introduction, the method comprises the following steps:

Step 1: determine the type set belonging to film, sets up the placard training set of various types of film and the training set of story introduction;

Step 4: with the disaggregated model of placard, prediction is carried out to the placard of film to be measured and obtains result Y1, and then the disaggregated model calling text carries out prediction to the story introduction of film to be measured and obtains result Y2; Finally Y1 and Y2 is carried out OR operation, namely contrasts with the type label of film to be measured, if one to predict the outcome be correct, then predict that correct type is used as the type of last film to be measured; Otherwise using the type of Y1 as last film to be measured.

2. method according to claim 1, is characterized in that, collects the type of the film in China and foreign countries' video website, determines the type set belonging to film; Collect placard corresponding to film and story introduction, set up the training set of placard and the training set of text respectively.

3. method according to claim 2, is characterized in that, the feature of described placard comprises: the number of face in color affective characteristics, color harmony analysis matrix feature, edge feature, textural characteristics, color variation characteristic and placard.

4. according to method according to claim 3, it is characterized in that, when the text of story introduction is English, the feature extracting the text of story introduction comprises:

Step 4a: remove the punctuation mark in text and stop word;

Step 4b: reduction morphological pattern;

Step 4c: selected characteristic word, sets up word bag model;

Step 4d: the story introduction word bag model representation of each film is become vector space model.

5. according to method according to claim 3, it is characterized in that, use determinant

Δ _f=det (ρ) represents described color variation characteristic, in the computation process of color variation characteristic, first by RGB color space conversion to Luv color space, then obtain colour switching matrix; Image is in Luv space, and the covariance matrix that each pixel 3 passages produce is:

ρ = (\begin{matrix} {σ_{L}}^{2} & {σ_{L u}}^{2} & {σ_{L v}}^{2} \\ {σ_{L u}}^{2} & {σ_{u}}^{2} & {σ_{u v}}^{2} \\ {σ_{L v}}^{2} & {σ_{u v}}^{2} & {σ_{v}}^{2} \end{matrix})

Wherein, represent the variance at Luv space i passage, represent the covariance at Luv space i and j passage.