CN110852195A

CN110852195A - Video slice-based video type classification method

Info

Publication number: CN110852195A
Application number: CN201911015725.7A
Authority: CN
Inventors: 胡能; 李云夕; 杨金江; 胡耀武
Original assignee: HANGZHOU QUWEI SCIENCE & TECHNOLOGY Co Ltd
Current assignee: HANGZHOU QUWEI SCIENCE & TECHNOLOGY Co Ltd
Priority date: 2019-10-24
Filing date: 2019-10-24
Publication date: 2020-02-28

Abstract

The invention discloses a video slice-based video type classification method, which comprises the following steps of: 101) a setting and classifying step, 102) a slice image extracting step, 103) a category classifying step and 104) a category judging step; the invention provides a video slice-based video type classification method with performance improved by at least 100 times.

Description

Video slice-based video type classification method

Technical Field

The invention relates to the technical field of video classification, in particular to a video slice-based video type classification method.

Background

Video content analysis is a basic technology of video processing, and comprises the following steps: (1) lens switching detection; (2) type classification, determining comedy, war, suspicion, animation and the like; (3) specific object detection, such as detecting human faces, cars, billboards; (4) semantic analysis; (5) content ratio is equal. The video content analysis technology can be applied to various fields, such as video editing software, video recommendation, content verification, advertisement delivery, copyright maintenance and the like.

A video is composed of many images, a 10-minute video can contain 1 ten thousand multi-frame images, and it is difficult to simply analyze the video content based on the images, and assuming that one image is processed for 50ms (usually longer than this time), 1 ten thousand images take 8 minutes. Therefore, the current application limitation is large, the accuracy of the mature copyright maintenance (adopting a video comparison technology) is still inferior to that of manual work, and a large number of servers need to be put in. Therefore, many of the tasks related to video processing are basically completed by manual work, such as content review of video websites.

Disclosure of Invention

The invention overcomes the defects of the prior art and provides a video slice-based video type classification method.

The technical scheme of the invention is as follows:

a video slice-based video type classification method comprises the following steps:

101) setting and classifying: designing classification labels according to application scenes, wherein the classification labels comprise comedy, science fiction, cartoon, pornography and violence and terrorism;

102) extracting slice images: the video content is rapidly extracted by adopting a video slicing method, the video slicing method is to extract partial pixels from each frame of image and combine the partial pixels into an image which has a certain rule and can represent short video content, and the formula is as follows:

wherein x is a slice image, and I is a middle line of pixels of a video frame;

forming a new image by a video slicing method, extracting n 256 new images, wherein the size of n is in direct proportion to the total frame number of the video, and detecting scene switching in the video image through a video slice;

103) and (3) classification step: using self-developed neural networks as a classification method, i.e. convolutional layers L_cAn active layer L_rA pooling layer L_pThe cascade combination is formed by adding an inner layer, so that the classification task can be quickly and effectively carried out, and the formula is expressed as follows;

f＝L_p(L_r(L_c(x)))

arg min|F-M·g|

wherein F is a neural network single-stage characteristic, g is a joint characteristic, M is an inner product matrix, and F is a classification probability;

104) a category judgment step: and (5) counting the labels of the n new images, and finding the classification result with the largest number as the classification label of the whole video.

Furthermore, the cut shot switching is judged according to the position of the crack, and the position of the fuzzy filtering is the gradual shot switching.

Further, a framework of the video type classification method is deployed at a server; and uploading the video by the client in an http request mode, and returning an obtained result after the server classifies.

Compared with the prior art, the invention has the advantages that:

compared with a classification method which adopts original frame images (sampling or full frames) or manual work, the method has the advantages that the performance is improved by at least 100 times; the video slice method effectively abstracts the characteristics of the video, reduces redundant information, and has the effect of completely replacing manual classification at present by the powerful classification capability of a self-developed network.

Drawings

FIG. 1 is a diagram of a video slice extracting partial content of a frame according to the present invention;

FIG. 2 is a new image of the present invention;

FIG. 3 is a schematic diagram of the classification process of the present invention;

FIG. 4 is a flow chart of video classification according to the present invention.

Detailed Description

The invention is further described with reference to the following figures and detailed description.

As shown in fig. 1 to 4, a video slice-based video type classification method includes the following steps:

101) setting and classifying: according to the application scene, the classification labels are designed, and the classification labels comprise labels of comedy, science fiction, cartoon, pornography, riot, and the like.

102) Extracting slice images: for continuous images in a video, a middle row of pixels in each frame of image is taken as a column to form a new image, and the new image can reflect the characteristics of some videos, such as shot switching of the videos, video content motion modes, content styles and the like. As shown in fig. 2, the break position of the shot-to-shot new image of the video is cut shot switching, and the position of the blur filter is gradual shot switching. The video content moves in a manner that the content is still as a static shot (as shown in the middle of the first drawing from left to right in fig. 2), and the content changes continuously (as shown in the middle of the second drawing from left to right in fig. 2, the shot moves upward). The content styles, as shown in fig. 2, the first and second images from left to right are animation videos (adventure type), the second two are drama videos (suspense type), and it can be clearly seen that the first two images are gorgeous in color, and the second two images are dull in color.

The video content is extracted quickly by adopting a video slicing method, namely, part of pixels are extracted from each frame of image and combined into an image which has a certain rule and can represent short video content, the formula is as follows, wherein x is a slice image, I is a line of pixels in the middle of a video frame, and the operation is splicing operation;

the method comprises the steps of obtaining n 256 new images after extraction through a video slicing method, wherein the size of n is in direct proportion to the total frame number of video, and detecting scene switching in a video image through video slice. For example, if there are 25600 frames in a video, 256 slice images are obtained, and the above feature detection is performed on each image.

103) And (3) classification step: using neural networks as classification methods, i.e. convolutional layers L_cAn active layer L_rA pooling layer L_pThe three are combined in a cascade mode and formed by an inner laminated layer, classification tasks can be rapidly and effectively carried out according to feature detection, and a formula is expressed as follows;

f＝L_p(L_r(L_c(x)))

argmin|F-M·g|

f is the single-stage characteristic of the neural network, g is the joint characteristic, M is the inner product matrix, and F is the classification probability.

I.e. passing through the multi-stage convolutional layer L for each new image_cAn active layer L_rA pooling layer L_pThe three are combined in a cascade way, and the obtained data is subjected to an inner lamination layer to obtain a classification result; compared with a full convolution network (such as cascade cnn), the network mainly introduces a pooling layer and simultaneously comprises a very ingenious structure, and the final characteristics comprise the statistical values of all convolution layers, so that the front-back correlation is well kept, and under the condition of the same parameter number as the full convolution network, the calculated amount is smaller, and the representation capability is stronger. The convolutional layer, the activation layer and the pooling layer are all basic components of a common deep neural network.

Finally, deploying the framework of the video type classification method at a server; and the client uploads the video in an http request mode, and the server returns the obtained result after classification, so that complete application is realized.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the spirit of the present invention, and these modifications and decorations should also be regarded as being within the scope of the present invention.

Claims

1. A video slice-based video type classification method is characterized by comprising the following steps:

wherein x is a slice image, and I is a middle line of pixels of a video frame;

f＝L_p(L_r(L_c(x)))

arg min|F-M·g|

2. The video slice-based video type classification method according to claim 1, wherein the position of the crack is determined as cut shot cut, and the position of the blur filtering is gradual shot cut.

3. The video slice-based video type classification method according to claim 1, wherein a framework of the video type classification method is deployed at a server; and uploading the video by the client in an http request mode, and returning an obtained result after the server classifies.