CN105955708A - Sports video lens classification method based on deep convolutional neural networks - Google Patents
Sports video lens classification method based on deep convolutional neural networks Download PDFInfo
- Publication number
- CN105955708A CN105955708A CN201610302292.3A CN201610302292A CN105955708A CN 105955708 A CN105955708 A CN 105955708A CN 201610302292 A CN201610302292 A CN 201610302292A CN 105955708 A CN105955708 A CN 105955708A
- Authority
- CN
- China
- Prior art keywords
- convolutional neural
- neural networks
- shot
- training
- degree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a sports video lens classification method based on deep convolutional neural networks. The method comprises the following steps: 1) performing shot segmentation on the existing football video, each shot is a continuous image sequence photographed by one camera, selecting 3-10 key frame images from each lens fragment, and sticking a lens classification label on each image so as to construct a training sample set; 2) constructing seven layers of deep convolutional neural networks, wherein the seven layers of deep convolutional neural networks comprises five convolutional layers and three full-connecting layers; 3) training the deep convolutional neural networks in the step 2) using the training sample in the step 1), wherein the training of the convolutional neural network utilizes softmax regression as the classification algorithm, using the error back propagation algorithm to adjust the network parameters of the CNN; 4) testing a testing sample set using a convolutional neural network model obtained through the training in the step 3), and outputting the lens classification result of the final image.
Description
Technical field:
The invention belongs to Video processing and machine learning field, be specifically related to a kind of based on degree of depth convolutional Neural net
The method for classifying physical education video lens of network.
Background technology:
Shot classification is a basic technology of Sports Video Analysis, for particular event detection in sports video,
The retrieval of sports video and the extraction of high-level semantics all have great importance, such as special during football video is analyzed
Detection (red and yellow card, shoot, interruption etc. of competing) and the detection of specific sportsman of determining event are required for using mirror
The result of head classification.One quickly and accurately shot classification method for subsequent analysis performance raising will produce
Help greatly.
In the rebroadcast video of sports tournament, generally camera lens can be divided three classes: long shot, middle scape mirror
Head and close-up shot.Long shot shooting is major part place, and medium shot is to regional area in place
Some sportsman and scene shoot, close-up shot is to athletic half body feature or action message.Its
In medium shot and close-up shot in addition to place is shot, also include the shooting to outside audience.
The current method of above a few class camera lenses of distinguishing is mainly by calculating the area ratio in domain color region.This
The color in place in camera lens is defined as domain color (if pitch is with green as domain color) by class method, then
The area ratio occupied in camera lens further according to domain color is to judge the classification belonging to this camera lens, and thinks and have
The camera lens of bigger domain color area ratio is long shot, and the camera lens of less domain color area ratio is
Close-up shot.Used by the method, domain color area ratio feature is subject in medium shot and close-up shot
Background color interference is relatively big, limits final shot classification precision.
Summary of the invention:
In order to overcome the deficiencies in the prior art, the present invention provides a kind of physical culture based on degree of depth convolutional neural networks
The method of video lens classification.The present invention passes through degree of depth convolutional neural networks, every class camera lens in learning database
Characteristics of image, test time directly choose the classification that convolutional neural networks softmax layer maximum regressand value is corresponding
As the result of shot classification, the key frame for being given is made can automatically to carry out the classification of affiliated camera lens.This
The bright precision that can improve shot classification, and there is preferable feasibility and robustness.
For reaching above-mentioned purpose, the present invention adopts the following technical scheme that and realizes:
A kind of method for classifying physical education video lens based on degree of depth convolutional neural networks, comprises the following steps:
1) existing football video carrying out shot segmentation, each camera lens is the one section of company shot by certain photographic head
Continuous image sequence, selects the key frame images of 3~10 from each camera lens fragment, and to every image patch
Upper shot cluster distinguishing label, constructs training sample set;
2) constructing seven layer depth convolutional neural networks, these seven layers of convolutional Neural networkings include: five convolutional layers,
Three full articulamentums;
3) utilize step 1) in training sample to step 2) described in degree of depth convolutional neural networks model enter
Row training, the training of convolutional neural networks utilizes softmax to return as sorting algorithm, to biography after use error
Broadcast algorithm and adjust the network parameter of CNN;
4) step 3 is utilized) train the convolutional neural networks model obtained that test sample collection is tested, and
The shot classification result of output final image.
The present invention is further improved by, described step 1) in, shot cluster distinguishing label is divided into 6 kinds: remote
Scape camera lens, medium shot in field, outside the venue medium shot, close-up shot in field, outside the venue close-up shot, and not
Belong to other camera lenses of these 5 kinds of camera lenses.
The present invention is further improved by, described step 2) in, each input picture is scaled 256
× 256 sizes, and intercept the square block of 224 × 224 sizes the most at random, with tri-color dimension of RGB
Input;First, second and the 5th convolutional layer excitation output after, through maximum pond, down-sampling operates, defeated
Go out to next convolutional layer;Degree of depth convolutional neural networks finally exports the neuron response that dimension is 6, corresponding
6 kinds of camera lens kinds in image to be classified.
The present invention is further improved by, described step 3) in, during training, convolutional neural networks uses
Different little randoms number initializes the parameter of neutral net.
Compared with prior art, the method have the advantages that
Method for classifying physical education video lens based on degree of depth convolutional neural networks of the present invention, design deep
Degree convolutional neural networks, using key frame images as the input of network, implicitly learns the image in every class camera lens
Feature, and then use this feature more efficiently to carry out shot classification.
Accompanying drawing illustrates:
Fig. 1 is the schematic flow sheet of the present invention.
Fig. 2 is the structural representation of convolutional neural networks in present example.
Detailed description of the invention:
Below in conjunction with the accompanying drawings the present invention is described in further detail:
Reference Fig. 1, the method for physical education video lens based on degree of depth convolutional neural networks of the present invention classification,
Comprise the following steps:
1) existing football video carrying out shot segmentation, each camera lens is the one section of company shot by certain photographic head
Continuous image sequence.From each camera lens fragment, select the key frame images of 5, and every image is sticked
Label, constructs training sample set.Shot cluster distinguishing label is divided into 6 kinds: long shot, medium shot in field,
Medium shot outside the venue, close-up shot in field, outside the venue close-up shot, and it is not belonging to other mirrors of these 5 kinds of camera lenses
Head.
2) constructing seven layer depth convolutional neural networks (Convolutional Neural Network, CNN), these are seven years old
Layer convolutional Neural networking includes: five convolutional layers, three full articulamentums.
Each input picture is scaled 256 × 256 sizes, and intercepts 224 × 224 sizes the most at random
Square block, with tri-color dimension inputs of RGB.First, second and the 5th convolutional layer excitation output after,
Through the down-sampling operation of maximum pond, next convolutional layer is given in output.Degree of depth convolutional neural networks finally exports
Dimension is the neuron response of 6, corresponding to 6 kinds of camera lens kinds of image to be classified.As in figure 2 it is shown, it is defeated
Enter image to include through the detailed process of each layer:
Ground floor convolutional layer is made up of the characteristic pattern that 96 sizes are 55 × 55.Operate through Max Pooling,
The characteristic pattern of 96 27 × 27 sizes of output.
Second layer convolutional layer is made up of the characteristic pattern that 256 sizes are 27 × 27.Operate through Max Pooling,
The characteristic pattern of 96 13 × 13 sizes of output.
Third layer convolutional layer is made up of the characteristic pattern that 384 sizes are 13 × 13.
4th layer of convolutional layer is made up of the characteristic pattern that 384 sizes are 13 × 13.
Layer 5 convolutional layer is made up of the characteristic pattern that 256 sizes are 13 × 13.Operate through Max Pooling,
The characteristic pattern of 256 6 × 6 sizes of output.
Layer 6 and layer 7 are full articulamentum, the characteristic vector of output 4096 dimension.
8th layer is full articulamentum, exports the characteristic vector of one 6 dimension, softmax layer classify and export point
Class result.
The convolutional layer of convolutional neural networks can be expressed as follows: the jth characteristic pattern matrix of l layerMay be by
The weighting of several characteristic pattern convolution of preceding layer obtains,
Wherein, f is neuron activation functions;NjRepresenting the combination of input feature vector figure, * represents convolution algorithm,
For convolution kernel matrix,For bias matrix.
Sampling process can be expressed as:
Wherein, down () represents sampling function, and conventional has maximum sampling function (Max Pooling).
Sampling process is similar with convolution process, uses a kind of sampling function without weight parameter, from input feature vector figure
The upper left corner starts to slide by a fixed step size (or downward) to the right, samples the pixel of window respective block
Rear output.
Each neuron of the full articulamentum of convolutional neural networks can be connected with each neuron of next layer.L
Full articulamentum characteristic vector x of layerlCan be expressed as follows:
xl=f (wlxl-1+bl),(3)
Wherein, wlIt is weight matrix, blIt it is bias vector.
3) utilize step 1) in training sample to step 2) described in degree of depth convolutional neural networks model carry out
Training.The training of convolutional neural networks utilizes softmax to return as sorting algorithm, uses error back-propagating
Algorithm adjusts the network parameter of CNN.
Convolutional neural networks uses the little random number that some are different to initialize the parameter of neutral net.CNN model
Training need continuous print iteration optimization, it can according to Iterative classification result go adjust next iteration ginseng
Number.Picture is input to network, through propagated forward and two training stages of back-propagating, propagated forward mistake
Journey is a sample input network, calculates corresponding actual output;Back-propagating process is that calculating is actual defeated
Go out and the difference of preferable output, according to error rate, continue to optimize network parameter, carry out the training of model.
4) step 3 is utilized) train the convolutional neural networks model obtained that test sample collection is tested, and defeated
Go out the shot classification result of final image.
Claims (4)
1. a method for classifying physical education video lens based on degree of depth convolutional neural networks, it is characterised in that bag
Include following steps:
1) existing football video carrying out shot segmentation, each camera lens is the one section of company shot by certain photographic head
Continuous image sequence, selects the key frame images of 3~10 from each camera lens fragment, and to every image patch
Upper shot cluster distinguishing label, constructs training sample set;
2) constructing seven layer depth convolutional neural networks, these seven layers of convolutional Neural networkings include: five convolutional layers,
Three full articulamentums;
3) utilize step 1) in training sample to step 2) described in degree of depth convolutional neural networks model enter
Row training, the training of convolutional neural networks utilizes softmax to return as sorting algorithm, to biography after use error
Broadcast algorithm and adjust the network parameter of CNN;
4) step 3 is utilized) train the convolutional neural networks model obtained that test sample collection is tested, and
The shot classification result of output final image.
A kind of physical education video lens based on degree of depth convolutional neural networks the most according to claim 1 is classified
Method, it is characterised in that described step 1) in, shot cluster distinguishing label is divided into 6 kinds: long shot, field
Interior medium shot, outside the venue medium shot, close-up shot in field, outside the venue close-up shot, and it is not belonging to these 5 kinds
Other camera lenses of camera lens.
A kind of physical education video lens based on degree of depth convolutional neural networks the most according to claim 2 is classified
Method, it is characterised in that described step 2) in, each input picture is scaled 256 × 256 sizes,
And intercept the square block of 224 × 224 sizes the most at random, with tri-color dimension inputs of RGB;The first,
Second and the 5th convolutional layer excitation output after, through the down-sampling operation of maximum pond, next volume is given in output
Lamination;Degree of depth convolutional neural networks finally exports the neuron response that dimension is 6, corresponding to image to be classified
6 kinds of camera lens kinds.
A kind of physical education video lens based on degree of depth convolutional neural networks the most according to claim 1 is classified
Method, it is characterised in that described step 3) in, during training convolutional neural networks use some different little with
Machine number initializes the parameter of neutral net.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610302292.3A CN105955708A (en) | 2016-05-09 | 2016-05-09 | Sports video lens classification method based on deep convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610302292.3A CN105955708A (en) | 2016-05-09 | 2016-05-09 | Sports video lens classification method based on deep convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105955708A true CN105955708A (en) | 2016-09-21 |
Family
ID=56914080
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610302292.3A Pending CN105955708A (en) | 2016-05-09 | 2016-05-09 | Sports video lens classification method based on deep convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105955708A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106504190A (en) * | 2016-12-29 | 2017-03-15 | 浙江工商大学 | A kind of three-dimensional video-frequency generation method based on 3D convolutional neural networks |
CN106779073A (en) * | 2016-12-27 | 2017-05-31 | 西安石油大学 | Media information sorting technique and device based on deep neural network |
CN106897714A (en) * | 2017-03-23 | 2017-06-27 | 北京大学深圳研究生院 | A kind of video actions detection method based on convolutional neural networks |
CN107241645A (en) * | 2017-06-09 | 2017-10-10 | 成都索贝数码科技股份有限公司 | A kind of method that splendid moment of scoring is automatically extracted by the subtitle recognition to video |
CN108270946A (en) * | 2016-12-30 | 2018-07-10 | 央视国际网络无锡有限公司 | A kind of computer-aided video editing device in feature based vector library |
CN108810620A (en) * | 2018-07-18 | 2018-11-13 | 腾讯科技(深圳)有限公司 | Identify method, computer equipment and the storage medium of the material time point in video |
CN109299687A (en) * | 2018-09-18 | 2019-02-01 | 成都网阔信息技术股份有限公司 | A kind of fuzzy anomalous video recognition methods based on CNN |
CN109325533A (en) * | 2018-09-18 | 2019-02-12 | 成都网阔信息技术股份有限公司 | A kind of artificial intelligence frame progress CNN repetitive exercise method |
CN109858514A (en) * | 2018-12-20 | 2019-06-07 | 北京以萨技术股份有限公司 | A kind of video behavior classification method neural network based |
WO2020077494A1 (en) * | 2018-10-15 | 2020-04-23 | 华为技术有限公司 | Intelligent photographing method and system, and related device |
CN108848389B (en) * | 2018-07-27 | 2021-03-30 | 恒信东方文化股份有限公司 | Panoramic video processing method and playing system |
CN116991298A (en) * | 2023-09-27 | 2023-11-03 | 子亥科技(成都)有限公司 | Virtual lens control method based on antagonistic neural network |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894125A (en) * | 2010-05-13 | 2010-11-24 | 复旦大学 | Content-based video classification method |
-
2016
- 2016-05-09 CN CN201610302292.3A patent/CN105955708A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894125A (en) * | 2010-05-13 | 2010-11-24 | 复旦大学 | Content-based video classification method |
Non-Patent Citations (2)
Title |
---|
ALEX KRIZHEVSKY等: ""ImageNet Classification with Deep Convolutional Neural Networks"", 《PROCEEDING OF THE NEURAL INFORMATION PROCESSING SYSTEMS 2012》 * |
JAKE BOUVRIE: ""Notes on Convolutional Neural Networks"", 《MASSACHUSETTS: CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106779073A (en) * | 2016-12-27 | 2017-05-31 | 西安石油大学 | Media information sorting technique and device based on deep neural network |
CN106779073B (en) * | 2016-12-27 | 2019-05-31 | 西安石油大学 | Media information classification method and device based on deep neural network |
CN106504190A (en) * | 2016-12-29 | 2017-03-15 | 浙江工商大学 | A kind of three-dimensional video-frequency generation method based on 3D convolutional neural networks |
CN106504190B (en) * | 2016-12-29 | 2019-09-13 | 浙江工商大学 | A kind of three-dimensional video-frequency generation method based on 3D convolutional neural networks |
CN108270946A (en) * | 2016-12-30 | 2018-07-10 | 央视国际网络无锡有限公司 | A kind of computer-aided video editing device in feature based vector library |
CN106897714B (en) * | 2017-03-23 | 2020-01-14 | 北京大学深圳研究生院 | Video motion detection method based on convolutional neural network |
CN106897714A (en) * | 2017-03-23 | 2017-06-27 | 北京大学深圳研究生院 | A kind of video actions detection method based on convolutional neural networks |
WO2018171109A1 (en) * | 2017-03-23 | 2018-09-27 | 北京大学深圳研究生院 | Video action detection method based on convolutional neural network |
CN107241645A (en) * | 2017-06-09 | 2017-10-10 | 成都索贝数码科技股份有限公司 | A kind of method that splendid moment of scoring is automatically extracted by the subtitle recognition to video |
CN107241645B (en) * | 2017-06-09 | 2020-07-24 | 成都索贝数码科技股份有限公司 | Method for automatically extracting goal wonderful moment through caption recognition of video |
CN108810620A (en) * | 2018-07-18 | 2018-11-13 | 腾讯科技(深圳)有限公司 | Identify method, computer equipment and the storage medium of the material time point in video |
CN108810620B (en) * | 2018-07-18 | 2021-08-17 | 腾讯科技(深圳)有限公司 | Method, device, equipment and storage medium for identifying key time points in video |
CN108848389B (en) * | 2018-07-27 | 2021-03-30 | 恒信东方文化股份有限公司 | Panoramic video processing method and playing system |
CN109325533A (en) * | 2018-09-18 | 2019-02-12 | 成都网阔信息技术股份有限公司 | A kind of artificial intelligence frame progress CNN repetitive exercise method |
CN109299687A (en) * | 2018-09-18 | 2019-02-01 | 成都网阔信息技术股份有限公司 | A kind of fuzzy anomalous video recognition methods based on CNN |
WO2020077494A1 (en) * | 2018-10-15 | 2020-04-23 | 华为技术有限公司 | Intelligent photographing method and system, and related device |
US11470246B2 (en) | 2018-10-15 | 2022-10-11 | Huawei Technologies Co., Ltd. | Intelligent photographing method and system, and related apparatus |
CN109858514A (en) * | 2018-12-20 | 2019-06-07 | 北京以萨技术股份有限公司 | A kind of video behavior classification method neural network based |
CN116991298A (en) * | 2023-09-27 | 2023-11-03 | 子亥科技(成都)有限公司 | Virtual lens control method based on antagonistic neural network |
CN116991298B (en) * | 2023-09-27 | 2023-11-28 | 子亥科技(成都)有限公司 | Virtual lens control method based on antagonistic neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105955708A (en) | Sports video lens classification method based on deep convolutional neural networks | |
CN111310862B (en) | Image enhancement-based deep neural network license plate positioning method in complex environment | |
CN110334765B (en) | Remote sensing image classification method based on attention mechanism multi-scale deep learning | |
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
CN109670528B (en) | Data expansion method facing pedestrian re-identification task and based on paired sample random occlusion strategy | |
CN105184309B (en) | Classification of Polarimetric SAR Image based on CNN and SVM | |
CN106326937B (en) | Crowd density distribution estimation method based on convolutional neural networks | |
CN107122776A (en) | A kind of road traffic sign detection and recognition methods based on convolutional neural networks | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN109284669A (en) | Pedestrian detection method based on Mask RCNN | |
CN107016413B (en) | A kind of online stage division of tobacco leaf based on deep learning algorithm | |
CN107424159A (en) | Image, semantic dividing method based on super-pixel edge and full convolutional network | |
CN106203523A (en) | The classification hyperspectral imagery of the semi-supervised algorithm fusion of decision tree is promoted based on gradient | |
CN108960404B (en) | Image-based crowd counting method and device | |
CN106815604A (en) | Method for viewing points detecting based on fusion of multi-layer information | |
CN111178120B (en) | Pest image detection method based on crop identification cascading technology | |
CN111489370B (en) | Remote sensing image segmentation method based on deep learning | |
CN108154102A (en) | A kind of traffic sign recognition method | |
CN104598924A (en) | Target matching detection method | |
CN113160062B (en) | Infrared image target detection method, device, equipment and storage medium | |
CN107784319A (en) | A kind of pathological image sorting technique based on enhancing convolutional neural networks | |
CN109242826B (en) | Mobile equipment end stick-shaped object root counting method and system based on target detection | |
CN109919073B (en) | Pedestrian re-identification method with illumination robustness | |
CN108776777A (en) | The recognition methods of spatial relationship between a kind of remote sensing image object based on Faster RCNN | |
CN114863263B (en) | Snakehead fish detection method for blocking in class based on cross-scale hierarchical feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20180112 Address after: 100022 building 3, building 88, building 7-10, Jianguo Road, Beijing, Chaoyang District, 305 Applicant after: Beijing Hippo energy Sports Technology Co., Ltd. Address before: 710075 Shaanxi city of Xi'an province high tech Zone Feng Hui Road No. 18 sigma building room 10201-224-26 Applicant before: Xi'an Brision Information Technology Co., Ltd. |
|
TA01 | Transfer of patent application right | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160921 |
|
RJ01 | Rejection of invention patent application after publication |