CN105701460A - Video-based basketball goal detection method and device - Google Patents

Video-based basketball goal detection method and device Download PDF

Info

Publication number
CN105701460A
CN105701460A CN201610012877.1A CN201610012877A CN105701460A CN 105701460 A CN105701460 A CN 105701460A CN 201610012877 A CN201610012877 A CN 201610012877A CN 105701460 A CN105701460 A CN 105701460A
Authority
CN
China
Prior art keywords
layer
recursive convolution
goal
output
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610012877.1A
Other languages
Chinese (zh)
Other versions
CN105701460B (en
Inventor
王跃明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ROOT SPORTS SCIENCE AND TECHNOLOGY (BEIJING) Co.,Ltd.
Original Assignee
Apex Sports Investments (beijing) Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apex Sports Investments (beijing) Ltd filed Critical Apex Sports Investments (beijing) Ltd
Priority to CN201610012877.1A priority Critical patent/CN105701460B/en
Publication of CN105701460A publication Critical patent/CN105701460A/en
Application granted granted Critical
Publication of CN105701460B publication Critical patent/CN105701460B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Abstract

The present invention relates to a video-based basketball goal detection method and device. The method comprises the steps of constructing a recurrent-convolutional neural network model based on a convolutional neural network and a recurrent neural network; constructing a photo gallery sample set of the basketball videos, wherein the photos in the photo gallery possess labels, and the labels comprise the timing sequence information and the gold identification information; training the recurrent-convolutional neural network model based on the sample set; processing the to-be-detected basketball videos to extract images, and using the trained recurrent-convolutional neural network model to process the extracted images to obtain the output vectors; determining whether goal occurs in the current basketball video according to the output vectors of a recurrent-convolutional neural network. According to the present invention, the basketball goal can be detected accurately based on the videos.

Description

A kind of basketball goal detection method and apparatus based on video
Technical field
The present invention relates to video detection technology, relate more specifically to a kind of basketball goal detection method and apparatus based on video。
Background technology
Development rapidly along with science and technology, the passing video that only professional camera can shoot, nowadays can also complete with the smart mobile phone carried with, based on this, upload to the various types of videos on network in recent years and present the state of blowout, and along with the development of Internet technology and digital media technology, how to be analyzed being increasingly becoming the focus of research for certain certain types of video。
Network today has substantial amounts of sport category video, it is wherein basketball video greatly, some researchs are had to launch for basketball video, but majority concentrates on sportsman's detection or court event analysis, such as the spectators in sportsman or court being analyzed, then little for the goal detection in basketball video, the goal that reason is because in basketball video greatly is a dynamic process continuously, and unlike sportsman's detection or audience analysis, be the process of a relative quiescent。
How to carry out goal detection for basketball video exactly to be one and need to be solved the technical problem that。
Summary of the invention
In consideration of it, it is an object of the present invention to provide a kind of basketball goal detection method and apparatus based on video, to substantially eliminate one or more problem caused because of the limitation of prior art and shortcoming。
In order to realize the purpose of the present invention, in one aspect of the invention, it is provided that a kind of basketball goal detection method based on video, the method comprises the following steps:
Build the recursive convolution neural network model based on convolutional neural networks and recurrent neural network;
Building the picture library sample set of basketball video, the picture in described picture library has label, and described label includes time sequence information and goal identification information;
Based on described sample set, described recursive convolution neural network model is trained;
From processing, basketball video to be detected extracts image, use the recursive convolution neural network model after training to process the image extracted, obtain output vector;
Output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。
Further, the step of the picture library sample set of structure basketball video comprises the steps that and extracts each two field picture from basketball video, is categorized as goal picture or non-goal picture and adds label in picture;Picture delimited basketry position, intercept out the coloured image block of predefined size, in sequential adjacent and there are two coloured image blocks of same goal identification information form single samples, thus obtain the goal sample set of predetermined ratio and non-goal sample set based on the two field picture extracted from basketball video。
Further, described recursive convolution neural network model includes the recursive convolution neutral net that three layers is parallel, recursive convolution neutral net every layer parallel includes the convolutional neural networks layer of 6 serials, the convolutional neural networks layer of these 6 serials sequentially includes: the first recursive convolution layer, first pond layer, second recursive convolution layer, second pond layer, full articulamentum and output layer, described three layers parallel recurrence convolutional neural networks shares full articulamentum and output layer, in described three layers parallel recurrence convolutional neural networks first, the input of two layers of parallel recurrence convolutional neural networks is the coloured image of previous sequential in sample, the input of third layer parallel recurrence convolutional neural networks is the coloured image of a rear sequential in sample, and the output of each convolutional neural networks layer inputs as the part of convolution neural net layer corresponding in later layer parallel recurrence convolutional neural networks in preceding layer parallel recurrence convolutional neural networks。
Further, described first recursive convolution layer has 10 characteristic patterns, described first pond layer has 10 characteristic patterns, described second recursive convolution layer has 30 characteristic patterns, described second pond layer has 30 characteristic patterns, described first recursive convolution layer is output as the input of described first pond layer, described first pond layer is output as the input of described second recursive convolution layer, described second recursive convolution layer is output as the input of described second pond layer, second pond layer of former and later two parallel sequential is output as the input of described full articulamentum, described full articulamentum has 30 nodes, described output layer has 2 nodes。
Further, based on described sample set, the step that described recursive convolution neural network model is trained is comprised the steps that the goal sample in the middle of by sample set and non-goal sample are input in described recursive convolution neutral net, adopt negative log-likelihood as loss function, utilize the back-propagation algorithm launched in time, optimize described recursive convolution neutral net, the value making described loss function is more and more less, thus completing the training of recursive convolution neutral net。
Further, the step based on described sample set, described recursive convolution neural network model being trained can comprise the following steps that
A. described recursive convolution neutral net is initialized, wherein, remember single neuronic fan-out and fan-in number respectively fanoutAnd fanin, then in convolutional neural networks single neuronic initialization weights with in intervalEqually distributed form produces, and in recurrent neural network, single neuronic initialization weights are to produce in interval [-0.1,0.1] equally distributed form;
B. the goal sample in the middle of sample set and non-goal sample are input to the recursive convolution neutral net after initialization, obtain the actual output of recursive convolution neutral net;
C. use negative log-likelihood loss function, utilizing the gradient of back-propagation algorithm computing network connection weight, utilizing gradient to recursive convolution neutral net described in weight optimization so that the value of loss function is more and more less, thus completing the training of recursive convolution neutral net。
Described negative log-likelihood loss function is:
N L L ( θ , D ) = - Σ i = 0 | D | log P ( Y = y ( i ) | x ( i ) , θ ) ;
Wherein, θ refers to the parameter of Current Situation of Neural Network, and D refers to current sample set, and including goal sample set and non-goal sample set, | D | refers to the size of current sample set, P (Y=y(i)|x(i), θ) and refer to that sample is under parameter current θ and at given input x(i)When output y(i)Probability, log represents logarithm operation;
Formula used by described output layer is:
P ( y = j | x ) = e x T θ j Σ k = 1 K e x T θ k ;
Wherein, x refers to input vector, xTReferring to the transposition of this input vector, θ refers to the parameter of output layer, and K refers to that current output layer has a few class to export, and P (y=j | x) it is represented to surely be currently entered x, the probability of output y。
Further, from processing, basketball video to be detected extracts image, use the step that the recursive convolution neural network model after training processes the image extracted to comprise the steps that
D. in basketball video, delimit region to be detected, extracted region to be detected in video become every two field picture;
E. image is inputted, according to the form of continuous data stream, the recursive convolution neutral net trained, successively calculate until output layer;
F. obtain the probability scored and do not score at output layer, if the probability scored is more than the probability do not scored, then judge current video occurs score, otherwise, it is judged that current video does not occur score。
According on the other hand, additionally providing a kind of basketball goal detection device based on video, this device includes:
This device includes training sample acquisition module, recursive convolution neutral net builds module and goal detection module, wherein:
Described training sample acquisition module builds the picture library sample set of basketball video, and the picture in described picture library has label, and described label includes time sequence information and goal identification information;
Described recursive convolution neutral net builds the module construction recursive convolution neural network model based on convolutional neural networks and recurrent neural network, and based on described sample set, described recursive convolution neural network model is trained;And
Described goal detection module is from processing extraction image basketball video to be detected, the recursive convolution neural network model after training is used to process the image extracted, obtain output vector, and the output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。
According to the embodiment of the present invention, it is possible to detect whether basketball scores exactly based on video。
The attendant advantages of the present invention, purpose, and feature will will partly be illustrated by the following description, and hereafter will become apparent upon to rear section studying for those of ordinary skill in the art, or can know according to the practice of the present invention。The structure that the purpose of the present invention and further advantage can be passed through to specifically note in written explanation and claims and accompanying drawing thereof realizes and obtains。
It will be appreciated by those skilled in the art that can with the purpose that the present invention realizes and advantage be not limited to above concrete described in, and understand, according to described further below, the above and other purpose that the present invention is capable of with will be apparent from。
Accompanying drawing explanation
With reference to the following drawings, it is better understood with many aspects of the present invention。In accompanying drawing:
Fig. 1 is the method schematic diagram carrying out basketball goal detection in the embodiment of the present invention based on video;
Fig. 2 is the schematic diagram of the recursive convolution neutral net of the present invention;
Fig. 3 is the recursive convolution neutral net Organization Chart of the present invention;
Fig. 4 is the flow chart training recursive convolution neutral net in the embodiment of the present invention;
Fig. 5 is the flow chart of the process detecting whether goal in the embodiment of the present invention;
Fig. 6 is side-looking result receiver test curve (ROC) figure of the present invention;And
Fig. 7 is the block diagram of the basketball goal detection device in the embodiment of the present invention based on video。
Detailed description of the invention
Below, the preferred embodiment of the present invention is described in detail。The example of these preferred implementations has illustrated in the accompanying drawings。Shown in accompanying drawing and the embodiments of the present invention that describe with reference to the accompanying drawings merely exemplary, and the technical spirit of the present invention and primary operational thereof be not limited to these embodiments。
At this, in addition it is also necessary to explanation, in order to avoid having obscured the present invention because of unnecessary details, illustrate only in the accompanying drawings and according to the closely-related structure of the solution of the present invention and/or process step, and eliminate other details little with relation of the present invention。
Machine learning is an important subdiscipline of artificial intelligence field。In recent years, rapid emergence along with branch's neutral net in machine learning, a lot of models in neutral net have successfully been applied to multiple field, wherein, convolutional neural networks is successfully applied to the related fields of computer vision, such as recognition of face, pedestrian detection etc., achieving the effect making us being pleasantly surprised, recurrent neural network is also successfully applied to field of speech recognition, effectively improves the accuracy rate of identification。
In the present invention, it is utilize convolutional neural networks to carry out basketball goal detection based on video。
The basketball goal detection method based on video of the present invention is based on following design and realizes: first extract picture from basketball video and picture indicia has target information and time sequence information, to build Sample Storehouse (wherein target information could be for identifying whether the goal identification information of goal), then structure recursive convolution neutral net, then use Sample Storehouse that recursive convolution neutral net is trained, finally use the recursive convolution Processing with Neural Network basketball video after having trained, output vector according to recursive convolution neutral net judges whether occur goal in current time video。The present invention builds Sample Storehouse and builds the sequencing of recursive convolution neutral net and unrestricted。
Specifically, as it is shown in figure 1, the basketball goal detection based on video of the present invention comprises the following steps:
Step S130, builds the picture library sample set of basketball video, and the picture in this picture library has label, and this label can include time sequence information and goal identification information。
Specifically, each two field picture can be extracted from some basketball video, all two field pictures for some video, can time sequence information corresponding to labelling, whether occurring in that goal according in current image frame, available goal identification information is respectively labeled as goal picture and non-goal picture, adds time sequence information in the picture and goal identification information can be performed manually by, additionally, time sequence information can also be the time tag automatically generated in two field picture。By delimiting basketry position, the coloured image block intercepting out predefined size (is such as sized to the coloured image block of 32*40 pixel, the present invention is not limited to this), the goal mark for goal picture just can be, the goal mark for non-goal picture can be negative。According to time sequence information, the goal picture obtained and non-goal picture are combined, two coloured image blocks composition individualized training samples that are adjacent and that have mark of scoring equally in sequential, composition individualized training sample refers to and two coloured image blocks is divided into former frame and a later frame preservation。The goal sample set of predetermined quantity ratio and non-goal sample set may finally be obtained, such as obtain quantity substantially 1 to 50 positive sample set and negative sample set, wherein, positive sample set refers to the set of all positive samples, each positive sample comprises two coloured image blocks, it is divided into former frame and a later frame, this two frame is intercept out in goal picture adjacent sequential respectively, the quantity of positive sample can approximately 900, and negative sample refers to the set of all negative samples, each negative sample comprises two coloured image blocks, it is divided into former frame and a later frame, this two frame is to intercept out in non-goal picture adjacent sequential, the quantity of negative sample can approximately 44000。At this, the sample size of 900 and 44000 is merely illustrative, is not intended to limit the present invention。
Step S110, builds the recursive convolution neural network model based on convolutional neural networks and recurrent neural network。
The recursive convolution neutral net built can include the convolutional neural networks layer of 6 serials, including: two recursive convolution layers, two pond layers, a full articulamentum and a softmax return layer。Wherein, the input of the neurad network of former and later two sequential is the Three Channel Color image (such as RGB color image) of front and back two frame 32*40 pixels continuously。Recursive convolution layer RC1 has 10 characteristic patterns, pond layer M2 to have 10 characteristic patterns, and recursive convolution layer RC3 has 30 characteristic patterns, pond layer M4 has 30 characteristic patterns, the output of the pond layer M4 that input is former and later two sequential of full articulamentum F5, has 30 nodes, output layer L6 to have 2 nodes。Original image is as the input of first recursive convolution layer RC1, first recursive convolution layer RC1 is output as 10 characteristic patterns, these 10 characteristic patterns are as the input of first pond layer M2, the input of pond layer M2 is also 10 characteristic patterns, it is as the input of second recursive convolution layer RC3, second recursive convolution layer RC3 is output as 30 characteristic patterns, it is as the input of second pond layer M4, the output of second pond layer M4 is also 30 characteristic patterns, it is launched into the vector form input as full articulamentum F5, full articulamentum F5 exports the vector of one 30 dimension, it returns the input of layer L6 as softmax, softmax layer L6 is output as 2 dimensional vectors, it represents this original image respectively is the probability scored or do not score。Wherein recursive convolution layer plays the effect extracting feature, it is to input enterprising line slip convolution by wave filter, thus extracting the feature of input, it is substantially that wave filter serves the effect extracting feature, wherein, wave filter refers to the matrix of a fixed size, this matrix to input convolutional layer image with from left to right, order from top to bottom, the some pixels of slip every time, carry out convolution, wherein, convolution refers to be multiplied the pixel value of the wave filter of formed objects with each correspondence position in the image block on image and sues for peace afterwards, one nonlinear function (such as tanh function) of input and should be worth, will be output as the value of a pixel in convolved image, complete the convolution of whole input picture and according to pixels calculate the process of tanh function, it is called extraction characteristic procedure。Pond layer can pass through the mode of the local maximum of selected characteristic value, can not only reduce characteristic dimension, moreover it is possible to keeps the micro-strain robust to input。The feature of multiple characteristic patterns can be converged by full articulamentum, thus obtaining the global feature of original image。Output layer utilizes the output characteristic of full articulamentum to calculate the probability scored with non-goal。
In the present invention, the convolutional neural networks adopting 6 layers is to consider the goal detection precision and computation complexity to reach and select, and the present invention is not limited to this, it is also possible to select more multi-layered or less layer。
It is illustrated in figure 2 the detail view of wall scroll convolutional neural networks of the present invention, wherein (10 below first recursive convolution layer RC1 of this wall scroll convolutional neural networks, 3,5,9) quantity of the characteristic pattern of expression input is 3, the quantity of the characteristic pattern of output is 10, in 10 wave filter (several characteristic patterns of general output are accomplished by several wave filter) needed for output characteristic figure, each wave filter is of a size of 5*9, wherein, the size of wave filter refers to the matrix-block size for convolutional calculation。(2,2) below first pond layer M2 represent that the filter size of the pond layer M2 used is 2*2。(30,10,5,5) below second recursive convolution layer RC2 represent that the quantity of the characteristic pattern of input is 10, and the quantity of the characteristic pattern of output is 30, and wave filter is of a size of 5*5。(2,2) below second pond layer M4 represent that the filter size of the pond layer used is 2*2。(2700,1) below full articulamentum F5 represent that F5 is of a size of 2700*1。(2,1) below softmax layer represent that this layer is of a size of 2*1。
It is illustrated in figure 3 the Organization Chart of the recursive convolution neutral net of the present invention, it comprises the convolutional neural networks that three layers is parallel, the details of every layer of convolutional neural networks is as shown in Figure 2, Recognition with Recurrent Neural Network (RNN) Connection Step sequential relationship is added between three layers, process is: the output of the recursive convolution layer of previous sequential inputted as the part of the recursive convolution layer of next sequential, therefore the connection of sequential is realized, thus the final Organization Chart shown in pie graph 3, in figure, the sample that T moment and T+1 moment input is previous sample, the sample that the T+2 moment inputs is later sample。Additionally, the connection between convolutional layer is based on the full connection of block form, after second pond layer, a full articulamentum all ponds layer is linked together。
It is to say, in the present invention, the interframe that recursive convolution neural network model introduces image by using convolutional neural networks and recurrent neural network on the basis of sample data connects, to catch the sequential relationship in video。
Step S150, is trained recursive convolution neural network model based on sample set。
Such as, positive negative sample in basketball video sample set is input to the RC1 layer in recursive convolution neutral net, adopt negative log-likelihood (negativeloglikelihood, NLL) as loss function, in conjunction with the back-propagation algorithm (backpropagationthroughtime, BPTT) launched in time, carry out the parameter of Optimal Recursive convolutional neural networks, the value making loss function is more and more less, thus completing the training of recursive convolution neutral net。This step specifically can comprise the following steps that
Step S1501: recursive convolution neutral net is initialized, wherein, remembers single neuronic fan-out and fan-in number respectively fanoutAnd fanin, in convolutional neural networks, single neuronic initialization weights are with in intervalEqually distributed form produces, in recurrent neural network, single neuronic initialization weights are with interval [-0.1,0.1] equally distributed form produces, wherein, the weight initialization of convolutional neural networks unit make reason in this way be desirable to initialized weights try one's best little but be not concentrated on certain point or its near, thus improving the efficiency accelerating back propagation, the weight initialization of recurrent neural network is also based on same mechanism。
Step S1502: the recursive convolution neutral net all training samples of input after initializing are trained, and obtain the actual output of recursive convolution neutral net。Wherein, training sample includes the desirable output vector of input vector and its correspondence, and input vector, in the calculating through recursive convolution neutral net, arrives softmax layer, obtains actual output vector。Wherein, network training process adopts stochastic gradient descent method, namely concentrate the input selecting part sample as recursive convolution layer RC1 layer from training sample every time, the output of recursive convolution layer RC1 layer is as the input of pond layer M2 layer, the output of pond layer M2 layer is as the input of recursive convolution layer RC3 layer, recursive convolution layer RC3 layer is output as the input of pond layer M4 layer, the output of the pond layer M4 layer of three sequential is merged into the input as full articulamentum F5 layer of the vector, the output of full articulamentum F5 layer is as the input of softmax output layer, the output of softmax layer is for the value of counting loss function, based on classical gradient descent algorithm and back-propagation method, loss function is for the gradient of successively backwards calculation network connection weight, thus utilizing gradient that weight is updated, the value making loss function is more and more less, reach the purpose of training network。
Step S1503: adopt the negative log-likelihood in formula 1 as loss function, utilize the back-propagation algorithm training recursive convolution neutral net launched in time。
Such as, adopt the negative log-likelihood in formula 1 as loss function:
N L L ( θ , D ) = - Σ i = 0 | D | log P ( Y = y ( i ) | x ( i ) , θ ) - - - ( 1 )
Wherein, θ refers to the parameter of Current Situation of Neural Network, and D refers to current sample set, and including positive sample set and negative sample set, | D | refers to the size of current sample set, P (Y=y(i)|x(i), θ) and refer to that certain sample is under parameter current θ, given input x(i)When, export y(i)Probability, log represents logarithm operation。Using negative log-likelihood loss function (NLL), with this loss function for optimization aim, training process is the ever-reduced process of value instigating this loss function, utilizes the back-propagation algorithm launched in time, carrys out Optimal Recursive convolutional neural networks。
Softmax layer can represent with function, and the definition of this function is provided by formula 2:
P ( y = j | x ) = e x T θ j Σ k = 1 K e x T θ k - - - ( 2 )
Wherein, x refers to certain input vector, xTRefer to the transposition of this input vector, θ refers to the parameter of softmax layer, K refers to that current softmax layer has a few class to export, P (y=j | x) it is represented to surely be currently entered x, the probability of output y, using class (scoring or non-goal) the highest for softmax layer output probability as output result, softmax layer is equivalent to a grader。The output of softmax layer is for the value of counting loss function, loss function is for utilizing the gradient of back-propagation algorithm successively backwards calculation network connection weight, thus utilizing gradient that weight is updated so that the value of loss function is more and more less, reach the purpose of training network。When utilizing the back-propagation algorithm launched in time, network is made to launch with sequential, form the network shown in Fig. 3, then for the network of each sequential, utilizing back-propagation algorithm to carry out network training, each network connection weight is calculated gradient for utilizing loss function by the process of back-propagation algorithm, in the way of gradient decline, changing network connection weight, thus reaching to reduce the purpose of the value of loss function, completing the training of recursive convolution neutral net。
Step S170-S190, from processing, basketball video to be detected extracts image, use the recursive convolution neural network model after training to process the image extracted, obtain output vector, according to the output vector of recursive convolution neutral net judges whether occur in that goal in current basketball video。
Step S170-S190 is the step that detection is scored, basketball video is regarded as continuous print picture frame stream, two, front and back are one group, in the recursive convolution neutral net that input trains, detect the sequential that whether having in certain moment video scores, and goal start time and goal finish time occur and obtain, final output detections result, specifically comprises the following steps that
Step S1701: delimiting region to be detected in basketball video, region to be detected generally refers to basketry position, becomes every two field picture by extracted region to be detected in video, forms the frame stream of image。
Step S1702: according to the form of continuous data stream, image is inputted the recursive convolution neutral net trained, successively calculates until output layer。
Step S1703: obtain the probability scored and do not score at output layer, if the probability scored is more than the probability do not scored, then judges occur in current video scoring, otherwise, it is judged that do not occur in current video scoring。
The basketball goal detection method based on video based on the present invention, it is possible to whether detection basketball scores exactly。
In order to detect the accuracy in detection of the present invention, the present invention generates a test set comprising 18314 negative samples and 371 positive samples, the recursive convolution neutral net trained is tested, in amounting to 18685 test samples, the number of successful classification is 18646, accuracy (accuracyrate) is 99.79128%, 39 samples of classification error, error rate (errorrate) is 0.20872%, wherein, negative sample is 18, positive sample is 21, overall recall rate (recallrate) is 94.33962%, False Alarm Rate (falsealarmrate) is 0.09829%。
Visible, the present invention is capable of significantly high picking up ball Detection accuracy。The present invention can effectively detect the goal in basketball video, has good feasibility and robustness。
The order of the step shown in accompanying drawing of the present invention is merely illustrative, but is not limited to this, but can reasonably change, for instance, the order of step S110 and step S130 can be exchanged, it is also possible to carries out parallel。The change of these sequence of steps is all in protection scope of the present invention。
The present invention also provides for a kind of base for realizing the device of said method, as it is shown in fig. 7, this device includes: training sample acquisition module, recursive convolution neutral net build module and goal detection module, wherein:
Described training sample acquisition module builds the picture library sample set of basketball video, and the picture in described picture library has label, and described label includes time sequence information and goal identification information;
Described recursive convolution neutral net builds the module construction recursive convolution neural network model based on convolutional neural networks and recurrent neural network, and based on described sample set, described recursive convolution neural network model is trained, obtain about whether video occurring, the grader scored exports result;And
Described goal detection module is from processing extraction image basketball video to be detected, the recursive convolution neural network model after training is used to process the image extracted, obtain output vector, and the output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。
The each several part of the present invention can realize with hardware, software, firmware or their combination。In the above-described embodiment, multiple steps or method can realize with the storage software or firmware in memory and by suitable instruction execution system execution。Such as, if realized with hardware, the same in another embodiment, can realize by any one in the following technology that this area is known altogether or their combination: there is the discrete logic of logic gates for data signal realizes logic function, there is the special IC of suitable combination logic gate circuit, programmable gate array (PGA), field programmable gate array (FPGA) etc.。
Represent in flow charts or in this logic otherwise described and/or step, such as, it is considered the sequencing list of executable instruction for realizing logic function, may be embodied in any computer-readable medium, use for instruction execution system, device or equipment (such as computer based system, including the system of processor or other can from instruction execution system, device or equipment instruction fetch the system performing instruction), or use in conjunction with these instruction execution systems, device or equipment。
In the present invention, the feature describing for embodiment and/or illustrating, it is possible to use in the same manner or in a similar manner in one or more other embodiment, and/or combine with the feature of other embodiments or replace the feature of other embodiments。
It should be noted that above-described embodiment is only illustrates rather than the scope of the claims limiting the present invention, any equivalents technology based on the present invention, all should in the scope of patent protection of the present invention。

Claims (9)

1. the basketball goal detection method based on video, it is characterised in that the method comprises the following steps:
Building the picture library sample set of basketball video, the picture in described picture library has label, and described label includes time sequence information and goal identification information;
Build the recursive convolution neural network model based on convolutional neural networks and recurrent neural network;
Based on described sample set, described recursive convolution neural network model is trained;
From processing, basketball video to be detected extracts image, use the recursive convolution neural network model after training to process the image extracted, obtain output vector;
Output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。
2. method according to claim 1, is characterized in that, the step of the picture library sample set building basketball video includes:
From basketball video, extract each two field picture, be categorized as goal picture or non-goal picture and in picture, add label;
Picture delimited basketry position, intercept out the coloured image block of predefined size, in sequential adjacent and there are two coloured image blocks of same goal identification information form single samples, thus obtain the goal sample set of predetermined ratio and non-goal sample set based on the two field picture extracted from basketball video。
3. method according to claim 2, it is characterized in that, described recursive convolution neural network model includes the recursive convolution neutral net that three layers is parallel, recursive convolution neutral net every layer parallel includes the convolutional neural networks layer of 6 serials, the convolutional neural networks layer of these 6 serials sequentially includes: the first recursive convolution layer, first pond layer, second recursive convolution layer, second pond layer, full articulamentum and output layer, described three layers parallel recurrence convolutional neural networks shares full articulamentum and output layer, in described three layers parallel recurrence convolutional neural networks first, the input of two layers of parallel recurrence convolutional neural networks is the coloured image of previous sequential in sample, the input of third layer parallel recurrence convolutional neural networks is the coloured image of a rear sequential in sample, and the output of each convolutional neural networks layer inputs as the part of convolution neural net layer corresponding in later layer parallel recurrence convolutional neural networks in preceding layer parallel recurrence convolutional neural networks。
4. method according to claim 3, it is characterized in that, described first recursive convolution layer has 10 characteristic patterns, described first pond layer has 10 characteristic patterns, described second recursive convolution layer has 30 characteristic patterns, described second pond layer has 30 characteristic patterns, described first recursive convolution layer is output as the input of described first pond layer, described first pond layer is output as the input of described second recursive convolution layer, described second recursive convolution layer is output as the input of described second pond layer, second pond layer of former and later two parallel sequential is output as the input of described full articulamentum, described full articulamentum has 30 nodes, described output layer has 2 nodes。
5. method according to claim 1, is characterized in that, the step described recursive convolution neural network model being trained based on described sample set includes:
Goal sample in the middle of sample set and non-goal sample are input in described recursive convolution neutral net, adopt negative log-likelihood as loss function, utilize the back-propagation algorithm launched in time, optimize described recursive convolution neutral net, the value making described loss function is more and more less, thus completing the training of recursive convolution neutral net。
6. the method according to any one in claim 1-5, is characterized in that, the step described recursive convolution neural network model being trained based on described sample set comprises the following steps:
A. described recursive convolution neutral net is initialized, wherein, remember single neuronic fan-out and fan-in number respectively fanoutAnd fanin, then in convolutional neural networks single neuronic initialization weights with in intervalEqually distributed form produces, and in recurrent neural network, single neuronic initialization weights are to produce in interval [-0.1,0.1] equally distributed form;
B. the goal sample in the middle of sample set and non-goal sample are input to the recursive convolution neutral net after initialization, obtain the actual output of recursive convolution neutral net;
C. use negative log-likelihood loss function, utilizing the gradient of back-propagation algorithm computing network connection weight, utilizing gradient to recursive convolution neutral net described in weight optimization so that the value of loss function is more and more less, thus completing the training of recursive convolution neutral net。
7. method according to claim 6, is characterized in that, described negative log-likelihood loss function is:
N L L ( θ , D ) = - Σ i = 0 | D | log P ( Y = y ( i ) | x ( i ) , θ ) ;
Wherein, θ refers to the parameter of Current Situation of Neural Network, and D refers to current sample set, and including goal sample set and non-goal sample set, | D | refers to the size of current sample set, P (Y=y(i)|x(i), θ) and refer to that sample is under parameter current θ and at given input x(i)When output y(i)Probability, log represents logarithm operation;
Formula used by described output layer is:
P ( y = j | x ) = e x T θ j Σ k = 1 K e x T θ k ;
Wherein, x refers to input vector, xTReferring to the transposition of this input vector, θ refers to the parameter of output layer, and K refers to that current output layer has a few class to export, and P (y=j | x) it is represented to surely be currently entered x, the probability of output y。
8. method according to claim 3, is characterized in that, extracts image from processing basketball video to be detected, uses the step that the recursive convolution neural network model after training processes the image extracted to include:
D. in basketball video, delimit region to be detected, extracted region to be detected in video become every two field picture;
E. image is inputted, according to the form of continuous data stream, the recursive convolution neutral net trained, successively calculate until output layer;
F. obtain the probability scored and do not score at output layer, if the probability scored is more than the probability do not scored, then judge current video occurs score, otherwise, it is judged that current video does not occur score。
9. the basketball goal detection device based on video realizing method described in any of the above-described claim, it is characterised in that this device includes training sample acquisition module, recursive convolution neutral net builds module and goal detection module, wherein:
Described training sample acquisition module builds the picture library sample set of basketball video, and the picture in described picture library has label, and described label includes time sequence information and goal identification information;
Described recursive convolution neutral net builds the module construction recursive convolution neural network model based on convolutional neural networks and recurrent neural network, and based on described sample set, described recursive convolution neural network model is trained;And
Described goal detection module is from processing extraction image basketball video to be detected, the recursive convolution neural network model after training is used to process the image extracted, obtain output vector, and the output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。
CN201610012877.1A 2016-01-07 2016-01-07 A kind of basketball goal detection method and apparatus based on video Active CN105701460B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610012877.1A CN105701460B (en) 2016-01-07 2016-01-07 A kind of basketball goal detection method and apparatus based on video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610012877.1A CN105701460B (en) 2016-01-07 2016-01-07 A kind of basketball goal detection method and apparatus based on video

Publications (2)

Publication Number Publication Date
CN105701460A true CN105701460A (en) 2016-06-22
CN105701460B CN105701460B (en) 2019-01-29

Family

ID=56226994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610012877.1A Active CN105701460B (en) 2016-01-07 2016-01-07 A kind of basketball goal detection method and apparatus based on video

Country Status (1)

Country Link
CN (1) CN105701460B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991408A (en) * 2017-04-14 2017-07-28 电子科技大学 The generation method and method for detecting human face of a kind of candidate frame generation network
CN107038221A (en) * 2017-03-22 2017-08-11 杭州电子科技大学 A kind of video content description method guided based on semantic information
CN107122798A (en) * 2017-04-17 2017-09-01 深圳市淘米科技有限公司 Chin-up count detection method and device based on depth convolutional network
CN107864334A (en) * 2017-11-09 2018-03-30 睿魔智能科技(东莞)有限公司 A kind of intelligent camera lens image pickup method and system using deep learning
CN108156130A (en) * 2017-03-27 2018-06-12 上海观安信息技术股份有限公司 Network attack detecting method and device
CN108171103A (en) * 2016-12-07 2018-06-15 北京三星通信技术研究有限公司 Object detection method and device
CN108257156A (en) * 2018-01-24 2018-07-06 清华大学深圳研究生院 A kind of method of the automatic tracing target object from video
CN108268427A (en) * 2018-01-10 2018-07-10 中国地质大学(武汉) A kind of free kick goal probability analysis method, equipment and storage device
CN108735202A (en) * 2017-03-13 2018-11-02 百度(美国)有限责任公司 Convolution recurrent neural network for small occupancy resource keyword retrieval
CN109614896A (en) * 2018-10-29 2019-04-12 山东大学 A method of the video content semantic understanding based on recursive convolution neural network
CN109657523A (en) * 2017-10-10 2019-04-19 北京京东尚科信息技术有限公司 A kind of drivable region detection method and device
CN109800860A (en) * 2018-12-28 2019-05-24 北京工业大学 A kind of Falls in Old People detection method of the Community-oriented based on CNN algorithm
CN110020597A (en) * 2019-02-27 2019-07-16 中国医学科学院北京协和医院 It is a kind of for the auxiliary eye method for processing video frequency examined of dizziness/dizziness and system
CN110276309A (en) * 2019-06-25 2019-09-24 新华智云科技有限公司 Method for processing video frequency, device, computer equipment and storage medium
CN110275953A (en) * 2019-06-21 2019-09-24 四川大学 Personality classification method and device
CN110298231A (en) * 2019-05-10 2019-10-01 新华智云科技有限公司 A kind of method and system determined for the goal of Basketball Match video
CN110314361A (en) * 2019-05-10 2019-10-11 新华智云科技有限公司 A kind of basketball goal score judgment method and system based on convolutional neural networks
CN110472561A (en) * 2019-08-13 2019-11-19 新华智云科技有限公司 Soccer goal kind identification method, device, system and storage medium
CN110490112A (en) * 2019-08-13 2019-11-22 新华智云科技有限公司 Football video segment detection method, device, system and storage medium
CN110569695A (en) * 2018-08-31 2019-12-13 阿里巴巴集团控股有限公司 Image processing method and device based on loss assessment image judgment model
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
CN111428660A (en) * 2020-03-27 2020-07-17 腾讯科技(深圳)有限公司 Video editing method and device, storage medium and electronic device
CN111539294A (en) * 2020-04-17 2020-08-14 广东世宇科技股份有限公司 Shooting detection method and device, electronic equipment and computer readable storage medium
CN111860819A (en) * 2020-07-27 2020-10-30 南京大学 Splicing and segmentable full-connection neural network reasoning accelerator and acceleration method thereof
CN111932620A (en) * 2020-07-27 2020-11-13 根尖体育科技(北京)有限公司 Method for judging whether volleyball serving is passed through net or not and method for acquiring serving speed
KR102211135B1 (en) * 2020-07-15 2021-02-02 리디아 주식회사 Apparatus and method for analyzing basketball match based on multiple neural networks
CN112870665A (en) * 2021-02-04 2021-06-01 太原理工大学 Basketball ball control training instrument and control method thereof
CN113537168A (en) * 2021-09-16 2021-10-22 中科人工智能创新技术研究院(青岛)有限公司 Basketball goal detection method and system for rebroadcasting and court monitoring scene
WO2024041319A1 (en) * 2022-08-23 2024-02-29 京东方科技集团股份有限公司 Basketball shot recognition method and apparatus, device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8051446B1 (en) * 1999-12-06 2011-11-01 Sharp Laboratories Of America, Inc. Method of creating a semantic video summary using information from secondary sources
CN102306154A (en) * 2011-06-29 2012-01-04 西安电子科技大学 Football video goal event detection method based on hidden condition random field
CN102393909A (en) * 2011-06-29 2012-03-28 西安电子科技大学 Method for detecting goal events in soccer video based on hidden markov model
CN104794501A (en) * 2015-05-14 2015-07-22 清华大学 Mode identification method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8051446B1 (en) * 1999-12-06 2011-11-01 Sharp Laboratories Of America, Inc. Method of creating a semantic video summary using information from secondary sources
CN102306154A (en) * 2011-06-29 2012-01-04 西安电子科技大学 Football video goal event detection method based on hidden condition random field
CN102393909A (en) * 2011-06-29 2012-03-28 西安电子科技大学 Method for detecting goal events in soccer video based on hidden markov model
CN104794501A (en) * 2015-05-14 2015-07-22 清华大学 Mode identification method and device

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171103A (en) * 2016-12-07 2018-06-15 北京三星通信技术研究有限公司 Object detection method and device
CN108735202B (en) * 2017-03-13 2023-04-07 百度(美国)有限责任公司 Convolutional recurrent neural network for small-occupied resource keyword retrieval
CN108735202A (en) * 2017-03-13 2018-11-02 百度(美国)有限责任公司 Convolution recurrent neural network for small occupancy resource keyword retrieval
CN107038221A (en) * 2017-03-22 2017-08-11 杭州电子科技大学 A kind of video content description method guided based on semantic information
CN108156130A (en) * 2017-03-27 2018-06-12 上海观安信息技术股份有限公司 Network attack detecting method and device
CN106991408A (en) * 2017-04-14 2017-07-28 电子科技大学 The generation method and method for detecting human face of a kind of candidate frame generation network
CN107122798A (en) * 2017-04-17 2017-09-01 深圳市淘米科技有限公司 Chin-up count detection method and device based on depth convolutional network
CN109657523A (en) * 2017-10-10 2019-04-19 北京京东尚科信息技术有限公司 A kind of drivable region detection method and device
CN109657523B (en) * 2017-10-10 2021-03-30 北京京东乾石科技有限公司 Driving region detection method and device
CN107864334A (en) * 2017-11-09 2018-03-30 睿魔智能科技(东莞)有限公司 A kind of intelligent camera lens image pickup method and system using deep learning
CN108268427A (en) * 2018-01-10 2018-07-10 中国地质大学(武汉) A kind of free kick goal probability analysis method, equipment and storage device
CN108257156A (en) * 2018-01-24 2018-07-06 清华大学深圳研究生院 A kind of method of the automatic tracing target object from video
CN108257156B (en) * 2018-01-24 2021-05-04 清华大学深圳研究生院 Method for automatically tracking target object from video
CN110569695A (en) * 2018-08-31 2019-12-13 阿里巴巴集团控股有限公司 Image processing method and device based on loss assessment image judgment model
CN110569695B (en) * 2018-08-31 2021-07-09 创新先进技术有限公司 Image processing method and device based on loss assessment image judgment model
CN109614896A (en) * 2018-10-29 2019-04-12 山东大学 A method of the video content semantic understanding based on recursive convolution neural network
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
CN109800860A (en) * 2018-12-28 2019-05-24 北京工业大学 A kind of Falls in Old People detection method of the Community-oriented based on CNN algorithm
CN110020597B (en) * 2019-02-27 2022-03-11 中国医学科学院北京协和医院 Eye video processing method and system for auxiliary diagnosis of dizziness/vertigo
CN110020597A (en) * 2019-02-27 2019-07-16 中国医学科学院北京协和医院 It is a kind of for the auxiliary eye method for processing video frequency examined of dizziness/dizziness and system
CN110314361B (en) * 2019-05-10 2021-03-30 新华智云科技有限公司 Method and system for judging basketball goal score based on convolutional neural network
CN110298231A (en) * 2019-05-10 2019-10-01 新华智云科技有限公司 A kind of method and system determined for the goal of Basketball Match video
CN110314361A (en) * 2019-05-10 2019-10-11 新华智云科技有限公司 A kind of basketball goal score judgment method and system based on convolutional neural networks
CN110298231B (en) * 2019-05-10 2021-07-27 新华智云科技有限公司 Method and system for judging goal of basketball game video
CN110275953A (en) * 2019-06-21 2019-09-24 四川大学 Personality classification method and device
CN110276309A (en) * 2019-06-25 2019-09-24 新华智云科技有限公司 Method for processing video frequency, device, computer equipment and storage medium
CN110276309B (en) * 2019-06-25 2021-05-28 新华智云科技有限公司 Video processing method, video processing device, computer equipment and storage medium
CN110472561B (en) * 2019-08-13 2021-08-20 新华智云科技有限公司 Football goal type identification method, device, system and storage medium
CN110472561A (en) * 2019-08-13 2019-11-19 新华智云科技有限公司 Soccer goal kind identification method, device, system and storage medium
CN110490112A (en) * 2019-08-13 2019-11-22 新华智云科技有限公司 Football video segment detection method, device, system and storage medium
CN111428660A (en) * 2020-03-27 2020-07-17 腾讯科技(深圳)有限公司 Video editing method and device, storage medium and electronic device
CN111428660B (en) * 2020-03-27 2023-04-07 腾讯科技(深圳)有限公司 Video editing method and device, storage medium and electronic device
CN111539294A (en) * 2020-04-17 2020-08-14 广东世宇科技股份有限公司 Shooting detection method and device, electronic equipment and computer readable storage medium
CN111539294B (en) * 2020-04-17 2022-11-15 广东世宇科技股份有限公司 Shooting detection method and device, electronic equipment and computer readable storage medium
KR102211135B1 (en) * 2020-07-15 2021-02-02 리디아 주식회사 Apparatus and method for analyzing basketball match based on multiple neural networks
CN111860819A (en) * 2020-07-27 2020-10-30 南京大学 Splicing and segmentable full-connection neural network reasoning accelerator and acceleration method thereof
CN111932620A (en) * 2020-07-27 2020-11-13 根尖体育科技(北京)有限公司 Method for judging whether volleyball serving is passed through net or not and method for acquiring serving speed
CN111860819B (en) * 2020-07-27 2023-11-07 南京大学 Spliced and sectionable full-connection neural network reasoning accelerator and acceleration method thereof
CN111932620B (en) * 2020-07-27 2024-01-12 根尖体育科技(北京)有限公司 Method for judging whether volleyball is out of net or not and method for acquiring service speed
CN112870665A (en) * 2021-02-04 2021-06-01 太原理工大学 Basketball ball control training instrument and control method thereof
CN113537168B (en) * 2021-09-16 2022-01-18 中科人工智能创新技术研究院(青岛)有限公司 Basketball goal detection method and system for rebroadcasting and court monitoring scene
CN113537168A (en) * 2021-09-16 2021-10-22 中科人工智能创新技术研究院(青岛)有限公司 Basketball goal detection method and system for rebroadcasting and court monitoring scene
WO2024041319A1 (en) * 2022-08-23 2024-02-29 京东方科技集团股份有限公司 Basketball shot recognition method and apparatus, device and storage medium

Also Published As

Publication number Publication date
CN105701460B (en) 2019-01-29

Similar Documents

Publication Publication Date Title
CN105701460A (en) Video-based basketball goal detection method and device
Yuan et al. Temporal action localization by structured maximal sums
Papandreou et al. Towards accurate multi-person pose estimation in the wild
CN111738124B (en) Remote sensing image cloud detection method based on Gabor transformation and attention
CN109584248A (en) Infrared surface object instance dividing method based on Fusion Features and dense connection network
CN109784386A (en) A method of it is detected with semantic segmentation helpers
US11494938B2 (en) Multi-person pose estimation using skeleton prediction
CN109903312A (en) A kind of football sportsman based on video multi-target tracking runs distance statistics method
CN106887011A (en) A kind of multi-template method for tracking target based on CNN and CF
CN108961308B (en) Residual error depth characteristic target tracking method for drift detection
CN112883819A (en) Multi-target tracking method, device, system and computer readable storage medium
CN111178120B (en) Pest image detection method based on crop identification cascading technology
Lu et al. Identification and tracking of players in sport videos
CN111832484A (en) Loop detection method based on convolution perception hash algorithm
CN113743417B (en) Semantic segmentation method and semantic segmentation device
CN109919246A (en) Pedestrian's recognition methods again based on self-adaptive features cluster and multiple risks fusion
CN106934332A (en) A kind of method of multiple target tracking
CN111582091A (en) Pedestrian identification method based on multi-branch convolutional neural network
CN114724059A (en) Chain type multi-target tracking method for secondary association low-resolution detection frame
CN110314361B (en) Method and system for judging basketball goal score based on convolutional neural network
Qian et al. Trm: Temporal relocation module for video recognition
Pires et al. An efficient cascaded model for ship segmentation in aerial images
CN111027555A (en) License plate recognition method and device and electronic equipment
Nguyen et al. You always look again: Learning to detect the unseen objects
CN107948586A (en) Trans-regional moving target detecting method and device based on video-splicing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No.38, Zheda Road, Hangzhou, Zhejiang 310007

Patentee after: Wang Yueming

Patentee after: ROOT SPORTS SCIENCE AND TECHNOLOGY (BEIJING) Co.,Ltd.

Address before: No.38, Zheda Road, Hangzhou, Zhejiang 310007

Patentee before: Wang Yueming

Patentee before: ROOTSPORTS INVESTMENT (BEIJING) Co.,Ltd.