CN105701460A

CN105701460A - Video-based basketball goal detection method and device

Info

Publication number: CN105701460A
Application number: CN201610012877.1A
Authority: CN
Inventors: 王跃明
Original assignee: Apex Sports Investments (beijing) Ltd
Current assignee: ROOT SPORTS SCIENCE AND TECHNOLOGY (BEIJING) Co.,Ltd.
Priority date: 2016-01-07
Filing date: 2016-01-07
Publication date: 2016-06-22
Anticipated expiration: 2036-01-07
Also published as: CN105701460B

Abstract

The present invention relates to a video-based basketball goal detection method and device. The method comprises the steps of constructing a recurrent-convolutional neural network model based on a convolutional neural network and a recurrent neural network; constructing a photo gallery sample set of the basketball videos, wherein the photos in the photo gallery possess labels, and the labels comprise the timing sequence information and the gold identification information; training the recurrent-convolutional neural network model based on the sample set; processing the to-be-detected basketball videos to extract images, and using the trained recurrent-convolutional neural network model to process the extracted images to obtain the output vectors; determining whether goal occurs in the current basketball video according to the output vectors of a recurrent-convolutional neural network. According to the present invention, the basketball goal can be detected accurately based on the videos.

Description

A kind of basketball goal detection method and apparatus based on video

Technical field

The present invention relates to video detection technology, relate more specifically to a kind of basketball goal detection method and apparatus based on video。

Background technology

Development rapidly along with science and technology, the passing video that only professional camera can shoot, nowadays can also complete with the smart mobile phone carried with, based on this, upload to the various types of videos on network in recent years and present the state of blowout, and along with the development of Internet technology and digital media technology, how to be analyzed being increasingly becoming the focus of research for certain certain types of video。

Network today has substantial amounts of sport category video, it is wherein basketball video greatly, some researchs are had to launch for basketball video, but majority concentrates on sportsman's detection or court event analysis, such as the spectators in sportsman or court being analyzed, then little for the goal detection in basketball video, the goal that reason is because in basketball video greatly is a dynamic process continuously, and unlike sportsman's detection or audience analysis, be the process of a relative quiescent。

How to carry out goal detection for basketball video exactly to be one and need to be solved the technical problem that。

Summary of the invention

In consideration of it, it is an object of the present invention to provide a kind of basketball goal detection method and apparatus based on video, to substantially eliminate one or more problem caused because of the limitation of prior art and shortcoming。

In order to realize the purpose of the present invention, in one aspect of the invention, it is provided that a kind of basketball goal detection method based on video, the method comprises the following steps:

Build the recursive convolution neural network model based on convolutional neural networks and recurrent neural network；

Building the picture library sample set of basketball video, the picture in described picture library has label, and described label includes time sequence information and goal identification information；

Based on described sample set, described recursive convolution neural network model is trained；

From processing, basketball video to be detected extracts image, use the recursive convolution neural network model after training to process the image extracted, obtain output vector；

Output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。

Further, the step of the picture library sample set of structure basketball video comprises the steps that and extracts each two field picture from basketball video, is categorized as goal picture or non-goal picture and adds label in picture；Picture delimited basketry position, intercept out the coloured image block of predefined size, in sequential adjacent and there are two coloured image blocks of same goal identification information form single samples, thus obtain the goal sample set of predetermined ratio and non-goal sample set based on the two field picture extracted from basketball video。

Further, described recursive convolution neural network model includes the recursive convolution neutral net that three layers is parallel, recursive convolution neutral net every layer parallel includes the convolutional neural networks layer of 6 serials, the convolutional neural networks layer of these 6 serials sequentially includes: the first recursive convolution layer, first pond layer, second recursive convolution layer, second pond layer, full articulamentum and output layer, described three layers parallel recurrence convolutional neural networks shares full articulamentum and output layer, in described three layers parallel recurrence convolutional neural networks first, the input of two layers of parallel recurrence convolutional neural networks is the coloured image of previous sequential in sample, the input of third layer parallel recurrence convolutional neural networks is the coloured image of a rear sequential in sample, and the output of each convolutional neural networks layer inputs as the part of convolution neural net layer corresponding in later layer parallel recurrence convolutional neural networks in preceding layer parallel recurrence convolutional neural networks。

Further, described first recursive convolution layer has 10 characteristic patterns, described first pond layer has 10 characteristic patterns, described second recursive convolution layer has 30 characteristic patterns, described second pond layer has 30 characteristic patterns, described first recursive convolution layer is output as the input of described first pond layer, described first pond layer is output as the input of described second recursive convolution layer, described second recursive convolution layer is output as the input of described second pond layer, second pond layer of former and later two parallel sequential is output as the input of described full articulamentum, described full articulamentum has 30 nodes, described output layer has 2 nodes。

Further, based on described sample set, the step that described recursive convolution neural network model is trained is comprised the steps that the goal sample in the middle of by sample set and non-goal sample are input in described recursive convolution neutral net, adopt negative log-likelihood as loss function, utilize the back-propagation algorithm launched in time, optimize described recursive convolution neutral net, the value making described loss function is more and more less, thus completing the training of recursive convolution neutral net。

Further, the step based on described sample set, described recursive convolution neural network model being trained can comprise the following steps that

A. described recursive convolution neutral net is initialized, wherein, remember single neuronic fan-out and fan-in number respectively fan_outAnd fan_in, then in convolutional neural networks single neuronic initialization weights with in intervalEqually distributed form produces, and in recurrent neural network, single neuronic initialization weights are to produce in interval [-0.1,0.1] equally distributed form；

B. the goal sample in the middle of sample set and non-goal sample are input to the recursive convolution neutral net after initialization, obtain the actual output of recursive convolution neutral net；

C. use negative log-likelihood loss function, utilizing the gradient of back-propagation algorithm computing network connection weight, utilizing gradient to recursive convolution neutral net described in weight optimization so that the value of loss function is more and more less, thus completing the training of recursive convolution neutral net。

Described negative log-likelihood loss function is:

N L L (θ, D) = - Σ_{i = 0}^{| D |} \log P (Y = y^{(i)} | x^{(i)}, θ);

Wherein, θ refers to the parameter of Current Situation of Neural Network, and D refers to current sample set, and including goal sample set and non-goal sample set, | D | refers to the size of current sample set, P (Y=y⁽ⁱ⁾|x⁽ⁱ⁾, θ) and refer to that sample is under parameter current θ and at given input x⁽ⁱ⁾When output y⁽ⁱ⁾Probability, log represents logarithm operation；

Formula used by described output layer is:

P (y = j | x) = \frac{e^{x^{T} θ_{j}}}{Σ_{k = 1}^{K} e^{x^{T} θ_{k}}};

Wherein, x refers to input vector, x^TReferring to the transposition of this input vector, θ refers to the parameter of output layer, and K refers to that current output layer has a few class to export, and P (y=j | x) it is represented to surely be currently entered x, the probability of output y。

Further, from processing, basketball video to be detected extracts image, use the step that the recursive convolution neural network model after training processes the image extracted to comprise the steps that

D. in basketball video, delimit region to be detected, extracted region to be detected in video become every two field picture；

E. image is inputted, according to the form of continuous data stream, the recursive convolution neutral net trained, successively calculate until output layer；

F. obtain the probability scored and do not score at output layer, if the probability scored is more than the probability do not scored, then judge current video occurs score, otherwise, it is judged that current video does not occur score。

According on the other hand, additionally providing a kind of basketball goal detection device based on video, this device includes:

This device includes training sample acquisition module, recursive convolution neutral net builds module and goal detection module, wherein:

Described training sample acquisition module builds the picture library sample set of basketball video, and the picture in described picture library has label, and described label includes time sequence information and goal identification information；

Described recursive convolution neutral net builds the module construction recursive convolution neural network model based on convolutional neural networks and recurrent neural network, and based on described sample set, described recursive convolution neural network model is trained；And

Described goal detection module is from processing extraction image basketball video to be detected, the recursive convolution neural network model after training is used to process the image extracted, obtain output vector, and the output vector according to recursive convolution neutral net judges whether occur in that goal in current basketball video。

According to the embodiment of the present invention, it is possible to detect whether basketball scores exactly based on video。

The attendant advantages of the present invention, purpose, and feature will will partly be illustrated by the following description, and hereafter will become apparent upon to rear section studying for those of ordinary skill in the art, or can know according to the practice of the present invention。The structure that the purpose of the present invention and further advantage can be passed through to specifically note in written explanation and claims and accompanying drawing thereof realizes and obtains。

It will be appreciated by those skilled in the art that can with the purpose that the present invention realizes and advantage be not limited to above concrete described in, and understand, according to described further below, the above and other purpose that the present invention is capable of with will be apparent from。

Accompanying drawing explanation

With reference to the following drawings, it is better understood with many aspects of the present invention。In accompanying drawing:

Fig. 1 is the method schematic diagram carrying out basketball goal detection in the embodiment of the present invention based on video；

Fig. 2 is the schematic diagram of the recursive convolution neutral net of the present invention；

Fig. 3 is the recursive convolution neutral net Organization Chart of the present invention；

Fig. 4 is the flow chart training recursive convolution neutral net in the embodiment of the present invention；

Fig. 5 is the flow chart of the process detecting whether goal in the embodiment of the present invention；

Fig. 6 is side-looking result receiver test curve (ROC) figure of the present invention；And

Fig. 7 is the block diagram of the basketball goal detection device in the embodiment of the present invention based on video。

Detailed description of the invention

Below, the preferred embodiment of the present invention is described in detail。The example of these preferred implementations has illustrated in the accompanying drawings。Shown in accompanying drawing and the embodiments of the present invention that describe with reference to the accompanying drawings merely exemplary, and the technical spirit of the present invention and primary operational thereof be not limited to these embodiments。

At this, in addition it is also necessary to explanation, in order to avoid having obscured the present invention because of unnecessary details, illustrate only in the accompanying drawings and according to the closely-related structure of the solution of the present invention and/or process step, and eliminate other details little with relation of the present invention。

Machine learning is an important subdiscipline of artificial intelligence field。In recent years, rapid emergence along with branch's neutral net in machine learning, a lot of models in neutral net have successfully been applied to multiple field, wherein, convolutional neural networks is successfully applied to the related fields of computer vision, such as recognition of face, pedestrian detection etc., achieving the effect making us being pleasantly surprised, recurrent neural network is also successfully applied to field of speech recognition, effectively improves the accuracy rate of identification。

In the present invention, it is utilize convolutional neural networks to carry out basketball goal detection based on video。

The basketball goal detection method based on video of the present invention is based on following design and realizes: first extract picture from basketball video and picture indicia has target information and time sequence information, to build Sample Storehouse (wherein target information could be for identifying whether the goal identification information of goal), then structure recursive convolution neutral net, then use Sample Storehouse that recursive convolution neutral net is trained, finally use the recursive convolution Processing with Neural Network basketball video after having trained, output vector according to recursive convolution neutral net judges whether occur goal in current time video。The present invention builds Sample Storehouse and builds the sequencing of recursive convolution neutral net and unrestricted。

Specifically, as it is shown in figure 1, the basketball goal detection based on video of the present invention comprises the following steps:

Step S130, builds the picture library sample set of basketball video, and the picture in this picture library has label, and this label can include time sequence information and goal identification information。

Specifically, each two field picture can be extracted from some basketball video, all two field pictures for some video, can time sequence information corresponding to labelling, whether occurring in that goal according in current image frame, available goal identification information is respectively labeled as goal picture and non-goal picture, adds time sequence information in the picture and goal identification information can be performed manually by, additionally, time sequence information can also be the time tag automatically generated in two field picture。By delimiting basketry position, the coloured image block intercepting out predefined size (is such as sized to the coloured image block of 32*40 pixel, the present invention is not limited to this), the goal mark for goal picture just can be, the goal mark for non-goal picture can be negative。According to time sequence information, the goal picture obtained and non-goal picture are combined, two coloured image blocks composition individualized training samples that are adjacent and that have mark of scoring equally in sequential, composition individualized training sample refers to and two coloured image blocks is divided into former frame and a later frame preservation。The goal sample set of predetermined quantity ratio and non-goal sample set may finally be obtained, such as obtain quantity substantially 1 to 50 positive sample set and negative sample set, wherein, positive sample set refers to the set of all positive samples, each positive sample comprises two coloured image blocks, it is divided into former frame and a later frame, this two frame is intercept out in goal picture adjacent sequential respectively, the quantity of positive sample can approximately 900, and negative sample refers to the set of all negative samples, each negative sample comprises two coloured image blocks, it is divided into former frame and a later frame, this two frame is to intercept out in non-goal picture adjacent sequential, the quantity of negative sample can approximately 44000。At this, the sample size of 900 and 44000 is merely illustrative, is not intended to limit the present invention。

Step S110, builds the recursive convolution neural network model based on convolutional neural networks and recurrent neural network。

The recursive convolution neutral net built can include the convolutional neural networks layer of 6 serials, including: two recursive convolution layers, two pond layers, a full articulamentum and a softmax return layer。Wherein, the input of the neurad network of former and later two sequential is the Three Channel Color image (such as RGB color image) of front and back two frame 32*40 pixels continuously。Recursive convolution layer RC1 has 10 characteristic patterns, pond layer M2 to have 10 characteristic patterns, and recursive convolution layer RC3 has 30 characteristic patterns, pond layer M4 has 30 characteristic patterns, the output of the pond layer M4 that input is former and later two sequential of full articulamentum F5, has 30 nodes, output layer L6 to have 2 nodes。Original image is as the input of first recursive convolution layer RC1, first recursive convolution layer RC1 is output as 10 characteristic patterns, these 10 characteristic patterns are as the input of first pond layer M2, the input of pond layer M2 is also 10 characteristic patterns, it is as the input of second recursive convolution layer RC3, second recursive convolution layer RC3 is output as 30 characteristic patterns, it is as the input of second pond layer M4, the output of second pond layer M4 is also 30 characteristic patterns, it is launched into the vector form input as full articulamentum F5, full articulamentum F5 exports the vector of one 30 dimension, it returns the input of layer L6 as softmax, softmax layer L6 is output as 2 dimensional vectors, it represents this original image respectively is the probability scored or do not score。Wherein recursive convolution layer plays the effect extracting feature, it is to input enterprising line slip convolution by wave filter, thus extracting the feature of input, it is substantially that wave filter serves the effect extracting feature, wherein, wave filter refers to the matrix of a fixed size, this matrix to input convolutional layer image with from left to right, order from top to bottom, the some pixels of slip every time, carry out convolution, wherein, convolution refers to be multiplied the pixel value of the wave filter of formed objects with each correspondence position in the image block on image and sues for peace afterwards, one nonlinear function (such as tanh function) of input and should be worth, will be output as the value of a pixel in convolved image, complete the convolution of whole input picture and according to pixels calculate the process of tanh function, it is called extraction characteristic procedure。Pond layer can pass through the mode of the local maximum of selected characteristic value, can not only reduce characteristic dimension, moreover it is possible to keeps the micro-strain robust to input。The feature of multiple characteristic patterns can be converged by full articulamentum, thus obtaining the global feature of original image。Output layer utilizes the output characteristic of full articulamentum to calculate the probability scored with non-goal。

In the present invention, the convolutional neural networks adopting 6 layers is to consider the goal detection precision and computation complexity to reach and select, and the present invention is not limited to this, it is also possible to select more multi-layered or less layer。

It is illustrated in figure 2 the detail view of wall scroll convolutional neural networks of the present invention, wherein (10 below first recursive convolution layer RC1 of this wall scroll convolutional neural networks, 3,5,9) quantity of the characteristic pattern of expression input is 3, the quantity of the characteristic pattern of output is 10, in 10 wave filter (several characteristic patterns of general output are accomplished by several wave filter) needed for output characteristic figure, each wave filter is of a size of 5*9, wherein, the size of wave filter refers to the matrix-block size for convolutional calculation。(2,2) below first pond layer M2 represent that the filter size of the pond layer M2 used is 2*2。(30,10,5,5) below second recursive convolution layer RC2 represent that the quantity of the characteristic pattern of input is 10, and the quantity of the characteristic pattern of output is 30, and wave filter is of a size of 5*5。(2,2) below second pond layer M4 represent that the filter size of the pond layer used is 2*2。(2700,1) below full articulamentum F5 represent that F5 is of a size of 2700*1。(2,1) below softmax layer represent that this layer is of a size of 2*1。

It is illustrated in figure 3 the Organization Chart of the recursive convolution neutral net of the present invention, it comprises the convolutional neural networks that three layers is parallel, the details of every layer of convolutional neural networks is as shown in Figure 2, Recognition with Recurrent Neural Network (RNN) Connection Step sequential relationship is added between three layers, process is: the output of the recursive convolution layer of previous sequential inputted as the part of the recursive convolution layer of next sequential, therefore the connection of sequential is realized, thus the final Organization Chart shown in pie graph 3, in figure, the sample that T moment and T+1 moment input is previous sample, the sample that the T+2 moment inputs is later sample。Additionally, the connection between convolutional layer is based on the full connection of block form, after second pond layer, a full articulamentum all ponds layer is linked together。

It is to say, in the present invention, the interframe that recursive convolution neural network model introduces image by using convolutional neural networks and recurrent neural network on the basis of sample data connects, to catch the sequential relationship in video。

Step S150, is trained recursive convolution neural network model based on sample set。

Such as, positive negative sample in basketball video sample set is input to the RC1 layer in recursive convolution neutral net, adopt negative log-likelihood (negativeloglikelihood, NLL) as loss function, in conjunction with the back-propagation algorithm (backpropagationthroughtime, BPTT) launched in time, carry out the parameter of Optimal Recursive convolutional neural networks, the value making loss function is more and more less, thus completing the training of recursive convolution neutral net。This step specifically can comprise the following steps that

Step S1501: recursive convolution neutral net is initialized, wherein, remembers single neuronic fan-out and fan-in number respectively fan_outAnd fan_in, in convolutional neural networks, single neuronic initialization weights are with in intervalEqually distributed form produces, in recurrent neural network, single neuronic initialization weights are with interval [-0.1,0.1] equally distributed form produces, wherein, the weight initialization of convolutional neural networks unit make reason in this way be desirable to initialized weights try one's best little but be not concentrated on certain point or its near, thus improving the efficiency accelerating back propagation, the weight initialization of recurrent neural network is also based on same mechanism。

Step S1502: the recursive convolution neutral net all training samples of input after initializing are trained, and obtain the actual output of recursive convolution neutral net。Wherein, training sample includes the desirable output vector of input vector and its correspondence, and input vector, in the calculating through recursive convolution neutral net, arrives softmax layer, obtains actual output vector。Wherein, network training process adopts stochastic gradient descent method, namely concentrate the input selecting part sample as recursive convolution layer RC1 layer from training sample every time, the output of recursive convolution layer RC1 layer is as the input of pond layer M2 layer, the output of pond layer M2 layer is as the input of recursive convolution layer RC3 layer, recursive convolution layer RC3 layer is output as the input of pond layer M4 layer, the output of the pond layer M4 layer of three sequential is merged into the input as full articulamentum F5 layer of the vector, the output of full articulamentum F5 layer is as the input of softmax output layer, the output of softmax layer is for the value of counting loss function, based on classical gradient descent algorithm and back-propagation method, loss function is for the gradient of successively backwards calculation network connection weight, thus utilizing gradient that weight is updated, the value making loss function is more and more less, reach the purpose of training network。

Step S1503: adopt the negative log-likelihood in formula 1 as loss function, utilize the back-propagation algorithm training recursive convolution neutral net launched in time。

Such as, adopt the negative log-likelihood in formula 1 as loss function:

N L L (θ, D) = - Σ_{i = 0}^{| D |} \log P (Y = y^{(i)} | x^{(i)}, θ) - - - (1)

Wherein, θ refers to the parameter of Current Situation of Neural Network, and D refers to current sample set, and including positive sample set and negative sample set, | D | refers to the size of current sample set, P (Y=y⁽ⁱ⁾|x⁽ⁱ⁾, θ) and refer to that certain sample is under parameter current θ, given input x⁽ⁱ⁾When, export y⁽ⁱ⁾Probability, log represents logarithm operation。Using negative log-likelihood loss function (NLL), with this loss function for optimization aim, training process is the ever-reduced process of value instigating this loss function, utilizes the back-propagation algorithm launched in time, carrys out Optimal Recursive convolutional neural networks。

Softmax layer can represent with function, and the definition of this function is provided by formula 2:

P (y = j | x) = \frac{e^{x^{T} θ_{j}}}{Σ_{k = 1}^{K} e^{x^{T} θ_{k}}} - - - (2)

Wherein, x refers to certain input vector, x^TRefer to the transposition of this input vector, θ refers to the parameter of softmax layer, K refers to that current softmax layer has a few class to export, P (y=j | x) it is represented to surely be currently entered x, the probability of output y, using class (scoring or non-goal) the highest for softmax layer output probability as output result, softmax layer is equivalent to a grader。The output of softmax layer is for the value of counting loss function, loss function is for utilizing the gradient of back-propagation algorithm successively backwards calculation network connection weight, thus utilizing gradient that weight is updated so that the value of loss function is more and more less, reach the purpose of training network。When utilizing the back-propagation algorithm launched in time, network is made to launch with sequential, form the network shown in Fig. 3, then for the network of each sequential, utilizing back-propagation algorithm to carry out network training, each network connection weight is calculated gradient for utilizing loss function by the process of back-propagation algorithm, in the way of gradient decline, changing network connection weight, thus reaching to reduce the purpose of the value of loss function, completing the training of recursive convolution neutral net。

Step S170-S190, from processing, basketball video to be detected extracts image, use the recursive convolution neural network model after training to process the image extracted, obtain output vector, according to the output vector of recursive convolution neutral net judges whether occur in that goal in current basketball video。

Step S170-S190 is the step that detection is scored, basketball video is regarded as continuous print picture frame stream, two, front and back are one group, in the recursive convolution neutral net that input trains, detect the sequential that whether having in certain moment video scores, and goal start time and goal finish time occur and obtain, final output detections result, specifically comprises the following steps that

Step S1701: delimiting region to be detected in basketball video, region to be detected generally refers to basketry position, becomes every two field picture by extracted region to be detected in video, forms the frame stream of image。

Step S1702: according to the form of continuous data stream, image is inputted the recursive convolution neutral net trained, successively calculates until output layer。

Step S1703: obtain the probability scored and do not score at output layer, if the probability scored is more than the probability do not scored, then judges occur in current video scoring, otherwise, it is judged that do not occur in current video scoring。

The basketball goal detection method based on video based on the present invention, it is possible to whether detection basketball scores exactly。

In order to detect the accuracy in detection of the present invention, the present invention generates a test set comprising 18314 negative samples and 371 positive samples, the recursive convolution neutral net trained is tested, in amounting to 18685 test samples, the number of successful classification is 18646, accuracy (accuracyrate) is 99.79128%, 39 samples of classification error, error rate (errorrate) is 0.20872%, wherein, negative sample is 18, positive sample is 21, overall recall rate (recallrate) is 94.33962%, False Alarm Rate (falsealarmrate) is 0.09829%。

Visible, the present invention is capable of significantly high picking up ball Detection accuracy。The present invention can effectively detect the goal in basketball video, has good feasibility and robustness。

The order of the step shown in accompanying drawing of the present invention is merely illustrative, but is not limited to this, but can reasonably change, for instance, the order of step S110 and step S130 can be exchanged, it is also possible to carries out parallel。The change of these sequence of steps is all in protection scope of the present invention。

The present invention also provides for a kind of base for realizing the device of said method, as it is shown in fig. 7, this device includes: training sample acquisition module, recursive convolution neutral net build module and goal detection module, wherein:

Described recursive convolution neutral net builds the module construction recursive convolution neural network model based on convolutional neural networks and recurrent neural network, and based on described sample set, described recursive convolution neural network model is trained, obtain about whether video occurring, the grader scored exports result；And

The each several part of the present invention can realize with hardware, software, firmware or their combination。In the above-described embodiment, multiple steps or method can realize with the storage software or firmware in memory and by suitable instruction execution system execution。Such as, if realized with hardware, the same in another embodiment, can realize by any one in the following technology that this area is known altogether or their combination: there is the discrete logic of logic gates for data signal realizes logic function, there is the special IC of suitable combination logic gate circuit, programmable gate array (PGA), field programmable gate array (FPGA) etc.。

Represent in flow charts or in this logic otherwise described and/or step, such as, it is considered the sequencing list of executable instruction for realizing logic function, may be embodied in any computer-readable medium, use for instruction execution system, device or equipment (such as computer based system, including the system of processor or other can from instruction execution system, device or equipment instruction fetch the system performing instruction), or use in conjunction with these instruction execution systems, device or equipment。

In the present invention, the feature describing for embodiment and/or illustrating, it is possible to use in the same manner or in a similar manner in one or more other embodiment, and/or combine with the feature of other embodiments or replace the feature of other embodiments。

It should be noted that above-described embodiment is only illustrates rather than the scope of the claims limiting the present invention, any equivalents technology based on the present invention, all should in the scope of patent protection of the present invention。

Claims

1. the basketball goal detection method based on video, it is characterised in that the method comprises the following steps:

2. method according to claim 1, is characterized in that, the step of the picture library sample set building basketball video includes:

From basketball video, extract each two field picture, be categorized as goal picture or non-goal picture and in picture, add label；

Picture delimited basketry position, intercept out the coloured image block of predefined size, in sequential adjacent and there are two coloured image blocks of same goal identification information form single samples, thus obtain the goal sample set of predetermined ratio and non-goal sample set based on the two field picture extracted from basketball video。

3. method according to claim 2, it is characterized in that, described recursive convolution neural network model includes the recursive convolution neutral net that three layers is parallel, recursive convolution neutral net every layer parallel includes the convolutional neural networks layer of 6 serials, the convolutional neural networks layer of these 6 serials sequentially includes: the first recursive convolution layer, first pond layer, second recursive convolution layer, second pond layer, full articulamentum and output layer, described three layers parallel recurrence convolutional neural networks shares full articulamentum and output layer, in described three layers parallel recurrence convolutional neural networks first, the input of two layers of parallel recurrence convolutional neural networks is the coloured image of previous sequential in sample, the input of third layer parallel recurrence convolutional neural networks is the coloured image of a rear sequential in sample, and the output of each convolutional neural networks layer inputs as the part of convolution neural net layer corresponding in later layer parallel recurrence convolutional neural networks in preceding layer parallel recurrence convolutional neural networks。

4. method according to claim 3, it is characterized in that, described first recursive convolution layer has 10 characteristic patterns, described first pond layer has 10 characteristic patterns, described second recursive convolution layer has 30 characteristic patterns, described second pond layer has 30 characteristic patterns, described first recursive convolution layer is output as the input of described first pond layer, described first pond layer is output as the input of described second recursive convolution layer, described second recursive convolution layer is output as the input of described second pond layer, second pond layer of former and later two parallel sequential is output as the input of described full articulamentum, described full articulamentum has 30 nodes, described output layer has 2 nodes。

5. method according to claim 1, is characterized in that, the step described recursive convolution neural network model being trained based on described sample set includes:

Goal sample in the middle of sample set and non-goal sample are input in described recursive convolution neutral net, adopt negative log-likelihood as loss function, utilize the back-propagation algorithm launched in time, optimize described recursive convolution neutral net, the value making described loss function is more and more less, thus completing the training of recursive convolution neutral net。

6. the method according to any one in claim 1-5, is characterized in that, the step described recursive convolution neural network model being trained based on described sample set comprises the following steps:

7. method according to claim 6, is characterized in that, described negative log-likelihood loss function is:

N L L (θ, D) = - Σ_{i = 0}^{| D |} \log P (Y = y^{(i)} | x^{(i)}, θ);

Formula used by described output layer is:

P (y = j | x) = \frac{e^{x^{T} θ_{j}}}{Σ_{k = 1}^{K} e^{x^{T} θ_{k}}};

8. method according to claim 3, is characterized in that, extracts image from processing basketball video to be detected, uses the step that the recursive convolution neural network model after training processes the image extracted to include:

9. the basketball goal detection device based on video realizing method described in any of the above-described claim, it is characterised in that this device includes training sample acquisition module, recursive convolution neutral net builds module and goal detection module, wherein: