CN111932620B

CN111932620B - Method for judging whether volleyball is out of net or not and method for acquiring service speed

Info

Publication number: CN111932620B
Application number: CN202010754590.2A
Authority: CN
Inventors: 王海滨; 凌展鸿
Original assignee: Root Sports Science And Technology Beijing Co ltd
Current assignee: Root Sports Science And Technology Beijing Co ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2024-01-12
Anticipated expiration: 2040-07-27
Also published as: CN111932620A

Abstract

The application discloses a method for judging whether volleyball is in service or not and a method for acquiring service speed, wherein the method for judging comprises the following steps: constructing a convolutional neural network; training the convolutional neural network; and processing volleyball service video by using the trained convolutional neural network, and judging whether volleyball service is passed or not. The construction of the convolutional neural network comprises the following steps: constructing a six-layer convolutional neural network; the six layers comprise a first convolution layer, a second pooling layer, a third convolution layer, a fourth pooling layer, a fifth full-coupling layer and a sixth output layer. The method for judging whether the volleyball is shot and passed or not utilizes the convolutional neural network to process the volleyball shot video, judges whether the volleyball is shot and passed or not, has accurate judgment result, high processing speed and high precision, reduces the labor intensity, saves the labor cost and can well meet the requirements of practical application.

Description

Method for judging whether volleyball is out of net or not and method for acquiring service speed

Technical Field

The application relates to the technical field of video processing, in particular to a video-based volleyball service passing or not judging method and a service speed obtaining method.

Background

The service is an important technical link in volleyball sports. In practice, data statistics are required specifically for special service, whether in volleyball games or in daily volleyball training, to describe the quality and level of performance of certain key players in the service link.

In the past long time, the technical statistics can only be realized by a pure manual means, and meanwhile, the past technical proposal has the defects of high cost and limited precision on the aspects of data acquisition such as the speed of serving a ball, the angle of passing a net and the like. Along with the rapid development of science and technology, high-speed cameras are widely applied in low-cost application scenes, and how to collect the high-speed motion process in the volleyball service process by using the cameras and complete data statistics through corresponding image analysis is a problem to be solved.

Disclosure of Invention

The purpose of the application is to provide a method for judging whether a volleyball is out of net or not and a method for acquiring the service speed. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

According to one aspect of the embodiments of the present application, there is provided a method for determining whether a volleyball is passing through a net, including:

constructing a convolutional neural network;

training the convolutional neural network;

and processing volleyball service video by using the trained convolutional neural network, and judging whether volleyball service is passed or not.

Further, the constructing a convolutional neural network includes:

constructing a six-layer convolutional neural network;

the six layers comprise a first convolution layer, a second pooling layer, a third convolution layer, a fourth pooling layer, a fifth full-connection layer and a sixth output layer; the output of the first layer of convolution layer is characteristic data obtained after the three channels of the original sample RGB pass through a plurality of filter convolution operations, the characteristic data is used as the input of a second layer of pooling layer, the second layer of pooling layer is used for pooling the output result of the first layer of convolution layer by using a maximum pooling filter, and the pooling processed result is used as the input of a third layer of convolution layer.

Further, the training the convolutional neural network includes:

training the convolutional neural network using a training sample set;

the training sample set comprises a plurality of pictures, and the pictures are marked with a network identification label and a corresponding time sequence label; the network passing identification tag is marked on the picture in a manual marking mode; the pictures are taken from video captured by a high-speed camera.

Further, the processing the volleyball service video by using the trained convolutional neural network to judge whether the volleyball service is passing through the net, including:

inputting volleyball service video shot by a high-speed camera into the trained convolutional neural network, recognizing the images of the volleyball service video frame by the trained convolutional neural network, and outputting a judging result for judging whether volleyball service is passed through the net.

Further, before the training the convolutional neural network, the method further comprises a step of initializing the convolutional neural network by using an Xavier initialization mode.

According to another aspect of the embodiments of the present application, there is provided a method for determining whether a volleyball is passing through a net, including:

calculating the average ball speed of volleyball before and after passing through the net for the service video judged to pass through the net; wherein the service video judged as passing through the net is obtained by the method for judging whether the volleyball passes through the net or not.

Further, the calculating the average speed of volleyball in a period of time before and after passing through the net includes:

aiming at a service video judged to pass through the net, a first frame image and a last frame image of the service video when volleyball is positioned in an identification area are cut off by using a deep neural network model;

and measuring volleyball displacement distances of the first frame image and the last frame image, and calculating the average ball speed in a preset time period before and after the ball serving pass according to the measurement result and the time difference between the first frame image and the last frame image.

Further, the measuring the volleyball displacement distance of the first frame image and the last frame image includes:

performing relation conversion on the image coordinates and world coordinates of the camera by using a monocular ranging method;

performing volleyball edge detection and pixel coordinate extraction;

and calculating the volleyball midpoint space Euclidean distance of the two frame images according to the volleyball midpoint actual coordinates of the first frame image and the last frame image.

let a real world point be R (X _R ，Y _R ，Z _R ) Its homogeneous coordinate matrix is [ X ] _R Y _R Z _R 1] ^T The point r (x _r ，y _r ) The homogeneous coordinate matrix of the pixel is [ u ] _r v _r 1] ^T WhereinWherein (u) ₀ ，v ₀ ) For the pixel coordinates of the points in the image, dx and dy represent the physical length represented by each pixel, namely the pixel size, and the coordinates corresponding to the points R and R are mutually converted through a camera parameter matrix K, namely s [ u ] _r v _r 1] ^T ＝K[a ₁ a ₂ a ₃ d][X _R Y _R Z _R 1] ^T Where s is a scale factor, K is a 3x3 matrix comprising pixel size data and center point coordinates, a ₁ ，a ₂ ，a ₃ D is the rotation matrix and translation vector of the camera coordinate system relative to the world coordinate system, respectively;

constructing an equation set according to the feature points of the real image and the positions of the feature points in the image to obtain a ₁ ，a ₂ ，a ₃ D, taking the value of d;

in world coordinates P (X) _P ，Y _P ，Z _P ) To center put volleyball, construct Z _P Collecting the world coordinates of the edge points of the volleyball on the plane and the corresponding pixel coordinates on the generated image, constructing an equation by using the obtained camera parameter matrix K, and obtaining a plane Z with a height of two meters from the ground _P The corresponding coordinate values; let it be assumed that at Z _R The image diameter of volleyball on the plane=0 and the coordinates of the two end edges are (x ₀ ，y ₀ )，(x′ ₀ ，y′ ₀ ) The distance between the two points isAt Z _P The coordinates of the image diameter and the two edges of the image generated by volleyball on the plane are (x) ₁ ，y ₁ )，(x′ ₁ ，y′ ₁ ) The distance between two points isObtaining Z from image similarity relationship _P /(Z _P -Z _R )～l ₁ +d ₁ /(l ₁ +d ₁ -l ₀ -d ₀ ) Wherein d is ₁ And d ₀ Respectively is l ₁ And l ₀ Translation vector at corresponding height, identifying volleyball diameter edge in image and obtaining corresponding inter-pixel distance l _x Then, the height coordinate Z of the volleyball in the real world is obtained _x ；

Carrying out graying treatment on the image, carrying out weighting treatment on the RGB three-channel values of the input image, and converting the RGB three-channel values into one-channel gray scale image with a value range of 0-255; for a pixel with three RGB channel values of R, G, B, the Gray value gray=αr+βg+γb of the pixel generated after Gray processing, wherein α, β, γ are weighting coefficients, and α+β+γ=1;

performing smoothing processing on the image by using Gaussian smoothing filtering on the gray-scale image after graying;

carrying out convolution processing on the image subjected to Gaussian smoothing processing so as to strengthen the edge of the image;

traversing the gray value G of each pixel, comparing the gray value G with two adjacent pixels in the theta direction, retaining the maximum gray value, otherwise changing the gray value into 0, and extracting the residual G value to obtain volleyball edge coordinates;

selecting the equivalent of theta ₁ ＝0°、θ ₂ ＝90°、θ ₃ ＝180°、θ ₄ Four pixel points G when=270° ₁ 、G ₂ 、G ₃ 、G ₄ To calculate G ₁ And G ₃ Distance l of point in image ₁₃ ，G ₂ And G ₄ With points in the imageDistance l ₂₄ To obtain the arithmetic mean l _G ＝(l ₁₃ +l ₂₄ ) And/2, obtaining the actual height coordinate Z of volleyball in the two pictures _t And Z _t′ ；

At Z _t And Z _t′ After the determination, the image pixel point G is used ₁ And G ₃ Point connection line and G ₂ And G ₄ The point connection line junction point is taken as the volleyball midpoint, and the actual coordinates (X) of the volleyball midpoint at the time t and the time t' are finally obtained _t ，Y _t ，Z _t ) And (X) _t′ ，Y _t′ ，Z _t′ ) Calculating the two-point space Euclidean distance l _tt′ 。

According to another aspect of the embodiments of the present application, there is provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the method described above.

One of the technical solutions provided in one aspect of the embodiments of the present application may include the following beneficial effects:

according to the volleyball passing-net judging method, the volleyball service video is processed by the convolutional neural network, whether the volleyball passes through the net is judged, the judging result is accurate, the processing speed is high, the accuracy is high, the labor intensity is reduced, the labor cost is saved, and the actual application requirement can be well met.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart showing a method for determining whether a volleyball is passing through a net according to one embodiment of the present application;

FIG. 2 is a schematic diagram of a convolutional neural network;

FIG. 3 is a flowchart of a method for determining whether a volleyball is passing through a net according to another embodiment of the present application;

FIG. 4 is a flowchart of a method for acquiring a serving speed according to another embodiment of the present application;

FIG. 5 is a flowchart for determining whether a volleyball is passing through a net in an embodiment of the present application;

fig. 6 is a flowchart for determining a movement speed of a volleyball before and after passing through a net from a video clip from a time t to a time t+1s in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As shown in fig. 1, one embodiment of the present application provides a method for determining whether a volleyball is passing through a net, including:

constructing a convolutional neural network;

training the convolutional neural network;

In certain embodiments, the constructing a convolutional neural network comprises:

constructing a six-layer convolutional neural network;

as shown in fig. 2, the six layers include a first layer of convolution layer, a second layer of pooling layer, a third layer of convolution layer, a fourth layer of pooling layer, a fifth layer of full-coupling layer, and a sixth layer of output layer; the output of the first layer of convolution layer is characteristic data obtained after the three channels of the original sample RGB pass through a plurality of filter convolution operations, the characteristic data is used as the input of a second layer of pooling layer, the second layer of pooling layer is used for pooling the output result of the first layer of convolution layer by using a maximum pooling filter, and the pooling processed result is used as the input of a third layer of convolution layer. The input layer connected with the convolutional neural network is an RGB three-channel input layer.

In some embodiments, the number of layers of the convolutional neural network may be other numbers, not limited to six layers, and the specific number of layers may be set according to actual needs. Meanwhile, the filter size of the relevant convolution layer and the maximum pooling filter size of the pooling layer can be changed and improved according to actual conditions.

In some embodiments, the training the convolutional neural network comprises:

training the convolutional neural network using a training sample set;

In some embodiments, the training sample set is obtained by:

the volleyball service video is collected through fixed point position camera equipment (such as a high-speed camera), key frame images in the video are selected for marking, and a training sample set is formed by using the marked images.

The method for processing volleyball service video by using the trained convolutional neural network and judging whether volleyball service is passing through the network comprises the following steps:

Another embodiment of the present application further provides a method for obtaining a serving speed, where the method includes:

for the service video judged to be passing through the net, calculating the average ball speed of volleyball before and after passing through the net. The service video determined as passing through is obtained by the method for determining whether the volleyball service passes through the net according to the previous embodiment.

In some embodiments, the calculating the average ball speed for a period of time before and after the volleyball passes the net comprises:

In some embodiments, the measuring the volleyball displacement distance of the first frame image and the last frame image includes:

performing volleyball edge detection and pixel coordinate extraction;

In some embodiments, the method further comprises the step of initializing the convolutional neural network using an Xavier initialization mode prior to the training of the convolutional neural network.

Another embodiment of the present application provides a method for obtaining a serving speed, including: extracting a key frame image in a video judged as passing through the net by the judgment method of passing through the net by the volleyball of the first embodiment; extracting volleyball edges in the key frame image; determining the space position of volleyball according to the extracted edge; and calculating the service speed according to the space position distance between the key frames.

Another embodiment of the present application provides a method for determining whether a volleyball is passing through a net, as shown in fig. 3, including the following steps:

installing fixed point positions and visual angle cameras at corresponding point positions in the volleyball court, and collecting the service video images;

selecting a required image from the acquired video to construct a picture sample set, and carrying out corresponding classification and labeling;

constructing a convolutional neural network, and training the convolutional neural network by using the constructed image sample set;

and identifying the service video by using the trained convolutional neural network, and judging whether volleyball in the video passes through the network.

Another embodiment of the present application provides a method for determining whether a volleyball is passing through a net. The method utilizes images acquired by a high-speed camera at a fixed machine position, and utilizes a convolutional neural network and a real-time single-lens ranging method to judge whether volleyball is passing through the net or not and acquire the speed of the volleyball before and after passing through the net.

The method comprises the following steps:

acquiring a video of a ball serving process at a specific point position by a high-speed camera;

constructing a picture library sample set of the shot video by utilizing the acquired video, wherein the picture library sample set is used for training a convolutional neural network and is used as a training sample set, and pictures in the training sample set are marked with a network passing identification label and a corresponding time sequence label which are marked by people;

constructing a convolution recurrent neural network model, and training the model by using a sample set to obtain a model for judging whether a certain service ball passes through the net, which is called a net passing judgment model;

and judging whether the service in a certain video is successful or not by using the network passing judging model.

In one embodiment, the method for judging whether a volleyball is passing through a net or not specifically comprises the following steps:

step one, a fixed point location high-speed camera is erected above the midpoint of a volleyball court according to volleyball sports characteristics, so that the view finding range of the camera at least can comprise a field middle net and three rice lines of two sides of the field, and the corresponding end line area is covered. Considering that the peak of the service speed of a professional volleyball player can reach 130 km/h, i.e. more than 36 m/s, in order for the present solution to be effectively applied, a high-speed camera with a shooting speed of 100 frames/s or more should be used, which camera should be able to acquire clear motion videos and images at illuminance of 200 lux or more.

And secondly, acquiring a service video by using the erected fixed point position high-speed camera.

And thirdly, dividing the three-nanowire area at two sides of the volleyball into two parts by taking the volleyball middle net as a boundary, and respectively constructing a picture library sample set of volleyball in the two areas (which can be respectively named as side A and side B). And extracting each frame of image from the collected volleyball video, framing the middle net and the three-meter-line areas on the two sides, and marking the images in the areas, wherein the marked labels are specific positions of volleyball in the images. The specific labeling method is that when the volleyball whole in the image is completely located in the three-meter line area on the side A or the side B, the picture is labeled as the volleyball located on the side A or the side B, namely, the picture is labeled as a class A sample and a class B sample, and other sample pictures are labeled as class C samples, wherein no volleyball exists in the area and the volleyball whole does not completely enter the corresponding area. The volleyball completely enters the framed image, but cannot be completely separated from the middle net, namely, the volleyball and the middle net project the sample overlapped in the image, and the sample is marked as class D. At a photographing speed of 100 frames/second or more, a plurality of clear sample images can be obtained which meet these types of situations.

And step four to step seven, constructing a multi-layer convolutional neural network and training the multi-layer convolutional neural network so as to accurately identify whether volleyball service passes through the network.

Step four, constructing a serial multi-layer convolutional neural network, for example, a six-layer neural network would include a first layer convolutional layer, a second layer pooled layer, a third layer convolutional layer, a fourth layer pooled layer, a fifth layer fully-coupled layer, and a sixth layer output layer. The output of the first layer convolution layer is characteristic data obtained after the three channels of the original sample RGB pass through a plurality of filter convolution operations, the characteristic data are used as the input of a second layer pooling layer, the maximum pooling filter is used for pooling the result of the first layer convolution layer, the result is used as the input of a third layer convolution layer, and the like, the output of the third layer convolution layer is used as the input of a fourth layer, the output of the fourth layer pooling layer is used as the input of a fifth layer full-connection layer, and the output of the fifth layer full-connection layer is used as the input of a sixth layer output layer. In actual use, more convolution layers and pooling layers can be set according to the actual condition of the neural network training so as to achieve better effects.

And fifthly, initializing the multi-layer convolutional neural network by using an Xavier initialization mode. Let m and n be the input and output dimensions of a single neuron, then the initialization weights of the single neuron in the neural network to be in the intervalIs produced in a uniformly distributed form.

Step six, inputting the sample set into the initialized neural network to obtain a first output result. The output result of the neural network is four types, and a softmax function is constructed at the output layerWherein the method comprises the steps of/>An exponential function input for a single neuron of an output layer, j is the output result category number, and y is used as _k The class corresponding to the maximum value is used as the output of the neural network.

Step seven, comparing the output result of the nervous system with the actual classification of the sample to construct a loss function for the neural networkWhere θ is a series of parameters of the current neural network, P _jt The θ is the proportion of the total samples to the class of samples that are correctly classified under the series of parameters. When the samples are all correctly classified, L (θ) reaches a minimum value of 0. The neural network is trained, and the neural network is optimized by using a gradient descent algorithm, so that L (theta) is smaller and smaller until the value approaches 0. When L (θ) reaches a minimum, neural network training is complete.

And step eight, inputting a video fragment to be recognized by using a neural network after training, recognizing the video to be recognized frame by frame according to time sequence, and when the output value of a certain frame image at the time t-1 in the video is C type and the output value of a certain frame image at the time t is A type or B type, considering that the volleyball enters a certain three-meter-line area corresponding to the time t. The A class, the B class and the C class are preset according to actual needs.

Specifically, the shortest distance from the net in any area of the triple play is three meters during volleyball movement, and at a movement speed of 36 meters/second, volleyball can be considered to take at least 0.08 seconds to move from the triple play to the net. In the high-speed shooting environment, images at the time t+1 and the time t+2 can be continuously judged, and if the image output at the time t+1 and the time t+2 is the same as the time t, the service is considered to have reached the area.

And step nine, after the moment t is determined, according to the volleyball movement characteristics, intercepting a video segment from the moment t to the moment t+1s, regarding the video segment as a shot, respectively identifying ten frames of images (t+0.1s, t+0.2s, … … and the like to the moment t+1s) by using a trained neural network, and if one or more pieces of images in the ten frames are judged to be positioned at the other side of the category corresponding to the moment t (for example, the moment t is output as A category, and the image is output as B category), judging that the shot is passed. Otherwise, judging that the volleyball is not passing through the net. The flow of steps in some embodiments may be as shown with reference to fig. 5.

And step ten, repeating the processes of the step eight and the step nine, and obtaining a plurality of service video image sequences judged to be successful in passing through the net.

The above steps one to ten provide an embodiment of a method for determining whether a volleyball is passing through a net.

In another embodiment, a method for acquiring a serving speed is provided, including:

constructing a deep neural network model, and automatically cutting out images of a first frame and a last frame of volleyball entering an identification area in the service video by using the deep neural network model for the service video judged to pass through the net;

based on a single lens distance measurement principle and a positioning method, volleyball displacement distances in two frames of images of a first frame and a last frame are measured, and according to the time difference of the two frames of images and the volleyball displacement distances, the average ball speed in a period of time before and after the volleyball passes through the net is calculated, wherein the period of time refers to the time from the time when the volleyball enters a three second area of a volleyball serving side to the time when the volleyball leaves the three second area of the volleyball receiving side, and can be the time interval between specific frames selected according to actual needs, for example, 1s, 2s and the like.

In another embodiment, a method for acquiring a serving speed is provided, including the steps of:

step eleven, the following uses these screened service video image sequences to measure the movement speed of volleyball before and after passing the net each time of service. In each image sequence, the image at the time t is calibrated, and the last frame image of the volleyball leaving the opposite-region three-second region is confirmed between the time t and the time t+1s. In order to improve the determination accuracy, a rule may be set, for example, when the output of three frames of images t '+1, t' +2, and t '+3 are all determined to be C-type, the image at time t' is selected as the last frame of image of the three second zone of the volleyball leaving the opposite zone. Reference is made to fig. 6.

And the steps twelve to thirteenth are that the monocular distance measuring method is used for carrying out relation conversion on the image coordinates and world coordinates of the camera.

Step twelve, let a point in the real world be R (X _R ，Y _R ，Z _R ) Its homogeneous coordinate matrix is [ X ] _R Y _R Z _R 1] ^T The point r (x _r ，y _r ) The homogeneous coordinate matrix of the pixel is [ u ] _r v _r 1] ^T WhereinWherein (u) ₀ ，v ₀ ) For the pixel coordinates of the points in the image, dx and dy represent the physical length represented by each pixel, i.e. the pixel size, and the coordinates corresponding to the points R and R can be mutually converted by the camera parameter matrix K, i.e. s [ u ] _r v _r 1] ^T ＝K[a ₁ a ₂ a ₃ d][X _R Y _R Z _R 1] ^T Where s is a scale factor, K is a 3x3 matrix comprising pixel size data and center point coordinates, a ₁ ，a ₂ ，a ₃ D is the rotation matrix and translation vector of the camera coordinate system relative to the world coordinate system, respectively. Constructing an equation set according to the feature points (such as four vertexes of the framed range) of the real image and the positions of the feature points in the image to obtain a ₁ ，a ₂ ，a ₃ And d, taking the value of the D. In this method, the problem of rotating the matrix is generally not involved.

Thirteen steps, using the plane Z of the court surface _R =0 is template plane construction calibration. For a standard volleyball court, the distance between two three second lines is 6 meters, and the left and right lines are twoThe distance between the edges was 9 meters. And framing the area where the four marking lines are positioned, calculating the pixel length and width of the picture occupied by the area, and constructing an equation set according to the actual coordinate positions of the four vertexes and the coordinate positions of the image points of the framed area to obtain the value of K.

Fourteen steps, at two meters from the ground level, at world coordinate P (X _P ，Y _P ，Z _P ) To center put volleyball, construct Z _P Collecting the world coordinates of the edge points of the volleyball on the plane and the corresponding pixel coordinates on the generated image, constructing an equation by using the obtained camera parameter matrix K, and obtaining a plane Z with a height of two meters from the ground _P The corresponding coordinate values. Let it be assumed that at Z _P The image diameter of volleyball on the plane=0 and the coordinates of the two end edges are (x ₀ ，y ₀ )，(x′ ₀ ，y′ ₀ ) The distance between the two points isAt Z _P The coordinates of the image diameter and the two edges of the image generated by volleyball on the plane are (x) ₁ ，y ₁ )，(x′ ₁ ，y′ ₁ ) The distance between two points is->Obtaining Z from image similarity relationship _P /(Z _P -Z _R )～l ₁ +d ₁ /(l ₁ +d ₁ -l ₀ -d ₀ ) Wherein d is ₁ And d ₀ Respectively is l ₁ And l ₀ Translation vector at corresponding height, wherein by this relation volleyball diameter edge is identified in the image and corresponding inter-pixel distance l is obtained _x Then, the height coordinate Z of the volleyball in the real world can also be obtained _x 。

Fifteen to seventeen steps are the process of volleyball edge detection and pixel coordinate extraction.

Fifteen, carrying out graying treatment on the image, carrying out weighting treatment on the RGB three-channel value of the input image, and converting the RGB three-channel value into a one-channel gray image with a value range of 0-255. For a pixel with RGB three channel values of R, G, B, gray=αr+βg+γb, where α, β, γ are weighting coefficients, and α+β+γ=1. The coefficient can be adjusted according to specific conditions, and the aim is to make the outline of the volleyball image in the image after the graying have obvious separation from the periphery.

And performing smoothing processing operation on the image by using Gaussian smoothing filtering on the gray-scale image after graying. A (2n+1) (2n+1) size Gaussian smoothing filter matrix generation method is as follows:

wherein sigma ² The filter variance is valued. For most images, 0.5. Ltoreq.σ can be used ² And less than or equal to 2 to obtain better effect.

Sixthly, carrying out convolution processing on the image subjected to Gaussian smoothing processing so as to strengthen the edge of the image. For example, a pair of sobel operators of a 3X3 matrix may be used as filters to convolve individual pixels of an image. Typical horizontal direction operator S _x And a vertical direction operator S _y The matrix values are as follows:

considering the Gaussian smoothed image, if the gray value 3X3 matrix formed by a certain pixel and eight pixels around the certain pixel in the image is A, the gray value of the pixel point is after the pixel is filteredGradient direction θ=arctg (G _y /G _x ) Wherein G is _x ＝S _x ·A，G _y ＝S _y A. According to the result, a better edge strengthening effect can be obtained by adjusting the values of the elements of the operator.

Seventeenth, traversing the gray value G of each pixel of the image processed in sixteenth, comparing the gray value G with two adjacent pixels in the theta direction, retaining the maximum gray value, otherwise changing the gray value into 0, and extracting the residual G value to obtain volleyball edge coordinates.

Eighteenth, processing the pictures at the time t and the time t' in the eleventh step by adopting the methods from the fifteen to the seventeen, and selecting the current theta from the processing results obtained after the processing by the methods from the fifteen to the seventeen ₁ ＝0°、θ ₂ ＝90°、θ ₃ ＝180°、θ ₄ Four pixel points G when=270° ₁ 、G ₂ 、G ₃ 、G ₄ To calculate G ₁ And G ₃ Distance l of point in image ₁₃ ，G ₂ And G ₄ Distance l of point in image ₂₄ To obtain the arithmetic mean l _G ＝(l ₁₃ +l ₂₄ ) And/2, obtaining the actual height coordinate Z of the volleyball in the two pictures by a fourteen-step method _t And Z _t′ 。

Nineteenth step, at Z _t And Z _t′ After the determination, the image pixel point G in the eighteen steps is used ₁ And G ₃ Point connection line and G ₂ And G ₄ The point connection line junction point is taken as the volleyball midpoint, and the actual coordinates (X) of the volleyball midpoint at the time t and the time t' are finally obtained _t ，Y _t ，Z _t ) And (X) _t′ ，Y _t′ ，Z _t′ ) Calculating the two-point space Euclidean distance l _tt′ And knowing the time difference t' -t between the two frames of images, and calculating the average speed of passing through the net after the service according to the ratio of the average speed equal to the spatial Euclidean distance of the two points to the time difference between the two frames of images.

The step eleven to the step nineteenth are one embodiment of a method for obtaining a serving speed.

Another embodiment of the present application provides a method for obtaining a serving speed, as shown in fig. 4, including the following steps:

using a monocular ranging method to perform relation conversion on the image coordinates and world coordinates of the camera;

selecting two frames of images in the video clips judged to pass through the net successfully in the service, and extracting volleyball image coordinates through an edge detection algorithm;

and respectively converting space coordinates of volleyball in the two frames of images according to the extracted volleyball image coordinates, and further calculating the volleyball passing speed.

The acquisition method of the embodiment utilizes convolution kernel processing and a monocular positioning method to process key image frames in the network video, calculates volleyball movement speed in the video, and has accurate calculation result.

Another embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor executes the program to implement the above-mentioned method for determining whether a volleyball is passing through a net.

According to the method for judging whether the volleyball is shot or not, the volleyball shot video is processed by the convolutional neural network, whether the volleyball is shot or not is judged, the judgment result is accurate, the processing speed is high, the precision is high, the labor intensity is reduced, the labor cost is saved, and the requirement of practical application can be well met.

It should be noted that:

the algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may also be used with the teachings herein. The required structure for the construction of such devices is apparent from the description above. In addition, the present application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and the above description of specific languages is provided for disclosure of preferred embodiments of the present application.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

The foregoing examples merely represent embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method for determining whether a volleyball is passing through a net, comprising:

calculating the average ball speed of volleyball before and after passing through the net for the service video judged to pass through the net; wherein the service video judged as passing through the net is obtained by a judging method of whether the volleyball is passed through the net or not;

the method for judging whether volleyball is in service or not comprises the following steps:

constructing a convolutional neural network;

training the convolutional neural network;

processing volleyball service video by using the trained convolutional neural network, and judging whether volleyball service is passed or not;

the calculating the average ball speed of volleyball in a period of time before and after passing through the net comprises the following steps:

measuring volleyball displacement distances of the first frame image and the last frame image, and calculating average ball speed in a preset time period before and after passing through the net for the ball serving according to a measurement result and a time difference between the first frame image and the last frame image;

the measuring the volleyball displacement distance of the first frame image and the last frame image includes:

Carrying out graying treatment on the image, carrying out weighting treatment on the RGB three-channel values of the input image, and converting the RGB three-channel values into one-channel gray scale image with a value range of 0-255; for a pixel with three RGB channel values of R, G, B, the Gray value gray=αr+βg+γb of the pixel generated after the graying process, wherein α, β, γ are weighting coefficients, and α+β+γ=1;

selecting the equivalent of theta ₁ ＝0°、θ ₂ ＝90°、θ ₃ ＝180°、θ ₄ Four pixel points G when=270° ₁ 、G ₂ 、G ₃ 、G ₄ To calculate G ₁ And G ₃ Distance l of point in image ₁₃ ，G ₂ And G ₄ Distance l of point in image ₂₄ To obtain the arithmetic mean l _G ＝(l ₁₃ +l ₂₄ ) And/2, obtaining the actual height coordinate Z of volleyball in the two pictures _t And Z _t′ ；

2. The method of claim 1, wherein measuring the volleyball displacement distance of the first frame image and the last frame image comprises:

performing volleyball edge detection and pixel coordinate extraction;

3. The method of claim 1, wherein said constructing a convolutional neural network comprises:

constructing a six-layer convolutional neural network;

4. The method of claim 1, wherein the training the convolutional neural network comprises:

training the convolutional neural network using a training sample set;

5. The method of claim 1, wherein processing the volleyball service video using the trained convolutional neural network to determine whether volleyball service is passing the net comprises:

6. The method of claim 1, further comprising the step of initializing the convolutional neural network using Xavier initialization prior to the training of the convolutional neural network.

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the method of any of claims 1-6.