CN109045664B

CN109045664B - Diving scoring method, server and system based on deep learning

Info

Publication number: CN109045664B
Application number: CN201811030493.8A
Authority: CN
Inventors: 李永祺; 杜存宵; 林俊宇; 甘甜; 宋雪萌; 聂礼强
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2018-09-05
Filing date: 2018-09-05
Publication date: 2019-10-01
Anticipated expiration: 2038-09-05
Also published as: CN109045664A

Abstract

The invention discloses a kind of diving scoring method, server and system based on deep learning.Wherein, a kind of diving scoring method based on deep learning, comprising: building diving scoring model；The diving scoring model includes C3DNet model, PoseNet model and SVR support vector regression model, and the C3DNet model and PoseNet model are connected in parallel and then are connected in series with SVR support vector regression model；The C3DNet model is realized based on the Caffe frame of Intel optimization；Utilize known diving sets of video data and corresponding diving fractional value collection, training diving scoring model；It will dive in the diving scoring model that video input is completed to training, and export diving fractional value.Which obviate the Human disturbances of diving fractional value, improve the accuracy of diving fractional value.

Description

Diving scoring method, server and system based on deep learning

Technical field

The invention belongs to field of information processing more particularly to a kind of diving scoring methods based on deep learning, clothes of diving Business device and system.

Background technique

Diving refers to sporter's take-off on the instrument of certain altitude, is one terminated to enter water after completing action in the air Item acrobatic gymnastics waterborne.Diving generally can be divided into contest diving and non-contest diving two major classes.However, for contest Property diving often there is diving race unfairness and dispute on phenomenon, and some unjust phenomenons after match it is obvious that all can Huge public opinion is caused to be argued.Such as in the men's 3m springboard diving match that 17 morning of 2008 Beijing time August carries out, Chinese player Peng Bo four-wheel before match carries out final jump in the case where having a head start, and judges intentional penalty China Team and loses Accidentally, gold medal is caused to be lost.Certain judge's real name disclose diving golds all default, oneself and judge for another example in the National Games in 10 years Group is manipulated.

In addition, culture judge is time-consuming and expends substantial contribution.One outstanding judge generally requires tens of years and goes to train It supports, judge needs to go ceaselessly to learn again in the meantime, and bout generally requires 5-7 judges, employs the expense of judge With expenditure important often and in contest fund.

Race marking of diving is regular are as follows: the judge of racing dive has 7 people and 5 people's systems, then plus 1 chief referee.But the Olympic Games, generation Boundary's championship and World Cup Competition must have 7 judge scorings.The full marks of each movement are 10 points.When scoring, leave out highest and Minimum point, remaining score is added multiplied by difficulty point, just obtains the score of the movement.Judge can be according to the run-up of sportsman It is (i.e. andante, treadmill), take-off, aerial and enter hydrodynamic(al) and make to evaluate score.Scoring is mainly made of difficulty point with judgers, Difficulty point is absolutely objective, but judgers are entirely to be made of the subjective marking of judge.Although judge's marking has certain detailed rules and regulations, It but is more still to rely on judge personal experience.And this experience can be unreliable, marking result, which can convince people, all to remain to be discussed.

In conclusion being made of at present about some in the marking of racing dive the subjective marking of judge, obtain in this way Marking result be inaccurate, and to sportsman and unfair.Therefore, it is urgent to provide a kind of scoring methods of accurately and efficiently diving And system.

Summary of the invention

In order to solve the deficiencies in the prior art, the first object of the present invention is to provide a kind of diving based on deep learning and beats Divide method, deep learning is firstly introduced race marking field, can be improved the accuracy of diving appraisal result.

A kind of diving scoring method based on deep learning of the invention, comprising:

Building diving scoring model；The diving scoring model includes that C3DNet model, PoseNet model and SVR are supported Vector regression model, the C3DNet model and PoseNet model be connected in parallel and then with SVR support vector regression model It is connected in series；The C3DNet model is realized based on the Caffe frame of Intel optimization；

Utilize known diving sets of video data and corresponding diving fractional value collection, training diving scoring model；

It will dive in the diving scoring model that video input is completed to training, and export diving fractional value.

Further, this method further includes:

C3DNet model and PoseNet model are connected in parallel and then are connected in series with SVM classifier, it is dynamic to constitute Perform an analysis model；

Result tally set, training action analysis model are analyzed using known diving sets of video data and corresponding actions；

It will dive in the motion analysis model that video input is completed to training, export dive comment.

Further, this method further includes:

Utilize known diving sets of video data and corresponding trunk recognition result collection, training PoseNet model；

It will dive in the PoseNet model that video input is completed to training, export trunk recognition result.

The second object of the present invention is to provide a kind of diving marking server based on deep learning.

A kind of diving marking server based on deep learning of the invention, comprising:

Scoring model of diving constructs module, is configured as: building diving scoring model；The diving scoring model includes C3DNet model, PoseNet model and SVR support vector regression model, the C3DNet model and PoseNet model parallel connection connect It connects and then is connected in series with SVR support vector regression model；The C3DNet model is the Caffe frame based on Intel optimization Frame is realized；

It dives scoring model training module, is configured as: score of using known diving sets of video data and accordingly diving Value collection, training diving scoring model；

Diving fractional value output module, is configured as: the diving scoring model that the video input that will dive is completed to training In, output diving fractional value.

Further, the server further include:

Motion analysis model construction module, is configured as: C3DNet model and PoseNet model are connected in parallel it Afterwards, then with SVM classifier it is connected in series, to constitute motion analysis model；

Motion analysis model training module, is configured as: being analyzed using known diving sets of video data and corresponding actions As a result tally set, training action analysis model；

Dive comment output module, is configured as: the motion analysis mould that video input of diving is completed to training In type, dive comment is exported.

Further, the server further include:

PoseNet model training module, is configured as: utilizing known diving sets of video data and corresponding trunk identification knot Fruit collection, training PoseNet model；

Trunk recognition result output module, is configured as: the PoseNet model that video input of diving is completed to training In, export trunk recognition result.

The third object of the present invention is to provide a kind of diving scoring system based on deep learning.

A kind of diving scoring system based on deep learning of the invention, including the jump described above based on deep learning Water marking server.

Further, the diving scoring system based on deep learning, further includes client, be based on deep learning Diving marking server be connected, for shows based on deep learning diving give a mark server output result.

Compared with prior art, the beneficial effects of the present invention are:

(1) C3DNet model of the invention is realized based on the Caffe frame of Intel optimization；In original-pack Intel The new convolutional layer for being directly used in reading video data and 3 dimensions is compiled on Caffe, in this way without using framing and traditional volume Lamination operates frame, can directly be handled four-dimensional array (blob) with Three dimensional convolution, improve data processing speed.

(2) present invention uses the thought of integrated study, portrays the feature of diving with two networks respectively, C3D network is portrayed Temporal aspect, PoseNet network portray single frames human body information, most merge to obtain final score through SVR afterwards, improve last diving The accuracy of fractional value.

(3) present invention introduces the methods of multi-task learning, in shallow-layer parameter, all type shared parameters, last SVR on each network is respectively trained, be a kind of multi-task learning method based on parameter sharing, finally effectively improve net Network performance.

Detailed description of the invention

The accompanying drawings constituting a part of this application is used to provide further understanding of the present application, and the application's shows Meaning property embodiment and its explanation are not constituted an undue limitation on the present application for explaining the application.

Fig. 1 is a kind of embodiment flow chart of the diving scoring method of the invention based on deep learning.

Fig. 2 is a kind of example structure schematic diagram of the diving marking server of the invention based on deep learning.

Specific embodiment

It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.

It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singular Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.

Technical term is explained:

(1) deep neural network: deep neural network is the artificial neural network with multilayer neuron, in input layer and There are multiple hidden layers between output layer.Data can be mutually transmitted between each layer of neuron, and according to the function of network Target dynamic adjusts the weighted value of itself.

(2) C3DNet model: 3 for video feature extraction tie up convolutional network.

(3) PoseNet model: PoseNet is the vision positioning model that Cambridge University does, and can pass through a Zhang Caise Your posture information of framing.Under a big urban environment, the posture information that it only needs to spend 5ms to obtain you, Ratio of precision GPS is higher.And compared to GPS, it can determine your direction and can run indoors.PoseNet is to use sword Bridge landmark data collection is trained.It is a large-scale urban environment location data collection, and having this is more than 12000 Cambridge weeks The 6 class scene images enclosed, every image all correspond to the camera posture information of 6DOF.

(4) SVR support vector regression model: (i.e. input label is the method that SVR () is exactly SVM algorithm to do recurrence Method when successive value).

(5) SVM:Support Vector Machine, refers to support vector machines, is a kind of common method of discrimination. It is the learning model for having supervision in machine learning field, commonly used to carry out pattern-recognition, classification and regression analysis.

(6) Caffe:Caffe is the deep learning frame of BVLC exploitation, is based on C++ and CUDA C Plus Plus, and provide Python and Matlab interface；The frame has convolutional neural networks CNN, Recognition with Recurrent Neural Network RNN and multilayer perceptron very much It helps.

Deep learning was persistently had an effect in recent years, even achieves in the contest ImageNet of computer vision and is more than The performance of mankind's average level.And deep learning is also current state of theart (best) for the processing of video Performance.But deep learning is combined with diving marking and is had the following problems:

(1) data set count issue: diving marking problem is a very novel field, is not increased income on public network Tape label data set, it is therefore desirable to voluntarily acquire.It is limited to time, financial resources, the sample size well marked only has more than 2000, This is too small for deep learning, and directly training will lead to serious over-fitting, gradient can not normal backpropagation, net Network can not be trained normally.

(2) continuously diving feature difficulty is portrayed: dive is an extremely complex continuous action, and sportsman is in high speed Movement in, using CNN (Convolutional Neural Network, convolutional neural networks), VGG (Visual Geometry Group, large-scale image identification depth convolutional network) even ResNet (depth residual error network) capture this Performance is all unsatisfactory when a little features.

(3) it dives many kinds of, corresponding standards of grading are also various: different movement types has different scoring marks in diving Standard, may also be different even if acting identical score.However data are inherently very rare, if training one to each movement Model, will be so that data be more sparse.

Problem above will lead to diving appraisal result inaccuracy, and then occur to the inequitable phenomenon of sportsman.

To solve the above-mentioned problems, deep learning is firstly introduced race marking field by the application, be can be improved diving and is commented Divide the accuracy of result.

As shown in Figure 1, a kind of diving scoring method based on deep learning of the present embodiment, comprising:

Step 1: building diving scoring model；The diving scoring model include C3DNet model, PoseNet model and SVR support vector regression model, the C3DNet model and PoseNet model are connected in parallel and then return with SVR supporting vector Model is returned to be connected in series；The C3DNet model is realized based on the Caffe frame of Intel optimization；

Step 2: utilizing known diving sets of video data and corresponding diving fractional value collection, training diving scoring model；

Step 3: in the diving scoring model that video input is completed to training that will dive, output diving fractional value.

Since PoseNet model can accurately capture the posture feature of human body, the posture of PoseNet model catcher's body It is characterized in well making up for the information that can not be captured to C3DNet model.

For C3DNet model and PoseNet model fusion use SVR support vector regression model, C3DNet model and Two models of PoseNet model are all directly using ground-truth training, output numerical value type score.SVR supporting vector is returned The effect of model is returned to be to realize Ensemble, input is after the characteristic layer of two models cascades as a result, training data is exactly True score.

Specifically, the workflow of C3DNet model are as follows:

1) video will become the array of one 4 dimension after reading, and respectively indicate channel, frame number, length and width.

2) convolution operation is carried out to array with 3d convolution, obtains the array of one 2 dimension.

3) array for 2 dimensions that step 2) obtains obtains a vector by two-way LSTM.

4) vector that step 3) obtains passes through a full articulamentum, obtains a numerical value, i.e. diving score.

The workflow of PoseNet model:

1) with the openpose tool of open source, (this tool is increased income on the net, and specific implementation is very complicated, we are only Simply to bring application) video is handled, can obtain the coordinate of the human body key point of each frame and treated view Frequently.

2) coordinate of each frame human body key point is sent into two-way GRU network, obtains a vector.

3) vector becomes a numerical value by a full articulamentum, is diving score.

In another embodiment, this method further include:

The workflow of motion analysis model are as follows:

4) vector that step 3) obtains passes through sigmoid activation primitive, each numerical value in vector becomes 0 or 1. If it is 1, mean that corresponding comment is selected.

Motion analysis model is used to analyze the standard and nonstandard place of dive.

Realize that the key of motion analysis model is data, the tally set of motion analysis result is 6 common labels, Such as: take-off opportunity is too early, and spray is excessive etc..Each video has 4 labels, so motion analysis model conversation is that a classification is asked Topic, wherein feature extraction is carried out using with C3DNet model consistent in FocNet model, last SVR support vector regression Model is changed to SVM classifier to realize motion analysis model.

In another embodiment, this method further include:

Trunk recognition result goes out the trunk and key point of personage for identification.

On the one hand trunk identification model provides Partial Feature for scoring model, on the other hand its feature visualization result Also can show on platform.PoseNet is that the OpenPose progress fine-tuning based on Carnegie Mellon University's open source is (micro- Adjust), due to OpenPose had reached with high accuracy as a result, only mark a small amount of diving video can obtain it is very complete The result of beauty.

Such as: the performance in this index is 0.5, it is also assumed that for extensive sample, score that model provides 0.5 is not above with the mathematic expectaion of the gap of true value；On the other hand, whether related coefficient is accurate for measuring ranking Index is 0.87 in the performance of the index.Select the race video of the 2012 London Olympic Games as test set, for the Olympic Games The judge of meeting, will be tested with judge of the same index to marking, and the Olympic Games share seven judges, to seven judges point It does not test, it is found that the performance of this method of the invention in average absolute value error has surmounted most of monomers judge, occupy the Two, third is occupied on related coefficient.It is considered that having reached the level of same expert.

The server as shown in Fig. 2, a kind of diving based on deep learning of the invention is given a mark, comprising:

(1) diving scoring model constructs module, is configured as: building diving scoring model；The diving scoring model Including C3DNet model, PoseNet model and SVR support vector regression model, the C3DNet model and PoseNet model are simultaneously Connection is connected and then is connected in series with SVR support vector regression model；The C3DNet model is optimized based on Intel Caffe frame is realized；

(2) diving scoring model training module, is configured as: utilizing known diving sets of video data and corresponding diving point Set of values, training diving scoring model；

(3) diving fractional value output module, is configured as: the diving marking mould that video input to the training that will dive is completed In type, output diving fractional value.

In another embodiment, the server further include:

The present invention also provides a kind of diving scoring system based on deep learning.

A kind of diving scoring system based on deep learning of the invention, including as shown in Figure 2 based on deep learning Diving marking server.

In specific implementation, the diving scoring system based on deep learning, further includes client, be based on depth The diving marking server of study is connected, for showing the output result of the diving marking server based on deep learning.

C3DNet model of the invention is realized based on the Caffe frame of Intel optimization；In original-pack IntelCaffe The new convolutional layer for being directly used in reading video data and 3 dimensions of upper compiling, in this way without using framing and traditional convolutional layer Frame is operated, directly four-dimensional array (blob) can be handled with Three dimensional convolution, improve data processing speed.

The present invention uses the thought of integrated study, portrays the feature of diving with two networks respectively, C3D network portrays timing Feature, PoseNet network portray single frames human body information, most merge to obtain final score through SVR afterwards, improve last diving score The accuracy of value.

Present invention introduces the methods of multi-task learning, in shallow-layer parameter, all type shared parameters, last Each network is respectively trained on SVR, is a kind of multi-task learning method based on parameter sharing, finally effectively improves network Performance.

It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, the shape of hardware embodiment, software implementation or embodiment combining software and hardware aspects can be used in the present invention Formula.Moreover, the present invention, which can be used, can use storage in the computer that one or more wherein includes computer usable program code The form for the computer program product implemented on medium (including but not limited to magnetic disk storage and optical memory etc.).

The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random AccessMemory, RAM) etc..

Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims

1. a kind of diving scoring method based on deep learning characterized by comprising

Building diving scoring model；The diving scoring model includes C3DNet model, PoseNet model and SVR supporting vector Regression model, the C3DNet model and PoseNet model are connected in parallel and then connect with SVR support vector regression model Connection；The C3DNet model is realized based on the Caffe frame of Intel optimization；

2. a kind of diving scoring method based on deep learning as described in claim 1, which is characterized in that this method is also wrapped It includes:

C3DNet model and PoseNet model are connected in parallel and then are connected in series with SVM classifier, with composition movement point Analyse model；

3. a kind of diving scoring method based on deep learning as described in claim 1, which is characterized in that this method is also wrapped It includes:

The server 4. a kind of diving based on deep learning is given a mark characterized by comprising

Diving scoring model training module, is configured as: known diving sets of video data and corresponding diving fractional value collection are utilized, Training diving scoring model；

Diving fractional value output module, is configured as: defeated in the diving scoring model that video input is completed to training that will dive It dives out fractional value.

The server 5. a kind of diving based on deep learning as claimed in claim 4 is given a mark, which is characterized in that the server is also Include:

Motion analysis model construction module, is configured as: C3DNet model and PoseNet model are connected in parallel and then It is connected in series with SVM classifier, to constitute motion analysis model；

Motion analysis model training module, is configured as: analyzing result using known diving sets of video data and corresponding actions Tally set, training action analysis model；

Dive comment output module, is configured as: in the motion analysis model that video input is completed to training that will dive, Export dive comment.

The server 6. a kind of diving based on deep learning as claimed in claim 4 is given a mark, which is characterized in that the server is also Include:

PoseNet model training module, is configured as: utilizing known diving sets of video data and corresponding trunk recognition result Collection, training PoseNet model；

Trunk recognition result output module, is configured as: in the PoseNet model that video input is completed to training that will dive, Export trunk recognition result.

7. a kind of diving scoring system based on deep learning, which is characterized in that including as described in any one of claim 4-6 Diving based on deep learning give a mark server.

8. a kind of diving scoring system based on deep learning as claimed in claim 7, which is characterized in that further include client End is connected with the diving marking server based on deep learning, for showing the diving marking server based on deep learning Output result.