CN110298214A

CN110298214A - A kind of stage multi-target tracking and classification method based on combined depth neural network

Info

Publication number: CN110298214A
Application number: CN201810242992.7A
Authority: CN
Inventors: 程飞; 罗恒阳; 谢彦春; 肖继民
Original assignee: Suzhou Kai Ming Zhen Nana Electronic Technology Co Ltd
Current assignee: Suzhou Kai Ming Zhen Nana Electronic Technology Co Ltd
Priority date: 2018-03-23
Filing date: 2018-03-23
Publication date: 2019-10-01

Abstract

The invention discloses a kind of stage multi-target tracking and classification method based on combined depth neural network, hardware device includes: one or more video camera and calculation server in the method；The method are as follows: in conjunction with a variety of deep neural networks, and use image processing techniques, by the training to deep neural network model, realize that more performers track simultaneously in stage performance and role distinguishes.The present invention solves detection, tracking and the Classification and Identification of the different type performer during stage is had a dress rehearsal and performed in conjunction with computer vision algorithms make by a variety of neural network algorithms.This technology can understand for subsequent stage, scenery and the intelligent stage applications such as illumination scheme intelligently generates, stage choreography intelligently assists provide important basis.

Description

A kind of stage multi-target tracking and classification method based on combined depth neural network

Technical field

The present invention relates to the tracking of performer and Classification and Identification technical field, specially a kind of to be based on combined depth neural network Stage multi-target tracking and classification method.

Background technique

One excellent stage performance be unable to do without outstanding scenery and light of stage in visual aspects.And perform personnel To walk be one of the key factor of scenery and light of stage design with role's distribution.Traditional scenery design and dance Platform Lighting Design often designs suitable setting and light according to the arrangement that walks of performer when having a dress rehearsal by veteran designer Effect.However, the design of this immobilization, the performance of performer is restricted in performance very much, performer needs according to dress rehearsal It walks and is performed.Also often occur performer in some large-scale performances and does not go to the feelings on correct position and missing light Condition.

Currently, for this problem, the method for mainstream is being waved by wearing the realization performer of short-distance wireless alignment sensor Positioning on platform.But the electromagnetic environment interference for being highly susceptible to stage complexity of this method, and be unfavorable for for more performers Scene use, can not also identify the role of performer.

For the present invention by the way that the deep neural network algorithms of different purposes is applied in combination, solve stage performer is positioned at angle Color identifies problem.Important information foundation is provided for subsequent stage intelligent.

Summary of the invention

The purpose of the present invention is to provide a kind of stage multi-target trackings based on combined depth neural network and classification side Method, to solve the problems mentioned in the above background technology.Information foundation is provided for subsequent intelligent stage application.

To achieve the above object, the invention provides the following technical scheme:

A kind of stage multi-target tracking and classification method based on combined depth neural network, hardware device in the method It include: one or more video camera and calculation server；The method are as follows: in conjunction with a variety of deep neural networks, and use Image processing techniques realizes more performers while tracking and role in stage performance by the training to deep neural network model It distinguishes.

As a further solution of the present invention: specifically comprising the following steps:

1) data acquire: shooting object scene by video camera, image includes that institute's some need is tracked and what is distinguished drills Member；

2) performer's position detection and image segmentation: by the target detection deep neural network of reasonable training, every frame is found The position of all performers in image, and intercept the picture of each performer；

3) picture after interception segmentation: being carried out the pretreatment of uniform sizes by image preprocessing after dividing, and is guaranteeing personage In the case where ratio, image scaling is carried out；

4) performer's position tracking: by the target following neural network of reasonable training, the movement rail of each performer is obtained Mark

5) performer role distinguishes: carrying out Unsupervised clustering analysis to performer role, analyzes different performer's types.

As a further solution of the present invention: the performer in the method includes singer, dancer or actor.

As a further solution of the present invention: the video camera uses high definition high-speed camera.

As a further solution of the present invention: target detection deep neural network uses Fast-RCNN in the step 2) Neural network, Faster-RCNN neural network or YOLO neural network.

As a further solution of the present invention: the training data in the step 2) is from various large-scale concerts, performance A large amount of video and image are intercepted, is marked by human assistance, training sample abundant is formed.

As a further solution of the present invention: target following neural network, which uses, in the step 4) is based on correlation filter End-to-end track algorithm, or using the track algorithm based on trained and high speed depth Recurrent networks.

Compared with prior art, the beneficial effects of the present invention are:

(1) it is tracked and Classification and Identification using contactless method；

(2) real-time tracking of multiple target performer is realized；

(3) the accurate differentiation of performer role is realized.

Detailed description of the invention

Fig. 1 is the schematic diagram of the method for the present invention usage scenario；

Fig. 2 is the corresponding system flow chart of the method for the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Please refer to Fig. 1~2, the present invention provides a kind of technical solution: a kind of stage based on combined depth neural network is more Target tracking and classification method, hardware device includes: one or more video camera and calculation server in the method；It is described Method are as follows: in conjunction with a variety of deep neural networks, and use image processing techniques, pass through the instruction to deep neural network model Practice, realizes that more performers track simultaneously in stage performance and role distinguishes.

Specific implementation includes the following steps:

Above-described " performer " includes the various performers such as singer, dancer, actor.

In the step 1), to the use of the model, quantity of video camera there is no limit, it is only necessary to cover the whole audience, most High definition high-speed camera is used well, can improve the precision of identification.

In the step 2), the type for the target detection neural network that can be used has: Fast-RCNN network (reference: Girshick R.Fast R-CNN[C]//Computer Vision(ICCV),2015IEEE International Conference on.IEEE, 2015:1440-1448.), (reference: Ren S, He K, Girshick of Faster-RCNN network R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE transactions on pattern analysis and machine intelligence, 2017,39 (6): 1137-1149.), (reference: Redmon J, Divvala S, Girshick R, et al.You of YOLO network only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on computer vision and pattern recognition.2016:779-788.) etc. nerve nets Network method.

In the step 2), training data can intercept a large amount of video and image from various large-scale concerts, performance, It is marked by human assistance, forms training sample abundant.

In the step 4), the monotrack algorithm of various maturations can be used, such as: the end based on correlation filter To end track algorithm (Valmadre J, Bertinetto L, Henriques J, et al.End-to-end representation learning for correlation filter based tracking[C]//Computer Vision and Pattern Recognition(CVPR),2017IEEE Conference on.IEEE,2017:5000- 5008.), based on trained and high speed depth Recurrent networks track algorithm (Held D, Thrun S, Savarese S.Learning to track at 100fps with deep regression networks[C]//European Conference on Computer Vision.Springer,Cham,2016:749-765.)

The important difference of the step 4) and step 2) is: step 2) is only to have found position, but not can determine that Whether whether certain performers have carried out location swap, and step 4) meeting track individual feature, the movement for carrying out single performer chase after Track, even if can also accurately be tracked under similar action situation in similar clothing.

In the step 5), unsupervised clustering algorithm can be used to distinguish role different on stage.The area of role Following several classes: star and accompanying dancer, protagonist and supporting role, combination team etc. can be divided into.In order to be not limited to fixed classification mould Formula, therefore use Unsupervised clustering algorithm.Such as use classical K-means algorithm (Ahmad A, Dey L.A k-mean clustering algorithm for mixed numeric and categorical data[J].Data&Knowledge Engineering,2007,63(2):503-527.)。

It may finally be the motion profile for analyzing every performer in real time in stage performance by above scheme.

It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.

Claims

1. a kind of stage multi-target tracking and classification method based on combined depth neural network, it is characterised in that: the method Middle hardware device includes: one or more video camera and calculation server；The method are as follows: in conjunction with a variety of depth nerve nets Network, and image processing techniques is used, by the training to deep neural network model, realize that more performers are simultaneously in stage performance Tracking and role distinguish.

2. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 1, It is characterized by: specifically comprising the following steps:

1) data acquire: shooting object scene by video camera, image includes the performer that institute's some need is tracked and distinguishes；

2) performer's position detection and image segmentation: by the target detection deep neural network of reasonable training, every frame image is found In all performers position, and intercept the picture of each performer；

3) picture after interception segmentation: being carried out the pretreatment of uniform sizes by image preprocessing after dividing, and is guaranteeing personage's ratio In the case where, carry out image scaling；

4) performer's position tracking: by the target following neural network of reasonable training, the motion profile of each performer is obtained

3. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 2, It is characterized by: the performer in the method includes singer, dancer or actor.

4. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 2, It is characterized by: the video camera uses high definition high-speed camera.

5. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 2, It is characterized by: target detection deep neural network uses Fast-RCNN neural network, Faster-RCNN in the step 2) Neural network or YOLO neural network.

6. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 2, It is characterized by: the training data in the step 2) intercepts a large amount of video and image from various large-scale concerts, performance, It is marked by human assistance, forms training sample abundant.

7. a kind of stage multi-target tracking and classification method based on combined depth neural network according to claim 2, It is characterized by: target following neural network uses the end-to-end track algorithm based on correlation filter in the step 4), or Person uses the track algorithm based on trained and high speed depth Recurrent networks.