CN109711232A

CN109711232A - Deep learning pedestrian recognition methods again based on multiple objective function

Info

Publication number: CN109711232A
Application number: CN201711018885.8A
Authority: CN
Inventors: 单鼎一; 范宇; 张晓林; 刘惟锦; 武玉亭
Original assignee: China Changfeng Science Technology Industry Group Corp
Current assignee: China Changfeng Science Technology Industry Group Corp
Priority date: 2017-10-26
Filing date: 2017-10-26
Publication date: 2019-05-03

Abstract

The present invention provides a kind of deep learning pedestrian recognition methods again based on multiple objective function, and the predeterminable area under different angle camera extracts video frame, and carries out pedestrian detection and label label, obtains training data；Training data picture is pre-processed, each sample is a pair of of picture；Feature extraction is carried out using trained deep learning network respectively for the picture in test picture and search library, selects abstract representation of the output of the last one convolutional layer as pedestrian's picture；For each test picture, the characteristic similarity of itself and pedestrian's abstract characteristics all in image library is calculated, this result indicates their similarity degree；It is the maximum artificial same person of similarity in pedestrian and the picture library determined in test picture when the similarity of the feature meets preset requirement.

Description

Deep learning pedestrian recognition methods again based on multiple objective function

Technical field

The invention belongs to mode identification technologies, more particularly, to a kind of multiple target letter identified for pedestrian again Several deep learning method and systems.

Background technique

Pedestrian identifies that (Person Re-Identification) refers under the multiple-camera network of non-overlap visual angle domain again The pedestrian of progress matches, i.e., whether the camera of determining different location is the same person in the pedestrian that different moments find.Specifically To utilize external appearance characteristic in existing non-overlap video camera pedestrian library using some target pedestrian to be searched as search source It is automatically found picture relevant to search source.Pedestrian identifies again also is understood as video pictures retrieval tasks, and the technology is criminal Investigation, pedestrian detection, the field of video monitoring such as multi-cam pedestrian tracking and behavioural analysis have important application.But due to same One target, which the factors such as is blocked by visual angle, illumination, object under different cameras, to be influenced, and the expression of feature is past under different perspectives Toward there are deviations, the macroscopic features of different pedestrians may be more more like than the macroscopic features of the same person, therefore, pedestrian identify again according to It is old that there is huge application challenges.

The new lover of " deep learning " as video image processing task is mapped from big data certainly by multilayered nonlinear It moves learning characteristic rather than uses the feature of hand-designed and then improve the ability of classification and identification algorithm, open intelligent video-image and divide The new page of analysis technology.Problem, the prior art are then mentioned using supervised learning Classification and Identification network to be identified again for pedestrian Certain layer of abstract characteristics are taken to carry out pedestrian's expression, though there is good differentiation identification, due to not accounting for same people in difference Inner link under camera, less effective when measured similarity.Some algorithms are to overcome the problems, such as that correlation deficiency is adopted in class Model is verified with deep learning, inputs a pair of of picture every time, whether label is the same person for them, can ignore classification in this way and know The multiple target discriminating power of other problem.

To solve the above-mentioned problems, the invention proposes one kind based on identification model in depth learning technology and verifying model The pedestrian combined knows method for distinguishing again,

Summary of the invention

Object of the present invention is to the disadvantages described above for the prior art and requirement is improved, is proposed a kind of based on deep learning more The pedestrian of business fusion knows method for distinguishing again, makes every effort to the identification and representativeness that improve feature from multiple angles, better assures that The accuracy and practicability of similitude distinction promote pedestrian's weight identification technology accuracy.

To achieve the above object, the present invention is using technical solution:

Deep learning pedestrian recognition methods again based on multiple objective function, it is characterised in that: in deep learning network structure The upper cross entropy identification loss function that multirow people is added and the classification verifying loss function combination learning method of pairs of picture two, including Following steps:

Step 1: obtaining module: pedestrian detection, the predeterminable area under different angle camera extracts video frame, and carries out Pedestrian detection and label mark, and obtain training data.

Step 2: training module pre-processes training data picture, and each sample is a pair of of picture, it can is not It can also be different a pair of of picture of pedestrian with two picture of same a group traveling together under camera.

Step 3: trained deep learning net extraction module: being used respectively for the picture in test picture and search library Network carries out feature extraction, selects abstract representation of the output of the last one convolutional layer as pedestrian's picture.

Step 4: computing module: for each test picture, calculating itself and pedestrian's abstract characteristics all in image library Characteristic similarity, this result indicate their similarity degree.

Step 5: confirmation module: being when the similarity of the feature meets preset requirement, determines the pedestrian in test picture With the maximum artificial same person of similarity in picture library.

The beneficial effects of the present invention are:

1. the design of core of the invention deep learning network characterization extractor carries out network using multitask combination learning Training, two tasks share convolutional neural networks feature, reduce model complexity, while solving the task of identification with verifying.

2. the model of the method for the invention has stronger generalization ability, extracted abstract characteristics have stronger Shandong Stick.

3. the pedestrian that the present invention can efficiently handle monitoring system in real time identifies problem again.

Detailed description of the invention

Fig. 1 is two kinds of deep learning structure charts that the present invention is used for reference；

Fig. 2 is multitask convolutional neural networks structure chart proposed by the invention；

Fig. 3 is the supervised training flow chart of neural network structure of the invention；

Fig. 4 is offline pedestrian's identification process figure of the invention；

Fig. 5 is the real-time pedestrian's weight identification process figure of online multi-cam of the invention.

Specific embodiment

The cross entropy identification loss function and pairs of picture two of multirow people is added in the present invention in deep learning network structure Classification verifying loss function combination learning method, comprising the following steps:

As shown in figure 1 shown in (a), structure is identified for multiple target convolutional neural networks, and a large amount of identification may be learned and distinguish Information.As shown in figure 1 shown in (b), there is confirmatory network powerful similitude to compare learning ability.

As shown in Fig. 2, for the depth convolutional network of multi-task learning designed by the present invention.This method is first by a pair of of row People's image inputs the network structure of depth model, either being also possible to different pedestrians with a group traveling together, model uses multitask Optimisation strategy, while integrated classification identification mission and two-value validation task are merged, two compositions of loss function are embodied in Part: multi-tag cross entropy loss function and two classification verifying loss functions.In addition, declining in a large amount of training data and gradient Error-duration model strategy support under, finely tune depth convolutional neural networks, while improving feature decision separating capacity and similarity Discriminating power is measured, training mission is completed.

Step 3: the trained depth of the present invention extraction module: being used respectively for the picture in test picture and search library Learning network carries out feature extraction, selects abstract representation of the output of the last one convolutional layer as pedestrian's picture.

To make the purpose of the present invention, technical solution and advantage are become apparent from, and are carried out below in conjunction with attached drawing and specific embodiment Specific complete description:

It is shown in Figure 3, the core depth convolutional neural networks of invention are introduced.

The building of S301 data set: the pedestrian detection algorithm of deep learning, Background difference, optical flow method, inter-frame difference can be passed through Method carries out pedestrian detection based on the methods of statistical learning.Shooting picture with the different angle of a group traveling together uses same label.It is right The input of network is answered, a large amount of positive and negative sample images can be produced to (positive sample is the detection under two different cameras of same people Picture, negative sample are detection picture of the different people under different cameras).

The structure of S302 convolutional neural networks: the major network structure that the present invention uses is 50 layers of depth residual error network- Resnet50 overcomes the artificial design of traditional characteristic using the implicit identification feature of depth learning technology study pedestrian's picture Interference mainly includes a large amount of convolutional layer, pond layer and full articulamentum.To meet task needs, its last layer is eliminated New convolutional layer is added in full articulamentum, and network is made to become full convolutional network.Simultaneously for the picture inputted in pairs, feature is extracted Enter new squared difference layer (similarity comparison layer) jointly afterwards, loss function is verified in rear access, as shown in Figure 2.

S303 model learning process: pairs of positive negative sample inputs network, carries out forward-propagating, is believed according to their label Breath with whether be same people's information, model can respectively obtain identification loss and lose with verifying.Invention passes through under given learning rate The updated value of weight is sought in gradient decline with chain type derivation principle according to local derviation.Under a large amount of label training datas, optimization fine tuning is deep Convolutional neural networks are spent, multi-task learning guarantees the identification and representativeness of feature, restrains and stablize until model.

Offline pedestrian's identification process figure shown in Figure 4, of the invention.

The building of S401 database picture: the pedestrian detection algorithm of deep learning, Background difference, optical flow method, interframe can be passed through Calculus of finite differences carries out pedestrian detection based on the methods of statistical learning.Shooting picture with the different angle of a group traveling together uses same mark Label form target pedestrian search library.

S402 carries out pedestrian detection under the scene of actual test, to the video data that camera obtains, and obtains row to be measured People's picture.

S403 carries out feature extraction to database picture and picture to be measured respectively using trained deep learning model (database feature extracts primary save), the present invention retains the output of the last one convolutional layer as the distinctive feature of target Vector.

S404 forms feature database: the database feature extracted being saved with corresponding label, also is understood as The characteristics dictionary of pedestrian's weight identifying system.

S405 calculates similarity: calculating the similarity of all features of characteristics dictionary and pedestrian's feature to be measured, can pass through cosine A variety of calculation methods such as distance, Euclidean distance, apart from two features of smaller expression more like being more likely same pedestrian target.

S406 definitive result: similarity reaches preset value and is possible to as same a group traveling together, more due to that may have in feature database Two kinds of strategies: the corresponding label of 1. feature in the smallest feature database can be used in a feature higher than threshold value, the method for determination Pedestrian is pedestrian to be measured；2. all features for being higher than threshold value carry out the classification that ballot determines pedestrian to be measured according to label.

It is shown in Figure 5, the real-time pedestrian's weight identification process figure of online multi-cam of the invention.

S501, S502: camera A and B acquisition real-time video simultaneously carry out pedestrian detection acquisition pedestrian's picture, the pedestrian in A To occur for the first time, the target of system is to judge whether the pedestrian in B occurs in A.

S503 carries out feature extraction to pedestrian's picture of camera A and B respectively using trained deep learning model, Retain the output of the last one convolutional layer as feature vector.

The feature that each camera A of S504 is extracted all carries out being saved, and identifies again for the pedestrian of camera B and provides spy Levy library.

S505 calculates similarity: the similarity of pedestrian's feature to be measured in all features of feature database and camera B is calculated, it can By a variety of calculation methods such as COS distance, Euclidean distance, apart from two features of smaller expression more like being more likely same a group traveling together Target.

S506 definitive result: similarity reaches preset value and is possible to as same a group traveling together, more due to that may have in feature database Two kinds of strategies: the corresponding label of 1. feature in the smallest feature database can be used in a feature higher than threshold value, the method for determination Pedestrian is pedestrian to be measured；2. all features for being higher than threshold value carry out the classification that ballot determines pedestrian to be measured according to label.

Claims

1. a kind of deep learning pedestrian recognition methods again based on multiple objective function, it is characterised in that: in deep learning network knot The cross entropy identification loss function and the classification verifying loss function combination learning method of pairs of picture two of multirow people, packet are added on structure Include following steps:

Step 1: obtaining module: pedestrian detection, the predeterminable area under different angle camera extracts video frame, every trade of going forward side by side people Detection is marked with label, obtains training data；

Step 2: training module pre-processes training data picture, and each sample is a pair of of picture, either difference is taken the photograph As two picture of same a group traveling together under head can also be different a pair of of picture of pedestrian；

Step 3: extraction module: for test picture and search library in picture use respectively trained deep learning network into Abstract representation of the output of the last one convolutional layer as pedestrian's picture is selected in row feature extraction；

Step 4: computing module: for each test picture, calculating the feature of itself and pedestrian's abstract characteristics all in image library Similarity, this result indicate their similarity degree；

Step 5: confirmation module: being when the similarity of the feature meets preset requirement, determines the pedestrian in test picture and figure The maximum artificial same person of similarity in valut.