CN109063649A - Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian - Google Patents
Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian Download PDFInfo
- Publication number
- CN109063649A CN109063649A CN201810876899.1A CN201810876899A CN109063649A CN 109063649 A CN109063649 A CN 109063649A CN 201810876899 A CN201810876899 A CN 201810876899A CN 109063649 A CN109063649 A CN 109063649A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- residual error
- error network
- branch
- twin
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a kind of pedestrian's recognition methods again that residual error network is aligned based on twin pedestrian, comprising the following steps: S1, constructs the twin residual error network of basic branch;S2, building pedestrian are aligned the twin residual error network of branch;S3, the twin residual error network progress parameter training of branch is aligned to the twin network of basic branch built and pedestrian using the training dataset constructed, pedestrian in branch's prototype basic in the twin residual error network of trained basic branch and pedestrian's alignment twin residual error network of branch is aligned the taking-up of branch's prototype and carries out the disaggregated model that pedestrian identifies again.The present invention improves the accuracy that original algorithm pedestrian identifies again.
Description
Technical field
It the invention belongs to image retrieval technologies field, is judged in image or video sequence using computer vision technique
With the presence or absence of the technology of specific pedestrian, further relates to one of pedestrian's weight identification technology field and be aligned based on twin pedestrian
The pedestrian of residual error network recognition methods again.
Background technique
In monitor video, background block with pedestrian away from camera farther out caused by due to low resolution etc., often
It is unable to get the image that can be used for recognition of face.And when face recognition technology can not be in the case of normal use obtains, pedestrian knows again
A very important substitute technology is not just become.Pedestrian identifies that having a very important characteristic is exactly across camera shooting again
Head, so being the identical pedestrian image that retrieve under different cameras when evaluating performance in academic paper.Pedestrian knows again
It does not study for many years, but the development with deep learning in several years up to date, just achieves very huge prominent in academia
It is broken.
Tradition is roughly divided into following several classes based on the algorithm of image identified by feature representation method progress pedestrian again:
(1) bottom visual signature: this method is essentially all to divide an image into multiple regions, to each extracted region
A variety of different bottom visual signatures obtain the better character representation form of robustness after combination, most common is exactly that color is straight
Fang Tu;
(2) middle layer semantic attribute: judging whether belong to same a group traveling together in two images by semantic information, such as color,
The information such as clothes and the packet of carrying, identical pedestrian semantic attribute under different video captures seldom change;
(3) high-level vision feature: the selection technique of feature promotes the discrimination that pedestrian identifies again.Use depth
Habit progress pedestrian knows method for distinguishing again and is that it does not need artificial selected characteristic, passes through end with the maximum difference of conventional method
To the study at end, automatically learn the various features in pedestrian image.
Therefore, field is identified again in pedestrian, another characteristic is known again for carrying out pedestrian by artificial selection, due to characteristic
Measure it is numerous, it is practical by the picture that camera photographed may also be multifarious etc. reasons, be difficult to determine certain specific feature to institute
Some images have good performance.Therefore, pedestrian's feature is chosen compared to artificial, the method based on deep learning model can
Reach preferable effect.
Existing deep learning model principally falls into the classification of convolutional neural networks, and usually used model has
CaffeNet, VGGNet and residual error network etc..
Summary of the invention
The invention proposes a kind of based on the pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian.It can be effective
Raising whole network precision, improve the accuracy rate that identifies again of pedestrian.
Meanwhile network takes channel structure, pedestrian image inputs in pairs, and image is to including similar image and inhomogeneity figure
Picture provides the feedback of positive negative sample, makes e-learning to the feature with judgement index.
In order to achieve the above technical purposes, the present invention uses following specific technical solution:
A kind of pedestrian's recognition methods again being aligned residual error network based on twin pedestrian, including the following steps:
S1, the twin residual error network of basic branch is constructed;
S1.1, construction first foundation branch depth residual error network are imported using transfer learning strategy in ImageNet data
The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network;
S1.2, the model structure by replicating first foundation branch depth residual error network and parameter obtain the second basic branch
Depth residual error network;
S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, utilize convolutional layer
Two classification are carried out with classifier, judge whether the input of above-mentioned two basic branch depth residual error network is same category of figure
Picture;
S2, building pedestrian are aligned the twin residual error network of branch;
S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth
Any one in residual error network or the second basic branch depth residual error network, is deleted for high dimensional feature image to be returned as spy
Levy the residual block of vector;
The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to defeated
Low-dimensional characteristic image out carries out affine transformation, obtains the pedestrian image by alignment;
By appointing in trained first foundation branch depth residual error network or the second basic branch depth residual error network
Meaning one leaves out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and is used to training and obtains
By alignment pedestrian image;
S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian alignment
Branch depth residual error network;
S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output, benefit are calculated
Two classification are carried out with convolutional layer and classifier, judge whether the input of two branching networks is same category of image;
S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin
Raw residual error network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned
Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in the twin residual error network of branch;
S3.1, use batch gradient descent method twin to the basic branch constructed respectively using the training dataset constructed
Raw residual error network and pedestrian are aligned the twin residual error network of branch and carry out parameter training;
S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin
Raw residual error network, which takes out, is used as pedestrian image disaggregated model;
S4, building test sample and query sample;
S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian
Feature extraction is carried out in alignment branch depth residual error network;
S6, the feature for obtaining two branching networks carry out feature connection;
S7, carry out Euclidean distance to the image of the image of test sample and query sample sorted lists are calculated;
S8, progress pedestrian identifies again on the basis of reordering.
Step S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1, output are obtained by average Chi Huahou1;
S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f1Mapping becomes n and ties up
Vector exports final class prediction by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing
Parameter, t is pedestrian's classification, and f is the obtained feature vector after basic branch depth residual error network carries out feature extraction,
The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together's class
Other t, piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process
The probability value obtained after the processing of softmax function.
Step S1.3 is specific as follows:
S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output1、f2Take difference
Square, obtain fs=(f1-f2)2
The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next
From same category;
S1.3.4, for the same category or different classes of input picture to q, define the second loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing
Parameter, s is identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,For classifier letter
The feature vector f of number outputsIt whether is same class pedestrian θSProbability, f1、f2Respectively by the twin residual error net of formation base branch
The feature that the basic branch depth residual error network of two of network extracts, if f1、f2It is same people, q1=1, q2=0;Otherwise q1=
0, q2=1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis two dimension
Whether two images that vector represents input belong to the probability of same pedestrian's classification, wherein
Step S2.1 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahoua;
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to
Amount exports final class prediction by full link sort device;
S2.1.3 is aligned outputting and inputting for the twin residual error network of branch depth for pedestrian, defines third loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing
Parameter, t be pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together
Classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any figure
The probability value obtained after the processing of softmax function as i.
Step S2.3 is specific as follows:
Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer
Squared difference is taken, is obtained
The convolutional layer that S2.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next
From same category;
S2.3.4, for the same category or different classes of input picture to q, define the 4th loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θsConvolutional layer used in representing
Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions
The feature vector of outputIt whether is same class pedestrian θsProbability,Respectively by the twin residual error network of formation base branch
The feature extracted of two basic branch depth residual error networks, ifIt is same people,It is no
ThenAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivector
Whether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
The present invention have compared with prior art it is following a little:
First, present invention employs the identification models and verifying model in convolutional neural networks, effectively combine both
The advantages of model, wherein identification model verifies the similarity of model inspection input picture pair, this two for extracting characteristics of image
The complementation of kind model makes whole network study to the feature description for more having judgement index, effectively avoids the generation of over-fitting.
Second, the present invention is aligned network using pedestrian, by crucial pedestrian's Feature Mapping to low-dimensional on high dimensional feature figure
On characteristic pattern, enable entire neural network in the study for focusing more on pedestrian's feature at the very start;Meanwhile it is so effective that subtract
Lack extra background and pedestrian image excalation in pedestrian image and brought to obtain interference, improves the standard of neural network recognization
True property.
Third, the present invention is when carrying out pedestrian and identifying again while having used basic branch depth residual error network and alignment
Two groups of features of branch depth residual error network, compared to the feature that basic branch depth residual error network extraction is used alone and individually
It is aligned the feature that branch depth residual error network extracts using pedestrian, the method can further promote the precision that pedestrian identifies again.
Detailed description of the invention
Fig. 1 is network structure of the invention;
Fig. 2 is step figure of the invention.
Specific embodiment
Further detailed description is done to technical solution of the present invention with reference to the accompanying drawing.
S1 constructs the twin residual error network of basic branch;The full articulamentum of residual error network, addition convolutional layer and classification layer are deleted,
Obtain basic branch depth residual error web original;Duplicate network prototype, and two sorter networks are added, it is twin to obtain basic branch
Raw residual error network.
It constructs pedestrian and is aligned the twin residual error network of branch.On trained basic branch depth residual error network, delete most
The latter residual block adds a grid network, and is superimposed the basic branch depth residual error network for removing first residual block, obtains
Branch depth residual error web original is aligned to pedestrian;Duplicate network prototype, and two sorter networks are added, obtain pedestrian's alignment
The twin residual error network of branch.
Using the training dataset constructed to the twin network of basic branch and pedestrian be aligned the twin residual error network of branch into
Trained basic branch's prototype and pedestrian are aligned branch's prototype and take out the classification mould for carrying out pedestrian and identifying again by row parameter training
Type.
In the training stage, the twin residual error network of basic branch is trained first, uses the basic branch of two shared weights deep
Degree residual error web original forms the twin residual error network of basic branch and carries out feature respectively to the two images of the image pair of input
It extracts, carries out the calculating of Euclidean distance by two sorter networks to obtained feature, judge whether it is the same pedestrian
Classification, result are compared with image tag, for adjusting the parameter of the twin residual error network of entire basic branch;
Then the twin residual error network of depth being made of the alignment branch depth residual error network of two shared weights is trained, it is right
Neat branch depth residual error network is that a grid network is added on trained basic branch depth residual error network, and effect is
Generate six parameters of the affine transformation for pedestrian's alignment, and by the pedestrian image of input by obtaining after affine transformation pair
Then neat pedestrian image carries out feature extraction to it, obtained pairs of feature carries out the calculating of Euclidean distance, judges that it is
No to belong to same category, the parameter for being entirely aligned the twin residual error network of branch depth adjusts;
In test phase, its of the basic twin residual error network of branch and the alignment twin residual error network of branch depth are used respectively
In a progress feature extraction, obtained two kinds of features carry out Fusion Features, for judging pedestrian's classification.
Referring to Fig.1, the present invention realizes that specific step is as follows:
Step S1 constructs the twin residual error network of basic branch:
S1.1 constructs the twin residual error network of first foundation branch, using transfer learning strategy, imports in ImageNet data
The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network;
S1.2 obtains the second basic branch by the model structure and parameter that replicate first foundation branch depth residual error network
Depth residual error network;
S1.3 calculates square of the feature vector difference of two basic branch depth residual error networks output, using convolutional layer and
Classifier carries out two classification, judges whether the input of two basic branch depth residual error networks is same category of image;
Step S2 constructs pedestrian and is aligned the twin residual error network of branch:
S2.1 constructs the first pedestrian and is aligned branch depth residual error network, uses trained basic branch depth residual error net
Network basis branch depth residual error network, deletes the last one residual block;
The output result of 4th residual block is returned as six parameters via a residual block by S2.2, using this as progress
The parameter of affine transformation carries out affine transformation, the pedestrian image being aligned to the image of second residual block output;
S2.3 by trained basic branch depth residual error network foundation branch depth residual error network leave out first it is residual
Poor block, and it is used to the pedestrian image by alignment that training obtains;
S2.4 replicates the model structure of the first pedestrian alignment branch depth residual error network and parameter obtains the second pedestrian alignment
Branch depth residual error network;
S2.5 calculates square that two pedestrians are aligned the feature vector difference of branch depth residual error network output, utilizes convolution
Layer and classifier carry out two classification, judge whether the input of two branching networks is same category of image;
Step S3 constructs training dataset and substep carries out the twin residual error network of basic branch and pedestrian is aligned branch depth
The training of twin residual error network:
S3.1 is using the training dataset constructed using batch gradient descent method respectively to the twin residual error net of basic branch
Network and pedestrian are aligned the twin residual error network of branch depth and carry out parameter training;
Any one basic branch depth residual error network and pedestrian are aligned branch's depth respectively after training parameter by S3.2
Residual error network model is spent to take out as pedestrian image disaggregated model.
Step S4 constructs test sample and query sample.
The classification of step S5 test sample: test sample is respectively fed to basic branch depth residual error network and pedestrian's alignment point
Feature extraction is carried out in branch depth residual error network.
The feature that step S6 obtains two branching networks carries out the feature that weight is 0.5 and links.
Sorted lists are calculated to what the image of the image of test sample and query sample carried out Euclidean distance in step S7.
Step S8 carries out pedestrian on the basis of reordering and identifies again.
Depth residual error network architecture described in step S1 can be divided into five residual blocks, each residual error block structure difference
It is as follows:
Input is original pedestrian image, and Feature Mapping map number is 3, i.e. the three of image Color Channel;
Residual block one is made of four-layer network network, respectively convolutional layer, batch normalization layer, line rectification function layer and maximum pond
Change layer, output Feature Mapping map number is 64;
Residual block two is made of multiple identity blocks respectively to residual block five, and the structure of each identity block is three convolution
Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging;
Residual block two includes three identity blocks, and output Feature Mapping map number is 256;
Residual block three includes four identity blocks, and output Feature Mapping map number is 512;
Residual block four includes six identity blocks, and output Feature Mapping map number is 1024;
Residual block five includes three identity blocks, and output Feature Mapping map number is 2048;
Average pond is carried out to the characteristic spectrum of 2048 dimensions of output, obtains the feature vector of 2048 dimensions;
By full articulamentum and classifier layer by the vector for the dimension that maps feature vectors to size are pedestrian's classification number.
Further, step S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1 output is obtained by average Chi Huahou1;
The Feature Mapping figure number that convolutional layer is arranged in S1.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to
Amount exports final class prediction by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing
Parameter, t is pedestrian's classification, and f is the obtained feature vector after basic branch depth residual error network carries out feature extraction,
The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together's class
Other t, piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process
The probability value obtained after the processing of softmax function.
Further, step S1.3 is specific as follows:
S1.3.1., square layer is set, the feature vector f that two depth residual error network models are exported1、f2Squared difference is taken,
Obtain fs=(f1-f2)2
S1.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next
From same category;
S1.3.4. the second loss function is defined to q (the same category/different classes of) for input picture:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing
Parameter, s is identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,For classifier letter
The feature vector f of number outputsIt whether is same class pedestrian θsProbability, f1、f2Respectively by the twin residual error net of formation base branch
The feature that the basic branch depth residual error network of two of network extracts, if f1、f2It is same people, q1=1, q2=0;Otherwise q1=
0, q2=1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis two dimension
Whether two images that vector represents input belong to the probability of same pedestrian's classification, wherein
Depth residual error network architecture described in step S2 can be divided into eight residual blocks and a grid residual block, often
A residual error block structure difference is as follows:
Input is original pedestrian image, and Feature Mapping map number is 3, i.e. the three of image Color Channel;
Residual block one is made of four-layer network network, respectively convolutional layer, batch normalization layer, line rectification function layer and maximum pond
Change layer, output Feature Mapping map number is 64;
Residual block two is made of multiple identity blocks respectively to residual block four, and the structure of each identity block is three convolution
Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging;
Residual block two includes three identity blocks, and output Feature Mapping map number is 256;
Residual block three includes four identity blocks, and output Feature Mapping map number is 512;
Residual block four includes six identity blocks, and output Feature Mapping map number is 1024;
Grid network block includes three identity blocks and an average pond layer, but its output is the transformation ginseng of six dimensions
Number carries out pedestrian's alignment for generating image lattice;
The image that residual block two is exported carries out Grid Align, the Feature Mapping map number of the pedestrian image of obtained alignment
Mesh is 256;
Residual block five includes four identity blocks, and output Feature Mapping map number is 512;
Residual block six includes six identity blocks, and output Feature Mapping map number is 1024;
Residual block seven includes three identity blocks, and output Feature Mapping map number is 2048;
Average pond is carried out to the characteristic spectrum of 2048 dimensions of output, obtains the feature vector of 2048 dimensions;
By full articulamentum and classifier layer by the vector for the dimension that maps feature vectors to size are pedestrian's classification number.
Further, step S2.1-S2.3 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahou1;
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to
Amount exports final class prediction by full link sort device;
S2.1.3 residual error network twin for basic branch is output and input, and defines third loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing
Parameter, t be pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together
Classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any figure
The probability value obtained after the processing of softmax function as i.
Further, step S2.5 is specific as follows:
S2.3.1., square layer is set, the feature vector f that two depth residual error network models are exported1、f2Squared difference is taken,
Obtain fs=(f1-f2)2
S2.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next
From same category;
S2.3.4. the 4th loss function is defined to q (the same category/different classes of) for input picture:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing
Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier letter
The feature vector of number outputIt whether is same class pedestrian θsProbability,Respectively by the twin residual error of formation base branch
The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people,
OtherwiseAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
Further, step S3.1 is specific as follows:
S3.1.1. how to construct training set: upsetting the sequence that training data concentrates image, generate training data pair;
S3.1.2. it is carried out most using 4 loss functions of the batch descent method to S1.1.3, S1.3.4, S2.1.3, S2.3.4
Optimization;
S3.1.3., the weight of 4 loss functions, respectively λ are set1, λ2, λ3, λ4;
S3.1.4. optimal weighted value is determined into parameter testing by a series of experiments;
Further, step S3.2 is specific as follows:
S3.2.1. 4 loss function training are minimized into loss function to optimal;
S3.2.2. disaggregated model of the trained depth residual error network as next step is taken out;
It is as follows how step S4 constructs test sample:
S4-1., pedestrian is identified to remaining image is as test sample in data set again;
S4-2. every image trimming size in test sample is adjusted to 224 × 224;
Step S4 is specific as follows:
S4-1. disaggregated model is single depth residual error network model, and corresponding input is single image;
S4-2. classification standard is using overall accuracy and recognition correct rate, correct picture number of respectively classifying account for for the first time
The percentage of test sample and the percentage that identification same category pedestrian correctly identifies for the first time.
Effect of the invention is described further below:
1, experiment condition:
Experiment of the invention be NVIDIA GTX 1080Ti GPU, I7-8700K CPU hardware environment and
It is carried out under the software environment of MATLAB2017.
Experiment of the invention has used three pedestrians to identify data set Market-1501, DukeMMC and CUHK03 again.
Market-1501 data set is collected before a supermarket of Tsinghua University.Six cameras are used altogether, including
5 high resolution cameras and a low-resolution cameras.There is overlapping between different cameral.In general, this data set includes
32,668 mark bounding boxes, wherein including 1,501 marks.In this open system, the image of each identity at most by
Six video camera shootings.Ensure that each annotation mark at least exists in two video cameras, to carry out across video camera search.
Duke provides a kind of tracking system inside video camera and across camera operation, and one by 8 Synchronous cameras
The novel large-scale high-definition sets of video data of machine record, wherein including 7, a single camera track and 2 more than 000, more than 000 unique
Identity, and a kind of new performance estimating method.
CUHK03 includes 13,164 images of 1,360 pedestrians, and entire data set is shot by six monitor cameras.Often
A identity is shot by two disjoint cameras, which acquires in Hong Kong Chinese University, and image is from 2 different camera shootings
Head.The data set provides machine detection and manual inspection two datasets.Wherein detection data collection includes some detection errors, more
Close to actual conditions.It is average that everyone has 9.6 training datas.
2, interpretation of result
Twin network and (2) that pedestrian is aligned network are not used with (1) using the method for the present invention in emulation experiment of the invention
The pedestrian that twinned structure is not used is aligned network and classifies to three data sets, and classifying quality is compared and analyzed.
Table 1 is that experiment of the invention carries out overall accuracy using three kinds of convolutional neural networks models and the method for the present invention
The statistical form of comparison." Data Set " in table 1 indicates that the pedestrian used identifies that data set type, " result " expression are known again again
Not as a result, the accuracy of " Accuracy " presentation class, Rank-1 indicate that identification for the first time is the probability of correct pedestrian,
" Verif+identif " indicates that the twin network that pedestrian is aligned network is not used, and twin knot is not used in " Base+Align " expression
The pedestrian of structure is aligned network, and " (Base+Verif)+(Align+Verif) " indicates the method that the present invention uses.
1 pedestrian of table weight recognition result compares list
As it can be seen from table 1 the method for the present invention result on three data sets is superior to other two methods.
Claims (4)
1. a kind of pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian, which is characterized in that including following step
It is rapid:
S1, the twin residual error network of basic branch is constructed;
S1.1, construction first foundation branch depth residual error network are imported on ImageNet data set using transfer learning strategy
The residual error network parameter of pre-training, as the underlying parameter of first foundation branch depth residual error network;
S1.2, the model structure by replicating first foundation branch depth residual error network and parameter obtain the second basic branch depth
Residual error network;
S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, using convolutional layer and divide
Class device carries out two classification, judges whether the input of above-mentioned two basic branch depth residual error network is same category of image;
S2, building pedestrian are aligned the twin residual error network of branch;
S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth residual error
Any one in network or the second basic branch depth residual error network, delete for by the return of high dimensional feature image be characterized to
The residual block of amount;
The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to output
Low-dimensional characteristic image carries out affine transformation, obtains the pedestrian image by alignment;
It will be any one in trained first foundation branch depth residual error network or the second basic branch depth residual error network
It is a, leave out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and be used to the warp that training obtains
Cross the pedestrian image of alignment;
S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian and is aligned branch
Depth residual error network;
S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output is calculated, utilizes volume
Lamination and classifier carry out two classification, judge whether the input of two branching networks is same category of image;
S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin residual
Poor network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned branch
Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in twin residual error network;
S3.1, use batch gradient descent method twin residual to the basic branch constructed respectively using the training dataset constructed
Poor network and pedestrian are aligned the twin residual error network of branch and carry out parameter training;
S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin residual
Poor network, which takes out, is used as pedestrian image disaggregated model;
S4, building test sample and query sample;
S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian is aligned
Feature extraction is carried out in branch depth residual error network;
S6, the feature for obtaining two branching networks carry out feature connection;
S7, carry out Euclidean distance to the image of the image of test sample and query sample sorted lists are calculated;
S8, progress pedestrian identifies again on the basis of reordering.
2. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 2, which is characterized in that step
Rapid S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1, output are obtained by average Chi Huahou1;
S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f1Mapping becomes n-dimensional vector,
Final class prediction is exported by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θIThe ginseng of convolutional layer used in representing
Number, t are pedestrian's classification, and f is the feature vector obtained after basic branch depth residual error network carries out feature extraction,To divide
The feature vector f of class device function output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together classification t,
piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process
The probability value obtained after the processing of softmax function.
3. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step
Rapid S1.3 is specific as follows:
S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output1、f2Squared difference is taken,
Obtain fs=(f1-f2)2
The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same
One classification;
S1.3.4, for the same category or different classes of input picture to q, define the second loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θSThe ginseng of convolutional layer used in representing
Number, s are identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,It is defeated for classifier functions
Feature vector f outsIt whether is same class pedestrian θsProbability, f1、f2Respectively by the twin residual error network of formation base branch
The feature that two basic branch depth residual error networks extract, if f1、f2It is same people, q1=1, q2=0;Otherwise q1=0, q2
=1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis bivector
Whether two images for representing input belong to the probability of same pedestrian's classification, wherein
4. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step
Rapid S2.1 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahoua;
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping becomes n-dimensional vector, by
Full link sort device exports final class prediction;
S2.1.3 is aligned outputting and inputting for the twin residual error network of branch depth for pedestrian, defines third loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θIThe ginseng of convolutional layer used in representing
Number, t are pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,For
The feature vector f of classifier functions output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together's classification
t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any image i warp
Cross the probability value obtained after the processing of softmax function.
Step S2.3 is specific as follows:
Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer Take difference
Square, it obtains
The convolutional layer that S2.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same
One classification;
S2.3.4, for the same category or different classes of input picture to q, define the 4th loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θSThe ginseng of convolutional layer used in representing
Number, s are identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions
The feature vector of outputIt whether is same class pedestrian θSProbability,Respectively by the twin residual error net of formation base branch
The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people,
OtherwiseAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810876899.1A CN109063649B (en) | 2018-08-03 | 2018-08-03 | Pedestrian re-identification method based on twin pedestrian alignment residual error network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810876899.1A CN109063649B (en) | 2018-08-03 | 2018-08-03 | Pedestrian re-identification method based on twin pedestrian alignment residual error network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109063649A true CN109063649A (en) | 2018-12-21 |
CN109063649B CN109063649B (en) | 2021-05-14 |
Family
ID=64833110
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810876899.1A Active CN109063649B (en) | 2018-08-03 | 2018-08-03 | Pedestrian re-identification method based on twin pedestrian alignment residual error network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063649B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784237A (en) * | 2018-12-29 | 2019-05-21 | 北京航天云路有限公司 | The scene classification method of residual error network training based on transfer learning |
CN110084215A (en) * | 2019-05-05 | 2019-08-02 | 上海海事大学 | A kind of pedestrian of the twin network model of binaryzation triple recognition methods and system again |
CN110163117A (en) * | 2019-04-28 | 2019-08-23 | 浙江大学 | A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning |
CN110570490A (en) * | 2019-09-06 | 2019-12-13 | 北京航空航天大学 | saliency image generation method and equipment |
CN111382834A (en) * | 2018-12-29 | 2020-07-07 | 杭州海康威视数字技术股份有限公司 | Confidence degree comparison method and device |
CN111797700A (en) * | 2020-06-10 | 2020-10-20 | 南昌大学 | Vehicle re-identification method based on fine-grained discrimination network and second-order reordering |
CN112507835A (en) * | 2020-12-01 | 2021-03-16 | 燕山大学 | Method and system for analyzing multi-target object behaviors based on deep learning technology |
WO2021047190A1 (en) * | 2019-09-09 | 2021-03-18 | 深圳壹账通智能科技有限公司 | Alarm method based on residual network, and apparatus, computer device and storage medium |
CN112949608A (en) * | 2021-04-15 | 2021-06-11 | 南京邮电大学 | Pedestrian re-identification method based on twin semantic self-encoder and branch fusion |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107145900A (en) * | 2017-04-24 | 2017-09-08 | 清华大学 | Pedestrian based on consistency constraint feature learning recognition methods again |
CN108334849A (en) * | 2018-01-31 | 2018-07-27 | 中山大学 | A kind of recognition methods again of the pedestrian based on Riemann manifold |
CN108345837A (en) * | 2018-01-17 | 2018-07-31 | 浙江大学 | A kind of pedestrian's recognition methods again based on the study of human region alignmentization feature representation |
-
2018
- 2018-08-03 CN CN201810876899.1A patent/CN109063649B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107145900A (en) * | 2017-04-24 | 2017-09-08 | 清华大学 | Pedestrian based on consistency constraint feature learning recognition methods again |
CN108345837A (en) * | 2018-01-17 | 2018-07-31 | 浙江大学 | A kind of pedestrian's recognition methods again based on the study of human region alignmentization feature representation |
CN108334849A (en) * | 2018-01-31 | 2018-07-27 | 中山大学 | A kind of recognition methods again of the pedestrian based on Riemann manifold |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784237A (en) * | 2018-12-29 | 2019-05-21 | 北京航天云路有限公司 | The scene classification method of residual error network training based on transfer learning |
CN111382834A (en) * | 2018-12-29 | 2020-07-07 | 杭州海康威视数字技术股份有限公司 | Confidence degree comparison method and device |
CN111382834B (en) * | 2018-12-29 | 2023-09-29 | 杭州海康威视数字技术股份有限公司 | Confidence degree comparison method and device |
CN110163117A (en) * | 2019-04-28 | 2019-08-23 | 浙江大学 | A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning |
CN110163117B (en) * | 2019-04-28 | 2021-03-05 | 浙江大学 | Pedestrian re-identification method based on self-excitation discriminant feature learning |
CN110084215A (en) * | 2019-05-05 | 2019-08-02 | 上海海事大学 | A kind of pedestrian of the twin network model of binaryzation triple recognition methods and system again |
CN110570490A (en) * | 2019-09-06 | 2019-12-13 | 北京航空航天大学 | saliency image generation method and equipment |
WO2021047190A1 (en) * | 2019-09-09 | 2021-03-18 | 深圳壹账通智能科技有限公司 | Alarm method based on residual network, and apparatus, computer device and storage medium |
CN111797700A (en) * | 2020-06-10 | 2020-10-20 | 南昌大学 | Vehicle re-identification method based on fine-grained discrimination network and second-order reordering |
CN112507835A (en) * | 2020-12-01 | 2021-03-16 | 燕山大学 | Method and system for analyzing multi-target object behaviors based on deep learning technology |
CN112949608A (en) * | 2021-04-15 | 2021-06-11 | 南京邮电大学 | Pedestrian re-identification method based on twin semantic self-encoder and branch fusion |
CN112949608B (en) * | 2021-04-15 | 2022-08-02 | 南京邮电大学 | Pedestrian re-identification method based on twin semantic self-encoder and branch fusion |
Also Published As
Publication number | Publication date |
---|---|
CN109063649B (en) | 2021-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109063649A (en) | Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian | |
WO2021134871A1 (en) | Forensics method for synthesized face image based on local binary pattern and deep learning | |
CN109670528A (en) | The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission | |
WO2020155939A1 (en) | Image recognition method and device, storage medium and processor | |
CN110222792A (en) | A kind of label defects detection algorithm based on twin network | |
CN110851645B (en) | Image retrieval method based on similarity maintenance under deep metric learning | |
CN108830209B (en) | Remote sensing image road extraction method based on generation countermeasure network | |
CN102804208B (en) | Individual model for visual search application automatic mining famous person | |
CN104504362A (en) | Face detection method based on convolutional neural network | |
CN108171184A (en) | Method for distinguishing is known based on Siamese networks again for pedestrian | |
CN106529499A (en) | Fourier descriptor and gait energy image fusion feature-based gait identification method | |
CN107563280A (en) | Face identification method and device based on multi-model | |
CN109190643A (en) | Based on the recognition methods of convolutional neural networks Chinese medicine and electronic equipment | |
CN108564094A (en) | A kind of Material Identification method based on convolutional neural networks and classifiers combination | |
CN111325115A (en) | Countermeasures cross-modal pedestrian re-identification method and system with triple constraint loss | |
CN104268593A (en) | Multiple-sparse-representation face recognition method for solving small sample size problem | |
CN109902202A (en) | A kind of video classification methods and device | |
CN110866134B (en) | Image retrieval-oriented distribution consistency keeping metric learning method | |
CN114998220B (en) | Tongue image detection and positioning method based on improved Tiny-YOLO v4 natural environment | |
CN109492528A (en) | A kind of recognition methods again of the pedestrian based on gaussian sum depth characteristic | |
WO2022062419A1 (en) | Target re-identification method and system based on non-supervised pyramid similarity learning | |
CN109214298A (en) | A kind of Asia women face value Rating Model method based on depth convolutional network | |
CN112052772A (en) | Face shielding detection algorithm | |
CN110163117A (en) | A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning | |
CN109614866A (en) | Method for detecting human face based on cascade deep convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |