CN109063649A - Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian - Google Patents

Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian Download PDF

Info

Publication number
CN109063649A
CN109063649A CN201810876899.1A CN201810876899A CN109063649A CN 109063649 A CN109063649 A CN 109063649A CN 201810876899 A CN201810876899 A CN 201810876899A CN 109063649 A CN109063649 A CN 109063649A
Authority
CN
China
Prior art keywords
pedestrian
residual error
error network
branch
twin
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810876899.1A
Other languages
Chinese (zh)
Other versions
CN109063649B (en
Inventor
周勇
郑沂
赵佳琦
姚睿
刘兵
夏士雄
刘栩宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN201810876899.1A priority Critical patent/CN109063649B/en
Publication of CN109063649A publication Critical patent/CN109063649A/en
Application granted granted Critical
Publication of CN109063649B publication Critical patent/CN109063649B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a kind of pedestrian's recognition methods again that residual error network is aligned based on twin pedestrian, comprising the following steps: S1, constructs the twin residual error network of basic branch;S2, building pedestrian are aligned the twin residual error network of branch;S3, the twin residual error network progress parameter training of branch is aligned to the twin network of basic branch built and pedestrian using the training dataset constructed, pedestrian in branch's prototype basic in the twin residual error network of trained basic branch and pedestrian's alignment twin residual error network of branch is aligned the taking-up of branch's prototype and carries out the disaggregated model that pedestrian identifies again.The present invention improves the accuracy that original algorithm pedestrian identifies again.

Description

Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian
Technical field
It the invention belongs to image retrieval technologies field, is judged in image or video sequence using computer vision technique With the presence or absence of the technology of specific pedestrian, further relates to one of pedestrian's weight identification technology field and be aligned based on twin pedestrian The pedestrian of residual error network recognition methods again.
Background technique
In monitor video, background block with pedestrian away from camera farther out caused by due to low resolution etc., often It is unable to get the image that can be used for recognition of face.And when face recognition technology can not be in the case of normal use obtains, pedestrian knows again A very important substitute technology is not just become.Pedestrian identifies that having a very important characteristic is exactly across camera shooting again Head, so being the identical pedestrian image that retrieve under different cameras when evaluating performance in academic paper.Pedestrian knows again It does not study for many years, but the development with deep learning in several years up to date, just achieves very huge prominent in academia It is broken.
Tradition is roughly divided into following several classes based on the algorithm of image identified by feature representation method progress pedestrian again:
(1) bottom visual signature: this method is essentially all to divide an image into multiple regions, to each extracted region A variety of different bottom visual signatures obtain the better character representation form of robustness after combination, most common is exactly that color is straight Fang Tu;
(2) middle layer semantic attribute: judging whether belong to same a group traveling together in two images by semantic information, such as color, The information such as clothes and the packet of carrying, identical pedestrian semantic attribute under different video captures seldom change;
(3) high-level vision feature: the selection technique of feature promotes the discrimination that pedestrian identifies again.Use depth Habit progress pedestrian knows method for distinguishing again and is that it does not need artificial selected characteristic, passes through end with the maximum difference of conventional method To the study at end, automatically learn the various features in pedestrian image.
Therefore, field is identified again in pedestrian, another characteristic is known again for carrying out pedestrian by artificial selection, due to characteristic Measure it is numerous, it is practical by the picture that camera photographed may also be multifarious etc. reasons, be difficult to determine certain specific feature to institute Some images have good performance.Therefore, pedestrian's feature is chosen compared to artificial, the method based on deep learning model can Reach preferable effect.
Existing deep learning model principally falls into the classification of convolutional neural networks, and usually used model has CaffeNet, VGGNet and residual error network etc..
Summary of the invention
The invention proposes a kind of based on the pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian.It can be effective Raising whole network precision, improve the accuracy rate that identifies again of pedestrian.
Meanwhile network takes channel structure, pedestrian image inputs in pairs, and image is to including similar image and inhomogeneity figure Picture provides the feedback of positive negative sample, makes e-learning to the feature with judgement index.
In order to achieve the above technical purposes, the present invention uses following specific technical solution:
A kind of pedestrian's recognition methods again being aligned residual error network based on twin pedestrian, including the following steps:
S1, the twin residual error network of basic branch is constructed;
S1.1, construction first foundation branch depth residual error network are imported using transfer learning strategy in ImageNet data The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network;
S1.2, the model structure by replicating first foundation branch depth residual error network and parameter obtain the second basic branch Depth residual error network;
S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, utilize convolutional layer Two classification are carried out with classifier, judge whether the input of above-mentioned two basic branch depth residual error network is same category of figure Picture;
S2, building pedestrian are aligned the twin residual error network of branch;
S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth Any one in residual error network or the second basic branch depth residual error network, is deleted for high dimensional feature image to be returned as spy Levy the residual block of vector;
The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to defeated Low-dimensional characteristic image out carries out affine transformation, obtains the pedestrian image by alignment;
By appointing in trained first foundation branch depth residual error network or the second basic branch depth residual error network Meaning one leaves out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and is used to training and obtains By alignment pedestrian image;
S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian alignment Branch depth residual error network;
S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output, benefit are calculated Two classification are carried out with convolutional layer and classifier, judge whether the input of two branching networks is same category of image;
S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin Raw residual error network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in the twin residual error network of branch;
S3.1, use batch gradient descent method twin to the basic branch constructed respectively using the training dataset constructed Raw residual error network and pedestrian are aligned the twin residual error network of branch and carry out parameter training;
S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin Raw residual error network, which takes out, is used as pedestrian image disaggregated model;
S4, building test sample and query sample;
S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian Feature extraction is carried out in alignment branch depth residual error network;
S6, the feature for obtaining two branching networks carry out feature connection;
S7, carry out Euclidean distance to the image of the image of test sample and query sample sorted lists are calculated;
S8, progress pedestrian identifies again on the basis of reordering.
Step S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1, output are obtained by average Chi Huahou1
S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f1Mapping becomes n and ties up Vector exports final class prediction by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing Parameter, t is pedestrian's classification, and f is the obtained feature vector after basic branch depth residual error network carries out feature extraction, The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together's class Other t, piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process The probability value obtained after the processing of softmax function.
Step S1.3 is specific as follows:
S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output1、f2Take difference Square, obtain fs=(f1-f2)2
The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next From same category;
S1.3.4, for the same category or different classes of input picture to q, define the second loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing Parameter, s is identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,For classifier letter The feature vector f of number outputsIt whether is same class pedestrian θSProbability, f1、f2Respectively by the twin residual error net of formation base branch The feature that the basic branch depth residual error network of two of network extracts, if f1、f2It is same people, q1=1, q2=0;Otherwise q1= 0, q2=1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis two dimension Whether two images that vector represents input belong to the probability of same pedestrian's classification, wherein
Step S2.1 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahoua
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to Amount exports final class prediction by full link sort device;
S2.1.3 is aligned outputting and inputting for the twin residual error network of branch depth for pedestrian, defines third loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing Parameter, t be pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together Classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any figure The probability value obtained after the processing of softmax function as i.
Step S2.3 is specific as follows:
Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer Squared difference is taken, is obtained
The convolutional layer that S2.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next From same category;
S2.3.4, for the same category or different classes of input picture to q, define the 4th loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θsConvolutional layer used in representing Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions The feature vector of outputIt whether is same class pedestrian θsProbability,Respectively by the twin residual error network of formation base branch The feature extracted of two basic branch depth residual error networks, ifIt is same people,It is no ThenAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivector Whether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
The present invention have compared with prior art it is following a little:
First, present invention employs the identification models and verifying model in convolutional neural networks, effectively combine both The advantages of model, wherein identification model verifies the similarity of model inspection input picture pair, this two for extracting characteristics of image The complementation of kind model makes whole network study to the feature description for more having judgement index, effectively avoids the generation of over-fitting.
Second, the present invention is aligned network using pedestrian, by crucial pedestrian's Feature Mapping to low-dimensional on high dimensional feature figure On characteristic pattern, enable entire neural network in the study for focusing more on pedestrian's feature at the very start;Meanwhile it is so effective that subtract Lack extra background and pedestrian image excalation in pedestrian image and brought to obtain interference, improves the standard of neural network recognization True property.
Third, the present invention is when carrying out pedestrian and identifying again while having used basic branch depth residual error network and alignment Two groups of features of branch depth residual error network, compared to the feature that basic branch depth residual error network extraction is used alone and individually It is aligned the feature that branch depth residual error network extracts using pedestrian, the method can further promote the precision that pedestrian identifies again.
Detailed description of the invention
Fig. 1 is network structure of the invention;
Fig. 2 is step figure of the invention.
Specific embodiment
Further detailed description is done to technical solution of the present invention with reference to the accompanying drawing.
S1 constructs the twin residual error network of basic branch;The full articulamentum of residual error network, addition convolutional layer and classification layer are deleted, Obtain basic branch depth residual error web original;Duplicate network prototype, and two sorter networks are added, it is twin to obtain basic branch Raw residual error network.
It constructs pedestrian and is aligned the twin residual error network of branch.On trained basic branch depth residual error network, delete most The latter residual block adds a grid network, and is superimposed the basic branch depth residual error network for removing first residual block, obtains Branch depth residual error web original is aligned to pedestrian;Duplicate network prototype, and two sorter networks are added, obtain pedestrian's alignment The twin residual error network of branch.
Using the training dataset constructed to the twin network of basic branch and pedestrian be aligned the twin residual error network of branch into Trained basic branch's prototype and pedestrian are aligned branch's prototype and take out the classification mould for carrying out pedestrian and identifying again by row parameter training Type.
In the training stage, the twin residual error network of basic branch is trained first, uses the basic branch of two shared weights deep Degree residual error web original forms the twin residual error network of basic branch and carries out feature respectively to the two images of the image pair of input It extracts, carries out the calculating of Euclidean distance by two sorter networks to obtained feature, judge whether it is the same pedestrian Classification, result are compared with image tag, for adjusting the parameter of the twin residual error network of entire basic branch;
Then the twin residual error network of depth being made of the alignment branch depth residual error network of two shared weights is trained, it is right Neat branch depth residual error network is that a grid network is added on trained basic branch depth residual error network, and effect is Generate six parameters of the affine transformation for pedestrian's alignment, and by the pedestrian image of input by obtaining after affine transformation pair Then neat pedestrian image carries out feature extraction to it, obtained pairs of feature carries out the calculating of Euclidean distance, judges that it is No to belong to same category, the parameter for being entirely aligned the twin residual error network of branch depth adjusts;
In test phase, its of the basic twin residual error network of branch and the alignment twin residual error network of branch depth are used respectively In a progress feature extraction, obtained two kinds of features carry out Fusion Features, for judging pedestrian's classification.
Referring to Fig.1, the present invention realizes that specific step is as follows:
Step S1 constructs the twin residual error network of basic branch:
S1.1 constructs the twin residual error network of first foundation branch, using transfer learning strategy, imports in ImageNet data The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network;
S1.2 obtains the second basic branch by the model structure and parameter that replicate first foundation branch depth residual error network Depth residual error network;
S1.3 calculates square of the feature vector difference of two basic branch depth residual error networks output, using convolutional layer and Classifier carries out two classification, judges whether the input of two basic branch depth residual error networks is same category of image;
Step S2 constructs pedestrian and is aligned the twin residual error network of branch:
S2.1 constructs the first pedestrian and is aligned branch depth residual error network, uses trained basic branch depth residual error net Network basis branch depth residual error network, deletes the last one residual block;
The output result of 4th residual block is returned as six parameters via a residual block by S2.2, using this as progress The parameter of affine transformation carries out affine transformation, the pedestrian image being aligned to the image of second residual block output;
S2.3 by trained basic branch depth residual error network foundation branch depth residual error network leave out first it is residual Poor block, and it is used to the pedestrian image by alignment that training obtains;
S2.4 replicates the model structure of the first pedestrian alignment branch depth residual error network and parameter obtains the second pedestrian alignment Branch depth residual error network;
S2.5 calculates square that two pedestrians are aligned the feature vector difference of branch depth residual error network output, utilizes convolution Layer and classifier carry out two classification, judge whether the input of two branching networks is same category of image;
Step S3 constructs training dataset and substep carries out the twin residual error network of basic branch and pedestrian is aligned branch depth The training of twin residual error network:
S3.1 is using the training dataset constructed using batch gradient descent method respectively to the twin residual error net of basic branch Network and pedestrian are aligned the twin residual error network of branch depth and carry out parameter training;
Any one basic branch depth residual error network and pedestrian are aligned branch's depth respectively after training parameter by S3.2 Residual error network model is spent to take out as pedestrian image disaggregated model.
Step S4 constructs test sample and query sample.
The classification of step S5 test sample: test sample is respectively fed to basic branch depth residual error network and pedestrian's alignment point Feature extraction is carried out in branch depth residual error network.
The feature that step S6 obtains two branching networks carries out the feature that weight is 0.5 and links.
Sorted lists are calculated to what the image of the image of test sample and query sample carried out Euclidean distance in step S7.
Step S8 carries out pedestrian on the basis of reordering and identifies again.
Depth residual error network architecture described in step S1 can be divided into five residual blocks, each residual error block structure difference It is as follows:
Input is original pedestrian image, and Feature Mapping map number is 3, i.e. the three of image Color Channel;
Residual block one is made of four-layer network network, respectively convolutional layer, batch normalization layer, line rectification function layer and maximum pond Change layer, output Feature Mapping map number is 64;
Residual block two is made of multiple identity blocks respectively to residual block five, and the structure of each identity block is three convolution Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging;
Residual block two includes three identity blocks, and output Feature Mapping map number is 256;
Residual block three includes four identity blocks, and output Feature Mapping map number is 512;
Residual block four includes six identity blocks, and output Feature Mapping map number is 1024;
Residual block five includes three identity blocks, and output Feature Mapping map number is 2048;
Average pond is carried out to the characteristic spectrum of 2048 dimensions of output, obtains the feature vector of 2048 dimensions;
By full articulamentum and classifier layer by the vector for the dimension that maps feature vectors to size are pedestrian's classification number.
Further, step S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1 output is obtained by average Chi Huahou1
The Feature Mapping figure number that convolutional layer is arranged in S1.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to Amount exports final class prediction by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing Parameter, t is pedestrian's classification, and f is the obtained feature vector after basic branch depth residual error network carries out feature extraction, The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together's class Other t, piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process The probability value obtained after the processing of softmax function.
Further, step S1.3 is specific as follows:
S1.3.1., square layer is set, the feature vector f that two depth residual error network models are exported1、f2Squared difference is taken, Obtain fs=(f1-f2)2
S1.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next From same category;
S1.3.4. the second loss function is defined to q (the same category/different classes of) for input picture:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing Parameter, s is identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,For classifier letter The feature vector f of number outputsIt whether is same class pedestrian θsProbability, f1、f2Respectively by the twin residual error net of formation base branch The feature that the basic branch depth residual error network of two of network extracts, if f1、f2It is same people, q1=1, q2=0;Otherwise q1= 0, q2=1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis two dimension Whether two images that vector represents input belong to the probability of same pedestrian's classification, wherein
Depth residual error network architecture described in step S2 can be divided into eight residual blocks and a grid residual block, often A residual error block structure difference is as follows:
Input is original pedestrian image, and Feature Mapping map number is 3, i.e. the three of image Color Channel;
Residual block one is made of four-layer network network, respectively convolutional layer, batch normalization layer, line rectification function layer and maximum pond Change layer, output Feature Mapping map number is 64;
Residual block two is made of multiple identity blocks respectively to residual block four, and the structure of each identity block is three convolution Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging;
Residual block two includes three identity blocks, and output Feature Mapping map number is 256;
Residual block three includes four identity blocks, and output Feature Mapping map number is 512;
Residual block four includes six identity blocks, and output Feature Mapping map number is 1024;
Grid network block includes three identity blocks and an average pond layer, but its output is the transformation ginseng of six dimensions Number carries out pedestrian's alignment for generating image lattice;
The image that residual block two is exported carries out Grid Align, the Feature Mapping map number of the pedestrian image of obtained alignment Mesh is 256;
Residual block five includes four identity blocks, and output Feature Mapping map number is 512;
Residual block six includes six identity blocks, and output Feature Mapping map number is 1024;
Residual block seven includes three identity blocks, and output Feature Mapping map number is 2048;
Average pond is carried out to the characteristic spectrum of 2048 dimensions of output, obtains the feature vector of 2048 dimensions;
By full articulamentum and classifier layer by the vector for the dimension that maps feature vectors to size are pedestrian's classification number.
Further, step S2.1-S2.3 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahou1
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping become n tie up to Amount exports final class prediction by full link sort device;
S2.1.3 residual error network twin for basic branch is output and input, and defines third loss function:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θIConvolutional layer used in representing Parameter, t be pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together Classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any figure The probability value obtained after the processing of softmax function as i.
Further, step S2.5 is specific as follows:
S2.3.1., square layer is set, the feature vector f that two depth residual error network models are exported1、f2Squared difference is taken, Obtain fs=(f1-f2)2
S2.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next From same category;
S2.3.4. the 4th loss function is defined to q (the same category/different classes of) for input picture:
Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θSConvolutional layer used in representing Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier letter The feature vector of number outputIt whether is same class pedestrian θsProbability,Respectively by the twin residual error of formation base branch The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people, OtherwiseAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
Further, step S3.1 is specific as follows:
S3.1.1. how to construct training set: upsetting the sequence that training data concentrates image, generate training data pair;
S3.1.2. it is carried out most using 4 loss functions of the batch descent method to S1.1.3, S1.3.4, S2.1.3, S2.3.4 Optimization;
S3.1.3., the weight of 4 loss functions, respectively λ are set1, λ2, λ3, λ4
S3.1.4. optimal weighted value is determined into parameter testing by a series of experiments;
Further, step S3.2 is specific as follows:
S3.2.1. 4 loss function training are minimized into loss function to optimal;
S3.2.2. disaggregated model of the trained depth residual error network as next step is taken out;
It is as follows how step S4 constructs test sample:
S4-1., pedestrian is identified to remaining image is as test sample in data set again;
S4-2. every image trimming size in test sample is adjusted to 224 × 224;
Step S4 is specific as follows:
S4-1. disaggregated model is single depth residual error network model, and corresponding input is single image;
S4-2. classification standard is using overall accuracy and recognition correct rate, correct picture number of respectively classifying account for for the first time The percentage of test sample and the percentage that identification same category pedestrian correctly identifies for the first time.
Effect of the invention is described further below:
1, experiment condition:
Experiment of the invention be NVIDIA GTX 1080Ti GPU, I7-8700K CPU hardware environment and It is carried out under the software environment of MATLAB2017.
Experiment of the invention has used three pedestrians to identify data set Market-1501, DukeMMC and CUHK03 again.
Market-1501 data set is collected before a supermarket of Tsinghua University.Six cameras are used altogether, including 5 high resolution cameras and a low-resolution cameras.There is overlapping between different cameral.In general, this data set includes 32,668 mark bounding boxes, wherein including 1,501 marks.In this open system, the image of each identity at most by Six video camera shootings.Ensure that each annotation mark at least exists in two video cameras, to carry out across video camera search.
Duke provides a kind of tracking system inside video camera and across camera operation, and one by 8 Synchronous cameras The novel large-scale high-definition sets of video data of machine record, wherein including 7, a single camera track and 2 more than 000, more than 000 unique Identity, and a kind of new performance estimating method.
CUHK03 includes 13,164 images of 1,360 pedestrians, and entire data set is shot by six monitor cameras.Often A identity is shot by two disjoint cameras, which acquires in Hong Kong Chinese University, and image is from 2 different camera shootings Head.The data set provides machine detection and manual inspection two datasets.Wherein detection data collection includes some detection errors, more Close to actual conditions.It is average that everyone has 9.6 training datas.
2, interpretation of result
Twin network and (2) that pedestrian is aligned network are not used with (1) using the method for the present invention in emulation experiment of the invention The pedestrian that twinned structure is not used is aligned network and classifies to three data sets, and classifying quality is compared and analyzed.
Table 1 is that experiment of the invention carries out overall accuracy using three kinds of convolutional neural networks models and the method for the present invention The statistical form of comparison." Data Set " in table 1 indicates that the pedestrian used identifies that data set type, " result " expression are known again again Not as a result, the accuracy of " Accuracy " presentation class, Rank-1 indicate that identification for the first time is the probability of correct pedestrian, " Verif+identif " indicates that the twin network that pedestrian is aligned network is not used, and twin knot is not used in " Base+Align " expression The pedestrian of structure is aligned network, and " (Base+Verif)+(Align+Verif) " indicates the method that the present invention uses.
1 pedestrian of table weight recognition result compares list
As it can be seen from table 1 the method for the present invention result on three data sets is superior to other two methods.

Claims (4)

1. a kind of pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian, which is characterized in that including following step It is rapid:
S1, the twin residual error network of basic branch is constructed;
S1.1, construction first foundation branch depth residual error network are imported on ImageNet data set using transfer learning strategy The residual error network parameter of pre-training, as the underlying parameter of first foundation branch depth residual error network;
S1.2, the model structure by replicating first foundation branch depth residual error network and parameter obtain the second basic branch depth Residual error network;
S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, using convolutional layer and divide Class device carries out two classification, judges whether the input of above-mentioned two basic branch depth residual error network is same category of image;
S2, building pedestrian are aligned the twin residual error network of branch;
S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth residual error Any one in network or the second basic branch depth residual error network, delete for by the return of high dimensional feature image be characterized to The residual block of amount;
The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to output Low-dimensional characteristic image carries out affine transformation, obtains the pedestrian image by alignment;
It will be any one in trained first foundation branch depth residual error network or the second basic branch depth residual error network It is a, leave out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and be used to the warp that training obtains Cross the pedestrian image of alignment;
S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian and is aligned branch Depth residual error network;
S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output is calculated, utilizes volume Lamination and classifier carry out two classification, judge whether the input of two branching networks is same category of image;
S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin residual Poor network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned branch Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in twin residual error network;
S3.1, use batch gradient descent method twin residual to the basic branch constructed respectively using the training dataset constructed Poor network and pedestrian are aligned the twin residual error network of branch and carry out parameter training;
S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin residual Poor network, which takes out, is used as pedestrian image disaggregated model;
S4, building test sample and query sample;
S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian is aligned Feature extraction is carried out in branch depth residual error network;
S6, the feature for obtaining two branching networks carry out feature connection;
S7, carry out Euclidean distance to the image of the image of test sample and query sample sorted lists are calculated;
S8, progress pedestrian identifies again on the basis of reordering.
2. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 2, which is characterized in that step Rapid S1.1 is specific as follows:
The 2048 dimensional vector f that S1.1.1, output are obtained by average Chi Huahou1
S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f1Mapping becomes n-dimensional vector, Final class prediction is exported by full link sort device;
S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θIThe ginseng of convolutional layer used in representing Number, t are pedestrian's classification, and f is the feature vector obtained after basic branch depth residual error network carries out feature extraction,To divide The feature vector f of class device function output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together classification t, piWhether representative image i belongs to pedestrian classification t, if belonged to, pi=1, otherwise pi=0,For any image i process The probability value obtained after the processing of softmax function.
3. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step Rapid S1.3 is specific as follows:
S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output1、f2Squared difference is taken, Obtain fs=(f1-f2)2
The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same One classification;
S1.3.4, for the same category or different classes of input picture to q, define the second loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θSThe ginseng of convolutional layer used in representing Number, s are identical or not identical two classifications, fsFor the feature vector that convolution obtains after square layer,It is defeated for classifier functions Feature vector f outsIt whether is same class pedestrian θsProbability, f1、f2Respectively by the twin residual error network of formation base branch The feature that two basic branch depth residual error networks extract, if f1、f2It is same people, q1=1, q2=0;Otherwise q1=0, q2 =1, after convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorThis bivector Whether two images for representing input belong to the probability of same pedestrian's classification, wherein
4. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step Rapid S2.1 is specific as follows:
The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahoua
The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f1Mapping becomes n-dimensional vector, by Full link sort device exports final class prediction;
S2.1.3 is aligned outputting and inputting for the twin residual error network of branch depth for pedestrian, defines third loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θIThe ginseng of convolutional layer used in representing Number, t are pedestrian's classification, faFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,For The feature vector f of classifier functions output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together's classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any image i warp Cross the probability value obtained after the processing of softmax function.
Step S2.3 is specific as follows:
Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer Take difference Square, it obtains
The convolutional layer that S2.3.2, setting Feature Mapping figure number are 2, by fsMapping becomes 2 dimensional vectors;
S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same One classification;
S2.3.4, for the same category or different classes of input picture to q, define the 4th loss function:
Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θSThe ginseng of convolutional layer used in representing Number, s are identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions The feature vector of outputIt whether is same class pedestrian θSProbability,Respectively by the twin residual error net of formation base branch The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people, OtherwiseAfter convolutional layer and the processing of softmax function, by fsIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein
CN201810876899.1A 2018-08-03 2018-08-03 Pedestrian re-identification method based on twin pedestrian alignment residual error network Active CN109063649B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810876899.1A CN109063649B (en) 2018-08-03 2018-08-03 Pedestrian re-identification method based on twin pedestrian alignment residual error network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810876899.1A CN109063649B (en) 2018-08-03 2018-08-03 Pedestrian re-identification method based on twin pedestrian alignment residual error network

Publications (2)

Publication Number Publication Date
CN109063649A true CN109063649A (en) 2018-12-21
CN109063649B CN109063649B (en) 2021-05-14

Family

ID=64833110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810876899.1A Active CN109063649B (en) 2018-08-03 2018-08-03 Pedestrian re-identification method based on twin pedestrian alignment residual error network

Country Status (1)

Country Link
CN (1) CN109063649B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784237A (en) * 2018-12-29 2019-05-21 北京航天云路有限公司 The scene classification method of residual error network training based on transfer learning
CN110084215A (en) * 2019-05-05 2019-08-02 上海海事大学 A kind of pedestrian of the twin network model of binaryzation triple recognition methods and system again
CN110163117A (en) * 2019-04-28 2019-08-23 浙江大学 A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning
CN110570490A (en) * 2019-09-06 2019-12-13 北京航空航天大学 saliency image generation method and equipment
CN111382834A (en) * 2018-12-29 2020-07-07 杭州海康威视数字技术股份有限公司 Confidence degree comparison method and device
CN111797700A (en) * 2020-06-10 2020-10-20 南昌大学 Vehicle re-identification method based on fine-grained discrimination network and second-order reordering
CN112507835A (en) * 2020-12-01 2021-03-16 燕山大学 Method and system for analyzing multi-target object behaviors based on deep learning technology
WO2021047190A1 (en) * 2019-09-09 2021-03-18 深圳壹账通智能科技有限公司 Alarm method based on residual network, and apparatus, computer device and storage medium
CN112949608A (en) * 2021-04-15 2021-06-11 南京邮电大学 Pedestrian re-identification method based on twin semantic self-encoder and branch fusion

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
CN108334849A (en) * 2018-01-31 2018-07-27 中山大学 A kind of recognition methods again of the pedestrian based on Riemann manifold
CN108345837A (en) * 2018-01-17 2018-07-31 浙江大学 A kind of pedestrian's recognition methods again based on the study of human region alignmentization feature representation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
CN108345837A (en) * 2018-01-17 2018-07-31 浙江大学 A kind of pedestrian's recognition methods again based on the study of human region alignmentization feature representation
CN108334849A (en) * 2018-01-31 2018-07-27 中山大学 A kind of recognition methods again of the pedestrian based on Riemann manifold

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784237A (en) * 2018-12-29 2019-05-21 北京航天云路有限公司 The scene classification method of residual error network training based on transfer learning
CN111382834A (en) * 2018-12-29 2020-07-07 杭州海康威视数字技术股份有限公司 Confidence degree comparison method and device
CN111382834B (en) * 2018-12-29 2023-09-29 杭州海康威视数字技术股份有限公司 Confidence degree comparison method and device
CN110163117A (en) * 2019-04-28 2019-08-23 浙江大学 A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning
CN110163117B (en) * 2019-04-28 2021-03-05 浙江大学 Pedestrian re-identification method based on self-excitation discriminant feature learning
CN110084215A (en) * 2019-05-05 2019-08-02 上海海事大学 A kind of pedestrian of the twin network model of binaryzation triple recognition methods and system again
CN110570490A (en) * 2019-09-06 2019-12-13 北京航空航天大学 saliency image generation method and equipment
WO2021047190A1 (en) * 2019-09-09 2021-03-18 深圳壹账通智能科技有限公司 Alarm method based on residual network, and apparatus, computer device and storage medium
CN111797700A (en) * 2020-06-10 2020-10-20 南昌大学 Vehicle re-identification method based on fine-grained discrimination network and second-order reordering
CN112507835A (en) * 2020-12-01 2021-03-16 燕山大学 Method and system for analyzing multi-target object behaviors based on deep learning technology
CN112949608A (en) * 2021-04-15 2021-06-11 南京邮电大学 Pedestrian re-identification method based on twin semantic self-encoder and branch fusion
CN112949608B (en) * 2021-04-15 2022-08-02 南京邮电大学 Pedestrian re-identification method based on twin semantic self-encoder and branch fusion

Also Published As

Publication number Publication date
CN109063649B (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN109063649A (en) Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian
WO2021134871A1 (en) Forensics method for synthesized face image based on local binary pattern and deep learning
CN109670528A (en) The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission
WO2020155939A1 (en) Image recognition method and device, storage medium and processor
CN110222792A (en) A kind of label defects detection algorithm based on twin network
CN110851645B (en) Image retrieval method based on similarity maintenance under deep metric learning
CN108830209B (en) Remote sensing image road extraction method based on generation countermeasure network
CN102804208B (en) Individual model for visual search application automatic mining famous person
CN104504362A (en) Face detection method based on convolutional neural network
CN108171184A (en) Method for distinguishing is known based on Siamese networks again for pedestrian
CN106529499A (en) Fourier descriptor and gait energy image fusion feature-based gait identification method
CN107563280A (en) Face identification method and device based on multi-model
CN109190643A (en) Based on the recognition methods of convolutional neural networks Chinese medicine and electronic equipment
CN108564094A (en) A kind of Material Identification method based on convolutional neural networks and classifiers combination
CN111325115A (en) Countermeasures cross-modal pedestrian re-identification method and system with triple constraint loss
CN104268593A (en) Multiple-sparse-representation face recognition method for solving small sample size problem
CN109902202A (en) A kind of video classification methods and device
CN110866134B (en) Image retrieval-oriented distribution consistency keeping metric learning method
CN114998220B (en) Tongue image detection and positioning method based on improved Tiny-YOLO v4 natural environment
CN109492528A (en) A kind of recognition methods again of the pedestrian based on gaussian sum depth characteristic
WO2022062419A1 (en) Target re-identification method and system based on non-supervised pyramid similarity learning
CN109214298A (en) A kind of Asia women face value Rating Model method based on depth convolutional network
CN112052772A (en) Face shielding detection algorithm
CN110163117A (en) A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning
CN109614866A (en) Method for detecting human face based on cascade deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant