CN111062329B - Unsupervised pedestrian re-identification method based on augmented network - Google Patents

Unsupervised pedestrian re-identification method based on augmented network Download PDF

Info

Publication number
CN111062329B
CN111062329B CN201911310016.1A CN201911310016A CN111062329B CN 111062329 B CN111062329 B CN 111062329B CN 201911310016 A CN201911310016 A CN 201911310016A CN 111062329 B CN111062329 B CN 111062329B
Authority
CN
China
Prior art keywords
augmentation
network
pedestrian
image
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911310016.1A
Other languages
Chinese (zh)
Other versions
CN111062329A (en
Inventor
郑伟诗
袁子逸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911310016.1A priority Critical patent/CN111062329B/en
Publication of CN111062329A publication Critical patent/CN111062329A/en
Application granted granted Critical
Publication of CN111062329B publication Critical patent/CN111062329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an unsupervised pedestrian re-identification method based on an augmentation network, which is characterized in that based on pedestrian image data in an original database, various forms of data augmentation are carried out, and the characteristics of the same label data access parameters which are taken as basic data after the augmentation are respectively extracted by a network with unshared parameters, so that the network is helped to train. The method mainly considers how to utilize the unlabeled data which cannot be directly used as input under the condition that the data set is not abundant, and the main network model obtained by the method can be directly used for testing after the feature extraction is directly carried out on the test set; the method can also be used for pre-training a plurality of augmentation networks and the main network by using the unlabeled data, and then the main network parameters are finely adjusted by using the labeled data, so that the unlabeled information is effectively utilized, and the accuracy of pedestrian re-identification is improved.

Description

Unsupervised pedestrian re-identification method based on augmented network
Technical Field
The invention relates to the field of deep learning, in particular to an unsupervised pedestrian re-recognition method.
Background
In recent years, deep learning technology is continuously developed, and deep learning methods based on deep neural networks have been applied to aspects of our lives. Such as text translation, text classification, etc. in the field of natural language processing (Natural Language Processing); image retrieval, face recognition, etc. in the field of Computer vision (Computer Vison). The occurrence of the deep learning method brings great convenience to human society.
The pedestrian re-recognition method is an important application based on the deep learning method. Pedestrian re-recognition (Person-identification) is also called pedestrian re-recognition, and is a technique for judging whether a specific pedestrian exists in an image or video sequence acquired by cameras whose fields of view do not overlap each other by using a computer vision technique. Because of the difference between different camera devices, pedestrians have the characteristics of rigidity and flexibility, and the appearance is easily influenced by wearing, dimensions, shielding, postures, visual angles and the like, the pedestrian re-recognition becomes a hot subject which has research value and is very challenging in the field of computer vision.
Pedestrian re-identification has some special databases in the academic field, but because the acquisition and calibration of data require a lot of manpower and financial resources, the number of images of the data sets is small. Market-1501 and DukeMTMC-reID are two of the common data sets.
The mark-1501 dataset was collected on a university campus of bloom, with images from 6 different cameras. The training set contained 12,936 images and the test set contained 19,732 images. The training data is of 751 people in total and 750 people in the test set. There are on average 17.2 pieces of training data per class (per person) in the training set.
The DukeMTMC-reID dataset was acquired at the university of duke and the images were from 8 different cameras. The training set contained 16,522 images and the test set contained 17,661 images. There were 702 total people in the training data, and 23.5 training data per class (each person) on average.
The two common pedestrian re-identification data sets only have 33,000 Zhang Zuoyou image data, and the difference is obvious compared with the image data of tens of millions in enterprises. While the data set is too small, the training of the neural network may tend to over fit (overfit), resulting in reduced accuracy of testing the neural network trained on the original data set on other data sets.
In this case, many data augmentation (Data Augmentation) methods are beginning to be used in the area of pedestrian re-identification, such as random clipping, random flipping, etc. However, this method only performs secondary processing on the image data on the original labeled data set, and other unlabeled data sets are still not reasonably utilized.
Disclosure of Invention
The invention mainly aims to overcome the defects and shortcomings of the prior art, provides a set of femur distal end personalized bone cutting guide plate which is convenient to operate and can realize accurate positioning and is provided by novel bone cutting logic, materials and design difficulty are reduced as much as possible under the condition that the conditions are met, and the design efficiency is improved.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
an unsupervised pedestrian re-identification method based on an augmented network comprises the following steps:
s1: performing augmentation operation on an original pedestrian image data set D0 without labels, wherein the augmentation operation comprises one or more of image scaling, random cutting, random erasing, noise adding and Gaussian blur, and M new augmentation data sets D1-DM are obtained, and M is a positive integer;
s2: the original image data in the original pedestrian image data set D0 is introduced into a convolutional neural network as a main network N0 to carry out forward propagation extraction to obtain a feature F0;
s3: respectively inputting corresponding augmented image data in M augmented data sets D1-DN into M convolutional neural networks with unshared parameters as augmented networks N1-NM to perform forward propagation extraction to obtain characteristics F1-FM;
s4: randomly selecting an image inequative from an original pedestrian image data set D0 to serve as a negative sample, and introducing a main network N0 to forward propagation and extraction to obtain a characteristic Fnegative;
s5: calculating Euclidean distances by using the output characteristics F0 and the output characteristics F1-FM respectively to obtain M loss values L1-LM;
s6: calculating Euclidean distance by using the output characteristics Fnegative and the output characteristics F0-FM respectively to obtain M+1 loss values L0 negative-LMnegative;
s7: the obtained result of subtracting the M loss values L1-LM obtained in S5 from the M loss values L1 negative-LMnegative obtained in S6 is used as loss to carry out backward propagation calculation gradient update on the augmented network N1-NM;
s8: summing M loss values L1-LM obtained in the step S5, and subtracting the result of summing the loss values L0 negative-LMnegative obtained in the step S6 to obtain a total loss value L0;
s9: taking the total loss value L0 obtained in the step S8 as loss to carry out backward propagation calculation gradient update on the main network N0 to update the main network parameters;
s10: repeating the operations of S2-S9 until the main network and the augmentation network are converged;
s11: the master network model is taken as output.
As a preferred technical solution, in step S1, when the image scaling process is included in the augmentation operation, the image is scaled by using a bilinear interpolation method, so as to simulate various images with different resolutions that may occur in the natural dataset, and the specific calculation method is as follows:
Figure BDA0002324253080000041
where q11= (x 1, y 1), q12= (x 1, y 2), q21= (x 2, y 1), q22= (x 2, y 2) is the four pixel points where the point (x, y) is closest.
As a preferred technical solution, in step S1, when the augmentation operation includes random clipping, the augmentation is performed by using a random clipping method, so as to simulate various partial pedestrian images that may occur in the natural dataset, and the specific method is as follows:
firstly, randomly selecting a pixel point in an image, then taking the pixel point as an upper left corner, randomly forming a rectangle with a certain length and a certain width, and outputting the pixel point in the whole rectangle as a cutting result.
As a preferred technical solution, in step S1, when the random erasure process is included in the augmentation operation, the random erasure method is used to perform the augmentation, so as to simulate various missing or incomplete pedestrian images that may occur in the natural dataset, and the specific method is as follows:
a pixel point is selected randomly from an image, then the pixel point is taken as the upper left corner to form a rectangle with a certain length and a certain width, all pixel values of the pixel points in the whole rectangle are set to be black, namely pixel values (0, 0 and 0), and then the whole image after operation is output as a random erasing result.
As a preferred technical solution, in step S1, when the noise adding method is included in the amplifying operation, the noise adding method is used for amplifying, so as to simulate the image noise possibly occurring in the natural data set, and the specific operations are as follows:
for each pixel, a certain probability value becomes white point, namely pixel value (255 )), or black point, namely pixel value (0, 0), and then the whole image after operation is output as a noise adding result.
As a preferred technical solution, in step S1, when the gaussian blur processing is included in the augmentation operation, the gaussian blur method is used for the augmentation, so as to simulate the situation of image blur that may occur in the natural dataset, according to the following formula:
Figure BDA0002324253080000051
after the sigma value is set, a weight matrix can be calculated, so that matrix operation is performed by taking each pixel in the image as the center, and the purpose of blurring the image can be achieved.
As a preferable technical solution, in step S2 and step S3, the respective pedestrian image data are transmitted to the corresponding convolutional neural network, and feature extraction is performed by using a forward propagation method, where a specific forward propagation formula is as follows:
Figure BDA0002324253080000052
wherein a represents the intermediate layer output; sigma represents an activation function; z represents the input of the activation layer; the superscript indicates the number of layers; * Representing a convolution operation; w represents a convolution kernel; b denotes the bias.
As a preferable technical scheme, step S5 specifically includes:
the Euclidean distance is calculated by the characteristic F0 extracted by the main network N0 and the characteristics F1-FM extracted by the augmentation networks N1-NM respectively, and the specific formula is as follows:
Figure BDA0002324253080000053
wherein x is brought into the feature F0 extracted by the main network; y are respectively brought into the characteristics F1-FM extracted by the five augmentation networks; xi and yi are values in each dimension of the corresponding feature;
the step S6 specifically comprises the following steps:
and the feature Fnegative is a negative sample, an image is randomly selected as the negative sample Fnegative, and the Euclidean distance is calculated with the output features F0-FM.
As a preferable technical scheme, step S7 specifically includes:
and transmitting the calculated error value back to the corresponding convolutional neural network, and iteratively updating the parameter value of the convolutional neural network by using a backward propagation algorithm, wherein the specific formula is as follows:
Figure BDA0002324253080000054
wherein the superscript indicates the number of layers; delta represents a gradient value; * Representing a convolution operation; w represents a convolution kernel; rot180 means that the matrix is turned 180 degrees, namely turned up and down once and then turned left and right once; o represents a point-to-point multiplication; σ' represents the derivative of the activation function.
As a preferable technical scheme, step S8 specifically includes:
the error values L1-LM summation obtained after the Euclidean distance is calculated respectively by the features obtained by the augmentation network and the features obtained by the main network are subtracted from the results of the L0 negative-LM negative summation to obtain a total error value L0, and the specific formula is as follows:
Figure BDA0002324253080000061
wherein lambda is i ,i∈[1,M]Is a positive sample weight value, L i ,i∈[1,M]The corresponding positive sample error value, here λ, is taken i =1,i∈[1,M];λ inegative ,inegative∈[0,M]Is a negative sample weight value, L inegative ,inegative∈[0,M]A corresponding negative sample error value, here λ ineg =1,ineg∈[0,M]。
Compared with the prior art, the invention has the following advantages and beneficial effects:
the method uses the method of the augmentation network to utilize the unlabeled pedestrian image data which cannot be directly used as the input of the deep neural network, and the augmentation network and the main network can be trained by end-to-end operation after the unlabeled data set is subjected to the augmentation operation. The deep neural network is trained by utilizing the information that the characteristics extracted by the original data and the augmented data obtained by the original data are consistent as much as possible. The method has great benefits for the pedestrian re-recognition field in which the data set and the data quantity are relatively lacking, and in addition, various different augmentation operations simulate the possible condition of blurring and missing of the pedestrian re-recognition data to a certain extent, so the method provided by the invention can promote the generalization of the trained deep neural network, relieve the overfitting and finally achieve the effect of improving the recognition accuracy
Drawings
FIG. 1 is a flow chart of an unsupervised pedestrian re-identification method based on an augmented network of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Examples
As shown in fig. 1, the embodiment provides an unsupervised pedestrian re-identification method based on an augmentation network, which includes the following steps:
s1: the enhancement operation is performed on the unlabeled pedestrian image data set D0, including image scaling, random clipping, random erasure, noise addition and gaussian blur (several of these five enhancement operations may be selected to perform the combined call volume, or all of them may be selected, and the embodiment further illustrates the enhancement operation in step 5), so as to obtain five new enhancement data sets D1 to D5.
In the step S1, the image scaling, random clipping, random erasing, noise adding and Gaussian blur are specifically as follows:
s11: the original unlabeled pedestrian image data is scaled by a bilinear interpolation method, so that various images with different resolutions possibly appearing in a natural dataset are simulated, and the specific calculation mode is as follows:
Figure BDA0002324253080000071
where q11= (x 1, y 1), q12= (x 1, y 2), q21= (x 2, y 1), q22= (x 2, y 2) is the four pixel points where the point (x, y) is closest.
S12: the original unlabeled pedestrian image data is amplified by using a random clipping method, so that various local pedestrian images possibly appearing in a natural data set are simulated, and the specific operation is as follows: firstly, randomly selecting a pixel point in an image, then taking the pixel point as an upper left corner, randomly forming a rectangle with a certain length and a certain width, and outputting the pixel point in the whole rectangle as a cutting result.
S13: the original unlabeled pedestrian image data is amplified by using a random erasing method, so that various missing or incomplete pedestrian images possibly appearing in a natural data set are simulated, and the specific operation is as follows:
a pixel point is selected randomly from an image, then the pixel point is taken as the upper left corner to form a rectangle with a certain length and a certain width randomly, the pixel values of the pixel points in the whole rectangle are all set to be black (namely, the pixel values (0, 0)), and then the whole image after operation is output as a random erasing result.
S14: the original unlabeled pedestrian image data is amplified by using a noise adding method, so that image noise possibly occurring in a natural data set is simulated, and the method specifically comprises the following steps:
each pixel point has a certain probability value, which becomes a white point (i.e., pixel value (255, 255)) or a black point (i.e., pixel value (0, 0)), and then the entire image after the operation is output as a noise result.
S15: the original unlabeled pedestrian image data is augmented with a gaussian blur method to simulate the image blur that may occur in a natural dataset. According to the following formula:
Figure BDA0002324253080000081
after the sigma value is set, a weight matrix can be calculated, so that matrix operation is performed by taking each pixel in the image as the center, and the purpose of blurring the image can be achieved.
S2: the original image data in the original pedestrian image data set D0 is introduced into a convolutional neural network as a main network N0 to carry out forward propagation extraction to obtain a feature F0; the respective pedestrian image data are transmitted to the corresponding convolutional neural network, and feature extraction is carried out by using a forward propagation method, wherein the specific formula of forward propagation is as follows:
Figure BDA0002324253080000082
wherein a represents the intermediate layer output; sigma represents an activation function; z represents the input of the activation layer; the superscript indicates the number of layers; * Representing a convolution operation; w represents a convolution kernel; b denotes the bias.
S3: the corresponding augmented image data in the five augmented data sets D1-D5 are respectively input into a convolutional neural network which is not shared by the five parameters as the augmented networks N1-N5 to carry out forward propagation extraction to obtain characteristics F1-F5; the forward propagation in step S3 takes the same way as in step S2.
S4, randomly selecting an image inequative from an original pedestrian image data set D0 to serve as a negative sample, and introducing a main network N0 to forward propagation and extraction to obtain a characteristic Fnegative;
s5: the Euclidean distance is calculated by using the output characteristic F0 and the output characteristics F1 to F5 respectively to obtain five loss values L1 to L5, specifically:
the Euclidean distance is calculated by the characteristic F0 extracted by the main network N0 and the characteristics F1 to F5 extracted by the augmentation networks N1 to N5 respectively, and the specific formula is as follows:
Figure BDA0002324253080000091
wherein x is brought into the feature F0 extracted by the main network; y are respectively brought into the characteristics F1 to F5 extracted by the five augmentation networks; xi and yi are values in the respective dimensions of the corresponding feature.
S6: calculating Euclidean distances by using the output characteristics Fnegative and the output characteristics F0-F5 respectively to obtain 6 loss values L0 negative-LMnegative; since in general the data volume in the data set is large and the proportion of the same class of data to the total data volume is small, a randomly selected image is taken here as a negative sample, which is feasible in most cases.
S7: and (3) taking the results obtained by subtracting the five loss values L1-L5 obtained in the step (S5) from the 5 loss values L1 negative-LMnegative obtained in the step (S6) as losses to perform backward propagation calculation gradient update on the augmented networks N1-N5.
The calculated error value is transmitted back to the corresponding convolutional neural network, and the parameter value of the convolutional neural network is iteratively updated by using a backward propagation algorithm, wherein the specific formula is as follows:
Figure BDA0002324253080000092
wherein the superscript indicates the number of layers; delta represents a gradient value; * Representing a convolution operation; w represents a convolution kernel; rot180 means that the matrix is turned 180 degrees, namely turned up and down once and then turned left and right once; o represents a point-to-point multiplication; σ' represents the derivative of the activation function.
S8: the five loss values L1 to L5 obtained in the step S4 are summed and subtracted from the result of the summation of the loss values L0negative to L5negative obtained in the step S6 to obtain a total loss value L0, and the specific formula is as follows:
Figure BDA0002324253080000101
wherein lambda is i ,i∈[1,5]Is a positive sample weight value, L i ,i∈[1,5]The corresponding positive sample error value, here λ, is taken i =1,i∈[1,5];λ inegative ,inegative∈[0,5]Is a negative sample weight value, L inegative ,inegative∈[0,5]A corresponding negative sample error value, here λ inegative =1,inegative∈[0,5]。
S9: and (3) taking the total loss value L0 obtained in the step S6 as loss to perform backward propagation calculation gradient updating on the main network N0 to update the main network parameters.
S10: the operations of S2 to S10 are repeated until the main network and the augmented network converge.
S11: the master network model is taken as output.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (10)

1. An unsupervised pedestrian re-identification method based on an augmented network is characterized by comprising the following steps:
s1: performing augmentation operation on an original pedestrian image data set D0 without labels, wherein the augmentation operation comprises one or more of image scaling, random cutting, random erasing, noise adding and Gaussian blur, and M new augmentation data sets D1-DM are obtained, and M is a positive integer;
s2: the original image data in the original pedestrian image data set D0 is introduced into a convolutional neural network as a main network N0 to carry out forward propagation extraction to obtain a feature F0;
s3: respectively inputting corresponding augmented image data in M augmented data sets D1-DN into M convolutional neural networks with unshared parameters as augmented networks N1-NM to perform forward propagation extraction to obtain characteristics F1-FM;
s4: randomly selecting an image inequative from an original pedestrian image data set D0 to serve as a negative sample, and introducing a main network N0 to forward propagation and extraction to obtain a characteristic Fnegative;
s5: calculating Euclidean distances by using the output characteristics F0 and the output characteristics F1-FM respectively to obtain M loss values L1-LM;
s6: calculating Euclidean distance by using the output characteristics Fnegative and the output characteristics F0-FM respectively to obtain M+1 loss values L0 negative-LMnegative;
s7: the obtained result of subtracting the M loss values L1-LM obtained in S5 from the M loss values L1 negative-LMnegative obtained in S6 is used as loss to carry out backward propagation calculation gradient update on the augmented network N1-NM;
s8: summing M loss values L1-LM obtained in the step S5, and subtracting the result of summing the loss values L0 negative-LMnegative obtained in the step S6 to obtain a total loss value L0;
s9: taking the total loss value L0 obtained in the step S8 as loss to carry out backward propagation calculation gradient update on the main network N0 to update the main network parameters;
s10: repeating the operations of S2-S9 until the main network and the augmentation network are converged;
s11: the master network model is taken as output.
2. The method for unsupervised pedestrian re-recognition based on the augmentation network according to claim 1, wherein in step S1, when the image scaling process is included in the augmentation operation, the image is scaled by using a bilinear interpolation method, so as to simulate the images with various resolutions which may occur in the natural dataset, and the specific calculation method is as follows:
Figure FDA0002324253070000021
where q11= (x 1, y 1), q12= (x 1, y 2), q21= (x 2, y 1), q22= (x 2, y 2) is the four pixel points where the point (x, y) is closest.
3. The method for unsupervised pedestrian re-recognition based on the augmentation network according to claim 1, wherein in step S1, when the augmentation operation includes random clipping, the augmentation is performed by using a random clipping method, so as to simulate various local pedestrian images that may occur in the natural dataset, and the specific method is as follows:
firstly, randomly selecting a pixel point in an image, then taking the pixel point as an upper left corner, randomly forming a rectangle with a certain length and a certain width, and outputting the pixel point in the whole rectangle as a cutting result.
4. The method for unsupervised pedestrian re-recognition based on the augmentation network according to claim 1, wherein in step S1, when the random erasure process is included in the augmentation operation, the random erasure method is used for the augmentation, so as to simulate various missing or incomplete pedestrian images possibly occurring in the natural dataset, and the specific method is as follows:
a pixel point is selected randomly from an image, then the pixel point is taken as the upper left corner to form a rectangle with a certain length and a certain width, all pixel values of the pixel points in the whole rectangle are set to be black, namely pixel values (0, 0 and 0), and then the whole image after operation is output as a random erasing result.
5. The method for unsupervised pedestrian re-recognition based on the augmentation network according to claim 1, wherein in step S1, when the operation of augmentation includes a noise adding method, the noise adding method is used for augmentation, so as to simulate the image noise possibly occurring in the natural data set, and the specific operation is as follows:
for each pixel, a certain probability value becomes white point, namely pixel value (255 )), or black point, namely pixel value (0, 0), and then the whole image after operation is output as a noise adding result.
6. The method for unsupervised pedestrian re-recognition based on the augmentation network according to claim 1, wherein in step S1, when the gaussian blur processing is included in the augmentation operation, the augmentation is performed using a gaussian blur method, so as to simulate the situation of image blur that may occur in the natural data set, according to the following formula:
Figure FDA0002324253070000031
after the sigma value is set, a weight matrix can be calculated, so that matrix operation is performed by taking each pixel in the image as the center, and the purpose of blurring the image can be achieved.
7. The method for unsupervised pedestrian re-recognition based on the augmented network according to claim 1, wherein in step S2 and step S3, the respective pedestrian image data are transmitted to the corresponding convolutional neural network, and the feature extraction is performed by using a forward propagation method, and the forward propagation specific formula is as follows:
Figure FDA0002324253070000032
wherein a represents the intermediate layer output; sigma represents an activation function; z represents the input of the activation layer; the superscript indicates the number of layers; * Representing a convolution operation; w represents a convolution kernel; b denotes the bias.
8. The method for unsupervised pedestrian re-identification based on the augmented network according to claim 1, wherein step S5 specifically comprises:
the Euclidean distance is calculated by the characteristic F0 extracted by the main network N0 and the characteristics F1-FM extracted by the augmentation networks N1-NM respectively, and the specific formula is as follows:
Figure FDA0002324253070000033
wherein x is brought into the feature F0 extracted by the main network; y are respectively brought into the characteristics F1-FM extracted by the five augmentation networks; xi and yi are values in each dimension of the corresponding feature;
the step S6 specifically comprises the following steps:
and the feature Fnegative is a negative sample, an image is randomly selected as the negative sample Fnegative, and the Euclidean distance is calculated with the output features F0-FM.
9. The method for unsupervised pedestrian re-identification based on the augmented network according to claim 1, wherein step S7 specifically comprises:
and transmitting the calculated error value back to the corresponding convolutional neural network, and iteratively updating the parameter value of the convolutional neural network by using a backward propagation algorithm, wherein the specific formula is as follows:
Figure FDA0002324253070000034
wherein the superscript indicates the number of layers; delta represents a gradient value; * Representing a convolution operation; w represents a convolution kernel; rot180 means that the matrix is turned 180 degrees, namely turned up and down once and then turned left and right once; o represents a point-to-point multiplication; σ' represents the derivative of the activation function.
10. The method for unsupervised pedestrian re-identification based on the augmented network according to claim 1, wherein step S8 specifically comprises:
the error values L1-LM summation obtained after the Euclidean distance is calculated respectively by the features obtained by the augmentation network and the features obtained by the main network are subtracted from the results of the L0 negative-LM negative summation to obtain a total error value L0, and the specific formula is as follows:
Figure FDA0002324253070000041
wherein lambda is i ,i∈[1,M]Is a positive sample weight value, L i ,i∈[1,M]The corresponding positive sample error value, here λ, is taken i =1,i∈[1,M];λ inegative ,inegative∈[0,M]Is a negative sample weight value, L inegative ,inegative∈[0,M]A corresponding negative sample error value, here λ inegative =1,inegative∈[0,M]。
CN201911310016.1A 2019-12-18 2019-12-18 Unsupervised pedestrian re-identification method based on augmented network Active CN111062329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911310016.1A CN111062329B (en) 2019-12-18 2019-12-18 Unsupervised pedestrian re-identification method based on augmented network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911310016.1A CN111062329B (en) 2019-12-18 2019-12-18 Unsupervised pedestrian re-identification method based on augmented network

Publications (2)

Publication Number Publication Date
CN111062329A CN111062329A (en) 2020-04-24
CN111062329B true CN111062329B (en) 2023-05-30

Family

ID=70302269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911310016.1A Active CN111062329B (en) 2019-12-18 2019-12-18 Unsupervised pedestrian re-identification method based on augmented network

Country Status (1)

Country Link
CN (1) CN111062329B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985645A (en) * 2020-08-28 2020-11-24 北京市商汤科技开发有限公司 Neural network training method and device, electronic equipment and storage medium
CN112043260B (en) * 2020-09-16 2022-11-15 杭州师范大学 Electrocardiogram classification method based on local mode transformation
CN112200187A (en) * 2020-10-16 2021-01-08 广州云从凯风科技有限公司 Target detection method, device, machine readable medium and equipment
CN112580720B (en) * 2020-12-18 2024-07-09 华为技术有限公司 Model training method and device
CN113033410B (en) * 2021-03-26 2023-06-06 中山大学 Domain generalization pedestrian re-recognition method, system and medium based on automatic data enhancement

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583379A (en) * 2018-11-30 2019-04-05 常州大学 A kind of pedestrian's recognition methods again being aligned network based on selective erasing pedestrian

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426858B (en) * 2017-08-29 2021-04-06 京东方科技集团股份有限公司 Neural network, training method, image processing method, and image processing apparatus

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583379A (en) * 2018-11-30 2019-04-05 常州大学 A kind of pedestrian's recognition methods again being aligned network based on selective erasing pedestrian

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Fast Open-World Person Re-Identification;Wei-Shi Zheng et al.;《IEEE Transactions on Image Processing 》;20170816;第27卷(第5期);第1-2页 *

Also Published As

Publication number Publication date
CN111062329A (en) 2020-04-24

Similar Documents

Publication Publication Date Title
CN111062329B (en) Unsupervised pedestrian re-identification method based on augmented network
CN112287940B (en) Semantic segmentation method of attention mechanism based on deep learning
Li et al. Semantic relationships guided representation learning for facial action unit recognition
CN109886121B (en) Human face key point positioning method for shielding robustness
CN110210551B (en) Visual target tracking method based on adaptive subject sensitivity
CN108256562B (en) Salient target detection method and system based on weak supervision time-space cascade neural network
CN107945118B (en) Face image restoration method based on generating type confrontation network
WO2019136591A1 (en) Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network
CN111861886B (en) Image super-resolution reconstruction method based on multi-scale feedback network
CN113052775B (en) Image shadow removing method and device
CN116682120A (en) Multilingual mosaic image text recognition method based on deep learning
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN116310693A (en) Camouflage target detection method based on edge feature fusion and high-order space interaction
CN110751271B (en) Image traceability feature characterization method based on deep neural network
CN114821050B (en) Method for dividing reference image based on transformer
CN113378812A (en) Digital dial plate identification method based on Mask R-CNN and CRNN
Wang et al. Msfnet: multistage fusion network for infrared and visible image fusion
Pang et al. PTRSegNet: A Patch-to-Region Bottom-Up Pyramid Framework for the Semantic Segmentation of Large-Format Remote Sensing Images
Li et al. Semantic prior-driven fused contextual transformation network for image inpainting
AU2021104479A4 (en) Text recognition method and system based on decoupled attention mechanism
CN114708591A (en) Document image Chinese character detection method based on single character connection
Sun et al. Knock knock, who’s there: Facial recognition using CNN-based classifiers
CN116188774B (en) Hyperspectral image instance segmentation method and building instance segmentation method
Ding et al. Pointnet: Learning Point Representation for High-Resolution Remote Sensing Imagery Land-Cover Classification
Zhang Image Style Transfer based on DeepLabV3+ Semantic Segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant