CN112395997A - Weak supervision training method of pedestrian re-recognition model based on micrographic learning - Google Patents
Weak supervision training method of pedestrian re-recognition model based on micrographic learning Download PDFInfo
- Publication number
- CN112395997A CN112395997A CN202011303629.5A CN202011303629A CN112395997A CN 112395997 A CN112395997 A CN 112395997A CN 202011303629 A CN202011303629 A CN 202011303629A CN 112395997 A CN112395997 A CN 112395997A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- label
- model
- bag
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000006870 function Effects 0.000 claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims description 22
- 239000013598 vector Substances 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 8
- 101100478633 Escherichia coli O157:H7 stcE gene Proteins 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 239000003086 colorant Substances 0.000 claims description 3
- 239000011521 glass Substances 0.000 claims description 3
- 101150115529 tagA gene Proteins 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000001000 micrograph Methods 0.000 claims 4
- 238000002372 labelling Methods 0.000 abstract description 4
- 238000001514 detection method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a weak supervision training method of a pedestrian re-recognition model based on micrographic learning, which comprises the steps of firstly grouping pedestrian pictures into bags according to a shooting time period and distributing bag type labels; then, capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model; then, carrying out integral training of the pedestrian re-identification model and the graph model; and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm. The method can achieve advanced model performance without heavy manual labeling cost and almost without increasing the computational complexity.
Description
Technical Field
The invention relates to the technical field of machine vision, in particular to a weak supervision training method of a pedestrian re-recognition model based on micrographic learning.
Background
At present, the pedestrian re-identification problem mainly has three realization methods: (1) extracting distinguishing features; (2) learning a stable metric or subspace for matching; (3) combining the two methods. However, most implementations require strong supervised training labels, i.e. manual labeling of each picture of the dataset. In addition, a pedestrian re-identification method based on unsupervised learning without manual labeling uses a local saliency matching or clustering model, but the remarkable difference of cross-camera views is difficult to model, so that high precision is difficult to achieve. In contrast, the weak supervision pedestrian re-identification method provided by the invention is an excellent training method, and can achieve higher precision without expensive manual labeling cost.
Weak supervision learning: although training deep neural networks with weakly supervised approaches is a challenging problem, it has been little studied to solve certain tasks such as picture classification, semantic segmentation and object detection. Similar to these studies, the present invention is also based on the generation of pseudo-labels, but the task of weakly supervised pedestrian re-identification has two features: (1) a representative image of each individual pedestrian cannot be found because people change clothes in a short time, and therefore the label is not clear; (2) the entropy is larger than that of other tasks, for example, pixels of an image in a weak supervision semantic segmentation task have certain stability, and pedestrians in a pedestrian re-identification task are more disordered and irregular. The two characteristics improve the difficulty of re-identification of the weak supervision pedestrian.
Uncertain label learning: where single sample pedestrian re-identification is most relevant to the present invention, but two differences exist: (1) each pedestrian category of single-sample pedestrian re-identification needs at least one picture instance, and the data set of the method does not need an accurate pedestrian category label; (2) the method introduces the bag type label as the estimation for limiting and guiding the pseudo-pedestrian type label, and ensures that the pseudo-label is more reliable to generate than the single-sample pedestrian re-identification.
Pedestrian search: the process of pedestrian detection and pedestrian re-identification is combined. The present invention has two main differences with it: (1) the present invention is only concerned with visual feature matching because the capabilities of current people detectors are adequate; (2) the method benefits from low-cost weak labels, and each training picture searched by the pedestrian still needs strong labels.
Application No. 201710487019.7 discloses an image quality scoring method using a depth-generating machine learning model that can use depth machine learning to create a generative model expecting good quality images for image quality scoring of images from a medical scanner. The deviation of the input image from the generative model is used as an input feature vector for the discriminant model; the discriminant model may also operate on additional input feature vectors derived from the input image. Based on these. However, the patent cannot directly express graph learning as a loss function which can be slightly applied to network parameters, so that the graph learning can be optimized by a random gradient descent method, and the integrated training of a graph model and a pedestrian re-recognition model is realized.
Disclosure of Invention
The invention provides a weak supervision training method of a pedestrian re-recognition model based on micrographic learning, which is characterized in that a module for automatically generating a training label is added to a pedestrian re-recognition deep neural network and is trained with the pedestrian re-recognition deep neural network integrally, so that the algorithm complexity is reduced.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a pedestrian re-recognition model weak supervision training method based on micrographic learning comprises the following steps:
s1: grouping the pedestrian pictures into bags according to the shooting time period and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model;
s3: carrying out integrated training on a pedestrian re-identification model and a graph model;
s4: and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm.
Further, the specific process of step S1 is:
denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l.
Further, the process of step S2 is:
if only the bag type label l is available for the weak supervision pedestrian re-identification, a false pedestrian type label needs to be estimated for each picture, and is represented by a probability vector Y; assuming that n pedestrian categories are contained in the bags under the category label, the whole training set has m pedestrian categories, and the bag category label is used for limiting Y, each picture xjThe probability vector of the pedestrian category label is:
further, the process of step S3 is:
defining a directed graph, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) Assigning pairs of penalties for labels, and eliminating false labels generated by weak supervised learning by formula (2);
the univariate term in equation (2) is defined as:
Wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs the bag limit expressed by equation (1),representing element-by-element products [ ·]Representing a vector index;
since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
further, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing a portion of the picture to be assigned to a pedestrian category that is not predicted; the false pedestrian category label for each picture can be obtained by minimizing equation (2):
where {1,2,3, …, m } represents all pedestrian classes in the training set.
Further, in step S3, before performing the integrated training of the pedestrian re-identification model and the graph model, the graph model needs to be miniaturized, and the specific process is as follows:
obtaining a pseudo-pedestrian category label by using an external graph model for supervising training of a pedestrian to re-identify a deep neural network, wherein the calculation of obtaining the pseudo label by minimizing the formula (2) is not trivial, so that the graph model is not compatible with the deep neural network, and therefore the relaxation formula (2) is required to be:
the discrete Φ and Ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in a micro-map model, directly inputting a picture x into a deep neural network to obtain the prediction of y; the difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)T log(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj。
Further, in the step S4, the graph model loss LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tagAs a supervised normalized exponential cross entropy loss function:
whereinShow thatConversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian classes of the training set, the total loss function L is a linear combination of the two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)The weights representing the two losses, respectively, are set to 1 and 0.5, respectively.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention combines the micrographic learning method and the weak supervised learning method, adds a module for automatically generating the training label for the deep neural network for pedestrian re-identification and trains the module and the deep neural network integrally.
Drawings
FIG. 1 is a graph model of a bag of pictures generating pseudo-pedestrian category labels;
FIG. 2 is a training flow diagram of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
A pedestrian re-recognition model weak supervision training method based on micrographic learning comprises the following steps:
1. ranging from supervised pedestrian re-identification to weakly supervised pedestrian re-identification
Denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l. The supervised pedestrian re-identification needs classification prediction of a pedestrian class label y supervision model; the weak supervision pedestrian re-identification only uses a bag type label l, a false pedestrian type label needs to be estimated for each picture, and the false pedestrian type label is represented by a probability vector Y. Assuming that l contains n pedestrian categories, the whole training set has m pedestrian categories, and Y is limited by bag category labels, then each picture xjThe probability vector of the pedestrian category label is:
2. weak supervision pedestrian re-identification based on micrographic learning
Graph model pedestrian re-identification
As shown in FIG. 1, a directed graph is defined, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) A pair of penalties for the tag is assigned. Equation (2) eliminates false tags of errors generated by weakly supervised learning.
Unary item
The univariate term in equation (2) is defined as:
Wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs the bag limit expressed by equation (1),representing element-by-element products [ ·]Representing the vector index.
Paired items
Since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
bag restraint
Indeed, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing portions of the picture to be assigned to pedestrian categories that are not predicted.
Inference of false pedestrian category labels
The false pedestrian category label for each picture can be obtained by minimizing equation (2):
where {1,2,3, …, m } represents all pedestrian classes in the training set.
The study of the drawings can be miniaturized
The weak supervision pedestrian re-identification method is not integrally trained, because firstly, an external graph model is needed to obtain a pseudo pedestrian category label for training the supervision pedestrian re-identification deep neural network. Minimizing the computation of equation (2) to obtain pseudo labels is not trivial, making the graph model incompatible with deep neural networks, and therefore requires relaxing equation (2) as:
the discrete Φ and Ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in the micrographic model, the picture x is directly input into the deep neural network to obtain the prediction of y. The difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)T log(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj。
3. Whole neural network structure
Fig. 2 is a network structure of training and reasoning, with dashed lines representing the training data flow and solid lines representing the reasoning data flow, wherein the graph model only participates in the training phase. The overall structure comprises three main modules:
feature extraction module
Referring to FIG. 2(a), ResNet-50 is used as backbone network, the last layer of original ResNet-50 is removed, and the network is changed into a full connection layer with 512-dimensional output, a batch normalization, a linear rectification function with leakage and a dropout.
Coarse pedestrian heavy identification module
As shown in fig. 2(b), a full-link layer with the same output dimension as the number of the pedestrian categories is added to the top of the feature extraction module, and then the normalized exponential cross entropy is used as a loss function. Pedestrian category prediction scoreAs a rough pedestrian re-identification estimate, the probability of the pedestrian category of the picture in pocket b is represented.
Refined pedestrian re-identification module
As shown in fig. 2(c), coarse pedestrian re-identification scores, appearances and bag limits are input into the graph model according to equations (8) and (9), and the pseudo labels generated by the graph model can be used to update network parameters like artificially labeled real labels.
4. Optimization
The gradient of the integral loss value corresponding to the deep neural network parameters can be calculated by obtaining the pseudo-pedestrian category label, and the gradient is transmitted back to all layers of the network by utilizing a back propagation algorithm, so that the integral training of all parameters of the weak supervision model is realized.
Loss function
The optimization objective of the method comprises graph model loss LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tagAs a supervised normalized exponential cross entropy loss function:
whereinShow thatConversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian classes of the training set.
The total loss function L is a linear combination of these two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)Representing the weights of the two losses, respectively, the method is set to 1 and 0.5.
The same or similar reference numerals correspond to the same or similar parts;
the positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. A pedestrian re-recognition model weak supervision training method based on micrographic learning is characterized by comprising the following steps of:
s1: grouping the pedestrian pictures into bags according to the shooting time period and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model;
s3: carrying out integrated training on a pedestrian re-identification model and a graph model;
s4: and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm.
2. The weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 1, wherein the specific process of the step S1 is:
denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l.
3. The method for weakly supervised training of a pedestrian re-recognition model based on micrographic learning according to claim 1, wherein the step S2 is performed by:
if only the bag type label l is available for the weak supervision pedestrian re-identification, a false pedestrian type label needs to be estimated for each picture, and is represented by a probability vector Y; assuming that n pedestrian categories are contained in the bags under the category label, the whole training set has m pedestrian categories, and the bag category label is used for limiting Y, each picture xjThe probability vector of the pedestrian category label is:
4. the weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 3, wherein the process of step S3 is:
defining a directed graph, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) Assigning pairs of penalties for labels, and eliminating false labels generated by weak supervised learning by formula (2);
the univariate term in equation (2) is defined as:
wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs a bag limit indicated by formula (1), which indicates a element-by-element product, [ ·]Representing a vector index;
since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
5. the weakly supervised training method of pedestrian re-recognition model based on micrographic learning of claim 4, wherein the bag class label contains additional information to improve the generation of pseudo labels: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing portions of the picture to be assigned to pedestrian categories that are not predicted.
6. The weakly supervised training method of pedestrian re-identification model based on micrographic learning of claim 5, wherein the false pedestrian category label of each picture can be obtained by minimizing formula (2):
where {1,2,3, …, m } represents all pedestrian classes in the training set.
7. The weakly supervised training method of pedestrian re-recognition model based on micro-graph learning as claimed in claim 6, wherein in step S3, before performing the integrated training of pedestrian re-recognition model and graph model, the graph model needs to be micro-modeled, and the specific process is as follows:
obtaining a pseudo-pedestrian category label by using an external graph model for supervising training of a pedestrian to re-identify a deep neural network, wherein the calculation of obtaining the pseudo label by minimizing the formula (2) is not trivial, so that the graph model is not compatible with the deep neural network, and therefore the relaxation formula (2) is required to be:
the discrete Φ and Ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in a micro-map model, directly inputting a picture x into a deep neural network to obtain the prediction of y; the difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)Tlog(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj。
8. The weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 7, wherein in the step S4, the graph model is lost by LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tagAs a supervised normalized exponential cross entropy loss function:
whereinShow thatConversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian classes of the training set, the total loss function L is a linear combination of the two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)Representing the two lost weights respectively.
9. The method of claim 8, wherein w is wClassificationIs set to 1.
10. The method of claim 8, wherein w is wDrawing (A)Set to 0.5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011303629.5A CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011303629.5A CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112395997A true CN112395997A (en) | 2021-02-23 |
CN112395997B CN112395997B (en) | 2023-11-24 |
Family
ID=74605913
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011303629.5A Active CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112395997B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112949630A (en) * | 2021-03-01 | 2021-06-11 | 北京交通大学 | Weak supervision target detection method based on frame classification screening |
CN113128410A (en) * | 2021-04-21 | 2021-07-16 | 湖南大学 | Weak supervision pedestrian re-identification method based on track association learning |
CN113688781A (en) * | 2021-09-08 | 2021-11-23 | 北京邮电大学 | Pedestrian re-identification anti-attack method with blocking elasticity |
CN114913472A (en) * | 2022-02-23 | 2022-08-16 | 北京航空航天大学 | Infrared video pedestrian significance detection method combining graph learning and probability propagation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110942025A (en) * | 2019-11-26 | 2020-03-31 | 河海大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
CN111445488A (en) * | 2020-04-22 | 2020-07-24 | 南京大学 | Method for automatically identifying and segmenting salt body through weak supervised learning |
CN111723645A (en) * | 2020-04-24 | 2020-09-29 | 浙江大学 | Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
-
2020
- 2020-11-19 CN CN202011303629.5A patent/CN112395997B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110942025A (en) * | 2019-11-26 | 2020-03-31 | 河海大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
CN111445488A (en) * | 2020-04-22 | 2020-07-24 | 南京大学 | Method for automatically identifying and segmenting salt body through weak supervised learning |
CN111723645A (en) * | 2020-04-24 | 2020-09-29 | 浙江大学 | Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
Non-Patent Citations (1)
Title |
---|
郑宝玉;王雨;吴锦雯;周全;: "基于深度卷积神经网络的弱监督图像语义分割", 南京邮电大学学报(自然科学版), no. 05, pages 5 - 16 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112949630A (en) * | 2021-03-01 | 2021-06-11 | 北京交通大学 | Weak supervision target detection method based on frame classification screening |
CN112949630B (en) * | 2021-03-01 | 2024-03-19 | 北京交通大学 | Weak supervision target detection method based on frame hierarchical screening |
CN113128410A (en) * | 2021-04-21 | 2021-07-16 | 湖南大学 | Weak supervision pedestrian re-identification method based on track association learning |
CN113688781A (en) * | 2021-09-08 | 2021-11-23 | 北京邮电大学 | Pedestrian re-identification anti-attack method with blocking elasticity |
CN113688781B (en) * | 2021-09-08 | 2023-09-15 | 北京邮电大学 | Pedestrian re-identification anti-attack method capable of shielding elasticity |
CN114913472A (en) * | 2022-02-23 | 2022-08-16 | 北京航空航天大学 | Infrared video pedestrian significance detection method combining graph learning and probability propagation |
CN114913472B (en) * | 2022-02-23 | 2024-06-25 | 北京航空航天大学 | Infrared video pedestrian significance detection method combining graph learning and probability propagation |
Also Published As
Publication number | Publication date |
---|---|
CN112395997B (en) | 2023-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112395997A (en) | Weak supervision training method of pedestrian re-recognition model based on micrographic learning | |
Aly et al. | DeepArSLR: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition | |
Kong et al. | Interactive phrases: Semantic descriptionsfor human interaction recognition | |
CN112182166B (en) | Text matching method and device, electronic equipment and storage medium | |
CN107203753B (en) | Action recognition method based on fuzzy neural network and graph model reasoning | |
Mao et al. | Hierarchical Bayesian theme models for multipose facial expression recognition | |
Felzenszwalb et al. | Object detection grammars. | |
CN110309306A (en) | A kind of Document Modeling classification method based on WSD level memory network | |
Wang et al. | Ppp: Joint pointwise and pairwise image label prediction | |
Ding et al. | Inferring social relations from visual concepts | |
Singla et al. | Discovery of social relationships in consumer photo collections using markov logic | |
CN113761887A (en) | Matching method and device based on text processing, computer equipment and storage medium | |
Katkade et al. | Advances in real-time object detection and information retrieval: A review | |
Joodi et al. | Increasing validation accuracy of a face mask detection by new deep learning model-based classification | |
CN113627237A (en) | Late-stage fusion face image clustering method and system based on local maximum alignment | |
Chergui et al. | Investigating deep cnns models applied in kinship verification through facial images | |
Srininvas et al. | A framework to recognize the sign language system for deaf and dumb using mining techniques | |
Yun et al. | Head pose classification by multi-class AdaBoost with fusion of RGB and depth images | |
de Souza et al. | Building semantic understanding beyond deep learning from sound and vision | |
Yadaiah et al. | A Fuzzy logic based soft computing approach in CBIR system using incremental filtering feature selection to identify patterns | |
Dohnálek et al. | Application and comparison of modified classifiers for human activity recognition | |
Jain et al. | Unsupervised temporal segmentation of human action using community detection | |
Lagunes-Fortiz et al. | Centroids Triplet Network and Temporally-Consistent Embeddings for In-Situ Object Recognition | |
Balfaqih et al. | An Intelligent Movies Recommendation System Based Facial Attributes Using Machine Learning | |
Pereira et al. | Real-Time Multi-Stage Deep Learning Pipeline for Facial Recognition by Service Robots |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Nie Lin Inventor after: Zhang Jiqi Inventor after: Lin Jing Inventor after: Wang Guangrun Inventor after: Wang Guangcong Inventor before: Zhang Jiqi Inventor before: Lin Jing Inventor before: Nie Lin Inventor before: Wang Guangrun Inventor before: Wang Guangcong |
|
GR01 | Patent grant | ||
GR01 | Patent grant |