CN112395997A - Weak supervision training method of pedestrian re-recognition model based on micrographic learning - Google Patents

Weak supervision training method of pedestrian re-recognition model based on micrographic learning Download PDF

Info

Publication number
CN112395997A
CN112395997A CN202011303629.5A CN202011303629A CN112395997A CN 112395997 A CN112395997 A CN 112395997A CN 202011303629 A CN202011303629 A CN 202011303629A CN 112395997 A CN112395997 A CN 112395997A
Authority
CN
China
Prior art keywords
pedestrian
label
model
bag
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011303629.5A
Other languages
Chinese (zh)
Other versions
CN112395997B (en
Inventor
张吉祺
林倞
聂琳
王广润
王广聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202011303629.5A priority Critical patent/CN112395997B/en
Publication of CN112395997A publication Critical patent/CN112395997A/en
Application granted granted Critical
Publication of CN112395997B publication Critical patent/CN112395997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a weak supervision training method of a pedestrian re-recognition model based on micrographic learning, which comprises the steps of firstly grouping pedestrian pictures into bags according to a shooting time period and distributing bag type labels; then, capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model; then, carrying out integral training of the pedestrian re-identification model and the graph model; and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm. The method can achieve advanced model performance without heavy manual labeling cost and almost without increasing the computational complexity.

Description

Weak supervision training method of pedestrian re-recognition model based on micrographic learning
Technical Field
The invention relates to the technical field of machine vision, in particular to a weak supervision training method of a pedestrian re-recognition model based on micrographic learning.
Background
At present, the pedestrian re-identification problem mainly has three realization methods: (1) extracting distinguishing features; (2) learning a stable metric or subspace for matching; (3) combining the two methods. However, most implementations require strong supervised training labels, i.e. manual labeling of each picture of the dataset. In addition, a pedestrian re-identification method based on unsupervised learning without manual labeling uses a local saliency matching or clustering model, but the remarkable difference of cross-camera views is difficult to model, so that high precision is difficult to achieve. In contrast, the weak supervision pedestrian re-identification method provided by the invention is an excellent training method, and can achieve higher precision without expensive manual labeling cost.
Weak supervision learning: although training deep neural networks with weakly supervised approaches is a challenging problem, it has been little studied to solve certain tasks such as picture classification, semantic segmentation and object detection. Similar to these studies, the present invention is also based on the generation of pseudo-labels, but the task of weakly supervised pedestrian re-identification has two features: (1) a representative image of each individual pedestrian cannot be found because people change clothes in a short time, and therefore the label is not clear; (2) the entropy is larger than that of other tasks, for example, pixels of an image in a weak supervision semantic segmentation task have certain stability, and pedestrians in a pedestrian re-identification task are more disordered and irregular. The two characteristics improve the difficulty of re-identification of the weak supervision pedestrian.
Uncertain label learning: where single sample pedestrian re-identification is most relevant to the present invention, but two differences exist: (1) each pedestrian category of single-sample pedestrian re-identification needs at least one picture instance, and the data set of the method does not need an accurate pedestrian category label; (2) the method introduces the bag type label as the estimation for limiting and guiding the pseudo-pedestrian type label, and ensures that the pseudo-label is more reliable to generate than the single-sample pedestrian re-identification.
Pedestrian search: the process of pedestrian detection and pedestrian re-identification is combined. The present invention has two main differences with it: (1) the present invention is only concerned with visual feature matching because the capabilities of current people detectors are adequate; (2) the method benefits from low-cost weak labels, and each training picture searched by the pedestrian still needs strong labels.
Application No. 201710487019.7 discloses an image quality scoring method using a depth-generating machine learning model that can use depth machine learning to create a generative model expecting good quality images for image quality scoring of images from a medical scanner. The deviation of the input image from the generative model is used as an input feature vector for the discriminant model; the discriminant model may also operate on additional input feature vectors derived from the input image. Based on these. However, the patent cannot directly express graph learning as a loss function which can be slightly applied to network parameters, so that the graph learning can be optimized by a random gradient descent method, and the integrated training of a graph model and a pedestrian re-recognition model is realized.
Disclosure of Invention
The invention provides a weak supervision training method of a pedestrian re-recognition model based on micrographic learning, which is characterized in that a module for automatically generating a training label is added to a pedestrian re-recognition deep neural network and is trained with the pedestrian re-recognition deep neural network integrally, so that the algorithm complexity is reduced.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a pedestrian re-recognition model weak supervision training method based on micrographic learning comprises the following steps:
s1: grouping the pedestrian pictures into bags according to the shooting time period and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model;
s3: carrying out integrated training on a pedestrian re-identification model and a graph model;
s4: and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm.
Further, the specific process of step S1 is:
denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l.
Further, the process of step S2 is:
if only the bag type label l is available for the weak supervision pedestrian re-identification, a false pedestrian type label needs to be estimated for each picture, and is represented by a probability vector Y; assuming that n pedestrian categories are contained in the bags under the category label, the whole training set has m pedestrian categories, and the bag category label is used for limiting Y, each picture xjThe probability vector of the pedestrian category label is:
Figure BDA0002787516160000021
further, the process of step S3 is:
defining a directed graph, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
Figure BDA0002787516160000031
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) Assigning pairs of penalties for labels, and eliminating false labels generated by weak supervised learning by formula (2);
the univariate term in equation (2) is defined as:
Φ(yi|xi)=-log(Yi[yi]) Wherein
Figure BDA0002787516160000032
Wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs the bag limit expressed by equation (1),
Figure BDA0002787516160000033
representing element-by-element products [ ·]Representing a vector index;
since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
Figure BDA0002787516160000034
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
Figure BDA0002787516160000035
further, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing a portion of the picture to be assigned to a pedestrian category that is not predicted; the false pedestrian category label for each picture can be obtained by minimizing equation (2):
Figure BDA0002787516160000036
where {1,2,3, …, m } represents all pedestrian classes in the training set.
Further, in step S3, before performing the integrated training of the pedestrian re-identification model and the graph model, the graph model needs to be miniaturized, and the specific process is as follows:
obtaining a pseudo-pedestrian category label by using an external graph model for supervising training of a pedestrian to re-identify a deep neural network, wherein the calculation of obtaining the pseudo label by minimizing the formula (2) is not trivial, so that the graph model is not compatible with the deep neural network, and therefore the relaxation formula (2) is required to be:
Figure BDA0002787516160000041
the discrete Φ and Ψ are serialized:
Figure BDA0002787516160000042
Figure BDA0002787516160000043
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in a micro-map model, directly inputting a picture x into a deep neural network to obtain the prediction of y; the difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)T log(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj
Further, in the step S4, the graph model loss LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tag
Figure BDA0002787516160000048
As a supervised normalized exponential cross entropy loss function:
Figure BDA0002787516160000044
wherein
Figure BDA0002787516160000045
Show that
Figure BDA0002787516160000046
Conversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
Figure BDA0002787516160000047
where m represents the number of pedestrian classes of the training set, the total loss function L is a linear combination of the two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)The weights representing the two losses, respectively, are set to 1 and 0.5, respectively.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention combines the micrographic learning method and the weak supervised learning method, adds a module for automatically generating the training label for the deep neural network for pedestrian re-identification and trains the module and the deep neural network integrally.
Drawings
FIG. 1 is a graph model of a bag of pictures generating pseudo-pedestrian category labels;
FIG. 2 is a training flow diagram of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
A pedestrian re-recognition model weak supervision training method based on micrographic learning comprises the following steps:
1. ranging from supervised pedestrian re-identification to weakly supervised pedestrian re-identification
Denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l. The supervised pedestrian re-identification needs classification prediction of a pedestrian class label y supervision model; the weak supervision pedestrian re-identification only uses a bag type label l, a false pedestrian type label needs to be estimated for each picture, and the false pedestrian type label is represented by a probability vector Y. Assuming that l contains n pedestrian categories, the whole training set has m pedestrian categories, and Y is limited by bag category labels, then each picture xjThe probability vector of the pedestrian category label is:
Figure BDA0002787516160000051
2. weak supervision pedestrian re-identification based on micrographic learning
Graph model pedestrian re-identification
As shown in FIG. 1, a directed graph is defined, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
Figure BDA0002787516160000052
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) A pair of penalties for the tag is assigned. Equation (2) eliminates false tags of errors generated by weakly supervised learning.
Unary item
The univariate term in equation (2) is defined as:
Φ(yi|xi)=-log(Yi[yi]) Wherein
Figure BDA0002787516160000061
Wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs the bag limit expressed by equation (1),
Figure BDA0002787516160000062
representing element-by-element products [ ·]Representing the vector index.
Paired items
Since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
Figure BDA0002787516160000063
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
Figure BDA0002787516160000064
bag restraint
Indeed, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing portions of the picture to be assigned to pedestrian categories that are not predicted.
Inference of false pedestrian category labels
The false pedestrian category label for each picture can be obtained by minimizing equation (2):
Figure BDA0002787516160000065
where {1,2,3, …, m } represents all pedestrian classes in the training set.
The study of the drawings can be miniaturized
The weak supervision pedestrian re-identification method is not integrally trained, because firstly, an external graph model is needed to obtain a pseudo pedestrian category label for training the supervision pedestrian re-identification deep neural network. Minimizing the computation of equation (2) to obtain pseudo labels is not trivial, making the graph model incompatible with deep neural networks, and therefore requires relaxing equation (2) as:
Figure BDA0002787516160000071
the discrete Φ and Ψ are serialized:
Figure BDA0002787516160000072
Figure BDA0002787516160000073
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in the micrographic model, the picture x is directly input into the deep neural network to obtain the prediction of y. The difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)T log(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj
3. Whole neural network structure
Fig. 2 is a network structure of training and reasoning, with dashed lines representing the training data flow and solid lines representing the reasoning data flow, wherein the graph model only participates in the training phase. The overall structure comprises three main modules:
feature extraction module
Referring to FIG. 2(a), ResNet-50 is used as backbone network, the last layer of original ResNet-50 is removed, and the network is changed into a full connection layer with 512-dimensional output, a batch normalization, a linear rectification function with leakage and a dropout.
Coarse pedestrian heavy identification module
As shown in fig. 2(b), a full-link layer with the same output dimension as the number of the pedestrian categories is added to the top of the feature extraction module, and then the normalized exponential cross entropy is used as a loss function. Pedestrian category prediction score
Figure BDA0002787516160000074
As a rough pedestrian re-identification estimate, the probability of the pedestrian category of the picture in pocket b is represented.
Refined pedestrian re-identification module
As shown in fig. 2(c), coarse pedestrian re-identification scores, appearances and bag limits are input into the graph model according to equations (8) and (9), and the pseudo labels generated by the graph model can be used to update network parameters like artificially labeled real labels.
4. Optimization
The gradient of the integral loss value corresponding to the deep neural network parameters can be calculated by obtaining the pseudo-pedestrian category label, and the gradient is transmitted back to all layers of the network by utilizing a back propagation algorithm, so that the integral training of all parameters of the weak supervision model is realized.
Loss function
The optimization objective of the method comprises graph model loss LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tag
Figure BDA0002787516160000081
As a supervised normalized exponential cross entropy loss function:
Figure BDA0002787516160000082
wherein
Figure BDA0002787516160000083
Show that
Figure BDA0002787516160000084
Conversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
Figure BDA0002787516160000085
where m represents the number of pedestrian classes of the training set.
The total loss function L is a linear combination of these two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)Representing the weights of the two losses, respectively, the method is set to 1 and 0.5.
The same or similar reference numerals correspond to the same or similar parts;
the positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A pedestrian re-recognition model weak supervision training method based on micrographic learning is characterized by comprising the following steps of:
s1: grouping the pedestrian pictures into bags according to the shooting time period and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo-pedestrian category label for each picture in the bag of the category, wherein the label is used as supervision information for training a pedestrian re-identification model;
s3: carrying out integrated training on a pedestrian re-identification model and a graph model;
s4: and (3) taking the linear combination of the graph model loss and the re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm.
2. The weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 1, wherein the specific process of the step S1 is:
denote by b a bag containing p pictures, i.e. b ═ x1,x2,…,xj,…,xp,y=y1,y2,…,yj,…,ypThe pedestrian category label is denoted by l.
3. The method for weakly supervised training of a pedestrian re-recognition model based on micrographic learning according to claim 1, wherein the step S2 is performed by:
if only the bag type label l is available for the weak supervision pedestrian re-identification, a false pedestrian type label needs to be estimated for each picture, and is represented by a probability vector Y; assuming that n pedestrian categories are contained in the bags under the category label, the whole training set has m pedestrian categories, and the bag category label is used for limiting Y, each picture xjThe probability vector of the pedestrian category label is:
Figure FDA0002787516150000011
4. the weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 3, wherein the process of step S3 is:
defining a directed graph, each node representing a picture x in a bagiEach edge represents the relationship between pictures, and the energy function of assigning a pedestrian category label y to a node x on the graph is as follows:
Figure FDA0002787516150000012
where U and V represent nodes and edges, respectively, phi (y)i|xi) Is calculated as picture xiAssigning label yiA unitary term of cost of Ψ (y)i,yj|xi;xj) Is calculated as a picture pair (x)i,xj) Assigning pairs of penalties for labels, and eliminating false labels generated by weak supervised learning by formula (2);
the univariate term in equation (2) is defined as:
Figure FDA0002787516150000021
wherein P isiIs the neural network as picture xiCalculated probability of pedestrian class label, YiIs a bag limit indicated by formula (1), which indicates a element-by-element product, [ ·]Representing a vector index;
since the output of the unary terms of different pictures are independent of each other, the unary terms are unstable and need to be smoothed by the pairwise terms:
Figure FDA0002787516150000022
calculating the appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super-parameter sigma, and limiting pictures with similar appearances to have the same label; tag compatibility ζ (y)i,yj) Expressed by the glass model:
Figure FDA0002787516150000023
5. the weakly supervised training method of pedestrian re-recognition model based on micrographic learning of claim 4, wherein the bag class label contains additional information to improve the generation of pseudo labels: correcting the estimated pseudo label as the pedestrian classification with the highest prediction score in the bag; causing portions of the picture to be assigned to pedestrian categories that are not predicted.
6. The weakly supervised training method of pedestrian re-identification model based on micrographic learning of claim 5, wherein the false pedestrian category label of each picture can be obtained by minimizing formula (2):
Figure FDA0002787516150000024
where {1,2,3, …, m } represents all pedestrian classes in the training set.
7. The weakly supervised training method of pedestrian re-recognition model based on micro-graph learning as claimed in claim 6, wherein in step S3, before performing the integrated training of pedestrian re-recognition model and graph model, the graph model needs to be micro-modeled, and the specific process is as follows:
obtaining a pseudo-pedestrian category label by using an external graph model for supervising training of a pedestrian to re-identify a deep neural network, wherein the calculation of obtaining the pseudo label by minimizing the formula (2) is not trivial, so that the graph model is not compatible with the deep neural network, and therefore the relaxation formula (2) is required to be:
Figure FDA0002787516150000031
the discrete Φ and Ψ are serialized:
Figure FDA0002787516150000032
Figure FDA0002787516150000033
the difference between the formula (8) and the formula (3) is that in the non-micrographic model, all possible y needs to be input into the energy function, and the y with the lowest energy is taken as the optimal solution; in a micro-map model, directly inputting a picture x into a deep neural network to obtain the prediction of y; the difference between equation (9) and equation (4) is that the cross-entropy term- (Y) is usediPi)Tlog(YjPj) Approximation of the irreducible term ζ (y) in equation (4)i,yj)YiYj
8. The weakly supervised training method of pedestrian re-identification model based on micro-graph learning of claim 7, wherein in the step S4, the graph model is lost by LDrawing (A)And classification/re-identification loss LClassification,LClassificationIs a pseudo tag
Figure FDA0002787516150000038
As a supervised normalized exponential cross entropy loss function:
Figure FDA0002787516150000034
wherein
Figure FDA0002787516150000035
Show that
Figure FDA0002787516150000036
Conversion into a function of the unique heat vector, n representing the number of pictures in a bag, PiThe probability representing the pedestrian class calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
Figure FDA0002787516150000037
where m represents the number of pedestrian classes of the training set, the total loss function L is a linear combination of the two loss functions:
L=wclassificationLClassification+wDrawing (A)LDrawing (A) (12)
Wherein wClassificationAnd wDrawing (A)Representing the two lost weights respectively.
9. The method of claim 8, wherein w is wClassificationIs set to 1.
10. The method of claim 8, wherein w is wDrawing (A)Set to 0.5.
CN202011303629.5A 2020-11-19 2020-11-19 Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning Active CN112395997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011303629.5A CN112395997B (en) 2020-11-19 2020-11-19 Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011303629.5A CN112395997B (en) 2020-11-19 2020-11-19 Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning

Publications (2)

Publication Number Publication Date
CN112395997A true CN112395997A (en) 2021-02-23
CN112395997B CN112395997B (en) 2023-11-24

Family

ID=74605913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011303629.5A Active CN112395997B (en) 2020-11-19 2020-11-19 Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning

Country Status (1)

Country Link
CN (1) CN112395997B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949630A (en) * 2021-03-01 2021-06-11 北京交通大学 Weak supervision target detection method based on frame classification screening
CN113128410A (en) * 2021-04-21 2021-07-16 湖南大学 Weak supervision pedestrian re-identification method based on track association learning
CN113688781A (en) * 2021-09-08 2021-11-23 北京邮电大学 Pedestrian re-identification anti-attack method with blocking elasticity
CN114913472A (en) * 2022-02-23 2022-08-16 北京航空航天大学 Infrared video pedestrian significance detection method combining graph learning and probability propagation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019136946A1 (en) * 2018-01-15 2019-07-18 中山大学 Deep learning-based weakly supervised salient object detection method and system
CN110942025A (en) * 2019-11-26 2020-03-31 河海大学 Unsupervised cross-domain pedestrian re-identification method based on clustering
CN111445488A (en) * 2020-04-22 2020-07-24 南京大学 Method for automatically identifying and segmenting salt body through weak supervised learning
CN111723645A (en) * 2020-04-24 2020-09-29 浙江大学 Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene
CN111860678A (en) * 2020-07-29 2020-10-30 中国矿业大学 Unsupervised cross-domain pedestrian re-identification method based on clustering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019136946A1 (en) * 2018-01-15 2019-07-18 中山大学 Deep learning-based weakly supervised salient object detection method and system
CN110942025A (en) * 2019-11-26 2020-03-31 河海大学 Unsupervised cross-domain pedestrian re-identification method based on clustering
CN111445488A (en) * 2020-04-22 2020-07-24 南京大学 Method for automatically identifying and segmenting salt body through weak supervised learning
CN111723645A (en) * 2020-04-24 2020-09-29 浙江大学 Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene
CN111860678A (en) * 2020-07-29 2020-10-30 中国矿业大学 Unsupervised cross-domain pedestrian re-identification method based on clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑宝玉;王雨;吴锦雯;周全;: "基于深度卷积神经网络的弱监督图像语义分割", 南京邮电大学学报(自然科学版), no. 05, pages 5 - 16 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949630A (en) * 2021-03-01 2021-06-11 北京交通大学 Weak supervision target detection method based on frame classification screening
CN112949630B (en) * 2021-03-01 2024-03-19 北京交通大学 Weak supervision target detection method based on frame hierarchical screening
CN113128410A (en) * 2021-04-21 2021-07-16 湖南大学 Weak supervision pedestrian re-identification method based on track association learning
CN113688781A (en) * 2021-09-08 2021-11-23 北京邮电大学 Pedestrian re-identification anti-attack method with blocking elasticity
CN113688781B (en) * 2021-09-08 2023-09-15 北京邮电大学 Pedestrian re-identification anti-attack method capable of shielding elasticity
CN114913472A (en) * 2022-02-23 2022-08-16 北京航空航天大学 Infrared video pedestrian significance detection method combining graph learning and probability propagation
CN114913472B (en) * 2022-02-23 2024-06-25 北京航空航天大学 Infrared video pedestrian significance detection method combining graph learning and probability propagation

Also Published As

Publication number Publication date
CN112395997B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
CN112395997A (en) Weak supervision training method of pedestrian re-recognition model based on micrographic learning
Aly et al. DeepArSLR: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition
Kong et al. Interactive phrases: Semantic descriptionsfor human interaction recognition
CN112182166B (en) Text matching method and device, electronic equipment and storage medium
CN107203753B (en) Action recognition method based on fuzzy neural network and graph model reasoning
Mao et al. Hierarchical Bayesian theme models for multipose facial expression recognition
Felzenszwalb et al. Object detection grammars.
CN110309306A (en) A kind of Document Modeling classification method based on WSD level memory network
Wang et al. Ppp: Joint pointwise and pairwise image label prediction
Ding et al. Inferring social relations from visual concepts
Singla et al. Discovery of social relationships in consumer photo collections using markov logic
CN113761887A (en) Matching method and device based on text processing, computer equipment and storage medium
Katkade et al. Advances in real-time object detection and information retrieval: A review
Joodi et al. Increasing validation accuracy of a face mask detection by new deep learning model-based classification
CN113627237A (en) Late-stage fusion face image clustering method and system based on local maximum alignment
Chergui et al. Investigating deep cnns models applied in kinship verification through facial images
Srininvas et al. A framework to recognize the sign language system for deaf and dumb using mining techniques
Yun et al. Head pose classification by multi-class AdaBoost with fusion of RGB and depth images
de Souza et al. Building semantic understanding beyond deep learning from sound and vision
Yadaiah et al. A Fuzzy logic based soft computing approach in CBIR system using incremental filtering feature selection to identify patterns
Dohnálek et al. Application and comparison of modified classifiers for human activity recognition
Jain et al. Unsupervised temporal segmentation of human action using community detection
Lagunes-Fortiz et al. Centroids Triplet Network and Temporally-Consistent Embeddings for In-Situ Object Recognition
Balfaqih et al. An Intelligent Movies Recommendation System Based Facial Attributes Using Machine Learning
Pereira et al. Real-Time Multi-Stage Deep Learning Pipeline for Facial Recognition by Service Robots

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Nie Lin

Inventor after: Zhang Jiqi

Inventor after: Lin Jing

Inventor after: Wang Guangrun

Inventor after: Wang Guangcong

Inventor before: Zhang Jiqi

Inventor before: Lin Jing

Inventor before: Nie Lin

Inventor before: Wang Guangrun

Inventor before: Wang Guangcong

GR01 Patent grant
GR01 Patent grant