CN114648777A - Pedestrian re-identification method, pedestrian re-identification training method and device - Google Patents

Pedestrian re-identification method, pedestrian re-identification training method and device Download PDF

Info

Publication number
CN114648777A
CN114648777A CN202011499944.XA CN202011499944A CN114648777A CN 114648777 A CN114648777 A CN 114648777A CN 202011499944 A CN202011499944 A CN 202011499944A CN 114648777 A CN114648777 A CN 114648777A
Authority
CN
China
Prior art keywords
pedestrian
image
feature
network
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011499944.XA
Other languages
Chinese (zh)
Inventor
陈亮雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202011499944.XA priority Critical patent/CN114648777A/en
Publication of CN114648777A publication Critical patent/CN114648777A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a pedestrian re-identification method, a pedestrian re-identification training method and a device, wherein the method comprises the following steps: acquiring a first image set and a first classification label corresponding to the first image set; intercepting image segments of the positions of the people included in the first image set to obtain an image segment set; inputting the first image segment into a pedestrian re-recognition network to be trained to obtain a first feature map, a first feature and a second classification label; inputting the first image segment into a first pedestrian re-recognition network to obtain a second feature map and a second feature, wherein the complexity of the first pedestrian re-recognition network is greater than that of the pedestrian re-recognition network to be trained; calculating the total loss according to the first classification label, the second classification label, the first feature graph, the second feature graph, the first feature and the second feature; and optimizing the parameters of the pedestrian re-recognition network to be trained according to the total loss to obtain a second pedestrian re-recognition network. The embodiment of the invention can improve the efficiency of the trained pedestrian re-identification network under the condition of ensuring the accuracy of the network.

Description

Pedestrian re-identification method, pedestrian re-identification training method and device
Technical Field
The invention relates to the technical field of image recognition, in particular to a pedestrian re-recognition method, a pedestrian re-recognition training method and a pedestrian re-recognition training device.
Background
With the continuous development of monitoring technology, the application of pedestrian re-identification is more and more extensive. Pedestrian re-identification, also known as pedestrian re-identification, is a technique for determining whether a specific pedestrian is present in an image or video sequence using computer vision techniques. Therefore, how to accurately perform pedestrian re-identification on an image or a video becomes a technical problem to be solved urgently. At present, in order to improve the accuracy of pedestrian re-identification, the complexity of a pedestrian re-identification network can be improved. However, the increase of the complexity of the pedestrian re-identification network may reduce the identification efficiency of the pedestrian re-identification network.
Disclosure of Invention
The embodiment of the invention provides a pedestrian re-identification method, a pedestrian re-identification training method and a device, which can improve the identification efficiency of a trained pedestrian re-identification network under the condition of not influencing the identification accuracy of the network.
A first aspect provides a pedestrian re-identification training method, including:
obtaining training data, wherein the training data comprises a first image set and a first classification label of a person included in each image in the first image set;
intercepting image segments of positions of people included in each image in the first image set to obtain an image segment set;
inputting a first image segment into a pedestrian re-recognition network to be trained to obtain a first feature map, a first feature and a second classification label, wherein the first image segment is any one image segment in the image segment set;
inputting the first image segment into the first pedestrian re-recognition network to obtain a second feature map and a second feature, wherein the first pedestrian re-recognition network is a trained pedestrian re-recognition network, and the complexity of the first pedestrian re-recognition network is greater than that of the pedestrian re-recognition network to be trained;
calculating a total loss from the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature;
and optimizing the parameters of the pedestrian re-identification network to be trained according to the total loss to obtain the second pedestrian re-identification network.
In the embodiment of the invention, the first pedestrian re-recognition network with higher accuracy is used as a teacher network during training, the pedestrian re-recognition network to be trained with lower complexity is used as a student network to be obtained according to distillation training, the teacher network keeps the trained parameters unchanged during the training process, the parameters of the student network are optimized, and the trained student network, namely the second pedestrian re-recognition network, is kept after the training is converged, so that the student network with the accuracy almost the same as that of the teacher network but higher speed can be obtained, and the recognition efficiency of the student re-recognition network can be improved without influencing the recognition accuracy of the pedestrian re-recognition network. In addition, when the loss is calculated, not only are real classification labels used, but also the classification labels, the feature map and the features output by the pedestrian re-recognition network to be trained, and the feature map and the features output by the first pedestrian re-recognition network are used, so that the information quantity is rich, although the complexity of the second pedestrian re-recognition network is smaller than that of the first pedestrian re-recognition network, the precision of the second pedestrian re-recognition network can reach that of the first pedestrian re-recognition network, and the recognition efficiency of the trained pedestrian re-recognition network can be further improved under the condition that the recognition accuracy of the trained pedestrian re-recognition network is not influenced.
As a possible implementation manner, the to-be-trained pedestrian re-recognition network includes a feature map module, a feature module, and a classification module, and the inputting the first image segment into the to-be-trained pedestrian re-recognition network to obtain the first feature map, the first feature, and the second classification label includes:
extracting a feature map of the first image segment through the feature map module to obtain a first feature map;
extracting the features of the first feature map through the feature module to obtain first features;
and classifying the first characteristics through the classification module to obtain a second classification label.
In the embodiment of the invention, in the distillation training process, not only the feature diagram distillation is considered, but also the feature distillation is considered, and the distillation information is more, so that the training accuracy can be further improved, and the accuracy of the trained pedestrian re-recognition network, namely the accuracy, can be further ensured.
As a possible implementation, the calculating a total loss according to the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature includes:
calculating a first loss according to the first characteristic diagram and the second characteristic diagram;
calculating a second loss based on the first feature and the second feature;
calculating a third loss according to the first classification label and the second classification label;
calculating a total loss from the first loss, the second loss, and the third loss.
In the embodiment of the invention, the total loss is obtained by the loss of the characteristic diagram, the loss of the characteristic and the classification loss, the considered loss is more, and all factors influencing the accuracy of the network can be considered, so that the accuracy of the trained pedestrian re-identification network can be ensured.
As a possible implementation manner, the network for re-identifying pedestrians to be trained further includes a conversion module, and the calculating a first loss according to the first feature map and the second feature map includes:
adjusting the dimension of the first feature map through the conversion module to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
and calculating a first loss according to the second characteristic diagram and the third characteristic diagram.
In the embodiment of the invention, because the network structures of the pedestrian re-identification network to be trained and the first pedestrian re-identification network are different, the dimensionality of the output characteristic diagram is correspondingly lower, the accuracy of the calculation result is lower, and the conversion module can ensure that the maintenance of the characteristic diagram is the same, so that the loss accuracy can be improved, and the accuracy of the trained pedestrian re-identification network can be ensured.
As a possible implementation, the calculating a first loss according to the second feature map and the third feature map includes:
calculating a first loss according to the second feature map, the third feature map and a first loss function;
the first loss function may be expressed as follows:
Figure BDA0002843217850000031
L1representing said first loss, N being the number of image segments comprised by said set of image segments, y1Represents the second characteristic diagram, y2The third characteristic diagram is shown.
In the embodiment of the invention, the loss function is mean square error, so that the precision between a teacher network and a student network can be ensured to be as close as possible, and the accuracy of pedestrian re-identification can be further improved.
A second aspect provides a pedestrian re-identification method, including:
obtaining a query image comprising a first person and a second image set comprising at least one image;
inputting the query image into a second pedestrian re-recognition network to obtain a query feature, wherein the second pedestrian re-recognition network is obtained by training through the pedestrian re-recognition training method provided by the first aspect;
inputting the second image set into the second pedestrian re-identification network to obtain at least one characteristic;
determining a similarity of the query feature to each of the at least one feature;
determining images in the second set of images having a similarity greater than a threshold as images including the first person.
In the embodiment of the invention, the complexity of the second pedestrian re-identification network is lower, so that the network structure of the second pedestrian re-identification network is simpler, and the pedestrian re-identification efficiency can be improved; in addition, although the network structure of the second pedestrian re-identification network is simpler, the second pedestrian re-identification network is obtained by training according to the first pedestrian re-identification network with higher complexity, so that the detection precision of the second pedestrian re-identification network is not influenced, the pedestrian re-identification efficiency can be improved under the condition of ensuring the pedestrian re-identification accuracy, and the pedestrian re-identification accuracy is not influenced while the pedestrian re-identification efficiency is improved.
As a possible implementation, the inputting the query image into the second pedestrian re-identification network to obtain the query feature includes:
extracting a feature map of the query image through a feature map module in a second pedestrian re-identification network to obtain a query feature map;
and extracting the characteristics of the query characteristic graph through a characteristic module in the second pedestrian re-identification network to obtain query characteristics.
A third aspect provides a pedestrian re-recognition training device, which includes a unit for executing the pedestrian re-recognition training method provided by the first aspect or any one of the embodiments of the first aspect.
A fourth aspect provides a pedestrian re-identification device comprising means for performing the pedestrian re-identification method provided by the second aspect or any one of the embodiments of the second aspect.
A fifth aspect provides a pedestrian re-recognition training device, which includes a processor and a memory, where the processor and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is used to call the program instructions to execute the pedestrian re-recognition training method provided by the first aspect or any implementation manner of the first aspect.
A sixth aspect provides a pedestrian re-identification apparatus, comprising a processor and a memory, the processor and the memory being connected to each other, wherein the memory is used for storing a computer program, the computer program comprises program instructions, and the processor is used for calling the program instructions to execute the pedestrian re-identification method provided by the second aspect or any one of the embodiments of the second aspect.
A seventh aspect provides a computer readable storage medium storing a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the pedestrian re-identification training method provided by the first aspect or any one of the embodiments of the first aspect, or the pedestrian re-identification method provided by any one of the second aspect or any one of the embodiments of the second aspect.
An eighth aspect provides an application program for executing the pedestrian re-recognition training method provided by the first aspect or any one of the embodiments of the first aspect, or the pedestrian re-recognition method provided by any one of the second aspect or any one of the embodiments of the second aspect, when running.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a pedestrian re-identification training method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image segment being cut according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a training session provided by an embodiment of the present invention;
FIG. 4 is a schematic illustration of another training provided by embodiments of the present invention;
fig. 5 is a schematic flow chart of a pedestrian re-identification method according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a pedestrian re-identification training device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The embodiment of the invention provides a pedestrian re-identification method, a pedestrian re-identification training method and a device, which can improve the identification efficiency of a trained pedestrian re-identification network under the condition of not influencing the identification accuracy of the network. The following are detailed below.
Referring to fig. 1, fig. 1 is a schematic flow chart of a pedestrian re-identification training method according to an embodiment of the present invention. According to different requirements, some steps in the flowchart shown in fig. 1 can be divided into several steps. As shown in fig. 1, the pedestrian re-recognition training method may include the following steps.
101. Training data is acquired.
When the pedestrian re-recognition network to be trained needs to be trained, training data for training can be acquired. The training data may include a first set of images and a first classification label for a person included with each image in the first set of images. Each image in the first image set may include only one person or a plurality of persons. The pedestrian re-recognition network to be trained comprises a plurality of pedestrians, namely a plurality of categories. The first classification label is a probability that each image in the first set of images includes an artifact for each of the plurality of categories. When an image in the first image set includes an artifact in one of the plurality of categories, the probability that the person is in the category is 1, and the probability that the person is in the other category is 0. The pedestrian re-recognition network to be trained comprises how many classes, and the first classification label comprises how many values, namely, each class corresponds to one value.
102. And intercepting image segments of the positions of the persons included in each image in the first image set to obtain an image segment set.
After the training data is obtained, image segments of positions of people included in each image in the first image set can be captured to obtain an image segment set. The position of the person included in each image in the first image set can be identified, and then the image segment of the position of the person included in each image in the first image set can be intercepted according to the position to obtain the image segment set. A human recognition network may be used to identify where each image in the first set of images includes a person. The position of the person is the position of the human body in the image.
Each image segment in the set of image segments includes a person. When each image in the first image set includes a person, the first image set includes a number of images equal to a number of image segments included in the set of image segments. When there are images including a plurality of persons in the first image set, the first image set includes a smaller number of images than the number of image segments included in the image segment set.
Referring to fig. 2, fig. 2 is a schematic diagram of capturing an image segment according to an embodiment of the present invention. As shown in fig. 2, when a person occupies the entire image, the size of the image segment after the image is cut may be the same as that of the image before the image is cut, that is, the image may be directly determined as the image segment without cutting the image. When a person occupies only a part of the image, the size of the image segment after the image is cut out is smaller than that of the image before the image is cut out, that is, the image needs to be cut out, and the part of the image including the person can be determined as the image segment.
103. And training the pedestrian re-recognition network to be trained by using the image fragment set and the first pedestrian re-recognition network to obtain a second pedestrian re-recognition network.
After image segments of positions of people included in each image in the first image set are intercepted to obtain an image segment set, the image segment set and the first pedestrian re-recognition network can be used for training a pedestrian re-recognition network to be trained to obtain a second pedestrian re-recognition network. The first image segment can be input into a pedestrian re-recognition network to be trained to obtain a first feature map, a first feature and a second classification label, the first image segment is input into the first pedestrian re-recognition network to obtain a second feature map and a second feature, total loss is calculated according to the first classification label, the second classification label, the first feature map, the second feature map, the first feature and the second feature, and parameters of the pedestrian re-recognition network to be trained are optimized according to the total loss to obtain the second pedestrian re-recognition network. The first image segment is any image segment in the image segment set, the first pedestrian re-identification network is a trained pedestrian re-identification network, and the complexity of the first pedestrian re-identification network is greater than that of the pedestrian re-identification network to be trained. After all the image segments in the image segment set are input into the first pedestrian re-recognition network and the pedestrian re-recognition network to be trained, the total loss once can be calculated, the parameters of the pedestrian re-recognition network to be trained once can be optimized, and the training is stopped until the total loss is not reduced or the accuracy of the verification set is not increased.
Referring to fig. 3, fig. 3 is a schematic diagram of a training session according to an embodiment of the invention. As shown in fig. 3, the to-be-trained pedestrian re-recognition network may include a feature map module, a feature module and a classification module, and the first pedestrian re-recognition network may include a feature map module and a feature module. The feature map of the first image segment can be extracted through a feature map module in the to-be-trained pedestrian re-recognition network to obtain a first feature map, the feature of the first feature map can be extracted through the feature module in the to-be-trained pedestrian re-recognition network to obtain a first feature, and the first feature can be classified through a classification module in the to-be-trained pedestrian re-recognition network to obtain a second classification label. The feature map of the first image segment can be extracted through a feature map module in the first pedestrian re-recognition network to obtain a second feature map, and the feature of the second feature map can be extracted through a feature module in the first pedestrian re-recognition network to obtain a second feature.
A first loss may be calculated from the first profile and the second profile, a second loss may be calculated from the first profile and the second profile, a third loss may be calculated from the first classification tag and the second classification tag, and a total loss may be calculated from the first loss, the second loss, and the third loss.
As shown in fig. 3, the network for re-identifying pedestrians to be trained may further include a conversion module, the dimensionality of the first feature map may be adjusted by the conversion module in the network for re-identifying pedestrians to be trained to obtain a third feature map, and the first loss may be calculated according to the second feature map and the third feature map. The third feature map has the same dimension as the second feature map. The first loss may be calculated from the second profile, the third profile, and the first loss function. The first loss function may be expressed as follows:
Figure BDA0002843217850000081
wherein L is1Representing the first loss, N being the number of image segments comprised by the set of image segments, y1Showing a second characteristic diagram, y2A third characteristic diagram is shown.
A second loss may be calculated based on the first characteristic and the second characteristic. The second loss may be calculated based on the first characteristic, the second characteristic, and the second loss function. The second loss function may be expressed as follows:
Figure BDA0002843217850000082
wherein L is2Denotes the second loss, x1Denotes a second feature, x2The first characteristic is indicated.
The third loss L may be calculated from the first class label and the second class label3. The third loss may be calculated from the first classification label, the second classification label, and the third loss function.The third loss function may be a categorical loss function.
The total loss L can be expressed as follows:
L=C1L1+C2L2+C3L3
the total loss may be a sum of the first loss, the second loss, and the third loss, or may be a weighted sum of the first loss, the second loss, and the third loss. C1、C2And C3Are weighting coefficients.
The feature map module may include a backbone network (backbone), and the number of backbone networks may be one or multiple. In the case where the signature block comprises a plurality of backbone networks, the first loss may be a sum or a weighted sum of losses corresponding to the signature input by each of the plurality of backbone networks. The feature module may also include a backbone network and a feature transformation network for transforming the feature map into 2-dimensional features. The feature transformation network may be a pooling layer, or may be other functional layers, units or modules that can transform the feature map into features. The classification module may include a full connectivity layer and a softmax function. The conversion module may include a convolution layer and a normalization layer, and the convolution layer may perform upscaling on the first feature map so that the number of channels of the upscaled first feature map is the same as the number of channels of the second feature map.
Referring to fig. 4, fig. 4 is a schematic diagram of another training method according to an embodiment of the present invention. As shown in fig. 4, the feature map module includes two backbone networks, the feature module includes one backbone network and one pooling layer, the classification module may include a full connection layer and a softmax function, the conversion module may include a convolution layer and a normalization layer of 1 × 1, and the first loss function and the second loss function are mean square errors.
Under the condition that the sizes of the images in the image fragment sets are different, in order to ensure that the second pedestrian re-recognition network can be normally recognized, the sizes of the images in the image fragment sets can be converted into the same size, and then the converted image fragment sets and the first pedestrian re-recognition network are used for training the pedestrian re-recognition network to be trained, so that the second pedestrian re-recognition network is obtained.
Referring to fig. 5, fig. 5 is a schematic flowchart of a pedestrian re-identification method according to an embodiment of the present invention. Some steps in the flowchart shown in fig. 5 may be divided into several steps according to different requirements. As shown in fig. 1, the pedestrian re-identification method may include the following steps.
501. A query image including a first person and a second set of images including at least one image are obtained.
When pedestrian re-identification is required, a query image including a first person and a second set of images including at least one image may be obtained. The query image may include only the first person, or may include the first person and others. Each image in the second set of images may include one person or a plurality of persons. The query image and the second image set may be locally stored images, images obtained from a network or a server, or images acquired by an image acquisition device, which is not limited herein.
502. And inputting the query image into a second pedestrian re-identification network to obtain query characteristics.
After the query image and the second image set are obtained, the feature of the first person in the query image can be extracted by using a second pedestrian re-identification network, namely, the image to be queried is input into the second pedestrian re-identification network, and the feature of the first person is output by the second pedestrian re-identification network. The feature of the first person may be a feature of a human body of the first person. The second pedestrian re-identification network is a pedestrian re-identification network trained in advance by the pedestrian re-identification training method corresponding to fig. 1. The query image can be input into a feature map module in the second pedestrian re-identification network to obtain a query feature map, then the query feature map can be input into a feature module in the second pedestrian re-identification network to obtain query features, namely, the feature map module in the second pedestrian re-identification network extracts the feature map of the query image to obtain the query feature map, and the feature module in the second pedestrian re-identification network extracts the features of the query feature map to obtain the query features.
503. The second set of images is input into a second pedestrian re-identification network to obtain at least one characteristic.
After the query image and the second image set are obtained, the second pedestrian re-identification network can be used for extracting the characteristics of the people included in each image in the second image set, namely, each image in the second image set is input into the second pedestrian re-identification network, and at least one characteristic is output by the second pedestrian re-identification network. Each image in the second set of images may correspond to a feature. When the image includes a person, a feature may include a feature of the person. When the image includes a plurality of persons, one feature may include features of the plurality of persons.
Step 502 and step 503 may be executed in parallel or in series.
504. A similarity of the query feature to each of the at least one feature is determined.
After the query feature and the at least one feature are obtained, the similarity between the query feature and each of the at least one feature may be determined, that is, the similarity between the query feature and each of the at least one feature is calculated. The similarity may be calculated by a cosine distance, a cosine similarity, an euclidean distance, or other methods, which is not limited herein.
505. And determining the images with the similarity larger than the threshold value in the second image set as the images comprising the first person.
After determining the similarity of the query feature to each of the at least one feature, images in the second set of images having a similarity greater than a threshold may be determined to include the first person. When the image includes a plurality of persons, the corresponding similarity of the image may include a plurality of similarities, and the similarity is greater than the threshold, and it is understood that one of the plurality of similarities is greater than the threshold.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a pedestrian re-identification training device according to an embodiment of the present invention. As shown in fig. 6, the pedestrian re-recognition training apparatus may include:
an obtaining unit 601, configured to obtain training data, where the training data includes a first image set and a first classification label of a person included in each image in the first image set;
an intercepting unit 602, configured to intercept an image segment of a position where a person is located in each image in the first image set, to obtain an image segment set;
the input unit 603 is configured to input the first image segment into the pedestrian re-identification network to be trained to obtain a first feature map, a first feature and a second classification label, where the first image segment is any image segment in an image segment set;
the input unit 603 is further configured to input the first image segment into a first pedestrian re-recognition network to obtain a second feature map and a second feature, where the first pedestrian re-recognition network is a trained pedestrian re-recognition network, and the complexity of the first pedestrian re-recognition network is greater than that of a pedestrian re-recognition network to be trained;
a calculating unit 604, configured to calculate a total loss according to the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature;
and the optimizing unit 605 is configured to optimize parameters of the pedestrian re-identification network to be trained according to the total loss to obtain a second pedestrian re-identification network.
In one embodiment, the to-be-trained pedestrian re-recognition network includes a feature map module, a feature module, and a classification module, and the inputting unit 603 inputs the first image segment into the to-be-trained pedestrian re-recognition network to obtain the first feature map, the first feature, and the second classification label includes:
extracting a feature map of the first image segment through a feature map module to obtain a first feature map;
extracting the characteristics of the first characteristic diagram through a characteristic module to obtain first characteristics;
and classifying the first characteristics through a classification module to obtain a second classification label.
In an embodiment, the calculating unit 604 is specifically configured to:
calculating a first loss according to the first characteristic diagram and the second characteristic diagram;
calculating a second loss according to the first characteristic and the second characteristic;
calculating a third loss according to the first classification label and the second classification label;
the total loss is calculated based on the first loss, the second loss, and the third loss.
In one embodiment, the network for re-identifying pedestrians to be trained further includes a conversion module, and the calculating unit 604 calculates the first loss according to the first feature map and the second feature map, including:
adjusting the dimension of the first feature map through a conversion module to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
and calculating the first loss according to the second characteristic diagram and the third characteristic diagram.
In one embodiment, the calculating unit 604 calculates the first loss according to the second feature map and the third feature map, including:
calculating a first loss according to the second feature map, the third feature map and the first loss function;
the first loss function may be expressed as follows:
Figure BDA0002843217850000111
L1representing the first loss, N being the number of image segments comprised by the set of image segments, y1Represents the second characteristic diagram, y2And representing the third characteristic diagram.
More detailed descriptions about the obtaining unit 601, the intercepting unit 602, the input unit 603, the calculating unit 604, and the optimizing unit 605 may be directly obtained by directly referring to the related descriptions in the embodiment of the method shown in fig. 1, which are not repeated herein.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention. As shown in fig. 7, the pedestrian re-recognition apparatus may include:
an obtaining unit 701, configured to obtain a query image including a first person and a second image set including at least one image;
an input unit 702, configured to input the query image into a second pedestrian re-identification network to obtain a query feature, where the second pedestrian re-identification network is obtained by training through the pedestrian re-identification training method;
an input unit 702, further configured to input the second image set into a second pedestrian re-identification network, so as to obtain at least one feature;
a determining unit 703, configured to determine a similarity between the query feature and each of the at least one feature;
the determining unit 703 is further configured to determine, as the image including the first person, the image in the second image set whose similarity is greater than the threshold.
In one embodiment, the inputting unit 702 inputs the query image into the second pedestrian re-identification network, and obtaining the query feature includes:
extracting a feature map of the query image through a feature map module in a second pedestrian re-identification network to obtain a query feature map;
and extracting the characteristics of the query characteristic graph through a characteristic module in the second pedestrian re-recognition network to obtain query characteristics.
More detailed descriptions about the obtaining unit 701, the input unit 702, and the determining unit 703 can be directly obtained by referring to the related descriptions in the method embodiment shown in fig. 5, which are not repeated herein.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present invention. As shown in fig. 8, the apparatus may include a processor 801, a memory 802, and a bus 803. The processor 801 may be a general purpose Central Processing Unit (CPU) or multiple CPUs, a single or multiple block Graphics Processing Unit (GPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs in accordance with the present invention. The Memory 802 may be a Read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these. The memory 802 may be separate or integrated with the processor 801. The bus 803 is connected to the processor 801. A bus 803 transfers information between the above components.
In one case, the device may be a pedestrian re-identification training device, the memory 802 stores a set of program codes, and the processor 801 is configured to call the program codes stored in the memory 802 to perform the operations performed by the acquiring unit 601, the intercepting unit 602, the input unit 603, the calculating unit 604, and the optimizing unit 605.
In another case, the device may be a pedestrian re-identification device, the memory 802 stores a set of program codes, and the processor 801 is configured to call the program codes stored in the memory 802 to perform the operations performed by the acquiring unit 701, the input unit 702, and the determining unit 703.
In one embodiment, a computer-readable storage medium is provided for storing an application program for performing a method corresponding to fig. 1 or fig. 5 when the application program is executed.
In one embodiment, an application program is provided for performing the method corresponding to fig. 1 or fig. 5 at runtime.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by instructions associated with hardware via a program, which may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above embodiments of the present invention are described in detail, and the principle and the implementation of the present invention are explained by applying specific embodiments, and the above description of the embodiments is only used to help understanding the method of the present invention and the core idea thereof; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (11)

1. A pedestrian re-identification training method is characterized by comprising the following steps:
obtaining training data, wherein the training data comprises a first image set and a first classification label of a person included in each image in the first image set;
intercepting image segments of the positions of the people included in each image in the first image set to obtain an image segment set;
inputting a first image segment into a pedestrian re-recognition network to be trained to obtain a first feature map, a first feature and a second classification label, wherein the first image segment is any one image segment in the image segment set;
inputting the first image segment into a first pedestrian re-recognition network to obtain a second feature map and a second feature, wherein the first pedestrian re-recognition network is a trained pedestrian re-recognition network, and the complexity of the first pedestrian re-recognition network is greater than that of the pedestrian re-recognition network to be trained;
calculating a total loss from the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature;
and optimizing the parameters of the pedestrian re-identification network to be trained according to the total loss to obtain a second pedestrian re-identification network.
2. The method according to claim 1, wherein the pedestrian re-recognition network to be trained comprises a feature map module, a feature module and a classification module, and the inputting the first image segment into the pedestrian re-recognition network to be trained to obtain a first feature map, a first feature and a second classification label comprises:
extracting a feature map of the first image segment through the feature map module to obtain a first feature map;
extracting the features of the first feature map through the feature module to obtain first features;
and classifying the first characteristics through the classification module to obtain a second classification label.
3. The method of claim 2, wherein said calculating a total loss from the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature comprises:
calculating a first loss according to the first characteristic diagram and the second characteristic diagram;
calculating a second loss based on the first feature and the second feature;
calculating a third loss according to the first classification label and the second classification label;
calculating a total loss from the first loss, the second loss, and the third loss.
4. The method according to claim 3, wherein the network for re-identifying pedestrians to be trained further comprises a conversion module, and the calculating the first loss according to the first feature map and the second feature map comprises:
adjusting the dimension of the first feature map through the conversion module to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
and calculating a first loss according to the second characteristic diagram and the third characteristic diagram.
5. A pedestrian re-identification method is characterized by comprising the following steps:
obtaining a query image comprising a first person and a second image set comprising at least one image;
inputting the query image into a second pedestrian re-recognition network to obtain query features, wherein the second pedestrian re-recognition network is obtained by training through the pedestrian re-recognition training method of any one of claims 1-5;
inputting the second image set into the second pedestrian re-identification network to obtain at least one characteristic;
determining a similarity of the query feature to each of the at least one feature;
determining images in the second set of images having a similarity greater than a threshold as images including the first person.
6. The method of claim 5, wherein inputting the query image into a second pedestrian re-identification network, resulting in a query feature comprises:
extracting a feature map of the query image through a feature map module in a second pedestrian re-identification network to obtain a query feature map;
and extracting the characteristics of the query characteristic graph through a characteristic module in the second pedestrian re-identification network to obtain query characteristics.
7. A pedestrian re-recognition training device, comprising:
an obtaining unit configured to obtain training data, the training data including a first image set and a first classification label of a person included in each image in the first image set;
the intercepting unit is used for intercepting image segments of positions of people included in each image in the first image set to obtain an image segment set;
the input unit is used for inputting a first image segment into a pedestrian re-identification network to be trained to obtain a first feature map, a first feature and a second classification label, wherein the first image segment is any one image segment in the image segment set;
the input unit is further configured to input the first image segment into a first pedestrian re-recognition network to obtain a second feature map and a second feature, the first pedestrian re-recognition network is a trained pedestrian re-recognition network, and the complexity of the first pedestrian re-recognition network is greater than that of the pedestrian re-recognition network to be trained;
a calculating unit, configured to calculate a total loss according to the first classification label, the second classification label, the first feature map, the second feature map, the first feature, and the second feature;
and the optimization unit is used for optimizing the parameters of the pedestrian re-identification network to be trained according to the total loss to obtain a second pedestrian re-identification network.
8. A pedestrian re-recognition apparatus, comprising:
an acquisition unit configured to acquire a query image including a first person and a second image set including at least one image;
an input unit, configured to input the query image into a second pedestrian re-recognition network to obtain a query feature, where the second pedestrian re-recognition network is obtained by training through the pedestrian re-recognition training method according to any one of claims 1 to 5;
the input unit is further configured to input the second image set into the second pedestrian re-identification network to obtain at least one feature;
a determining unit configured to determine a similarity of the query feature to each of the at least one feature;
the determining unit is further configured to determine, as the image including the first person, the image in the second image set whose similarity is greater than a threshold.
9. A pedestrian re-recognition training apparatus comprising a processor and a memory, the processor and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, and wherein the processor is configured to invoke the program instructions to perform the pedestrian re-recognition training method of any one of claims 1 to 4.
10. Pedestrian re-identification device comprising a processor and a memory, said processor and said memory being interconnected, wherein said memory is adapted to store a computer program comprising program instructions, said processor being adapted to invoke said program instructions to perform the pedestrian re-identification method according to claim 5 or 6.
11. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the pedestrian re-recognition training method according to any one of claims 1-4, or the pedestrian re-recognition method according to claim 5 or 6.
CN202011499944.XA 2020-12-17 2020-12-17 Pedestrian re-identification method, pedestrian re-identification training method and device Pending CN114648777A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011499944.XA CN114648777A (en) 2020-12-17 2020-12-17 Pedestrian re-identification method, pedestrian re-identification training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011499944.XA CN114648777A (en) 2020-12-17 2020-12-17 Pedestrian re-identification method, pedestrian re-identification training method and device

Publications (1)

Publication Number Publication Date
CN114648777A true CN114648777A (en) 2022-06-21

Family

ID=81989806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011499944.XA Pending CN114648777A (en) 2020-12-17 2020-12-17 Pedestrian re-identification method, pedestrian re-identification training method and device

Country Status (1)

Country Link
CN (1) CN114648777A (en)

Similar Documents

Publication Publication Date Title
WO2020221298A1 (en) Text detection model training method and apparatus, text region determination method and apparatus, and text content determination method and apparatus
CN110147786B (en) Method, apparatus, device, and medium for detecting text region in image
Kim et al. San: Learning relationship between convolutional features for multi-scale object detection
CN108345827B (en) Method, system and neural network for identifying document direction
WO2016138838A1 (en) Method and device for recognizing lip-reading based on projection extreme learning machine
US8606022B2 (en) Information processing apparatus, method and program
CN111353491B (en) Text direction determining method, device, equipment and storage medium
CN111553406A (en) Target detection system, method and terminal based on improved YOLO-V3
CN109492674B (en) Generation method and device of SSD (solid State disk) framework for target detection
CN107679539B (en) Single convolution neural network local information and global information integration method based on local perception field
CN112990175B (en) Method, device, computer equipment and storage medium for recognizing handwritten Chinese characters
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN111461145A (en) Method for detecting target based on convolutional neural network
CN114332133A (en) New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net
CN115631112B (en) Building contour correction method and device based on deep learning
KR20220056707A (en) Method and apparatus for face recognition robust to alignment shape of the face
CN114332500A (en) Image processing model training method and device, computer equipment and storage medium
CN110232381B (en) License plate segmentation method, license plate segmentation device, computer equipment and computer readable storage medium
Liu et al. SSD small object detection algorithm based on feature enhancement and sample selection
CN114648777A (en) Pedestrian re-identification method, pedestrian re-identification training method and device
CN115273202A (en) Face comparison method, system, equipment and storage medium
CN111444975B (en) Traffic light identification method based on image processing and deep learning
CN113516148A (en) Image processing method, device and equipment based on artificial intelligence and storage medium
CN115424250A (en) License plate recognition method and device
JP2024521197A (en) Model training device, model training method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination