CN115937906A - Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction - Google Patents
Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction Download PDFInfo
- Publication number
- CN115937906A CN115937906A CN202310121979.7A CN202310121979A CN115937906A CN 115937906 A CN115937906 A CN 115937906A CN 202310121979 A CN202310121979 A CN 202310121979A CN 115937906 A CN115937906 A CN 115937906A
- Authority
- CN
- China
- Prior art keywords
- occlusion
- feature
- image
- shielding
- pedestrian
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000005764 inhibitory process Effects 0.000 title abstract description 5
- 230000008447 perception Effects 0.000 claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims abstract description 14
- 238000000605 extraction Methods 0.000 claims description 30
- 230000001629 suppression Effects 0.000 claims description 18
- 238000010276 construction Methods 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 7
- 230000001788 irregular Effects 0.000 claims description 4
- 230000000452 restraining effect Effects 0.000 claims description 4
- 208000032538 Depersonalisation Diseases 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 6
- 238000012545 processing Methods 0.000 abstract description 6
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 230000002401 inhibitory effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 17
- 238000004590 computer program Methods 0.000 description 7
- 230000000903 blocking effect Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000002708 enhancing effect Effects 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a pedestrian re-identification method for an occlusion scene based on occlusion inhibition and feature reconstruction, and belongs to the technical field of image processing. According to the method, firstly, a random grid alignment block-shaped occlusion enhancement strategy is used for generating an enhanced image sample simulating occlusion, the enhanced image sample is used for training an occlusion sensor in a self-supervision mode, and the occlusion position in a pedestrian image can be predicted. The invention firstly uses the shielding inhibition encoder to extract the features of the input image, the encoder adopts the self-attention mechanism to divide the image into blocks and carry out sufficient information exchange between image blocks, and in the process, the result of shielding perception is used for inhibiting the feature transmission of the shielding position, so that the global features of the non-shielding region can be generated. Then, the invention uses the characteristic repairing network to rebuild the complete pedestrian characteristic, and finally obtains a robust characteristic expression. The global feature constructed by the method can reduce the shielding interference and improve the retrieval accuracy in the shielding scene.
Description
Technical Field
The invention relates to the field of pedestrian re-identification in image processing and machine vision, in particular to a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction.
Background
Pedestrian re-identification is an important research topic in the field of computer vision, the objective of which is to match the same pedestrian image under different lenses, and the research can be applied to tasks such as pedestrian retrieval, pedestrian retrieval and the like in a monitoring scene. In recent years, the conventional pedestrian re-identification work based on the complete pedestrian image has been greatly successful, however, the pedestrian re-identification in the occlusion scene is still a great challenge, and the task needs to use the pedestrian image with a part being occluded as a query image to be searched in a gallery. Pedestrian targets are often frequently shielded in a real monitoring scene, so that the stability of the enhanced model in the shielded scene can greatly improve the practicability of the pedestrian re-identification method.
The difficulty in obscuring the task of re-identifying pedestrians is two-fold. Firstly, it is difficult to extract the characteristic that has the discriminant when the key position of pedestrian is sheltered from, and secondly, non-target pedestrian can bring the interference characteristic under people's fender people's condition, easily produces wrong matching. The current work mainly solves the problem of shielding in pedestrian re-identification from two ideas. The first is to exploit global information to produce robust feature representations. In order to deal with the occlusion scene, the method excavates the discriminative features from more positions or scales as much as possible, so that mistakes can be reduced when some areas are occluded. The second is to enhance the local features of the key sites with extra cues. In occlusion scenes, it is crucial to enhance local features of certain key parts. There are some efforts to find key sites that are not occluded using extra clues.
In an occlusion scene, if all local areas are fully utilized to extract a uniform feature, the feature introduced by an occlusion object can easily interfere with the feature. This can produce many false matches, such as different pedestrian images matching the same obstruction. In the existing work, an additional model is used for skeleton extraction of the pedestrian, the visibility of each part of the pedestrian is predicted, the shielded local features are restrained, and the visible local features are enhanced. However, this approach incurs additional computational overhead, and the external model may fail when the target pedestrian is occluded by other pedestrians.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides the pedestrian re-identification method based on the occlusion suppression and the feature reconstruction, which can improve the pedestrian re-identification precision in the occlusion scene.
According to a first aspect of the invention, an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction is provided, and comprises the following steps:
step 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and a corresponding occlusion label;
step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor;
step 4, constructing a feature reconstruction network, and training by using the shielding enhanced image features and the complete pedestrian image features to obtain the feature reconstruction network;
step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the blocking pedestrian, and restraining the features of the blocking area by using the blocking perception result to obtain the global features of the visible area of the concerned pedestrian; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristics to complete pedestrian re-identification.
On the basis of the technical scheme, the invention can be improved as follows.
Optionally, the process of generating the occlusion enhanced image and the corresponding occlusion label in step 1 includes:
scaling the complete pedestrian image to a set size and dividing the complete pedestrian image into image blocks according to equally divided grids;
setting a plurality of shielding enhanced images as a batch of images, setting shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch of images after multiple sampling;
and randomly selecting other complete pedestrian images with different identities from the batch of images during each sampling, randomly selecting the regions with the same shape to cover the complete pedestrian images to be processed, and generating the shielding enhanced images and the corresponding shielding labels.
Optionally, the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer;
the input of the occlusion sensor is an input sequence consisting of an image block feature embedding sequence and an occlusion indication feature initialized to an all-zero vector.
Optionally, the training process of the occlusion sensor includes:
integrating information carried in image block feature embedding by self-attention module, and updating the occlusion indication featureThe occlusion indication feature is->Convert into the occlusion perception result->Wherein the occlusion prediction is supervised by an occlusion label corresponding to the occlusion enhanced image.
Optionally, the feature extraction network constructed in step 3 is composed of multiple layers of self-attention modules;
the input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature.
Optionally, the feature extraction network extraction process includes: integrating information carried in image block feature embedding, updating the identity classification indication features, and generating an attention matrix by each layer of self-attention modules; wherein, the N elements in the first row of the attention matrix represent the information transfer strength of the embedded features of the N image blocks to the identity indication features;
when the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding sensing result, so that the image block features with high shielding fractions are embedded in the feature exchange process and have smaller weight, and the corrected attention pattern is used for calculating feature updating.
Optionally, the feature reconstruction network constructed in step 4 is formed by two branches of self-attention layers, where the two branches are: global feature construction network and complete feature reasoning network;
the global feature construction network obtains complete global features based on the complete pedestrian image feature construction(ii) a The full feature inference network derives reconstructed global features &basedon the occlusion enhanced image feature inference>。
Optionally, the feature reconstruction network aims to construct features similar to features of the complete image as much as possible from the incomplete image by the complete feature inference network;
wherein ,indicates a loss of identity classification, based on the presence of a predetermined number of active cells>Indicates a triple loss, asserted>Representing a predicted loss of occlusion, a loss of inference>, wherein />Represents the Euclidean distance, <' > is selected>、/>、/> and />Respectively, are the balance weights of the class 4 losses.
The invention provides a pedestrian re-identification method based on occlusion suppression and feature reconstruction for an occlusion scene, which has the beneficial effects that:
1. a data enhancement strategy is designed, and an image with enhanced shielding and a corresponding label can be generated and used for enhancing the comprehension capability of a model to the shielding condition;
2. the method improves the mode of determining the key visual part when the local features are enhanced, does not depend on an additional model, and carries out occlusion perception in a self-supervision mode;
3. the method improves the mode of enhancing local features during feature extraction, does not generate independent local feature re-weighting fusion, but directly generates global features focusing on specific local features, and is more suitable for a feature extraction network based on a self-attention mechanism;
4. a feature reconstruction network is designed, incomplete features can be reconstructed in an occlusion scene, complete global features are reconstructed, and accuracy of pedestrian re-identification in the occlusion scene is improved.
Drawings
FIG. 1 is a diagram of an overall network structure of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 2 is a network training flowchart of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 3 is a flow chart of network prediction for an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 4 is an exemplary diagram of a tag generated by grid-aligned block-shaped occlusion enhancement of an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction according to the present invention;
fig. 5 is a schematic diagram of a feature reconstruction embodiment of a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction provided by the invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
According to the occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction, firstly, a random grid alignment block-shaped occlusion enhancement strategy is used for generating an enhanced image sample simulating occlusion, the enhanced image sample is used for training an occlusion sensor in a self-supervision mode, and the occlusion position in a pedestrian image can be predicted. The invention firstly uses the shielding inhibition encoder to extract the features of the input image, the encoder adopts the self-attention mechanism to divide the image into blocks and carry out sufficient information exchange between the image blocks, and in the process, the result of shielding perception is utilized to inhibit the feature transmission of the shielding position, thereby generating the global features of the non-shielding region. Then, the invention uses the characteristic repairing network to rebuild the complete pedestrian characteristic, and finally obtains a robust characteristic expression.
Fig. 1 is an overall network structure diagram of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention, and fig. 2 and 3 are a network training flow diagram and a network prediction flow diagram of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention, respectively, and it can be known from fig. 1 to fig. 3 that the re-identification method includes:
In particular implementations, a complete pedestrian image may be obtained from the pedestrian re-identification public dataset.
And 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and the corresponding occlusion label.
Step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; and when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor.
And 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network.
Step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the pedestrians sheltered, and restraining the features of the sheltered area by using the sheltered sensing result to obtain the global features of the visible area of the pedestrians concerned; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristics to complete pedestrian re-identification.
The invention provides a pedestrian re-identification method of an occlusion scene based on occlusion suppression and feature reconstruction, which is characterized in that firstly, a grid alignment block-shaped occlusion enhancement strategy is provided for training the perception capability of a model to occlusion in a self-supervision mode; secondly, an occlusion sensor is constructed, and the occlusion fraction of each image block of the partitioned pedestrian image can be predicted; thirdly, a feature reconstruction network is constructed, the blocked incomplete features can be repaired, and the complete global features can be reconstructed.
Example 1
In particular implementations, a complete pedestrian image may be obtained from the pedestrian re-identification public dataset.
In a possible embodiment, the process of generating the occlusion enhanced image and the corresponding occlusion label in step 1 includes:
scaling the complete pedestrian image to a set size and dividing into image blocks according to an equally divided grid.
Setting a plurality of shielding enhanced images as a batch image, setting a shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch image after multiple sampling.
And selecting other complete pedestrian images with different identities from the batch images at random during sampling each time, randomly selecting the same-shaped area to cover the complete pedestrian image to be processed, and generating the shielding enhanced image and the corresponding shielding label.
In specific implementation, the process may specifically be: the original image is first scaled toAnd is divided into ∑ and ∑ according to the averaging grid>An image block, wherein>Representing the side length of the image block. Setting a shading ratio>Then continuously generates a random size, and has a length and width->An integer multiple of rectangles is added to an initially empty set until all rectangles in the set take and produce shapes that are sufficiently close in area to->And finishing the construction of the random irregular blocky mask. Each sampling generates a random mask for all images in a batch, other pedestrian images with different identities are randomly selected from the same batch, and areas with the same shape are selected to cover the original image to obtain a shielding enhanced image. At the same time, the size can be obtained as->Occlusion label matrix->The value of the matrix element is 0 or 1, wherein 0 represents the original imageBlock, 1, represents a simulated obstruction image block.
Fig. 4 is a grid-aligned block-shaped occlusion enhancement exemplary diagram of an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction provided by the present invention; in fig. 4, white represents a mask, and black represents an original.
And 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and the corresponding occlusion label.
In one possible embodiment, the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer; in particular, it may be a 3-layer self-attention module.
The input of the occlusion sensor is an input sequence consisting of an image block feature embedded sequence and an occlusion indication feature initialized to an all-zero vector。
In a possible embodiment, the training process of the occlusion sensor includes:
the occlusion sensor uses a self-attention mechanism for the input, a self-attention module integrates information carried in image block feature embedding, and the occlusion indicating feature is updatedThe occlusion indication feature is->Convert into the occlusion perception result->Wherein the occlusion prediction is supervised by the occlusion label corresponding to the occlusion enhanced image obtained in step 1, resulting in an occlusion prediction loss >>, wherein ,/>Representing a cross entropy function.
Through the steps, the construction and the training of the shielding perceptron are completed.
Step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; and when the feature extraction is carried out on the shielding enhanced image, the shielding sensing result of the shielding sensor is utilized to inhibit shielding interference.
In a possible embodiment, the feature extraction network constructed in step 3 suppresses feature embedding of an occlusion image block by using an occlusion sensing result, and is composed of multiple layers of self-attention modules; specifically, the module may be an 11-layer self-attention module.
The input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature。
In a possible embodiment, the feature extraction network will use a self-attention mechanism for the above input, and the extraction process includes: integrating information carried in image block feature embedding, and updating the identity classification indication featureDuring the process, each layer of self-attention module generates an attention matrix; wherein N elements of the first row of the attention matrix are marked as ^ 4>Means for embedding features into N image blocks>The information transmission strength of (1).
When the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding sensing result, so that the image block features with high shielding fraction are embedded in feature exchangeThe process has smaller weight and the corrected attention diagram is carried outFor computing feature updates.
In a specific implementation, the correction process may be:, wherein />Represents a hadamard product, is selected and/or selected>Represents a unit vector, <' > is selected>Indicating the degree of correction of the attention map.
And 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network.
Fig. 5 is a schematic diagram of an embodiment of feature reconstruction of a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction according to the present invention, and it can be known from fig. 5,
in a possible embodiment, the feature reconstruction network constructed in step 4 is composed of two branches of self-attention layers, where the two branches are: the global feature construction network and the complete feature inference network.
The global feature construction network obtains complete global features based on the complete pedestrian image feature construction(ii) a The full feature inference network derives a reconstructed global feature ≥ based on the occlusion-enhanced image feature inference>。
The feature reconstruction network is trained in a self-supervision mode, incomplete features of a part of concerned visual regions are recovered to be global features of complete pedestrians on a feature level, and the obtained features are more discriminative in pedestrian re-identification of a sheltered scene.
In particular, the goal of the feature reconstruction network is to construct features from the incomplete image that are as similar as possible to the features of the complete image by the complete feature inference network.
wherein ,indicates a loss of identity classification, based on the presence of a predetermined number of active cells>Indicates a triplet lost, —>Representing a predicted loss of occlusion, a loss of inference>, wherein />Representing Euclidean distance, and backward propagation only passes through branches of the complete feature inference network; />、/>、/> and />Respectively, are the balance weights of the class 4 losses.
Step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the blocking pedestrian, and restraining the features of the blocking area by using the blocking perception result to obtain the global features of the visible area of the concerned pedestrian; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristic to complete pedestrian re-identification.
In the occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction, in the prediction stage, the occlusion perceptron trained in the step 2 is used for carrying out occlusion perception on the occluded pedestrian images in the real scene query set to obtain the occlusion perception result(ii) a Performing feature extraction on the image of the hidden pedestrian by using the feature extraction network obtained in the step 3, and suppressing the features of the hidden area in a mode of correcting the attention map by using the hidden sensing result obtained in the step 5 to generate the global features of the visible area of the hidden pedestrian; and finally, utilizing one branch of the feature reconstruction network obtained by training in the step 4, namely the complete feature inference network, and carrying out feature reconstruction based on the obtained global features to obtain the final global features for pedestrian re-identification. And finally, calculating characteristic distances by using the obtained global characteristics and the global characteristics extracted in the same mode of the image library set, sorting according to the cosine distances, and sequentially outputting according to the sequence of the characteristic distances from near to far so as to finish the pedestrian re-identification task.
The beneficial effects include:
1. a data enhancement strategy is designed, and an image with enhanced occlusion and a corresponding label can be generated for enhancing the comprehension capability of a model on the occlusion condition.
2. The method improves the mode of determining key visual parts when local features are enhanced, does not depend on an additional model, and carries out occlusion perception in an automatic supervision mode.
3. The method improves the method for enhancing the local features during feature extraction, does not additionally generate independent local feature reweighting fusion, but directly generates global features focusing on specific local, and is more suitable for the feature extraction network based on the self-attention mechanism.
4. A feature reconstruction network is designed, incomplete features can be repaired in an occlusion scene, complete global features are reconstructed, and accuracy of pedestrian re-identification in the occlusion scene is improved.
It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (8)
1. A pedestrian re-identification method of an occlusion scene based on occlusion suppression and feature reconstruction is characterized in that the re-identification method comprises the following steps:
step 1, performing data enhancement on a complete pedestrian image by using a grid alignment block-shaped shielding enhancement strategy to generate a shielding enhancement image simulating shielding and a shielding label corresponding to the shielding enhancement image;
step 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and a corresponding occlusion label;
step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor;
step 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network;
step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the pedestrians sheltered, and restraining the features of the sheltered area by using the sheltered sensing result to obtain the global features of the visible area of the pedestrians concerned; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristic to complete pedestrian re-identification.
2. The re-recognition method of claim 1, wherein the step 1 of generating the occlusion enhanced image and the corresponding occlusion label comprises:
scaling the complete pedestrian image to a set size and dividing the complete pedestrian image into image blocks according to equally divided grids;
setting a plurality of shielding enhanced images as a batch of images, setting shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch of images after multiple sampling;
and randomly selecting other complete pedestrian images with different identities from the batch of images during each sampling, randomly selecting the areas with the same shape to cover the complete pedestrian images to be processed, and generating the shielding enhanced images and the corresponding shielding labels.
3. The re-recognition method of claim 1, wherein the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer;
the input of the occlusion sensor is an input sequence consisting of an image block feature embedding sequence and an occlusion indication feature initialized to an all-zero vector.
4. The re-recognition method of claim 3, wherein the training process of the occlusion sensor comprises:
integrating information carried in image block feature embedding by self-attention module, and updating the occlusion indication featureThe occlusion indication feature is->Convert into the occlusion perception result->Wherein the occlusion prediction is supervised by an occlusion label corresponding to the occlusion enhanced image.
5. The re-recognition method according to claim 1, wherein the feature extraction network constructed in the step 3 is composed of a plurality of layers of self-attention modules;
the input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature.
6. The re-recognition method of claim 5, wherein the feature extraction network extraction process comprises: integrating information carried in image block feature embedding, updating the identity classification indication features, and generating an attention matrix by each layer of self-attention modules; wherein, the N elements in the first row of the attention matrix represent the information transfer strength of the embedded features of the N image blocks to the identity indication features;
when the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding perception result, the image block features with high shielding fractions are embedded in the feature exchange process and have smaller weight, and the corrected attention map is used for calculating feature updating.
7. The re-recognition method according to claim 1, wherein the feature reconstruction network constructed in the step 4 is composed of two branches of self-attention layers, and the two branches are respectively: global feature construction network and complete feature reasoning network;
8. The re-recognition method of claim 7, wherein the feature reconstruction network aims to construct features from the incomplete image that are as similar as possible to features of the complete image by the complete feature inference network;
wherein ,indicates a loss of identity classification, based on the presence of a predetermined number of active cells>Indicates a triple loss, asserted>Representing predictive, inferential loss of occlusion, wherein />Represents the Euclidean distance, <' > is selected>、/>、/> and />Respectively, are the balance weights of the class 4 losses. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310121979.7A CN115937906B (en) | 2023-02-16 | 2023-02-16 | Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310121979.7A CN115937906B (en) | 2023-02-16 | 2023-02-16 | Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115937906A true CN115937906A (en) | 2023-04-07 |
CN115937906B CN115937906B (en) | 2023-06-06 |
Family
ID=85827197
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310121979.7A Active CN115937906B (en) | 2023-02-16 | 2023-02-16 | Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115937906B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150154765A1 (en) * | 2011-10-28 | 2015-06-04 | Carestream Health, Inc. | Tomosynthesis reconstruction with rib suppression |
CN110929578A (en) * | 2019-10-25 | 2020-03-27 | 南京航空航天大学 | Anti-blocking pedestrian detection method based on attention mechanism |
US20200125925A1 (en) * | 2018-10-18 | 2020-04-23 | Deepnorth Inc. | Foreground Attentive Feature Learning for Person Re-Identification |
CN111310718A (en) * | 2020-03-09 | 2020-06-19 | 成都川大科鸿新技术研究所 | High-accuracy detection and comparison method for face-shielding image |
CN112465872A (en) * | 2020-12-10 | 2021-03-09 | 南昌航空大学 | Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization |
CN112801051A (en) * | 2021-03-29 | 2021-05-14 | 哈尔滨理工大学 | Method for re-identifying blocked pedestrians based on multitask learning |
CN114022823A (en) * | 2021-11-16 | 2022-02-08 | 北京信息科技大学 | Shielding-driven pedestrian re-identification method and system and storable medium |
CN114419671A (en) * | 2022-01-18 | 2022-04-29 | 北京工业大学 | Hypergraph neural network-based occluded pedestrian re-identification method |
CN114639122A (en) * | 2022-03-22 | 2022-06-17 | 江苏大学 | Attitude correction pedestrian re-recognition method based on convolution generation countermeasure network |
WO2022222766A1 (en) * | 2021-04-21 | 2022-10-27 | 中山大学 | Semantic segmentation-based face integrity measurement method and system, device and storage medium |
CN115565207A (en) * | 2022-11-29 | 2023-01-03 | 武汉图科智能科技有限公司 | Occlusion scene downlink person detection method with feature simulation fused |
-
2023
- 2023-02-16 CN CN202310121979.7A patent/CN115937906B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150154765A1 (en) * | 2011-10-28 | 2015-06-04 | Carestream Health, Inc. | Tomosynthesis reconstruction with rib suppression |
US20200125925A1 (en) * | 2018-10-18 | 2020-04-23 | Deepnorth Inc. | Foreground Attentive Feature Learning for Person Re-Identification |
CN110929578A (en) * | 2019-10-25 | 2020-03-27 | 南京航空航天大学 | Anti-blocking pedestrian detection method based on attention mechanism |
CN111310718A (en) * | 2020-03-09 | 2020-06-19 | 成都川大科鸿新技术研究所 | High-accuracy detection and comparison method for face-shielding image |
CN112465872A (en) * | 2020-12-10 | 2021-03-09 | 南昌航空大学 | Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization |
CN112801051A (en) * | 2021-03-29 | 2021-05-14 | 哈尔滨理工大学 | Method for re-identifying blocked pedestrians based on multitask learning |
WO2022222766A1 (en) * | 2021-04-21 | 2022-10-27 | 中山大学 | Semantic segmentation-based face integrity measurement method and system, device and storage medium |
CN114022823A (en) * | 2021-11-16 | 2022-02-08 | 北京信息科技大学 | Shielding-driven pedestrian re-identification method and system and storable medium |
CN114419671A (en) * | 2022-01-18 | 2022-04-29 | 北京工业大学 | Hypergraph neural network-based occluded pedestrian re-identification method |
CN114639122A (en) * | 2022-03-22 | 2022-06-17 | 江苏大学 | Attitude correction pedestrian re-recognition method based on convolution generation countermeasure network |
CN115565207A (en) * | 2022-11-29 | 2023-01-03 | 武汉图科智能科技有限公司 | Occlusion scene downlink person detection method with feature simulation fused |
Non-Patent Citations (3)
Title |
---|
KAZIWA SALEH等: "Occlusion Handling in Generic Object Detection: A Review", 《IEEE》, pages 1 - 8 * |
YUEQIAO FAN等: "DSF-net: occluded person re-identification based on dual structure features", 《NEURAL COMPUTING AND APPLICATIONS》, pages 3537 * |
王旭东: "基于深度学习的遮挡人脸检测和还原技术研究", 《中国优秀论文电子期刊网》, pages 44 - 59 * |
Also Published As
Publication number | Publication date |
---|---|
CN115937906B (en) | 2023-06-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113223068B (en) | Multi-mode image registration method and system based on depth global features | |
CN112699786B (en) | Video behavior identification method and system based on space enhancement module | |
CN111626184B (en) | Crowd density estimation method and system | |
CN109784283A (en) | Based on the Remote Sensing Target extracting method under scene Recognition task | |
CN112801169A (en) | Camouflage target detection method based on improved YOLO algorithm | |
CN112084923A (en) | Semantic segmentation method for remote sensing image, storage medium and computing device | |
CN116503399B (en) | Insulator pollution flashover detection method based on YOLO-AFPS | |
CN112884802A (en) | Anti-attack method based on generation | |
Wei et al. | Traffic sign detection and recognition using novel center-point estimation and local features | |
Kang et al. | YOLO-FA: Type-1 fuzzy attention based YOLO detector for vehicle detection | |
CN115222998A (en) | Image classification method | |
CN112329771A (en) | Building material sample identification method based on deep learning | |
CN115546171A (en) | Shadow detection method and device based on attention shadow boundary and feature correction | |
Pratiwi et al. | Early detection of deforestation through satellite land geospatial images based on CNN architecture | |
CN113704276A (en) | Map updating method and device, electronic equipment and computer readable storage medium | |
CN115937906A (en) | Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction | |
Ataş | Performance Evaluation of Jaccard-Dice Coefficient on Building Segmentation from High Resolution Satellite Images | |
CN115953668A (en) | Method and system for detecting camouflage target based on YOLOv5 algorithm | |
CN112966569B (en) | Image processing method and device, computer equipment and storage medium | |
CN117593470B (en) | Street view reconstruction method and system based on AI model | |
Li et al. | Visual Servo Technology for Fault Detection in Power Inspection UAVs | |
Ren et al. | Incremental Land Cover Classification via Label Strategy and Adaptive Weights | |
CN117765231A (en) | Radar image sea clutter suppression auxiliary target detection method and system | |
El Rai et al. | MSLandslide: A MultiSource Segmentation For Remote Sensing Landslide Images | |
Liu et al. | DMSHNet: Multi-Scale and Multi-Supervised Hierarchical network for Remote-Sensing Image Change Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: No. 548, 5th Floor, Building 10, No. 28 Linping Avenue, Donghu Street, Linping District, Hangzhou City, Zhejiang Province Patentee after: Hangzhou Tuke Intelligent Information Technology Co.,Ltd. Address before: 430000 B033, No. 05, 4th floor, building 2, international enterprise center, No. 1, Guanggu Avenue, Donghu New Technology Development Zone, Wuhan, Hubei (Wuhan area of free trade zone) Patentee before: Wuhan Tuke Intelligent Technology Co.,Ltd. |
|
CP03 | Change of name, title or address |