CN115937906A - Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction - Google Patents

Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction Download PDF

Info

Publication number
CN115937906A
CN115937906A CN202310121979.7A CN202310121979A CN115937906A CN 115937906 A CN115937906 A CN 115937906A CN 202310121979 A CN202310121979 A CN 202310121979A CN 115937906 A CN115937906 A CN 115937906A
Authority
CN
China
Prior art keywords
occlusion
feature
image
shielding
pedestrian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310121979.7A
Other languages
Chinese (zh)
Other versions
CN115937906B (en
Inventor
韩守东
章孜闻
郭维
刘东海生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Tuke Intelligent Information Technology Co ltd
Original Assignee
Wuhan Tuke Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Tuke Intelligent Technology Co ltd filed Critical Wuhan Tuke Intelligent Technology Co ltd
Priority to CN202310121979.7A priority Critical patent/CN115937906B/en
Publication of CN115937906A publication Critical patent/CN115937906A/en
Application granted granted Critical
Publication of CN115937906B publication Critical patent/CN115937906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a pedestrian re-identification method for an occlusion scene based on occlusion inhibition and feature reconstruction, and belongs to the technical field of image processing. According to the method, firstly, a random grid alignment block-shaped occlusion enhancement strategy is used for generating an enhanced image sample simulating occlusion, the enhanced image sample is used for training an occlusion sensor in a self-supervision mode, and the occlusion position in a pedestrian image can be predicted. The invention firstly uses the shielding inhibition encoder to extract the features of the input image, the encoder adopts the self-attention mechanism to divide the image into blocks and carry out sufficient information exchange between image blocks, and in the process, the result of shielding perception is used for inhibiting the feature transmission of the shielding position, so that the global features of the non-shielding region can be generated. Then, the invention uses the characteristic repairing network to rebuild the complete pedestrian characteristic, and finally obtains a robust characteristic expression. The global feature constructed by the method can reduce the shielding interference and improve the retrieval accuracy in the shielding scene.

Description

Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction
Technical Field
The invention relates to the field of pedestrian re-identification in image processing and machine vision, in particular to a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction.
Background
Pedestrian re-identification is an important research topic in the field of computer vision, the objective of which is to match the same pedestrian image under different lenses, and the research can be applied to tasks such as pedestrian retrieval, pedestrian retrieval and the like in a monitoring scene. In recent years, the conventional pedestrian re-identification work based on the complete pedestrian image has been greatly successful, however, the pedestrian re-identification in the occlusion scene is still a great challenge, and the task needs to use the pedestrian image with a part being occluded as a query image to be searched in a gallery. Pedestrian targets are often frequently shielded in a real monitoring scene, so that the stability of the enhanced model in the shielded scene can greatly improve the practicability of the pedestrian re-identification method.
The difficulty in obscuring the task of re-identifying pedestrians is two-fold. Firstly, it is difficult to extract the characteristic that has the discriminant when the key position of pedestrian is sheltered from, and secondly, non-target pedestrian can bring the interference characteristic under people's fender people's condition, easily produces wrong matching. The current work mainly solves the problem of shielding in pedestrian re-identification from two ideas. The first is to exploit global information to produce robust feature representations. In order to deal with the occlusion scene, the method excavates the discriminative features from more positions or scales as much as possible, so that mistakes can be reduced when some areas are occluded. The second is to enhance the local features of the key sites with extra cues. In occlusion scenes, it is crucial to enhance local features of certain key parts. There are some efforts to find key sites that are not occluded using extra clues.
In an occlusion scene, if all local areas are fully utilized to extract a uniform feature, the feature introduced by an occlusion object can easily interfere with the feature. This can produce many false matches, such as different pedestrian images matching the same obstruction. In the existing work, an additional model is used for skeleton extraction of the pedestrian, the visibility of each part of the pedestrian is predicted, the shielded local features are restrained, and the visible local features are enhanced. However, this approach incurs additional computational overhead, and the external model may fail when the target pedestrian is occluded by other pedestrians.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides the pedestrian re-identification method based on the occlusion suppression and the feature reconstruction, which can improve the pedestrian re-identification precision in the occlusion scene.
According to a first aspect of the invention, an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction is provided, and comprises the following steps:
step 1, performing data enhancement on a complete pedestrian image by using a grid alignment block-shaped shielding enhancement strategy to generate a shielding enhancement image simulating shielding and a shielding label corresponding to the shielding enhancement image;
step 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and a corresponding occlusion label;
step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor;
step 4, constructing a feature reconstruction network, and training by using the shielding enhanced image features and the complete pedestrian image features to obtain the feature reconstruction network;
step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the blocking pedestrian, and restraining the features of the blocking area by using the blocking perception result to obtain the global features of the visible area of the concerned pedestrian; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristics to complete pedestrian re-identification.
On the basis of the technical scheme, the invention can be improved as follows.
Optionally, the process of generating the occlusion enhanced image and the corresponding occlusion label in step 1 includes:
scaling the complete pedestrian image to a set size and dividing the complete pedestrian image into image blocks according to equally divided grids;
setting a plurality of shielding enhanced images as a batch of images, setting shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch of images after multiple sampling;
and randomly selecting other complete pedestrian images with different identities from the batch of images during each sampling, randomly selecting the regions with the same shape to cover the complete pedestrian images to be processed, and generating the shielding enhanced images and the corresponding shielding labels.
Optionally, the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer;
the input of the occlusion sensor is an input sequence consisting of an image block feature embedding sequence and an occlusion indication feature initialized to an all-zero vector.
Optionally, the training process of the occlusion sensor includes:
integrating information carried in image block feature embedding by self-attention module, and updating the occlusion indication feature
Figure SMS_1
The occlusion indication feature is->
Figure SMS_2
Convert into the occlusion perception result->
Figure SMS_3
Wherein the occlusion prediction is supervised by an occlusion label corresponding to the occlusion enhanced image.
Optionally, the feature extraction network constructed in step 3 is composed of multiple layers of self-attention modules;
the input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature.
Optionally, the feature extraction network extraction process includes: integrating information carried in image block feature embedding, updating the identity classification indication features, and generating an attention matrix by each layer of self-attention modules; wherein, the N elements in the first row of the attention matrix represent the information transfer strength of the embedded features of the N image blocks to the identity indication features;
when the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding sensing result, so that the image block features with high shielding fractions are embedded in the feature exchange process and have smaller weight, and the corrected attention pattern is used for calculating feature updating.
Optionally, the feature reconstruction network constructed in step 4 is formed by two branches of self-attention layers, where the two branches are: global feature construction network and complete feature reasoning network;
the global feature construction network obtains complete global features based on the complete pedestrian image feature construction
Figure SMS_4
(ii) a The full feature inference network derives reconstructed global features &basedon the occlusion enhanced image feature inference>
Figure SMS_5
Optionally, the feature reconstruction network aims to construct features similar to features of the complete image as much as possible from the incomplete image by the complete feature inference network;
the overall loss of the feature reconstruction network in training is
Figure SMS_6
;
wherein ,
Figure SMS_8
indicates a loss of identity classification, based on the presence of a predetermined number of active cells>
Figure SMS_9
Indicates a triple loss, asserted>
Figure SMS_10
Representing a predicted loss of occlusion, a loss of inference>
Figure SMS_11
, wherein />
Figure SMS_12
Represents the Euclidean distance, <' > is selected>
Figure SMS_13
、/>
Figure SMS_14
、/>
Figure SMS_7
and />
Figure SMS_15
Respectively, are the balance weights of the class 4 losses.
The invention provides a pedestrian re-identification method based on occlusion suppression and feature reconstruction for an occlusion scene, which has the beneficial effects that:
1. a data enhancement strategy is designed, and an image with enhanced shielding and a corresponding label can be generated and used for enhancing the comprehension capability of a model to the shielding condition;
2. the method improves the mode of determining the key visual part when the local features are enhanced, does not depend on an additional model, and carries out occlusion perception in a self-supervision mode;
3. the method improves the mode of enhancing local features during feature extraction, does not generate independent local feature re-weighting fusion, but directly generates global features focusing on specific local features, and is more suitable for a feature extraction network based on a self-attention mechanism;
4. a feature reconstruction network is designed, incomplete features can be reconstructed in an occlusion scene, complete global features are reconstructed, and accuracy of pedestrian re-identification in the occlusion scene is improved.
Drawings
FIG. 1 is a diagram of an overall network structure of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 2 is a network training flowchart of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 3 is a flow chart of network prediction for an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention;
FIG. 4 is an exemplary diagram of a tag generated by grid-aligned block-shaped occlusion enhancement of an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction according to the present invention;
fig. 5 is a schematic diagram of a feature reconstruction embodiment of a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction provided by the invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
According to the occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction, firstly, a random grid alignment block-shaped occlusion enhancement strategy is used for generating an enhanced image sample simulating occlusion, the enhanced image sample is used for training an occlusion sensor in a self-supervision mode, and the occlusion position in a pedestrian image can be predicted. The invention firstly uses the shielding inhibition encoder to extract the features of the input image, the encoder adopts the self-attention mechanism to divide the image into blocks and carry out sufficient information exchange between the image blocks, and in the process, the result of shielding perception is utilized to inhibit the feature transmission of the shielding position, thereby generating the global features of the non-shielding region. Then, the invention uses the characteristic repairing network to rebuild the complete pedestrian characteristic, and finally obtains a robust characteristic expression.
Fig. 1 is an overall network structure diagram of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention, and fig. 2 and 3 are a network training flow diagram and a network prediction flow diagram of an embodiment of occlusion scene pedestrian re-identification based on occlusion suppression and feature reconstruction provided by the present invention, respectively, and it can be known from fig. 1 to fig. 3 that the re-identification method includes:
step 1, performing data enhancement on a complete pedestrian image by using a grid alignment block-shaped shielding enhancement strategy to generate a shielding enhancement image simulating shielding and a shielding label corresponding to the shielding enhancement image.
In particular implementations, a complete pedestrian image may be obtained from the pedestrian re-identification public dataset.
And 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and the corresponding occlusion label.
Step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; and when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor.
And 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network.
Step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the pedestrians sheltered, and restraining the features of the sheltered area by using the sheltered sensing result to obtain the global features of the visible area of the pedestrians concerned; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristics to complete pedestrian re-identification.
The invention provides a pedestrian re-identification method of an occlusion scene based on occlusion suppression and feature reconstruction, which is characterized in that firstly, a grid alignment block-shaped occlusion enhancement strategy is provided for training the perception capability of a model to occlusion in a self-supervision mode; secondly, an occlusion sensor is constructed, and the occlusion fraction of each image block of the partitioned pedestrian image can be predicted; thirdly, a feature reconstruction network is constructed, the blocked incomplete features can be repaired, and the complete global features can be reconstructed.
Example 1
Embodiment 1 provided by the present invention is an embodiment of a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction, and as can be seen in conjunction with fig. 1 to 3, the embodiment of the method includes:
step 1, performing data enhancement on a complete pedestrian image by using a grid alignment block-shaped shielding enhancement strategy to generate a shielding enhancement image simulating shielding and a shielding label corresponding to the shielding enhancement image.
In particular implementations, a complete pedestrian image may be obtained from the pedestrian re-identification public dataset.
In a possible embodiment, the process of generating the occlusion enhanced image and the corresponding occlusion label in step 1 includes:
scaling the complete pedestrian image to a set size and dividing into image blocks according to an equally divided grid.
Setting a plurality of shielding enhanced images as a batch image, setting a shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch image after multiple sampling.
And selecting other complete pedestrian images with different identities from the batch images at random during sampling each time, randomly selecting the same-shaped area to cover the complete pedestrian image to be processed, and generating the shielding enhanced image and the corresponding shielding label.
In specific implementation, the process may specifically be: the original image is first scaled to
Figure SMS_17
And is divided into ∑ and ∑ according to the averaging grid>
Figure SMS_18
An image block, wherein>
Figure SMS_19
Representing the side length of the image block. Setting a shading ratio>
Figure SMS_20
Then continuously generates a random size, and has a length and width->
Figure SMS_21
An integer multiple of rectangles is added to an initially empty set until all rectangles in the set take and produce shapes that are sufficiently close in area to->
Figure SMS_22
And finishing the construction of the random irregular blocky mask. Each sampling generates a random mask for all images in a batch, other pedestrian images with different identities are randomly selected from the same batch, and areas with the same shape are selected to cover the original image to obtain a shielding enhanced image. At the same time, the size can be obtained as->
Figure SMS_23
Occlusion label matrix->
Figure SMS_16
The value of the matrix element is 0 or 1, wherein 0 represents the original imageBlock, 1, represents a simulated obstruction image block.
Fig. 4 is a grid-aligned block-shaped occlusion enhancement exemplary diagram of an occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction provided by the present invention; in fig. 4, white represents a mask, and black represents an original.
And 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and the corresponding occlusion label.
In one possible embodiment, the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer; in particular, it may be a 3-layer self-attention module.
The input of the occlusion sensor is an input sequence consisting of an image block feature embedded sequence and an occlusion indication feature initialized to an all-zero vector
Figure SMS_24
In a possible embodiment, the training process of the occlusion sensor includes:
the occlusion sensor uses a self-attention mechanism for the input, a self-attention module integrates information carried in image block feature embedding, and the occlusion indicating feature is updated
Figure SMS_25
The occlusion indication feature is->
Figure SMS_26
Convert into the occlusion perception result->
Figure SMS_27
Wherein the occlusion prediction is supervised by the occlusion label corresponding to the occlusion enhanced image obtained in step 1, resulting in an occlusion prediction loss >>
Figure SMS_28
, wherein ,/>
Figure SMS_29
Representing a cross entropy function.
Through the steps, the construction and the training of the shielding perceptron are completed.
Step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; and when the feature extraction is carried out on the shielding enhanced image, the shielding sensing result of the shielding sensor is utilized to inhibit shielding interference.
In a possible embodiment, the feature extraction network constructed in step 3 suppresses feature embedding of an occlusion image block by using an occlusion sensing result, and is composed of multiple layers of self-attention modules; specifically, the module may be an 11-layer self-attention module.
The input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature
Figure SMS_30
In a possible embodiment, the feature extraction network will use a self-attention mechanism for the above input, and the extraction process includes: integrating information carried in image block feature embedding, and updating the identity classification indication feature
Figure SMS_31
During the process, each layer of self-attention module generates an attention matrix; wherein N elements of the first row of the attention matrix are marked as ^ 4>
Figure SMS_32
Means for embedding features into N image blocks>
Figure SMS_33
The information transmission strength of (1).
When the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding sensing result, so that the image block features with high shielding fraction are embedded in feature exchangeThe process has smaller weight and the corrected attention diagram is carried out
Figure SMS_34
For computing feature updates.
In a specific implementation, the correction process may be:
Figure SMS_35
, wherein />
Figure SMS_36
Represents a hadamard product, is selected and/or selected>
Figure SMS_37
Represents a unit vector, <' > is selected>
Figure SMS_38
Indicating the degree of correction of the attention map.
And 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network.
Fig. 5 is a schematic diagram of an embodiment of feature reconstruction of a pedestrian re-identification method for an occlusion scene based on occlusion suppression and feature reconstruction according to the present invention, and it can be known from fig. 5,
in a possible embodiment, the feature reconstruction network constructed in step 4 is composed of two branches of self-attention layers, where the two branches are: the global feature construction network and the complete feature inference network.
The global feature construction network obtains complete global features based on the complete pedestrian image feature construction
Figure SMS_39
(ii) a The full feature inference network derives a reconstructed global feature ≥ based on the occlusion-enhanced image feature inference>
Figure SMS_40
The feature reconstruction network is trained in a self-supervision mode, incomplete features of a part of concerned visual regions are recovered to be global features of complete pedestrians on a feature level, and the obtained features are more discriminative in pedestrian re-identification of a sheltered scene.
In particular, the goal of the feature reconstruction network is to construct features from the incomplete image that are as similar as possible to the features of the complete image by the complete feature inference network.
The overall loss of the feature reconstruction network during training is
Figure SMS_41
;
wherein ,
Figure SMS_43
indicates a loss of identity classification, based on the presence of a predetermined number of active cells>
Figure SMS_44
Indicates a triplet lost, —>
Figure SMS_46
Representing a predicted loss of occlusion, a loss of inference>
Figure SMS_47
, wherein />
Figure SMS_48
Representing Euclidean distance, and backward propagation only passes through branches of the complete feature inference network; />
Figure SMS_49
、/>
Figure SMS_50
、/>
Figure SMS_42
and />
Figure SMS_45
Respectively, are the balance weights of the class 4 losses.
Step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the blocking pedestrian, and restraining the features of the blocking area by using the blocking perception result to obtain the global features of the visible area of the concerned pedestrian; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristic to complete pedestrian re-identification.
In the occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction, in the prediction stage, the occlusion perceptron trained in the step 2 is used for carrying out occlusion perception on the occluded pedestrian images in the real scene query set to obtain the occlusion perception result
Figure SMS_51
(ii) a Performing feature extraction on the image of the hidden pedestrian by using the feature extraction network obtained in the step 3, and suppressing the features of the hidden area in a mode of correcting the attention map by using the hidden sensing result obtained in the step 5 to generate the global features of the visible area of the hidden pedestrian; and finally, utilizing one branch of the feature reconstruction network obtained by training in the step 4, namely the complete feature inference network, and carrying out feature reconstruction based on the obtained global features to obtain the final global features for pedestrian re-identification. And finally, calculating characteristic distances by using the obtained global characteristics and the global characteristics extracted in the same mode of the image library set, sorting according to the cosine distances, and sequentially outputting according to the sequence of the characteristic distances from near to far so as to finish the pedestrian re-identification task.
The beneficial effects include:
1. a data enhancement strategy is designed, and an image with enhanced occlusion and a corresponding label can be generated for enhancing the comprehension capability of a model on the occlusion condition.
2. The method improves the mode of determining key visual parts when local features are enhanced, does not depend on an additional model, and carries out occlusion perception in an automatic supervision mode.
3. The method improves the method for enhancing the local features during feature extraction, does not additionally generate independent local feature reweighting fusion, but directly generates global features focusing on specific local, and is more suitable for the feature extraction network based on the self-attention mechanism.
4. A feature reconstruction network is designed, incomplete features can be repaired in an occlusion scene, complete global features are reconstructed, and accuracy of pedestrian re-identification in the occlusion scene is improved.
It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A pedestrian re-identification method of an occlusion scene based on occlusion suppression and feature reconstruction is characterized in that the re-identification method comprises the following steps:
step 1, performing data enhancement on a complete pedestrian image by using a grid alignment block-shaped shielding enhancement strategy to generate a shielding enhancement image simulating shielding and a shielding label corresponding to the shielding enhancement image;
step 2, constructing an occlusion sensor, and training the occlusion sensor by using the occlusion enhanced image and a corresponding occlusion label;
step 3, constructing a feature extraction network, and respectively extracting features of the complete pedestrian image and the shielding enhanced image to obtain a complete pedestrian image feature and a shielding enhanced image feature; when the feature extraction is carried out on the shielding enhanced image, shielding interference is suppressed by using a shielding sensing result of the shielding sensor;
step 4, constructing a characteristic reconstruction network, and training by using the shielding enhanced image characteristics and the complete pedestrian image characteristics to obtain the characteristic reconstruction network;
step 5, carrying out occlusion perception on an occluded pedestrian image of a real scene by using the occlusion perceptron to obtain an occlusion perception result; extracting features of the image of the pedestrians sheltered, and restraining the features of the sheltered area by using the sheltered sensing result to obtain the global features of the visible area of the pedestrians concerned; carrying out feature reconstruction on the global features based on the feature reconstruction network to obtain final global features for pedestrian re-identification; and calculating a characteristic distance based on the final global characteristic to complete pedestrian re-identification.
2. The re-recognition method of claim 1, wherein the step 1 of generating the occlusion enhanced image and the corresponding occlusion label comprises:
scaling the complete pedestrian image to a set size and dividing the complete pedestrian image into image blocks according to equally divided grids;
setting a plurality of shielding enhanced images as a batch of images, setting shielding proportion, circularly and randomly generating rectangles aligned with grids, adding the rectangles into a mask set until the total area of all rectangles in the mask set is consistent with the shielding proportion, forming an irregular block mask based on the mask set, and generating a plurality of random masks of the batch of images after multiple sampling;
and randomly selecting other complete pedestrian images with different identities from the batch of images during each sampling, randomly selecting the areas with the same shape to cover the complete pedestrian images to be processed, and generating the shielding enhanced images and the corresponding shielding labels.
3. The re-recognition method of claim 1, wherein the occlusion sensor is composed of a plurality of layers of self-attention modules and a linear layer;
the input of the occlusion sensor is an input sequence consisting of an image block feature embedding sequence and an occlusion indication feature initialized to an all-zero vector.
4. The re-recognition method of claim 3, wherein the training process of the occlusion sensor comprises:
integrating information carried in image block feature embedding by self-attention module, and updating the occlusion indication feature
Figure QLYQS_1
The occlusion indication feature is->
Figure QLYQS_2
Convert into the occlusion perception result->
Figure QLYQS_3
Wherein the occlusion prediction is supervised by an occlusion label corresponding to the occlusion enhanced image.
5. The re-recognition method according to claim 1, wherein the feature extraction network constructed in the step 3 is composed of a plurality of layers of self-attention modules;
the input of the feature extraction network is an input sequence consisting of an image block feature embedding sequence and an identity classification indication feature.
6. The re-recognition method of claim 5, wherein the feature extraction network extraction process comprises: integrating information carried in image block feature embedding, updating the identity classification indication features, and generating an attention matrix by each layer of self-attention modules; wherein, the N elements in the first row of the attention matrix represent the information transfer strength of the embedded features of the N image blocks to the identity indication features;
when the feature extraction is carried out on the shielding enhanced image, the attention moment array is corrected according to the shielding perception result, the image block features with high shielding fractions are embedded in the feature exchange process and have smaller weight, and the corrected attention map is used for calculating feature updating.
7. The re-recognition method according to claim 1, wherein the feature reconstruction network constructed in the step 4 is composed of two branches of self-attention layers, and the two branches are respectively: global feature construction network and complete feature reasoning network;
the global feature construction network obtains complete global features based on the complete pedestrian image feature construction
Figure QLYQS_4
(ii) a The full feature inference network derives reconstructed global features &basedon the occlusion enhanced image feature inference>
Figure QLYQS_5
8. The re-recognition method of claim 7, wherein the feature reconstruction network aims to construct features from the incomplete image that are as similar as possible to features of the complete image by the complete feature inference network;
the overall loss of the feature reconstruction network during training is
Figure QLYQS_6
;
wherein ,
Figure QLYQS_8
indicates a loss of identity classification, based on the presence of a predetermined number of active cells>
Figure QLYQS_10
Indicates a triple loss, asserted>
Figure QLYQS_11
Representing predictive, inferential loss of occlusion
Figure QLYQS_12
, wherein />
Figure QLYQS_13
Represents the Euclidean distance, <' > is selected>
Figure QLYQS_14
、/>
Figure QLYQS_15
、/>
Figure QLYQS_7
and />
Figure QLYQS_9
Respectively, are the balance weights of the class 4 losses. />
CN202310121979.7A 2023-02-16 2023-02-16 Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction Active CN115937906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310121979.7A CN115937906B (en) 2023-02-16 2023-02-16 Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310121979.7A CN115937906B (en) 2023-02-16 2023-02-16 Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction

Publications (2)

Publication Number Publication Date
CN115937906A true CN115937906A (en) 2023-04-07
CN115937906B CN115937906B (en) 2023-06-06

Family

ID=85827197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310121979.7A Active CN115937906B (en) 2023-02-16 2023-02-16 Occlusion scene pedestrian re-identification method based on occlusion suppression and feature reconstruction

Country Status (1)

Country Link
CN (1) CN115937906B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154765A1 (en) * 2011-10-28 2015-06-04 Carestream Health, Inc. Tomosynthesis reconstruction with rib suppression
CN110929578A (en) * 2019-10-25 2020-03-27 南京航空航天大学 Anti-blocking pedestrian detection method based on attention mechanism
US20200125925A1 (en) * 2018-10-18 2020-04-23 Deepnorth Inc. Foreground Attentive Feature Learning for Person Re-Identification
CN111310718A (en) * 2020-03-09 2020-06-19 成都川大科鸿新技术研究所 High-accuracy detection and comparison method for face-shielding image
CN112465872A (en) * 2020-12-10 2021-03-09 南昌航空大学 Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization
CN112801051A (en) * 2021-03-29 2021-05-14 哈尔滨理工大学 Method for re-identifying blocked pedestrians based on multitask learning
CN114022823A (en) * 2021-11-16 2022-02-08 北京信息科技大学 Shielding-driven pedestrian re-identification method and system and storable medium
CN114419671A (en) * 2022-01-18 2022-04-29 北京工业大学 Hypergraph neural network-based occluded pedestrian re-identification method
CN114639122A (en) * 2022-03-22 2022-06-17 江苏大学 Attitude correction pedestrian re-recognition method based on convolution generation countermeasure network
WO2022222766A1 (en) * 2021-04-21 2022-10-27 中山大学 Semantic segmentation-based face integrity measurement method and system, device and storage medium
CN115565207A (en) * 2022-11-29 2023-01-03 武汉图科智能科技有限公司 Occlusion scene downlink person detection method with feature simulation fused

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154765A1 (en) * 2011-10-28 2015-06-04 Carestream Health, Inc. Tomosynthesis reconstruction with rib suppression
US20200125925A1 (en) * 2018-10-18 2020-04-23 Deepnorth Inc. Foreground Attentive Feature Learning for Person Re-Identification
CN110929578A (en) * 2019-10-25 2020-03-27 南京航空航天大学 Anti-blocking pedestrian detection method based on attention mechanism
CN111310718A (en) * 2020-03-09 2020-06-19 成都川大科鸿新技术研究所 High-accuracy detection and comparison method for face-shielding image
CN112465872A (en) * 2020-12-10 2021-03-09 南昌航空大学 Image sequence optical flow estimation method based on learnable occlusion mask and secondary deformation optimization
CN112801051A (en) * 2021-03-29 2021-05-14 哈尔滨理工大学 Method for re-identifying blocked pedestrians based on multitask learning
WO2022222766A1 (en) * 2021-04-21 2022-10-27 中山大学 Semantic segmentation-based face integrity measurement method and system, device and storage medium
CN114022823A (en) * 2021-11-16 2022-02-08 北京信息科技大学 Shielding-driven pedestrian re-identification method and system and storable medium
CN114419671A (en) * 2022-01-18 2022-04-29 北京工业大学 Hypergraph neural network-based occluded pedestrian re-identification method
CN114639122A (en) * 2022-03-22 2022-06-17 江苏大学 Attitude correction pedestrian re-recognition method based on convolution generation countermeasure network
CN115565207A (en) * 2022-11-29 2023-01-03 武汉图科智能科技有限公司 Occlusion scene downlink person detection method with feature simulation fused

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAZIWA SALEH等: "Occlusion Handling in Generic Object Detection: A Review", 《IEEE》, pages 1 - 8 *
YUEQIAO FAN等: "DSF-net: occluded person re-identification based on dual structure features", 《NEURAL COMPUTING AND APPLICATIONS》, pages 3537 *
王旭东: "基于深度学习的遮挡人脸检测和还原技术研究", 《中国优秀论文电子期刊网》, pages 44 - 59 *

Also Published As

Publication number Publication date
CN115937906B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN112801169B (en) Camouflage target detection method, system, device and storage medium based on improved YOLO algorithm
CN112699786B (en) Video behavior identification method and system based on space enhancement module
CN113223068B (en) Multi-mode image registration method and system based on depth global features
CN105243154A (en) Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings
CN112084923A (en) Semantic segmentation method for remote sensing image, storage medium and computing device
CN116503399B (en) Insulator pollution flashover detection method based on YOLO-AFPS
CN112884802A (en) Anti-attack method based on generation
Li et al. Adaptive linear feature-reuse network for rapid forest fire smoke detection model
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
CN112036381A (en) Visual tracking method, video monitoring method and terminal equipment
Wei et al. Traffic sign detection and recognition using novel center-point estimation and local features
CN115546171A (en) Shadow detection method and device based on attention shadow boundary and feature correction
Pratiwi et al. Early detection of deforestation through satellite land geospatial images based on CNN architecture
CN113704276A (en) Map updating method and device, electronic equipment and computer readable storage medium
CN115937906A (en) Occlusion scene pedestrian re-identification method based on occlusion inhibition and feature reconstruction
İsa Performance Evaluation of Jaccard-Dice Coefficient on Building Segmentation from High Resolution Satellite Images
CN115953668A (en) Method and system for detecting camouflage target based on YOLOv5 algorithm
CN115222998A (en) Image classification method
Li et al. YOLO-A2G: An air-to-ground high-precision object detection algorithm based on YOLOv5
CN112966569B (en) Image processing method and device, computer equipment and storage medium
CN117593470B (en) Street view reconstruction method and system based on AI model
Xu et al. Study of Multiscale Fused Extraction of Cropland Plots in Remote Sensing Images Based on Attention Mechanism
Song et al. Study of Multiscale Fused Extraction of Cropland Plots in Remote Sensing Images Based on Attention Mechanism
Song et al. Research Article Study of Multiscale Fused Extraction of Cropland Plots in Remote Sensing Images Based on Attention Mechanism
Prathima et al. Detection of Armed Assailants in Hostage Situations-A Machine Learning based approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: No. 548, 5th Floor, Building 10, No. 28 Linping Avenue, Donghu Street, Linping District, Hangzhou City, Zhejiang Province

Patentee after: Hangzhou Tuke Intelligent Information Technology Co.,Ltd.

Address before: 430000 B033, No. 05, 4th floor, building 2, international enterprise center, No. 1, Guanggu Avenue, Donghu New Technology Development Zone, Wuhan, Hubei (Wuhan area of free trade zone)

Patentee before: Wuhan Tuke Intelligent Technology Co.,Ltd.

CP03 Change of name, title or address