CN117935172A - Visible light infrared pedestrian re-identification method and system based on spectral information filtering - Google Patents

Visible light infrared pedestrian re-identification method and system based on spectral information filtering Download PDF

Info

Publication number
CN117935172A
CN117935172A CN202410325387.1A CN202410325387A CN117935172A CN 117935172 A CN117935172 A CN 117935172A CN 202410325387 A CN202410325387 A CN 202410325387A CN 117935172 A CN117935172 A CN 117935172A
Authority
CN
China
Prior art keywords
infrared
visible light
pedestrian
training
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410325387.1A
Other languages
Chinese (zh)
Other versions
CN117935172B (en
Inventor
张国庆
王准
张家伟
郑钰辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202410325387.1A priority Critical patent/CN117935172B/en
Publication of CN117935172A publication Critical patent/CN117935172A/en
Application granted granted Critical
Publication of CN117935172B publication Critical patent/CN117935172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a visible light infrared pedestrian re-identification method and a visible light infrared pedestrian re-identification system based on spectrum information filtering, wherein the method comprises the following steps: (1) Obtaining original data, dividing a training set, a verification set and a test set, and preprocessing; (2) Randomly forming a cross-modal image pair by the obtained batch training samples; (3) Setting up a three-branch pedestrian re-recognition network based on PyTorch and setting training parameters; (4) Dividing the training period into two stages of V-T and V-I, calculating the semantic consistency loss updating network weight when the training period is in the V-T stage, taking a transition mode as a filtering condition, and reserving spectrum information most relevant to an infrared mode from a visible light mode; (5) When the method is in the V-I stage, the cascade aggregation loss is calculated, the network weight is updated, the mode alignment is directly realized between the visible light and the infrared modes, and the mode sharing representation is extracted; and verifying the accuracy of the algorithm by using the verification set, and storing the network weight with optimal accuracy.

Description

Visible light infrared pedestrian re-identification method and system based on spectral information filtering
Technical Field
The invention relates to the technical field of traffic environment, in particular to a visible light infrared pedestrian re-identification method and system based on spectral information filtering.
Background
The pedestrian re-recognition aims to accurately recognize and match the same pedestrian at different positions or time points by analyzing visual characteristics of the pedestrian in different scenes or cameras. In recent years, this technology has received increasing attention due to its wide application in the fields of intelligent security and intelligent transportation. With the continued development of deep learning and neural network architecture, pedestrian re-recognition has achieved remarkable results. However, most current methods are designed mainly to focus on single-mode pedestrian re-recognition (visible light), and the visible light image is ignored to be easily affected by illumination conditions in the imaging process. In poor lighting conditions or night environments, the captured visible light images lack sufficient visual cues to accurately discern identity. Therefore, in order to make up for the limitation, recent researches gradually shift the focus to visible light infrared pedestrian re-identification, fully utilize the advantages of infrared images in a low illumination environment, and provide a more reliable solution for identity identification.
Visible-infrared pedestrian re-recognition is required to not only address challenges inherent to the task of pedestrian re-recognition (e.g., change in viewing angle, change in posture, difference in illumination, etc.), but also to overcome significant differences between the two different sensor modalities. This modal difference results from the different physical principles used in the acquisition of the visible and infrared images. The visible light image is directly affected by natural illumination conditions to capture the surface color and texture of the object, while the infrared image is based on the thermal radiation of the target object, and focuses on the temperature distribution of the object. These two different physical properties lead to significant differences in the appearance, texture, brightness and thermal profile of the pedestrian. Existing methods can be generally divided into two categories: (1) Modality sharing feature learning aims at mining public characterization of cross-modality pedestrian pictures by metric learning or modality specific information de-entanglement; (2) The purpose of modal compensation learning is to generate missing modal attributes at the image or feature level, and to mitigate modal differences by attribute complementation. However, these two methods attempt to directly deal with huge modal differences, but neglect a large modal gap, which is unfavorable for exploring the spectrum correspondence between visible light and infrared images, and it is difficult to learn enough discrimination semantics.
Disclosure of Invention
The invention aims to: the invention aims to provide a visible light infrared pedestrian re-identification method and a visible light infrared pedestrian re-identification system based on spectrum information filtering, which are used for improving and optimizing the existing visible light infrared pedestrian re-identification algorithm and capturing potential spectrum corresponding relations among modes, so that the problem of low feature matching accuracy is solved.
The technical scheme is as follows: the invention discloses a visible light infrared pedestrian re-identification method based on spectrum information filtering, which comprises the following steps:
(1) Obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;
(2) Randomly forming cross-modal image pairs from the batch training samples processed in the step (1);
(3) Setting up a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;
(4) Dividing the training period into two stages of V-T and V-I, calculating the semantic consistency loss updating network weight when the training period is in the V-T stage, taking a transition mode as a filtering condition, and reserving spectrum information most relevant to an infrared mode from a visible light mode;
(5) When the method is in the V-I stage, the cascade aggregation loss is calculated, the network weight is updated, the mode alignment is directly realized between the visible light and the infrared modes, and the mode sharing representation is extracted; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.
Further, in the step (1), the original data set is a published SYSU-MM01 or RegDB; the pretreatment comprises the following steps: training samples were cut and scaled to 288x144 pixels and randomly erased by horizontal flipping to increase sample diversity, and finally normalized using channel mean and standard deviation calculated from the ImageNet dataset.
Further, the step (2) specifically includes the following steps: firstly, randomly selecting one of red, green and blue color channels of a visible light image and expanding the red, green and blue color channels into three channels; then horizontally dividing the processed visible light image and the original infrared image into a plurality of parts, wherein each part keeps one of two modes with the same probability; finally, splicing along the height dimension of the image; generating a transition image for each image pair; the formula is as follows:
Setting visible light image in extracted image pair Randomly selecting one of the three channels of red, green and blue, expanding the three channels into three channels, and then transforming the visible light image/>And original infrared image/>Level division into/>A plurality of sections; then:
wherein, 、/>、/>Respectively represent red, green and blue channels in the visible light picture,/>And/>For random selection and expansion operations;
wherein, And/>Is an array formed by horizontal stripes of the picture,Representing the random selection operation of the array elements to obtain a new array; /(I)And (5) splicing the picture in the height dimension.
Further, the step (3) specifically includes the following steps: the ResNet model pre-trained on ImageNet was selected as the feature extractor, and then the network was trimmed on SYSU-MM01 or RegDB dataset; the ResNet stages are five in total, the first stage is duplicated for three times to be used as input ports of visible light, transition and infrared modes respectively, and the last four stages are mode sharing to form a three-branch pedestrian re-identification network; and (3) inputting the visible light and infrared pictures read in the step (1) and the transition pictures synthesized in the step (2) into a pedestrian re-identification network to extract pedestrian characteristics.
Further, the step (4) specifically includes the following steps: calculating joint training loss consisting of basic loss and semantic consistency loss by utilizing the pedestrian characteristics extracted in the step (3) and the identity tag information read in the step (1), and updating a pedestrian re-recognition network through gradient back propagation; the semantic consistency loss takes a transition mode as a constraint condition, and retains spectrum information most relevant to an infrared mode from a visible light mode; the calculation process of the semantic consistency loss is as follows:
wherein P represents the number of pedestrian categories in the current batch, For the number of visible/transitional pictures under a single ID,/>Is a fully connected network; /(I)Controlling knowledge propagation rates of strong correlation and weak correlation sample pairs for the super-parameter weights; Regularization for L2; /(I) And/>Respectively representing the jth visible light characteristic and the kth transition characteristic under the ith identity.
Further, the basic loss consists of identity loss, namely cross entropy loss and triplet loss; the calculation process of the visible light modal identity loss is expressed as follows:
wherein N is the number of visible light pictures, C is the total number of IDs, Extracting the obtained characteristics of the I-th visible light picture; classifier weights corresponding to the true identities of the samples; the technical process of infrared and transition modes is similar to that of the prior art; /(I) The classifier weights are corresponding;
The triplet loss calculation process is as follows:
wherein, Representing a set of visible light and transition features; /(I)And/>Is a positive sample pair; And/> Is a negative sample pair; /(I)Is the Euclidean distance; /(I);/>Is an edge parameter.
Further, in the step (5), updating the pedestrian re-recognition network through gradient back propagation; the calculation process of the cascade aggregation loss is as follows:
wherein, And/>Visible and infrared feature centers respectively representing the ith identity; /(I)And/>Visible and infrared semantic centers respectively representing the ith identity; p is the total number of IDs in the current batch; /(I)And extracting the obtained characteristics for the j-th infrared picture in the i-th ID in the current batch.
The invention relates to a visible light infrared pedestrian re-identification system based on spectrum information filtering, which comprises the following components:
and (3) an acquisition pretreatment module: the method comprises the steps of obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;
cross-modality image pair module: the system comprises a preprocessing module, a cross-modal image acquisition module and a cross-modal image acquisition module, wherein the preprocessing module is used for preprocessing the acquired training samples;
And an extraction module: the method is used for building a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;
V-T stage module: the training period is divided into two stages, namely a V-T stage and a V-I stage, when the training period is in the V-T stage, the semantic consistency loss is calculated to update the network weight, the transition mode is used as a filtering condition, and the spectrum information most relevant to the infrared mode is reserved from the visible light mode;
V-I stage module: the method is used for calculating cascading aggregation loss when the system is in a V-I stage, updating network weight, directly realizing modal alignment between visible light and infrared modes, and extracting modal sharing representation; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.
The invention discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and is characterized in that the computer program is loaded to the processor to realize any visible infrared pedestrian re-identification method based on spectrum information filtering.
The storage medium of the present invention stores a computer program, wherein the computer program when executed by a processor implements a visible infrared pedestrian re-recognition method based on spectral information filtering as set forth in any one of the above.
The beneficial effects are that: compared with the prior art, the invention has the following remarkable advantages: the pedestrian identity is identified and matched by deep learning, and the pedestrian identity identification method has high identification rate, and can save a great deal of time cost and labor cost. In addition, the invention has no extra model complexity, realizes better performance by using the global feature, and has smaller requirements on hardware and calculation speed during deployment.
Drawings
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a network architecture diagram of the visible infrared pedestrian re-identification method based on spectral information filtering provided by the invention;
FIG. 3 is a schematic diagram of transition mode generation in accordance with the present invention;
FIG. 4 is a schematic diagram of a two-stage training penalty of the present invention;
FIG. 5 is a flow chart of the model training of the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1-5, an embodiment of the present invention provides a visible light infrared pedestrian re-recognition method based on spectral information filtering, including the following steps:
(1) Obtaining original data, dividing a training set, a verification set and a test set, and preprocessing; wherein the original data set is the published SYSU-MM01 or RegDB; the pretreatment comprises the following steps: training samples were cut and scaled to 288x144 pixels and randomly erased by horizontal flipping to increase sample diversity, and finally normalized using channel mean and standard deviation calculated from the ImageNet dataset.
(2) Randomly forming cross-modal image pairs from the batch training samples processed in the step (1); the method comprises the following steps: firstly, randomly selecting one of red, green and blue color channels of a visible light image and expanding the red, green and blue color channels into three channels; then horizontally dividing the processed visible light image and the original infrared image into a plurality of parts, wherein each part keeps one of two modes with the same probability; finally, splicing along the height dimension of the image; generating a transition image for each image pair; the formula is as follows:
Setting visible light image in extracted image pair Randomly selecting one of the three channels of red, green and blue, expanding the three channels into three channels, and then transforming the visible light image/>And original infrared image/>Level division into/>A plurality of sections; then:
wherein, 、/>、/>Respectively represent red, green and blue channels in the visible light picture,/>And/>For random selection and expansion operations;
wherein, And/>Is an array formed by horizontal stripes of the picture,Representing the random selection operation of the array elements to obtain a new array; /(I)And (5) splicing the picture in the height dimension.
(3) Setting up a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization; the method comprises the following steps: the ResNet model pre-trained on ImageNet was selected as the feature extractor, and then the network was trimmed on SYSU-MM01 or RegDB dataset; the ResNet stages are five in total, the first stage is duplicated for three times to be used as input ports of visible light, transition and infrared modes respectively, and the last four stages are mode sharing to form a three-branch pedestrian re-identification network; and (3) inputting the visible light and infrared pictures read in the step (1) and the transition pictures synthesized in the step (2) into a pedestrian re-identification network to extract pedestrian characteristics.
(4) Dividing the training period into two stages of V-T and V-I, calculating the semantic consistency loss updating network weight when the training period is in the V-T stage, taking a transition mode as a filtering condition, and reserving spectrum information most relevant to an infrared mode from a visible light mode; the method comprises the following steps: calculating joint training loss consisting of basic loss and semantic consistency loss by utilizing the pedestrian characteristics extracted in the step (3) and the identity tag information read in the step (1), and updating a pedestrian re-recognition network through gradient back propagation; the semantic consistency loss takes a transition mode as a constraint condition, and retains spectrum information most relevant to an infrared mode from a visible light mode; the calculation process of the semantic consistency loss is as follows:
wherein P represents the number of pedestrian categories in the current batch, For the number of visible/transitional pictures under a single ID,/>Is a fully connected network; /(I)Controlling knowledge propagation rates of strong correlation and weak correlation sample pairs for the super-parameter weights; Regularization for L2; /(I) And/>Respectively representing the jth visible light characteristic and the kth transition characteristic under the ith identity.
The basic loss consists of identity loss, namely cross entropy loss and triplet loss; the calculation process of the visible light modal identity loss is expressed as follows:
wherein N is the number of visible light pictures, C is the total number of IDs, Extracting the obtained characteristics of the I-th visible light picture; classifier weights corresponding to the true identities of the samples; the technical process of infrared and transition modes is similar to that of the prior art; /(I) The classifier weights are corresponding;
The triplet loss calculation process is as follows:
wherein, Representing a set of visible light and transition features; /(I)And/>Is a positive sample pair; And/> Is a negative sample pair; /(I)Is the Euclidean distance; /(I);/>Is an edge parameter.
(5) When the method is in the V-I stage, the cascade aggregation loss is calculated, the network weight is updated, the mode alignment is directly realized between the visible light and the infrared modes, and the mode sharing representation is extracted; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved. Updating the pedestrian re-recognition network through gradient back propagation; the calculation process of the cascade aggregation loss is as follows:
wherein, And/>Visible and infrared feature centers respectively representing the ith identity; /(I)And/>Visible and infrared semantic centers respectively representing the ith identity; p is the total number of IDs in the current batch; /(I)And extracting the obtained characteristics for the j-th infrared picture in the i-th ID in the current batch.
The initial learning rate of the present invention was set to 0.01 and then linearly increased to 0.1 after the 10 th epoch. Thereafter, the learning rate was reduced by a factor of 0.1 at the 20 th and 60 th epochs, respectively. One batch contains 64 images, of which 4 visible images and 4 infrared images are randomly selected from 8 identities. The total training duration was 110 epochs, with the first 100 epochs being used for V-T stage training and the last 10 epochs being used for V-I stage. We used SGD as the optimizer, weight decay set to 0.0005 and momentum set to 0.9.
Excellent performance was achieved on two mainstream visible infrared pedestrian re-identification datasets, SYSU-MM01 and RegDB, compared with other methods as shown in Table 1:
Table 1: performance comparison of the method and other visible light infrared pedestrian re-identification methods
The embodiment of the invention provides a visible light infrared pedestrian re-identification system based on spectral information filtering, which comprises the following steps:
and (3) an acquisition pretreatment module: the method comprises the steps of obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;
cross-modality image pair module: the system comprises a preprocessing module, a cross-modal image acquisition module and a cross-modal image acquisition module, wherein the preprocessing module is used for preprocessing the acquired training samples;
And an extraction module: the method is used for building a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;
V-T stage module: the training period is divided into two stages, namely a V-T stage and a V-I stage, when the training period is in the V-T stage, the semantic consistency loss is calculated to update the network weight, the transition mode is used as a filtering condition, and the spectrum information most relevant to the infrared mode is reserved from the visible light mode;
V-I stage module: the method is used for calculating cascading aggregation loss when the system is in a V-I stage, updating network weight, directly realizing modal alignment between visible light and infrared modes, and extracting modal sharing representation; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.
The embodiment of the invention provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and is characterized in that the computer program is loaded to the processor to realize any visible infrared pedestrian re-identification method based on spectrum information filtering.
An embodiment of the present invention provides a storage medium storing a computer program, where the computer program when executed by a processor implements a visible infrared pedestrian re-identification method based on spectral information filtering as set forth in any one of the above.

Claims (10)

1. The visible light infrared pedestrian re-identification method based on spectral information filtering is characterized by comprising the following steps of:
(1) Obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;
(2) Randomly forming cross-modal image pairs from the batch training samples processed in the step (1);
(3) Setting up a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;
(4) Dividing the training period into two stages of V-T and V-I, calculating the semantic consistency loss updating network weight when the training period is in the V-T stage, taking a transition mode as a filtering condition, and reserving spectrum information most relevant to an infrared mode from a visible light mode;
(5) When the method is in the V-I stage, the cascade aggregation loss is calculated, the network weight is updated, the mode alignment is directly realized between the visible light and the infrared modes, and the mode sharing representation is extracted; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.
2. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 1, wherein in the step (1), the original dataset is a published SYSU-MM01 or RegDB; the pretreatment comprises the following steps: training samples were cut and scaled to 288x144 pixels and randomly erased by horizontal flipping to increase sample diversity, and finally normalized using channel mean and standard deviation calculated from the ImageNet dataset.
3. The visible infrared pedestrian re-identification method based on spectral information filtering according to claim 1, wherein the step (2) is specifically as follows: firstly, randomly selecting one of red, green and blue color channels of a visible light image and expanding the red, green and blue color channels into three channels; then horizontally dividing the processed visible light image and the original infrared image into a plurality of parts, wherein each part keeps one of two modes with the same probability; finally, splicing along the height dimension of the image; generating a transition image for each image pair; the formula is as follows:
Setting visible light image in extracted image pair Randomly selecting one of the three channels of red, green and blue, expanding the three channels into three channels, and then transforming the visible light image/>And original infrared image/>Level division into/>A plurality of sections; then:
wherein, 、/>、/>Respectively represent red, green and blue channels in the visible light picture,/>AndFor random selection and expansion operations;
wherein, And/>Is an array formed by horizontal stripes of the picture,Representing the random selection operation of the array elements to obtain a new array; /(I)And (5) splicing the picture in the height dimension.
4. The visible infrared pedestrian re-identification method based on spectral information filtering according to claim 1, wherein the step (3) is specifically as follows: the ResNet model pre-trained on ImageNet was selected as the feature extractor, and then the network was trimmed on SYSU-MM01 or RegDB dataset; the ResNet stages are five in total, the first stage is duplicated for three times to be used as input ports of visible light, transition and infrared modes respectively, and the last four stages are mode sharing to form a three-branch pedestrian re-identification network; and (3) inputting the visible light and infrared pictures read in the step (1) and the transition pictures synthesized in the step (2) into a pedestrian re-identification network to extract pedestrian characteristics.
5. The visible infrared pedestrian re-identification method based on spectral information filtering of claim 1, wherein the step (4) is specifically as follows: calculating joint training loss consisting of basic loss and semantic consistency loss by utilizing the pedestrian characteristics extracted in the step (3) and the identity tag information read in the step (1), and updating a pedestrian re-recognition network through gradient back propagation; the semantic consistency loss takes a transition mode as a constraint condition, and retains spectrum information most relevant to an infrared mode from a visible light mode; the calculation process of the semantic consistency loss is as follows:
wherein P represents the number of pedestrian categories in the current batch, For the number of visible/transitional pictures under a single ID,Is a fully connected network; /(I)Controlling knowledge propagation rates of strong correlation and weak correlation sample pairs for the super-parameter weights; /(I)Regularization for L2; /(I)And/>Respectively representing the jth visible light characteristic and the kth transition characteristic under the ith identity.
6. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 5, wherein the basic loss consists of identity loss, i.e. cross entropy loss and triplet loss; the calculation process of the visible light modal identity loss is expressed as follows:
wherein N is the number of visible light pictures, C is the total number of IDs, Extracting the obtained characteristics of the I-th visible light picture; /(I)Classifier weights corresponding to the true identities of the samples; /(I)The classifier weights are corresponding;
The triplet loss calculation process is as follows:
wherein, Representing a set of visible light and transition features; /(I)And/>Is a positive sample pair; /(I)And (3) withIs a negative sample pair; /(I)Is the Euclidean distance; /(I);/>Is an edge parameter.
7. The method for re-identifying visible infrared pedestrians based on spectral information filtering according to claim 1, wherein in the step (5), the pedestrian re-identification network is updated by gradient back propagation; the calculation process of the cascade aggregation loss is as follows:
wherein, And/>Visible and infrared feature centers respectively representing the ith identity; /(I)And/>Visible and infrared semantic centers respectively representing the ith identity; p is the total number of IDs in the current batch; /(I)And extracting the obtained characteristics for the j-th infrared picture in the i-th ID in the current batch.
8. A visible infrared pedestrian re-identification system based on spectral information filtering, comprising:
and (3) an acquisition pretreatment module: the method comprises the steps of obtaining original data, dividing a training set, a verification set and a test set, and preprocessing;
cross-modality image pair module: the system comprises a preprocessing module, a cross-modal image acquisition module and a cross-modal image acquisition module, wherein the preprocessing module is used for preprocessing the acquired training samples;
And an extraction module: the method is used for building a three-branch pedestrian re-recognition network based on PyTorch, setting training parameters, taking an original training sample and a synthesized transition picture as network input, and extracting pedestrian characterization;
V-T stage module: the training period is divided into two stages, namely a V-T stage and a V-I stage, when the training period is in the V-T stage, the semantic consistency loss is calculated to update the network weight, the transition mode is used as a filtering condition, and the spectrum information most relevant to the infrared mode is reserved from the visible light mode;
V-I stage module: the method is used for calculating cascading aggregation loss when the system is in a V-I stage, updating network weight, directly realizing modal alignment between visible light and infrared modes, and extracting modal sharing representation; in the training process, the accuracy of the algorithm is verified by using a verification set, and the network weight with optimal accuracy is saved.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when loaded into the processor implements a method for visible infrared pedestrian re-identification based on spectral information filtering according to any one of claims 1-7.
10. A storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the visible infrared pedestrian re-recognition method based on spectral information filtering according to any one of claims 1-7.
CN202410325387.1A 2024-03-21 2024-03-21 Visible light infrared pedestrian re-identification method and system based on spectral information filtering Active CN117935172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410325387.1A CN117935172B (en) 2024-03-21 2024-03-21 Visible light infrared pedestrian re-identification method and system based on spectral information filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410325387.1A CN117935172B (en) 2024-03-21 2024-03-21 Visible light infrared pedestrian re-identification method and system based on spectral information filtering

Publications (2)

Publication Number Publication Date
CN117935172A true CN117935172A (en) 2024-04-26
CN117935172B CN117935172B (en) 2024-06-14

Family

ID=90751103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410325387.1A Active CN117935172B (en) 2024-03-21 2024-03-21 Visible light infrared pedestrian re-identification method and system based on spectral information filtering

Country Status (1)

Country Link
CN (1) CN117935172B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651262A (en) * 2019-10-09 2021-04-13 四川大学 Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment
WO2022027986A1 (en) * 2020-08-04 2022-02-10 杰创智能科技股份有限公司 Cross-modal person re-identification method and device
CN114241517A (en) * 2021-12-02 2022-03-25 河南大学 Cross-modal pedestrian re-identification method based on image generation and shared learning network
CN114511878A (en) * 2022-01-05 2022-05-17 南京航空航天大学 Visible light infrared pedestrian re-identification method based on multi-modal relational polymerization
US20220180132A1 (en) * 2020-12-09 2022-06-09 Tongji University Cross-modality person re-identification method based on local information learning
CN115546844A (en) * 2022-11-09 2022-12-30 中国农业银行股份有限公司 Cross-modal pedestrian re-identification model generation method, cross-modal pedestrian re-identification model identification device and equipment
CN116503792A (en) * 2022-01-17 2023-07-28 安徽大学 Multispectral vehicle re-identification method based on cross consistency
CN116798070A (en) * 2023-05-15 2023-09-22 安徽理工大学 Cross-mode pedestrian re-recognition method based on spectrum sensing and attention mechanism
CN116824625A (en) * 2023-05-29 2023-09-29 北京交通大学 Target re-identification method based on generation type multi-mode image fusion
CN117523609A (en) * 2023-11-15 2024-02-06 安徽大学 Visible light and near infrared pedestrian re-identification method based on specific and shared representation learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651262A (en) * 2019-10-09 2021-04-13 四川大学 Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment
WO2022027986A1 (en) * 2020-08-04 2022-02-10 杰创智能科技股份有限公司 Cross-modal person re-identification method and device
US20220180132A1 (en) * 2020-12-09 2022-06-09 Tongji University Cross-modality person re-identification method based on local information learning
CN114241517A (en) * 2021-12-02 2022-03-25 河南大学 Cross-modal pedestrian re-identification method based on image generation and shared learning network
CN114511878A (en) * 2022-01-05 2022-05-17 南京航空航天大学 Visible light infrared pedestrian re-identification method based on multi-modal relational polymerization
CN116503792A (en) * 2022-01-17 2023-07-28 安徽大学 Multispectral vehicle re-identification method based on cross consistency
CN115546844A (en) * 2022-11-09 2022-12-30 中国农业银行股份有限公司 Cross-modal pedestrian re-identification model generation method, cross-modal pedestrian re-identification model identification device and equipment
CN116798070A (en) * 2023-05-15 2023-09-22 安徽理工大学 Cross-mode pedestrian re-recognition method based on spectrum sensing and attention mechanism
CN116824625A (en) * 2023-05-29 2023-09-29 北京交通大学 Target re-identification method based on generation type multi-mode image fusion
CN117523609A (en) * 2023-11-15 2024-02-06 安徽大学 Visible light and near infrared pedestrian re-identification method based on specific and shared representation learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GUOQING ZHANG 等: "Hybrid-attention guided network with multiple resolution features for person re-identification", 《INFORMATION SCIENCES》, 21 July 2021 (2021-07-21), pages 525, XP086820710, DOI: 10.1016/j.ins.2021.07.058 *
GUOQING ZHANG 等: "Learning dual attention enhancement feature for visible–infrared person re-identification", 《J. VIS. COMMUN. IMAGE R.》, 31 March 2024 (2024-03-31), pages 1 - 10 *
HAO YU 等: "Modality Unifying Network for Visible-Infrared Person Re-Identification", 《COMPUTER VISION AND PATTERN RECOGNITION》, 12 September 2023 (2023-09-12), pages 1 - 11 *
MANG YE 等: "Deep Learning for Person Re-identification: A Survey and Outlook", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》, 31 December 2020 (2020-12-31), pages 1 - 25 *
王路遥 等: "结合多尺度特征与混淆学习的跨模态行人重识", 《智能系统学报》, 12 March 2024 (2024-03-12), pages 1 - 12 *

Also Published As

Publication number Publication date
CN117935172B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
CN109961051B (en) Pedestrian re-identification method based on clustering and block feature extraction
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
CN110717411A (en) Pedestrian re-identification method based on deep layer feature fusion
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN111027421A (en) Graph-based direct-push type semi-supervised pedestrian re-identification method
CN109919073B (en) Pedestrian re-identification method with illumination robustness
CN113920472B (en) Attention mechanism-based unsupervised target re-identification method and system
CN114067444A (en) Face spoofing detection method and system based on meta-pseudo label and illumination invariant feature
CN113159043A (en) Feature point matching method and system based on semantic information
US11361534B2 (en) Method for glass detection in real scenes
JP2022082493A (en) Pedestrian re-identification method for random shielding recovery based on noise channel
CN111507416B (en) Smoking behavior real-time detection method based on deep learning
CN115393788B (en) Multi-scale monitoring pedestrian re-identification method based on global information attention enhancement
CN110348366B (en) Automatic optimal face searching method and device
TWI696958B (en) Image adaptive feature extraction method and its application
CN111695531A (en) Cross-domain pedestrian re-identification method based on heterogeneous convolutional network
CN117935172B (en) Visible light infrared pedestrian re-identification method and system based on spectral information filtering
CN118038494A (en) Cross-modal pedestrian re-identification method for damage scene robustness
CN116523969B (en) MSCFM and MGFE-based infrared-visible light cross-mode pedestrian re-identification method
CN118115947A (en) Cross-mode pedestrian re-identification method based on random color conversion and multi-scale feature fusion
CN114973164B (en) Ship target fusion recognition method based on image style migration
CN113128461B (en) Pedestrian re-recognition performance improving method based on human body key point mining full-scale features
CN112989911B (en) Pedestrian re-identification method and system
CN117994822B (en) Cross-mode pedestrian re-identification method based on auxiliary mode enhancement and multi-scale feature fusion
Li et al. Intelligent terminal face spoofing detection algorithm based on deep belief network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant