CN112633417A - Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization - Google Patents

Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization Download PDF

Info

Publication number
CN112633417A
CN112633417A CN202110059638.2A CN202110059638A CN112633417A CN 112633417 A CN112633417 A CN 112633417A CN 202110059638 A CN202110059638 A CN 202110059638A CN 112633417 A CN112633417 A CN 112633417A
Authority
CN
China
Prior art keywords
pedestrian
network
feature
identification
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110059638.2A
Other languages
Chinese (zh)
Inventor
张涛
孙星
李璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202110059638.2A priority Critical patent/CN112633417A/en
Publication of CN112633417A publication Critical patent/CN112633417A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A pedestrian depth feature fusion method for pedestrian re-identification and with a neural network modularization achieves a pedestrian re-identification method based on deep learning and suitable for different neural networks and loss functions, and meanwhile achieves feature fusion of different depths of different neural network structures. Therefore, the feature fusion method applied to pedestrian re-identification has higher flexibility and robustness, improves the accuracy of pedestrian re-identification, and weakens the interference of factors such as the change of a monitoring visual angle and the like on pedestrian re-identification. A Resnet50 Network and a multi-granularity Network (MGN) for pedestrian re-identification are adopted as reference algorithms, and cross entropy loss, triple loss and circle loss are adopted as alternative loss functions. The test data set respectively adopts a Market-1501, DukeMTMC-reid and CUHK03 data set, and on a CUHK03 data set, compared with a reference algorithm MGN, the method disclosed by the invention obtains the improvement of Rank-1/mAP of + 2.71%/2.11%.

Description

Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization
Technical Field
The invention relates to a pedestrian re-identification method, in particular to a pedestrian re-identification method aiming at pedestrian feature fusion of different depths in a neural network training process, which has more flexibility and robustness.
Background
The pedestrian re-identification task can be regarded as a subtask of image retrieval, which can be understood as a task of searching for a pedestrian of a specific identity in videos or images captured by cameras of a plurality of different imaging areas. As a new technology in the field of intelligent video analysis, the pedestrian re-identification task plays an important role in security and monitoring application. The essence of the pedestrian re-identification task is that pedestrians in images or videos are detected, tracked and subjected to characteristic analysis, due to the influence of monitoring environment, the shooting visual angle changes, the shooting environment difference, the image quality is poor, the accuracy of the pedestrian re-identification result is influenced by factors such as human posture changes, along with the rapid development of computer vision technology and machine learning, many key problems in the pedestrian re-identification task are effectively solved, and the pedestrian re-identification task gradually becomes a very popular research direction.
Although the efficiency of pedestrian re-identification is effectively improved by the deep learning technology, the problem of pedestrian re-identification still has many research difficulties due to the interference caused by the change of the camera view angle and the change of the pedestrian posture. Since there is a large difference between different pictures of the same pedestrian captured at different camera angles, it is necessary to consider feature fusion of the same pedestrian at different camera angles. The existing deep learning method for pedestrian re-identification usually focuses on feature extraction for classification problems, and feature learning of the same pedestrian under different visual angles usually achieves the purpose of similarity improvement through constraint of loss functions. Therefore, the performance of the neural network for pedestrian re-identification can be improved by carrying out effective feature fusion in the network training process. The current common feature fusion method is to fuse the features of a fixed node in the network, and the method has a certain limitation on the flexibility of application.
The invention provides a pedestrian feature fusion method for pedestrian re-identification, which modularizes a neural network, so that the feature fusion method applied to pedestrian re-identification has higher flexibility and robustness. According to the pedestrian re-identification method, the neural network is modularized, so that the characteristic fusion of the characteristics extracted from the networks at different depths is realized, the influence of the network characteristics at different depths on pedestrian re-identification is focused on by the neural network modularization, and meanwhile, the characteristic fusion module can be more conveniently applied to different pedestrian re-identification networks, so that the accuracy rate of pedestrian re-identification is improved.
Disclosure of Invention
The pedestrian re-identification algorithm optimizes the image feature fusion method of the same pedestrian under different visual angles, enables the feature fusion module to be more flexibly applied to a neural network framework in a mode of modularizing a neural network, pays attention to the influence of features extracted from different depth network layers on pedestrian features finally used for identification, and accordingly finds a relatively excellent feature fusion mode, wherein the feature fusion module is formed by cascade 1 x 1 convolution kernels.
1) Dividing a neural network architecture W into a plurality of training modules by taking a plurality of input/outputs as nodes, wherein a loss function is an independent module, and the neural network architecture W is defined as a backbone network.
2) And selecting the output node p of a certain dividing module B _ p as the input node of the characteristic fusion module B _ f.
3) A sub-network behind a node p in an original network architecture W is defined as W _ q, and a sub-network M _ q with the same structure as W _ q is constructed.
4) And splicing the M _ q after the characteristic fusion module B _ f to construct a characteristic fusion branch network.
5) And respectively selecting proper loss functions to constrain the main network and the feature fusion branch network.
6) In the network training process, the loss function of the whole network architecture is the sum of the loss function of the main network and the loss function of the feature fusion branch network.
Alternative loss functions are, cross-entropy loss function LsoftmaxTriple loss function
Figure BSA0000230649730000024
Sum circle loss function
Figure BSA0000230649730000025
The definition is as follows:
Figure BSA0000230649730000021
xiis the ith feature, WkDenotes xiIs the k-th person's possible weight matrix yiDenotes xiIs the true weight matrix for the kth person.
Figure BSA0000230649730000022
Respectively representing the current feature, a hardest positive sample and a hardest negative sample, wherein the hardest positive sample represents the feature which belongs to the same pedestrian as the current feature in the current training batch data and is most dissimilar, and the hardest negative sample represents the feature which belongs to the same pedestrian as the current feature in the current training batch data and is most similar. Alpha is a hyper-parameter in the network.
Each picture is input into the entire network architecture.
And realizing cross image feature fusion at the node p.
And (4) constraining the whole network structure hyper-parameter through the loss function of the backbone network and the loss function of the feature fusion branch network.
10) Judging whether the iteration number set during network training is reached, if so, executing a step 11), otherwise, executing a step 7)
11) And coding the feature _ m obtained by each pedestrian picture through a main network and the feature _ f obtained by the feature fusion branch network to obtain the features finally used for pedestrian re-identification. The specific coding mode is to take the average value or the maximum value of each dimension feature of feature _ m and feature _ f as the pedestrian feature _ o finally used for retrieval.
12) And judging the identity of the pedestrian by adopting a distance measurement mode on the pedestrian feature _ o used for searching for each pedestrian, wherein the closer the distance is, the higher the possibility that the pedestrian is the same person is, and the farther the distance is, the higher the possibility that the pedestrian is not the same person is.
Drawings
FIG. 1 shows the overall flow chart of the process
Table one Resnet50 comparison with the results of the method of the present invention
Figure BSA0000230649730000023
Figure BSA0000230649730000031
TABLE II comparison of MGN with the inventive method results
Figure BSA0000230649730000032
Detailed Description
Taking a Cuhk03 data set and an MGN network as an example, the following is a best implementation mode under one situation in the present invention, and the specific implementation steps are as follows:
1) res _ conv1, res _ conv2 and res _ conv3 of the neural network architecture MGN are respectively taken as input/output nodes to divide a plurality of training modules, wherein a loss function is an independent module, and the neural network architecture W is defined as a backbone network.
2) Res _ conv3 is selected as the partitioning module B _ p, the output node p of which is the input node of the feature fusion module B _ f.
3) A sub-network behind a node p in an original network architecture W is defined as W _ q, and a sub-network M _ q with the same structure as W _ q is constructed.
4) And splicing the M _ q after the characteristic fusion module B _ f to construct a characteristic fusion branch network.
5) The main network loss function selects a cross entropy loss function and a circle loss function, and the loss function of the feature fusion branch network selects the cross entropy loss function and the circle loss function.
6) In the network training process, the loss function of the whole network architecture is the sum of the loss function of the main network and the loss function of the feature fusion branch network.
Cross entropy loss function LsoftmaxSum circle loss function LtircleThe definition is as follows:
Figure BSA0000230649730000041
xiis the ith feature, WkDenotes xiIs the k-thPossible weight matrix for a personiDenotes xiIs the true weight matrix for the kth person.
Figure BSA0000230649730000042
Respectively representing the current feature, a hardest positive sample and a hardest negative sample, wherein the hardest positive sample represents the feature which belongs to the same pedestrian as the current feature in the current training batch data and is most dissimilar, and the hardest negative sample represents the feature which belongs to the same pedestrian as the current feature in the current training batch data and is most similar. Alpha is a hyper-parameter in the network.
7) Inputting each picture into the whole network architecture, inputting 8 images for the pedestrian images with the same identity, and inputting 8 pedestrian images with the same identity in total, namely inputting 64 images into the deep neural network MGN for pedestrian re-identification, wherein the input pictures require that the images with each identity are adjacent in the parallel training process.
8) And realizing cross image feature fusion at the node p.
9) And (4) constraining the whole network structure hyper-parameter through the loss function of the backbone network and the loss function of the feature fusion branch network.
10) Setting the iteration number to be 400, judging whether the iteration number set during network training is reached, if so, executing step 11), otherwise, executing step 7)
11) And coding the feature _ m obtained by each pedestrian picture through a main network and the feature _ f obtained by the feature fusion branch network to obtain the features finally used for pedestrian re-identification. The specific coding mode is to take the mean value of each dimensional feature of feature _ m and feature _ f as the pedestrian feature _ o finally used for retrieval.
12) And judging the identity of the pedestrian by adopting a distance measurement mode on the pedestrian feature _ o used for searching for each pedestrian, wherein the closer the distance is, the higher the possibility that the pedestrian is the same person is, and the farther the distance is, the higher the possibility that the pedestrian is not the same person is.

Claims (1)

1. A method for optimizing image feature fusion of the same pedestrian under different visual angles based on a neural network modularization mode is characterized by comprising the following steps:
1) dividing a neural network architecture W into a plurality of training modules by taking a plurality of input/outputs as nodes, wherein a loss function is an independent module, and the neural network architecture W is defined as a backbone network;
2) and selecting the output node p of a certain dividing module B _ p as the input node of the characteristic fusion module B _ f.
3) Defining a sub-network behind a node p in an original network architecture W as W _ q, and constructing a sub-network M _ q with the same structure as W _ q;
4) splicing the M _ q after the characteristic fusion module B _ f to construct a characteristic fusion branch network;
5) respectively selecting proper loss functions to constrain the main network and the feature fusion branch network, wherein the proper loss functions comprise a cross entropy loss function, a triple loss function and a circular loss function;
6) inputting each picture into the whole network architecture, and realizing cross image feature fusion at a node p;
7) the method comprises the steps that the whole network structure hyper-parameter is constrained through a loss function of a main network and a loss function of a feature fusion branch network;
8) judging whether the iteration number set during network training is reached, if so, executing step 11), and if not, executing step 7);
9) and coding the feature _ m obtained by each pedestrian picture through a main network and the feature _ f obtained by the feature fusion branch network to obtain the features finally used for pedestrian re-identification. The specific coding mode is that the average value or the maximum value of each dimension feature of feature _ m and feature _ f is taken as the pedestrian feature _ o finally used for retrieval;
10) and judging the identity of the pedestrian by adopting a distance measurement mode on the pedestrian feature _ o used for searching for each pedestrian, wherein the closer the distance is, the higher the possibility that the pedestrian is the same person is, and the farther the distance is, the higher the possibility that the pedestrian is not the same person is.
CN202110059638.2A 2021-01-18 2021-01-18 Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization Pending CN112633417A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110059638.2A CN112633417A (en) 2021-01-18 2021-01-18 Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110059638.2A CN112633417A (en) 2021-01-18 2021-01-18 Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization

Publications (1)

Publication Number Publication Date
CN112633417A true CN112633417A (en) 2021-04-09

Family

ID=75294466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110059638.2A Pending CN112633417A (en) 2021-01-18 2021-01-18 Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization

Country Status (1)

Country Link
CN (1) CN112633417A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255604A (en) * 2021-06-29 2021-08-13 苏州浪潮智能科技有限公司 Pedestrian re-identification method, device, equipment and medium based on deep learning network
CN115100690A (en) * 2022-08-24 2022-09-23 天津大学 Image feature extraction method based on joint learning
WO2023272995A1 (en) * 2021-06-29 2023-01-05 苏州浪潮智能科技有限公司 Person re-identification method and apparatus, device, and readable storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255604A (en) * 2021-06-29 2021-08-13 苏州浪潮智能科技有限公司 Pedestrian re-identification method, device, equipment and medium based on deep learning network
CN113255604B (en) * 2021-06-29 2021-10-15 苏州浪潮智能科技有限公司 Pedestrian re-identification method, device, equipment and medium based on deep learning network
WO2023272995A1 (en) * 2021-06-29 2023-01-05 苏州浪潮智能科技有限公司 Person re-identification method and apparatus, device, and readable storage medium
US11810388B1 (en) 2021-06-29 2023-11-07 Inspur Suzhou Intelligent Technology Co., Ltd. Person re-identification method and apparatus based on deep learning network, device, and medium
US11830275B1 (en) 2021-06-29 2023-11-28 Inspur Suzhou Intelligent Technology Co., Ltd. Person re-identification method and apparatus, device, and readable storage medium
CN115100690A (en) * 2022-08-24 2022-09-23 天津大学 Image feature extraction method based on joint learning

Similar Documents

Publication Publication Date Title
CN112633417A (en) Pedestrian depth feature fusion method for pedestrian re-identification and with neural network modularization
CN113516012B (en) Pedestrian re-identification method and system based on multi-level feature fusion
CN111723645B (en) Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN111340123A (en) Image score label prediction method based on deep convolutional neural network
CN112651262B (en) Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment
CN109508663A (en) A kind of pedestrian's recognition methods again based on multi-level supervision network
Shankar et al. Refining architectures of deep convolutional neural networks
CN113159466B (en) Short-time photovoltaic power generation prediction system and method
CN115171165A (en) Pedestrian re-identification method and device with global features and step-type local features fused
CN110751018A (en) Group pedestrian re-identification method based on mixed attention mechanism
CN110022422B (en) Video frame sequence generation method based on dense connection network
Giraldo et al. Graph CNN for moving object detection in complex environments from unseen videos
CN111126223A (en) Video pedestrian re-identification method based on optical flow guide features
CN112418087A (en) Underwater video fish identification method based on neural network
CN115063832A (en) Global and local feature-based cross-modal pedestrian re-identification method for counterstudy
CN115713546A (en) Lightweight target tracking algorithm for mobile terminal equipment
CN109359530B (en) Intelligent video monitoring method and device
CN111079585A (en) Image enhancement and pseudo-twin convolution neural network combined pedestrian re-identification method based on deep learning
CN109598227B (en) Single-image mobile phone source re-identification method based on deep learning
CN114694185B (en) Cross-modal target re-identification method, device, equipment and medium
CN114418003B (en) Double-image recognition and classification method based on attention mechanism and multi-size information extraction
CN113537032B (en) Diversity multi-branch pedestrian re-identification method based on picture block discarding
CN114821629A (en) Pedestrian re-identification method for performing cross image feature fusion based on neural network parallel training architecture
Hernandez et al. Classification of color textures with random field models and neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication