CN114663861A - Vehicle re-identification method based on dimension decoupling and non-local relation - Google Patents

Vehicle re-identification method based on dimension decoupling and non-local relation Download PDF

Info

Publication number
CN114663861A
CN114663861A CN202210531995.9A CN202210531995A CN114663861A CN 114663861 A CN114663861 A CN 114663861A CN 202210531995 A CN202210531995 A CN 202210531995A CN 114663861 A CN114663861 A CN 114663861A
Authority
CN
China
Prior art keywords
local
channel
decoupling
dimension
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210531995.9A
Other languages
Chinese (zh)
Other versions
CN114663861B (en
Inventor
王成
孟庆兰
田鑫
郑艳丽
姜刚武
庞希愚
栗士涛
李曦
周厚仁
郑美凤
孙珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jiaotong University
Original Assignee
Shandong Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jiaotong University filed Critical Shandong Jiaotong University
Priority to CN202210531995.9A priority Critical patent/CN114663861B/en
Publication of CN114663861A publication Critical patent/CN114663861A/en
Application granted granted Critical
Publication of CN114663861B publication Critical patent/CN114663861B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of vehicle heavy identification, in particular to a vehicle heavy identification method based on dimension decoupling and non-local relation, which comprises the following steps: copying partial residual layers after ResNet-50 into three branches with the same structure, and sequentially introducing a global feature extraction mechanism, a non-local relation capture mechanism and a dimension decoupling mechanism into each branch behind the residual layer; the non-local relationship capturing mechanism comprises a non-local relationship capturing module based on a channel and a non-local relationship capturing module based on a space, wherein the non-local relationship capturing module and the non-local relationship capturing module respectively perform noise reduction on the non-local relationship between the channel and the space level, and meanwhile different weights are distributed for the relationship between the channel and the position on the channel and the space level; the dimension decoupling mechanism is used for decoupling the space and the channel, and a part of the characteristics are concentrated in a specific subspace. The invention solves the problems of large intra-class difference and small inter-class difference in vehicle weight identification.

Description

Vehicle re-identification method based on dimension decoupling and non-local relation
Technical Field
The invention relates to the technical field of vehicle re-identification, in particular to a vehicle re-identification method based on dimension decoupling and non-local relation.
Background
With the use and popularization of automobiles, a great number of technical problems related to vehicle management and scheduling are derived to be solved urgently. Vehicle weight recognition has been reported by researchers in the industry as one of the technical difficulties associated with vehicle management scheduling. The vehicle weight recognition aims to find vehicle images belonging to the same identity from images shot by different cameras. In recent years, the vehicle re-identification algorithm based on deep learning has the advantages of unique adaptivity, strong identification precision and the like, so that the deep learning theory is widely applied to the field of vehicle re-identification. At present, the problem of vehicle heavy identification is that the vehicle heavy identification network has low generalization capability and network accuracy due to the problem of large intra-class difference and small inter-class difference.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a vehicle re-identification method based on dimension decoupling and non-local relation, adopts a global feature extraction mechanism, a non-local relation capture mechanism and a dimension decoupling mechanism to design, solves the problems of larger intra-class difference and smaller inter-class difference in the vehicle re-identification problem, and can improve the generalization capability and network precision of a vehicle re-identification network.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a vehicle re-identification method based on dimension decoupling and non-local relation comprises the following steps:
a convolutional neural network Resnet50 is used as a backbone, partial residual error layers (res _ conv4_ 2-res _ conv5) behind ResNet-50 are copied into three branches with the same structure, and a global feature extraction mechanism, a non-local relationship capture mechanism and a dimension decoupling mechanism are sequentially introduced into each branch behind the residual error layers;
the global feature extraction mechanism is used for extracting global features of the vehicle, and feature graphs generated by a res _ conv5 layer are sequentially input into a Global Average Pooling (GAP) and a channel dimension reduction module consisting of a 1 × 1 convolution, Batch Normalization (BN) and a ReLU activation function to obtain 256-dimensional feature representation;
the non-local relationship capturing mechanism comprises a non-local relationship capturing module based on a channel and a non-local relationship capturing module based on a space, wherein the non-local relationship capturing module and the non-local relationship capturing module respectively perform noise reduction on the non-local relationship between the channel and the space level, and meanwhile different weights are distributed for the relationship between the channel and the position on the channel and the space level; the mechanism aims to dig out useful non-local relations so as to improve the performance of the network;
the dimension decoupling mechanism is used for stripping mutual interference of information between the channel and the space, and decoupling the space and the channel, so that a part of features are concentrated in a specific subspace.
Further, the non-local relationship capture mechanism is specifically operative to: simultaneously inputting the tensors output by the res _ conv5 layer into a space-based non-local relation capturing module and a channel-based non-local relation capturing module; the feature size through the spatial and channel non-local relationship capture module is unchanged, and then the dimensionality is reduced from 2048 to 256 through Global Average Pooling (GAP), 1 × 1 convolution, Bulk Normalization (BN), and ReLU operations.
Further, the dimensional decoupling mechanism is specifically operative to: firstly, a feature map is divided into two parts from the horizontal direction in a space dimension to obtain two feature maps, and then the feature maps are decoupled on average in a channel dimension, namely the two obtained feature maps are divided into two parts on the channel dimension, aiming at cutting off redundant channels in the channel dimension in the two spaces, so that the calculated amount is reduced, and the relation between the spaces and the channels is stripped; in the third branch, the invention takes two gray features in the feature map generated by res _ conv5 as two decoupled subspaces; after the decoupling operation, independently performing feature extraction operations in two subspaces: the feature maps were subjected to Global Average Pooling (GAP), 1 × 1 convolution, Bulk Normalization (BN), and ReLU operations, reducing the dimensionality from 2048 to 256.
Further, the composition structure of the space-based non-local relationship capture module is as follows: is provided with
Figure 51251DEST_PATH_IMAGE001
Is an input to a feature extraction module, wherein
Figure 561866DEST_PATH_IMAGE002
And
Figure 607183DEST_PATH_IMAGE003
respectively the width and the height of the input tensor,
Figure 205654DEST_PATH_IMAGE004
is the number of channels; wherein the function
Figure 361305DEST_PATH_IMAGE005
Is input as a pair
Figure 449346DEST_PATH_IMAGE006
To carry out
Figure 919642DEST_PATH_IMAGE007
A depth separable convolution operation with a convolution kernel of size
Figure 712018DEST_PATH_IMAGE008
Then obtain
Figure 646475DEST_PATH_IMAGE007
Each size is
Figure 843102DEST_PATH_IMAGE009
And then deforming both of them into
Figure 863010DEST_PATH_IMAGE010
Then spliced in the spatial dimension to obtain
Figure 678651DEST_PATH_IMAGE011
The matrix is marked as A;
Figure 670877DEST_PATH_IMAGE012
for the input tensor
Figure 366301DEST_PATH_IMAGE006
Performing 1 × 1 convolution operation, and performing deformation operation to obtain
Figure 935823DEST_PATH_IMAGE013
Then multiplying A by the B matrix to determine a non-local relationship and obtain a value of
Figure 476525DEST_PATH_IMAGE014
A matrix of (a); using for each column on the basis of the derived non-local relation
Figure 588838DEST_PATH_IMAGE015
Activating functions to obtain a suitable probability matrix
Figure 2633DEST_PATH_IMAGE016
Figure 997134DEST_PATH_IMAGE017
Figure 279210DEST_PATH_IMAGE016
Has a size of
Figure 42767DEST_PATH_IMAGE018
Then, the sum of each row of the probability matrix is calculated to obtain a weight vector
Figure 407889DEST_PATH_IMAGE019
Each element of the vector
Figure 561790DEST_PATH_IMAGE020
Represents the first
Figure 709875DEST_PATH_IMAGE021
A weight of each spatial location;
Figure 140987DEST_PATH_IMAGE020
expressed as:
Figure 552377DEST_PATH_IMAGE022
space-based non-local relationship capture module final output characteristics
Figure 255891DEST_PATH_IMAGE023
Comprises the following steps:
Figure 4404DEST_PATH_IMAGE024
in the feature extraction module,
Figure 742553DEST_PATH_IMAGE005
function is as
Figure 59265DEST_PATH_IMAGE007
A convolution kernel of size
Figure 325773DEST_PATH_IMAGE008
May separate the convolutions. A set of depth separable convolution operations is used to obtain a global set of key value distributions to measure the importance of each point, and thusAs the size of the convolution kernel is
Figure 815660DEST_PATH_IMAGE008
This means that each convolution fuses all feature points to obtain a global feature representation, and different weights are assigned to different positions. Wherein
Figure 80420DEST_PATH_IMAGE007
For adjustable hyper-parameters, it is found through experiments
Figure 630350DEST_PATH_IMAGE025
The present network achieves optimal performance.
Furthermore, in order to supplement the function of the space-based non-local relationship capturing module and improve the performance of the network, the channel-based non-local relationship capturing module is added in the network, and the module has a similar principle with the space-based non-local relationship capturing module, and is different in that the module focuses on the relationship between the channels, and the performance of the network is improved by establishing the non-local relationship among a plurality of groups of channels.
The channel-based non-local relationship capture module comprises the following components: first, the original features are combined
Figure 636352DEST_PATH_IMAGE001
Input to two 1 × 1 convolution operations; the first 1 x 1 convolution is used to remove the channel from
Figure 867613DEST_PATH_IMAGE004
Is compressed to
Figure 314775DEST_PATH_IMAGE026
Wherein
Figure 482899DEST_PATH_IMAGE027
To be hyper-parametric and then deforming it into
Figure 314589DEST_PATH_IMAGE028
Of size of
Figure 819520DEST_PATH_IMAGE029
(ii) a After the second 1 × 1 convolution operation, the distortion is
Figure 773569DEST_PATH_IMAGE030
Of size of
Figure 426267DEST_PATH_IMAGE031
Will be
Figure 999331DEST_PATH_IMAGE030
Is transposed with respect to
Figure 155506DEST_PATH_IMAGE028
Matrix multiplication is carried out to obtain the value of
Figure 296769DEST_PATH_IMAGE032
Is given as a global reference
Figure 108867DEST_PATH_IMAGE033
(ii) a Will be provided with
Figure 547938DEST_PATH_IMAGE033
Each column of
Figure 886516DEST_PATH_IMAGE015
Activating the function to obtain a probability matrix; then, the sum of each row of the probability matrix is calculated, which is a function
Figure 323313DEST_PATH_IMAGE034
Representation (this operation and solving in a space-based non-local relationship capture Module
Figure 685025DEST_PATH_IMAGE035
Is similar) to obtain a size of
Figure 475257DEST_PATH_IMAGE004
Global feature mask based on channel
Figure 606024DEST_PATH_IMAGE036
Figure 948144DEST_PATH_IMAGE037
Then will be
Figure 125047DEST_PATH_IMAGE038
And input features
Figure 640342DEST_PATH_IMAGE006
After element dot multiplication, and
Figure 828878DEST_PATH_IMAGE006
adding to obtain a final output characteristic representation; channel-based non-local relationship capture module final output characteristics
Figure 404216DEST_PATH_IMAGE023
Comprises the following steps:
Figure 550639DEST_PATH_IMAGE039
the invention has the technical effects that:
compared with the prior art, the vehicle re-identification method based on the dimension decoupling and the non-local relation is designed by adopting a global feature extraction mechanism, a non-local relation capture mechanism and a dimension decoupling mechanism, the global feature extraction mechanism is used for capturing relatively complete and coarse-grained features, the non-local relation capture mechanism is used for respectively extracting significant information from a feature map obtained by a backbone network on the spatial dimension and the channel dimension, so that the network can extract the features with finer granularity, the space and the channel are thoroughly decoupled by the dimension decoupling mechanism, and a part of the features are concentrated on a specific subspace. According to the invention, different useful information is respectively extracted through the three branches, and the three branches can assist each other, so that the performance of the model is optimal, and the generalization capability and the network precision of the vehicle weight recognition network are greatly improved.
Drawings
FIG. 1 is a vehicle re-identification network architecture diagram based on dimensional decoupling and non-local relationships in accordance with the present invention;
FIG. 2 is a diagram of a space-based non-local relationship capture module architecture according to the present invention;
FIG. 3 is a diagram of a lane-based non-local relationship capture module architecture according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings of the specification.
Example 1:
the embodiment relates to a vehicle re-identification method based on dimension decoupling and non-local relation, which comprises the following steps:
the convolutional neural network Resnet50 is used as a backbone network to enhance the feature extraction capability of the network, the network structure is shown in FIG. 1, partial residual error layers (res _ conv4_ 2-res _ conv5) behind ResNet-50 are copied into three branches with the same structure, and a global feature extraction mechanism, a non-local relationship capture mechanism and a dimension decoupling mechanism are sequentially introduced into each branch behind the residual error layer;
the global feature extraction mechanism is used for extracting global features of the vehicle, called global branches; the global branch inputs the feature maps generated by the res _ conv5 layer to a Global Average Pooling (GAP) and a channel dimension reduction module consisting of 1 × 1 convolution, Batch Normalization (BN) and ReLU activation functions in sequence to obtain 256-dimensional feature representation;
the non-local relationship capture mechanism comprises a channel-based non-local relationship capture module and a space-based non-local relationship capture module; firstly, simultaneously inputting the tensor output by the res _ conv5 layer into a space-based non-local relation capturing module and a channel-based non-local relation capturing module, and capturing highly detailed space and channel correlation by adopting a parallel structure; the non-local relation capture module based on the channel and the non-local relation capture module based on the space respectively perform noise reduction on the non-local relation between the channel and the space level, and meanwhile different weights are distributed for the relation between the channels and the positions on the channel and the space level; the mechanism aims to dig out useful non-local relations and further improve the performance of the network. The feature size of the spatial and channel non-local relation capture module is unchanged, the subsequent operation is the same as the global branch, and the dimensionality is reduced from 2048 to 256 through Global Average Pooling (GAP), 1 × 1 convolution, Batch Normalization (BN) and ReLU operation. Because the non-local relations of the input features are not all useful, only the specific non-local relations have a remarkable effect on improving the network precision, but the redundant non-local relations can become interference factors for capturing the features with the identifiability by the network, therefore, the non-local relation capturing mechanism solves the problems that the calculation amount is too large due to the fact that all relations of all positions are calculated by the existing network, the redundant non-local relations occupy calculation resources and interfere the network to capture the features which really play the role of identification, and improves the network precision of the vehicle re-identification network.
The dimension decoupling mechanism is used for stripping mutual interference of information between the channel and the space and decoupling the space and the channel, so that a part of features are concentrated in a specific subspace; the specific operation is that the feature map is firstly divided into two parts from the horizontal direction in the space dimension to obtain two feature maps, and then the two feature maps are averagely decoupled in the channel dimension, namely the two feature maps are respectively divided into two parts in the channel direction, aiming at cutting off redundant channels in the channel dimension in the two spaces, so that the calculated amount is reduced, the relation existing between the spaces and the channels is stripped, and compared with the previous method that each subspace is associated with all the channels, the decoupling branch really extracts the local features from only partial channels and subspaces. In the third branch, as shown in fig. 1, the present invention takes the two gray features in the feature map generated by res _ conv5 as the two decoupled subspaces. And independently executing the feature extraction operation in the two subspaces after the decoupling operation, wherein the feature extraction operation is the same as the subsequent operation of the first two branches. Because the hard division strategies are used for simply dividing the input features on the spatial level, the feature information of each region subjected to hard division on the channel dimension is ignored. Because the features to be paid attention to are different among the partitions after hard partitioning, the region of interest in the channel dimension also changes among the partitions. Therefore, the invention adopts a more accurate characteristic capture mode, namely a dimension decoupling mechanism, can reduce most of calculated amount, and effectively breaks through the problem that the information between the subspace and the channel interferes with each other to influence the network precision when the existing hard division strategy extracts fine-grained characteristics.
As shown in fig. 2, the structure of the space-based non-local relationship capture module is as follows: is provided with
Figure 72887DEST_PATH_IMAGE001
Input data for a feature extraction module, wherein
Figure 178246DEST_PATH_IMAGE002
And
Figure 986802DEST_PATH_IMAGE040
respectively the width and the height of the input tensor,
Figure 544823DEST_PATH_IMAGE004
is the number of channels. Wherein the function
Figure 605183DEST_PATH_IMAGE005
Is to input
Figure 643677DEST_PATH_IMAGE006
To carry out
Figure 560817DEST_PATH_IMAGE007
A (a)
Figure 809396DEST_PATH_IMAGE007
For adjustable hyper-parameters) depth separable convolution operation with a convolution kernel size of
Figure 470185DEST_PATH_IMAGE008
Then obtain
Figure 612453DEST_PATH_IMAGE007
Each size is
Figure 903757DEST_PATH_IMAGE009
And then deforming both of them into
Figure 436369DEST_PATH_IMAGE010
Then splicing is carried out in the spatial dimension to obtain
Figure 713898DEST_PATH_IMAGE011
Of the matrix of (a). Let the matrix be A.
Figure 382777DEST_PATH_IMAGE041
For the input tensor
Figure 110561DEST_PATH_IMAGE006
Performing 1 × 1 convolution operation, and performing deformation operation to obtain
Figure 192787DEST_PATH_IMAGE013
Then multiplying A by the B matrix to determine a non-local relationship and obtain a value of
Figure 195378DEST_PATH_IMAGE014
A matrix of (a); using for each column on the basis of the derived non-local relationship
Figure 922026DEST_PATH_IMAGE015
Activating functions to obtain a suitable probability matrix
Figure 617449DEST_PATH_IMAGE016
Figure 937703DEST_PATH_IMAGE017
Figure 681668DEST_PATH_IMAGE016
Has a size of
Figure 590718DEST_PATH_IMAGE018
Then, the sum of each row of the probability matrix is calculated to obtain a weight vector
Figure 253781DEST_PATH_IMAGE019
Each element of the vector
Figure 248282DEST_PATH_IMAGE020
Represents the first
Figure 795938DEST_PATH_IMAGE021
A weight of each spatial location;
Figure 393052DEST_PATH_IMAGE020
expressed as:
Figure 164698DEST_PATH_IMAGE022
space-based non-local relationship capture module final output characteristics
Figure 584179DEST_PATH_IMAGE023
Comprises the following steps:
Figure 60159DEST_PATH_IMAGE024
in the feature extraction module,
Figure 678222DEST_PATH_IMAGE005
function is as
Figure 824033DEST_PATH_IMAGE007
A convolution kernel of size
Figure 527547DEST_PATH_IMAGE008
Can separate the convolutions. The importance of each point is measured by obtaining a global set of key-value distributions using a set of depth-separable convolution operations, and because the size of the convolution kernel is
Figure 26792DEST_PATH_IMAGE008
This means that each convolution fuses all feature points to obtain a global feature representation, and different weights are assigned to different positions. Wherein
Figure 968203DEST_PATH_IMAGE007
For adjustable hyper-parameters, it is found through experiments
Figure 81653DEST_PATH_IMAGE042
The present network achieves optimal performance.
In order to supplement the functions of the space-based non-local relationship capture module and improve the performance of the network, the invention adds the channel-based non-local relationship capture module in the network, and the module has a similar principle with the space-based non-local relationship capture module, and is different in that the module focuses on the relationship between channels, and the performance of the network is improved by establishing the non-local relationship among a plurality of groups of channels.
As shown in fig. 3, the component structure of the channel-based non-local relationship capture module is: first, the original features are combined
Figure 334780DEST_PATH_IMAGE001
Input to two 1 × 1 convolution operations; the first 1 x 1 convolution is used to remove the channel from
Figure 824667DEST_PATH_IMAGE004
Is compressed to
Figure 355005DEST_PATH_IMAGE026
Figure 983564DEST_PATH_IMAGE043
For adjustable hyper-parameters, the network will
Figure 661670DEST_PATH_IMAGE027
Set to 4) and then deformed into
Figure 627352DEST_PATH_IMAGE028
(size
Figure 74514DEST_PATH_IMAGE029
). After the second 1 × 1 convolution operation, the distortion is
Figure 857662DEST_PATH_IMAGE030
(size
Figure 226327DEST_PATH_IMAGE031
) Will be
Figure 58016DEST_PATH_IMAGE030
Is transposed with respect to
Figure 172734DEST_PATH_IMAGE028
Matrix multiplication is carried out to obtain the value of
Figure 64467DEST_PATH_IMAGE044
Is given as a global reference
Figure 920427DEST_PATH_IMAGE045
. Will be provided with
Figure 352545DEST_PATH_IMAGE045
Each column of
Figure 774300DEST_PATH_IMAGE015
The function is activated to obtain a probability matrix. Then, the sum of each row of the probability matrix is calculated, which is a function
Figure 774617DEST_PATH_IMAGE046
Representation (this operation and solving in a space-based non-local relationship capture Module
Figure 649032DEST_PATH_IMAGE047
Operation is similar) to obtain a dimension of
Figure 163802DEST_PATH_IMAGE004
Based on the global feature mask of the channel
Figure 377746DEST_PATH_IMAGE038
Figure 611281DEST_PATH_IMAGE037
Then will be
Figure 35309DEST_PATH_IMAGE038
And input features
Figure 746913DEST_PATH_IMAGE006
After element dot multiplication, and
Figure 346522DEST_PATH_IMAGE006
the addition yields the final output characteristic representation. Channel-based non-local relationship capture module final output characteristics
Figure 564008DEST_PATH_IMAGE023
Comprises the following steps:
Figure 413015DEST_PATH_IMAGE039
in order to improve the learning and identification capability of the network, the invention adopts a cross entropy loss function and a triple loss function to constrain the network:
the invention uses ResNet-50 as backbone network, setting batch size to be 16, training round number to be 450 rounds, and adjusting size of image to be 256 × 256 before inputting into network. In the training phase, the 256-dimensional features after dimensionality reduction are constrained by using the triple loss training. In addition, the reduced-dimension features are passed through a full connectivity layer (fc) to convert the 256-dimension features into the number of vehicle IDs for the dataset, and then cross-entropy loss is used for training constraints. In the testing stage, the Euclidean distance is used for carrying out similarity measurement on the vehicle images.
The invention provides a vehicle re-identification network based on dimension decoupling and non-local relation, which is used for a vehicle re-identification task. In both dimension-based decoupling and non-local relationship networks, three branches are used to learn a variety of useful information. The first branch captures relatively complete and coarse-grained features; for branch two, the significance information of the feature graph obtained by the backbone network is respectively extracted on the space dimension and the channel dimension, so that the network can extract features with finer granularity. The third branch makes a complete decoupling of the space and the channel, with some features dedicated to a particular one of the subspaces. In general, three branches respectively extract different useful information, and the three branches can assist each other to optimize the performance of the model.
The above embodiments are only specific examples of the present invention, and the scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or modifications by those of ordinary skill in the art, which are consistent with the claims of the present invention, shall fall within the scope of the present invention.

Claims (6)

1. A vehicle re-identification method based on dimension decoupling and non-local relation is characterized in that: the method comprises the following steps:
using a convolutional neural network Resnet50 as a backbone, copying partial residual error layers after ResNet-50 into three branches with the same structure, and sequentially introducing a global feature extraction mechanism, a non-local relationship capture mechanism and a dimension decoupling mechanism into each branch after the residual error layers;
the global feature extraction mechanism is used for extracting global features of the vehicle, and feature maps generated by a res _ conv5 layer are sequentially input into a global average pooling module and a channel dimension reduction module consisting of a 1 × 1 convolution, batch standardization and a ReLU activation function to obtain 256-dimensional feature representation;
the non-local relationship capturing mechanism comprises a non-local relationship capturing module based on a channel and a non-local relationship capturing module based on a space, wherein the non-local relationship capturing module and the non-local relationship capturing module respectively perform noise reduction on the non-local relationship between the channel and the space level, and meanwhile different weights are distributed for the relationship between the channel and the position on the channel and the space level; the mechanism aims to dig out useful non-local relations so as to improve the performance of the network;
the dimension decoupling mechanism is used for stripping mutual interference of information between the channel and the space, and decoupling the space and the channel, so that a part of features are concentrated in a specific subspace.
2. The method of claim 1 for vehicle re-identification based on dimensional decoupling and non-local relationships, characterized by: the non-local relationship capture mechanism is specifically operative to: simultaneously inputting the tensors output by the res _ conv5 layer into a space-based non-local relation capturing module and a channel-based non-local relation capturing module; the feature size through the spatial and channel non-local relationship capture module is unchanged, and then the dimensionality is reduced from 2048 to 256 through global average pooling, 1 × 1 convolution, batch normalization and ReLU operation.
3. The method of claim 1 for vehicle re-identification based on dimensional decoupling and non-local relationships, characterized by: the dimension decoupling mechanism is specifically operative to: firstly, dividing a feature map into two parts in the horizontal direction in a space dimension to obtain two feature maps, and then equally dividing the two obtained feature maps into two parts in a channel dimension; after the decoupling operation, independently performing feature extraction operations in two subspaces: the feature map is subjected to global average pooling, 1 × 1 convolution, batch normalization and ReLU operation, and the dimensionality is reduced from 2048 to 256.
4. The method of claim 1 for vehicle re-identification based on dimensional decoupling and non-local relationships, characterized by: the composition structure of the space-based non-local relationship capture module is as follows: is provided with
Figure 442607DEST_PATH_IMAGE001
Is an input to a feature extraction module, wherein
Figure 755777DEST_PATH_IMAGE002
And
Figure 534377DEST_PATH_IMAGE003
respectively is an input sheetThe width and the height of the volume are,
Figure 870681DEST_PATH_IMAGE004
is the number of channels; wherein the function
Figure 2716DEST_PATH_IMAGE005
Is to input
Figure 108075DEST_PATH_IMAGE006
To carry out
Figure 791997DEST_PATH_IMAGE007
A depth separable convolution operation with a convolution kernel of size
Figure 677914DEST_PATH_IMAGE008
Then obtain
Figure 800591DEST_PATH_IMAGE007
Each size is
Figure 698139DEST_PATH_IMAGE009
And then deforming both of them into
Figure 615280DEST_PATH_IMAGE010
Then splicing is carried out in the spatial dimension to obtain
Figure 4804DEST_PATH_IMAGE011
The matrix is marked as A;
Figure 603276DEST_PATH_IMAGE012
for the input tensor
Figure 417648DEST_PATH_IMAGE006
Performing 1 × 1 convolution operation, and performing deformation operation to obtain
Figure 833586DEST_PATH_IMAGE013
Then multiplying A by the B matrix to determine a non-local relationship and obtain a value of
Figure 569460DEST_PATH_IMAGE014
A matrix of (a); using for each column on the basis of the derived non-local relation
Figure 768361DEST_PATH_IMAGE015
Activating functions to obtain a suitable probability matrix
Figure 512938DEST_PATH_IMAGE016
Figure 240723DEST_PATH_IMAGE017
Figure 995052DEST_PATH_IMAGE016
Has a size of
Figure 325539DEST_PATH_IMAGE018
Then, the sum of each row of the probability matrix is calculated to obtain a weight vector
Figure 114504DEST_PATH_IMAGE019
Each element of the vector
Figure 747610DEST_PATH_IMAGE020
Represents the first
Figure 67864DEST_PATH_IMAGE021
A weight of each spatial location;
Figure 874146DEST_PATH_IMAGE020
expressed as:
Figure 455301DEST_PATH_IMAGE022
space-based non-local relationship capture module final output characteristics
Figure 649521DEST_PATH_IMAGE023
Comprises the following steps:
Figure 378443DEST_PATH_IMAGE024
in the feature extraction module,
Figure 191678DEST_PATH_IMAGE005
function is as
Figure 768284DEST_PATH_IMAGE007
A convolution kernel of size
Figure 539931DEST_PATH_IMAGE008
May separate the convolutions.
5. The method of claim 1 for vehicle re-identification based on dimensional decoupling and non-local relationships, characterized by: the channel-based non-local relationship capture module comprises the following components: first, the original features are combined
Figure 224990DEST_PATH_IMAGE001
Input to two 1 × 1 convolution operations; the first 1 x 1 convolution is used to remove the channel from
Figure 435392DEST_PATH_IMAGE004
Is compressed to
Figure 256717DEST_PATH_IMAGE025
In which
Figure 199266DEST_PATH_IMAGE026
Is ultraParameter, then deforming it into
Figure 715829DEST_PATH_IMAGE027
Of size of
Figure 667604DEST_PATH_IMAGE028
(ii) a After the second 1 × 1 convolution operation, the distortion is
Figure 77857DEST_PATH_IMAGE029
Of size of
Figure 784782DEST_PATH_IMAGE030
Will be
Figure 975592DEST_PATH_IMAGE029
Is transposed with respect to
Figure 403162DEST_PATH_IMAGE027
Matrix multiplication is carried out to obtain the value of
Figure 995817DEST_PATH_IMAGE031
Is given as a global reference
Figure 621446DEST_PATH_IMAGE032
(ii) a Will be provided with
Figure 971656DEST_PATH_IMAGE033
Each column of
Figure 265234DEST_PATH_IMAGE015
Activating the function to obtain a probability matrix; then, the sum of each row of the probability matrix is calculated, which is a function
Figure 774713DEST_PATH_IMAGE034
Expressed, get the size of
Figure 698807DEST_PATH_IMAGE004
Based on the global feature mask of the channel
Figure 864209DEST_PATH_IMAGE035
Figure 508948DEST_PATH_IMAGE036
Then will be
Figure 13879DEST_PATH_IMAGE035
And input features
Figure 905611DEST_PATH_IMAGE006
After element dot multiplication, and
Figure 355047DEST_PATH_IMAGE006
adding to obtain a final output characteristic representation; channel-based non-local relationship capture module final output characteristics
Figure 193690DEST_PATH_IMAGE023
Comprises the following steps:
Figure 615444DEST_PATH_IMAGE037
6. the method for vehicle re-identification based on dimensional decoupling and non-local relationship as claimed in any one of claims 1-5, wherein: and (3) constraining the network by adopting a cross entropy loss function and a triplet loss function:
using ResNet-50 as a backbone network, setting the batch processing size to be 16, the number of training rounds to be 450 rounds, and adjusting the size of an image to be 256 multiplied by 256 before the image is input into the network; in the training stage, the 256-dimensional features after dimension reduction are constrained by utilizing triple loss training; in addition, the 256-dimensional features are changed into the number of vehicle IDs of the data set through a full connection layer after the features are subjected to dimension reduction, and then cross entropy loss is used for training and constraining; in the testing stage, the Euclidean distance is used for carrying out similarity measurement on the vehicle images.
CN202210531995.9A 2022-05-17 2022-05-17 Vehicle re-identification method based on dimension decoupling and non-local relation Active CN114663861B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210531995.9A CN114663861B (en) 2022-05-17 2022-05-17 Vehicle re-identification method based on dimension decoupling and non-local relation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210531995.9A CN114663861B (en) 2022-05-17 2022-05-17 Vehicle re-identification method based on dimension decoupling and non-local relation

Publications (2)

Publication Number Publication Date
CN114663861A true CN114663861A (en) 2022-06-24
CN114663861B CN114663861B (en) 2022-08-26

Family

ID=82037194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210531995.9A Active CN114663861B (en) 2022-05-17 2022-05-17 Vehicle re-identification method based on dimension decoupling and non-local relation

Country Status (1)

Country Link
CN (1) CN114663861B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665019A (en) * 2023-07-31 2023-08-29 山东交通学院 Multi-axis interaction multi-dimensional attention network for vehicle re-identification

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120157A1 (en) * 2019-12-20 2021-06-24 Intel Corporation Light weight multi-branch and multi-scale person re-identification
CN113420742A (en) * 2021-08-25 2021-09-21 山东交通学院 Global attention network model for vehicle weight recognition
CN113822246A (en) * 2021-11-22 2021-12-21 山东交通学院 Vehicle weight identification method based on global reference attention mechanism
CN114005078A (en) * 2021-12-31 2022-02-01 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism
CN114332919A (en) * 2021-12-11 2022-04-12 南京行者易智能交通科技有限公司 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment
CN114398979A (en) * 2022-01-13 2022-04-26 四川大学华西医院 Ultrasonic image thyroid nodule classification method based on feature decoupling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021120157A1 (en) * 2019-12-20 2021-06-24 Intel Corporation Light weight multi-branch and multi-scale person re-identification
CN113420742A (en) * 2021-08-25 2021-09-21 山东交通学院 Global attention network model for vehicle weight recognition
CN113822246A (en) * 2021-11-22 2021-12-21 山东交通学院 Vehicle weight identification method based on global reference attention mechanism
CN114332919A (en) * 2021-12-11 2022-04-12 南京行者易智能交通科技有限公司 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment
CN114005078A (en) * 2021-12-31 2022-02-01 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism
CN114398979A (en) * 2022-01-13 2022-04-26 四川大学华西医院 Ultrasonic image thyroid nodule classification method based on feature decoupling

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LINMING GAO等: "Low-Rank Nonlocal Representation for Remote Sensing Scene Classification", 《IEEE GEOSCIENCE AND REMOTE SENSING LETTERS》 *
ZIYI CHEN等: "Corse-to-Fine Road Extraction Based on Local Dirichlet Mixture Models and Multiscale-High-Order Deep Learning", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 》 *
王成: "考虑多类型分布式电源参与的售电公司优化调度模型研究", 《中国优秀硕士学位论文全文数据库工程科技II辑》 *
王辉涛等: "基于全局时空感受野的高效视频分类方法", 《小型微型计算机系统》 *
谢彭宇等: "基于多尺度联合学习的行人重识别", 《北京航空航天大学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665019A (en) * 2023-07-31 2023-08-29 山东交通学院 Multi-axis interaction multi-dimensional attention network for vehicle re-identification
CN116665019B (en) * 2023-07-31 2023-09-29 山东交通学院 Multi-axis interaction multi-dimensional attention network for vehicle re-identification

Also Published As

Publication number Publication date
CN114663861B (en) 2022-08-26

Similar Documents

Publication Publication Date Title
CN111462126B (en) Semantic image segmentation method and system based on edge enhancement
Wang et al. Exploring linear relationship in feature map subspace for convnets compression
CN113902926A (en) General image target detection method and device based on self-attention mechanism
CN113822246B (en) Vehicle weight identification method based on global reference attention mechanism
JP2015052832A (en) Weight setting device and method
CN107292225B (en) Face recognition method
CN109670418B (en) Unsupervised object identification method combining multi-source feature learning and group sparsity constraint
CN108154133B (en) Face portrait-photo recognition method based on asymmetric joint learning
CN112580480B (en) Hyperspectral remote sensing image classification method and device
CN107564007B (en) Scene segmentation correction method and system fusing global information
CN114005078B (en) Vehicle weight identification method based on double-relation attention mechanism
CN114998958B (en) Face recognition method based on lightweight convolutional neural network
CN112084895B (en) Pedestrian re-identification method based on deep learning
CN112733590A (en) Pedestrian re-identification method based on second-order mixed attention
CN114663861B (en) Vehicle re-identification method based on dimension decoupling and non-local relation
Zhang et al. Fusion of multifeature low-rank representation for synthetic aperture radar target configuration recognition
CN116128944A (en) Three-dimensional point cloud registration method based on feature interaction and reliable corresponding relation estimation
CN112967210B (en) Unmanned aerial vehicle image denoising method based on full convolution twin network
CN114005046A (en) Remote sensing scene classification method based on Gabor filter and covariance pooling
CN113221992A (en) Based on L2,1Large-scale data rapid clustering method of norm
Huang et al. A convolutional neural network architecture for vehicle logo recognition
Li et al. POLSAR Target Recognition Using a Feature Fusion Framework Based on Monogenic Signal and Complex-Valued Nonlocal Network
CN116030495A (en) Low-resolution pedestrian re-identification algorithm based on multiplying power learning
CN110210443B (en) Gesture recognition method for optimizing projection symmetry approximate sparse classification
CN111931767B (en) Multi-model target detection method, device and system based on picture informativeness and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant