CN113128441B

CN113128441B - System and method for identifying vehicle weight by embedding structure of attribute and state guidance

Info

Publication number: CN113128441B
Application number: CN202110465670.0A
Authority: CN
Inventors: 郑爱华; 高亚飞; 李洪潮; 李成龙; 汤进; 罗斌
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2021-04-28
Filing date: 2021-04-28
Publication date: 2022-10-14
Anticipated expiration: 2041-04-28
Also published as: CN113128441A

Abstract

An attribute and state-guided structure-embedded vehicle weight recognition system and method, which belong to the technical field of computer vision and solve the problems of intra-class differences among the same vehicles and inter-class differences among different vehicles in vehicle weight recognition, and learn characteristics with discriminative power in vehicle weight recognition through attribute-based enhancement and state-based weakening, wherein the system comprises the following steps: the system comprises a residual network module, an attribute-based enhancement and expansion module, a state-based weakening and contraction module and a global structure embedding module; the method comprises two stages: a training stage and a testing stage; the characteristic spacing between different vehicles is increased by the enhancement and expansion of the attributes, and the characteristic spacing between the same vehicles is reduced by the weakening and contraction of the state, so that the characteristics of the vehicles are better learned by a network.

Description

System and method for identifying heavy vehicle by embedding structure of attribute and state guidance

Technical Field

The invention belongs to the technical field of computer vision, and relates to an attribute and state guided structure embedded vehicle weight recognition system and method.

Background

The task of vehicle weight recognition is to match vehicle images in non-overlapping surveillance cameras. Because of its wide application in the fields of video monitoring, social security, smart cities, smart traffic and the like, it is an active and challenging computer vision task and has attracted wide attention. Despite recent breakthroughs in vehicle re-identification, it still faces two serious challenges.

As shown in fig. 1 (a) and (b), there is a large intra-class difference between the same vehicle pictures in different states (e.g., different camera angles, vehicle viewing angles, and shooting times). As shown in fig. 1 (b), (c) and (d), there are slight inter-class differences between different vehicles, and especially two vehicles have the same or similar attributes, such as color, model, manufacturer, etc. Therefore, solving the intra-class differences between the same vehicles and the inter-class differences between different vehicles has become an important task in vehicle weight recognition.

Existing solutions present different approaches to address the two challenges described above. Representative methods are mainly classified into four categories: 1) Extracting global manual/depth features of the vehicle image through a specific metric learning method based on a global feature method; 2) The method based on the path generally adopts space-time information to remove unreasonable vehicles in the inference stage, and refines the retrieval result; 3) A perspective-based approach, the objective of which is to handle perspective changes through metric learning and learn the multi-perspective features of the vehicle Re-ID; 4) The method based on local information enhancement enhances the inter-class difference in vehicle weight recognition by providing some stable and discriminative clues.

However, in the two major challenges of the vehicle weight recognition that there is a large intra-class difference between the same vehicles and a small inter-class difference between different vehicles, the existing solutions have the following disadvantages:

1) Global feature-based approaches only consider the appearance of the vehicle image, and it is often difficult to capture intra-class similarities and inter-class differences;

2) The method based on the path usually ignores appearance change of the vehicle caused by space-time change in a learning stage of vehicle characteristics;

3) Perspective-based approaches, while significantly reducing intra-class variation, ignore intrinsic state factors of the vehicle (such as camera perspective and capture time), ignoring the challenge of subtle inter-class variation;

4) The method based on local information enhancement needs local region extraction operation, however, the local region extraction model usually needs a large amount of labeled data, and is time-consuming and labor-consuming.

In the prior art, a document 'study on vehicle weight recognition method based on multiple attributes' (Xiamen university, li Ke), published in 2019, discloses a vehicle weight recognition algorithm fusing vehicle angles, provides a vehicle weight recognition algorithm fusing vehicle colors and vehicle types, and constructs a vehicle weight recognition demonstration system based on a webpage platform; however, although this document considers the attribute information of the vehicle, it only uses the attribute information directly spliced to the vehicle feature as the auxiliary information, and thus obtains a good vehicle weight recognition performance, but does not consider optimizing the attribute information and the vehicle feature at the same time. The document ' research on a multi-view sparse fusion and multi-scale attention vehicle re-identification method ' (university of Anhui, dong's King), published in 2020, designs a vehicle re-identification network based on a multi-scale attention mechanism, extracts a vehicle feature map by using a skeleton network, generates vehicle feature maps of other scales by using a quadratic interpolation method, and sends the vehicle feature maps to corresponding sub-networks containing spatial channel attention modules respectively. After training the sub-networks separately, the multi-scale attention feature maps are fused using the joining layer, and the entire network is fine-tuned. The network obtains more robust features by fusing multi-scale complementary information and mining local details of discriminability by using an attention mechanism; but this document does not take into account the problem of information of attributes and states.

Disclosure of Invention

The invention aims to design a structure embedded vehicle weight recognition method and system for attribute and state guidance, and solve the problems of intra-class difference between the same vehicles and inter-class difference between different vehicles in vehicle weight recognition.

The invention solves the technical problems through the following technical scheme:

an attribute and state guided structure embedded vehicle weight recognition system that learns a feature having discriminative power in vehicle weight recognition through attribute-based enhancement and state-based weakening, comprising: the system comprises a residual network module, an attribute-based enhancement and expansion module, a state-based weakening and contraction module and a global structure embedding module;

the residual error network module is used for extracting a visible light characteristic diagram, copying the characteristic diagram into three parts, inputting one part of the characteristic diagram into the attribute-based enhancement and expansion module, inputting one part of the characteristic diagram into the state-based weakening and contraction module, and obtaining a category score through global average pooling and full-connection operation;

the attribute-based enhancement and expansion module is used for enhancing the identification of the vehicle features through attribute information related to identities and increasing the feature spacing between different vehicles through an attribute-based expansion loss function;

the state-based weakening and shrinking module is used for weakening state information interfering recognition and reducing the inter-class feature distance through a state-based shrinking loss function;

the global structure embedding module is used for obtaining a final vehicle feature vector based on the enhancement of the attribute and the weakening operation based on the state, calculating a category score through the final feature vector, transmitting the category score to a search library and judging whether the vehicle appears in other cameras.

The technical scheme of the invention provides a method for identifying the weight of the vehicle with the embedded structure guided by the attributes and the states, which increases the feature distance among different vehicles by enhancing and expanding the attributes, and reduces the feature distance among the same vehicles by weakening and contracting the states, thereby better helping the network to learn the features of the vehicles.

A method for embedding a vehicle weight recognition system in the attribute and state guidance structure, which is characterized by comprising two stages: a training stage and a testing stage;

the training phase comprises the following steps:

step 1): acquiring a visible light image of a vehicle, and inputting the visible light image into the system;

step 2): extracting a visible light characteristic diagram through a residual error network, copying the characteristic diagram into three parts, inputting one part of the characteristic diagram into an enhancement and expansion module based on attributes, inputting the other part of the characteristic diagram into a weakening and contraction module based on states, and obtaining category scores through global average pooling and full connection operation; calculating the category constraint loss values of the predicted category and the true value by using a cross entropy function, sharing the category loss values of three constraints of attributes, states and identities, and adding the three according to the proportion to form a multi-label category loss function to predict the categories of the three;

step 3): the feature distribution of the attributes is expanded through the expansion loss function of the attributes, so that the difference of the attributes among the classes is increased; narrowing the distribution of states by a shrinkage loss function of the states, thereby reducing the difference of the states in the class;

step 4): using a global structure embedding loss function, training all samples through heterogeneous sample separation and homogeneous sample approaching, so that the vehicle characteristic distance of the heterogeneous samples is increased along with the expansion of attributes until the upper boundary is reached, and the vehicle characteristic distance of the homogeneous samples is approached along with the contraction of the state until the lower boundary is smaller;

step 5): the final loss of the model is defined as the sum of multi-label classification loss, attribute-based expansion loss, state-based contraction loss and global structure embedding loss; in each training, the attribute and state-guided structure is embedded into the system to reversely transmit a final loss value, the loss value is reduced as a target, model parameters of the residual block and parameters of the attribute enhancing module and the state weakening module are updated by a self-adaptive momentum random gradient descent method, after multiple iterations, the loss value of the system is not reduced any more, all the parameters are optimal, and the training of the system is finished;

the testing stage comprises the following steps:

step a): loading a visible light image of a vehicle and inputting the visible light image into the trained system;

step b): extracting a visible light characteristic diagram through a trained residual error network;

step c): respectively passing the visible light characteristic diagram through a trained attribute-based enhancing and expanding module and a state-based weakening and contracting module, and obtaining an attribute enhanced characteristic tensor and a state weakened characteristic tensor;

step d): obtaining a final vehicle feature vector through attribute-based enhancement and state-based weakening operation;

step e): and calculating a category score through the final feature vector, transmitting the category score to a search library, and judging whether the vehicle appears in other cameras.

As a further improvement of the technical solution of the present invention, the method for increasing the difference between the attributes in the classes by expanding the feature distribution of the attributes through the attribute expansion loss function in step 3) comprises:

inputting a vehicle image I with an image size of 256 × 256, firstly, acquiring a feature map T of the vehicle image through a residual error network with 50 layers, wherein T = ResNet50 (I);

copying three parts of a feature map T of the vehicle image, wherein the first part is transmitted into an attribute-based enhancement and expansion module;

then, inputting the feature map T into 1 × 1 volume blocks corresponding to different attributes to obtain a feature map related to the attributes:

and (3) carrying out batch regularization BN operation and modified linear unit ReLU operation on the feature graph related to the attribute to obtain:

and then carrying out global average pooling operation to obtain characteristics related to the attributes:

introducing attribute related tags

Adding a full connection layer FC, so that the characteristics related to the attributes are constrained by the labels in the end-to-end training, and the constraint of the labels related to the attributes is formulated as:

then, the average value of the attributes of the current vehicle relative to all the vehicles is calculated

Is distributed over a plurality of features, | |) ₂ Represents l ₂ Norm:

designing an extended loss function based on attributes to force the network to continuously extend the feature distribution related to the attributes in end-to-end training, wherein the extended formula is as follows:

wherein, i represents the types of attributes, which are respectively: a color attribute of the vehicle, a type attribute of the vehicle, a manufacturer attribute of the vehicle;

the system trains to carry out back transmission loss function towards the descending direction of the expansion loss, so that the attribute feature distribution is continuously expanded, and the feature distances of different vehicles are driven to be continuously far away;

using sigmoid functions to map features associated with attributes

Normalizing to be between 0 and 1, and carrying out element-level product operation with the vehicle characteristic diagram T to obtain an enhanced diagram T related to attributes ^e The formula is as follows:

in the iterative process of the system, the expansion of the attribute is continuously executed, and the characteristic diagram related to the attribute is utilized

The calculation formula of the enhanced vehicle characteristic diagram T' is as follows:

T′＝T+T ^e

wherein, T ^e And (3) showing an enhancement map related to the attribute, T showing a characteristic map of the vehicle image, and T' showing a characteristic map of vehicle enhancement.

As a further improvement of the technical solution of the present invention, the method for reducing the difference of the intra-class states by narrowing down the distribution of the states through the shrinkage loss function of the states in step 3) comprises:

copying three parts of a characteristic diagram T of the vehicle image, wherein the first part is transmitted into a weakening and shrinking module based on the state;

then, inputting the feature map T into the 1 × 1 volume blocks corresponding to different states to obtain a feature map independent of the states:

the characteristic graph irrelevant to the state is subjected to batch regularization BN operation and modified linear unit ReLU operation to obtain the characteristic graph irrelevant to the state

And then carrying out global average pooling operation to obtain characteristics irrelevant to the state:

introducing state independent tags

Adding a full connectivity layer (FC) so that the state-independent features are constrained by labels in the end-to-end training, the state-independent label constraint being formulated as:

then calculating the different state mean value of the current vehicle relative to all vehicles

Is distributed over a plurality of features, | |) ₂ Represents l ₂ A norm;

designing a shrinkage loss function based on states to force the network to continuously shrink the feature distribution of different states in end-to-end training, wherein the shrinkage formula of the states is as follows:

the training of the network can carry out back transmission loss function towards the descending direction of shrinkage loss, so that the state characteristic distribution is continuously shrunk, and the characteristic distance of the same vehicle is driven to be continuously shrunk;

using sigmoid functions to make state-independent profiles

Normalizing to be between 0 and 1, and carrying out element level product operation with the vehicle characteristic diagram T to obtain a weakening diagram T independent of the state ^w The formula is as follows:

in the iterative process of the system, the contraction of the state is continuously executed, and the characteristic diagram which is irrelevant to the state is utilized

Weakening the characteristic diagram T 'of vehicle reinforcement to obtain a final vehicle characteristic diagram T', wherein the calculation formula is as follows:

T″＝T′-T ^w

wherein, T 'represents the final vehicle characteristic diagram, and T' represents the enhanced characteristic diagram of the vehicle.

As a further improvement of the technical scheme of the invention, the method for training all samples by heterogeneous sample separation and homogeneous sample closeup in step 4) comprises the following steps:

for the input image I, obtaining a final vehicle feature map T 'through an attribute-based enhancement and expansion module and a state-based weakening and contraction module, and performing global average pooling operation on the final vehicle feature map T' to obtain a feature vector of the image I:

f＝GAP(T″)

for any one heterogeneous sample pair (I) in the training set _i ,I _j ) Corresponding vehicle characteristics (f) can be obtained _i ,f _j ) Corresponding attribute features

The designed attribute-driven heterogeneous sample separation constraint is as follows:

wherein y is _ij =0 indicates that this is a heterogeneous sample pair,

represents a heterogeneous sample pair (I) _i ,I _j ) Of the attribute features of (1), d _ij The Euclidean distance of the vehicle characteristics of the heterogeneous sample pairs is represented, and the heterogeneous sample separation driven by the attributes enables the characteristics of the heterogeneous samples to be learned to have a weight related to the attributes in the metric learning stage, so that the vehicle characteristic distance of the heterogeneous samples is increased along with the expansion of the attributes until an upper boundary 1 is reached;

for any one homogeneous sample pair (I) in the training set _i ,I _j ) Corresponding vehicle characteristics (f) can be obtained _i ,f _j ) Corresponding attribute features

wherein y is _ij =1 indicates that this is an iso-homogeneous sample pair,

represents a heterogeneous sample pair (I) _i ,I _j ) Is characteristic of the state of (a), d _ij Vehicle characteristic Euclidean distance, state driven identity representing a pair of homogeneous samplesClass samples are close to each other, so that in the metric learning stage, the feature learning of the same class sample has a weight related to the state; so that the vehicle characteristic distances of the same and different samples are close together along with the contraction of the state until the vehicle characteristic distances are smaller than the lower boundary 0.3.

As a further improvement of the technical solution of the present invention, the function of the global structure embedding loss in step 5) is:

the training of the system carries out back transmission loss function towards the descending direction of the embedding loss of the global structure, so that the vehicle characteristic distance of the heterogeneous sample is increased along with the expansion of the attribute until the vehicle characteristic distance reaches an upper boundary 1, and the vehicle characteristic distance of the homogeneous sample is closed along with the contraction of the state until the vehicle characteristic distance is smaller than a lower boundary 0.3; and finally, reversely transmitting the multi-label classification loss value, updating and iterating for 120 times to obtain the optimal network parameter based on the expansion loss value of the attribute, the weakening loss value of the state and the global structured embedding loss value.

The invention has the advantages that:

(1) The technical scheme of the invention increases the feature distance between different vehicles by enhancing and expanding the attributes, and reduces the feature distance between the same vehicles by weakening and contracting the state, thereby better helping the network to learn the features of the vehicles.

(2) According to the technical scheme, when the vehicle re-identification task is executed in the testing stage, the attribute information of the vehicle is directly acquired by using the attribute label in the training stage, most of the vehicles without the same attribute can be filtered out in the re-identification process by enhancing the attribute information of the vehicle, and the vehicles with the same identity under the same attribute can be better excavated in the re-identification process of the vehicle by expanding the attribute information of the vehicle; meanwhile, the camera environment, the visual angle information and the time information of the tested vehicle are judged by utilizing the state label in the training stage, the vehicles with the same identity can be more easily distinguished by a re-recognition model by weakening the state information of the vehicle, and the vehicles with the same identity can be better retrieved when the vehicles are re-recognized by contracting the state information of the vehicles.

(3) The technical scheme of the invention starts from a single vehicle test image, associates the attribute information and the state information of the vehicle, is easier to find the vehicle with the same identity in a huge vehicle search library, and provides a simple and effective strategy for deploying the rapid re-identification system for intelligent transportation.

Drawings

FIG. 1 is a block diagram of the system of the present invention;

FIG. 2 is a block diagram of the system of the present invention;

FIG. 3 is a block diagram of the global architecture embedding module of the present invention;

FIG. 4 is a flow chart of a training phase of the method of the present invention;

FIG. 5 is a flow chart of a testing phase of the method of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical scheme of the invention is further described by combining the drawings and the specific embodiments in the specification:

as shown in fig. 2, an attribute and state-guided structure embedded in a vehicle weight recognition system for learning a feature having a discriminating force in vehicle weight recognition by attribute-based enhancement and state-based weakening, includes: the system comprises a residual network module, an attribute-based enhancement and expansion module, a state-based weakening and contraction module and a global structure embedding module;

the residual error network module is used for extracting a visible light characteristic diagram, copying the characteristic diagram into three parts, inputting one part of the characteristic diagram into the attribute-based enhancement and expansion module, inputting the other part of the characteristic diagram into the state-based weakening and contraction module, and obtaining category scores through global average pooling and full connection operation;

the attribute-based enhancement and expansion module is used for enhancing the recognition of vehicle features through attribute information related to identities and increasing feature distances among different vehicles by setting an attribute-based expansion loss function;

the state-based weakening and shrinking module is used for weakening state information interfering identification and reducing the inter-class feature distance through a state-based shrinking loss function;

as shown in fig. 3, the global structure embedding module is configured to obtain a final vehicle feature vector based on the attribute enhancement and the state-based weakening operation, calculate a category score through the final feature vector, transmit the category score to the search library, and determine whether the vehicle appears in another camera.

The calculating the category score through the final feature vector specifically includes: and calculating the category constraint loss values of the predicted category and the true value by using a cross entropy function, sharing the category loss values of three constraints of attributes, states and identities, and adding the three according to a proportion to form a multi-label category loss function to predict the categories of the three.

The characteristic spacing between different vehicles is increased by the enhancement and expansion of the attributes, and the characteristic spacing between the same vehicles is reduced by the weakening and contraction of the state, so that the characteristics of the vehicles are better learned by a network.

As shown in fig. 4-5, a structure-embedded vehicle weight identification method for attribute and status guidance includes two stages: a training phase and a testing phase.

1. The training phase comprises the following steps

Step 1: a visible light image of a vehicle is acquired and transmitted as input to the network.

Step 2: extracting a visible light characteristic diagram through a residual error network commonly used in the field of vehicle weight identification, copying the characteristic diagram into three parts, inputting one part of the characteristic diagram into an enhancement and expansion module based on attributes, inputting the other part of the characteristic diagram into a weakening and contraction module based on states, and obtaining category scores through global average pooling and full connection operation.

And calculating the category constraint loss values of the predicted category and the true value by using a cross entropy function, sharing the category loss values of three constraints of attributes, states and identities, and adding the three according to a proportion to form a multi-label category loss function to predict the categories of the three.

And step 3: in addition to attribute-based enhancement and state-based weakening, attribute-based expansion penalty and state-based contraction penalty are further proposed. Attribute-based expansion penalties are intended to expand the feature distribution of attributes, thereby increasing the differences in attributes between classes. State-based shrinkage loss is to narrow the distribution of states and thereby reduce the intra-class state variation.

And 4, step 4: it is proposed that using state and attribute guided global structure embedding loss enables negative samples with less attribute difference and positive samples with larger state to obtain larger gradient amplitude.

And 5: the final penalty of the model is defined as the sum of the multi-label classification penalty, the attribute-based expansion penalty, the state-based contraction penalty, and the global embedding penalty. In each training, the structure embedded network guided by the attributes and the states reversely transmits a final loss value, and updates model parameters of the residual block, the attribute enhancing module and the state weakening module parameters by using a self-adaptive momentum random gradient descent method with the loss value reduced as a target. After a certain number of iterations, the loss value of the network is not reduced any more, and the training of the network is finished when all the parameters reach the optimal values.

The specific training mode comprises the following steps:

(1) The network batch size was set to 64 in each training, i.e., the network read 64 images at a time. Each image was resized to 256 × 256 pixels and the 10 pixels filling the periphery of the image were 0. It is then randomly cropped to a 256 × 256 rectangular image. Each image is inverted at a probability level of 0.5 and normalized by the mean values 0.485, 0.456, 0.406 and standard deviations 0.229, 0.224, 0.225 for the three channels R, G, B of the image, respectively, and each image is decoded into 32-bit floating-point raw data, i.e., floating-point numbers with pixel values between [0,1 ].

(2) For a vehicle image input into the network, firstly, a corresponding characteristic map is extracted through a trained 50-layer residual error network. The feature map is copied into three parts, one part is input into an enhancement and expansion module based on attributes, the other part is input into a weakening and contraction module based on states, and the other part obtains category scores through global average pooling and full connection operation. In one aspect, multi-label classification penalties are obtained by combining classification penalties for attributes, states, and identities to ensure that the prediction class from each module is consistent with a true value. At the same time, differences in intra-class states are reduced by increasing the differences in attributes between classes based on extended penalties of attributes and by reducing the differences in states based on contracted penalties of states. On the other hand, larger gradient amplitudes are obtained for negative samples with smaller attribute differences and positive samples with larger states using attribute and state-guided global structure embedding loss. The final penalty value for the attribute-and state-directed structure embedding network model is the sum of the multi-label classification penalty, the attribute-based expansion penalty, the state-based contraction penalty, and the attribute-and state-directed global structure embedding penalty.

(3) The loss value is transmitted back and the parameters are updated. And repeating the iteration until the network converges. And transmitting the loss obtained by the last step back to the network, and updating model parameters by using an optimizer, wherein the optimizer adopts a random gradient descent method, the learning rate is dynamically set along with the network training times, the initial learning rate is set to be 0.00035, and the initial learning rate is reduced by 10 times in the 40 th iteration and the 70 th iteration respectively. There were a total of 120 training sessions.

2. The testing phase comprises the following steps

Step 1: a visible light image of the vehicle is loaded. The trained network is input.

Step 2: and extracting a visible light characteristic diagram through a trained residual error network.

And step 3: the visible light characteristic diagram is respectively subjected to a trained attribute-based enhancing and expanding module and a state-based weakening and contracting module, and an attribute enhanced characteristic tensor and a state weakened characteristic tensor are obtained.

And 4, step 4: and obtaining a final vehicle feature vector through attribute-based enhancement and state-based weakening operation.

And 5: and calculating a category score through the final feature vector, transmitting the category score to a search library, and judging whether the vehicle appears in other cameras.

The specific test mode comprises the following steps:

1) A vehicle image is input. Each image was resized to 256 x 256 pixels into the trained network.

2) And the network calculates to obtain the category scores after the attribute enhancement and the state reduction.

3) And after the category score of the vehicle is obtained, the vehicle characteristics are put into an image retrieval library for comparison, and whether the vehicle appears in other cameras or not is judged.

3. Specific workflow of extension phase

(1) A vehicle image I with an image size of 256 × 256 is input, and a feature map T of the vehicle image T = ResNet50 (I) can be acquired through a 50-layer residual network.

(2) The feature map T of the vehicle image is copied in triplicate, the first copy of which is passed into the attribute-based enhancement and expansion module.

(3) Then, inputting the feature map T into a 1 × 1 volume block related to the color attribute to obtain a feature map related to the color attribute:

(4) Batch regularization BN operation and modified linear unit ReLU operation of feature graph related to color attribute

(5) Carrying out global average pooling operation on the feature map related to the color attribute to obtain the features related to the color attribute

(6) Introducing color attribute dependent labels

Adding a full connection layer (FC) so that the color attribute related features are constrained by the label in end-to-end training, wherein the label constraint of the color attribute is formulated as:

(7) Label constraint of multiple (M) attributes is performed simultaneously, and the formula is as follows:

(8) Then, the color attribute mean value of the current vehicle relative to all vehicles is calculated

Is distributed over a plurality of features, | |) ₂ Represents l ₂ The norm takes the square root of the sum of the squares of the features.

(9) An extended loss function based on attributes is designed to force the network to continuously extend the attribute-related feature distribution in end-to-end training, and the extended formula of the attributes is as follows:

(10) Simultaneously extending various attributes (M), wherein the formula is as follows:

the training of the network can carry out back transmission loss function towards the descending direction of the expansion loss, so that the distribution of the correlation characteristics is continuously expanded (the black and gray attribute characteristics are far away), the characteristic distances of different vehicles are driven to be continuously far away, and the images of the different vehicles are more easily distinguished by the network.

4. Detailed workflow of enhancement phase

Since the expansion of the color properties is performed continuously in iterations of the network. Will obtain a more discriminative color feature map

It is desirable to utilize a more discriminating color profile

Enhancing the vehicle characteristic map T;

(1) And normalizing the color characteristic diagram into a product operation of 0-1 and the vehicle characteristic diagram at an element level by using an S-type function, wherein the formula is as follows:

(2) The enhanced graph of multiple attributes (M) can be formulated as:

(3) And (3) performing element level accumulation on the enhanced graph with various attributes and the original vehicle characteristic graph:

T′＝T+T ^e

5. specific workflow of the contraction phase

(2) The feature map T of the vehicle image is copied in triplicate, with the first copy passing into the attribute-based enhancement and expansion module.

(3) Then inputting the feature map T into a 1 × 1 volume block related to the camera state to obtain a feature map related to the camera state:

(5) Carrying out global average pooling operation on the feature map related to the color attribute to obtain the features related to the camera state

(6) Introducing camera state dependent tags

Adding a full connection layer (FC) so that the camera state related features are constrained by the label in the end-to-end training, wherein the label constraint of the camera state is formulated as:

(7) Label constraint of multiple (N) states is carried out simultaneously, and the formula is as follows:

(8) Then calculating the mean value of the camera states of the current vehicle relative to all vehicles

(9) Designing a contraction loss function based on a state to force the network to make the state-related feature distribution continuously contract in end-to-end training, wherein a contraction formula of the state is as follows:

(10) The contraction of the various states (N) is performed simultaneously, and the formula is as follows:

the training of the network carries out a back transmission loss function towards the descending direction of contraction loss, so that the distribution of state-related features (camera 139 and camera 79) is continuously contracted, the feature distance of the same vehicle is driven to be continuously contracted, and images of the same vehicle in different states are more easily distinguished by the network.

6. Specific workflow of the weakening phase

Since the shrinking of the camera state is performed continuously in iterations of the network. Will obtain the camera feature map irrelevant to the vehicle discrimination

It is therefore desirable to utilize camera profiles that are irrelevant for vehicle identification

Weakening vehicle characteristic diagram T

(2) The attenuation map for multiple states (N) can be formulated as:

(3) Performing element level cumulative difference on the enhanced graphs with various attributes and the enhanced vehicle characteristic graph to obtain a final vehicle characteristic graph:

T″＝T′-T ^w

7. attribute-driven heterogeneous sample separation

(1) For the input image I, a final vehicle feature map T 'is obtained through an attribute-based enhancement and expansion module and a state-based weakening and contraction module, a global average pooling operation is carried out on the vehicle feature map T', and a feature vector of the image I can be obtained:

f＝GAP(T″)

(2) For any one heterogeneous sample pair (I) in the training set _i ,I _j ) Corresponding vehicle characteristics (f) can be obtained _i ,f _j ) Corresponding attribute features

The attribute-driven heterogeneous sample separation constraints of the design are as follows:

wherein y is _ij =0 indicates that this is a heterogeneous sample pair,

represents a heterogeneous sample pair (I) _i ,I _j ) Of the attribute features of (1), d _ij Representative of the Euclidean distance of the vehicle characteristics of the heterogeneous sample pairs, attribute-driven heterogeneous sample separation is such that in the metric learning phase, heterogeneous samplesHas a weight associated with the attribute. Such that the vehicle feature distance of the heterogeneous sample increases as the attribute expands until the upper boundary 1 is reached.

8. State-driven homogeneous sample closedown

f＝GAP(T″)

(2) For any one homogeneous sample pair (I) in the training set _i ,I _j ) Corresponding vehicle characteristics (f) can be obtained _i ,f _j ) Corresponding attribute features

wherein y is _ij =1 indicates that this is an iso-homogeneous sample pair,

represents a heterogeneous sample pair (I) _i ,I _j ) Of the state features of (a) is a Euclidean distance d _ij And representing the Euclidean distance of the vehicle characteristics of the similar sample pairs, and enabling the similar samples driven by the states to be close together, so that the characteristic learning of the similar samples has a weight related to the states in the metric learning stage. So that the vehicle characteristic distances of the same and different samples are close together along with the contraction of the state until the vehicle characteristic distances are smaller than the lower boundary 0.3.

9. Global structure embedding

(1) Since heterogeneous sample separation and homogeneous sample closeness are operations on all samples of the training set, this whole constraint is named as global structure embedding module, and the global structure embedding loss function is:

the training of the network carries out back transmission loss function towards the descending direction of the embedding loss of the global structure, so that the vehicle characteristic distance of the heterogeneous sample is increased along with the expansion of the attribute until the upper boundary 1 is reached, and the vehicle characteristic distance of the homogeneous sample is closed along with the contraction of the state until the lower boundary is less than 0.3. Images of different vehicles are easier to distinguish by a network, and images of the same vehicle in different states are easier to distinguish by the network.

(2) And finally, reversely transmitting the multi-label classification loss value, updating and iterating for 120 times to obtain the optimal network parameter based on the expansion loss value of the attribute, the weakening loss value of the state and the global structured embedding loss value.

In the embodiment, feature learning and metric learning are trained by using an end-to-end deep neural network, and attribute information (color, vehicle type and manufacturer) and state information (the number of the camera usually means the arrangement place of the camera and the number of how many shooting places there are) and the viewpoints of the vehicle (five viewpoints of the head, the tail, the side, the front side and the rear side) and shooting time (0-23, including 24 hours of a day)) are considered in both the feature learning stage and the metric learning stage.

The attribute information of the vehicle is optimized, so that the characteristic distances of different vehicles are far away due to the fact that different attribute information is continuously expanded in end-to-end learning, and the fact that the characteristic distances of different vehicles are far away is more meaningful compared with direct characteristic splicing, and better vehicle re-identification performance can be obtained.

In addition to this, the status information of the vehicle is also taken into account, which is also original, and the information of the vehicle is divided into two groups, one group being considered helpful for the vehicle weight identification (attribute of the vehicle) and one group being considered to have an influence on the vehicle discrimination (status of the vehicle).

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An attribute and state-guided structure-embedded vehicle weight recognition system that learns a feature having discriminative power in vehicle weight recognition through attribute-based enhancement and state-based weakening, comprising: a residual network module, an attribute-based enhancement and expansion module, a state-based attenuation and contraction module and a global structure embedding module;

the design method of the extended loss function based on the attributes comprises the following steps:

inputting a vehicle image I, and firstly acquiring a characteristic diagram T of the vehicle image through a residual error network;

then inputting the feature map T into 1 × 1 volume blocks corresponding to different attributes to obtain a feature map related to the attributes:

introducing attribute related tags

Is distributed over a plurality of features, | |) ₂ Represents l ₂ Norm:

wherein, i represents the type of the attribute, which is respectively: a color attribute of the vehicle, a type attribute of the vehicle, a manufacturer attribute of the vehicle;

the design method of the shrinkage loss function based on the state is as follows:

copying the characteristic diagram T of the vehicle image into three parts, wherein the first part is transmitted into a weakening and shrinking module based on the state;

and (3) carrying out batch regularization BN operation and modified linear unit ReLU operation on the characteristic graph irrelevant to the state to obtain:

T _j ^stt ＝ReLU(BN(T _j ^st ))

f _j ^st ＝GAP(T _j ^stt )

introducing state independent tags

Is distributed over the feature, | · | ₂ Represents l ₂ A norm;

designing a contraction loss function based on the state to force the network to make different state feature distributions contract continuously in end-to-end training, wherein the contraction formula of the state is as follows:

wherein j represents the types of states, and is respectively as follows: the number of the camera, the viewpoint of the vehicle, and the shooting time;

2. A method for application of the attribute and state guided structure embedded vehicle weight recognition system of claim 1, characterized by two stages: a training stage and a testing stage;

the training phase comprises the following steps:

step 2): extracting a visible light characteristic diagram through a residual error network, copying the characteristic diagram into three parts, inputting one part of the characteristic diagram into an attribute-based enhancing and expanding module, inputting the other part of the characteristic diagram into a state-based weakening and contracting module, and obtaining a category score through global average pooling and full-connection operation; calculating a category constraint loss value of a predicted category and a true value by using a cross entropy function, sharing three constraint category loss values of an attribute, a state and an identity, and adding the three values according to a ratio to form a multi-label category loss function to predict the categories of the three values;

and step 3): the feature distribution of the attributes is expanded through the expansion loss function of the attributes, so that the difference of the attributes among the classes is increased; reducing the distribution of the states through a shrinkage loss function of the states, thereby reducing the difference of the states in the classes;

the testing stage comprises the following steps:

step a): loading a visible light image of a vehicle and inputting the visible light image into a trained system;

step c): respectively passing the visible light characteristic diagram through a trained attribute-based enhancing and expanding module and a trained state-based weakening and contracting module, and obtaining an attribute enhanced characteristic tensor and a state weakened characteristic tensor;

3. The method according to claim 2, wherein the method of expanding the feature distribution of the attribute by the expansion loss function of the attribute in step 3) to increase the difference of the attribute between the classes is:

introducing attribute related tags

Adding a full connection layer FC to make the characteristic related to the attributeThe end-to-end training is constrained by the label, and the label constraint related to the attribute is formulated as follows:

Is distributed over a plurality of features, | |) ₂ Represents l ₂ Norm:

using sigmoid functions to correlate feature maps with attributes

Normalizing to 0-1, and performing element-level product operation with the vehicle characteristic diagram T to obtain an enhanced diagram T related to the attributes ^e The formula is as follows:

in the iterative process of the system, the expansion of the attribute is continuously executed, and the feature diagram related to the attribute is utilized

The calculation formula of the vehicle enhanced characteristic diagram T' is as follows:

T′＝T+T ^e

4. The method according to claim 3, wherein the shrinkage loss function of the pass-through state in step 3) reduces the distribution of the states, thereby reducing the difference of the states within the class:

inputting a vehicle image I with an image size of 256 × 256, and acquiring a feature map T of the vehicle image through a 50-layer residual error network, wherein T = ResNet50 (I);

and (3) carrying out batch regularization BN operation and modified linear unit ReLU operation on the feature graph irrelevant to the state to obtain:

T _j ^stt ＝ReLU(BN(T _j ^st ))

f _j ^st ＝GAP(T _j ^stt )

introducing state independent tags

Is distributed over the feature, | · | ₂ Represents l ₂ A norm;

using sigmoid functions to make the characteristic diagram T independent of state _j ^st Normalizing to 0-1, and multiplying element level with the vehicle characteristic diagram T to obtain a weakening diagram T independent of state ^w The formula is as follows:

in the iterative process of the system, the contraction of the state is continuously executed, and the characteristic diagram T which is irrelevant to the state is utilized _j ^st Weakening the characteristic diagram T 'of vehicle reinforcement to obtain a final vehicle characteristic diagram T', wherein the calculation formula is as follows:

T″＝T′-T ^w

wherein, T 'represents the final vehicle characteristic diagram, and T' represents the vehicle enhanced characteristic diagram.

5. The method of claim 4, wherein the method for training all samples in step 4) by heterogeneous sample separation and homogeneous sample closeness is as follows:

f＝GAP(T″)

wherein y is _ij =0 indicates that this is a heterogeneous sample pair,

represents a heterogeneous sample pair (I) _i ,I _j ) Of the attribute features of (1), d _ij Vehicle characteristic Euclidean distance representing heterogeneous sample pairs and attribute drivingThe heterogeneous sample separation of (1) enables the characteristic learning of the heterogeneous sample to have a weight related to the attribute in the metric learning stage, so that the vehicle characteristic distance of the heterogeneous sample increases along with the expansion of the attribute until the upper boundary 1 is reached;

The designed state-driven homogeneous sample closeout constraints are as follows:

wherein y is _ij =1 indicates that this is a homogeneous sample pair,

represents a homogeneous sample pair (I) _i ,I _j ) Of the state features of (a) is a Euclidean distance d _ij Representing the Euclidean distance of the vehicle characteristics of the similar sample pairs, and enabling the similar samples driven by the states to be close to each other, so that the characteristic learning of the similar samples has a weight related to the states in the metric learning stage; and enabling the vehicle characteristic distance of the same type sample to be close together along with the contraction of the state until the vehicle characteristic distance is smaller than the lower boundary 0.3.

6. The method of claim 5, wherein the global structure embedding loss in step 5) is a function of: