CN116052150A - Vehicle face recognition method for shielding license plate - Google Patents

Vehicle face recognition method for shielding license plate Download PDF

Info

Publication number
CN116052150A
CN116052150A CN202310061292.9A CN202310061292A CN116052150A CN 116052150 A CN116052150 A CN 116052150A CN 202310061292 A CN202310061292 A CN 202310061292A CN 116052150 A CN116052150 A CN 116052150A
Authority
CN
China
Prior art keywords
layer
relu
vector
feature map
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310061292.9A
Other languages
Chinese (zh)
Inventor
邓玉辉
汤智敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan University
Original Assignee
Jinan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan University filed Critical Jinan University
Priority to CN202310061292.9A priority Critical patent/CN116052150A/en
Publication of CN116052150A publication Critical patent/CN116052150A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a vehicle face recognition method aiming at a shielded license plate, which solves the technical problem of the existing recognition of the shielded license plate. According to the invention, color prediction information and picture vector information are obtained according to the color prediction module, the global perception module and the detail perception module, then a corresponding folder is found according to the color prediction information, cosine distance calculation is sequentially carried out on the corresponding folder and the picture vector information in the folder, and the first 5 are selected as recognition results according to the sequence from big to small. Compared with the traditional method, the method has the advantages that the auxiliary flow of vehicle color classification is increased, the identification is carried out through the global perception module and the detail perception module, multiple normalization calculation is carried out in the global perception module from batches and channels, namely, vehicles which shield license plates are searched and identified in a vehicle picture library from the multi-dimension of color appearance, global information and detail information, the content expression of searching capacity and models on the vehicles which shield the license plates is improved, and the vehicles which do not shield the license plates are successfully found out.

Description

Vehicle face recognition method for shielding license plate
Technical Field
The invention relates to the technical field of computer vision and pattern recognition, in particular to a vehicle face recognition method aiming at a license plate shielding.
Background
The home households all choose the automobile as the riding tool. However, under a complex traffic environment, some drivers may perform illegal actions such as overspeed and overload on a certain road, shelter from the license plate of the driver to avoid tracking of traffic police, and resume normal and normal running of the license plate after a period of time, so that the phenomenon becomes difficult to detect and monitor traffic control and control, and is difficult to solve. Although a large number of cameras are arranged on a road to monitor the tracking of illegal vehicles, the characteristics of the colors of the types of the vehicles are the same, and the detection of corresponding target pictures in a massive picture library shot by the non-overlapping cameras is difficult to realize by manpower, and a large amount of time is wasted. Therefore, a method is needed to find out the illegal vehicle by using a deep learning technology to find out the face picture of the blocked vehicle in the picture library in a non-blocking form according to the characteristics of the face in the face picture library. The existing deep learning-based method is to extract an effective global feature representation for each face picture, and lacks consideration of detail feature information. In addition, the feature extracted through the neural network is difficult to obtain information of appearance change, and the face recognition result obtained after sequencing is very different from the actual vehicle in the picture to be detected.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a vehicle face recognition method aiming at a license plate shielding.
The aim of the invention can be achieved by adopting the following technical scheme:
a face recognition method for a license plate, the face recognition method comprising the steps of:
s1, inputting a certain face picture P in a data set into a backbone network to obtain a feature map F p ∈R C×W×H Wherein C represents the number of channels of the feature map, W and H are the width and height of the feature map, and R represents the real number domain;
s2, mapping the characteristic diagram F p The method comprises the steps of inputting the color of the vehicle into a color prediction module to predict the color of the vehicle, obtaining a predicted vehicle color result i, wherein the color prediction result i is used for step S6;
s3, mapping the characteristic diagram F p Inputting to global perception module for prediction to obtain vector
Figure SMS_1
Where D is the dimension of the column vector, vector Q 1 The global information of the face picture is contained;
s4, mapping the characteristic diagram F p Inputting to a detail perception module for prediction to obtain a vector
Figure SMS_2
Vector Q 2 The detail information of the face picture is contained;
s5, vector Q 1 Sum vector Q 2 Splicing to obtain a vector Q r ∈R D×1 Vector Q r Global information and detail information of the face picture are contained;
s6, according to the predicted vehicle color result i of the step S2, the vector Q is calculated r A folder named i;
s7, sequentially and repeatedly executing the operations from the step S1 to the step S6 on all the face pictures in the picture library in the data set to obtain a vector set Q i ={Q i,k |1≤k≤I n }, wherein I n For the number of vectors in folder i, Q i,k Refers to a vector corresponding to a kth picture in the folder i;
s8, inputting face pictures to be detected, and sequentially executing the step S1 to obtain a feature map F q
S9, inputting a feature map F q Executing step S2 to obtain a color prediction result of i q The color prediction result i is used in step S11;
s10, inputting a feature map F q Sequentially executing the operations of the steps S3, S4 and S5 to obtain a spliced vector Q q ∈R D×1 Vector Q q Global information and detail information of the face picture to be detected are contained;
s11, vector Q q And file i q Cosine distance calculation is carried out on all vectors in the set to obtain a set
Figure SMS_3
I nq For folder i q Number of medium vectors, ++>
Figure SMS_4
Refers to a folder i q The corresponding vector of the g-th picture, and the color prediction classification processing enables the identification of the face picture to be more targeted;
s12, collecting
Figure SMS_5
And sequencing the images according to the distance from large to small, and selecting the images corresponding to the top 5 vectors as prediction results.
Further, the backbone network structure is sequentially connected from an input layer to an output layer as follows: convolutional layers conv1_1, relu layer conv1_relu, convolutional layer conv1_2, relu layer conv1_2_relu, pooling layer max_pooling1, convolutional layer conv2_1, relu layer conv2_1_relu, convolutional layer conv2_2, BN layer conv2_2_bn, relu layer conv2_2_relu, pooling layer max_pooling2, convolutional layer conv3_1, relu layer conv3_1_relu, convolutional layer conv3_2 Relu layer conv3_2_relu, convolution layer conv3_3, relu layer conv3_3_relu, pooling layer max_pooling3, convolution layer conv4_1, relu layer conv4_1_relu, convolution layer conv4_2, relu layer conv4_2_relu, convolution layer conv4_3, relu layer conv4_3_relu, pooling layer max_pooling4, convolution layer conv5_1, relu layer conv5_1_relu, convolution layer conv5_2 Relu layer conv5_2_Relu, convolution layer conv5_3, relu layer conv5_3_Relu, pooling layer max_pooling5, convolution layer fc6, relu layer fc6_Relu, convolution layer fc7, relu layer fc7_Relu, convolution layer conv6_1, relu layer conv6_1_Relu, convolution layer conv6_2, relu layer conv6_2_Relu, convolution layer conv7_1, relu layer conv7_1_Relu, convolution layer conv7_2_Relu, convolution layer conv8_1, relu layer conv8_1_Relu, convolution layer conv8_2, relu layer conv8_2_Relu, convolution layer conv9_1, convolution layer conv9_2_conv2_conv10_con2_conv10_con2, convolution layer conv10_2_conv10_con2_con2.
Further, the color prediction module structure is sequentially connected from the input layer to the output layer as follows: pooling layers global_pooling_1, fc layer fc_1, fc layer fc_2.
Further, the global perception module structure is sequentially connected from the input layer to the output layer as follows: pooling layer global_pooling_2, multidimensional normalization layer conv1_1_in_bn, convolution layer conv11_1, relu layer conv11_1_relu, convolution layer conv11_2, relu layer conv11_2_relu; the multidimensional normalization layer conv1_1_in_bn is composed of a BN layer conv11_1_bn and an IN layer conv11_1_in.
Further, the detail perception module structure is sequentially connected from the input layer to the output layer as follows: feature detail compression layer horizontal_working_1, BN layer conv12_1_bn, convolution layer conv12_1, relu layer conv12_1_relu, convolution layer conv12_2, relu layer conv12_2_relu.
Further, the step S2 is as follows:
s21, feature map F p Inputting to global average pooling layer global_pooling_1 to obtain feature map E a ∈R C×1 For characteristic diagram F p Is compressed according to the information;
s22, combining the characteristic diagram E a Input to fc layer fc_1, fc layer fc_2, and obtain vector E by softmax function f ∈R N ×1 Wherein N is the number of vehicle colors in the data set, and the vehicle colors are predicted through the full-connection structure of the two fc layers;
s23, at vector E f The position index i corresponding to the highest value is obtained, and a vehicle color result i is output;
further, the step S3 is as follows:
s31, feature map F p Input to global_pooling_2 layer to obtain vector V e R D×1 For characteristic diagram F p Is compressed according to the information;
s32, inputting the vector V into the BN layer conv11_1_bn to obtain the vector V b ∈R D×1 The method comprises the steps of carrying out a first treatment on the surface of the Vector V b The vector V is normalized from the batch direction;
s33, inputting the vector V into the IN layer conv11_1_in to obtain the vector V i ∈R D×1 The method comprises the steps of carrying out a first treatment on the surface of the Vector V i Is the vector V normalized from the channel directionResults of the formatting;
s34, vector V b Vector V of AND i Splicing to obtain a vector V c ∈R 2D×1
S35, vector V c Vector Q is obtained by convolving layers conv11_1, relu layer conv11_1_relu, convolving layers conv11_2, relu layer conv11_2_relu 1 Thus vector Q 1 And the data spliced by the vector V from batch normalization and channel normalization are obtained, so that the loss of overfitting and wind pattern characteristics is avoided.
Further, the step S4 is as follows:
s41, feature map F p Input into a feature detail compression layer horizontal_compressing_1 for compression, and input into a BN layer conv12_1_bn for normalization to obtain a feature diagram J s ∈R C×H×1
S42, feature map J s Input to convolution layers conv12_1, relu layer conv12_1_relu, convolution layers conv12_2, relu layer conv12_2_relu to obtain vector Q 2
Further, the feature detail compression layer is used for compressing detail features to facilitate subsequent identification, and the working process is as follows:
assuming that the input of the feature detail compression layer is a feature map M E R C×W×H Output is a feature map Y ε R C×H×1
For the feature map M, the feature map M is segmented into a feature map set A with the number of H in the height direction of the feature map * ={A m 1.ltoreq.m.ltoreq.H, where A m ∈R C×W×1 Refers to the set A * A feature map of the m-th bit in the list;
for set A * A certain characteristic diagram A z Compressing to obtain result k z The compression formula is as follows:
Figure SMS_6
wherein A is z,x,y For set A * Feature map A of the z-th position of (2) z An xth row vector with the middle channel number of y;
for set A * Each feature map in the list is calculated by a formula (1) and is collected into a column vector T E R H×1 Wherein the value of the T-th position in the column vector T represents the set a * The feature map of the t-th position of (c) is calculated by the formula (1).
Further, the cosine distance is used for calculating the similarity of two vectors, and the calculation process is as follows:
assume that vector J and vector B perform cosine distance calculation to obtain a value l, and the calculation expression is as follows:
l=1-J·B T
in the formula, the symbol "·" represents the bitwise multiplication of elements in the vector, and the right-hand superscript "T" represents the vector transpose operation.
Compared with the prior art, the invention has the following advantages and effects:
(1) According to the invention, 5 vehicles similar to the picture of the shielded license plate can be found out from the picture library, and the picture of the non-shielded vehicle matched with the picture can be found out from the 5 pictures.
(2) Compared with the traditional method, the method improves the normalization of the neural network and the feature vector, and improves the expression capability of the feature vector on the face picture.
(3) Compared with the traditional method, the invention adds the color prediction module, as the auxiliary of the later recognition, classifies the vehicle faces according to the colors of the vehicle faces and stores the vehicle faces in folders with different colors, so that the recognition of the vehicle faces is more targeted.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
fig. 1 is a flowchart of a method for recognizing a face for blocking a license plate disclosed in the present invention.
Fig. 2 is a schematic diagram of a face recognition method for a license plate shielding method disclosed in the present invention.
FIG. 3 is a bar graph of accuracy of the present method versus other methods using the VehicleID dataset under the precondition of example 1;
FIG. 4 is a bar graph of accuracy of the present method versus other methods using the VeRi dataset with example 2;
fig. 5 is a block diagram of a backbone network in the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The Vehiccleid dataset was selected for use in this example 1. Each picture in the dataset carries an id tag corresponding to an identity in the real world. The training set is 13164 vehicles, and 113346 pictures are taken in total. The test set is 2400 vehicles, and 19777 pictures are taken.
The method comprises the following steps:
step S1, acquiring a number N of vehicle color categories in a data set, wherein n=9, the number of color categories being used for dimension setting in an fc layer fc_2 in a color prediction module; taking 1000 pictures randomly from the test set according to the id tag of the vehicle to serve as a picture library to be tested;
in step S2, the experimental environment in this embodiment only needs a general hardware configuration and a graphics processing unit (Graphics Processing Unit, GPU) capable of increasing the computing speed to perform the acceleration operation. The model is built, trained and the test of training results are completed under a Pytorch deep learning framework and a FastRIID tool, and a calculation unified architecture (Compute Unified Device Architecture, CUDA) is used to enable the GPU to solve the complex calculation problem. The specific operating environment configuration required for the experiment in this example is shown in table 1.
TABLE 1 experiment operation Environment configuration Table of this embodiment
Figure SMS_7
Step S3, constructing a vehicle face recognition method aiming at a license plate shielding, wherein a structural diagram is shown in FIG. 2, and the specific construction steps are as follows:
step S31, a backbone network is constructed, as shown in fig. 5, where the backbone network structure is sequentially connected from an input layer to an output layer: convolutional layers conv1_1, relu layer conv1_relu, convolutional layer conv1_2, relu layer conv1_2_relu, pooling layer max_pooling1, convolutional layer conv2_1, relu layer conv2_1_relu, convolutional layer conv2_2, BN layer conv2_2_bn, relu layer conv2_2_relu, pooling layer max_pooling2, convolutional layer conv3_1, relu layer conv3_1_relu, convolutional layer conv3_2 Relu layer conv3_2_relu, convolution layer conv3_3, relu layer conv3_3_relu, pooling layer max_pooling3, convolution layer conv4_1, relu layer conv4_1_relu, convolution layer conv4_2, relu layer conv4_2_relu, convolution layer conv4_3, relu layer conv4_3_relu, pooling layer max_pooling4, convolution layer conv5_1, relu layer conv5_1_relu, convolution layer conv5_2 Relu layer conv5_2_relu, convolution layer conv5_3, relu layer conv5_3_relu, pooling layer max_pooling5, convolution layer fc6, relu layer fc6_relu, convolution layer fc7, relu layer fc7_relu, convolution layer conv6_1, relu layer conv6_1_relu, convolution layer conv6_2, relu layer conv6_2_relu, convolution layer conv7_1, relu layer conv7_1_relu convolutionally layer conv7_2, reeu layer conv7_2_reelu, convolutionally layer conv8_1_reelu, convolutionally layer conv8_2, reeu layer conv8_2_reelu, convolutionally layer conv9_1, reeu layer conv9_1_reelu, convolutionally layer conv9_2, reeu layer conv9_2_reelu, convolutionally layer conv10_1, reeu layer conv10_reelu, convolutionally layer conv10_2, reeu layer conv10_2_reelu;
step S32, constructing a color prediction module, wherein the color prediction module structure is sequentially connected from an input layer to an output layer as follows: pooling layers global_pooling_1, fc layer fc_1, fc layer fc_2;
step S33, constructing a global perception module, wherein the global perception module structure is sequentially connected from an input layer to an output layer as follows: pooling layer global_pooling_2, multidimensional normalization layer conv1_1_in_bn, convolution layer conv11_1, relu layer conv11_1_relu, convolution layer conv11_2, relu layer conv11_2_relu; the multi-dimensional normalization layer conv1_1_in_bn is composed of a BN layer conv11_1_bn and an IN layer conv11_1_in;
step S34, constructing a detail perception module, wherein the detail perception module structure is sequentially connected from an input layer to an output layer as follows: feature detail compression layer horizontal_working_1, BN layer conv12_1_bn, convolution layer conv12_1, relu layer conv12_1_relu, convolution layer conv12_2, relu layer conv12_2_relu;
and S4, dividing the pictures of the training set into a picture library and a library to be queried according to the vehicle ids, wherein the vehicle ids in the library to be queried are unique, and the picture library can contain a plurality of pictures with the same vehicle ids. The training set pictures after being processed are trained by the method. In the training process, the optimizer used was Adam, the learning rate was set to 0.00035, the learning rate momentum was set to 0.0005, the training batch size was set to 64, epoch was set to 60, and the learning rate was decreased by a factor of 0.1 at epoch of 30 and 50.
S5, inputting a certain face picture P in the data set into a backbone network to obtain a feature map F p ∈R C×W×H Wherein C represents the number of channels of the feature map, W and H are the width and height of the feature map, and R represents the real number domain;
step S6, feature map F p The color prediction module is used for predicting the color of the vehicle to obtain a predicted vehicle color result i, wherein i is more than or equal to 1 and less than or equal to N;
the specific process is as follows:
s61, feature map F p Inputting to global average pooling layer global_pooling_1 to obtain feature map E a ∈R C×1 For characteristic diagram F p Is compressed according to the information;
s62, feature map E a Input to fc layer fc_1, fc layer fc_2, and obtain vector E by softmax function f ∈R N ×1 Predicting the color of the vehicle through the full-connection structure of the two fc layers;
s63, at vector E f Is taken inThe position index i corresponding to the highest value outputs a vehicle color result i
Step S7, feature map F p Inputting to global perception module for prediction to obtain vector
Figure SMS_8
Figure SMS_9
Where D is the dimension of the column vector, vector Q 1 The global information of the face picture is contained; in example 1, d=2048;
the specific process is as follows:
s71, feature map F p Input to global_pooling_2 layer to obtain vector V e R D×1
S72, inputting the vector V into the BN layer conv11_1_bn to obtain the vector V b ∈R D×1
S73, inputting the vector V into the IN layer conv11_1_in to obtain the vector V i ∈R D×1
S74, vector V b Vector V of AND i Splicing to obtain a vector V c ∈R 2D×1
S75, vector V c Vector Q is obtained by convolving layers conv11_1, relu layer conv11_1_relu, convolving layers conv11_2, relu layer conv11_2_relu 1
Step S8, feature map F p Inputting to a detail perception module for prediction to obtain a vector
Figure SMS_10
Figure SMS_11
Vector Q 2 The detail information of the face picture is contained;
the specific process is as follows:
s41, feature map F p Input into a feature detail compression layer horizontal_compressing_1 for compression, and input into a BN layer conv12_1_bn for normalization to obtain a feature diagram J s ∈R C×H×1
S42, feature map J s Input to convolution layers conv12_1, relu layer conv12_1_relu, convolution layers conv12_2, relu layer conv12_2_relu to obtain vector Q 2
The working process of the characteristic detail compression layer in the detail perception module is as follows:
assuming that the input of the feature detail compression layer is a feature map M E R C×W×H Output is a feature map Y ε R C×H×1
For the feature map M, the feature map M is segmented into a feature map set A with the number of H in the height direction of the feature map * ={A m 1.ltoreq.m.ltoreq.H, where A m ∈R C×W×1 Refers to the set A * A feature map of the m-th bit in the list;
for set A * A certain characteristic diagram A z Compressing to obtain result k z The compression formula is as follows:
Figure SMS_12
wherein A is z,x,y For set A * Feature map A of the z-th position of (2) z An xth row vector with the middle channel number of y;
for set A * Each feature map in the list is calculated by a formula (1) and is collected into a column vector T E R H×1 Wherein the value of the T-th position in the column vector T represents the set a * The feature map of the t-th position of (c) is calculated by the formula (1).
Step S9, vector Q 1 Sum vector Q 2 Splicing to obtain a vector Q r ∈R D×1 Vector Q r Global information and detail information of the face picture are contained;
step S10, according to the predicted vehicle color result i of step S6, vector Q is calculated r A folder named i;
step S11, all face pictures in the picture library in the test set are sequentially and completely executed with the operations from step S5 to step S10 to obtain a vector set Q i ={Q i,k |1≤k≤I n }, wherein I n For the number of vectors in folder i, Q i,k Refers to a vector corresponding to a kth picture in the folder i;
step S12, inputting the face picture to be detected, and sequentially executing step S5 to obtain a feature map F q
Step S13, input a feature map F q Executing step S6 to obtain a color prediction result of i q The color prediction result i is used in step S11;
step S14, inputting a feature map F q Sequentially executing the operations of the steps S7, S8 and S9 to obtain a spliced vector Q q ∈R D ×1 Vector Q q Global information and detail information of the face picture to be detected are contained;
step S15, vector Q q And file i q Cosine distance calculation is carried out on all vectors in the set to obtain a set
Figure SMS_13
I nq For folder i q Number of medium vectors, ++>
Figure SMS_14
Refers to a folder i q The corresponding vector of the g-th picture, and the color prediction classification processing enables the identification of the face picture to be more targeted;
the cosine distance is calculated as follows:
assume that vector J and vector B perform cosine distance calculation to obtain a value l, and the calculation expression is as follows:
l=1-J·B T
in the formula, the symbol "·" represents the bitwise multiplication of elements in the vector, and the right-hand superscript "T" represents the vector transpose operation.
Step S16, gathering
Figure SMS_15
And sequencing the images according to the distance from large to small, and selecting the images corresponding to the top 5 vectors as prediction results.
Step S17, sequentially executing steps S12, S13, S14 and S15 on the picture libraries to be tested in the test set, wherein the specific flow is shown in FIG. 1, obtaining all prediction results of the picture libraries to be tested in the test set, and calculating the accuracy.
The experimental comparison result of example 1 is shown in fig. 3, and example 1 is compared with other methods on a 1000-number picture library to be tested, the 5 pictures contain the corresponding picture marks as accurate, the method obtains 88.1% of accuracy, 70.21% higher than the BOW-CN method, 62.47% higher than the LOMO method, 27.61% higher than the FACT method, 27.38% higher than the NuFACT method, 17.81% higher than the VAML, and 4.73% higher than the QD-DLF, the effectiveness of the method is proved, and the corresponding unoccluded license plate can be successfully identified.
Example 2
In this example 2, the disclosed VeRi dataset was selected, which contained more than 50,000 images of 776 vehicles, including 37778 training sets, 11579 pictures in the library, and 1678 pictures to be tested. Each picture in the dataset carries an id tag corresponding to an identity in the real world.
The method comprises the following steps:
step S1, acquiring a number N of vehicle color categories in a data set, wherein n=9, the number of color categories being used for dimension setting in an fc layer fc_2 in a color prediction module;
in step S2, the experimental environment in this embodiment only needs a general hardware configuration and a graphics processing unit (Graphics Processing Unit, GPU) capable of increasing the computing speed to perform the acceleration operation. The model is built, trained and the test of training results are completed under a Pytorch deep learning framework and a Fasteid tool, and a calculation unified architecture (Compute Unified Device Architecture, CUDA) is used, so that the GPU can solve the complex calculation problem. Specifically, the hardware and software adopted in this embodiment are configured to: the GPU version is GeForce RTX 2080Ti; the CPU version is Intel (R) Xeon (R) Silver 4216CPU@2.10GHz; the version of the operating system is centOS 8.3.2011; python version 3.6.13; CUDA version 11.2.
Step S3, constructing a vehicle face recognition method aiming at a license plate shielding, wherein a structural diagram is shown in FIG. 2, and the specific construction steps are as follows:
step S31, a backbone network is constructed, as shown in fig. 5, where the backbone network structure is sequentially connected from an input layer to an output layer: convolutional layers conv1_1, relu layer conv1_relu, convolutional layer conv1_2, relu layer conv1_2_relu, pooling layer max_pooling1, convolutional layer conv2_1, relu layer conv2_1_relu, convolutional layer conv2_2, BN layer conv2_2_bn, relu layer conv2_2_relu, pooling layer max_pooling2, convolutional layer conv3_1, relu layer conv3_1_relu, convolutional layer conv3_2 Relu layer conv3_2_relu, convolution layer conv3_3, relu layer conv3_3_relu, pooling layer max_pooling3, convolution layer conv4_1, relu layer conv4_1_relu, convolution layer conv4_2, relu layer conv4_2_relu, convolution layer conv4_3, relu layer conv4_3_relu, pooling layer max_pooling4, convolution layer conv5_1, relu layer conv5_1_relu, convolution layer conv5_2 Relu layer conv5_2_relu, convolution layer conv5_3, relu layer conv5_3_relu, pooling layer max_pooling5, convolution layer fc6, relu layer fc6_relu, convolution layer fc7, relu layer fc7_relu, convolution layer conv6_1, relu layer conv6_1_relu, convolution layer conv6_2, relu layer conv6_2_relu, convolution layer conv7_1, relu layer conv7_1_relu convolutionally layer conv7_2, reeu layer conv7_2_reelu, convolutionally layer conv8_1_reelu, convolutionally layer conv8_2, reeu layer conv8_2_reelu, convolutionally layer conv9_1, reeu layer conv9_1_reelu, convolutionally layer conv9_2, reeu layer conv9_2_reelu, convolutionally layer conv10_1, reeu layer conv10_reelu, convolutionally layer conv10_2, reeu layer conv10_2_reelu;
step S32, constructing a color prediction module, wherein the color prediction module structure is sequentially connected from an input layer to an output layer as follows: pooling layers global_pooling_1, fc layer fc_1, fc layer fc_2;
step S33, constructing a global perception module, wherein the global perception module structure is sequentially connected from an input layer to an output layer as follows: pooling layer global_pooling_2, multidimensional normalization layer conv1_1_in_bn, convolution layer conv11_1, relu layer conv11_1_relu, convolution layer conv11_2, relu layer conv11_2_relu; the multi-dimensional normalization layer conv1_1_in_bn is composed of a BN layer conv11_1_bn and an IN layer conv11_1_in;
step S34, constructing a detail perception module, wherein the detail perception module structure is sequentially connected from an input layer to an output layer as follows: feature detail compression layer horizontal_working_1, BN layer conv12_1_bn, convolution layer conv12_1, relu layer conv12_1_relu, convolution layer conv12_2, relu layer conv12_2_relu;
and S4, dividing the pictures of the training set into a picture library and a library to be queried according to the vehicle ids, wherein the vehicle ids in the library to be queried are unique, and the picture library can contain a plurality of pictures with the same vehicle ids. The training set pictures after being processed are trained by the method. In the training process, the optimizer used was SGD, the learning rate was set to 0.001, the learning rate momentum was set to 0.0005, the training batch size was set to 64, epoch was set to 60, and the learning rate was decreased by a factor of 0.1 at epoch of 30 and 50.
S5, inputting a certain face picture P in the data set into a backbone network to obtain a feature map F p ∈R C×W×H Wherein C represents the number of channels of the feature map, W and H are the width and height of the feature map, and R represents the real number domain;
step S6, feature map F p The color prediction module is used for predicting the color of the vehicle to obtain a predicted vehicle color result i, wherein i is more than or equal to 1 and less than or equal to N;
the specific process is as follows:
s21, feature map F p Inputting to global average pooling layer global_pooling_1 to obtain feature map E a ∈R C×1 For characteristic diagram F p Is compressed according to the information;
s22, combining the characteristic diagram E a Input to fc layer fc_1, fc layer fc_2, and obtain vector E by softmax function f ∈R N ×1 Predicting the color of the vehicle through the full-connection structure of the two fc layers;
s23, at vector E f And (5) taking the position index i corresponding to the highest value, and outputting a vehicle color result i.
Step S7, feature map F p Inputting to global perception module for prediction to obtain vector
Figure SMS_16
Figure SMS_17
Where D is the dimension of the column vector, vector Q 1 The global information of the face picture is contained; in example 2, d=2048;
the specific process is as follows:
s71, feature map F p Input to global_pooling_2 layer to obtain vector V e R D×1
S72, inputting the vector V into the BN layer conv11_1_bn to obtain the vector V b ∈R D×1
S73, inputting the vector V into the IN layer conv11_1_in to obtain the vector V i ∈R D×1
S74, vector V b Vector V of AND i Splicing to obtain a vector V c ∈R 2D×1
S75, vector V c Vector Q is obtained by convolving layers conv11_1, relu layer conv11_1_relu, convolving layers conv11_2, relu layer conv11_2_relu 1
Step S8, feature map F p Inputting to a detail perception module for prediction to obtain a vector
Figure SMS_18
Figure SMS_19
Vector Q 2 The detail information of the face picture is contained;
the specific process is as follows:
s41, feature map F p Input into a feature detail compression layer horizontal_compressing_1 for compression, and input into a BN layer conv12_1_bn for normalization to obtain a feature diagram J s ∈R C×H×1
S42, feature map J s Input to convolution layers conv12_1, relu layer conv12_1_relu, convolution layers conv12_2, relu layer conv12_2_relu to obtain vector Q 2
The working process of the characteristic detail compression layer in the detail perception module is as follows:
presuming feature detail compressionThe layer input is a feature map M E R C×W×H Output is a feature map Y ε R C×H×1
For the feature map M, the feature map M is segmented into a feature map set A with the number of H in the height direction of the feature map * ={A m 1.ltoreq.m.ltoreq.H, where A m ∈R C×W×1 Refers to the set A * A feature map of the m-th bit in the list;
for set A * A certain characteristic diagram A z Compressing to obtain result k z The compression formula is as follows:
Figure SMS_20
wherein A is z,x,y For set A * Feature map A of the z-th position of (2) z An xth row vector with the middle channel number of y;
for set A * Each feature map in the list is calculated by a formula (1) and is collected into a column vector T E R H×1 Wherein the value of the T-th position in the column vector T represents the set a * The feature map of the t-th position of (c) is calculated by the formula (1).
Step S9, vector Q 1 Sum vector Q 2 Splicing to obtain a vector Q r ∈R D×1 Vector Q r Global information and detail information of the face picture are contained;
step S10, according to the predicted vehicle color result i of step S6, vector Q is calculated r A folder named i;
step S11, all face pictures in the picture library are sequentially and completely executed with the operations from step S5 to step S10 to obtain a vector set Q i ={Q i,k |1≤k≤I n }, wherein I n For the number of vectors in folder i, Q i,k Refers to a vector corresponding to a kth picture in the folder i;
step S12, inputting the face picture to be detected, and sequentially executing step S5 to obtain a feature map F q
Step S13, input a feature map F q Step S6 is executed to obtain a color prediction resultIs i q The color prediction result i is used in step S11;
step S14, inputting a feature map F q Sequentially executing the operations of the steps S7, S8 and S9 to obtain a spliced vector Q q ∈R D ×1 Vector Q q Global information and detail information of the face picture to be detected are contained;
step S15, vector Q q And file i q Cosine distance calculation is carried out on all vectors in the set to obtain a set
Figure SMS_21
I nq For folder i q Number of medium vectors, ++>
Figure SMS_22
Refers to a folder i q The corresponding vector of the g-th picture, and the color prediction classification processing enables the identification of the face picture to be more targeted;
the cosine distance is calculated as follows:
assume that vector J and vector B perform cosine distance calculation to obtain a value l, and the calculation expression is as follows:
l=1-J·B T
in the formula, the symbol "·" represents the bitwise multiplication of elements in the vector, and the right-hand superscript "T" represents the vector transpose operation.
Step S16, gathering
Figure SMS_23
And sequencing the images according to the distance from large to small, and selecting the images corresponding to the top 5 vectors as prediction results.
Step S17, sequentially executing steps S12, S13, S14 and S15 on the picture libraries to be tested in the test set, wherein the specific flow is shown in FIG. 1, obtaining all prediction results of the picture libraries to be tested in the test set, and calculating the accuracy.
The experimental comparison result of example 2 is shown in fig. 4, and example 2 is compared with other methods on a 1678-number picture library to be tested, the corresponding picture marks are included in 5 pictures to be accurate, the method obtains 97.3% of accuracy, which is 43.61% higher than the BOW-CN method, 50.82% higher than the LOMO method, 24.42% higher than the FACT method, 5.88% higher than the NuFACT method, 6.48% higher than the VAML, and 2.84% higher than the QD-DLF, the effectiveness of the method is proved, and the corresponding unoccluded license plate can be successfully identified.
In summary, the above embodiments disclose a face recognition method for shielding license plates. According to the invention, color prediction information and picture vector information are obtained according to the color prediction module, the global perception module and the detail perception module, then a corresponding folder is found according to the color prediction information, cosine distance calculation is sequentially carried out on the corresponding folder and the picture vector information in the folder, and the first 5 are selected as recognition results according to the sequence from big to small. The technical problem of current discernment shielding license plate is solved. Compared with the traditional method, the method has the advantages that the auxiliary flow of vehicle color classification is increased, the global perception module and the detail perception module are used for carrying out recognition, multiple normalization calculation is carried out in the global perception module from batches and channels, the searching capability and the content expression of the model on the shielded vehicles can be improved, and the vehicles in the corresponding non-shielded license plate states can be successfully found out. The high recognition rate of the method can be represented by the experimental results of example 1 and example 2.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (10)

1. The vehicle face recognition method for the license plate shielding is characterized by comprising the following steps of:
s1, inputting a certain face picture P in a data set into a backbone network to obtain a feature map F p ∈R C×E×H Wherein C represents the number of channels of the feature map, W and H are the width and height of the feature map, and R represents the real number domain;
s2, mapping the characteristic diagram F p Input to colorThe method comprises the steps that a prediction module predicts the color of a vehicle to obtain a predicted vehicle color result i;
s3, mapping the characteristic diagram F p Inputting to global perception module for prediction to obtain vector
Figure FDA0004061264920000011
Wherein D is the dimension of the column vector;
s4, mapping the characteristic diagram F p Inputting to a detail perception module for prediction to obtain a vector
Figure FDA0004061264920000012
S5, vector Q 1 Sum vector Q 2 Splicing to obtain a vector Q r ∈R D×1
S6, according to the predicted vehicle color result i of the step S2, the vector Q is calculated r A folder named i;
s7, sequentially and repeatedly executing the operations from the step S1 to the step S6 on all the face pictures in the picture library in the data set to obtain a vector set Q i ={Q i,k |1≤k≤I n }, wherein I n For the number of vectors in folder i, Q i,k Refers to a vector corresponding to a kth picture in the folder i;
s8, inputting face pictures to be detected, and sequentially executing the step S1 to obtain a feature map F q
S9, inputting a feature map F q Executing step S2 to obtain a color prediction result of i q
S10, inputting a feature map F q Sequentially executing the operations of the steps S3, S4 and S5 to obtain a spliced vector Q q ∈R D×1
S11, vector Q q And file i q Cosine distance calculation is carried out on all vectors in the set to obtain a set
Figure FDA0004061264920000013
I nq For folder i q Number of medium vectors, ++>
Figure FDA0004061264920000014
Refers to a folder i q Vector corresponding to the g-th picture;
s12, collecting
Figure FDA0004061264920000021
And sequencing the images according to the distance from large to small, and selecting the images corresponding to the top 5 vectors as prediction results.
2. The method for recognizing the face of a vehicle for shielding a license plate according to claim 1, wherein the backbone network structure is sequentially connected from an input layer to an output layer as follows: convolutional layers conv1_1, relu layer conv1_relu, convolutional layer conv1_2, relu layer conv1_2_relu, pooling layer max_pooling1, convolutional layer conv2_1, relu layer conv2_1_relu, convolutional layer conv2_2, BN layer conv2_2_bn, relu layer conv2_2_relu, pooling layer max_pooling2, convolutional layer conv3_1, relu layer conv3_1_relu, convolutional layer conv3_2 Relu layer conv3_2_relu, convolution layer conv3_3, relu layer conv3_3_relu, pooling layer max_pooling3, convolution layer conv4_1, relu layer conv4_1_relu, convolution layer conv4_2, relu layer conv4_2_relu, convolution layer conv4_3, relu layer conv4_3_relu, pooling layer max_pooling4, convolution layer conv5_1, relu layer conv5_1_relu, convolution layer conv5_2 Relu layer conv5_2_Relu, convolution layer conv5_3, relu layer conv5_3_Relu, pooling layer max_pooling5, convolution layer fc6, relu layer fc6_Relu, convolution layer fc7, relu layer fc7_Relu, convolution layer conv6_1, relu layer conv6_1_Relu, convolution layer conv6_2, relu layer conv6_2_Relu, convolution layer conv7_1, relu layer conv7_1_Relu, convolution layer conv7_2_Relu, convolution layer conv8_1, relu layer conv8_1_Relu, convolution layer conv8_2, relu layer conv8_2_Relu, convolution layer conv9_1, convolution layer conv9_2_conv2_conv10_con2_conv10_con2, convolution layer conv10_2_conv10_con2_con2.
3. The method for recognizing a vehicle face aiming at a license plate shielding according to claim 1, wherein the color prediction module structure is sequentially connected from an input layer to an output layer as follows: pooling layers global_pooling_1, fc layer fc_1, fc layer fc_2.
4. The method for recognizing the face of a vehicle aiming at shielding a license plate according to claim 1, wherein the global perception module structure is sequentially connected from an input layer to an output layer as follows: pooling layer global_pooling_2, multidimensional normalization layer conv1_1_in_bn, convolution layer conv11_1, relu layer conv11_1_relu, convolution layer conv11_2, relu layer conv11_2_relu; the multidimensional normalization layer conv1_1_in_bn is composed of a BN layer conv11_1_bn and an IN layer conv11_1_in.
5. The method for recognizing the face of the vehicle for shielding the license plate according to claim 1, wherein the detail perception module structure is sequentially connected from the input layer to the output layer as follows: feature detail compression layer horizontal_working_1, BN layer conv12_1_bn, convolution layer conv12_1, relu layer conv12_1_relu, convolution layer conv12_2, relu layer conv12_2_relu.
6. A method for recognizing a vehicle face for shielding a license plate according to claim 3, wherein the step S2 is as follows:
s21, feature map F p Inputting to global average pooling layer global_pooling_1 to obtain feature map E a ∈R C×1
S22, combining the characteristic diagram E a Input to fc layer fc_1, fc layer fc_2, and obtain vector E by softmax function f ∈R N×1 Where N is the number of vehicle colors in the dataset;
s23, at vector E f And (5) taking the position index i corresponding to the highest value, and outputting a vehicle color result i.
7. The method for recognizing a vehicle face for shielding a license plate according to claim 4, wherein the step S3 is as follows:
s31, feature map F p Input to global_pooling_2 layer to obtain vector V e R D×1
S32, inputting the vector V into the BN layer conv11_1_bn to obtain the vector V b ∈R D×1
S33, inputting the vector V into the IN layer conv11_1_in to obtain the vector V i ∈R D×1
S34, vector V b Vector V of AND i Splicing to obtain a vector V c ∈R 2D×1
S35, vector V c Vector Q is obtained by convolving layers conv11_1, relu layer conv11_1_relu, convolving layers conv11_2, relu layer conv11_2_relu 1
8. The method for recognizing the face of the vehicle for shielding the license plate according to claim 5, wherein the step S4 is as follows:
s41, feature map F p Inputting to feature detail compression layer horizontal_compressing_1 and BN layer conv12_1_bn to obtain feature diagram J s ∈R C×H×1
S42, feature map J s Input to convolution layers conv12_1, relu layer conv12_1_relu, convolution layers conv12_2, relu layer conv12_2_relu to obtain vector Q 2
9. The method for recognizing the face of a vehicle for shielding a license plate according to claim 5, wherein the working process of the feature detail compression layer is as follows:
assuming that the input of the feature detail compression layer is a feature map M E R C×W×H Output is a feature map Y ε R C×H×1
For the feature map M, the feature map M is segmented into a feature map set A with the number of H in the height direction of the feature map * ={A m 1.ltoreq.m.ltoreq.H, where A m ∈R C×W×1 Refers to the set A * A feature map of the m-th bit in the list;
for set A * A certain characteristic diagram A z Compressing to obtain result k z The compression formula is as follows:
Figure FDA0004061264920000041
wherein A is z,x,y For set A * Characteristic map A of the z-th position in (b) z An xth row vector with the middle channel number of y;
for set A * Each feature map in the list is calculated by a formula (1) and is collected into a column vector T E R H×1 Wherein the value of the T-th position in the column vector T represents the set a * The feature map of the t-th position of (c) is calculated by the formula (1).
10. The method for recognizing a car face aiming at a license plate shielding according to claim 1, wherein the cosine distance is calculated as follows:
assume that vector J and vector B perform cosine distance calculation to obtain a value l, and the calculation expression is as follows:
l=1-J·B T
in the formula, the symbol "·" represents the bitwise multiplication of elements in the vector, and the right-hand superscript "T" represents the vector transpose operation.
CN202310061292.9A 2023-01-18 2023-01-18 Vehicle face recognition method for shielding license plate Pending CN116052150A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310061292.9A CN116052150A (en) 2023-01-18 2023-01-18 Vehicle face recognition method for shielding license plate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310061292.9A CN116052150A (en) 2023-01-18 2023-01-18 Vehicle face recognition method for shielding license plate

Publications (1)

Publication Number Publication Date
CN116052150A true CN116052150A (en) 2023-05-02

Family

ID=86131084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310061292.9A Pending CN116052150A (en) 2023-01-18 2023-01-18 Vehicle face recognition method for shielding license plate

Country Status (1)

Country Link
CN (1) CN116052150A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740661A (en) * 2023-08-11 2023-09-12 科大国创软件股份有限公司 Method for reversely tracking Mongolian vehicle based on face recognition
CN117745720A (en) * 2024-02-19 2024-03-22 成都数之联科技股份有限公司 Vehicle appearance detection method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740661A (en) * 2023-08-11 2023-09-12 科大国创软件股份有限公司 Method for reversely tracking Mongolian vehicle based on face recognition
CN116740661B (en) * 2023-08-11 2023-12-22 科大国创软件股份有限公司 Method for reversely tracking Mongolian vehicle based on face recognition
CN117745720A (en) * 2024-02-19 2024-03-22 成都数之联科技股份有限公司 Vehicle appearance detection method, device, equipment and storage medium
CN117745720B (en) * 2024-02-19 2024-05-07 成都数之联科技股份有限公司 Vehicle appearance detection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111553205B (en) Vehicle weight recognition method, system, medium and video monitoring system without license plate information
Tian et al. A dual neural network for object detection in UAV images
Li et al. Deep neural network for structural prediction and lane detection in traffic scene
CN116052150A (en) Vehicle face recognition method for shielding license plate
CN111079674B (en) Target detection method based on global and local information fusion
CN110110689B (en) Pedestrian re-identification method
CN113408492A (en) Pedestrian re-identification method based on global-local feature dynamic alignment
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN112348849A (en) Twin network video target tracking method and device
CN114170516B (en) Vehicle weight recognition method and device based on roadside perception and electronic equipment
CN111814705B (en) Pedestrian re-identification method based on batch blocking shielding network
CN113205026A (en) Improved vehicle type recognition method based on fast RCNN deep learning network
CN111680678A (en) Target area identification method, device, equipment and readable storage medium
CN104881640A (en) Method and device for acquiring vectors
CN116311105B (en) Vehicle re-identification method based on inter-sample context guidance network
CN117689928A (en) Unmanned aerial vehicle detection method for improving yolov5
CN112861970A (en) Fine-grained image classification method based on feature fusion
Fontanel et al. Detecting anomalies in semantic segmentation with prototypes
Liu et al. An end to end framework with adaptive spatio-temporal attention module for human action recognition
Xin et al. Siamraan: Siamese residual attentional aggregation network for visual object tracking
CN113870312A (en) Twin network-based single target tracking method
Li et al. Siamese global location-aware network for visual object tracking
CN113298037B (en) Vehicle weight recognition method based on capsule network
CN115496966A (en) Method and system for generating video confrontation sample in cross-mode
Zheng et al. Dual-relational attention network for vehicle re-identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination