CN112686125A - Vehicle type determination method and device, storage medium and electronic device - Google Patents

Vehicle type determination method and device, storage medium and electronic device Download PDF

Info

Publication number
CN112686125A
CN112686125A CN202011567540.XA CN202011567540A CN112686125A CN 112686125 A CN112686125 A CN 112686125A CN 202011567540 A CN202011567540 A CN 202011567540A CN 112686125 A CN112686125 A CN 112686125A
Authority
CN
China
Prior art keywords
vehicle
feature vector
vehicle type
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011567540.XA
Other languages
Chinese (zh)
Inventor
张震
余言勋
王耀农
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202011567540.XA priority Critical patent/CN112686125A/en
Publication of CN112686125A publication Critical patent/CN112686125A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method and a device for determining a vehicle type, a storage medium and an electronic device, wherein the method comprises the following steps: acquiring image information of a vehicle to be detected to obtain a target image; inputting a target image into a multi-task target model to obtain a fusion feature vector output by the multi-task target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected; and determining the vehicle type of the vehicle to be tested based on the fusion feature vector. By the method and the device, the problems that the determination process of the vehicle type is complex and inaccurate in the related technology are solved, and the effect of accurately determining the vehicle type is achieved.

Description

Vehicle type determination method and device, storage medium and electronic device
Technical Field
The embodiment of the invention relates to the field of vehicles, in particular to a method and a device for determining vehicle types, a storage medium and an electronic device.
Background
In urban roads and rural roads, illegal behaviors of agricultural vehicles are more and more, such as illegal behaviors of driving of agricultural vehicles without license, illegal people carrying of agricultural vehicles, illegal refitting of agricultural vehicles, overload of agricultural vehicles and the like, the driving behaviors have serious harmfulness, and governments in various places also increase punishment on the driving behaviors of the agricultural vehicles.
In the prior art, the determination process of the vehicle type is complex and inaccurate.
In view of the above technical problems, no effective solution has been proposed in the related art.
Disclosure of Invention
The embodiment of the invention provides a vehicle type determining method and device, a storage medium and an electronic device, and aims to at least solve the problems that the vehicle type determining process is complex and inaccurate in the related art.
According to an embodiment of the present invention, there is provided a vehicle type determination method including: acquiring image information of a vehicle to be detected to obtain a target image; inputting the target image into a multitask target model to obtain a fusion feature vector output by the multitask target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of the vehicle to be detected; and determining the vehicle type of the vehicle to be tested based on the fusion feature vector.
According to another embodiment of the present invention, there is provided a vehicle type determination apparatus including: the first acquisition module is used for acquiring the image information of the vehicle to be detected to obtain a target image; the system comprises a first input module, a second input module and a third input module, wherein the first input module is used for inputting the target image into a multi-task target model to obtain a fusion feature vector output by the multi-task target model, and the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of the vehicle to be detected; and the first determining module is used for determining the vehicle type of the vehicle to be detected based on the fusion feature vector.
In an exemplary embodiment, the first input module includes: a first input unit, configured to input the target image into a backbone network in the multitask target model, and obtain N main feature maps of the target image output by the backbone network; a second input unit, configured to input M feature maps in the N main feature maps into a keypoint branch network, so as to obtain a keypoint feature vector output by the keypoint branch network; a third input unit, configured to input K feature maps in the N main feature maps into a vehicle system branch network, so as to obtain a vehicle system feature vector output by the vehicle system branch network; a fourth input unit, configured to input K feature maps in the N main feature maps into a vehicle type branch network, so as to obtain a vehicle type feature vector output by the vehicle type branch network; wherein each of N, M, and K is a natural number of 1 or more, and each of M and K is less than N.
In an exemplary embodiment, in a case where the backbone network includes a plurality of convolutional layers, the M feature maps are located at a P-th layer of the plurality of convolutional layers, and the K feature maps are located at a Q-th layer of the plurality of convolutional layers, where P and Q are natural numbers less than 1, and P is greater than Q.
In an exemplary embodiment, the second input unit includes: a first prediction subunit, configured to predict positions of the M feature maps by using a convolution layer in the keypoint branch network, so as to obtain keypoint position prediction information; a first determining subunit, configured to determine the keypoint feature vector based on the keypoint location prediction information and the K feature maps.
In an exemplary embodiment, the train-based branched network includes at least one fully-connected layer, wherein the fully-connected layer in the train-based branched network is configured to output the train characteristic vector.
In an exemplary embodiment, the vehicle type branching network includes at least one fully connected layer, wherein the fully connected layer in the vehicle type branching network is used for outputting the vehicle type feature vector.
In an exemplary embodiment, the first determining module includes: and a first determination unit configured to determine that the vehicle type of the vehicle to be measured is the vehicle type of the target vehicle, when the vehicle system corresponding to the fusion feature vector is the vehicle system of the target vehicle and the vehicle type corresponding to the vehicle type feature vector is the vehicle type of the target vehicle.
In an exemplary embodiment, the apparatus further includes: second oneA determining module, configured to input the target image into a multitask target model, and determine a model loss function L of the vehicle to be tested before obtaining a fusion feature vector output by the multitask target modeltype(ii) a A third determining module for determining the train loss function L of the vehicle to be testedmodel(ii) a A fourth determining module for determining the key point loss function L of the vehicle to be testedpoint(ii) a A fifth determining module for utilizing the LtypeL is as defined abovemodelAnd L as mentioned abovepointDetermining a target loss function LfinalWherein L isfinal=Lmod el+αLtype+βLpo intThe alpha and the beta are used for representing the weight coefficient of the multitask original model; and the sixth determining module is used for training the multitask original model based on the target loss function to obtain the multitask target model.
In an exemplary embodiment, the fourth determining module includes:
Figure BDA0002861119050000031
wherein, the above p* i(h, w) actual annotation information corresponding to the sample image, and pi(H, W) is used for representing the prediction marking information corresponding to the sample image, (H, W) is used for representing the pixel point coordinates in the sample image, (H) is used for representing the height of the sample image, (W) is used for representing the width of the sample image, and the sample image is used for training the multitask original model.
According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the invention, the target image is obtained by acquiring the image information of the vehicle to be detected; inputting a target image into a multi-task target model to obtain a fusion feature vector output by the multi-task target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected; and determining the vehicle type of the vehicle to be tested based on the fusion feature vector. The purpose of determining the type of the vehicle is achieved. Therefore, the problems that the determination process of the vehicle type is complex and inaccurate in the related art can be solved, and the effect of accurately and efficiently determining the vehicle type is achieved.
Drawings
Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of a vehicle type determination method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of vehicle type determination according to an embodiment of the present invention;
FIG. 3 is a schematic illustration of a key location according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of annotation information according to an embodiment of the present invention;
fig. 5 is a block diagram of the structure of a vehicle type determination device according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking an example of the method performed by a mobile terminal, fig. 1 is a block diagram of a hardware structure of the mobile terminal according to an embodiment of the present invention. As shown in fig. 1, the mobile terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the method for determining a vehicle type according to the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In the present embodiment, a vehicle type determination method is provided, and fig. 2 is a flowchart of a vehicle type determination method according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the steps of:
step S202, acquiring image information of a vehicle to be detected to obtain a target image;
optionally, the present embodiment includes, but is not limited to, being applied in a scenario where the type of the vehicle is determined, for example, whether the vehicle is an agricultural vehicle or not is detected, and the like. In this embodiment, the target image is labeled with key information of the vehicle to be tested, for example, the vehicle series, the vehicle type, the vehicle logo, the left vehicle light, and the right vehicle light of the vehicle to be tested are labeled as the key information.
Alternatively, the target image includes, but is not limited to, being captured by a camera, for example, a surveillance camera installed at a transportation junction captures a passing vehicle.
Step S204, inputting the target image into a multitask target model to obtain a fusion feature vector output by the multitask target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected;
optionally, in this embodiment, the multitask target model is a model based on a convolutional neural network, the multitask target model includes a target detection model for detecting vehicle information of the target vehicle, the target detection model may be, for example, SSD, Yolo, fasternn, centret, FCOS, or the like, and the backbone network in the target detection model may be a network such as ResNet, inclusion, densnet, MobileNet, or the like. The multitask object model also includes a classification model into which the object image is input to classify the type of the object vehicle, for example, the object classification model may be RegNetY.
The vehicle information includes, but is not limited to, characteristics of the vehicle, such as vehicle type, train, emblem, and the like. As shown in fig. 2, point a represents a key point of the emblem, point B represents a key point of the left lamp, and point C represents a key point of the right lamp. The types of the vehicle systems include common Audi, BMW and galloping vehicle systems, and also include agricultural vehicle systems, including corresponding agricultural vehicle systems of Wuzheng, Shifeng, Juli, etc. The Vehicle types include cars, Sport Utility vehicles (SUVs for short), vans, trucks, coaches and agricultural vehicles, wherein the Vehicle type corresponding to the agricultural Vehicle system is an agricultural Vehicle.
And step S206, determining the vehicle type of the vehicle to be detected based on the fusion feature vector.
The execution subject of the above steps may be a terminal, but is not limited thereto.
Through the steps, the target image is obtained by acquiring the image information of the vehicle to be detected; inputting a target image into a multi-task target model to obtain a fusion feature vector output by the multi-task target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected; and determining the vehicle type of the vehicle to be tested based on the fusion feature vector. The purpose of determining the type of the vehicle is achieved. Therefore, the problems that the determination process of the vehicle type is complex and inaccurate in the related art can be solved, and the effect of accurately and efficiently determining the vehicle type is achieved.
In an exemplary embodiment, inputting a target image into a multitask target model to obtain a fused feature vector output by the multitask target model, includes:
s1, inputting the target image into a backbone network in the multitask target model to obtain N main feature maps of the target image output by the backbone network;
s2, inputting M characteristic graphs in the N main characteristic graphs into the key point branch network to obtain key point characteristic vectors output by the key point branch network;
s3, inputting K characteristic graphs in the N main characteristic graphs into the vehicle system branch network to obtain vehicle system characteristic vectors output by the vehicle system branch network;
s4, inputting K characteristic graphs in the N main characteristic graphs into a vehicle type branch network to obtain vehicle type characteristic vectors output by the vehicle type branch network;
wherein N, M and K are both natural numbers greater than or equal to 1, and M and K are both less than N.
Optionally, in this embodiment, the target image is input into the multitask target model and is subjected to multitaskObtaining N characteristic graphs with different sizes by a backbone network in the target model, wherein the M characteristic graphs of a P-th layer of a penultimate layer are Fm1∈RDⅹH2ⅹW2Where D denotes the number of channels, H2 denotes the height of the profile, and W2 denotes the width of the profile. The size of K characteristic maps of the last but one layer Q is Fm2∈RCⅹHⅹW
In an exemplary embodiment, in a case where a plurality of convolutional layers are included in the backbone network, the M feature maps are located at a P-th layer of the plurality of convolutional layers, and the K feature maps are located at a Q-th layer of the plurality of convolutional layers, where P and Q are both natural numbers less than 1, and P is greater than Q.
Optionally, in this embodiment, the end of the target model may be divided into three subtasks, which are a vehicle type task, a key point task, and a vehicle series task, and each branch task corresponds to different vehicle label information. And corresponds to different network layers. For example, in the train network at the penultimate level Fm2Two full-connection layers are added at the back, wherein the two full-connection layers are respectively fy1And fy2. Vehicle type network in network layer Fm2Adding another two full connection layers, each of which is ft1And ft2. Key point network in Fm1Two convolutional neural network layers are connected at the back, and the two convolutional neural network layers are respectively Fp1∈REⅹH2ⅹW2And Fp2∈R3ⅹHⅹW
Alternatively, for example, Fp2The sigmoid function is used to predict each position of the corresponding channel in the feature as the final result of the keypoint prediction (corresponding to the first keypoint result). Vehicle keypoint results Fp2And Fm2Carrying out cross-channel pixel-by-pixel multiplication to obtain a new feature vector fp3(corresponding to the first eigenvector) E RC*3. The calculation method comprises the following steps: f is to bep2In the ith channel ofi(h, w) e R pixel value and Fm2Corresponding F (h, w) ∈ RCThe vectors are multiplied and divided by pi(h, w) the sum of all values to give vector fiThree key points generate three vectors to obtain a combinationVector f of the rearp3The concrete formula is as follows:
Figure BDA0002861119050000081
fp3=[f1,f2,f3](ii) a Wherein ε is 10-6,i=1,2,3。
In an exemplary embodiment, inputting M feature maps of the N main feature maps into the keypoint branch network to obtain keypoint feature vectors output by the convolutional layers in the keypoint branch network, includes:
s1, predicting the positions of the M feature maps by utilizing the convolution layer in the key point branch network to obtain the position prediction information of the key point;
and S2, determining a key point feature vector based on the key point position prediction information and the K feature maps.
In one exemplary embodiment, the train branch network comprises at least one fully connected layer, wherein the fully connected layer in the train branch network is used for outputting the train characteristic vector.
In one exemplary embodiment, the vehicle type branching network comprises at least one fully connected layer, wherein the fully connected layer in the vehicle type branching network is used for outputting the vehicle type feature vector.
In one exemplary embodiment, determining the vehicle type of the vehicle under test based on the fused feature vector comprises:
and S1, determining that the vehicle type of the vehicle to be tested is the vehicle type of the target vehicle under the condition that the vehicle system corresponding to the fusion feature vector is the vehicle system of the target vehicle and the vehicle type corresponding to the vehicle type feature vector is the vehicle type of the target vehicle.
Optionally, in this embodiment, when the vehicle system result corresponding to the vehicle is a vehicle system of an agricultural vehicle, and the vehicle type result is a vehicle type of the agricultural vehicle, it is determined that the vehicle is the agricultural vehicle.
In an exemplary embodiment, before inputting the target image into the multitask target model and obtaining the fused feature vector output by the multitask target model, the method further includes:
s1, determining the model loss function L of the vehicle to be testedtype
S2, determining the train loss function L of the vehicle to be testedmodel
S3, determining a key point loss function L of the vehicle to be testedpoint
S4, using Ltype、LmodelAnd LpointDetermining a target loss function LfinalWherein L isfinal=Lmod el+αLtype+βLpo intAnd both alpha and beta are used for representing the weight coefficient of the multitask original model;
and S5, training the multi-task original model based on the target loss function to obtain a multi-task target model.
In one exemplary embodiment, a keypoint loss function L for a vehicle under test is determinedpointThe method comprises the following steps:
Figure BDA0002861119050000091
wherein p is* i(h, w) is used for representing the corresponding real annotation information of the sample image, and p isiThe (H, W) is used for representing the corresponding prediction annotation information of the sample image, the (H, W) is used for representing the coordinates of pixel points in the sample image, the H is used for representing the height of the sample image, the W is used for representing the width of the sample image, and the sample image is used for training the multitask original model.
Optionally, the labeling information of the vehicle, for example, labeling performed by the vehicle series, the vehicle type, the vehicle logo key point, the left vehicle light key point, the right vehicle light key point, and the like.
The invention is illustrated below with reference to specific examples:
the vehicle to be tested in this embodiment is described by taking an agricultural vehicle as an example. In the embodiment, the pictures of the urban road and the rural road are obtained, the motor vehicles in the pictures are detected, and the pictures of the motor vehicles are obtained. And marking the pictures of the motor vehicles, and respectively marking the vehicle series, the vehicle type, and the information of the key points of the vehicle mark, the key points of the left vehicle lamp and the key points of the right vehicle lamp. And training the network model identified by the agricultural vehicle by using the vehicle picture and the corresponding marking information and adopting a multi-task training method to obtain a target model. A main network in the target model adopts RegnetY, and a vehicle type, a vehicle series and a vehicle local key point branch network are connected to the tail end of the network model.
For example, the vehicle picture is subjected to a trained convolutional neural network model, vehicle system and vehicle type information of the vehicle are output, and when the vehicle system is the brand of the agricultural vehicle and the vehicle type is the type of the agricultural vehicle, the vehicle is judged to be the agricultural vehicle.
The method comprises the following specific steps:
and marking the vehicles in the scenes of urban roads and rural roads, wherein the marking result comprises the position information of each vehicle. And training a target detection model by using the labeling result, wherein the target detection model comprises but is not limited to SSD, Yolo, FasterRCNN, CenterNet, FCOS and the like, and the backbone network can be a network such as ResNet, increment, DenseNet, MobileNet and the like.
After the vehicle target is obtained, the vehicle system, the vehicle type, the vehicle logo key point, the left vehicle light key point and the right vehicle light key point of the vehicle are marked, the key point positions are shown in fig. 3, the point A represents the vehicle logo key point, the point B represents the left vehicle light key point, and the point C represents the right vehicle light key point. The types of the vehicle systems include common Audi, BMW and galloping vehicle systems, and also include agricultural vehicle systems, including corresponding agricultural vehicle systems of Wuzheng, Shifeng, Juli, etc. The vehicle types comprise cars, SUVs, vans, trucks, passenger cars and agricultural vehicles, wherein the types corresponding to the agricultural vehicle series are the agricultural vehicles.
For example, a RegNetY network is used as a backbone network, an input picture is subjected to a series of convolution operations, and is divided into three subtasks at the end of the network, wherein the three subtasks are respectively a vehicle type task, a key point task and a vehicle series task, and each branch task corresponds to different vehicle marking information, as shown in fig. 4.
The input vehicle picture firstly passes through a backbone network RegNeTY to obtain feature maps with different sizes, whereinCharacteristic diagram of the penultimate layer is Fm1∈RDⅹH2ⅹW2D represents the number of channels, H2High, W, representing the characteristic diagram2Indicating the width of the graph. Feature size of the penultimate layer is Fm2∈RCⅹHⅹWThe train network is at the penultimate level Fm2Two full-connection layers are added at the back, wherein the two full-connection layers are respectively fy1And fy2. Vehicle type network in network layer Fm2Adding another two full connection layers, each of which is ft1And ft2. Key point network in Fm1Two convolutional neural network layers are connected at the back, and the two convolutional neural network layers are respectively Fp1∈REⅹH2ⅹW2And Fp2∈R3ⅹHⅹW。Fp2And predicting each position of a corresponding channel in the characteristics by adopting a sigmoid function as a final result of the key point prediction. Vehicle keypoint results Fp2And Fm2Carrying out cross-channel pixel-by-pixel multiplication to obtain a new feature vector fp3∈RC*3. The multiplication method comprises the following steps: layer Fp2In the ith channel ofi(h, w) e R pixel value and Fm2Corresponding F(h,w)∈RCThe vectors are multiplied and divided by pi(h, w) the sum of all values to give vector fiGenerating three vectors by the three key points to obtain a merged vector fp3The formula is shown below, wherein epsilon is 10-6,i=1,2,3。
Figure BDA0002861119050000111
fp3=[f1,f2,f3];
To obtain ft2、fp3、fy2After three eigenvectors are combined into one eigenvector f through concat operationallAs a feature vector of the vehicle system.
The vehicle type and the vehicle system use cross entropy loss function which is respectively expressed as LtypeAndLmodel. Loss function L of vehicle key pointspointIn particular, wherein p is* i(h,w)Representing the actual annotation information.
Figure BDA0002861119050000112
The final loss function is the loss function corresponding to the key points of the vehicle system, the vehicle type and the vehicle, and is added according to the weight, wherein alpha and beta are weight coefficients.
Lfinal=Lmod el+αLtype+βLpo int
And obtaining the vehicle type information and the vehicle system information of the vehicle through the trained network model by the vehicle picture, and judging that the vehicle is an agricultural vehicle when the vehicle system result corresponding to the vehicle is the agricultural vehicle system and the vehicle type result is the agricultural vehicle type.
To sum up, the deep convolutional neural network is adopted in the embodiment, direct end-to-end training is performed, additional local input pictures and other pre-processing information are not needed, the vehicle pictures are directly input, and corresponding vehicle logos, vehicle types and vehicle key point information are output. The vehicle key points and the convolutional neural network feature map are multiplied by a multi-task training mode, local semantic features of vehicle lamp and vehicle logo regions are fused, and meanwhile, global features of vehicle types are added to serve as feature vectors of vehicle classification, so that the accuracy of vehicle classification is improved. And the agricultural vehicle system result and the vehicle type classification result are combined to judge whether the vehicle is an agricultural vehicle, so that the instability of a single attribute result is avoided.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a vehicle type determining apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram showing the configuration of a vehicle type determination apparatus according to an embodiment of the present invention, which includes, as shown in fig. 5:
the first obtaining module 52 is configured to obtain image information of a vehicle to be detected, so as to obtain a target image;
the first input module 54 is configured to input the target image into the multitask target model to obtain a fusion feature vector output by the multitask target model, where the fusion feature vector includes a key point feature vector of the vehicle to be tested, a vehicle system feature vector, and a vehicle type feature vector;
and a first determination module 56 for determining the vehicle type of the vehicle to be tested based on the fused feature vector.
In an exemplary embodiment, the first input module includes: a first input unit, configured to input the target image into a backbone network in the multitask target model, and obtain N main feature maps of the target image output by the backbone network; a second input unit, configured to input M feature maps in the N main feature maps into a keypoint branch network, so as to obtain a keypoint feature vector output by the keypoint branch network; a third input unit, configured to input K feature maps in the N main feature maps into a vehicle system branch network, so as to obtain a vehicle system feature vector output by the vehicle system branch network; a fourth input unit, configured to input K feature maps in the N main feature maps into a vehicle type branch network, so as to obtain a vehicle type feature vector output by the vehicle type branch network; wherein each of N, M, and K is a natural number of 1 or more, and each of M and K is less than N.
In an exemplary embodiment, in a case where the backbone network includes a plurality of convolutional layers, the M feature maps are located at a P-th layer of the plurality of convolutional layers, and the K feature maps are located at a Q-th layer of the plurality of convolutional layers, where P and Q are natural numbers less than 1, and P is greater than Q.
In an exemplary embodiment, the second input unit includes: a first prediction subunit, configured to predict positions of the M feature maps by using a convolution layer in the keypoint branch network, so as to obtain keypoint position prediction information; a first determining subunit, configured to determine the keypoint feature vector based on the keypoint location prediction information and the K feature maps.
In an exemplary embodiment, the train-based branched network includes at least one fully-connected layer, wherein the fully-connected layer in the train-based branched network is configured to output the train characteristic vector.
In an exemplary embodiment, the vehicle type branching network includes at least one fully connected layer, wherein the fully connected layer in the vehicle type branching network is used for outputting the vehicle type feature vector.
In an exemplary embodiment, the first determining module includes: and a first determination unit configured to determine that the vehicle type of the vehicle to be measured is the vehicle type of the target vehicle, when the vehicle system corresponding to the fusion feature vector is the vehicle system of the target vehicle and the vehicle type corresponding to the vehicle type feature vector is the vehicle type of the target vehicle.
In an exemplary embodiment, the apparatus further includes: a second determining module, configured to input the target image into a multitask target model, and determine a model loss function L of the vehicle to be tested before obtaining a fusion feature vector output by the multitask target modeltype(ii) a A third determining module for determining the train loss function L of the vehicle to be testedmodel(ii) a A fourth determination module for determining the vehicle to be testedLoss function of key points Lpoint(ii) a A fifth determining module for utilizing the LtypeL is as defined abovemodelAnd L as mentioned abovepointDetermining a target loss function LfinalWherein L isfinal=Lmod el+αLtype+βLpo intThe alpha and the beta are used for representing the weight coefficient of the multitask original model; and the sixth determining module is used for training the multitask original model based on the target loss function to obtain the multitask target model.
In an exemplary embodiment, the fourth determining module includes:
Figure BDA0002861119050000141
wherein, the above p* i(h, w) actual annotation information corresponding to the sample image, and pi(H, W) is used for representing the prediction marking information corresponding to the sample image, (H, W) is used for representing the pixel point coordinates in the sample image, (H) is used for representing the height of the sample image, (W) is used for representing the width of the sample image, and the sample image is used for training the multitask original model.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
In the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring image information of the vehicle to be detected to obtain a target image;
s2, inputting the target image into the multitask target model to obtain a fusion feature vector output by the multitask target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected;
and S3, determining the vehicle type of the vehicle to be tested based on the fusion feature vector.
In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
In an exemplary embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring image information of the vehicle to be detected to obtain a target image;
s2, inputting the target image into the multitask target model to obtain a fusion feature vector output by the multitask target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of a vehicle to be detected;
and S3, determining the vehicle type of the vehicle to be tested based on the fusion feature vector.
For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.
It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A method of determining a type of a vehicle, comprising:
acquiring image information of a vehicle to be detected to obtain a target image;
inputting the target image into a multitask target model to obtain a fusion feature vector output by the multitask target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of the vehicle to be detected;
and determining the vehicle type of the vehicle to be tested based on the fusion feature vector.
2. The method of claim 1, wherein inputting the target image into a multitask target model to obtain a fused feature vector output by the multitask target model comprises:
inputting the target image into a backbone network in the multitask target model to obtain N main feature maps of the target image output by the backbone network;
inputting M characteristic graphs in the N main characteristic graphs into a key point branch network to obtain key point characteristic vectors output by the key point branch network;
inputting K characteristic graphs in the N main characteristic graphs into a vehicle system branch network to obtain vehicle system characteristic vectors output by the vehicle system branch network;
inputting K characteristic graphs in the N main characteristic graphs into a vehicle type branch network to obtain vehicle type characteristic vectors output by the vehicle type branch network;
wherein each of N, M, and K is a natural number greater than or equal to 1, and each of M and K is less than N.
3. The method of claim 2, comprising:
in a case where a plurality of convolutional layers are included in the backbone network, the M feature maps are located at a P-th layer of the plurality of convolutional layers, and the K feature maps are located at a Q-th layer of the plurality of convolutional layers, where P and Q are both natural numbers less than 1, and P is greater than Q.
4. The method of claim 2, wherein inputting M of the N main feature maps into a keypoint branch network to obtain the keypoint feature vector output by a convolutional layer in the keypoint branch network comprises:
predicting the positions of the M characteristic graphs by utilizing the convolution layer in the key point branch network to obtain key point position prediction information;
determining the keypoint feature vector based on the keypoint location prediction information and the K feature maps.
5. The method of claim 2, comprising:
the train system branch network at least comprises a full connection layer, wherein the full connection layer in the train system branch network is used for outputting the train system characteristic vector.
6. The method of claim 2, comprising:
the vehicle type branch network at least comprises a full connection layer, wherein the full connection layer in the vehicle type branch network is used for outputting the vehicle type feature vector.
7. The method of claim 1, wherein determining the vehicle type of the vehicle under test based on the fused feature vector comprises:
and determining that the vehicle type of the vehicle to be tested is the vehicle type of the target vehicle under the condition that the vehicle system corresponding to the fusion feature vector is the vehicle system of the target vehicle and the vehicle type corresponding to the vehicle type feature vector is the vehicle type of the target vehicle.
8. The method of claim 1, wherein before inputting the target image into a multitask target model and obtaining a fused feature vector output by the multitask target model, the method further comprises:
determining a model loss function L of the vehicle to be testedtype
Determining a train loss function L of the vehicle to be testedmodel
Determining a key point loss function L of the vehicle to be testedpoint
Using said LtypeThe said LmodelAnd said LpointDetermining a target loss function LfinalWherein L isfinal=Lmodel+αLtype+βLpointThe alpha and the beta are used for representing weight coefficients of the multitask original model;
and training the multi-task original model based on the target loss function to obtain the multi-task target model.
9. The method of claim 8, wherein the vehicle under test is determinedLoss function of key points LpointThe method comprises the following steps:
Figure FDA0002861119040000031
wherein, the p is* i(h, w) is used for representing the corresponding real annotation information of the sample image, and p isiThe (H, W) is used for representing the corresponding prediction annotation information of the sample image, the (H, W) is used for representing the coordinates of pixel points in the sample image, the H is used for representing the height of the sample image, the W is used for representing the width of the sample image, and the sample image is used for training the multitask original model.
10. A vehicle type determination device, characterized by comprising:
the first acquisition module is used for acquiring the image information of the vehicle to be detected to obtain a target image;
the first input module is used for inputting the target image into a multi-task target model to obtain a fusion feature vector output by the multi-task target model, wherein the fusion feature vector comprises a key point feature vector, a vehicle system feature vector and a vehicle type feature vector of the vehicle to be detected;
and the first determination module is used for determining the vehicle type of the vehicle to be detected based on the fusion feature vector.
11. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 9 when executed.
12. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 9.
CN202011567540.XA 2020-12-25 2020-12-25 Vehicle type determination method and device, storage medium and electronic device Pending CN112686125A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011567540.XA CN112686125A (en) 2020-12-25 2020-12-25 Vehicle type determination method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011567540.XA CN112686125A (en) 2020-12-25 2020-12-25 Vehicle type determination method and device, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN112686125A true CN112686125A (en) 2021-04-20

Family

ID=75451815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011567540.XA Pending CN112686125A (en) 2020-12-25 2020-12-25 Vehicle type determination method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN112686125A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139608A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Feature fusion method and device based on multi-task learning
CN115294537A (en) * 2022-08-10 2022-11-04 青岛文达通科技股份有限公司 Vehicle attribute identification method and system based on feature association

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139608A (en) * 2021-04-28 2021-07-20 北京百度网讯科技有限公司 Feature fusion method and device based on multi-task learning
CN113139608B (en) * 2021-04-28 2023-09-29 北京百度网讯科技有限公司 Feature fusion method and device based on multi-task learning
CN115294537A (en) * 2022-08-10 2022-11-04 青岛文达通科技股份有限公司 Vehicle attribute identification method and system based on feature association

Similar Documents

Publication Publication Date Title
CN111340131B (en) Image labeling method and device, readable medium and electronic equipment
CN107944450B (en) License plate recognition method and device
CN112528878A (en) Method and device for detecting lane line, terminal device and readable storage medium
CN111291812B (en) Method and device for acquiring attribute category, storage medium and electronic device
CN112613344B (en) Vehicle track occupation detection method, device, computer equipment and readable storage medium
CN113343461A (en) Simulation method and device for automatic driving vehicle, electronic equipment and storage medium
CN112686125A (en) Vehicle type determination method and device, storage medium and electronic device
CN113554643B (en) Target detection method and device, electronic equipment and storage medium
CN111860219B (en) High-speed channel occupation judging method and device and electronic equipment
CN112528477A (en) Road scene simulation method, equipment, storage medium and device
CN114495060B (en) Road traffic marking recognition method and device
CN115830399A (en) Classification model training method, apparatus, device, storage medium, and program product
CN113326826A (en) Network model training method and device, electronic equipment and storage medium
CN111199087A (en) Scene recognition method and device
CN115082752A (en) Target detection model training method, device, equipment and medium based on weak supervision
KC Enhanced pothole detection system using YOLOX algorithm
Uzar et al. Performance analysis of YOLO versions for automatic vehicle detection from UAV images
CN112654999B (en) Method and device for determining labeling information
Huang et al. Deep learning–based autonomous road condition assessment leveraging inexpensive rgb and depth sensors and heterogeneous data fusion: Pothole detection and quantification
CN111126493A (en) Deep learning model training method and device, electronic equipment and storage medium
CN113838076A (en) Method and device for labeling object contour in target image and storage medium
CN116107902A (en) Recharging method and device for test data and recharging system for test data
Thinh et al. An edge-AI heterogeneous solution for real-time parking occupancy detection
CN114627400A (en) Lane congestion detection method and device, electronic equipment and storage medium
Joshi et al. A Comprehensive Comparison Between Pre-trained and Custom Trained Object Detection Model for Indian Traffic Scenarios

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination