WO2022052375A1 - 车辆识别方法及装置、电子设备及存储介质 - Google Patents
车辆识别方法及装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022052375A1 WO2022052375A1 PCT/CN2020/140315 CN2020140315W WO2022052375A1 WO 2022052375 A1 WO2022052375 A1 WO 2022052375A1 CN 2020140315 W CN2020140315 W CN 2020140315W WO 2022052375 A1 WO2022052375 A1 WO 2022052375A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- loss
- feature data
- feature
- vehicle
- data
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 162
- 238000012545 processing Methods 0.000 claims abstract description 112
- 238000000605 extraction Methods 0.000 claims abstract description 84
- 238000007499 fusion processing Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims description 155
- 230000008569 process Effects 0.000 claims description 89
- 230000000875 corresponding effect Effects 0.000 claims description 36
- 238000011176 pooling Methods 0.000 claims description 35
- 230000002596 correlated effect Effects 0.000 claims description 24
- 230000015654 memory Effects 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 20
- 230000001174 ascending effect Effects 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 description 48
- 238000010586 diagram Methods 0.000 description 22
- 239000000284 extract Substances 0.000 description 17
- 101150064138 MAP1 gene Proteins 0.000 description 15
- 230000009467 reduction Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000010606 normalization Methods 0.000 description 6
- 101100400452 Caenorhabditis elegans map-2 gene Proteins 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Definitions
- the present disclosure relates to the field of computer vision technology, and in particular, to a vehicle identification method and device, an electronic device, and a storage medium.
- the vehicle identification method obtains two vehicle feature data by extracting the vehicle features from two images respectively, and compares the two vehicle feature data to determine whether the vehicle in the two images is the same car.
- the accuracy of the information included in the vehicle feature data extracted in this way is not high.
- the present disclosure provides a vehicle identification method and device, an electronic device and a storage medium.
- a vehicle identification method comprising:
- a vehicle identification device comprising:
- an acquisition unit configured to acquire a to-be-processed image containing the first vehicle to be identified
- a first processing unit configured to perform a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first to-be-recognized vehicle;
- a second processing unit configured to perform a second feature extraction process on the to-be-processed image to obtain second feature data including global feature information of the first to-be-recognized vehicle;
- a fusion processing unit configured to perform fusion processing on the first feature data and the second feature data to obtain third feature data of the first vehicle to be identified; the third feature data is used to obtain the first feature data The identification result of the vehicle to be identified.
- an electronic device comprising: a processor and a memory, wherein the memory is used to store computer program code, the computer program code includes computer instructions, when the processor executes the computer instructions , the electronic device executes the method according to the above-mentioned first aspect and any possible implementation manner thereof.
- an electronic device comprising: a processor, a sending device, an input device, an output device, and a memory, the memory being used to store computer program codes, the computer program codes comprising computer instructions, and in the processing When the computer executes the computer instructions, the electronic device executes the method according to the first aspect and any one of possible implementations thereof.
- a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions that, when the program instructions are executed by a processor, cause all The processor executes the method as described above in the first aspect and any possible implementation manner thereof.
- a computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer is made to perform the above-mentioned first aspect and any of them.
- Embodiments of the present disclosure provide a vehicle identification method and device, an electronic device, and a storage medium.
- first feature data of local feature information of the first to be identified vehicle is extracted, and Extracting the second feature data of the global feature information of the first vehicle to be recognized, and fusing the first feature data with the second feature data, so as to enrich the detailed feature information of the first vehicle to be recognized, and then based on the rich detailed features
- the information to determine the recognition result of the first vehicle to be recognized can improve the accuracy of the recognition result.
- FIG. 1 is a schematic flowchart of a vehicle identification method according to an embodiment of the present disclosure
- FIG. 2 is a schematic diagram of a key point provided by an embodiment of the present disclosure
- FIG. 3 is a schematic diagram of a local pixel area according to an embodiment of the present disclosure.
- FIG. 4 is a schematic structural diagram of a vehicle identification network according to an embodiment of the present disclosure.
- FIG. 5 is a schematic structural diagram of a feature extraction module provided by an embodiment of the present disclosure.
- FIG. 6 is a schematic structural diagram of a key point and local pixel point region generation module according to an embodiment of the present disclosure
- FIG. 7 is a schematic structural diagram of a joint training module provided by an embodiment of the present disclosure.
- FIG. 8 is a schematic structural diagram of a first actor-critic according to an embodiment of the present disclosure.
- FIG. 9 is a schematic structural diagram of a first molecule breaking module according to an embodiment of the present disclosure.
- FIG. 10 is a schematic structural diagram of a second actor-critic module according to an embodiment of the present disclosure.
- FIG. 11 is a schematic structural diagram of a second molecule breaking module according to an embodiment of the present disclosure.
- FIG. 12 is a schematic structural diagram of a vehicle identification device according to an embodiment of the present disclosure.
- FIG. 13 is a schematic diagram of a hardware structure of a vehicle identification device according to an embodiment of the present disclosure.
- monitoring equipment In order to enhance safety in work, life or social environment, monitoring equipment is installed in each area. With the improvement of people's living standards, there are more and more vehicles on the road and more and more traffic accidents. How to effectively determine the whereabouts of the vehicle (hereinafter referred to as the target vehicle) through the video stream collected by the monitoring equipment is of great significance . For example, when chasing a hit-and-run vehicle, the vehicle identification method is used to process the images collected by different cameras to determine the whereabouts of the hit-and-run vehicle.
- the vehicle identification method obtains the characteristics of the vehicle to be confirmed by extracting the overall appearance characteristic information of the vehicle to be confirmed in the image, and compares the vehicle characteristics of the vehicle to be confirmed with the target vehicle characteristics including the overall appearance characteristic information of the target vehicle. Yes, the similarity between the target vehicle and the vehicle to be confirmed is obtained, wherein the overall appearance features include: model and color. When the similarity exceeds the similarity threshold, it is determined that the vehicle to be confirmed and the target vehicle are the same vehicle.
- the embodiments of the present disclosure provide a vehicle identification method, which can enrich the information included in the vehicle features.
- the execution subject of the embodiment of the present disclosure is a vehicle identification device.
- the optional vehicle identification device can be one of the following: a mobile phone, a server, a computer, a tablet computer, and a wearable device. Please refer to FIG. 1 , which is a schematic flowchart of a vehicle identification method provided by an embodiment of the present disclosure.
- the to-be-processed image includes the first to-be-identified vehicle.
- the vehicle identification device receives the image to be processed input by the user through the input component.
- the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
- the vehicle identification device receives the to-be-processed image sent by the data terminal.
- the above data terminal may be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
- the vehicle identification device receives the to-be-processed image sent by the surveillance camera.
- the surveillance cameras are deployed on roads (including: highways, expressways, and urban roads).
- the local feature information includes detailed feature information of the vehicle, such as: feature information of a car lamp, feature information of a car logo, and feature information of a car window.
- the vehicle identification device can extract the local feature information of the first vehicle to be identified from the image to be processed by performing the first feature extraction process on the image to be processed to obtain the first feature data.
- the first feature extraction process may be implemented by a first convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the first convolutional neural network obtained by training can complete the first feature extraction processing of the image to be processed.
- the annotation information of the training data may be the detailed feature information of the vehicle in the image (such as the type of headlights, the type of the vehicle logo, the type of the vehicle window).
- the convolutional neural network extracts the feature data including the detailed feature information of the vehicle from the training data, and obtains the detailed information of the vehicle according to the extracted feature data as the training result. .
- the training of the convolutional neural network can be completed to obtain the first convolutional neural network.
- the vehicle identification device can use the first convolutional neural network to process the to-be-processed image to obtain detailed feature information of the first to-be-recognized vehicle to obtain first feature data.
- the vehicle identification device uses the first convolution kernel to perform convolution processing on the image to be processed, and extracts semantic information of the image to be processed including detailed feature information of the vehicle to obtain the first feature data.
- the global feature information of the vehicle includes the overall appearance feature information of the vehicle.
- the vehicle identification device can extract the global feature information of the first vehicle to be identified from the to-be-processed image by performing the second feature extraction process on the to-be-processed image to obtain second feature data.
- the second feature extraction process may be implemented by a second convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the second convolutional neural network obtained by training can complete the second feature extraction processing of the image to be processed.
- the annotation information of the training data may be the overall appearance feature information of the vehicle in the image (such as vehicle type, body color).
- the convolutional neural network extracts feature data including the overall appearance feature information of the vehicle from the training data, and obtains the overall appearance information of the vehicle according to the extracted feature data, as training results.
- the vehicle identification device can use the second convolutional neural network to process the to-be-processed image to obtain the overall appearance feature information of the first to-be-recognized vehicle to obtain the second feature data.
- the vehicle identification device uses the second convolution kernel to perform convolution processing on the to-be-processed image, and extracts semantic information of the to-be-processed image including the overall appearance feature information of the vehicle to obtain the second feature data.
- the parameters of the first convolution kernel are different from those of the second convolution kernel.
- the third feature data is used to obtain an identification result of the first vehicle to be identified, wherein the identification result includes the identity of the first vehicle to be identified.
- the vehicle identification device may further determine the vehicle to be identified as vehicle a according to the third characteristic data.
- the vehicle identification device compares the third feature data with the feature data in the vehicle feature database, and determines that the similarity between the target vehicle feature data in the vehicle feature database and the third feature data exceeds the similarity threshold. Then, based on the vehicle corresponding to the feature data of the target vehicle as vehicle b, the vehicle identification device can determine that the vehicle corresponding to the third feature data is vehicle b, that is, the recognition result of the first vehicle to be identified determined according to the third feature data is vehicle b. .
- the vehicle identification device can obtain third feature data including both global feature information of the first vehicle to be identified and local feature information of the first vehicle to be identified by fusing the first feature data and the second feature data. Using the third feature data as the feature data of the first vehicle to be recognized can enrich the information included in the feature data of the first vehicle to be recognized.
- the above-mentioned local feature information includes key point feature information.
- the key point feature information includes the position of the key point in the image to be processed and the semantic information of the key point.
- the key point 6 shown in FIG. 2 is the key point of the left front tire, and the semantic information of the key point 6 includes the information of the left front tire (such as tire specification, wheel size, tire brand).
- the key point 23 shown in FIG. 2 is the key point of the rear license plate, and the semantic information of the key point 23 includes the information of the rear license plate (such as the license plate number).
- the labeling method of the key points of the vehicle is shown in FIG. 2 .
- vehicle model shown in FIG. 2 is only an example. In practical applications, any vehicle type (such as a dump truck, bus or truck) can be marked according to the key point marking method shown in FIG. 2 .
- the vehicle identification device obtains first feature data including key point feature information of the first to-be-identified vehicle by performing a first feature extraction process on the to-be-processed image.
- the first characteristic data may include characteristic information of the key point of the left front tire and characteristic information of the key point of the rear license plate of the vehicle to be identified.
- the local feature information includes not only key point feature information but also local pixel point region feature information.
- the local pixel area belongs to the pixel area covered by the first vehicle to be recognized, and the area of the local pixel area is smaller than the area of the pixel area covered by the first vehicle to be recognized.
- the right local pixel area 301 includes the right area of the first vehicle to be identified 300
- the head pixel area 302 includes the head area of the first vehicle to be identified.
- the feature information of the local pixel point region includes semantic information of the local pixel point region.
- the semantic information of the local pixel area includes: the model of the headlight; the local pixel area includes the pixel area covered by the window.
- the semantic information of the local pixel point area includes: the type of the car window, the objects in the car that can be observed through the car window; in the case of the local pixel point area including the pixel point area covered by the front windshield
- the semantic information of the local pixel area includes: the type of the front windshield, the objects in the car that can be observed through the front windshield, the annual inspection mark on the front windshield, and the annual inspection mark on the front windshield. on the location.
- the vehicle identification device for local feature information performs the following steps:
- the fourth characteristic data includes characteristic information of at least one key point of the first vehicle to be identified.
- the vehicle identification device can extract feature information of at least one key point of the first vehicle to be identified from the image to be processed to obtain fourth feature data.
- the third feature extraction process may be implemented by a third convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the third convolutional neural network obtained by training can complete the third feature extraction processing of the image to be processed.
- the annotation information of the training data may be the key point feature information of the vehicle in the image (eg, the position of the key point, the semantic information of the key point).
- the convolutional neural network extracts the feature data including the key point feature information of the vehicle from the training data, and obtains the key point feature information according to the extracted feature data, as training results.
- the training of the convolutional neural network can be completed to obtain a third convolutional neural network.
- the vehicle identification device can use the third convolutional neural network to process the to-be-processed image to obtain the feature information of the key points extracted from the first to-be-identified vehicle to obtain fourth feature data.
- the vehicle identification device uses a third convolution kernel to perform convolution processing on the to-be-processed image, extracts semantic information of the to-be-processed image including the overall appearance feature information of the vehicle, and obtains fourth feature data.
- the parameters of the third convolution kernel are different from those of the first convolution kernel, and the parameters of the third convolution kernel are also different from those of the second convolution kernel.
- the fifth characteristic data includes characteristic information of at least one local pixel area of the first vehicle to be identified.
- the fourth feature extraction process may be implemented by a fourth convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the fourth convolutional neural network obtained by training can complete the fourth feature extraction processing of the image to be processed.
- the annotation information of the training data may be the feature information of the local pixel area of the vehicle in the image.
- the convolutional neural network extracts feature data including the feature information of the local pixel area of the vehicle from the training data, and obtains the local pixel points according to the extracted feature data.
- the feature information of the region is used as the training result.
- the training of the convolutional neural network can be completed to obtain a fourth convolutional neural network.
- the vehicle identification device can use the fourth convolutional neural network to process the to-be-processed image to obtain the feature information of the local pixel point region of the first to-be-identified vehicle to obtain fifth feature data.
- the vehicle identification device uses a fourth convolution kernel to perform convolution processing on the to-be-processed image, and extracts the feature information of the local pixel area of the first to-be-recognized vehicle of the to-be-processed image, and obtains the fifth characteristic data.
- the parameters of the fourth convolution kernel are different from the parameters of the first convolution kernel, the parameters of the second convolution kernel, and the parameters of the third convolution kernel.
- the feature information of the local pixel area contains the semantic information of the local pixel area, and there is a correlation between adjacent pixels in the image (the correlation here includes semantic correlation), by combining the semantic information of the local pixel area
- the fusion of information and key point feature information can enrich the detailed feature information of the vehicle.
- the vehicle identification device fuses the key point feature information of the first vehicle to be identified with the feature information of the local pixel point area of the first vehicle to be identified by fusing the fourth feature data and the fifth feature data to enrich the first feature information to be identified.
- the detailed feature information of the vehicle is obtained to obtain the first feature data.
- the vehicle identification device performs the following steps in the process of executing step 1:
- the sixth feature data includes key point feature information of the first vehicle to be identified, and the feature information included in any two sixth feature data belongs to different key points.
- the first vehicle to be identified includes a left rearview mirror keypoint and a right taillight keypoint.
- At least one sixth feature data includes: feature data 1 and feature data 2, wherein feature data 1 includes feature information of a key point of the left rearview mirror, and feature data 2 includes feature information of a key point of the right tail light.
- the vehicle identification device extracts the key point feature information of the first vehicle to be identified by performing the fifth feature extraction process on the image to be processed, and obtains the first intermediate feature data with the number of channels not less than 1, wherein , the data of each channel in the first intermediate feature data includes the key point feature information of the first vehicle to be identified, and the information included in the data of any two channels belongs to different key points.
- the vehicle identification device may use one channel data in the first intermediate characteristic data as a sixth characteristic data.
- the vehicle identification device may select k feature data (that is, k feature data including the largest amount of information from at least one sixth feature data) The seventh characteristic data) is used for subsequent processing, wherein k is an integer not less than 1.
- one seventh feature data can be obtained by executing step 5.
- the vehicle identification device can use the seventh feature data as the fourth feature data, that is, the fourth feature data includes a key point. characteristic information.
- At least two seventh feature data can be obtained by performing step 5.
- the vehicle identification device can perform fusion processing on at least two seventh feature data to obtain fourth feature data.
- the at least two seventh feature data include: seventh feature data 1, seventh feature data 2, and seventh feature data 3, wherein the seventh feature data 1 includes feature information of key points of the left front lamp, and the seventh feature data 2 includes the feature information of the key point of the left rear lamp, and the seventh feature data 3 includes the feature information of the key point of the left rearview mirror.
- the vehicle identification device may obtain the fourth characteristic data by performing fusion processing on the seventh characteristic data 1 and the seventh characteristic data 2 .
- the fourth characteristic data includes characteristic information of the key point of the left front lamp and characteristic information of the key point of the left rear lamp.
- the vehicle identification device may also obtain the fourth characteristic data by performing fusion processing on the seventh characteristic data 1 , the seventh characteristic data 2 and the seventh characteristic data 3 .
- the fourth feature data includes the feature information of the key point of the left front lamp, the feature information of the key point of the left rear lamp, and the feature information of the key point of the left rearview mirror.
- the vehicle identification device performs the following steps in the process of executing step 4:
- the first heat map includes position information of key points in the image to be processed, and the information included in any two first heat maps belong to different key points.
- the key points of the first vehicle to be identified include a left rearview mirror key point and a right tail light key point.
- At least one first heat map includes: a first heat map 1 and a first heat map 2, wherein the first heat map 1 includes the position information of the key points of the left rearview mirror in the image to be processed, and the first heat map 2 includes The position information of the right taillight key point in the image to be processed.
- the pixels in the same position in the two images are called mutual co-location.
- the position of pixel A in the first heat map 1 is the same as the position of pixel B in the first heat image 2
- pixel A is the same pixel as pixel B in the first heat map
- the pixel point B is the pixel point in the image to be processed that is the same location as the pixel point A.
- the size of the first heat map is the same as the size of the image to be processed.
- the pixel value of the pixel point in the first heat map represents the confidence of the existence of a key point in the position of the pixel point in the image to be processed that is co-located with the pixel point. For example, pixel A in the first heat map 1 and pixel B in the image to be processed are co-located with each other. If the first heat map 1 includes the position information of the key point of the left headlight in the to-be-processed image, and the pixel value of pixel A is 0.7, the confidence of the existence of the left headlight at pixel B is 0.7.
- the sixth feature extraction processing may be convolution processing, pooling processing, or a combination of convolution processing and pooling processing, which is not limited in this disclosure.
- the sixth feature extraction process may be implemented by a fifth convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the fifth convolutional neural network obtained by training can complete the extraction processing of the sixth feature of the image to be processed.
- the annotation information of the training data can be the position of the key point in the image.
- the convolutional neural network extracts the feature data including the position information of the key points from the training data, and obtains the positions of the key points in the image according to the extracted feature data, as the training result.
- the training of the convolutional neural network can be completed to obtain the fifth convolutional neural network.
- the vehicle identification device can use the fifth convolutional neural network to process the image to be processed to obtain the position information of the key points of the first vehicle to be identified, and obtain the first heat map.
- Each pixel in the image to be processed includes semantic information, and the semantic information includes feature information of key points.
- the semantic information includes feature information of key points.
- the first feature image not only includes key point feature information of pixels, but also includes relative position information between pixels.
- the information included in the fourth feature data does not include relative position information between pixels.
- the key points to which the location information included in the first heat map belongs are referred to as key points of the first heat map.
- the first heat map 1 includes the location information of the key points of the left headlight, that is, the first heat map 1 includes The information belongs to the key point of the left headlight.
- the key point of the first heat map 1 is the key point of the left headlight.
- the size of the image to be processed, the size of the first heat map, and the size of the first feature image are all the same. For example, if the length of the image to be processed is 50 and the width is 30, the length of the first heat map and the length of the first feature image are both 50, and the width of the first heat map and the width of the first feature image are both 30.
- the dot product refers to an element-wise product.
- the vehicle identification device may normalize the pixel values in the first feature image to obtain the normalized first feature image.
- a heat map for example, adjusts the pixel value not less than 0.6 to 1, and adjusts the pixel value less than 0.6 to 0.3.
- the vehicle identification device can extract the feature information of the key points of the first heat map by determining the dot product between the normalized first heat map and the first feature image, and obtain sixth feature data.
- the vehicle identification device performs the following steps in the process of executing step 5:
- the vehicle identification device can reduce the amount of data in the sixth feature data by performing pooling processing on one sixth feature data, and obtain an eighth feature data. In this way, processing the eighth characteristic data in the subsequent processing can reduce the data processing amount of the vehicle identification device.
- the vehicle identification device obtains at least one eighth characteristic data by pooling the characteristic data in the at least one sixth characteristic data respectively.
- the at least one sixth feature data includes: sixth feature data 1 , sixth feature data 2 , and sixth feature data 3 .
- the vehicle identification device obtains the eighth feature data 1 by pooling the sixth feature data 1, and obtains the eighth feature data 2 by performing the pooling process on the sixth feature data 2.
- at least one of the eighth feature data includes the first feature. Eight feature data 1 and eighth feature data 2.
- the vehicle identification device obtains eighth feature data 1 by performing pooling processing on sixth feature data 1, obtains eighth feature data 2 by performing pooling processing on sixth feature data 2, and obtains eighth feature data 2 by performing pooling processing on sixth feature data 3 Eighth feature data 3 is obtained.
- at least one eighth feature data includes eighth feature data 1 , eighth feature data 2 , and eighth feature data 3 .
- the pooling process in step 10 is a global average pooling process.
- the first probability is used to represent the amount of information included in the sixth feature data corresponding to the first probability.
- at least one eighth feature data includes eighth feature data 1
- at least one first probability includes first probability 1
- the first probability 1 is obtained according to the amount of information included in the eighth feature data 1
- the eighth characteristic data 1 is obtained by performing pooling processing on the sixth characteristic data 1 .
- the first probability 1 is used to represent the amount of information included in the sixth feature data 1 .
- the first probability there is a correlation between the first probability and the amount of information included in the sixth characteristic data. For example, in the case where the first probability is positively correlated with the amount of information included in the sixth feature data, in Example 1, the larger the first probability 1 is, the greater the amount of information included in the sixth feature data 1; When the probability is negatively correlated with the amount of information included in the sixth feature data, in Example 1, the larger the first probability 1 is, the smaller the amount of information included in the sixth feature data 1 is.
- the vehicle identification device can obtain the first probability according to the amount of information included in the eighth characteristic data.
- the vehicle identification device inputs the eighth characteristic data into the softmax function, and the first probability can be obtained.
- the vehicle identification device can obtain a first probability according to the information amount included in one eighth characteristic data, and obtain at least one first probability according to the information amount included in at least one eighth characteristic data.
- the at least one eighth characteristic data includes eighth characteristic data 1 and eighth characteristic data 2 .
- the vehicle identification device obtains the first probability 1 according to the amount of information included in the eighth characteristic data 1 , and at this time, at least one first probability includes the first probability 1 .
- the vehicle identification device obtains the first probability 1 according to the amount of information included in the eighth feature data 1, and obtains the first probability 2 according to the amount of information included in the eighth feature data 2. At this time, at least one first probability includes the first probability 1 and the first probability 2.
- the vehicle identification device executes step 12; in the case that the first probability is negatively correlated with the amount of information included in the sixth characteristic data, the vehicle identification The device executes step 13 .
- the vehicle identification device may determine the weight of each seventh feature data according to the amount of information included in the seventh feature data, and determine the weight of each seventh feature data according to the weight of the seventh feature data. At least one seventh feature data is weighted and fused to obtain fourth feature data.
- the vehicle identification device performs the following steps in the process of executing step 2:
- the ninth feature data includes key point feature information of the first vehicle to be identified, and the feature information included in any two ninth feature data belong to different local pixel regions.
- the first to-be-identified vehicle includes a local pixel area 1 and a local pixel area 2, wherein the local pixel area 1 includes the pixel area covered by the front windshield, and the local pixel area 2 includes the left glass area. pixel area.
- the at least one ninth feature data includes: feature data 1 and feature data 2 , wherein the feature data 1 includes feature information of the local pixel area 1 , and the feature data 2 includes feature information of the local pixel area 2 .
- the vehicle identification device extracts the key point feature information of the first vehicle to be identified by performing the tenth feature extraction process on the image to be processed, and obtains fourth intermediate feature data with a channel number of not less than 1, wherein , the data of each channel in the fourth intermediate feature data includes the feature information of the local pixel area of the first vehicle to be identified, and the information included in the data of any two channels belongs to different local pixel areas.
- the vehicle identification device may use one channel data in the fourth intermediate feature data as a ninth feature data.
- the vehicle identification device may select m feature data (that is, m feature data including the largest amount of information from at least one ninth feature data) The tenth characteristic data) is used for subsequent processing, wherein m is an integer not less than 1.
- one tenth feature data can be obtained by executing step 17.
- the vehicle identification device can use the tenth feature data as the fifth feature data, that is, the fifth feature data includes a key point. characteristic information.
- At least two tenth feature data can be obtained by executing step 5, and at this time, the vehicle identification device can perform fusion processing on at least two tenth feature data to obtain fifth feature data.
- the at least two tenth feature data include: tenth feature data 1, tenth feature data 2, and tenth feature data 3, wherein the tenth feature data 1 includes feature information of the pixel area covered by the front of the vehicle, and the tenth feature data
- the characteristic data 2 includes characteristic information of the pixel point area covered by the right front windshield
- the tenth characteristic data 3 includes characteristic information of the pixel point area covered by the left tire.
- the vehicle identification device may obtain the fifth characteristic data by performing fusion processing on the tenth characteristic data 1 and the tenth characteristic data 2 .
- the fifth characteristic data includes characteristic information of the pixel area covered by the front of the vehicle and characteristic information of the pixel area covered by the right front windshield.
- the vehicle identification device may also obtain the fifth characteristic data by performing fusion processing on the tenth characteristic data 1 , the tenth characteristic data 2 and the tenth characteristic data 3 .
- the fifth feature data includes feature information of the pixel point area covered by the front of the vehicle, feature information of the pixel point area covered by the right front windshield, and feature information of the pixel point area covered by the left tire.
- the vehicle identification device performs the following steps in the process of executing step 14:
- the second heat map includes position information of key points in the image to be processed, and the information included in any two second heat maps belong to different local pixel regions.
- the local pixel point area of the first vehicle to be identified includes a front windshield area and a head area.
- the at least one second heat map includes: a second heat map 1 and a second heat map 2, wherein the second heat map 1 includes the position information of the front windshield area in the image to be processed, and the second heat map 2 includes local The position information of the pixel area in the image to be processed.
- the pixels in the same position in the two images are called mutual co-location.
- the position of pixel A in the second heat map 1 is the same as the position of pixel B in the second heat image 2, then pixel A is the same pixel as pixel B in the second heat map
- the pixel point B is the pixel point in the image to be processed that is the same location as the pixel point A.
- the size of the second heat map is the same as the size of the image to be processed.
- the pixel value of the pixel point in the second heat map represents the confidence level that the position of the pixel point in the image to be processed that is co-located with the pixel point belongs to the local pixel point area. For example, pixel A in the second heat map 1 and pixel B in the image to be processed are co-located with each other. If the second heat map 1 includes the position information of the head area in the image to be processed, and the pixel value of pixel A is 0.7, the confidence that pixel B belongs to the head area is 0.7.
- the eleventh feature extraction processing may be convolution processing, pooling processing, or a combination of convolution processing and pooling processing, which is not limited in this disclosure.
- the eleventh feature extraction process may be implemented by the sixth convolutional neural network.
- the convolutional neural network is trained by using the image with label information as training data, so that the sixth convolutional neural network obtained by training can complete the eleventh feature extraction processing of the image to be processed.
- the annotation information of the training data can be the position of the local pixel area in the image.
- the convolutional neural network extracts the feature data including the position information of the local pixel area from the training data, and obtains the local pixel points in the image according to the extracted feature data. The location of the region as the training result.
- the training of the convolutional neural network can be completed to obtain the sixth convolutional neural network.
- the vehicle identification device can use the sixth convolutional neural network to process the to-be-processed image to obtain the location information of the key points of the first to-be-identified vehicle to obtain the second heat map.
- Each pixel in the image to be processed includes semantic information, and by performing the seventh feature extraction process on the image to be processed, the semantic information of each pixel can be extracted to obtain a second feature image.
- the second feature image not only includes semantic information of pixels, but also includes relative position information between pixels.
- the information included in the fifth feature data does not include relative position information between pixels.
- the first feature image and the second feature image may be the same.
- both the first feature image and the second feature image include semantic information of each pixel in the image to be processed.
- the local pixel area to which the location information included in the second heat map belongs is called the local pixel area of the second heat map.
- the second heat map 1 includes the location information of the front windshield area, that is, the second heat map The information included in 1 belongs to the front windshield area.
- the local pixel area of the second heat map 1 is the front windshield area.
- the size of the image to be processed, the size of the second heat map, and the size of the second feature image are all the same. For example, if the length of the image to be processed is 50 and the width is 30, the length of the second heat map and the length of the second feature image are both 50, and the width of the second heat map and the width of the second feature image are both 30.
- ninth feature data can be obtained from the feature information of the local pixel region of the second heat map extracted from the second feature image.
- the vehicle identification device may perform normalization processing on the pixel values in the second feature image to obtain the normalized first feature image.
- Two heatmaps for example, adjust pixel values over 0.7 to 1, and adjust pixel values not over 0.7 to 0.
- the vehicle identification device can extract feature information of key points of the second heat map to obtain ninth feature data.
- the vehicle identification device performs the following steps in the process of executing step 15:
- the vehicle identification device can reduce the amount of data in the ninth feature data by performing pooling processing on a ninth feature data, and obtain an eleventh feature data. In this way, by processing the eleventh characteristic data in the subsequent processing, the data processing amount of the vehicle identification device can be reduced.
- the vehicle identification device obtains at least one eleventh characteristic data by pooling the characteristic data in the at least one ninth characteristic data respectively.
- the at least one ninth feature data includes: ninth feature data 1 , ninth feature data 2 , and ninth feature data 3 .
- the vehicle identification device obtains eleventh feature data 1 by pooling the ninth feature data 1, and obtains eleventh feature data 2 by pooling the ninth feature data 2.
- at least one eleventh feature The data includes eleventh characteristic data 1 and eleventh characteristic data 2 .
- the vehicle identification device obtains the eleventh feature data 1 by pooling the ninth feature data 1, obtains the eleventh feature data 2 by pooling the ninth feature data 2, and obtains the eleventh feature data 2 by pooling the ninth feature data 3.
- Eleventh feature data 3 is obtained through the transformation process.
- at least one eleventh feature data includes eleventh feature data 1 , eleventh feature data 2 , and eleventh feature data 3 .
- the pooling process in step 20 is the global average pooling process.
- the second probability is used to represent the amount of information included in the ninth feature data corresponding to the second probability.
- at least one eleventh feature data includes eleventh feature data 1
- at least one second probability includes second probability 1
- the second probability 1 is obtained according to the amount of information included in the eleventh feature data 1
- the eleventh feature data 1 is obtained by pooling the ninth feature data 1 . That is, the second probability 1 is used to represent the amount of information included in the ninth feature data 1 .
- the second probability is positively correlated with the amount of information included in the ninth feature data
- the probability is negatively correlated with the amount of information included in the ninth feature data
- the vehicle identification device can obtain the second probability according to the amount of information included in the eleventh characteristic data.
- the vehicle identification device inputs the eleventh characteristic data into the softmax function, and the second probability can be obtained.
- the vehicle identification device may obtain a second probability according to the amount of information included in one eleventh characteristic data, and may obtain at least one second probability according to the amount of information included in at least one eleventh characteristic data.
- the at least one eleventh feature data includes eleventh feature data 1 and eleventh feature data 2 .
- the vehicle identification device obtains the second probability 1 according to the amount of information included in the eleventh characteristic data 1 , and at this time, at least one second probability includes the second probability 1 .
- the vehicle identification device obtains the second probability 1 according to the amount of information included in the eleventh feature data 1, and obtains the second probability 2 according to the amount of information included in the eleventh feature data 2.
- at least one second probability includes the second probability. Probability 1 and second probability 2.
- the vehicle identification device executes step 22; under the condition that the second probability is negatively correlated with the amount of information included in the ninth feature data, the vehicle identification device The device executes step 23 .
- a tenth feature data includes feature information of a local pixel area
- the number of local pixel areas in at least one local pixel area exceeds 1
- different tenth characteristic data include different amounts of information.
- the vehicle identification device may determine the weight of each tenth feature data according to the amount of information included in the tenth feature data, and determine the weight of each tenth feature data according to the amount of information included in the tenth feature data. The weight performs weighted fusion on at least one tenth feature data to obtain fifth feature data.
- the at least one local pixel point area includes a first local pixel point area and a second local pixel point area, and both the number and m of the ninth feature data are greater than 1.
- the vehicle identification device selects m pieces of feature data including the most information from at least two ninth feature data, and obtains twelfth feature data including feature information of the first local pixel point region and features including the second local pixel point region Thirteenth characteristic data of information.
- the vehicle identification device performs the following steps in the process of executing step 18:
- the first weight is positively correlated with the amount of information included in the twelfth feature data
- the second weight is positively correlated with the amount of information included in the thirteenth feature data
- the vehicle identification device performs weighted fusion of the ninth feature data and the tenth feature data according to the first weight and the second weight, and obtains the fourth feature data including the feature information of the local pixel point area of the first to-be-identified vehicle, which can improve the first to-be-identified vehicle.
- the accuracy of identifying the local pixel area feature information of the vehicle is a measure of the accuracy of identifying the local pixel area feature information of the vehicle.
- the vehicle identification device performs weighted summation on the twelfth characteristic data and the thirteenth characteristic data according to the first weight and the second weight to obtain the fifth characteristic data.
- the first weight is ⁇ 3
- the second weight is ⁇ 4
- the twelfth feature data is n 4
- the thirteenth feature data is n 5
- the fifth feature data is n 6
- n 6 ⁇ 3 ⁇ n 4 + ⁇ 4 ⁇ n 5 +d
- the vehicle identification device multiplies the first weight by the twelfth characteristic data to obtain fifth intermediate characteristic data, and multiplies the second weight by the thirteenth characteristic data to obtain sixth intermediate characteristic data , and the fifth characteristic data is obtained by fusing the fifth intermediate characteristic data and the sixth intermediate characteristic data.
- the embodiments of the present disclosure also provide a vehicle identification network, which can be configured to implement the technical solutions disclosed above.
- the vehicle recognition network includes: a feature extraction module 401 , a key point and local pixel region generation module 402 , and a joint training module 403 .
- the to-be-processed image 400 is processed by the feature extraction module 401 to obtain a third feature image 404 of the to-be-processed image.
- At least one first heat map and at least one second heat map 405 are obtained by processing the image to be processed by the key point and local pixel region generating module.
- the third feature map, at least one first heat map and at least one second heat image are input to the joint training module to obtain third feature data 406 .
- FIG. 5 is a schematic structural diagram of a feature extraction module.
- the feature extraction module includes three convolutional layers connected in series.
- the first convolutional layer 501 is conv2_x in ResNet50
- the second convolutional layer 502 is conv3_x in ResNet50
- the third convolutional layer 503 is conv4_x in ResNet50.
- Feature extraction is performed on the image 500 to be processed through the three convolution layers to obtain a third feature image 504 .
- Figure 6 shows a schematic diagram of the structure of the key point and local pixel region generation module.
- the keypoint and local pixel region generation module includes four convolutional layers in series.
- the first convolutional layer 601 is conv2_x in ResNet50
- the second convolutional layer 602 is conv3_x in ResNet50
- the third convolutional layer 603 is conv4_x in ResNet50
- the fourth convolutional layer 604 is conv5_x in ResNet50.
- the image 600 to be processed is processed through the four convolution layers to obtain at least one first heat map and at least one second heat map 605 .
- Figure 7 shows a schematic diagram of the structure of the joint training module.
- the third feature image 700 is processed by the first convolution layer 701 of the joint training module to obtain the first general feature image.
- the first feature image is obtained by performing dimensionality reduction on the channel dimension on the first general feature image through the first dimensionality reduction layer 702 .
- the first actor-critic module 703 processes the first feature image and at least one first heat map 704 to obtain k first critic feature data 705 .
- the k first critic feature data are processed through the first pooling layer 71 and the first normalization layer 72 in sequence, and k seventh feature data 705 are obtained.
- the third feature image is processed by the first convolution layer 701 of the joint training module to obtain a second general feature image.
- the second feature image is obtained by performing dimension reduction on the channel dimension on the second general feature image through the second dimension reduction layer 711 .
- the second feature image and at least one second heat map 713 are processed by the second actor-critic module 712 to obtain m second critic feature data.
- the m pieces of second critic feature data are processed through the second pooling layer 73 and the second normalization layer 74 in sequence, and m pieces of tenth feature data 714 are obtained.
- the third feature image is processed by the second layer convolution layer 721, the third layer dimensionality reduction layer 722, the third layer pooling layer 75, and the third normalization layer 76 of the joint training module in turn to obtain the second feature data. 723.
- the first convolutional layer 701 and the second convolutional layer 721 are both conv5_x in ResNet50.
- the first dimension reduction layer 702, the second dimension reduction layer 711, and the third dimension reduction layer 722 all include a convolution kernel with a size of 1*1.
- FIG. 8 is a schematic diagram of the structure of the first actor-critic module.
- the input of the first actor-critic module is at least a first heatmap 801 and a first feature image 802 .
- the first actor-critic module respectively determines the dot product between each first heat map and the first feature image to obtain at least one sixth feature data 803 .
- a first probability corresponding to the sixth characteristic data can be obtained by processing a sixth characteristic data by the first scoring module 804 .
- the corresponding sixth feature data obtains k first actor feature data.
- the k first actor feature data are respectively normalized to obtain k first critic feature data 807 .
- FIG. 9 is a schematic structural diagram of the first molecule breaking module.
- the sixth feature data 901 passes through the normalization layer 902, the pooling layer 903, and the fully connected layer 904 to obtain the eighth feature data, and the softmax layer 905 processes the eighth feature data to obtain The first probability 906 .
- FIG. 10 is a schematic structural diagram of the second actor-critic module.
- the input to the second actor-critic module is at least a second heatmap and a third feature image.
- the second actor-critic module respectively determines the dot product between each second heat map 1001 and the third feature image 1002 to obtain at least one ninth feature data 1003 .
- a second probability 1005 corresponding to the ninth characteristic data can be obtained by processing a ninth characteristic data by the second scoring module 1004 .
- the ninth feature data obtains m second actor feature data 1006 .
- the m second actor feature data are respectively normalized to obtain m second critic feature data 1007 .
- FIG. 11 is a schematic structural diagram of the second molecular splitting module.
- the ninth feature data 1101 passes through the normalization layer 1102, the pooling layer 1103, and the fully connected layer 1104 in turn to obtain the eleventh feature data, and the eighth feature data is processed by the softmax layer 1105, A second probability 1106 is obtained.
- the present disclosure also provides a training method for a vehicle identification network.
- the training method may include the following steps:
- the training image includes the first vehicle to be recognized.
- the vehicle identification device receives the training image input by the user through the input component.
- the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
- the vehicle identification device receives the training image sent by the training data terminal.
- the above training data terminal can be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
- the vehicle identification device receives the network to be trained input by the user through the input component.
- the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
- the vehicle identification device receives the network to be trained sent by the training data terminal.
- the above training data terminal can be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
- the global feature information of the second vehicle to be identified includes overall appearance feature information of the second vehicle to be identified.
- the label of the training image includes category information of the second vehicle to be identified.
- vehicle 1 and vehicle 2 are included in all training data.
- the category information of the second vehicle to be identified is vehicle 1
- it is indicated that the second vehicle to be identified is vehicle 1 .
- the vehicle identification device may obtain a second category of the vehicle to be identified (hereinafter referred to as a global category) according to the fourteenth characteristic data, and according to the difference between the global category and the category information included in the tag A first global loss can be obtained.
- a global category a second category of the vehicle to be identified
- the vehicle identification device can obtain the second category of the vehicle to be identified (hereinafter referred to as the key point category) according to the fifteenth feature data, and the vehicle identification device can obtain the category of the second vehicle to be identified (hereinafter referred to as the key point category) according to the relationship between the key point category and the category information included in the label The difference of the first keypoint loss can be obtained.
- G 1 , p 1 , L t satisfy formula (1):
- G 1 , p 1 , L t satisfy formula (3):
- the vehicle identification device adjusts the parameters of the network to be trained according to the total loss until the total loss is less than the convergence threshold, and the vehicle identification network is obtained.
- the vehicle recognition network is obtained by adjusting the parameters of the network to be trained based on the total loss.
- the vehicle recognition network can be used to process the image to be processed to obtain the global feature information of the first vehicle to be recognized. and keypoint feature information.
- the vehicle identification device before executing step 30, the vehicle identification device further executes the following steps:
- the vehicle identification device can obtain the second category of the vehicle to be identified (hereinafter referred to as the category of the local pixel point area) according to the sixteenth feature data, and the category of the vehicle identification device can obtain the category of the second vehicle to be identified (hereinafter referred to as the category of the local pixel point area), according to the category of the local pixel point area and the information included in the label.
- the difference between the class information can obtain the first local pixel area loss.
- the vehicle identification device After obtaining the first local pixel area loss, the vehicle identification device performs the following steps in the process of performing step 30:
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel region loss is ⁇ 1
- the total loss is L t
- G 1 ,p 1 , ⁇ 1 , L t satisfies formula (4):
- G 1 , p 1 , ⁇ 1 , L t satisfy formula (5):
- G 1 , p 1 , ⁇ 1 , L t satisfy formula (6):
- the vehicle recognition network is obtained by adjusting the parameters of the network to be trained based on the total loss, and the vehicle recognition network can be used to process the image to be processed to obtain the first 1.
- the vehicle identification device performs the following steps in the process of executing step 27:
- the seventeenth feature data includes key point feature information of the second vehicle to be identified, and the feature information included in any two seventeenth feature data belong to different key points.
- the s eighteenth feature data are fused to obtain the fifteenth feature data, and the vehicle identification network can be used to process the to-be-processed image, and the fourth feature data can be obtained according to the k seventh feature data. characteristic data.
- the vehicle identification device after obtaining s eighteenth characteristic data, before executing step 34, the vehicle identification device further executes the following steps:
- the first identification result includes category information of the second vehicle to be identified.
- the vehicle identification device can obtain a first identification result according to an eighteenth characteristic data. According to the s eighteenth characteristic data, s first identification results of the second vehicle to be identified can be obtained.
- the vehicle identification device may obtain a first identification difference according to a first identification result and a label, and obtain s first identification differences according to the s first identification results and the label.
- the vehicle identification device obtains the keypoint category loss by determining the sum of the s first identification differences.
- the vehicle identification device After obtaining the keypoint category loss, the vehicle identification device performs the following steps in the process of executing step 34:
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the total loss is L t , in a possible way to achieve , G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (7):
- G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (8):
- G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (9):
- the fourth feature data can be obtained according to the k seventh feature data in the process of using the vehicle identification network to process the image to be processed.
- the vehicle identification device performs the following steps in the process of executing step 36:
- the first order is the order of the included information amount from large to small, and the first order may be the order of the included information amount from small to large.
- the vehicle identification device selects the first s feature data in the first order as the s eighteenth feature data; In the case of the order of the amount of information from small to large, the vehicle identification device selects the last s feature data in the first order as the s eighteenth feature data.
- the vehicle identification device also performs the following steps before performing step 40:
- the second order is the order of the key point category loss from small to large. That is, the smaller the keypoint category loss, the higher the ranking of the first recognition result in the second order.
- the second order is the order of the keypoint category loss from large to small. That is, the larger the keypoint category loss, the higher the ranking of the first recognition result in the second order.
- the vehicle identification device After obtaining the key point ranking loss, the vehicle identification device performs the following steps in the process of executing step 40:
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the total loss is L t
- G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (10):
- G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (11):
- G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (12):
- adding the key point category loss to the total loss can improve the accuracy of the s eighteenth feature data, and then improve the accuracy of the information included in the fifteenth feature data.
- the accuracy of the k seventh feature data can be improved, thereby improving the accuracy of the information included in the fourth feature data.
- the vehicle identification device performs the following steps in the process of executing step 32:
- the nineteenth feature data includes local pixel region feature information of the second vehicle to be identified, and the feature information included in any two nineteenth feature data belong to different local pixel regions.
- the sixteenth feature data is obtained by fusing the p twentieth feature data, and the vehicle identification network can be used to process the to-be-processed image, and the fifth feature data can be obtained according to the m tenth feature data. characteristic data.
- the vehicle identification device further executes the following steps:
- the second identification result includes category information of the second vehicle to be identified.
- the vehicle identification device can obtain a second identification result according to a twentieth characteristic data. According to the p eighteenth characteristic data, p second identification results of the second vehicle to be identified may be obtained.
- the vehicle identification device may obtain a second identification difference according to a second identification result and a label, and may obtain p second identification differences according to the p second identification results and the label.
- the vehicle identification device obtains the local pixel point region category loss by determining the sum of the p second identification differences.
- the vehicle identification device After obtaining the local pixel point area category loss, the vehicle identification device performs the following steps in the process of executing step 45:
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2
- the loss is ⁇ 2
- the total loss is L t .
- G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 ,L t satisfy equation (13):
- G 1 , p 1 , p 2 , p 3 , ⁇ 1 , ⁇ 2 , L t satisfy formula (14):
- G 1 , p 1 , p 2 , p 3 , ⁇ 1 , ⁇ 2 , L t satisfy formula (15):
- the fifth feature data can be obtained according to the m tenth feature data in the process of using the vehicle identification network to process the image to be processed.
- the vehicle identification device performs the following steps in the process of executing step 47:
- the third order is the order of the included information amount from large to small, and the third order may be the order of the included information amount from small to large.
- the vehicle identification device selects the first p feature data in the third order as the p twentieth feature data; In the case of the order of the amount of information from small to large, the vehicle identification device selects the last p pieces of feature data in the third order as the p pieces of twentieth feature data.
- the vehicle identification device also performs the following steps before performing step 51:
- the fourth order is the order of the local pixel area category loss from small to large. That is, the smaller the local pixel area category loss, the higher the ranking of the second recognition result in the fourth order.
- the fourth order is the order of the local pixel region category loss from large to small. That is, the larger the local pixel region category loss, the higher the ranking of the second recognition result in the fourth order.
- the vehicle identification device After obtaining the local pixel point area sorting loss, the vehicle identification device performs the following steps in the process of executing step 51:
- the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss and the above-mentioned local pixel point area Sort the loss to get the total loss above.
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2
- the loss is ⁇ 2
- the local pixel region sorting loss is ⁇ 3
- the total loss is L t .
- G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 , L t satisfies formula (16):
- G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (17):
- G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (18):
- adding the local pixel area category loss to the total loss can improve the accuracy of the p twentieth feature data, and further improve the accuracy of the information included in the sixteenth feature data.
- the accuracy of the k seventh feature data can be improved, thereby improving the accuracy of the information included in the fourth feature data.
- the first global loss includes a global focus loss
- the vehicle identification device performs the following steps in the process of performing step 28:
- the third identification result includes category information of the second vehicle to be identified.
- the vehicle identification device can determine the category of the second vehicle to be identified according to the fourteenth characteristic data, and then obtain the third identification result.
- B is the number of training images
- ⁇ n is a positive number
- ⁇ is a non-negative number
- u n is the probability corresponding to the category of the label in the third recognition result.
- the training image includes image a
- the third recognition result 1 is obtained by processing the image a using the network to be trained. If the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1).
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.9
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.1.
- L F1 -2 ⁇ (1-0.9) 2 ⁇ log0.9.
- the training image includes image a and image b
- the image a is processed by the network to be trained to obtain the third recognition result 1
- the image b is processed by the network to be trained to obtain the third recognition result 2.
- the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
- the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
- the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
- the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
- L F1 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
- the image corresponding to the third recognition result with the maximum probability between the first probability threshold and the second probability threshold is called the first easy sample, and the images other than the first easy sample in the training image are called the first difficult sample. .
- the network to be trained obtains the third recognition result 1 by processing the image a.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the third recognition result 1 is 0.8, the maximum probability is greater than the second probability threshold, and the image a is the first easy sample.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the third recognition result 1 is 0.5, the maximum probability threshold is greater than the first probability threshold, and the maximum probability is less than the second threshold, and the image a is the first difficult sample.
- the focus loss of key points is obtained by calculating the focus loss of the third recognition result, and then the total loss is determined, which can improve the training effect of the network to be trained.
- the training image belongs to a training image set
- the training image set further includes a first positive sample image of the training image and a first negative sample image of the training image
- the first global loss further includes a global triplet loss .
- the vehicle identification device also executes the following steps:
- the category information included in the label of the first positive sample image is the same as the category information included in the label of the training image
- the category information included in the label of the first negative sample image is the same as the category information included in the label of the training image. different.
- the feature data of the first positive sample image includes semantic information of the first positive sample image, and the semantic information can be used to identify the category of the second vehicle to be identified in the first positive sample image.
- the feature data of the first negative sample image includes semantic information of the first positive sample image, and the semantic information can be used to identify the category of the second vehicle to be recognized in the first negative sample image.
- the vehicle identification device calculates the similarity between the twelfth feature data and the feature data of the first positive sample image to obtain the first positive similarity, and calculates the similarity between the twelfth feature data and the feature data of the first negative sample image Get the first negative similarity.
- the first positive similarity is a second norm between the twelfth feature data and the feature data of the first positive sample image.
- the first negative similarity is the second norm between the twelfth feature data and the feature data of the first negative sample image.
- the vehicle recognition apparatus may classify the images other than the training images in the training image set into Positive sample image set and negative sample image set.
- the class information included in the labels of the images in the positive sample image set is the same as the class information included in the labels of the training images, and the class information included in the labels of the images in the negative sample image set is different from the class information included in the labels of the training images.
- the vehicle identification device performs feature extraction processing on the images in the positive sample image set to obtain a positive sample feature data set, and performs feature extraction processing on the images in the negative sample image set to obtain a negative sample feature data set.
- the vehicle identification device calculates the similarity between the twelfth feature data and the feature data in the positive sample feature data set to obtain a first positive similarity set, and calculates the similarity between the twelfth feature data and the feature data in the negative sample feature data set get the first negative similarity set.
- the minimum value in the first positive similarity set is called the minimum similarity within the first class
- the maximum value in the first negative similarity set is called the maximum similarity outside the first class.
- the similarity between the twelfth feature data and the feature data in the first positive sample feature data set is, the second degree of similarity between the twelfth feature data and the feature data in the first positive sample feature data set norm.
- the similarity between the twelfth feature data and the feature data in the first negative sample feature data set is the second norm between the twelfth feature data and the feature data in the first negative sample feature data set.
- the global triplet loss can improve the accuracy of the recognition result of the second to-be-recognized vehicle obtained by the network to be trained based on the twelfth feature data, thereby improving the classification accuracy of the first to-be-recognized vehicle by the vehicle recognition network .
- the first global loss may be the sum of the global focus loss and the global triplet loss.
- the vehicle identification device before performing step 56, the vehicle identification device further performs the following steps:
- the fourth identification result includes category information of the second vehicle to be identified.
- the vehicle identification device can determine the category of the second vehicle to be identified according to the fifteenth characteristic data, and then obtain a fourth identification result.
- B is the number of training images
- ⁇ n is a positive number
- ⁇ is a non-negative number
- um is the probability corresponding to the category of the label in the fourth recognition result.
- the training image includes image a
- the training image includes image a and image b
- the image a is processed by the network to be trained to obtain the fourth recognition result 1
- the image b is processed by the network to be trained to obtain the fourth recognition result 2.
- the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
- the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
- the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
- the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
- L F2 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
- the vehicle identification device After obtaining the key point focus loss, the vehicle identification device performs the following steps in the process of executing step 58:
- the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss And the above local pixel area sorting loss, the above total loss is obtained.
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2
- the loss is ⁇ 2
- the local pixel region sorting loss is ⁇ 3
- the key point focus loss is p 4
- the total loss is L t
- G 1 ,p 1 ,p 2 ,p 3 , p 4 , ⁇ 1 , ⁇ 2 , ⁇ 3 , L t satisfy formula (23):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (25):
- the image corresponding to the fourth recognition result with the maximum probability between the third probability threshold and the fourth probability threshold is called the second easy sample, and the images other than the second easy sample in the training image are called the second difficult sample .
- the third probability threshold is 0.4 and the fourth probability threshold is 0.7.
- the network to be trained obtains the fourth recognition result 1 by processing the image a.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fourth recognition result 1 is 0.8, the maximum probability is greater than the fourth threshold, and the image a is the second easy sample.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fourth recognition result 1 is 0.5, the maximum probability is greater than the third probability threshold, and the maximum probability is less than the fourth probability threshold, and the image a is the second difficult sample.
- the focus loss of the local pixel point area is obtained, and the local pixel point area loss is added to the total loss, which can improve the training effect of the network to be trained.
- the vehicle identification device before performing step 63, the vehicle identification device further performs the following steps:
- the feature data of the first positive sample image, and the feature data of the first negative sample image obtain the keypoint triple loss.
- the vehicle identification device calculates the similarity between the fifteenth feature data and the feature data of the first positive sample image to obtain the second positive similarity, and calculates the similarity between the fifteenth feature data and the feature data of the first negative sample image Get the second negative similarity.
- the second positive similarity is a second norm between the fifteenth feature data and the feature data of the first positive sample image.
- the second negative similarity is the second norm between the fifteenth feature data and the feature data of the first negative sample image.
- the vehicle identification device performs feature extraction processing on the images in the positive sample image set to obtain a positive sample feature data set, and performs feature extraction processing on the images in the negative sample image set to obtain a negative sample feature data set.
- the vehicle identification device calculates the similarity between the fifteenth feature data and the feature data in the positive sample feature data set to obtain a second positive similarity set, and calculates the similarity between the fifteenth feature data and the feature data in the negative sample feature data set degree to get the second negative similarity set.
- the minimum value in the second positive similarity set is called the minimum similarity within the second class
- the maximum value in the second negative similarity set is called the maximum similarity outside the second class.
- the similarity between the fifteenth feature data and the feature data in the positive sample feature data set is the second norm between the fifteenth feature data and the feature data in the positive sample feature data set.
- the similarity between the fifteenth feature data and the feature data in the negative sample feature data set is the second norm between the fifteenth feature data and the feature data in the negative sample feature data set.
- the vehicle identification device After obtaining the key point focus loss, the vehicle identification device performs the following steps in the process of executing step 63:
- the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss and the above-mentioned local pixel point region sorting loss, to obtain the above-mentioned total loss.
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2 .
- the loss is ⁇ 2
- the local pixel region sorting loss is ⁇ 3
- the keypoint focus loss is p 4
- the key point triplet loss is p 5
- the total loss is L t .
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfies formula (28):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (29):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (30):
- the triple loss of key points can improve the accuracy of the recognition result of the second vehicle to be recognized obtained by the network to be trained based on the fifteenth feature data, thereby improving the accuracy of the classification of the first vehicle to be recognized by the vehicle recognition network.
- the vehicle identification device before performing step 66, the vehicle identification device further performs the following steps:
- the fifth identification result includes category information of the second vehicle to be identified.
- the vehicle identification device can determine the type of the second vehicle to be identified according to the sixteenth characteristic data, and then obtain the fifth identification result.
- the focus loss of the fifth identification result is obtained as the focus loss of the local pixel point area.
- B is the number of training images
- ⁇ n is a positive number
- ⁇ is a non-negative number
- uk is the probability corresponding to the category of the label in the fifth recognition result.
- the training image includes image a
- the fifth recognition result 1 is obtained by processing the image a with the network to be trained. If the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1).
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.9
- the training image includes image a and image b
- the image a is processed by the network to be trained to obtain the fifth recognition result 1
- the image b is processed by the network to be trained to obtain the fifth recognition result 2.
- the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
- the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
- the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
- the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
- L F3 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
- the vehicle identification device After obtaining the focus loss of the local pixel point area, the vehicle identification device performs the following steps in the process of executing step 66:
- the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss, the above-mentioned local pixel point area focus loss and the above-mentioned local pixel point area sorting loss, to obtain the above-mentioned total loss.
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2 .
- the loss is ⁇ 2
- the local pixel region sorting loss is ⁇ 3
- the local pixel region focus loss is ⁇ 4
- the key point focus loss is p 4
- the key point triple loss is p 5
- the total loss is L t .
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (32):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (33):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (34):
- the image corresponding to the fifth recognition result with the maximum probability between the fifth probability threshold and the sixth probability threshold is called the third easy sample, and the images other than the third easy sample in the training image are called the third difficult sample.
- the fifth probability threshold is 0.4 and the sixth probability threshold is 0.7.
- the network to be trained obtains the fifth recognition result 1 by processing the image a.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fifth recognition result 1 is 0.8, the maximum probability is greater than the sixth probability threshold, and the image a is the third easy sample.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fifth recognition result 1 is 0.5, the maximum probability threshold is greater than the fifth probability threshold, and the maximum probability is less than the sixth threshold, and the image a is the third difficult sample.
- the focus loss of the local pixel point region is obtained, and then the total loss is determined, which can improve the training effect of the third difficult sample, thereby improving the training effect of the network to be trained.
- the vehicle identification device before performing step 69, the vehicle identification device further performs the following steps:
- the feature data of the first positive sample image, and the feature data of the first negative sample image obtain the local pixel point region triple loss.
- the vehicle identification device calculates the similarity between the sixteenth feature data and the feature data of the first positive sample image to obtain a third positive similarity, and calculates the similarity between the sixteenth feature data and the feature data of the first negative sample image Get the third negative similarity.
- the third positive similarity is a second norm between the sixteenth feature data and the feature data of the first positive sample image.
- the third negative similarity is the second norm between the sixteenth feature data and the feature data of the first negative sample image.
- the vehicle identification device calculates the similarity between the sixteenth feature data and the feature data in the positive sample feature data set to obtain a third positive similarity set, and calculates the similarity between the sixteenth feature data and the feature data in the negative sample feature data set degree to get the third negative similarity set.
- the minimum value in the third positive similarity set is called the minimum similarity within the third class
- the maximum value in the third negative similarity set is called the maximum similarity outside the third class.
- the similarity between the sixteenth feature data and the feature data in the positive sample feature data set is the second norm between the sixteenth feature data and the feature data in the positive sample feature data set.
- the similarity between the sixteenth feature data and the feature data in the negative sample feature data set is the second norm between the sixteenth feature data and the feature data in the negative sample feature data set.
- the vehicle identification device After obtaining the focus loss of the local pixel point area, the vehicle identification device performs the following steps in the process of executing step 69:
- the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss, the above-mentioned local pixel point area focus loss, the above-mentioned local pixel point triplet loss and the above-mentioned local pixel point area sorting loss, to obtain the above-mentioned total loss.
- the first global loss is G 1
- the first key point loss is p 1
- the first local pixel point region loss is ⁇ 1
- the key point category loss is p 2
- the key point sorting loss is p 3
- the local pixel point region category loss is p 2 .
- the loss is ⁇ 2
- the local pixel region sorting loss is ⁇ 3
- the local pixel region focus loss is ⁇ 4
- the local pixel region ternary loss is ⁇ 5
- the key point focus loss is p 4
- the key point triple loss is is p 5
- the total loss is L t , in one possible implementation, G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 , L t satisfies formula (37):
- G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 ,L t satisfy equation (38) :
- G 1 , p 1 , p 2 , p 3 , p 4 , p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 , L t satisfy formula (39) :
- the image corresponding to the fifth recognition result with the maximum probability between the fifth probability threshold and the sixth probability threshold is called the third easy sample, and the images other than the third easy sample in the training image are called the third difficult sample.
- the fifth probability threshold is 0.4 and the sixth probability threshold is 0.7.
- the network to be trained obtains the fifth recognition result 1 by processing the image a.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fifth recognition result 1 is 0.8, the maximum probability is greater than the sixth probability threshold, and the image a is the third easy sample.
- the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
- the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fifth recognition result 1 is 0.5, the maximum probability threshold is greater than the fifth probability threshold, and the maximum probability is less than the sixth threshold, and the image a is the third difficult sample.
- the loss of the local pixel area triplet can improve the accuracy of the recognition result of the second vehicle to be recognized obtained by the network to be trained based on the sixteenth feature data, thereby improving the recognition results of the vehicle recognition network for the first vehicle to be recognized. Classification accuracy.
- the vehicle identification device acquires the generated data set, and uses the generated data set to train the key point and local pixel point region generation module.
- the generated data set includes at least one heatmap training image
- the labels of each heatmap training image include a keypoint label heatmap and a local pixel region label heatmap.
- the key point label heatmap includes location information of key points in the heatmap training image
- the local pixel area label heatmap includes location information of the local pixel area in the heatmap training image.
- the embodiments of the present disclosure also provide an application scenario of the vehicle identification method. With the rapid growth of the number of cameras in public places, how to effectively determine the whereabouts of hit-and-run vehicles through massive video streams is of great significance.
- the police can input the image of the hit-and-run vehicle into the vehicle identification device.
- the vehicle identification device uses the technical solutions provided by the embodiments of the present disclosure to extract feature data of the hit-and-run vehicle from the image of the hit-and-run vehicle.
- the vehicle identification device can be connected with a plurality of surveillance cameras, different surveillance cameras are installed in different locations, and the vehicle identification device can obtain real-time captured video streams from each surveillance camera.
- the vehicle identification device uses the technical solutions provided by the embodiments of the present disclosure to extract feature data of vehicles in the video stream from the images in the video stream to obtain a feature database.
- the vehicle identification device compares the feature data of the hit-and-run vehicle with the feature data in the feature database, and obtains the feature data matching the feature data of the hit-and-run vehicle as the target feature data. It is determined that the image corresponding to the target feature data is an image containing the hit-and-run vehicle, and then the whereabouts of the hit-and-run vehicle can be determined according to the image containing the hit-and-run vehicle.
- the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
- FIG. 12 is a schematic structural diagram of a vehicle identification device 1 according to an embodiment of the present disclosure.
- the vehicle identification device 1 includes: an acquisition unit 11 , a first processing unit 12 , a second processing unit 13 , and a fusion processing unit 14.
- the third processing unit 15 and the fourth processing unit 16 wherein:
- an acquisition unit 11 configured to acquire the to-be-processed image containing the first vehicle to be identified
- the first processing unit 12 is configured to perform a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first to-be-recognized vehicle;
- the second processing unit 13 is configured to perform a second feature extraction process on the to-be-processed image to obtain second feature data including global feature information of the first to-be-recognized vehicle;
- the fusion processing unit 14 is configured to perform fusion processing on the first feature data and the second feature data to obtain third feature data of the first vehicle to be identified; the third feature data is used to obtain the The first identification result of the vehicle to be identified.
- the local feature information includes key point feature information
- the first feature data includes feature information of at least one key point of the vehicle to be identified.
- the local feature information further includes local pixel region feature information
- the first feature data further includes feature information of at least one local pixel region of the vehicle to be identified.
- the first processing unit 12 is configured as:
- the fourth feature data includes feature information of at least one key point of the first vehicle to be identified;
- the fifth feature data includes feature information of at least one local pixel area of the first vehicle to be identified; the local pixel area It belongs to the pixel area covered by the first vehicle to be identified, and the area of the local pixel area is smaller than the area of the pixel area covered by the first vehicle to be identified;
- the fourth characteristic data and the fifth characteristic data are fused to obtain the first characteristic data.
- the first processing unit 12 is configured as:
- the sixth feature data includes feature information of the key points, and any two features included in the sixth feature data Information belongs to different key points;
- k characteristic data including the largest amount of information to obtain k seventh characteristic data; the k is an integer not less than 1;
- the fourth characteristic data is obtained according to the k seventh characteristic data.
- the first processing unit 12 is configured as:
- the first heat map includes position information of the key points in the to-be-processed image, and any two of the The information included in the first heat map belongs to different key points;
- the seventh feature extraction process on the to-be-processed image to obtain a first feature image of the to-be-processed image;
- the first feature image includes feature information of key points in the to-be-processed image;
- the dot product between each of the first heat maps and the first feature image is respectively determined to obtain the at least one sixth feature data.
- the first processing unit 12 is configured as:
- Pooling is performed on the feature data in the at least one sixth feature data, respectively, to obtain at least one eighth feature data;
- At least one first probability is obtained according to the amount of information included in the at least one eighth feature data; the first probability is used to characterize the amount of information included in the sixth feature data; the first probability is the same as the The sixth characteristic data is in one-to-one correspondence;
- the sixth feature data select the sixth feature data corresponding to the largest k first probabilities as the kth feature data seven characteristic data; or,
- the first probability is negatively correlated with the amount of information included in the sixth feature data
- the first processing unit 12 is configured as:
- the ninth feature data includes feature information of the key points, and any two features included in the ninth feature data Information belongs to different local pixel regions;
- m feature data containing the most information to obtain m tenth feature data From the at least two ninth feature data, select m feature data containing the most information to obtain m tenth feature data; the m is an integer not less than 1;
- the fifth characteristic data is obtained according to the m tenth characteristic data.
- the first processing unit 12 is configured as:
- the second heat map includes position information of the local pixel region in the to-be-processed image, and The information included in any two of the second heat maps belongs to different local pixel regions;
- the second feature image includes feature information of a local pixel area in the to-be-processed image
- the dot product between each of the second heat maps and the second feature image is determined respectively to obtain the at least one ninth feature data.
- the first processing unit 12 is configured as:
- At least one second probability is obtained; the second probability is used to represent the amount of information included in the ninth feature data; the second probability is the same as the The ninth characteristic data is in one-to-one correspondence;
- the ninth feature data corresponding to the largest m second probabilities are selected as the mth Ten characteristic data; or,
- the at least one local pixel area includes: a first pixel area and a second pixel area, the number of the ninth feature data and the m are both greater than 1, and the m
- the tenth feature data includes: twelfth feature data and thirteenth feature data, the twelfth feature data includes feature information of the first pixel area, and the thirteenth feature data includes the second pixel Feature information of point area;
- the first processing unit 12 is configured as:
- the first weight is obtained according to the amount of information included in the twelfth feature data
- the second weight is obtained according to the amount of information included in the thirteenth feature data
- the first weight is the same as that of the twelfth feature data.
- the amount of information included is positively correlated
- the second weight is positively correlated with the amount of information included in the thirteenth feature data;
- the twelfth feature data and the thirteenth feature data are weighted and fused to obtain the fifth feature data.
- the vehicle identification method executed by the vehicle identification device is applied to a vehicle identification network, and the obtaining unit is further configured to obtain a training image including the second vehicle to be identified and the network to be trained;
- the first processing unit 12 is further configured to use the network to be trained to process the training image to obtain fourteenth feature data including the global feature information of the second vehicle to be identified and the second feature data including the second to-be-identified vehicle.
- the third processing unit 15 is configured to obtain the first global loss according to the fourteenth feature data and the label of the training image
- the third processing unit 15 is further configured to obtain the first key point loss according to the fifteenth feature data and the label;
- the third processing unit 15 is further configured to obtain the total loss of the network to be trained according to the first global loss and the first key point loss;
- the fourth processing unit 16 is configured to adjust parameters of the network to be trained based on the total loss to obtain the vehicle identification network.
- the first processing unit 12 is further configured to, before obtaining the total loss of the network to be trained according to the first global loss and the first key point loss, use
- the to-be-trained network processes the training image to obtain sixteenth feature data including feature information of the local pixel area of the second to-be-recognized vehicle;
- the third processing unit 15 is further configured to obtain the first local pixel area loss according to the sixteenth feature data and the label;
- the third processing unit 15 is further configured to: obtain the total loss according to the first global loss, the first key point loss and the first local pixel area loss.
- the first processing unit 12 is configured as:
- the seventeenth feature data includes the key point feature information of the second vehicle to be identified, and any two of the The feature information included in the seventeenth feature data belongs to different key points;
- s characteristic data including the largest amount of information to obtain s eighteenth characteristic data; the s is an integer not less than 1;
- the s eighteenth feature data are fused to obtain the fifteenth feature data.
- the third processing unit is further configured to obtain the Before the total loss, obtain s first identification results of the second vehicle to be identified according to the s eighteenth characteristic data;
- the keypoint category loss is obtained
- the fourth processing unit 16 is configured as:
- the total loss is obtained according to the first global loss, the first keypoint loss, the first local pixel region loss, and the keypoint category loss.
- the first processing unit 12 is configured as:
- the first order is the order of the amount of information included from large to small, and the first order may be the The amount of information included is in ascending order;
- the first order from the at least one seventeenth characteristic data, select s characteristic data including the most informative data to obtain the s eighteenth characteristic data;
- the third processing unit 15 is configured to obtain the total loss according to the first global loss, the first keypoint loss, the first local pixel point region loss and the keypoint category loss. Before the loss, sort the s first recognition results according to the corresponding loss of the key point category to obtain the second order; the second order is the order of the loss of the key point category from large to small, so The second order or the order of the key point category loss from small to large;
- the fourth processing unit 16 is configured as:
- the total loss is obtained according to the first global loss, the first keypoint loss, the first local pixel region loss, the keypoint category loss, and the keypoint sorting loss.
- the first processing unit 12 is configured as:
- the network uses the network to be trained to process the training image to obtain at least one nineteenth feature data;
- the nineteenth feature data includes the feature information of the local pixel area, and any two of the nineteenth feature data
- the feature information included in the feature data belongs to different local pixel regions;
- p characteristic data including the largest amount of information to obtain p twentieth characteristic data; the p is an integer not less than 1;
- the third processing unit 15 is configured to perform an analysis according to the first global loss, the first key point loss, the first local pixel point region loss, and the key point category loss and the key point sorting loss, before obtaining the total loss, obtain p second recognition results of the second vehicle to be recognized according to the p twentieth feature data;
- the fourth processing unit 16 is configured as:
- the first global loss the first key point loss, the first local pixel point region loss, the key point category loss, the key point sorting loss and the local pixel point region category loss, we obtain the total loss.
- the first processing unit 12 is configured as:
- the third order is the order of the amount of information included from large to small, and the third order may be the The amount of information included is in ascending order;
- the third order from the at least one nineteenth characteristic data, select p characteristic data including the most informative data to obtain the p twentieth characteristic data;
- the third processing unit 15 is configured to: according to the first global loss, the first key point loss, the first local pixel point area loss, the key point category loss, the key point loss Sorting loss and the local pixel point area category loss, before obtaining the total loss, sort the p second recognition results according to the corresponding local pixel point area category loss to obtain the fourth order;
- the The fourth order is the order of the local pixel point region category loss from large to small, and the fourth order may be the order of the local pixel point region category loss from small to large;
- the fourth processing unit 16 is configured as:
- the first global loss the first keypoint loss, the first local pixel region loss, the keypoint category loss, the keypoint sorting loss, the local pixel region category loss and all
- the local pixel area sorting loss is used to obtain the total loss.
- the first global loss includes a global focus loss
- the third processing unit 15 is configured to:
- the focus loss of the third identification result is obtained as the global focus loss.
- the training image belongs to a training image set;
- the training image set further includes a first positive sample image of the training image and a first negative sample image of the training image;
- the first The global loss also includes the global triplet loss;
- the third processing unit 15 is further configured to:
- the global triplet loss is obtained according to the twelfth feature data, the feature data of the first positive sample image, and the feature data of the first negative sample image.
- the vehicle identification device can obtain a third feature information that includes both the global feature information of the first vehicle to be identified and the local feature information of the first vehicle to be identified by performing fusion processing on the first feature data and the second feature data. characteristic data. Using the third feature data as the feature data of the first vehicle to be recognized can enrich the information included in the feature data of the first vehicle to be recognized.
- the functions or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments.
- FIG. 13 is a schematic diagram of a hardware structure of a vehicle identification device according to an embodiment of the present disclosure.
- the vehicle identification device 2 includes a processor 21 , a memory 22 , an input device 23 , and an output device 24 .
- the processor 21 , the memory 22 , the input device 23 , and the output device 24 are coupled through a connector, and the connector includes various types of interfaces, transmission lines, or buses, which are not limited in this embodiment of the present disclosure. It should be understood that, in various embodiments of the present disclosure, coupling refers to mutual connection in a specific manner, including direct connection or indirect connection through other devices, such as various interfaces, transmission lines, and buses.
- the processor 21 may be one or more graphics processing units (graphics processing units, GPUs).
- the GPU may be a single-core GPU or a multi-core GPU.
- the processor 21 may be a processor group composed of multiple GPUs, and the multiple processors are coupled to each other through one or more buses.
- the processor may also be other types of processors, etc., which is not limited in this embodiment of the present disclosure.
- the memory 22 may be used to store computer program instructions, as well as various types of computer program code, including program code for implementing the disclosed aspects.
- the memory includes, but is not limited to, random access memory (RAM), read-only memory (read-only memory, ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM) ), or a portable read-only memory (compact disc read-only memory, CD-ROM), which is used for related instructions and data.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read only memory
- CD-ROM compact disc read-only memory
- the input device 23 is configured to input data and/or signals
- the output device 24 is configured to output data and/or signals.
- the input device 23 and the output device 24 may be independent devices or may be an integral device.
- the memory 22 can be used not only to store related instructions, but also to store related data.
- the memory 22 can be used to store images to be processed obtained through the input device 23, or the memory 22 can also be used to store The third characteristic data obtained through the processor 21 is stored, and the embodiment of the present disclosure does not limit the data specifically stored in the memory.
- FIG. 13 only shows a simplified design of a vehicle identification device.
- the vehicle identification device may also include other necessary elements, including but not limited to any number of input/output devices, processors, memories, etc., and all vehicle identification devices that can implement the embodiments of the present disclosure are included in this disclosure. within the scope of public protection.
- the disclosed system, apparatus and method may be implemented in other manners.
- the apparatus embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
- each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
- software it can be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions.
- the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
- the computer instructions may be stored in or transmitted over a computer-readable storage medium.
- the computer instructions can be sent from a website site, computer, server, or data center via wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) another website site, computer, server or data center for transmission.
- the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media.
- the available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, digital versatile discs (DVDs)), or semiconductor media (eg, solid state disks (SSDs)) )Wait.
- the process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium.
- the program When the program is executed , which may include the processes of the foregoing method embodiments.
- the aforementioned storage medium includes: read-only memory (read-only memory, ROM) or random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.
- the present disclosure discloses a vehicle identification method and device, an electronic device and a storage medium.
- the method includes: acquiring a to-be-processed image containing a first vehicle to be identified; performing a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first vehicle to be identified; performing a second feature extraction process on the to-be-processed image to obtain second feature data including the global feature information of the first vehicle to be identified; performing fusion processing on the first feature data and the second feature data to obtain the and the third feature data of the first vehicle to be identified; the third feature data is used to obtain the identification result of the first vehicle to be identified.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
Claims (24)
- 一种车辆识别方法,所述方法包括:获取包含第一待识别车辆的待处理图像;对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;其中,所述第三特征数据用于获得所述第一待识别车辆的识别结果。
- 根据权利要求1所述的方法,所述局部特征信息包括关键点特征信息,所述第一特征数据包括所述待识别车辆的至少一个关键点的特征信息。
- 根据权利要求2所述的方法,所述局部特征信息还包括局部像素点区域特征信息,所述第一特征数据还包括所述待识别车辆的至少一个局部像素点区域的特征信息。
- 根据权利要求3所述的方法,所述对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据,包括:对所述待处理图像进行第三特征提取处理,得到第四特征数据;所述第四特征数据包括所述第一待识别车辆的至少一个关键点的特征信息;对所述待处理图像进行第四特征提取处理,得到第五特征数据;所述第五特征数据包括所述第一待识别车辆的至少一个局部像素点区域的特征信息;所述局部像素点区域属于所述第一待识别车辆所覆盖的像素点区域,且所述局部像素点区域的面积小于所述第一待识别车辆所覆盖的像素点区域的面积;对所述第四特征数据和第五特征数据进行融合处理,得到所述第一特征数据。
- 根据权利要求4所述的方法,所述对所述待处理图像进行第三特征提取处理,得到第四特征数据,包括:对所述待处理图像进行第五特征提取处理,得到至少一个第六特征数据;所述第六特征数据包括所述关键点的特征信息,且任意两个所述第六特征数据所包括的特征信息属于不同的关键点;从所述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据;所述k为不小于1的整数;依据所述k个第七特征数据得到所述第四特征数据。
- 根据权利要求5所述的方法,所述对所述待处理图像进行第五特征提取处理,得到至少一个第六特征数据,包括:对所述待处理图像进行第六特征提取处理,得到至少一张第一热力图;所述第一热力图包括所述关键点在所述待处理图像中的位置信息,且任意两张所述第一热力图所包括的信息属于不同的关键点;对所述待处理图像进行第七特征提取处理,得到所述待处理图像的第一特征图像;所述第一特征图像包括所述待处理图像中的关键点的特征信息;分别确定每张所述第一热力图与所述第一特征图像之间的点积,得到所述至少一个第六特征数据。
- 根据权利要求5或6所述的方法,所述从所述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据,包括:对所述至少一个第六特征数据中的特征数据分别进行池化处理,得到至少一个第八特征数据;依据所述至少一个第八特征数据所包括的信息量,得到至少一个第一概率;所述第一概率用于表征所述第六特征数据所包括的信息量;所述第一概率与所述第六特征数据一一对应;在所述第一概率与所述第六特征数据所包括的信息量呈正相关的情况下,选取最大的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据;或,在所述第一概率与所述第六特征数据所包括的信息量呈负相关的情况下,选取最小的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据。
- 根据权利要求3至7中任意一项所述的方法,所述对所述待处理图像进行第四特征提取处理,得到第五特征数 据,包括:对所述待处理图像进行第十特征提取处理,得到至少一个第九特征数据;所述第九特征数据包括所述关键点的特征信息,且任意两个所述第九特征数据所包括的特征信息属于不同的局部像素点区域;从所述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据;所述m为不小于1的整数;依据所述m个第十特征数据得到所述第五特征数据。
- 根据权利要求8所述的方法,所述对所述待处理图像进行第十特征提取处理,得到至少一个第九特征数据,包括:对所述待处理图像进行第十一特征提取处理,得到所述至少一张第二热力图;所述第二热力图包括所述局部像素点区域在所述待处理图像中的位置信息,且任意两张所述第二热力图所包括的信息属于不同的局部像素点区域;对所述待处理图像进行第十二特征提取处理,得到所述待处理图像的第二特征图像;所述第二特征图像包括所述待处理图像中的局部像素点区域的特征信息;分别确定每张所述第二热力图与所述第二特征图像之间的点积,得到所述至少一个第九特征数据。
- 根据权利要求8或9所述的方法,所述从所述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据,包括:对所述第九特征数据中的特征数据分别进行池化处理,得到至少一个第十一特征数据;依据所述至少一个第十一特征数据所包括的信息量,得到至少一个第二概率;所述第二概率用于表征所述第九特征数据中包括的信息量;所述第二概率与所述第九特征数据一一对应;在所述第二概率与所述第九特征数据所包括的信息量呈正相关的情况下,选取最大的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据;或,在所述第二概率与所述第九特征数据所包括的信息量呈负相关的情况下,选取最小的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据。
- 根据权利要求8至10中任意一项所述的方法,所述至少一个局部像素点区域包括:第一像素点区域和第二像素点区域,所述第九特征数据的数量和所述m均大于1,所述m个第十特征数据包括:第十二特征数据和第十三特征数据,所述第十二特征数据包括所述第一像素点区域的特征信息,所述第十三特征数据包括所述第二像素点区域的特征信息;所述依据所述m个第十特征数据得到所述第五特征数据,包括:依据所述第十二特征数据所包括的信息量得到第一权重,依据所述第十三特征数据所包括的信息量得到第二权重;所述第一权重与所述第十二特征数据所包括的信息量呈正相关,所述第二权重与所述第十三特征数据所包括的信息量呈正相关;依据所述第一权重和所述第二权重,对所述第十二特征数据和所述第十三特征数据进行加权融合,得到所述第五特征数据。
- 根据权利要求1至11中任意一项所述的方法,所述车辆识别方法应用于车辆识别网络,所述车辆识别网络的训练方法,包括:获取包含第二待识别车辆的训练图像和待训练网络;使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的全局特征信息的第十四特征数据和包括所述第二待识别车辆的关键点特征信息的第十五特征数据;依据所述第十四特征数据和所述训练图像的标签,得到第一全局损失;依据所述第十五特征数据和所述标签,得到第一关键点损失;依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失;基于所述总损失调整所述待训练网络的参数,得到所述车辆识别网络。
- 根据权利要求12所述的方法,在所述依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失之前,所述方法还包括:使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据;依据所述第十六特征数据和所述标签,得到第一局部像素点区域损失;所述依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失,包括:依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失。
- 根据权利要求13所述的方法,所述使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的关键点特征信息的第十五特征数据,包括:使用所述待训练网络对所述训练图像进行处理,得到至少一个第十七特征数据;所述第十七特征数据包括所述第二待识别车辆的关键点特征信息,且任意两个所述第十七特征数据所包括的特征信息属于不同的关键点;从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据;所述s为不小于1的整数;对所述s个第十八特征数据进行融合处理,得到所述第十五特征数据。
- 根据权利要求14所述的方法,在所述依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失之前,所述方法还包括:依据所述s个第十八特征数据,得到所述第二待识别车辆的s个第一识别结果;分别依据所述s个第一识别结果与所述标签之间的差异,得到关键点类别损失;所述依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失,包括:依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失。
- 根据权利要求14或15所述的方法,所述从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据,包括:依据所包括的信息量对所述至少一个第十七特征数据进行排序,得到第一顺序;所述第一顺序为所包括的信息量从大到小的顺序,所述第一顺序或为所包括的信息量从小到大的顺序;依据所述第一顺序从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到所述s个第十八特征数据;在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失之前,所述方法还包括:依据所对应的所述关键点类别损失对所述s个第一识别结果进行排序,得到第二顺序;所述第二顺序为所述关键点类别损失从大到小的顺序,所述第二顺序或为所述关键点类别损失从小到大的顺序;依据所述第一顺序和所述第二顺序之间的差异,得到关键点排序损失;所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失,包括:依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失。
- 根据权利要求16所述的方法,所述使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据,包括:使用所述待训练网络对所述训练图像进行处理,得到至少一个第十九特征数据;所述第十九特征数据包括所述局部像素点区域的特征信息,且任意两个所述第十九特征数据所包括的特征信息属于不同的局部像素点区域;从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据;所述p为不小于1的整数;对所述p个第二十特征数据进行融合处理,得到所述第十六特征数据。
- 根据权利要求17所述的方法,在依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失之前,所述方法还包括:依据所述p个第二十特征数据,得到所述第二待识别车辆的p个第二识别结果;分别依据所述p个第二识别结果与所述标签之间的差异,得到局部像素点区域类别损失;所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失,包括:依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失。
- 根据权利要求17或18所述的方法,所述从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据,包括:依据所包括的信息量对所述至少一个第十九特征数据进行排序,得到第三顺序;所述第三顺序为所包括的信息量从大到小的顺序,所述第三顺序或为所包括的信息量从小到大的顺序;依据所述第三顺序从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到所述p个第二十特征数据;在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失之前,所述方法还包括:依据所对应的所述局部像素点区域类别损失对所述p个第二识别结果进行排序,得到第四顺序;所述第四顺序为所述局部像素点区域类别损失从大到小的顺序,所述第四顺序或为所述局部像素点区域类别损失从小到大的顺序;依据所述第三顺序和所述第四顺序之间的差异,得到局部像素点区域排序损失;所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失,包括:依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失、所述局部像素点区域类别损失和所述局部像素点区域排序损失,得到所述总损失。
- 根据权利要求12至19中任意一项所述的方法,所述第一全局损失包括全局焦点损失;所述依据所述第十四特征数据和所述训练图像的标签,得到第一全局损失,包括:依据所述第十四特征数据,得到所述第二待识别车辆的第三识别结果;依据所述第三识别结果和所述标签,得到所述第三识别结果的焦点损失,作为所述全局焦点损失。
- 根据权利要求20所述的方法,所述训练图像属于训练图像集;所述训练图像集还包括所述训练图像的第一正样本图像和所述训练图像的第一负样本图像;所述第一全局损失还包括全局三元组损失;所述方法还包括:使用所述待训练网络对所述第一正样本图像进行特征提取处理,得到所述第一正样本图像的特征数据;使用所述待训练网络对所述第一负样本图像进行特征提取处理,得到所述第一负样本图像的特征数据;依据所述第十二特征数据、所述第一正样本图像的特征数据和所述第一负样本图像的特征数据,得到所述全局三元组损失。
- 一种车辆识别装置,所述装置包括:获取单元,配置为获取包含第一待识别车辆的待处理图像;第一处理单元,配置为对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;第二处理单元,配置为对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;融合处理单元,配置为对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;所述第三特征数据用于获得所述第一待识别车辆的识别结果。
- 一种电子设备,包括:处理器和存储器,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,在所述处理器执行所述计算机指令的情况下,所述电子设备执行如权利要求1至21中任一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序包括程序指令,在所述程序指令被处理器执行的情况下,使所述处理器执行权利要求1至21中任意一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217042600A KR20220035335A (ko) | 2020-09-10 | 2020-12-28 | 차량 식별 방법 및 장치, 전자 기기 및 저장 매체 |
JP2021575043A JP2023501028A (ja) | 2020-09-10 | 2020-12-28 | 車両識別方法及び装置、電子デバイス及び記憶媒体 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010947349.1A CN112101183B (zh) | 2020-09-10 | 2020-09-10 | 车辆识别方法及装置、电子设备及存储介质 |
CN202010947349.1 | 2020-09-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022052375A1 true WO2022052375A1 (zh) | 2022-03-17 |
Family
ID=73752542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/140315 WO2022052375A1 (zh) | 2020-09-10 | 2020-12-28 | 车辆识别方法及装置、电子设备及存储介质 |
Country Status (5)
Country | Link |
---|---|
JP (1) | JP2023501028A (zh) |
KR (1) | KR20220035335A (zh) |
CN (2) | CN112101183B (zh) |
TW (1) | TW202221567A (zh) |
WO (1) | WO2022052375A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117455957A (zh) * | 2023-12-25 | 2024-01-26 | 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) | 一种基于深度学习的车辆轨迹定位追踪方法及系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112101183B (zh) * | 2020-09-10 | 2021-08-24 | 深圳市商汤科技有限公司 | 车辆识别方法及装置、电子设备及存储介质 |
CN113569912A (zh) * | 2021-06-28 | 2021-10-29 | 北京百度网讯科技有限公司 | 车辆识别方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140270384A1 (en) * | 2013-03-15 | 2014-09-18 | Mitek Systems, Inc. | Methods for mobile image capture of vehicle identification numbers |
CN107862340A (zh) * | 2017-11-16 | 2018-03-30 | 深圳市华尊科技股份有限公司 | 一种车型识别方法及装置 |
CN108229468A (zh) * | 2017-06-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备 |
CN110533119A (zh) * | 2019-09-04 | 2019-12-03 | 北京迈格威科技有限公司 | 标识识别方法及其模型的训练方法、装置及电子系统 |
CN112101183A (zh) * | 2020-09-10 | 2020-12-18 | 深圳市商汤科技有限公司 | 车辆识别方法及装置、电子设备及存储介质 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105913405B (zh) * | 2016-04-05 | 2019-03-29 | 智车优行科技(北京)有限公司 | 用于呈现图像细节的处理方法、装置及车辆 |
US10423855B2 (en) * | 2017-03-09 | 2019-09-24 | Entit Software Llc | Color recognition through learned color clusters |
CN108229353B (zh) * | 2017-12-21 | 2020-09-22 | 深圳市商汤科技有限公司 | 人体图像的分类方法和装置、电子设备、存储介质、程序 |
CN108319907A (zh) * | 2018-01-26 | 2018-07-24 | 腾讯科技(深圳)有限公司 | 一种车辆识别方法、装置和存储介质 |
CN108564119B (zh) * | 2018-04-04 | 2020-06-05 | 华中科技大学 | 一种任意姿态行人图片生成方法 |
CN108960140B (zh) * | 2018-07-04 | 2021-04-27 | 国家新闻出版广电总局广播科学研究院 | 基于多区域特征提取和融合的行人再识别方法 |
CN109063768B (zh) * | 2018-08-01 | 2021-10-01 | 北京旷视科技有限公司 | 车辆重识别方法、装置及系统 |
CN109685023A (zh) * | 2018-12-27 | 2019-04-26 | 深圳开立生物医疗科技股份有限公司 | 一种超声图像的面部关键点检测方法及相关装置 |
CN110689481A (zh) * | 2019-01-17 | 2020-01-14 | 成都通甲优博科技有限责任公司 | 车辆种类识别方法及装置 |
CN110348463B (zh) * | 2019-07-16 | 2021-08-24 | 北京百度网讯科技有限公司 | 用于识别车辆的方法和装置 |
CN111126379B (zh) * | 2019-11-22 | 2022-05-17 | 苏州浪潮智能科技有限公司 | 一种目标检测方法与装置 |
CN111274954B (zh) * | 2020-01-20 | 2022-03-15 | 河北工业大学 | 基于改进姿态估计算法的嵌入式平台实时跌倒检测方法 |
CN111339846B (zh) * | 2020-02-12 | 2022-08-12 | 深圳市商汤科技有限公司 | 图像识别方法及装置、电子设备和存储介质 |
CN111340701B (zh) * | 2020-02-24 | 2022-06-28 | 南京航空航天大学 | 一种基于聚类法筛选匹配点的电路板图像拼接方法 |
CN111401265B (zh) * | 2020-03-19 | 2020-12-25 | 重庆紫光华山智安科技有限公司 | 行人重识别方法、装置、电子设备和计算机可读存储介质 |
CN111311532B (zh) * | 2020-03-26 | 2022-11-11 | 深圳市商汤科技有限公司 | 图像处理方法及装置、电子设备、存储介质 |
CN111199550B (zh) * | 2020-04-09 | 2020-08-11 | 腾讯科技(深圳)有限公司 | 图像分割网络的训练方法、分割方法、装置和存储介质 |
-
2020
- 2020-09-10 CN CN202010947349.1A patent/CN112101183B/zh active Active
- 2020-09-10 CN CN202111056820.9A patent/CN113780165A/zh not_active Withdrawn
- 2020-12-28 JP JP2021575043A patent/JP2023501028A/ja not_active Withdrawn
- 2020-12-28 KR KR1020217042600A patent/KR20220035335A/ko unknown
- 2020-12-28 WO PCT/CN2020/140315 patent/WO2022052375A1/zh active Application Filing
-
2021
- 2021-07-15 TW TW110126129A patent/TW202221567A/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140270384A1 (en) * | 2013-03-15 | 2014-09-18 | Mitek Systems, Inc. | Methods for mobile image capture of vehicle identification numbers |
CN108229468A (zh) * | 2017-06-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备 |
CN107862340A (zh) * | 2017-11-16 | 2018-03-30 | 深圳市华尊科技股份有限公司 | 一种车型识别方法及装置 |
CN110533119A (zh) * | 2019-09-04 | 2019-12-03 | 北京迈格威科技有限公司 | 标识识别方法及其模型的训练方法、装置及电子系统 |
CN112101183A (zh) * | 2020-09-10 | 2020-12-18 | 深圳市商汤科技有限公司 | 车辆识别方法及装置、电子设备及存储介质 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117455957A (zh) * | 2023-12-25 | 2024-01-26 | 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) | 一种基于深度学习的车辆轨迹定位追踪方法及系统 |
CN117455957B (zh) * | 2023-12-25 | 2024-04-02 | 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) | 一种基于深度学习的车辆轨迹定位追踪方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
KR20220035335A (ko) | 2022-03-22 |
TW202221567A (zh) | 2022-06-01 |
CN112101183A (zh) | 2020-12-18 |
CN112101183B (zh) | 2021-08-24 |
JP2023501028A (ja) | 2023-01-18 |
CN113780165A (zh) | 2021-12-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022052375A1 (zh) | 车辆识别方法及装置、电子设备及存储介质 | |
WO2020042489A1 (zh) | 违法停车案件的鉴别方法、装置和计算机设备 | |
WO2021203882A1 (zh) | 姿态检测及视频处理方法、装置、电子设备和存储介质 | |
WO2021051601A1 (zh) | 利用Mask R-CNN选择检测框的方法及系统、电子装置及存储介质 | |
CN109034086B (zh) | 车辆重识别方法、装置及系统 | |
CN111767831B (zh) | 用于处理图像的方法、装置、设备及存储介质 | |
CN111435446A (zh) | 一种基于LeNet车牌识别方法及装置 | |
Salarian et al. | A vision based system for traffic lights recognition | |
WO2023024790A1 (zh) | 车辆识别方法及装置、电子设备、计算机可读存储介质和计算机程序产品 | |
WO2023246921A1 (zh) | 目标属性识别方法、模型训练方法和装置 | |
CN112733666A (zh) | 一种难例图像的搜集、及模型训练方法、设备及存储介质 | |
Latha et al. | Image understanding: semantic segmentation of graphics and text using faster-RCNN | |
CN117218622A (zh) | 路况检测方法、电子设备及存储介质 | |
CN111709377B (zh) | 特征提取方法、目标重识别方法、装置及电子设备 | |
CN111178181B (zh) | 交通场景分割方法及相关装置 | |
CN116071557A (zh) | 一种长尾目标检测方法、计算机可读存储介质及驾驶设备 | |
US20220207879A1 (en) | Method for evaluating environment of a pedestrian passageway and electronic device using the same | |
CN114724128A (zh) | 一种车牌识别方法、装置、设备和介质 | |
CN114882469A (zh) | 一种基于dl-ssd模型的交通标志检测方法及系统 | |
Wang et al. | Cost effective and accurate vehicle make/model recognition method using YoloV5 | |
CN111931680A (zh) | 一种基于多尺度的车辆重识别方法及系统 | |
CN116052220B (zh) | 行人重识别方法、装置、设备及介质 | |
Balabid et al. | Cell phone usage detection in roadway images: from plate recognition to violation classification | |
CN113505653B (zh) | 目标检测方法、装置、设备、介质及程序产品 | |
CN111988506B (zh) | 补光方法及装置、电子设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2021575043 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20953160 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 522431534 Country of ref document: SA |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.07.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20953160 Country of ref document: EP Kind code of ref document: A1 |