WO2022052375A1 - 车辆识别方法及装置、电子设备及存储介质 - Google Patents

车辆识别方法及装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022052375A1
WO2022052375A1 PCT/CN2020/140315 CN2020140315W WO2022052375A1 WO 2022052375 A1 WO2022052375 A1 WO 2022052375A1 CN 2020140315 W CN2020140315 W CN 2020140315W WO 2022052375 A1 WO2022052375 A1 WO 2022052375A1
Authority
WO
WIPO (PCT)
Prior art keywords
loss
feature data
feature
vehicle
data
Prior art date
Application number
PCT/CN2020/140315
Other languages
English (en)
French (fr)
Inventor
何智群
武伟
朱铖恺
闫俊杰
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Priority to KR1020217042600A priority Critical patent/KR20220035335A/ko
Priority to JP2021575043A priority patent/JP2023501028A/ja
Publication of WO2022052375A1 publication Critical patent/WO2022052375A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to a vehicle identification method and device, an electronic device, and a storage medium.
  • the vehicle identification method obtains two vehicle feature data by extracting the vehicle features from two images respectively, and compares the two vehicle feature data to determine whether the vehicle in the two images is the same car.
  • the accuracy of the information included in the vehicle feature data extracted in this way is not high.
  • the present disclosure provides a vehicle identification method and device, an electronic device and a storage medium.
  • a vehicle identification method comprising:
  • a vehicle identification device comprising:
  • an acquisition unit configured to acquire a to-be-processed image containing the first vehicle to be identified
  • a first processing unit configured to perform a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first to-be-recognized vehicle;
  • a second processing unit configured to perform a second feature extraction process on the to-be-processed image to obtain second feature data including global feature information of the first to-be-recognized vehicle;
  • a fusion processing unit configured to perform fusion processing on the first feature data and the second feature data to obtain third feature data of the first vehicle to be identified; the third feature data is used to obtain the first feature data The identification result of the vehicle to be identified.
  • an electronic device comprising: a processor and a memory, wherein the memory is used to store computer program code, the computer program code includes computer instructions, when the processor executes the computer instructions , the electronic device executes the method according to the above-mentioned first aspect and any possible implementation manner thereof.
  • an electronic device comprising: a processor, a sending device, an input device, an output device, and a memory, the memory being used to store computer program codes, the computer program codes comprising computer instructions, and in the processing When the computer executes the computer instructions, the electronic device executes the method according to the first aspect and any one of possible implementations thereof.
  • a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions that, when the program instructions are executed by a processor, cause all The processor executes the method as described above in the first aspect and any possible implementation manner thereof.
  • a computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer is made to perform the above-mentioned first aspect and any of them.
  • Embodiments of the present disclosure provide a vehicle identification method and device, an electronic device, and a storage medium.
  • first feature data of local feature information of the first to be identified vehicle is extracted, and Extracting the second feature data of the global feature information of the first vehicle to be recognized, and fusing the first feature data with the second feature data, so as to enrich the detailed feature information of the first vehicle to be recognized, and then based on the rich detailed features
  • the information to determine the recognition result of the first vehicle to be recognized can improve the accuracy of the recognition result.
  • FIG. 1 is a schematic flowchart of a vehicle identification method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a key point provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a local pixel area according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a vehicle identification network according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a feature extraction module provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a key point and local pixel point region generation module according to an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of a joint training module provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a first actor-critic according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a first molecule breaking module according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a second actor-critic module according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a second molecule breaking module according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of a vehicle identification device according to an embodiment of the present disclosure.
  • FIG. 13 is a schematic diagram of a hardware structure of a vehicle identification device according to an embodiment of the present disclosure.
  • monitoring equipment In order to enhance safety in work, life or social environment, monitoring equipment is installed in each area. With the improvement of people's living standards, there are more and more vehicles on the road and more and more traffic accidents. How to effectively determine the whereabouts of the vehicle (hereinafter referred to as the target vehicle) through the video stream collected by the monitoring equipment is of great significance . For example, when chasing a hit-and-run vehicle, the vehicle identification method is used to process the images collected by different cameras to determine the whereabouts of the hit-and-run vehicle.
  • the vehicle identification method obtains the characteristics of the vehicle to be confirmed by extracting the overall appearance characteristic information of the vehicle to be confirmed in the image, and compares the vehicle characteristics of the vehicle to be confirmed with the target vehicle characteristics including the overall appearance characteristic information of the target vehicle. Yes, the similarity between the target vehicle and the vehicle to be confirmed is obtained, wherein the overall appearance features include: model and color. When the similarity exceeds the similarity threshold, it is determined that the vehicle to be confirmed and the target vehicle are the same vehicle.
  • the embodiments of the present disclosure provide a vehicle identification method, which can enrich the information included in the vehicle features.
  • the execution subject of the embodiment of the present disclosure is a vehicle identification device.
  • the optional vehicle identification device can be one of the following: a mobile phone, a server, a computer, a tablet computer, and a wearable device. Please refer to FIG. 1 , which is a schematic flowchart of a vehicle identification method provided by an embodiment of the present disclosure.
  • the to-be-processed image includes the first to-be-identified vehicle.
  • the vehicle identification device receives the image to be processed input by the user through the input component.
  • the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
  • the vehicle identification device receives the to-be-processed image sent by the data terminal.
  • the above data terminal may be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
  • the vehicle identification device receives the to-be-processed image sent by the surveillance camera.
  • the surveillance cameras are deployed on roads (including: highways, expressways, and urban roads).
  • the local feature information includes detailed feature information of the vehicle, such as: feature information of a car lamp, feature information of a car logo, and feature information of a car window.
  • the vehicle identification device can extract the local feature information of the first vehicle to be identified from the image to be processed by performing the first feature extraction process on the image to be processed to obtain the first feature data.
  • the first feature extraction process may be implemented by a first convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the first convolutional neural network obtained by training can complete the first feature extraction processing of the image to be processed.
  • the annotation information of the training data may be the detailed feature information of the vehicle in the image (such as the type of headlights, the type of the vehicle logo, the type of the vehicle window).
  • the convolutional neural network extracts the feature data including the detailed feature information of the vehicle from the training data, and obtains the detailed information of the vehicle according to the extracted feature data as the training result. .
  • the training of the convolutional neural network can be completed to obtain the first convolutional neural network.
  • the vehicle identification device can use the first convolutional neural network to process the to-be-processed image to obtain detailed feature information of the first to-be-recognized vehicle to obtain first feature data.
  • the vehicle identification device uses the first convolution kernel to perform convolution processing on the image to be processed, and extracts semantic information of the image to be processed including detailed feature information of the vehicle to obtain the first feature data.
  • the global feature information of the vehicle includes the overall appearance feature information of the vehicle.
  • the vehicle identification device can extract the global feature information of the first vehicle to be identified from the to-be-processed image by performing the second feature extraction process on the to-be-processed image to obtain second feature data.
  • the second feature extraction process may be implemented by a second convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the second convolutional neural network obtained by training can complete the second feature extraction processing of the image to be processed.
  • the annotation information of the training data may be the overall appearance feature information of the vehicle in the image (such as vehicle type, body color).
  • the convolutional neural network extracts feature data including the overall appearance feature information of the vehicle from the training data, and obtains the overall appearance information of the vehicle according to the extracted feature data, as training results.
  • the vehicle identification device can use the second convolutional neural network to process the to-be-processed image to obtain the overall appearance feature information of the first to-be-recognized vehicle to obtain the second feature data.
  • the vehicle identification device uses the second convolution kernel to perform convolution processing on the to-be-processed image, and extracts semantic information of the to-be-processed image including the overall appearance feature information of the vehicle to obtain the second feature data.
  • the parameters of the first convolution kernel are different from those of the second convolution kernel.
  • the third feature data is used to obtain an identification result of the first vehicle to be identified, wherein the identification result includes the identity of the first vehicle to be identified.
  • the vehicle identification device may further determine the vehicle to be identified as vehicle a according to the third characteristic data.
  • the vehicle identification device compares the third feature data with the feature data in the vehicle feature database, and determines that the similarity between the target vehicle feature data in the vehicle feature database and the third feature data exceeds the similarity threshold. Then, based on the vehicle corresponding to the feature data of the target vehicle as vehicle b, the vehicle identification device can determine that the vehicle corresponding to the third feature data is vehicle b, that is, the recognition result of the first vehicle to be identified determined according to the third feature data is vehicle b. .
  • the vehicle identification device can obtain third feature data including both global feature information of the first vehicle to be identified and local feature information of the first vehicle to be identified by fusing the first feature data and the second feature data. Using the third feature data as the feature data of the first vehicle to be recognized can enrich the information included in the feature data of the first vehicle to be recognized.
  • the above-mentioned local feature information includes key point feature information.
  • the key point feature information includes the position of the key point in the image to be processed and the semantic information of the key point.
  • the key point 6 shown in FIG. 2 is the key point of the left front tire, and the semantic information of the key point 6 includes the information of the left front tire (such as tire specification, wheel size, tire brand).
  • the key point 23 shown in FIG. 2 is the key point of the rear license plate, and the semantic information of the key point 23 includes the information of the rear license plate (such as the license plate number).
  • the labeling method of the key points of the vehicle is shown in FIG. 2 .
  • vehicle model shown in FIG. 2 is only an example. In practical applications, any vehicle type (such as a dump truck, bus or truck) can be marked according to the key point marking method shown in FIG. 2 .
  • the vehicle identification device obtains first feature data including key point feature information of the first to-be-identified vehicle by performing a first feature extraction process on the to-be-processed image.
  • the first characteristic data may include characteristic information of the key point of the left front tire and characteristic information of the key point of the rear license plate of the vehicle to be identified.
  • the local feature information includes not only key point feature information but also local pixel point region feature information.
  • the local pixel area belongs to the pixel area covered by the first vehicle to be recognized, and the area of the local pixel area is smaller than the area of the pixel area covered by the first vehicle to be recognized.
  • the right local pixel area 301 includes the right area of the first vehicle to be identified 300
  • the head pixel area 302 includes the head area of the first vehicle to be identified.
  • the feature information of the local pixel point region includes semantic information of the local pixel point region.
  • the semantic information of the local pixel area includes: the model of the headlight; the local pixel area includes the pixel area covered by the window.
  • the semantic information of the local pixel point area includes: the type of the car window, the objects in the car that can be observed through the car window; in the case of the local pixel point area including the pixel point area covered by the front windshield
  • the semantic information of the local pixel area includes: the type of the front windshield, the objects in the car that can be observed through the front windshield, the annual inspection mark on the front windshield, and the annual inspection mark on the front windshield. on the location.
  • the vehicle identification device for local feature information performs the following steps:
  • the fourth characteristic data includes characteristic information of at least one key point of the first vehicle to be identified.
  • the vehicle identification device can extract feature information of at least one key point of the first vehicle to be identified from the image to be processed to obtain fourth feature data.
  • the third feature extraction process may be implemented by a third convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the third convolutional neural network obtained by training can complete the third feature extraction processing of the image to be processed.
  • the annotation information of the training data may be the key point feature information of the vehicle in the image (eg, the position of the key point, the semantic information of the key point).
  • the convolutional neural network extracts the feature data including the key point feature information of the vehicle from the training data, and obtains the key point feature information according to the extracted feature data, as training results.
  • the training of the convolutional neural network can be completed to obtain a third convolutional neural network.
  • the vehicle identification device can use the third convolutional neural network to process the to-be-processed image to obtain the feature information of the key points extracted from the first to-be-identified vehicle to obtain fourth feature data.
  • the vehicle identification device uses a third convolution kernel to perform convolution processing on the to-be-processed image, extracts semantic information of the to-be-processed image including the overall appearance feature information of the vehicle, and obtains fourth feature data.
  • the parameters of the third convolution kernel are different from those of the first convolution kernel, and the parameters of the third convolution kernel are also different from those of the second convolution kernel.
  • the fifth characteristic data includes characteristic information of at least one local pixel area of the first vehicle to be identified.
  • the fourth feature extraction process may be implemented by a fourth convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the fourth convolutional neural network obtained by training can complete the fourth feature extraction processing of the image to be processed.
  • the annotation information of the training data may be the feature information of the local pixel area of the vehicle in the image.
  • the convolutional neural network extracts feature data including the feature information of the local pixel area of the vehicle from the training data, and obtains the local pixel points according to the extracted feature data.
  • the feature information of the region is used as the training result.
  • the training of the convolutional neural network can be completed to obtain a fourth convolutional neural network.
  • the vehicle identification device can use the fourth convolutional neural network to process the to-be-processed image to obtain the feature information of the local pixel point region of the first to-be-identified vehicle to obtain fifth feature data.
  • the vehicle identification device uses a fourth convolution kernel to perform convolution processing on the to-be-processed image, and extracts the feature information of the local pixel area of the first to-be-recognized vehicle of the to-be-processed image, and obtains the fifth characteristic data.
  • the parameters of the fourth convolution kernel are different from the parameters of the first convolution kernel, the parameters of the second convolution kernel, and the parameters of the third convolution kernel.
  • the feature information of the local pixel area contains the semantic information of the local pixel area, and there is a correlation between adjacent pixels in the image (the correlation here includes semantic correlation), by combining the semantic information of the local pixel area
  • the fusion of information and key point feature information can enrich the detailed feature information of the vehicle.
  • the vehicle identification device fuses the key point feature information of the first vehicle to be identified with the feature information of the local pixel point area of the first vehicle to be identified by fusing the fourth feature data and the fifth feature data to enrich the first feature information to be identified.
  • the detailed feature information of the vehicle is obtained to obtain the first feature data.
  • the vehicle identification device performs the following steps in the process of executing step 1:
  • the sixth feature data includes key point feature information of the first vehicle to be identified, and the feature information included in any two sixth feature data belongs to different key points.
  • the first vehicle to be identified includes a left rearview mirror keypoint and a right taillight keypoint.
  • At least one sixth feature data includes: feature data 1 and feature data 2, wherein feature data 1 includes feature information of a key point of the left rearview mirror, and feature data 2 includes feature information of a key point of the right tail light.
  • the vehicle identification device extracts the key point feature information of the first vehicle to be identified by performing the fifth feature extraction process on the image to be processed, and obtains the first intermediate feature data with the number of channels not less than 1, wherein , the data of each channel in the first intermediate feature data includes the key point feature information of the first vehicle to be identified, and the information included in the data of any two channels belongs to different key points.
  • the vehicle identification device may use one channel data in the first intermediate characteristic data as a sixth characteristic data.
  • the vehicle identification device may select k feature data (that is, k feature data including the largest amount of information from at least one sixth feature data) The seventh characteristic data) is used for subsequent processing, wherein k is an integer not less than 1.
  • one seventh feature data can be obtained by executing step 5.
  • the vehicle identification device can use the seventh feature data as the fourth feature data, that is, the fourth feature data includes a key point. characteristic information.
  • At least two seventh feature data can be obtained by performing step 5.
  • the vehicle identification device can perform fusion processing on at least two seventh feature data to obtain fourth feature data.
  • the at least two seventh feature data include: seventh feature data 1, seventh feature data 2, and seventh feature data 3, wherein the seventh feature data 1 includes feature information of key points of the left front lamp, and the seventh feature data 2 includes the feature information of the key point of the left rear lamp, and the seventh feature data 3 includes the feature information of the key point of the left rearview mirror.
  • the vehicle identification device may obtain the fourth characteristic data by performing fusion processing on the seventh characteristic data 1 and the seventh characteristic data 2 .
  • the fourth characteristic data includes characteristic information of the key point of the left front lamp and characteristic information of the key point of the left rear lamp.
  • the vehicle identification device may also obtain the fourth characteristic data by performing fusion processing on the seventh characteristic data 1 , the seventh characteristic data 2 and the seventh characteristic data 3 .
  • the fourth feature data includes the feature information of the key point of the left front lamp, the feature information of the key point of the left rear lamp, and the feature information of the key point of the left rearview mirror.
  • the vehicle identification device performs the following steps in the process of executing step 4:
  • the first heat map includes position information of key points in the image to be processed, and the information included in any two first heat maps belong to different key points.
  • the key points of the first vehicle to be identified include a left rearview mirror key point and a right tail light key point.
  • At least one first heat map includes: a first heat map 1 and a first heat map 2, wherein the first heat map 1 includes the position information of the key points of the left rearview mirror in the image to be processed, and the first heat map 2 includes The position information of the right taillight key point in the image to be processed.
  • the pixels in the same position in the two images are called mutual co-location.
  • the position of pixel A in the first heat map 1 is the same as the position of pixel B in the first heat image 2
  • pixel A is the same pixel as pixel B in the first heat map
  • the pixel point B is the pixel point in the image to be processed that is the same location as the pixel point A.
  • the size of the first heat map is the same as the size of the image to be processed.
  • the pixel value of the pixel point in the first heat map represents the confidence of the existence of a key point in the position of the pixel point in the image to be processed that is co-located with the pixel point. For example, pixel A in the first heat map 1 and pixel B in the image to be processed are co-located with each other. If the first heat map 1 includes the position information of the key point of the left headlight in the to-be-processed image, and the pixel value of pixel A is 0.7, the confidence of the existence of the left headlight at pixel B is 0.7.
  • the sixth feature extraction processing may be convolution processing, pooling processing, or a combination of convolution processing and pooling processing, which is not limited in this disclosure.
  • the sixth feature extraction process may be implemented by a fifth convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the fifth convolutional neural network obtained by training can complete the extraction processing of the sixth feature of the image to be processed.
  • the annotation information of the training data can be the position of the key point in the image.
  • the convolutional neural network extracts the feature data including the position information of the key points from the training data, and obtains the positions of the key points in the image according to the extracted feature data, as the training result.
  • the training of the convolutional neural network can be completed to obtain the fifth convolutional neural network.
  • the vehicle identification device can use the fifth convolutional neural network to process the image to be processed to obtain the position information of the key points of the first vehicle to be identified, and obtain the first heat map.
  • Each pixel in the image to be processed includes semantic information, and the semantic information includes feature information of key points.
  • the semantic information includes feature information of key points.
  • the first feature image not only includes key point feature information of pixels, but also includes relative position information between pixels.
  • the information included in the fourth feature data does not include relative position information between pixels.
  • the key points to which the location information included in the first heat map belongs are referred to as key points of the first heat map.
  • the first heat map 1 includes the location information of the key points of the left headlight, that is, the first heat map 1 includes The information belongs to the key point of the left headlight.
  • the key point of the first heat map 1 is the key point of the left headlight.
  • the size of the image to be processed, the size of the first heat map, and the size of the first feature image are all the same. For example, if the length of the image to be processed is 50 and the width is 30, the length of the first heat map and the length of the first feature image are both 50, and the width of the first heat map and the width of the first feature image are both 30.
  • the dot product refers to an element-wise product.
  • the vehicle identification device may normalize the pixel values in the first feature image to obtain the normalized first feature image.
  • a heat map for example, adjusts the pixel value not less than 0.6 to 1, and adjusts the pixel value less than 0.6 to 0.3.
  • the vehicle identification device can extract the feature information of the key points of the first heat map by determining the dot product between the normalized first heat map and the first feature image, and obtain sixth feature data.
  • the vehicle identification device performs the following steps in the process of executing step 5:
  • the vehicle identification device can reduce the amount of data in the sixth feature data by performing pooling processing on one sixth feature data, and obtain an eighth feature data. In this way, processing the eighth characteristic data in the subsequent processing can reduce the data processing amount of the vehicle identification device.
  • the vehicle identification device obtains at least one eighth characteristic data by pooling the characteristic data in the at least one sixth characteristic data respectively.
  • the at least one sixth feature data includes: sixth feature data 1 , sixth feature data 2 , and sixth feature data 3 .
  • the vehicle identification device obtains the eighth feature data 1 by pooling the sixth feature data 1, and obtains the eighth feature data 2 by performing the pooling process on the sixth feature data 2.
  • at least one of the eighth feature data includes the first feature. Eight feature data 1 and eighth feature data 2.
  • the vehicle identification device obtains eighth feature data 1 by performing pooling processing on sixth feature data 1, obtains eighth feature data 2 by performing pooling processing on sixth feature data 2, and obtains eighth feature data 2 by performing pooling processing on sixth feature data 3 Eighth feature data 3 is obtained.
  • at least one eighth feature data includes eighth feature data 1 , eighth feature data 2 , and eighth feature data 3 .
  • the pooling process in step 10 is a global average pooling process.
  • the first probability is used to represent the amount of information included in the sixth feature data corresponding to the first probability.
  • at least one eighth feature data includes eighth feature data 1
  • at least one first probability includes first probability 1
  • the first probability 1 is obtained according to the amount of information included in the eighth feature data 1
  • the eighth characteristic data 1 is obtained by performing pooling processing on the sixth characteristic data 1 .
  • the first probability 1 is used to represent the amount of information included in the sixth feature data 1 .
  • the first probability there is a correlation between the first probability and the amount of information included in the sixth characteristic data. For example, in the case where the first probability is positively correlated with the amount of information included in the sixth feature data, in Example 1, the larger the first probability 1 is, the greater the amount of information included in the sixth feature data 1; When the probability is negatively correlated with the amount of information included in the sixth feature data, in Example 1, the larger the first probability 1 is, the smaller the amount of information included in the sixth feature data 1 is.
  • the vehicle identification device can obtain the first probability according to the amount of information included in the eighth characteristic data.
  • the vehicle identification device inputs the eighth characteristic data into the softmax function, and the first probability can be obtained.
  • the vehicle identification device can obtain a first probability according to the information amount included in one eighth characteristic data, and obtain at least one first probability according to the information amount included in at least one eighth characteristic data.
  • the at least one eighth characteristic data includes eighth characteristic data 1 and eighth characteristic data 2 .
  • the vehicle identification device obtains the first probability 1 according to the amount of information included in the eighth characteristic data 1 , and at this time, at least one first probability includes the first probability 1 .
  • the vehicle identification device obtains the first probability 1 according to the amount of information included in the eighth feature data 1, and obtains the first probability 2 according to the amount of information included in the eighth feature data 2. At this time, at least one first probability includes the first probability 1 and the first probability 2.
  • the vehicle identification device executes step 12; in the case that the first probability is negatively correlated with the amount of information included in the sixth characteristic data, the vehicle identification The device executes step 13 .
  • the vehicle identification device may determine the weight of each seventh feature data according to the amount of information included in the seventh feature data, and determine the weight of each seventh feature data according to the weight of the seventh feature data. At least one seventh feature data is weighted and fused to obtain fourth feature data.
  • the vehicle identification device performs the following steps in the process of executing step 2:
  • the ninth feature data includes key point feature information of the first vehicle to be identified, and the feature information included in any two ninth feature data belong to different local pixel regions.
  • the first to-be-identified vehicle includes a local pixel area 1 and a local pixel area 2, wherein the local pixel area 1 includes the pixel area covered by the front windshield, and the local pixel area 2 includes the left glass area. pixel area.
  • the at least one ninth feature data includes: feature data 1 and feature data 2 , wherein the feature data 1 includes feature information of the local pixel area 1 , and the feature data 2 includes feature information of the local pixel area 2 .
  • the vehicle identification device extracts the key point feature information of the first vehicle to be identified by performing the tenth feature extraction process on the image to be processed, and obtains fourth intermediate feature data with a channel number of not less than 1, wherein , the data of each channel in the fourth intermediate feature data includes the feature information of the local pixel area of the first vehicle to be identified, and the information included in the data of any two channels belongs to different local pixel areas.
  • the vehicle identification device may use one channel data in the fourth intermediate feature data as a ninth feature data.
  • the vehicle identification device may select m feature data (that is, m feature data including the largest amount of information from at least one ninth feature data) The tenth characteristic data) is used for subsequent processing, wherein m is an integer not less than 1.
  • one tenth feature data can be obtained by executing step 17.
  • the vehicle identification device can use the tenth feature data as the fifth feature data, that is, the fifth feature data includes a key point. characteristic information.
  • At least two tenth feature data can be obtained by executing step 5, and at this time, the vehicle identification device can perform fusion processing on at least two tenth feature data to obtain fifth feature data.
  • the at least two tenth feature data include: tenth feature data 1, tenth feature data 2, and tenth feature data 3, wherein the tenth feature data 1 includes feature information of the pixel area covered by the front of the vehicle, and the tenth feature data
  • the characteristic data 2 includes characteristic information of the pixel point area covered by the right front windshield
  • the tenth characteristic data 3 includes characteristic information of the pixel point area covered by the left tire.
  • the vehicle identification device may obtain the fifth characteristic data by performing fusion processing on the tenth characteristic data 1 and the tenth characteristic data 2 .
  • the fifth characteristic data includes characteristic information of the pixel area covered by the front of the vehicle and characteristic information of the pixel area covered by the right front windshield.
  • the vehicle identification device may also obtain the fifth characteristic data by performing fusion processing on the tenth characteristic data 1 , the tenth characteristic data 2 and the tenth characteristic data 3 .
  • the fifth feature data includes feature information of the pixel point area covered by the front of the vehicle, feature information of the pixel point area covered by the right front windshield, and feature information of the pixel point area covered by the left tire.
  • the vehicle identification device performs the following steps in the process of executing step 14:
  • the second heat map includes position information of key points in the image to be processed, and the information included in any two second heat maps belong to different local pixel regions.
  • the local pixel point area of the first vehicle to be identified includes a front windshield area and a head area.
  • the at least one second heat map includes: a second heat map 1 and a second heat map 2, wherein the second heat map 1 includes the position information of the front windshield area in the image to be processed, and the second heat map 2 includes local The position information of the pixel area in the image to be processed.
  • the pixels in the same position in the two images are called mutual co-location.
  • the position of pixel A in the second heat map 1 is the same as the position of pixel B in the second heat image 2, then pixel A is the same pixel as pixel B in the second heat map
  • the pixel point B is the pixel point in the image to be processed that is the same location as the pixel point A.
  • the size of the second heat map is the same as the size of the image to be processed.
  • the pixel value of the pixel point in the second heat map represents the confidence level that the position of the pixel point in the image to be processed that is co-located with the pixel point belongs to the local pixel point area. For example, pixel A in the second heat map 1 and pixel B in the image to be processed are co-located with each other. If the second heat map 1 includes the position information of the head area in the image to be processed, and the pixel value of pixel A is 0.7, the confidence that pixel B belongs to the head area is 0.7.
  • the eleventh feature extraction processing may be convolution processing, pooling processing, or a combination of convolution processing and pooling processing, which is not limited in this disclosure.
  • the eleventh feature extraction process may be implemented by the sixth convolutional neural network.
  • the convolutional neural network is trained by using the image with label information as training data, so that the sixth convolutional neural network obtained by training can complete the eleventh feature extraction processing of the image to be processed.
  • the annotation information of the training data can be the position of the local pixel area in the image.
  • the convolutional neural network extracts the feature data including the position information of the local pixel area from the training data, and obtains the local pixel points in the image according to the extracted feature data. The location of the region as the training result.
  • the training of the convolutional neural network can be completed to obtain the sixth convolutional neural network.
  • the vehicle identification device can use the sixth convolutional neural network to process the to-be-processed image to obtain the location information of the key points of the first to-be-identified vehicle to obtain the second heat map.
  • Each pixel in the image to be processed includes semantic information, and by performing the seventh feature extraction process on the image to be processed, the semantic information of each pixel can be extracted to obtain a second feature image.
  • the second feature image not only includes semantic information of pixels, but also includes relative position information between pixels.
  • the information included in the fifth feature data does not include relative position information between pixels.
  • the first feature image and the second feature image may be the same.
  • both the first feature image and the second feature image include semantic information of each pixel in the image to be processed.
  • the local pixel area to which the location information included in the second heat map belongs is called the local pixel area of the second heat map.
  • the second heat map 1 includes the location information of the front windshield area, that is, the second heat map The information included in 1 belongs to the front windshield area.
  • the local pixel area of the second heat map 1 is the front windshield area.
  • the size of the image to be processed, the size of the second heat map, and the size of the second feature image are all the same. For example, if the length of the image to be processed is 50 and the width is 30, the length of the second heat map and the length of the second feature image are both 50, and the width of the second heat map and the width of the second feature image are both 30.
  • ninth feature data can be obtained from the feature information of the local pixel region of the second heat map extracted from the second feature image.
  • the vehicle identification device may perform normalization processing on the pixel values in the second feature image to obtain the normalized first feature image.
  • Two heatmaps for example, adjust pixel values over 0.7 to 1, and adjust pixel values not over 0.7 to 0.
  • the vehicle identification device can extract feature information of key points of the second heat map to obtain ninth feature data.
  • the vehicle identification device performs the following steps in the process of executing step 15:
  • the vehicle identification device can reduce the amount of data in the ninth feature data by performing pooling processing on a ninth feature data, and obtain an eleventh feature data. In this way, by processing the eleventh characteristic data in the subsequent processing, the data processing amount of the vehicle identification device can be reduced.
  • the vehicle identification device obtains at least one eleventh characteristic data by pooling the characteristic data in the at least one ninth characteristic data respectively.
  • the at least one ninth feature data includes: ninth feature data 1 , ninth feature data 2 , and ninth feature data 3 .
  • the vehicle identification device obtains eleventh feature data 1 by pooling the ninth feature data 1, and obtains eleventh feature data 2 by pooling the ninth feature data 2.
  • at least one eleventh feature The data includes eleventh characteristic data 1 and eleventh characteristic data 2 .
  • the vehicle identification device obtains the eleventh feature data 1 by pooling the ninth feature data 1, obtains the eleventh feature data 2 by pooling the ninth feature data 2, and obtains the eleventh feature data 2 by pooling the ninth feature data 3.
  • Eleventh feature data 3 is obtained through the transformation process.
  • at least one eleventh feature data includes eleventh feature data 1 , eleventh feature data 2 , and eleventh feature data 3 .
  • the pooling process in step 20 is the global average pooling process.
  • the second probability is used to represent the amount of information included in the ninth feature data corresponding to the second probability.
  • at least one eleventh feature data includes eleventh feature data 1
  • at least one second probability includes second probability 1
  • the second probability 1 is obtained according to the amount of information included in the eleventh feature data 1
  • the eleventh feature data 1 is obtained by pooling the ninth feature data 1 . That is, the second probability 1 is used to represent the amount of information included in the ninth feature data 1 .
  • the second probability is positively correlated with the amount of information included in the ninth feature data
  • the probability is negatively correlated with the amount of information included in the ninth feature data
  • the vehicle identification device can obtain the second probability according to the amount of information included in the eleventh characteristic data.
  • the vehicle identification device inputs the eleventh characteristic data into the softmax function, and the second probability can be obtained.
  • the vehicle identification device may obtain a second probability according to the amount of information included in one eleventh characteristic data, and may obtain at least one second probability according to the amount of information included in at least one eleventh characteristic data.
  • the at least one eleventh feature data includes eleventh feature data 1 and eleventh feature data 2 .
  • the vehicle identification device obtains the second probability 1 according to the amount of information included in the eleventh characteristic data 1 , and at this time, at least one second probability includes the second probability 1 .
  • the vehicle identification device obtains the second probability 1 according to the amount of information included in the eleventh feature data 1, and obtains the second probability 2 according to the amount of information included in the eleventh feature data 2.
  • at least one second probability includes the second probability. Probability 1 and second probability 2.
  • the vehicle identification device executes step 22; under the condition that the second probability is negatively correlated with the amount of information included in the ninth feature data, the vehicle identification device The device executes step 23 .
  • a tenth feature data includes feature information of a local pixel area
  • the number of local pixel areas in at least one local pixel area exceeds 1
  • different tenth characteristic data include different amounts of information.
  • the vehicle identification device may determine the weight of each tenth feature data according to the amount of information included in the tenth feature data, and determine the weight of each tenth feature data according to the amount of information included in the tenth feature data. The weight performs weighted fusion on at least one tenth feature data to obtain fifth feature data.
  • the at least one local pixel point area includes a first local pixel point area and a second local pixel point area, and both the number and m of the ninth feature data are greater than 1.
  • the vehicle identification device selects m pieces of feature data including the most information from at least two ninth feature data, and obtains twelfth feature data including feature information of the first local pixel point region and features including the second local pixel point region Thirteenth characteristic data of information.
  • the vehicle identification device performs the following steps in the process of executing step 18:
  • the first weight is positively correlated with the amount of information included in the twelfth feature data
  • the second weight is positively correlated with the amount of information included in the thirteenth feature data
  • the vehicle identification device performs weighted fusion of the ninth feature data and the tenth feature data according to the first weight and the second weight, and obtains the fourth feature data including the feature information of the local pixel point area of the first to-be-identified vehicle, which can improve the first to-be-identified vehicle.
  • the accuracy of identifying the local pixel area feature information of the vehicle is a measure of the accuracy of identifying the local pixel area feature information of the vehicle.
  • the vehicle identification device performs weighted summation on the twelfth characteristic data and the thirteenth characteristic data according to the first weight and the second weight to obtain the fifth characteristic data.
  • the first weight is ⁇ 3
  • the second weight is ⁇ 4
  • the twelfth feature data is n 4
  • the thirteenth feature data is n 5
  • the fifth feature data is n 6
  • n 6 ⁇ 3 ⁇ n 4 + ⁇ 4 ⁇ n 5 +d
  • the vehicle identification device multiplies the first weight by the twelfth characteristic data to obtain fifth intermediate characteristic data, and multiplies the second weight by the thirteenth characteristic data to obtain sixth intermediate characteristic data , and the fifth characteristic data is obtained by fusing the fifth intermediate characteristic data and the sixth intermediate characteristic data.
  • the embodiments of the present disclosure also provide a vehicle identification network, which can be configured to implement the technical solutions disclosed above.
  • the vehicle recognition network includes: a feature extraction module 401 , a key point and local pixel region generation module 402 , and a joint training module 403 .
  • the to-be-processed image 400 is processed by the feature extraction module 401 to obtain a third feature image 404 of the to-be-processed image.
  • At least one first heat map and at least one second heat map 405 are obtained by processing the image to be processed by the key point and local pixel region generating module.
  • the third feature map, at least one first heat map and at least one second heat image are input to the joint training module to obtain third feature data 406 .
  • FIG. 5 is a schematic structural diagram of a feature extraction module.
  • the feature extraction module includes three convolutional layers connected in series.
  • the first convolutional layer 501 is conv2_x in ResNet50
  • the second convolutional layer 502 is conv3_x in ResNet50
  • the third convolutional layer 503 is conv4_x in ResNet50.
  • Feature extraction is performed on the image 500 to be processed through the three convolution layers to obtain a third feature image 504 .
  • Figure 6 shows a schematic diagram of the structure of the key point and local pixel region generation module.
  • the keypoint and local pixel region generation module includes four convolutional layers in series.
  • the first convolutional layer 601 is conv2_x in ResNet50
  • the second convolutional layer 602 is conv3_x in ResNet50
  • the third convolutional layer 603 is conv4_x in ResNet50
  • the fourth convolutional layer 604 is conv5_x in ResNet50.
  • the image 600 to be processed is processed through the four convolution layers to obtain at least one first heat map and at least one second heat map 605 .
  • Figure 7 shows a schematic diagram of the structure of the joint training module.
  • the third feature image 700 is processed by the first convolution layer 701 of the joint training module to obtain the first general feature image.
  • the first feature image is obtained by performing dimensionality reduction on the channel dimension on the first general feature image through the first dimensionality reduction layer 702 .
  • the first actor-critic module 703 processes the first feature image and at least one first heat map 704 to obtain k first critic feature data 705 .
  • the k first critic feature data are processed through the first pooling layer 71 and the first normalization layer 72 in sequence, and k seventh feature data 705 are obtained.
  • the third feature image is processed by the first convolution layer 701 of the joint training module to obtain a second general feature image.
  • the second feature image is obtained by performing dimension reduction on the channel dimension on the second general feature image through the second dimension reduction layer 711 .
  • the second feature image and at least one second heat map 713 are processed by the second actor-critic module 712 to obtain m second critic feature data.
  • the m pieces of second critic feature data are processed through the second pooling layer 73 and the second normalization layer 74 in sequence, and m pieces of tenth feature data 714 are obtained.
  • the third feature image is processed by the second layer convolution layer 721, the third layer dimensionality reduction layer 722, the third layer pooling layer 75, and the third normalization layer 76 of the joint training module in turn to obtain the second feature data. 723.
  • the first convolutional layer 701 and the second convolutional layer 721 are both conv5_x in ResNet50.
  • the first dimension reduction layer 702, the second dimension reduction layer 711, and the third dimension reduction layer 722 all include a convolution kernel with a size of 1*1.
  • FIG. 8 is a schematic diagram of the structure of the first actor-critic module.
  • the input of the first actor-critic module is at least a first heatmap 801 and a first feature image 802 .
  • the first actor-critic module respectively determines the dot product between each first heat map and the first feature image to obtain at least one sixth feature data 803 .
  • a first probability corresponding to the sixth characteristic data can be obtained by processing a sixth characteristic data by the first scoring module 804 .
  • the corresponding sixth feature data obtains k first actor feature data.
  • the k first actor feature data are respectively normalized to obtain k first critic feature data 807 .
  • FIG. 9 is a schematic structural diagram of the first molecule breaking module.
  • the sixth feature data 901 passes through the normalization layer 902, the pooling layer 903, and the fully connected layer 904 to obtain the eighth feature data, and the softmax layer 905 processes the eighth feature data to obtain The first probability 906 .
  • FIG. 10 is a schematic structural diagram of the second actor-critic module.
  • the input to the second actor-critic module is at least a second heatmap and a third feature image.
  • the second actor-critic module respectively determines the dot product between each second heat map 1001 and the third feature image 1002 to obtain at least one ninth feature data 1003 .
  • a second probability 1005 corresponding to the ninth characteristic data can be obtained by processing a ninth characteristic data by the second scoring module 1004 .
  • the ninth feature data obtains m second actor feature data 1006 .
  • the m second actor feature data are respectively normalized to obtain m second critic feature data 1007 .
  • FIG. 11 is a schematic structural diagram of the second molecular splitting module.
  • the ninth feature data 1101 passes through the normalization layer 1102, the pooling layer 1103, and the fully connected layer 1104 in turn to obtain the eleventh feature data, and the eighth feature data is processed by the softmax layer 1105, A second probability 1106 is obtained.
  • the present disclosure also provides a training method for a vehicle identification network.
  • the training method may include the following steps:
  • the training image includes the first vehicle to be recognized.
  • the vehicle identification device receives the training image input by the user through the input component.
  • the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
  • the vehicle identification device receives the training image sent by the training data terminal.
  • the above training data terminal can be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
  • the vehicle identification device receives the network to be trained input by the user through the input component.
  • the above input components include: keyboard, mouse, touch screen, touch pad, audio input and so on.
  • the vehicle identification device receives the network to be trained sent by the training data terminal.
  • the above training data terminal can be any one of the following: a mobile phone, a computer, a tablet computer, and a server.
  • the global feature information of the second vehicle to be identified includes overall appearance feature information of the second vehicle to be identified.
  • the label of the training image includes category information of the second vehicle to be identified.
  • vehicle 1 and vehicle 2 are included in all training data.
  • the category information of the second vehicle to be identified is vehicle 1
  • it is indicated that the second vehicle to be identified is vehicle 1 .
  • the vehicle identification device may obtain a second category of the vehicle to be identified (hereinafter referred to as a global category) according to the fourteenth characteristic data, and according to the difference between the global category and the category information included in the tag A first global loss can be obtained.
  • a global category a second category of the vehicle to be identified
  • the vehicle identification device can obtain the second category of the vehicle to be identified (hereinafter referred to as the key point category) according to the fifteenth feature data, and the vehicle identification device can obtain the category of the second vehicle to be identified (hereinafter referred to as the key point category) according to the relationship between the key point category and the category information included in the label The difference of the first keypoint loss can be obtained.
  • G 1 , p 1 , L t satisfy formula (1):
  • G 1 , p 1 , L t satisfy formula (3):
  • the vehicle identification device adjusts the parameters of the network to be trained according to the total loss until the total loss is less than the convergence threshold, and the vehicle identification network is obtained.
  • the vehicle recognition network is obtained by adjusting the parameters of the network to be trained based on the total loss.
  • the vehicle recognition network can be used to process the image to be processed to obtain the global feature information of the first vehicle to be recognized. and keypoint feature information.
  • the vehicle identification device before executing step 30, the vehicle identification device further executes the following steps:
  • the vehicle identification device can obtain the second category of the vehicle to be identified (hereinafter referred to as the category of the local pixel point area) according to the sixteenth feature data, and the category of the vehicle identification device can obtain the category of the second vehicle to be identified (hereinafter referred to as the category of the local pixel point area), according to the category of the local pixel point area and the information included in the label.
  • the difference between the class information can obtain the first local pixel area loss.
  • the vehicle identification device After obtaining the first local pixel area loss, the vehicle identification device performs the following steps in the process of performing step 30:
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel region loss is ⁇ 1
  • the total loss is L t
  • G 1 ,p 1 , ⁇ 1 , L t satisfies formula (4):
  • G 1 , p 1 , ⁇ 1 , L t satisfy formula (5):
  • G 1 , p 1 , ⁇ 1 , L t satisfy formula (6):
  • the vehicle recognition network is obtained by adjusting the parameters of the network to be trained based on the total loss, and the vehicle recognition network can be used to process the image to be processed to obtain the first 1.
  • the vehicle identification device performs the following steps in the process of executing step 27:
  • the seventeenth feature data includes key point feature information of the second vehicle to be identified, and the feature information included in any two seventeenth feature data belong to different key points.
  • the s eighteenth feature data are fused to obtain the fifteenth feature data, and the vehicle identification network can be used to process the to-be-processed image, and the fourth feature data can be obtained according to the k seventh feature data. characteristic data.
  • the vehicle identification device after obtaining s eighteenth characteristic data, before executing step 34, the vehicle identification device further executes the following steps:
  • the first identification result includes category information of the second vehicle to be identified.
  • the vehicle identification device can obtain a first identification result according to an eighteenth characteristic data. According to the s eighteenth characteristic data, s first identification results of the second vehicle to be identified can be obtained.
  • the vehicle identification device may obtain a first identification difference according to a first identification result and a label, and obtain s first identification differences according to the s first identification results and the label.
  • the vehicle identification device obtains the keypoint category loss by determining the sum of the s first identification differences.
  • the vehicle identification device After obtaining the keypoint category loss, the vehicle identification device performs the following steps in the process of executing step 34:
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the total loss is L t , in a possible way to achieve , G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (7):
  • G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (8):
  • G 1 , p 1 , p 2 , ⁇ 1 , L t satisfy formula (9):
  • the fourth feature data can be obtained according to the k seventh feature data in the process of using the vehicle identification network to process the image to be processed.
  • the vehicle identification device performs the following steps in the process of executing step 36:
  • the first order is the order of the included information amount from large to small, and the first order may be the order of the included information amount from small to large.
  • the vehicle identification device selects the first s feature data in the first order as the s eighteenth feature data; In the case of the order of the amount of information from small to large, the vehicle identification device selects the last s feature data in the first order as the s eighteenth feature data.
  • the vehicle identification device also performs the following steps before performing step 40:
  • the second order is the order of the key point category loss from small to large. That is, the smaller the keypoint category loss, the higher the ranking of the first recognition result in the second order.
  • the second order is the order of the keypoint category loss from large to small. That is, the larger the keypoint category loss, the higher the ranking of the first recognition result in the second order.
  • the vehicle identification device After obtaining the key point ranking loss, the vehicle identification device performs the following steps in the process of executing step 40:
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the total loss is L t
  • G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (10):
  • G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (11):
  • G 1 , p 1 , p 2 , p 3 , ⁇ 1 , L t satisfy formula (12):
  • adding the key point category loss to the total loss can improve the accuracy of the s eighteenth feature data, and then improve the accuracy of the information included in the fifteenth feature data.
  • the accuracy of the k seventh feature data can be improved, thereby improving the accuracy of the information included in the fourth feature data.
  • the vehicle identification device performs the following steps in the process of executing step 32:
  • the nineteenth feature data includes local pixel region feature information of the second vehicle to be identified, and the feature information included in any two nineteenth feature data belong to different local pixel regions.
  • the sixteenth feature data is obtained by fusing the p twentieth feature data, and the vehicle identification network can be used to process the to-be-processed image, and the fifth feature data can be obtained according to the m tenth feature data. characteristic data.
  • the vehicle identification device further executes the following steps:
  • the second identification result includes category information of the second vehicle to be identified.
  • the vehicle identification device can obtain a second identification result according to a twentieth characteristic data. According to the p eighteenth characteristic data, p second identification results of the second vehicle to be identified may be obtained.
  • the vehicle identification device may obtain a second identification difference according to a second identification result and a label, and may obtain p second identification differences according to the p second identification results and the label.
  • the vehicle identification device obtains the local pixel point region category loss by determining the sum of the p second identification differences.
  • the vehicle identification device After obtaining the local pixel point area category loss, the vehicle identification device performs the following steps in the process of executing step 45:
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2
  • the loss is ⁇ 2
  • the total loss is L t .
  • G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 ,L t satisfy equation (13):
  • G 1 , p 1 , p 2 , p 3 , ⁇ 1 , ⁇ 2 , L t satisfy formula (14):
  • G 1 , p 1 , p 2 , p 3 , ⁇ 1 , ⁇ 2 , L t satisfy formula (15):
  • the fifth feature data can be obtained according to the m tenth feature data in the process of using the vehicle identification network to process the image to be processed.
  • the vehicle identification device performs the following steps in the process of executing step 47:
  • the third order is the order of the included information amount from large to small, and the third order may be the order of the included information amount from small to large.
  • the vehicle identification device selects the first p feature data in the third order as the p twentieth feature data; In the case of the order of the amount of information from small to large, the vehicle identification device selects the last p pieces of feature data in the third order as the p pieces of twentieth feature data.
  • the vehicle identification device also performs the following steps before performing step 51:
  • the fourth order is the order of the local pixel area category loss from small to large. That is, the smaller the local pixel area category loss, the higher the ranking of the second recognition result in the fourth order.
  • the fourth order is the order of the local pixel region category loss from large to small. That is, the larger the local pixel region category loss, the higher the ranking of the second recognition result in the fourth order.
  • the vehicle identification device After obtaining the local pixel point area sorting loss, the vehicle identification device performs the following steps in the process of executing step 51:
  • the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss and the above-mentioned local pixel point area Sort the loss to get the total loss above.
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2
  • the loss is ⁇ 2
  • the local pixel region sorting loss is ⁇ 3
  • the total loss is L t .
  • G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 , L t satisfies formula (16):
  • G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (17):
  • G 1 ,p 1 ,p 2 ,p 3 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (18):
  • adding the local pixel area category loss to the total loss can improve the accuracy of the p twentieth feature data, and further improve the accuracy of the information included in the sixteenth feature data.
  • the accuracy of the k seventh feature data can be improved, thereby improving the accuracy of the information included in the fourth feature data.
  • the first global loss includes a global focus loss
  • the vehicle identification device performs the following steps in the process of performing step 28:
  • the third identification result includes category information of the second vehicle to be identified.
  • the vehicle identification device can determine the category of the second vehicle to be identified according to the fourteenth characteristic data, and then obtain the third identification result.
  • B is the number of training images
  • ⁇ n is a positive number
  • is a non-negative number
  • u n is the probability corresponding to the category of the label in the third recognition result.
  • the training image includes image a
  • the third recognition result 1 is obtained by processing the image a using the network to be trained. If the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1).
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.9
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.1.
  • L F1 -2 ⁇ (1-0.9) 2 ⁇ log0.9.
  • the training image includes image a and image b
  • the image a is processed by the network to be trained to obtain the third recognition result 1
  • the image b is processed by the network to be trained to obtain the third recognition result 2.
  • the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
  • the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
  • the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
  • the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
  • L F1 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
  • the image corresponding to the third recognition result with the maximum probability between the first probability threshold and the second probability threshold is called the first easy sample, and the images other than the first easy sample in the training image are called the first difficult sample. .
  • the network to be trained obtains the third recognition result 1 by processing the image a.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the third recognition result 1 is 0.8, the maximum probability is greater than the second probability threshold, and the image a is the first easy sample.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the third recognition result 1 is 0.5, the maximum probability threshold is greater than the first probability threshold, and the maximum probability is less than the second threshold, and the image a is the first difficult sample.
  • the focus loss of key points is obtained by calculating the focus loss of the third recognition result, and then the total loss is determined, which can improve the training effect of the network to be trained.
  • the training image belongs to a training image set
  • the training image set further includes a first positive sample image of the training image and a first negative sample image of the training image
  • the first global loss further includes a global triplet loss .
  • the vehicle identification device also executes the following steps:
  • the category information included in the label of the first positive sample image is the same as the category information included in the label of the training image
  • the category information included in the label of the first negative sample image is the same as the category information included in the label of the training image. different.
  • the feature data of the first positive sample image includes semantic information of the first positive sample image, and the semantic information can be used to identify the category of the second vehicle to be identified in the first positive sample image.
  • the feature data of the first negative sample image includes semantic information of the first positive sample image, and the semantic information can be used to identify the category of the second vehicle to be recognized in the first negative sample image.
  • the vehicle identification device calculates the similarity between the twelfth feature data and the feature data of the first positive sample image to obtain the first positive similarity, and calculates the similarity between the twelfth feature data and the feature data of the first negative sample image Get the first negative similarity.
  • the first positive similarity is a second norm between the twelfth feature data and the feature data of the first positive sample image.
  • the first negative similarity is the second norm between the twelfth feature data and the feature data of the first negative sample image.
  • the vehicle recognition apparatus may classify the images other than the training images in the training image set into Positive sample image set and negative sample image set.
  • the class information included in the labels of the images in the positive sample image set is the same as the class information included in the labels of the training images, and the class information included in the labels of the images in the negative sample image set is different from the class information included in the labels of the training images.
  • the vehicle identification device performs feature extraction processing on the images in the positive sample image set to obtain a positive sample feature data set, and performs feature extraction processing on the images in the negative sample image set to obtain a negative sample feature data set.
  • the vehicle identification device calculates the similarity between the twelfth feature data and the feature data in the positive sample feature data set to obtain a first positive similarity set, and calculates the similarity between the twelfth feature data and the feature data in the negative sample feature data set get the first negative similarity set.
  • the minimum value in the first positive similarity set is called the minimum similarity within the first class
  • the maximum value in the first negative similarity set is called the maximum similarity outside the first class.
  • the similarity between the twelfth feature data and the feature data in the first positive sample feature data set is, the second degree of similarity between the twelfth feature data and the feature data in the first positive sample feature data set norm.
  • the similarity between the twelfth feature data and the feature data in the first negative sample feature data set is the second norm between the twelfth feature data and the feature data in the first negative sample feature data set.
  • the global triplet loss can improve the accuracy of the recognition result of the second to-be-recognized vehicle obtained by the network to be trained based on the twelfth feature data, thereby improving the classification accuracy of the first to-be-recognized vehicle by the vehicle recognition network .
  • the first global loss may be the sum of the global focus loss and the global triplet loss.
  • the vehicle identification device before performing step 56, the vehicle identification device further performs the following steps:
  • the fourth identification result includes category information of the second vehicle to be identified.
  • the vehicle identification device can determine the category of the second vehicle to be identified according to the fifteenth characteristic data, and then obtain a fourth identification result.
  • B is the number of training images
  • ⁇ n is a positive number
  • is a non-negative number
  • um is the probability corresponding to the category of the label in the fourth recognition result.
  • the training image includes image a
  • the training image includes image a and image b
  • the image a is processed by the network to be trained to obtain the fourth recognition result 1
  • the image b is processed by the network to be trained to obtain the fourth recognition result 2.
  • the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
  • the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
  • the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
  • the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
  • L F2 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
  • the vehicle identification device After obtaining the key point focus loss, the vehicle identification device performs the following steps in the process of executing step 58:
  • the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss And the above local pixel area sorting loss, the above total loss is obtained.
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2
  • the loss is ⁇ 2
  • the local pixel region sorting loss is ⁇ 3
  • the key point focus loss is p 4
  • the total loss is L t
  • G 1 ,p 1 ,p 2 ,p 3 , p 4 , ⁇ 1 , ⁇ 2 , ⁇ 3 , L t satisfy formula (23):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (25):
  • the image corresponding to the fourth recognition result with the maximum probability between the third probability threshold and the fourth probability threshold is called the second easy sample, and the images other than the second easy sample in the training image are called the second difficult sample .
  • the third probability threshold is 0.4 and the fourth probability threshold is 0.7.
  • the network to be trained obtains the fourth recognition result 1 by processing the image a.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fourth recognition result 1 is 0.8, the maximum probability is greater than the fourth threshold, and the image a is the second easy sample.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fourth recognition result 1 is 0.5, the maximum probability is greater than the third probability threshold, and the maximum probability is less than the fourth probability threshold, and the image a is the second difficult sample.
  • the focus loss of the local pixel point area is obtained, and the local pixel point area loss is added to the total loss, which can improve the training effect of the network to be trained.
  • the vehicle identification device before performing step 63, the vehicle identification device further performs the following steps:
  • the feature data of the first positive sample image, and the feature data of the first negative sample image obtain the keypoint triple loss.
  • the vehicle identification device calculates the similarity between the fifteenth feature data and the feature data of the first positive sample image to obtain the second positive similarity, and calculates the similarity between the fifteenth feature data and the feature data of the first negative sample image Get the second negative similarity.
  • the second positive similarity is a second norm between the fifteenth feature data and the feature data of the first positive sample image.
  • the second negative similarity is the second norm between the fifteenth feature data and the feature data of the first negative sample image.
  • the vehicle identification device performs feature extraction processing on the images in the positive sample image set to obtain a positive sample feature data set, and performs feature extraction processing on the images in the negative sample image set to obtain a negative sample feature data set.
  • the vehicle identification device calculates the similarity between the fifteenth feature data and the feature data in the positive sample feature data set to obtain a second positive similarity set, and calculates the similarity between the fifteenth feature data and the feature data in the negative sample feature data set degree to get the second negative similarity set.
  • the minimum value in the second positive similarity set is called the minimum similarity within the second class
  • the maximum value in the second negative similarity set is called the maximum similarity outside the second class.
  • the similarity between the fifteenth feature data and the feature data in the positive sample feature data set is the second norm between the fifteenth feature data and the feature data in the positive sample feature data set.
  • the similarity between the fifteenth feature data and the feature data in the negative sample feature data set is the second norm between the fifteenth feature data and the feature data in the negative sample feature data set.
  • the vehicle identification device After obtaining the key point focus loss, the vehicle identification device performs the following steps in the process of executing step 63:
  • the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss and the above-mentioned local pixel point region sorting loss, to obtain the above-mentioned total loss.
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2 .
  • the loss is ⁇ 2
  • the local pixel region sorting loss is ⁇ 3
  • the keypoint focus loss is p 4
  • the key point triplet loss is p 5
  • the total loss is L t .
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfies formula (28):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (29):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 ,L t satisfy formula (30):
  • the triple loss of key points can improve the accuracy of the recognition result of the second vehicle to be recognized obtained by the network to be trained based on the fifteenth feature data, thereby improving the accuracy of the classification of the first vehicle to be recognized by the vehicle recognition network.
  • the vehicle identification device before performing step 66, the vehicle identification device further performs the following steps:
  • the fifth identification result includes category information of the second vehicle to be identified.
  • the vehicle identification device can determine the type of the second vehicle to be identified according to the sixteenth characteristic data, and then obtain the fifth identification result.
  • the focus loss of the fifth identification result is obtained as the focus loss of the local pixel point area.
  • B is the number of training images
  • ⁇ n is a positive number
  • is a non-negative number
  • uk is the probability corresponding to the category of the label in the fifth recognition result.
  • the training image includes image a
  • the fifth recognition result 1 is obtained by processing the image a with the network to be trained. If the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1).
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.9
  • the training image includes image a and image b
  • the image a is processed by the network to be trained to obtain the fifth recognition result 1
  • the image b is processed by the network to be trained to obtain the fifth recognition result 2.
  • the category included in the label of image a is vehicle 1 (that is, the label of image a is vehicle 1)
  • the category included in the label of image b is vehicle 2 (that is, the label of image a is vehicle 2).
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.3
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.7.
  • the probability that the second vehicle to be recognized in the image b is the vehicle 1 is 0.2
  • the probability that the second vehicle to be recognized in the image b is the vehicle 2 is 0.8.
  • L F3 -2 ⁇ (1-0.3) 2 ⁇ log0.3-2 ⁇ (1-0.8) 2 ⁇ log0.8.
  • the vehicle identification device After obtaining the focus loss of the local pixel point area, the vehicle identification device performs the following steps in the process of executing step 66:
  • the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss, the above-mentioned local pixel point area focus loss and the above-mentioned local pixel point area sorting loss, to obtain the above-mentioned total loss.
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2 .
  • the loss is ⁇ 2
  • the local pixel region sorting loss is ⁇ 3
  • the local pixel region focus loss is ⁇ 4
  • the key point focus loss is p 4
  • the key point triple loss is p 5
  • the total loss is L t .
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (32):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (33):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 ,L t satisfy formula (34):
  • the image corresponding to the fifth recognition result with the maximum probability between the fifth probability threshold and the sixth probability threshold is called the third easy sample, and the images other than the third easy sample in the training image are called the third difficult sample.
  • the fifth probability threshold is 0.4 and the sixth probability threshold is 0.7.
  • the network to be trained obtains the fifth recognition result 1 by processing the image a.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fifth recognition result 1 is 0.8, the maximum probability is greater than the sixth probability threshold, and the image a is the third easy sample.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fifth recognition result 1 is 0.5, the maximum probability threshold is greater than the fifth probability threshold, and the maximum probability is less than the sixth threshold, and the image a is the third difficult sample.
  • the focus loss of the local pixel point region is obtained, and then the total loss is determined, which can improve the training effect of the third difficult sample, thereby improving the training effect of the network to be trained.
  • the vehicle identification device before performing step 69, the vehicle identification device further performs the following steps:
  • the feature data of the first positive sample image, and the feature data of the first negative sample image obtain the local pixel point region triple loss.
  • the vehicle identification device calculates the similarity between the sixteenth feature data and the feature data of the first positive sample image to obtain a third positive similarity, and calculates the similarity between the sixteenth feature data and the feature data of the first negative sample image Get the third negative similarity.
  • the third positive similarity is a second norm between the sixteenth feature data and the feature data of the first positive sample image.
  • the third negative similarity is the second norm between the sixteenth feature data and the feature data of the first negative sample image.
  • the vehicle identification device calculates the similarity between the sixteenth feature data and the feature data in the positive sample feature data set to obtain a third positive similarity set, and calculates the similarity between the sixteenth feature data and the feature data in the negative sample feature data set degree to get the third negative similarity set.
  • the minimum value in the third positive similarity set is called the minimum similarity within the third class
  • the maximum value in the third negative similarity set is called the maximum similarity outside the third class.
  • the similarity between the sixteenth feature data and the feature data in the positive sample feature data set is the second norm between the sixteenth feature data and the feature data in the positive sample feature data set.
  • the similarity between the sixteenth feature data and the feature data in the negative sample feature data set is the second norm between the sixteenth feature data and the feature data in the negative sample feature data set.
  • the vehicle identification device After obtaining the focus loss of the local pixel point area, the vehicle identification device performs the following steps in the process of executing step 69:
  • the above-mentioned first global loss the above-mentioned first key point loss, the above-mentioned first local pixel point area loss, the above-mentioned key point category loss, the above-mentioned key point sorting loss, the above-mentioned local pixel point area category loss, the above-mentioned key point focus loss , the above-mentioned key point triplet loss, the above-mentioned local pixel point area focus loss, the above-mentioned local pixel point triplet loss and the above-mentioned local pixel point area sorting loss, to obtain the above-mentioned total loss.
  • the first global loss is G 1
  • the first key point loss is p 1
  • the first local pixel point region loss is ⁇ 1
  • the key point category loss is p 2
  • the key point sorting loss is p 3
  • the local pixel point region category loss is p 2 .
  • the loss is ⁇ 2
  • the local pixel region sorting loss is ⁇ 3
  • the local pixel region focus loss is ⁇ 4
  • the local pixel region ternary loss is ⁇ 5
  • the key point focus loss is p 4
  • the key point triple loss is is p 5
  • the total loss is L t , in one possible implementation, G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 , L t satisfies formula (37):
  • G 1 ,p 1 ,p 2 ,p 3 ,p 4 ,p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 ,L t satisfy equation (38) :
  • G 1 , p 1 , p 2 , p 3 , p 4 , p 5 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 , L t satisfy formula (39) :
  • the image corresponding to the fifth recognition result with the maximum probability between the fifth probability threshold and the sixth probability threshold is called the third easy sample, and the images other than the third easy sample in the training image are called the third difficult sample.
  • the fifth probability threshold is 0.4 and the sixth probability threshold is 0.7.
  • the network to be trained obtains the fifth recognition result 1 by processing the image a.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.8, and the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.2. Since the maximum probability of the fifth recognition result 1 is 0.8, the maximum probability is greater than the sixth probability threshold, and the image a is the third easy sample.
  • the probability that the second vehicle to be recognized in the image a is the vehicle 1 is 0.5
  • the probability that the second vehicle to be recognized in the image a is the vehicle 2 is 0.5. Since the maximum probability of the fifth recognition result 1 is 0.5, the maximum probability threshold is greater than the fifth probability threshold, and the maximum probability is less than the sixth threshold, and the image a is the third difficult sample.
  • the loss of the local pixel area triplet can improve the accuracy of the recognition result of the second vehicle to be recognized obtained by the network to be trained based on the sixteenth feature data, thereby improving the recognition results of the vehicle recognition network for the first vehicle to be recognized. Classification accuracy.
  • the vehicle identification device acquires the generated data set, and uses the generated data set to train the key point and local pixel point region generation module.
  • the generated data set includes at least one heatmap training image
  • the labels of each heatmap training image include a keypoint label heatmap and a local pixel region label heatmap.
  • the key point label heatmap includes location information of key points in the heatmap training image
  • the local pixel area label heatmap includes location information of the local pixel area in the heatmap training image.
  • the embodiments of the present disclosure also provide an application scenario of the vehicle identification method. With the rapid growth of the number of cameras in public places, how to effectively determine the whereabouts of hit-and-run vehicles through massive video streams is of great significance.
  • the police can input the image of the hit-and-run vehicle into the vehicle identification device.
  • the vehicle identification device uses the technical solutions provided by the embodiments of the present disclosure to extract feature data of the hit-and-run vehicle from the image of the hit-and-run vehicle.
  • the vehicle identification device can be connected with a plurality of surveillance cameras, different surveillance cameras are installed in different locations, and the vehicle identification device can obtain real-time captured video streams from each surveillance camera.
  • the vehicle identification device uses the technical solutions provided by the embodiments of the present disclosure to extract feature data of vehicles in the video stream from the images in the video stream to obtain a feature database.
  • the vehicle identification device compares the feature data of the hit-and-run vehicle with the feature data in the feature database, and obtains the feature data matching the feature data of the hit-and-run vehicle as the target feature data. It is determined that the image corresponding to the target feature data is an image containing the hit-and-run vehicle, and then the whereabouts of the hit-and-run vehicle can be determined according to the image containing the hit-and-run vehicle.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • FIG. 12 is a schematic structural diagram of a vehicle identification device 1 according to an embodiment of the present disclosure.
  • the vehicle identification device 1 includes: an acquisition unit 11 , a first processing unit 12 , a second processing unit 13 , and a fusion processing unit 14.
  • the third processing unit 15 and the fourth processing unit 16 wherein:
  • an acquisition unit 11 configured to acquire the to-be-processed image containing the first vehicle to be identified
  • the first processing unit 12 is configured to perform a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first to-be-recognized vehicle;
  • the second processing unit 13 is configured to perform a second feature extraction process on the to-be-processed image to obtain second feature data including global feature information of the first to-be-recognized vehicle;
  • the fusion processing unit 14 is configured to perform fusion processing on the first feature data and the second feature data to obtain third feature data of the first vehicle to be identified; the third feature data is used to obtain the The first identification result of the vehicle to be identified.
  • the local feature information includes key point feature information
  • the first feature data includes feature information of at least one key point of the vehicle to be identified.
  • the local feature information further includes local pixel region feature information
  • the first feature data further includes feature information of at least one local pixel region of the vehicle to be identified.
  • the first processing unit 12 is configured as:
  • the fourth feature data includes feature information of at least one key point of the first vehicle to be identified;
  • the fifth feature data includes feature information of at least one local pixel area of the first vehicle to be identified; the local pixel area It belongs to the pixel area covered by the first vehicle to be identified, and the area of the local pixel area is smaller than the area of the pixel area covered by the first vehicle to be identified;
  • the fourth characteristic data and the fifth characteristic data are fused to obtain the first characteristic data.
  • the first processing unit 12 is configured as:
  • the sixth feature data includes feature information of the key points, and any two features included in the sixth feature data Information belongs to different key points;
  • k characteristic data including the largest amount of information to obtain k seventh characteristic data; the k is an integer not less than 1;
  • the fourth characteristic data is obtained according to the k seventh characteristic data.
  • the first processing unit 12 is configured as:
  • the first heat map includes position information of the key points in the to-be-processed image, and any two of the The information included in the first heat map belongs to different key points;
  • the seventh feature extraction process on the to-be-processed image to obtain a first feature image of the to-be-processed image;
  • the first feature image includes feature information of key points in the to-be-processed image;
  • the dot product between each of the first heat maps and the first feature image is respectively determined to obtain the at least one sixth feature data.
  • the first processing unit 12 is configured as:
  • Pooling is performed on the feature data in the at least one sixth feature data, respectively, to obtain at least one eighth feature data;
  • At least one first probability is obtained according to the amount of information included in the at least one eighth feature data; the first probability is used to characterize the amount of information included in the sixth feature data; the first probability is the same as the The sixth characteristic data is in one-to-one correspondence;
  • the sixth feature data select the sixth feature data corresponding to the largest k first probabilities as the kth feature data seven characteristic data; or,
  • the first probability is negatively correlated with the amount of information included in the sixth feature data
  • the first processing unit 12 is configured as:
  • the ninth feature data includes feature information of the key points, and any two features included in the ninth feature data Information belongs to different local pixel regions;
  • m feature data containing the most information to obtain m tenth feature data From the at least two ninth feature data, select m feature data containing the most information to obtain m tenth feature data; the m is an integer not less than 1;
  • the fifth characteristic data is obtained according to the m tenth characteristic data.
  • the first processing unit 12 is configured as:
  • the second heat map includes position information of the local pixel region in the to-be-processed image, and The information included in any two of the second heat maps belongs to different local pixel regions;
  • the second feature image includes feature information of a local pixel area in the to-be-processed image
  • the dot product between each of the second heat maps and the second feature image is determined respectively to obtain the at least one ninth feature data.
  • the first processing unit 12 is configured as:
  • At least one second probability is obtained; the second probability is used to represent the amount of information included in the ninth feature data; the second probability is the same as the The ninth characteristic data is in one-to-one correspondence;
  • the ninth feature data corresponding to the largest m second probabilities are selected as the mth Ten characteristic data; or,
  • the at least one local pixel area includes: a first pixel area and a second pixel area, the number of the ninth feature data and the m are both greater than 1, and the m
  • the tenth feature data includes: twelfth feature data and thirteenth feature data, the twelfth feature data includes feature information of the first pixel area, and the thirteenth feature data includes the second pixel Feature information of point area;
  • the first processing unit 12 is configured as:
  • the first weight is obtained according to the amount of information included in the twelfth feature data
  • the second weight is obtained according to the amount of information included in the thirteenth feature data
  • the first weight is the same as that of the twelfth feature data.
  • the amount of information included is positively correlated
  • the second weight is positively correlated with the amount of information included in the thirteenth feature data;
  • the twelfth feature data and the thirteenth feature data are weighted and fused to obtain the fifth feature data.
  • the vehicle identification method executed by the vehicle identification device is applied to a vehicle identification network, and the obtaining unit is further configured to obtain a training image including the second vehicle to be identified and the network to be trained;
  • the first processing unit 12 is further configured to use the network to be trained to process the training image to obtain fourteenth feature data including the global feature information of the second vehicle to be identified and the second feature data including the second to-be-identified vehicle.
  • the third processing unit 15 is configured to obtain the first global loss according to the fourteenth feature data and the label of the training image
  • the third processing unit 15 is further configured to obtain the first key point loss according to the fifteenth feature data and the label;
  • the third processing unit 15 is further configured to obtain the total loss of the network to be trained according to the first global loss and the first key point loss;
  • the fourth processing unit 16 is configured to adjust parameters of the network to be trained based on the total loss to obtain the vehicle identification network.
  • the first processing unit 12 is further configured to, before obtaining the total loss of the network to be trained according to the first global loss and the first key point loss, use
  • the to-be-trained network processes the training image to obtain sixteenth feature data including feature information of the local pixel area of the second to-be-recognized vehicle;
  • the third processing unit 15 is further configured to obtain the first local pixel area loss according to the sixteenth feature data and the label;
  • the third processing unit 15 is further configured to: obtain the total loss according to the first global loss, the first key point loss and the first local pixel area loss.
  • the first processing unit 12 is configured as:
  • the seventeenth feature data includes the key point feature information of the second vehicle to be identified, and any two of the The feature information included in the seventeenth feature data belongs to different key points;
  • s characteristic data including the largest amount of information to obtain s eighteenth characteristic data; the s is an integer not less than 1;
  • the s eighteenth feature data are fused to obtain the fifteenth feature data.
  • the third processing unit is further configured to obtain the Before the total loss, obtain s first identification results of the second vehicle to be identified according to the s eighteenth characteristic data;
  • the keypoint category loss is obtained
  • the fourth processing unit 16 is configured as:
  • the total loss is obtained according to the first global loss, the first keypoint loss, the first local pixel region loss, and the keypoint category loss.
  • the first processing unit 12 is configured as:
  • the first order is the order of the amount of information included from large to small, and the first order may be the The amount of information included is in ascending order;
  • the first order from the at least one seventeenth characteristic data, select s characteristic data including the most informative data to obtain the s eighteenth characteristic data;
  • the third processing unit 15 is configured to obtain the total loss according to the first global loss, the first keypoint loss, the first local pixel point region loss and the keypoint category loss. Before the loss, sort the s first recognition results according to the corresponding loss of the key point category to obtain the second order; the second order is the order of the loss of the key point category from large to small, so The second order or the order of the key point category loss from small to large;
  • the fourth processing unit 16 is configured as:
  • the total loss is obtained according to the first global loss, the first keypoint loss, the first local pixel region loss, the keypoint category loss, and the keypoint sorting loss.
  • the first processing unit 12 is configured as:
  • the network uses the network to be trained to process the training image to obtain at least one nineteenth feature data;
  • the nineteenth feature data includes the feature information of the local pixel area, and any two of the nineteenth feature data
  • the feature information included in the feature data belongs to different local pixel regions;
  • p characteristic data including the largest amount of information to obtain p twentieth characteristic data; the p is an integer not less than 1;
  • the third processing unit 15 is configured to perform an analysis according to the first global loss, the first key point loss, the first local pixel point region loss, and the key point category loss and the key point sorting loss, before obtaining the total loss, obtain p second recognition results of the second vehicle to be recognized according to the p twentieth feature data;
  • the fourth processing unit 16 is configured as:
  • the first global loss the first key point loss, the first local pixel point region loss, the key point category loss, the key point sorting loss and the local pixel point region category loss, we obtain the total loss.
  • the first processing unit 12 is configured as:
  • the third order is the order of the amount of information included from large to small, and the third order may be the The amount of information included is in ascending order;
  • the third order from the at least one nineteenth characteristic data, select p characteristic data including the most informative data to obtain the p twentieth characteristic data;
  • the third processing unit 15 is configured to: according to the first global loss, the first key point loss, the first local pixel point area loss, the key point category loss, the key point loss Sorting loss and the local pixel point area category loss, before obtaining the total loss, sort the p second recognition results according to the corresponding local pixel point area category loss to obtain the fourth order;
  • the The fourth order is the order of the local pixel point region category loss from large to small, and the fourth order may be the order of the local pixel point region category loss from small to large;
  • the fourth processing unit 16 is configured as:
  • the first global loss the first keypoint loss, the first local pixel region loss, the keypoint category loss, the keypoint sorting loss, the local pixel region category loss and all
  • the local pixel area sorting loss is used to obtain the total loss.
  • the first global loss includes a global focus loss
  • the third processing unit 15 is configured to:
  • the focus loss of the third identification result is obtained as the global focus loss.
  • the training image belongs to a training image set;
  • the training image set further includes a first positive sample image of the training image and a first negative sample image of the training image;
  • the first The global loss also includes the global triplet loss;
  • the third processing unit 15 is further configured to:
  • the global triplet loss is obtained according to the twelfth feature data, the feature data of the first positive sample image, and the feature data of the first negative sample image.
  • the vehicle identification device can obtain a third feature information that includes both the global feature information of the first vehicle to be identified and the local feature information of the first vehicle to be identified by performing fusion processing on the first feature data and the second feature data. characteristic data. Using the third feature data as the feature data of the first vehicle to be recognized can enrich the information included in the feature data of the first vehicle to be recognized.
  • the functions or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments.
  • FIG. 13 is a schematic diagram of a hardware structure of a vehicle identification device according to an embodiment of the present disclosure.
  • the vehicle identification device 2 includes a processor 21 , a memory 22 , an input device 23 , and an output device 24 .
  • the processor 21 , the memory 22 , the input device 23 , and the output device 24 are coupled through a connector, and the connector includes various types of interfaces, transmission lines, or buses, which are not limited in this embodiment of the present disclosure. It should be understood that, in various embodiments of the present disclosure, coupling refers to mutual connection in a specific manner, including direct connection or indirect connection through other devices, such as various interfaces, transmission lines, and buses.
  • the processor 21 may be one or more graphics processing units (graphics processing units, GPUs).
  • the GPU may be a single-core GPU or a multi-core GPU.
  • the processor 21 may be a processor group composed of multiple GPUs, and the multiple processors are coupled to each other through one or more buses.
  • the processor may also be other types of processors, etc., which is not limited in this embodiment of the present disclosure.
  • the memory 22 may be used to store computer program instructions, as well as various types of computer program code, including program code for implementing the disclosed aspects.
  • the memory includes, but is not limited to, random access memory (RAM), read-only memory (read-only memory, ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM) ), or a portable read-only memory (compact disc read-only memory, CD-ROM), which is used for related instructions and data.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read only memory
  • CD-ROM compact disc read-only memory
  • the input device 23 is configured to input data and/or signals
  • the output device 24 is configured to output data and/or signals.
  • the input device 23 and the output device 24 may be independent devices or may be an integral device.
  • the memory 22 can be used not only to store related instructions, but also to store related data.
  • the memory 22 can be used to store images to be processed obtained through the input device 23, or the memory 22 can also be used to store The third characteristic data obtained through the processor 21 is stored, and the embodiment of the present disclosure does not limit the data specifically stored in the memory.
  • FIG. 13 only shows a simplified design of a vehicle identification device.
  • the vehicle identification device may also include other necessary elements, including but not limited to any number of input/output devices, processors, memories, etc., and all vehicle identification devices that can implement the embodiments of the present disclosure are included in this disclosure. within the scope of public protection.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted over a computer-readable storage medium.
  • the computer instructions can be sent from a website site, computer, server, or data center via wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) another website site, computer, server or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media.
  • the available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, digital versatile discs (DVDs)), or semiconductor media (eg, solid state disks (SSDs)) )Wait.
  • the process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium.
  • the program When the program is executed , which may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: read-only memory (read-only memory, ROM) or random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • the present disclosure discloses a vehicle identification method and device, an electronic device and a storage medium.
  • the method includes: acquiring a to-be-processed image containing a first vehicle to be identified; performing a first feature extraction process on the to-be-processed image to obtain first feature data including local feature information of the first vehicle to be identified; performing a second feature extraction process on the to-be-processed image to obtain second feature data including the global feature information of the first vehicle to be identified; performing fusion processing on the first feature data and the second feature data to obtain the and the third feature data of the first vehicle to be identified; the third feature data is used to obtain the identification result of the first vehicle to be identified.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本公开公开了一种车辆识别方法及装置、电子设备及存储介质。该方法包括:获取包含第一待识别车辆的待处理图像;对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;其中,所述第三特征数据用于获得所述第一待识别车辆的识别结果。

Description

车辆识别方法及装置、电子设备及存储介质
相关申请的交叉引用
本公开基于申请号为202010947349.1、申请日为2020年09月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种车辆识别方法及装置、电子设备及存储介质。
背景技术
随着现代社会车辆数量越来越多,各种交通问题接踵而至。在相关技术中,车辆识别方法通过分别从两张图像中提取出车辆的特征,得到两个车辆特征数据,并通过对两个车辆特征数据进行比对,以确定两张图像中的车辆是否是同一辆车。但是这样提取得到的车辆特征数据所包括的信息的准确度不高。
发明内容
本公开提供一种车辆识别方法及装置、电子设备及存储介质。
第一方面,提供了一种车辆识别方法,所述方法包括:
获取包含第一待识别车辆的待处理图像;
对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;
对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;
对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;其中,所述第三特征数据用于获得所述第一待识别车辆的识别结果。
第二方面,提供了一种车辆识别装置,所述装置包括:
获取单元,配置为获取包含第一待识别车辆的待处理图像;
第一处理单元,配置为对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;
第二处理单元,配置为对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;
融合处理单元,配置为对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;所述第三特征数据用于获得所述第一待识别车辆的识别结果。
第三方面,提供了一种电子设备,包括:处理器和存储器,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,在所述处理器执行所述计算机指令的情况下,所述电子设备执行如上述第一方面及其任意一种可能实现的方式的方法。
第四方面,提供了一种电子设备,包括:处理器、发送装置、输入装置、输出装置和存储器,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,在所述处理器执行所述计算机指令的情况下,所述电子设备执行如上述第一方面及其任意一种可能实现的方式的方法。
第五方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序包括程序指令,在所述程序指令被处理器执行的情况下,使所述处理器执行如上述第一方面及其任意一种可能实现的方式的方法。
第六方面,提供了一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,使得所述计算机执行上述第一方面及其任一种可能的实现方式的方法。
本公开实施例提供一种车辆识别方法及装置、电子设备及存储介质,对于获取的包含第一识别车辆的待处理图像,通过提取第一待识别车辆的局部特征信息的第一特征数据,以及提取第一待识别车辆的全局特征信息的第二特征数据,并将第一特征数据与第二特征数据进行融合,从而能够丰富第一待识别车辆的细节特征信息,进而基于这丰富的细节特征信息来确定第一待识别车辆的识别结果,能够提高识别结果的准确度。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。
附图说明
为了更清楚地说明本公开实施例或背景技术中的技术方案,下面将对本公开实施例或背景技术中所需要使用的附图进行说明。
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1为本公开实施例提供的一种车辆识别方法的流程示意图;
图2为本公开实施例提供的一种关键点示意图;
图3为本公开实施例提供的一种局部像素点区域示意图;
图4为本公开实施例提供的一种车辆识别网络的结构示意图;
图5为本公开实施例提供的一种特征提取模块的结构示意图;
图6为本公开实施例提供的一种关键点和局部像素点区域生成模块的结构示意图;
图7为本公开实施例提供的一种联合训练模块的结构示意图;
图8为本公开实施例提供的一种第一演员-评论家的结构示意图;
图9为本公开实施例提供的一种第一打分子模块的结构示意图;
图10为本公开实施例提供的一种第二演员-评论家模块的结构示意图;
图11为本公开实施例提供的一种第二打分子模块的结构示意图;
图12为本公开实施例提供的一种车辆识别装置的结构示意图;
图13为本公开实施例提供的一种车辆识别装置的硬件结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本公开的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
为了增强工作、生活或者社会环境中的安全性,在各个区域场所内均安装有监控设备。随着人们生活水平的提高,道路上的车辆越来越多,交通事故也越来越多,如何有效的通过监控设备采集的视频流确定车辆(下文将称为目标车辆)的行踪具有重要意义。例如,在追捕肇事逃逸车辆时,使用车辆识别方法对不同摄像头采集到的图像进行处理,可确定肇事逃逸车辆的行踪。
相关技术中,车辆识别方法通过提取出图像中的待确认车辆的整体外观特征信息得到待确认车辆特征,并将待确认车辆的车辆特征与包括目标车辆的整体外观特征信息的目标车辆特征进行比对,得到目标车辆与待确认车辆之间的相似度,其中,整体外观特征包括:车型、颜色。在相似度超过相似度阈值的情况下,确定待确认车辆与目标车辆为同一辆车。
由于仅通过整体外观特征信息判断两辆车是否为同一辆车会带来很大的误差,而通过目前的车辆识别方法从图像中提取出的车辆特征仅包括整体外观特征信息,目前的车辆识别方法的识别准确度低。基于此,本公开实施例提供了一种车辆识别方法,可丰富车辆特征所包括的信息。下面结合本公开实施例中的附图对本公开实施例进行描述。
本公开实施例的执行主体为车辆识别装置。可选的车辆识别装置可以是以下中的一种:手机、服务器、计算机、平板电脑、可穿戴设备。请参阅图1,图1是本公开实施例提供的一种车辆识别方法的流程示意图。
101、获取包含第一待识别车辆的待处理图像。
本公开实施例中,待处理图像包括第一待识别车辆。在一种获取待处理图像的实现方式中,车辆识别装置接收用户通过输入组件输入的待处理图像。上述输入组件包括:键盘、鼠标、触控屏、触控板和音频输入器等。
在另一种获取待处理图像的实现方式中,车辆识别装置接收数据终端发送的待处理图像。上述数据终端可以是以下任意一种:手机、计算机、平板电脑、服务器。
在又一种获取待处理图像的实现方式中,车辆识别装置接收监控摄像头发送的待处理图像。比如,该监控摄像头部署于道路(包括:高速公路、快速公路、城市公路)。
102、对上述待处理图像进行第一特征提取处理,得到包括上述第一待识别车辆的局部特征信息的第一特征数据。
本公开实施例中,局部特征信息包括车辆的细节特征信息,如:车灯的特征信息、车标的特征信息、车窗的特征信息。
车辆识别装置通过对待处理图像进行第一特征提取处理,可从待处理图像中提取出第一待识别车辆的局部特征信息,得到第一特征数据。
在一种可能实现的方式中,第一特征提取处理可通过第一卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第一卷积神经网络可完成对待处理图像的第一特征提取处理。训练数据的标注信息可以为图像中的车辆的细节特征信息(如车灯的类型、车标的类别、车窗的类别)。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括车辆的细节特征信息的特征数据,并依据提取出的特征数据得到车辆细节信息,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第一卷积神经网络。这样,车辆识别装置可使用第一卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的细节特征信息,得到第一特征数据。
在另一种可能实现的方式中,车辆识别装置使用第一卷积核对待处理图像进行卷积处理,提取出待处理图像的包含车辆的细节特征信息的语义信息,得到第一特征数据。
103、对上述待处理图像进行第二特征提取处理,得到包括上述第一待识别车辆的全局特征信息的第二特征数据。
本公开实施例中,车辆的全局特征信息包括车辆的整体外观特征信息。车辆识别装置通过对待处理图像进行第二特征提取处理,可从待处理图像中提取出第一待识别车辆的全局特征信息,得到第二特征数据。
在一种可能实现的方式中,第二特征提取处理可通过第二卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第二卷积神经网络可完成对待处理图像的第二特征提取处理。训练数据的标注信息可以为图像中的车辆的整体外观特征信息(如车型、车身颜色)。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括车辆的整体外观特征信息的特征数据,并依据提取出的特征数据得到车辆整体外观信息,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第二卷积神经网络。这样,车辆识别装置可使用第二卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的整体外观特征信息,得到第二特征数据。
在另一种可能实现的方式中,车辆识别装置使用第二卷积核对待处理图像进行卷积处理,提取出待处理图像的包含车辆的整体外观特征信息的语义信息,得到第二特征数据。其中,第一卷积核的参数与第二卷积核的参数不同。
104、对上述第一特征数据和上述第二特征数据进行融合处理,得到上述第一待识别车辆的第三特征数据。
本公开实施例中,第三特征数据用于获得第一待识别车辆的识别结果,其中,识别结果包括第一待识别车辆的身份。例如,车辆识别装置可进一步依据第三特征数据,确定待识别车辆为车辆a。又例如,车辆识别装置将第三特征数据与车辆特征数据库中的特征数据进行比对,确定车辆特征数据库中的目标车辆特征数据与第三特征数据之间的相似度超过相似度阈值。再基于目标车辆特征数据所对应的车辆为车辆b,车辆识别装置可确定第三特征数据所对应的车辆为车辆b,即依据第三特征数据确定的第一待识别车辆的识别结果为车辆b。
车辆识别装置通过对第一特征数据和第二特征数据进行融合处理,可得到既包括第一待识别车辆的全局特征信息又包括第一待识别车辆的局部特征信息的第三特征数据。将第三特征数据作为第一待识别车辆的特征数据,可丰富第一待识别车辆的特征数据所包括的信息。
作为一种可选的实施方式,上述局部特征信息包括关键点特征信息。关键点特征信息包括关键点在待处理图像中的位置、关键点的语义信息。例如,图2所示的关键点6为左前轮胎关键点,关键点6的语义信息包括左前轮胎的信息(如轮胎规格、轮毂尺寸、轮胎品牌)。图2所示的关键点23为后车牌关键点,关键点23的语义信息包括后车牌的信息(如车牌号)。
本公开实施例中,车辆的关键点的标注方式如图2所示。应理解,图2所示的车型仅为示例,在实际应用中,可依据图2所示的关键点标注方式对任意车型(如泥头车、公交车或卡车)的车辆进行标注。
在一种可能实现的方式中,车辆识别装置通过对待处理图像进行第一特征提取处理,得到包括第一待识别车辆的关键点特征信息的第一特征数据。以图2所示的关键点为例,第一特征数据可包括待识别车辆的左前轮胎关键点的特征信息和后车牌关键点的特征信息。
作为一种可能实现的实施方式,局部特征信息不仅包括关键点特征信息还包括局部像素点区域特征信息。本公开实施例中,局部像素点区域属于第一待识别车辆所覆盖的像素点区域,且局部像素点区域的面积小于第一待识别车辆所覆盖的像素点区域的面积。例如,在图3中,右侧局部像素点区域301包含第一待识别车辆300的右侧区域,车头像素点区域302包含第一待识别车辆的车头区域。
局部像素点区域特征信息包括局部像素点区域的语义信息。例如,在局部像素点区域包括前车灯所覆盖的像素点区域的情况下,局部像素点区域的语义信息包括:前车灯的型号;在局部像素点区域包括车窗所覆盖的像素点区域的情况下,局部像素点区域的语义信息包括:车窗的类别、透过车窗所能观察到的车内的物件;在局部像素点区域包括前挡风玻璃所覆盖的像素点区域的情况下,局部像素点区域的语义信息包括:前挡风玻璃的类别、透过前挡风玻璃所能观察到的车内的物件、前挡风玻璃上的年检标、年检标在前挡风玻璃上的位置。
在局部特征信息车辆识别装置在执行步骤102的过程中执行以下步骤:
1、对上述待处理图像进行第三特征提取处理,得到第四特征数据。
本公开实施例中,第四特征数据包括第一待识别车辆的至少一个关键点的特征信息。车辆识别装置通过对待处理图像进行第三特征提取处理,可从待处理图像中提取出第一待识别车辆的至少一个关键点的特征信息,得到第四特征数据。
在一种可能实现的方式中,第三特征提取处理可通过第三卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第三卷积神经网络可完成对待处理图像的第三特征提取处理。训练数据的标注信息可以为图像中的车辆的关键点特征信息(如关键点的位置、关键点的语义信息)。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括车辆的关键点特征信息的特征数据,并依据提取出的特征数据得到关键点特征信息,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第三卷积神经网络。这样,车辆识别装置可使用第三卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的关键点特征信息,得到第四特征数据。
在另一种可能实现的方式中,车辆识别装置使用第三卷积核对待处理图像进行卷积处理,提取出待处理图像的包含车辆的整体外观特征信息的语义信息,得到第四特征数据。其中,第三卷积核的参数与第一卷积核的参数不同,第三卷积核的参数与第二卷积核的参数也不同。
2、对上述待处理图像进行第四特征提取处理,得到第五特征数据。
本公开实施例中,第五特征数据包括第一待识别车辆的至少一个局部像素点区域的特征信息。
在一种可能实现的方式中,第四特征提取处理可通过第四卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第四卷积神经网络可完成对待处理图像的第四特征提取处理。训练数据的标注信息可以为图像中的车辆的局部像素点区域的特征信息。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括车辆的局部像素点区域的特征信息的特征数据,并依据提取出的特征数据得到局部像素点区域的特征信息,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第四卷积神经网络。这样,车辆识别装置可使用第四卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的局部像素点区域的特征信息,得到第五特征数据。
在另一种可能实现的方式中,车辆识别装置使用第四卷积核对待处理图像进行卷积处理,提取出待处理图像的第一待识别车辆的局部像素点区域的特征信息,得到第五特征数据。其中,第四卷积核的参数与第一卷积核的参数、第二卷积核的参数、第三卷积核的参数均不同。
3、对上述第四特征数据和第五特征数据进行融合处理,得到上述第一特征数据。
由于局部像素点区域的特征信息包含局部像素点区域的语义信息,而在图像中相邻像素点之间存在相关性(此处的相关性包括语义相关性),通过将局部像素点区域的语义信息与关键点特征信息融合,可丰富车辆的细节特征信息。
车辆识别装置通过对第四特征数据和第五特征数据进行融合处理,将第一待识别车辆的关键点特征信息与第一待识别车辆的局部像素点区域的特征信息融合,丰富第一待识别车辆的细节特征信息,得到第一特征数据。
作为一种可选的实施方式,车辆识别装置在执行步骤1的过程中执行以下步骤:
4、对上述待处理图像进行第五特征提取处理,得到至少一个第六特征数据。
本公开实施例中,第六特征数据包括第一待识别车辆的关键点特征信息,且任意两个第六特征数据所包括的特征信息属于不同的关键点。例如,第一待识别车辆包含左后视镜关键点和右尾灯关键点。至少一个第六特征数据包括:特征数据1和特征数据2,其中,特征数据1包括左后视镜关键点的特征信息,特征数据2包括右尾灯关键点的特征信息。
在一种可能实现的方式中,车辆识别装置通过对待处理图像进行第五特征提取处理,提取出第一待识别车辆的关键点特征信息,得到通道数不小于1的第一中间特征数据,其中,第一中间特征数据中每个通道的数据均包括第一待识别车辆的关键点特征信息,且任意两个通道的数据所包括的信息属于不同的关键点。车辆识别装置可将第一中间特征数据中一个通道数据作为一个第六特征数据。
5、从上述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据。
由于不同的第六特征数据所包括的信息量不同,为减小后续处理的数据处理量,车辆识别装置可从至少一个第六特征数据中选取包括信息量最多的k个特征数据(即k个第七特征数据)用于后续处理,其中,k为不小于1的整数。
6、依据上述k个第七特征数据得到上述第四特征数据。
在k=1的情况下,通过执行步骤5可得到1个第七特征数据,此时,车辆识别装置可将第七特征数据作为第四特征数据,即第四特征数据中包括一个关键点的特征信息。
在k大于1的情况下,通过执行步骤5可得到至少两个第七特征数据,此时,车辆识别装置可对至少两个第七特征数据进行融合处理,得到第四特征数据。
例如,至少两个第七特征数据包括:第七特征数据1、第七特征数据2、第七特征数据3,其中,第七特征数据1包括左前车灯关键点的特征信息,第七特征数据2包括左后车灯关键点的特征信息,第七特征数据3包括左后视镜关键点的特征信息。车辆识别装置可通过对第七特征数据1和第七特征数据2进行融合处理,可得到第四特征数据。此时第四特征数据包括左前车灯关键点的特征信息和左后车灯关键点的特征信息。车辆识别装置也可通过对第七特征数据1、第七特征数据2和第七特征数据3进行融合处理,可得到第四特征数据。此时第四特征数据包括左前车灯关键点的特征信息、左后车灯关键点的特征信息和左后视镜关键点的特征信息。
作为一种可能实现的实施方式,车辆识别装置在执行步骤4的过程中执行以下步骤:
7、对上述待处理图像进行第六特征提取处理,得到至少一张第一热力图。
本公开实施例中,第一热力图包括关键点在待处理图像中的位置信息,且任意两张第一热力图所包括的信息属于不同的关键点。例如,第一待识别车辆的关键点包括左后视镜关键点和右尾灯关键点。至少一张第一热力图包括:第一热力图1和第一热力图2,其中,第一热力图1包括左后视镜关键点在待处理图像中的位置信息,第一热力图2包括右尾灯关键点在待处理图像中的位置信息。
将两张图像中处于相同位置的像素点称为互为同位点。例如,像素点A在第一热力图1中的位置与像素点B在第一热力图像2中的位置相同,则像素点A为第一热力图中与像素点B互为同为点的像素点,像素点B为待处理图像中与像素点A互为同位点的像素点。
在一种可能实现的方式中,第一热力图的尺寸与待处理图像的尺寸相同。第一热力图中像素点的像素值表征,待处理图像中与该像素点互为同位点的像素点所在位置存在关键点的置信度。例如,第一热力图1中的像素点A与待处理图像中的像素点B互为同位点。若第一热力图1包括左前车灯关键点在待处理图像中的位置信息、像素点A的像素值为0.7,则在像素点B处存在左前车灯的置信度为0.7。
本公开实施例中,第六特征提取处理可以是卷积处理,也可以是池化处理,还可以是卷积处理和池化处理的结合,本公开对此不做限定。
在一种可能实现的方式中,第六特征提取处理可通过第五卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第五卷积神经网络可完成对待处理图像的第六特征的提取处理。训练数据的标注信息可以为关键点在图像中的位置。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括关键点的位置信息的特征数据,并依据提取出的特征数据得到图像中关键点的位置,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第五卷积神经网络。这样,车辆识别装置可使用第五卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的关键点的位置信息,得到第一热力图。
8、对上述待处理图像进行第七特征提取处理,得到上述待处理图像的第一特征图像。
待处理图像中每个像素点均包括语义信息,而语义信息内包含关键点特征信息,通过对待处理图像进行第七特征提取处理,可提取出每个像素点包括关键点特征信息,得到第一特征图像。
应理解,第一特征图像不仅包括像素点的关键点特征信息,还包括像素点之间的相对位置信息。而第四特征数据所包括的信息中不包含像素点之间的相对位置信息。
9、分别确定每张上述第一热力图与上述第一特征图像之间的点积,得到上述至少一个第六特征数据。
将第一热力图所包括的位置信息所属的关键点称为第一热力图的关键点,例如,第一热力图1包括左前车灯关键点 的位置信息,即第一热力图1所包括的信息属于左前车灯关键点,此时,第一热力图1的关键点为左前车灯关键点。
本公开实施例中,待处理图像的尺寸、第一热力图的尺寸、第一特征图像的尺寸均相同。例如,待处理图像的长为50、宽为30,则第一热力图的长和第一特征图像的长均为50、第一热力图的宽和第一特征图像的宽均为30。
本公开实施例中,点积指智能乘积(element-wise)。通过确定第一特征图像与第一热力图之间的点积,可从第一特征图像中提取出第一热力图的关键点的特征信息,得到第六特征数据。
在一些实施例中,在确定第一特征图像与第一热力图之间的点积之前,车辆识别装置可对第一特征图像中的像素值进行归一化处理,得到归一化后的第一热力图,例如,将不小于0.6的像素值调整为1,将小于0.6的像素值调整为0.3。车辆识别装置通过确定归一化后的第一热力图与第一特征图像之间的点积,可提取出第一热力图的关键点的特征信息,得到第六特征数据。
作为一种实施方式,车辆识别装置在执行步骤5的过程中执行以下步骤:
10、对上述至少一个第六特征数据中的特征数据分别进行池化处理,得到至少一个第八特征数据。
车辆识别装置通过对一个第六特征数据进行池化处理,可减小第六特征数据中的数据量,得到一个第八特征数据。这样,在后续处理中对第八特征数据进行处理,可减小车辆识别装置的数据处理量。
车辆识别装置通过对至少一个第六特征数据中的特征数据分别进行池化处理,得到至少一个第八特征数据。例如,至少一个第六特征数据包括:第六特征数据1、第六特征数据2、第六特征数据3。车辆识别装置通过对第六特征数据1进行池化处理得到第八特征数据1、通过对第六特征数据2进行池化处理得到第八特征数据2,此时,至少一个第八特征数据包括第八特征数据1和第八特征数据2。车辆识别装置通过对第六特征数据1进行池化处理得到第八特征数据1、通过对第六特征数据2进行池化处理得到第八特征数据2、通过对第六特征数据3进行池化处理得到第八特征数据3,此时,至少一个第八特征数据包括第八特征数据1、第八特征数据2、第八特征数据3。
在一些实施例中,步骤10中的池化处理为全局平均池化处理。
11、依据上述至少一个第八特征数据所包括的信息量,得到至少一个第一概率。
本公开实施例中,第一概率用于表征与第一概率所对应的第六特征数据所包括的信息量。例如(例1),至少一个第八特征数据包括第八特征数据1,至少一个第一概率包括第一概率1,且第一概率1是依据第八特征数据1所包括的信息量得到的,第八特征数据1通过对第六特征数据1进行池化处理得到。则第一概率1用于表征第六特征数据1所包括的信息量。
在一些实施例中,第一概率与第六特征数据所包括的信息量之间具有相关性。例如,在第一概率与第六特征数据所包括的信息量呈正相关的情况下,在例1中,第一概率1越大表征第六特征数据1所包括的信息量越大;在第一概率与第六特征数据所包括的信息量呈负相关的情况下,在例1中,第一概率1越大表征第六特征数据1所包括的信息量越小。
由于第八特征数据依据第六特征数据得到,第八特征数据所包括的信息量与第六特征数据所包括的信息量呈正相关。因此,车辆识别装置可依据第八特征数据所包括的信息量,得到第一概率。
在一种可能实现的方式中,车辆识别装置将第八特征数据输入至softmax函数,可得到第一概率。
车辆识别装置依据一个第八特征数据所包括的信息量可得到一个第一概率,依据至少一个第八特征数据所包括的信息量可得到至少一个第一概率。例如,至少一个第八特征数据包括第八特征数据1和第八特征数据2。车辆识别装置依据第八特征数据1所包括的信息量得到第一概率1,此时,至少一个第一概率包括第一概率1。车辆识别装置依据第八特征数据1所包括的信息量得到第一概率1、依据第八特征数据2所包括的信息量得到第一概率2,此时,至少一个第一概率包括第一概率1和第一概率2。
在第一概率与第六特征数据所包括的信息量呈正相关的情况下,车辆识别装置执行步骤12;在第一概率与第六特征数据所包括的信息量呈负相关的情况下,车辆识别装置执行步骤13。
12、选取最大的k个第一概率所对应的上述第六特征数据,作为上述k个第七特征数据。
13、选取最小的k个第一概率所对应的上述第六特征数据,作为上述k个第七特征数据。
作为一种可选的实施方式,由于一个第七特征数据包括一个关键点的特征信息,在至少一个关键点中关键点的数量超过1的情况下,第七特征数据的数量超过1。而不同的第七特征数据所包括的信息量不同。为提升第一待识别车辆的关键点特征信息的准确度,车辆识别装置可依据第七特征数据所包括的信息量分别确定每个第七特征数据的权重,并依据第七特征数据的权重对至少一个第七特征数据进行加权融合,得到第四特征数据。
作为一种可选的实施方式,车辆识别装置在执行步骤2的过程中执行以下步骤:
14、对上述待处理图像进行第十特征提取处理,得到至少一个第九特征数据。
本公开实施例中,第九特征数据包括第一待识别车辆的关键点特征信息,且任意两个第九特征数据所包括的特征信息属于不同的局部像素点区域。例如,第一待识别车辆包含局部像素点区域1和局部像素点区域2,其中,局部像素点区域1包括前挡风玻璃所覆盖的像素点区域,局部像素点区域2包括左侧玻璃所覆盖的像素点区域。至少一个第九特征数据包括:特征数据1和特征数据2,其中,特征数据1包括局部像素点区域1的特征信息,特征数据2包括局部像素点区域2的特征信息。
在一种可能实现的方式中,车辆识别装置通过对待处理图像进行第十特征提取处理,提取出第一待识别车辆的关键点特征信息,得到通道数不小于1的第四中间特征数据,其中,第四中间特征数据中每个通道的数据均包括第一待识别车辆的局部像素点区域的特征信息,且任意两个通道的数据所包括的信息属于不同的局部像素点区域。车辆识别装置可将第四中间特征数据中的一个通道数据作为一个第九特征数据。
15、从上述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据。
由于不同的第九特征数据所包括的信息量不同,为减小后续处理的数据处理量,车辆识别装置可从至少一个第九特征数据中选取包括信息量最多的m个特征数据(即m个第十特征数据)用于后续处理,其中,m为不小于1的整数。
16、依据上述m个第十特征数据得到上述第五特征数据。
在k=1的情况下,通过执行步骤17可得到1个第十特征数据,此时,车辆识别装置可将第十特征数据作为第五特征 数据,即第五特征数据中包括一个关键点的特征信息。
在k大于1的情况下,通过执行步骤5可得到至少两个第十特征数据,此时,车辆识别装置可对至少两个第十特征数据进行融合处理,得到第五特征数据。
例如,至少两个第十特征数据包括:第十特征数据1、第十特征数据2、第十特征数据3,其中,第十特征数据1包括车头所覆盖的像素点区域的特征信息,第十特征数据2包括右前挡风玻璃所覆盖的像素点区域的特征信息,第十特征数据3包括左轮胎所覆盖的像素点区域的特征信息。车辆识别装置可通过对第十特征数据1和第十特征数据2进行融合处理,可得到第五特征数据。此时第五特征数据包括车头所覆盖的像素点区域的特征信息和右前挡风玻璃所覆盖的像素点区域的特征信息。车辆识别装置也可通过对第十特征数据1、第十特征数据2和第十特征数据3进行融合处理,可得到第五特征数据。此时第五特征数据包括车头所覆盖的像素点区域的特征信息、右前挡风玻璃所覆盖的像素点区域的特征信息和左轮胎所覆盖的像素点区域的特征信息。
作为一种实施方式,车辆识别装置在执行步骤14的过程中执行以下步骤:
17、对上述待处理图像进行第十一特征提取处理,得到上述至少一张第二热力图。
本公开实施例中,第二热力图包括关键点在待处理图像中的位置信息,且任意两张第二热力图所包括的信息属于不同的局部像素点区域。例如,第一待识别车辆的局部像素点区域包括前挡风玻璃区域和车头区域。至少一张第二热力图包括:第二热力图1和第二热力图2,其中,第二热力图1包括前挡风玻璃区域在待处理图像中的位置信息,第二热力图2包括局部像素点区域在待处理图像中的位置信息。
将两张图像中处于相同位置的像素点称为互为同位点。例如,像素点A在第二热力图1中的位置与像素点B在第二热力图像2中的位置相同,则像素点A为第二热力图中与像素点B互为同为点的像素点,像素点B为待处理图像中与像素点A互为同位点的像素点。
在一种可能实现的方式中,第二热力图的尺寸与待处理图像的尺寸相同。第二热力图中像素点的像素值表征,待处理图像中与该像素点互为同位点的像素点所在位置属于局部像素点区域的置信度。例如,第二热力图1中的像素点A与待处理图像中的像素点B互为同位点。若第二热力图1包括车头区域在待处理图像中的位置信息、像素点A的像素值为0.7,则像素点B属于与车头区域的置信度为0.7。
本公开实施例中,第十一特征提取处理可以是卷积处理,也可以是池化处理,还可以是卷积处理和池化处理的结合,本公开对此不做限定。
在一种可能实现的方式中,第十一特征提取处理可通过第六卷积神经网络实现。通过将带有标注信息的图像作为训练数据,对卷积神经网络进行训练,使训练得到的第六卷积神经网络可完成对待处理图像的第十一特征提取处理。训练数据的标注信息可以为局部像素点区域在图像中的位置。在使用训练数据对卷积神经网络进行训练的过程中,卷积神经网络从训练数据中提取出包括局部像素点区域的位置信息的特征数据,并依据提取出的特征数据得到图像中局部像素点区域的位置,作为训练结果。使用训练数据的标签监督训练结果可完成卷积神经网络的训练,得到第六卷积神经网络。这样,车辆识别装置可使用第六卷积神经网络对待处理图像进行处理,得到提取出第一待识别车辆的关键点的位置信息,得到第二热力图。
18、对上述待处理图像进行第十二特征提取处理,得到上述待处理图像的第二特征图像。
待处理图像中每个像素点均包括语义信息,通过对待处理图像进行第七特征提取处理,可提取出每个像素点的语义信息,得到第二特征图像。
应理解,第二特征图像不仅包括像素点的语义信息,还包括像素点之间的相对位置信息。而第五特征数据所包括的信息中不包含像素点之间的相对位置信息。
在一些实施例中,第一特征图像与第二特征图像可以相同,此时,第一特征图像和第二特征图像均包括待处理图像中每个像素点的语义信息。
19、分别确定每张上述第二热力图与上述第二特征图像之间的点积,得到上述至少一个第九特征数据。
将第二热力图所包括的位置信息所属的局部像素点区域称为第二热力图的局部像素点区域,例如,第二热力图1包括前挡风玻璃区域的位置信息,即第二热力图1所包括的信息属于前挡风玻璃区域,此时,第二热力图1的局部像素点区域为前挡风区域。
本公开实施例中,待处理图像的尺寸、第二热力图的尺寸、第二特征图像的尺寸均相同。例如,待处理图像的长为50、宽为30,则第二热力图的长和第二特征图像的长均为50、第二热力图的宽和第二特征图像的宽均为30。
通过确定第二特征图像与第二热力图之间的点积,可从第二特征图像中提取出的第二热力图的局部像素点区域的特征信息,得到第九特征数据。
在一些实施例中,在确定第二特征图像与第二热力图之间的点积之前,车辆识别装置可对第二特征图像中的像素值进行归一化处理,得到归一化后的第二热力图,例如,将超过0.7的像素值调整为1,将未超过0.7的像素值调整为0。车辆识别装置通过确定归一化后的第二热力图与第二特征图像之间的点积,可提取出第二热力图的关键点的特征信息,得到第九特征数据。
作为一种可选的实施方式,车辆识别装置在执行步骤15的过程中执行以下步骤:
20、对上述第九特征数据中的特征数据分别进行池化处理,得到至少一个第十一特征数据。
车辆识别装置通过对一个第九特征数据进行池化处理,可减小第九特征数据中的数据量,得到一个第十一特征数据。这样,在后续处理中对第十一特征数据进行处理,可减小车辆识别装置的数据处理量。
车辆识别装置通过对至少一个第九特征数据中的特征数据分别进行池化处理,得到至少一个第十一特征数据。例如,至少一个第九特征数据包括:第九特征数据1、第九特征数据2、第九特征数据3。车辆识别装置通过对第九特征数据1进行池化处理得到第十一特征数据1、通过对第九特征数据2进行池化处理得到第十一特征数据2,此时,至少一个第十一特征数据包括第十一特征数据1和第十一特征数据2。车辆识别装置通过对第九特征数据1进行池化处理得到第十一特征数据1、通过对第九特征数据2进行池化处理得到第十一特征数据2、通过对第九特征数据3进行池化处理得到第 十一特征数据3,此时,至少一个第十一特征数据包括第十一特征数据1、第十一特征数据2、第十一特征数据3。
步骤20中的池化处理为全局平均池化处理。
21、依据上述至少一个第十一特征数据所包括的信息量,得到至少一个第二概率。
本公开实施例中,第二概率用于表征与第二概率所对应的第九特征数据所包括的信息量。例如(例2),至少一个第十一特征数据包括第十一特征数据1,至少一个第二概率包括第二概率1,且第二概率1依据第十一特征数据1所包括的信息量得到,第十一特征数据1通过对第九特征数据1进行池化处理得到。即第二概率1用于表征第九特征数据1所包括的信息量。
第二概率与第九特征数据所包括的信息量之间具有相关性。例如,在第二概率与第九特征数据所包括的信息量呈正相关的情况下,在例2中,第二概率1越大表征第九特征数据1所包括的信息量越大;在第二概率与第九特征数据所包括的信息量呈负相关的情况下,在例2中,第二概率1越大表征第九特征数据1所包括的信息量越小。
由于第十一特征数据依据第九特征数据得到,第十一特征数据所包括的信息量与第九特征数据所包括的信息量呈正相关。因此,车辆识别装置可依据第十一特征数据所包括的信息量,得到第二概率。
在一种可能实现的方式中,车辆识别装置将第十一特征数据输入至softmax函数,可得到第二概率。
车辆识别装置依据一个第十一特征数据所包括的信息量可得到一个第二概率,依据至少一个第十一特征数据所包括的信息量可得到至少一个第二概率。例如,至少一个第十一特征数据包括第十一特征数据1和第十一特征数据2。车辆识别装置依据第十一特征数据1所包括的信息量得到第二概率1,此时,至少一个第二概率包括第二概率1。车辆识别装置依据第十一特征数据1所包括的信息量得到第二概率1、依据第十一特征数据2所包括的信息量得到第二概率2,此时,至少一个第二概率包括第二概率1和第二概率2。
在第二概率与第九特征数据所包括的信息量呈正相关的情况下,车辆识别装置执行步骤22;在第二概率与第九特征数据所包括的信息量呈负相关的情况下,车辆识别装置执行步骤23。
22、选取最大的m个第二概率所对应的上述第九特征数据,作为上述m个第十特征数据。
23、选取最小的m个第二概率所对应的上述第九特征数据,得到作为上述m个第十特征数据。
作为一种实施方式,由于一个第十特征数据包括一个局部像素点区域的特征信息,在至少一个局部像素点区域中局部像素点区域的数量超过1的情况下,第十特征数据的数量超过1。而不同的第十特征数据所包括的信息量不同。为提升第一待识别车辆的局部像素点区域特征信息的准确度,车辆识别装置可依据第十特征数据所包括的信息量分别确定每个第十特征数据的权重,并依据第十特征数据的权重对至少一个第十特征数据进行加权融合,得到第五特征数据。
在一种可能实现的方式中,至少一个局部像素点区域包括第一局部像素点区域和第二局部像素点区域,第九特征数据的数量和m均大于1。车辆识别装置从至少两个第九特征数据中选取包括信息量最多的m个特征数据,得到包括第一局部像素点区域的特征信息的第十二特征数据和包括第二局部像素点区域的特征信息的第十三特征数据。车辆识别装置在执行步骤18的过程中执行以下步骤:
24、依据上述第十二特征数据所包括的信息量得到第一权重,依据上述第十三特征数据所包括的信息量得到第二权重。
本公开实施例中,第一权重与第十二特征数据所包括的信息量呈正相关,第二权重与第十三特征数据所包括的信息量呈正相关。
25、依据上述第一权重和上述第二权重,对上述第十二特征数据和上述第十三特征数据进行加权融合,得到上述第五特征数据。
车辆识别装置依据第一权重和第二权重对第九特征数据和第十特征数据进行加权融合,得到包括第一待识别车辆的局部像素点区域特征信息的第四特征数据,可提升第一待识别车辆的局部像素点区域特征信息的准确度。
在一种可能实现的实现方式中,车辆识别装置依据第一权重和第二权重,对第十二特征数据与第十三特征数据进行加权求和得到第五特征数据。例如,假设第一权重为ω 3,第二权重为ω 4,第十二特征数据为n 4,第十三特征数据为n 5,第五特征数据为n 6,ω 3、ω 4、n 4、n 5、n 6满足下式:n 6=ω 3×n 44×n 5+d,其中,d为实数。在一些实施例中,d=0。
在另一种可能实现的方式中,车辆识别装置将第一权重与第十二特征数据相乘得到第五中间特征数据、将第二权重与第十三特征数据相乘得到第六中间特征数据,对第五中间特征数据与第六中间特征数据进行融合处理得到第五特征数据。
本公开实施例还提供了一种车辆识别网络,可配置为实现前文所公开的技术方案。请参阅图4,图4为本公开实施例提供的一种车辆识别网络的结构示意图。如图4所示,车辆识别网络包括:特征提取模块401、关键点和局部像素点区域生成模块402、联合训练模块403。经特征提取模块401对待处理图像400进行处理,得到待处理图像的第三特征图像404。经关键点和局部像素点区域生成模块对待处理图像进行处理,得到至少一张第一热力图和至少一张第二热力图405。将第三特征图、至少一张第一热力图和至少一张第二热力图像输入至联合训练模块,得到第三特征数据406。
具体的,图5所示为特征提取模块的结构示意图。如图5所示,特征提取模块包括三层依次串联的卷积层。特征提取模块中,第一层卷积层501为ResNet50中的conv2_x,第二层卷积层502为ResNet50中的conv3_x,第三层卷积层503为ResNet50中的conv4_x。对于待处理图像500通过这三层卷积层进行特征提取,得到第三特征图像504。
图6所示为关键点和局部像素点区域生成模块的结构示意图。如图6所示,关键点和局部像素点区域生成模块包括四层依次串联的卷积层。在一些实施例中,在关键点和局部像素点区域生成模块中,第一层卷积层601为ResNet50中的conv2_x,第二层卷积层602为ResNet50中的conv3_x,第三层卷积层603为ResNet50中的conv4_x,第四层卷积层604为ResNet50中的conv5_x。将待处理图像600,通过这四层卷积层进行处理,得到至少一张第一热力图和至少一张第二热力图605。
图7所示为联合训练模块的结构示意图。如图7所示,经联合训练模块的第一层卷积层701对第三特征图像700进行 处理,得到第一通用特征图像。经第一降维层702对第一通用特征图像进行通道维度上的降维得到第一特征图像。经第一演员-评论家模块703对第一特征图像和至少一张第一热力图704进行处理,得到k个第一评论家特征数据705。依次经第一池化层71和第一归一化层72分别对k个第一评论家特征数据进行处理,得到k个第七特征数据705。
经联合训练模块的第一层卷积层701对第三特征图像进行处理,得到第二通用特征图像。经第二降维层711对第二通用特征图像进行通道维度上的降维得到第二特征图像。经第二演员-评论家模块712对第二特征图像和至少一张第二热力图713进行处理,得到m个第二评论家特征数据。依次经第二池化层73和第二归一化层74分别对m个第二评论家特征数据进行处理,得到m个第十特征数据714。
依次经联合训练模块的第二层卷积层721、第三层降维层722、第三层池化层75、第三归一化层76对第三特征图像进行处理,得到第二特征数据723。
在联合训练模块中,第一层卷积层701和第二层卷积层721均为ResNet50中的conv5_x。第一降维层702、第二降维层711、第三降维层722中均包含一个尺寸为1*1的卷积核。
请参阅图8,图8所示为第一演员-评论家模块的结构示意图。第一演员-评论家模块的输入为至少一张第一热力图801和第一特征图像802。第一演员-评论家模块分别确定每张第一热力图与第一特征图像之间的点积,得到至少一个第六特征数据803。经第一打分子模块804对一个第六特征数据进行处理,可得到与该第六特征数据对应的第一概率。从至少一个第六特征数据中选取最大的k个第一概率805对应的第六特征数据得到k个第一演员特征数据806,或从至少一个第六特征数据中选取最小的k个第一概率对应的第六特征数据得到k个第一演员特征数据。分别对k个第一演员特征数据进行归一化处理,得到k个第一评论家特征数据807。
请参阅图9,图9所示为第一打分子模块的结构示意图。在第一打分子模块中,第六特征数据901依次经过归一化层902、池化层903、全连接层904,得到第八特征数据,经softmax层905对第八特征数据进行处理,得到第一概率906。
请参阅图10,图10所示为第二演员-评论家模块的结构示意图。第二演员-评论家模块的输入为至少一张第二热力图和第三特征图像。第二演员-评论家模块分别确定每张第二热力图1001与第三特征图像1002之间的点积,得到至少一个第九特征数据1003。经第二打分子模块1004对一个第九特征数据进行处理,可得到与该第九特征数据对应的第二概率1005。从至少一个第九特征数据中选取最大的m个第二概率对应的第九特征数据得到m个第二演员特征数据,或从至少一个第九特征数据中选取最小的m个第二概率对应的第九特征数据得到m个第二演员特征数据1006。分别对m个第二演员特征数据进行归一化处理,得到m个第二评论家特征数据1007。
请参阅图11,图11所示为第二打分子模块的结构示意图。在第二打分子模块中,第九特征数据1101依次经过归一化层1102、池化层1103、全连接层1104,得到第十一特征数据,经softmax层1105对第八特征数据进行处理,得到第二概率1106。
在使用图4所示的车辆识别网络提取图像中的车辆的特征数据之前,需对车辆识别网络进行训练。为此,本公开还提供了一种车辆识别网络的训练方法。该训练方法可包括以下步骤:
26、获取包含第二待识别车辆的训练图像和待训练网络。
本公开实施例中,训练图像包括第一待识别车辆。在一种获取训练图像的实现方式中,车辆识别装置接收用户通过输入组件输入的训练图像。上述输入组件包括:键盘、鼠标、触控屏、触控板和音频输入器等。
在另一种获取训练图像的实现方式中,车辆识别装置接收训练数据终端发送的训练图像。上述训练数据终端可以是以下任意一种:手机、计算机、平板电脑、服务器。
本公开实施例中,待训练网络的具体结构请参见图4。在一种获取待训练网络的实现方式中,车辆识别装置接收用户通过输入组件输入的待训练网络。上述输入组件包括:键盘、鼠标、触控屏、触控板和音频输入器等。
在另一种获取待训练网络的实现方式中,车辆识别装置接收训练数据终端发送的待训练网络。上述训练数据终端可以是以下任意一种:手机、计算机、平板电脑、服务器。
27、使用上述待训练网络对上述训练图像进行处理,得到包括上述第二待识别车辆的全局特征信息的第十四特征数据和包括上述第二待识别车辆的关键点特征信息的第十五特征数据。
本公开实施例中,第二待识别车辆的全局特征信息包括第二待识别车辆的整体外观特征信息。
28、依据上述第十四特征数据和上述训练图像的标签,得到第一全局损失。
本公开实施例中,训练图像的标签包括第二待识别车辆的类别信息。例如,在所有训练数据中总共包含车辆1和车辆2。在第二待识别车辆的类别信息为车辆1的情况下,表明第二待识别车辆为车辆1。
在一种可能实现的方式中,车辆识别装置依据第十四特征数据可得到第二待识别车辆的类别(下文将称为全局类别),依据全局类别和标签所包括的类别信息之间的差异可得到第一全局损失。
29、依据上述第十五特征数据和上述标签,得到第一关键点损失。
在一种可能实现的方式中,车辆识别装置依据第十五特征数据可得到第二待识别车辆的类别(下文将称为关键点类别),依据关键点类别和标签所包括的类别信息之间的差异可得到第一关键点损失。
30、依据上述第一全局损失和上述第一关键点损失,得到上述待训练网络的总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,总损失为L t,在一种可能实现的方式中,G 1,p 1,L t满足公式(1):
L t=G 1+p 1+c 1    公式(1);
其中,c 1为实数。c 1=0。
在另一种可能实现的方式中,满足公式(2):
L t=α 1×(G 1+p 1)    公式(2);
其中,α 1为实数。α 1=1。
在又一种可能实现的方式中,G 1,p 1,L t满足公式(3):
L t=α 1×(G 1+p 1)+c 1    公式(3);
其中,α 1,c 1均为实数。c 1=0,α 1=1。
31、基于上述总损失调整上述待训练网络的参数,得到上述车辆识别网络。
车辆识别装置依据总损失调整待训练网络的参数,直至总损失小于收敛阈值,得到车辆识别网络。
由于总损失中包含第一全局损失和第一关键点损失,基于总损失调整待训练网络的参数得到车辆识别网络,可使用车辆识别网络对待处理图像进行处理得到第一待识别车辆的全局特征信息和关键点特征信息。
作为一种实施方式,在执行步骤30之前,车辆识别装置还执行以下步骤:
32、使用上述待训练网络对上述训练图像进行处理,得到包括上述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据。
33、依据上述第十六特征数据和上述标签,得到第一局部像素点区域损失。
在一种可能实现的方式中,车辆识别装置依据第十六特征数据可得到第二待识别车辆的类别(下文将称为局部像素点区域类别),依据局部像素点区域类别和标签所包括的类别信息之间的差异可得到第一局部像素点区域损失。
在得到第一局部像素点区域损失后,车辆识别装置在执行步骤30的过程中执行以下步骤:
34、依据上述第一全局损失、上述第一关键点损失和上述第一局部像素点区域损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,总损失为L t,在一种可能实现的方式中,G 1,p 11,L t满足公式(4):
L t=G 1+p 11+c 2    公式(4);
其中,c 2为实数。c 2=0。
在另一种可能实现的方式中,G 1,p 11,L t满足公式(5):
L t=α 2×(G 1+p 11)    公式(5);
其中,α 2为实数。α 2=1。
在又一种可能实现的方式中,G 1,p 11,L t满足公式(6):
L t=α 2(G 1+p 11)+c 2    公式(6);
其中,α 2,c 2均为实数。c 2=0,α 2=1。
由于总损失中包含第一全局损失、第一关键点损失和第一局部像素点区域损失,基于总损失调整待训练网络的参数得到车辆识别网络,可使用车辆识别网络对待处理图像进行处理得到第一待识别车辆的全局特征信息、关键点特征信息和局部像素点区域特征信息。
作为一种可选的实施方式,车辆识别装置在执行步骤27的过程中执行以下步骤:
35、使用上述待训练网络对上述训练图像进行处理,得到至少一个第十七特征数据。
本公开实施例中,第十七特征数据包括第二待识别车辆的关键点特征信息,且任意两个第十七特征数据所包括的特征信息属于不同的关键点。
36、从上述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据,其中,s为不小于1的整数。
37、对上述s个第十八特征数据进行融合处理,得到上述第十五特征数据。
在对待训练网络的训练过程中,对s个第十八特征数据进行融合处理得到第十五特征数据,可在使用车辆识别网络对待处理图像进行处理过程,依据k个第七特征数据得到第四特征数据。
作为一种实施方式,在得到s个第十八特征数据后,在执行步骤34之前,车辆识别装置还执行以下步骤:
38、依据上述s个第十八特征数据,得到上述第二待识别车辆的s个第一识别结果。
本公开实施例中,第一识别结果包括第二待识别车辆的类别信息。车辆识别装置依据一个第十八特征数据,可得到一个第一识别结果。依据s个第十八特征数据,可得到第二待识别车辆的s个第一识别结果。
39、分别依据上述s个第一识别结果与上述标签之间的差异,得到关键点类别损失。
在一种可能实现的方式中,车辆识别装置依据一个第一识别结果和标签可得到一个第一识别差异,依据s个第一识别结果和标签可得到s个第一识别差异。车辆识别装置通过确定s个第一识别差异的和,得到关键点类别损失。
在得到关键点类别损失之后,车辆识别装置在执行步骤34的过程中执行以下步骤:
40、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失和上述关键点类别损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 21,L t满足公式(7):
L t=G 1+p 1+p 21+c 3    公式(7);
其中,c 3为实数。c 3=0。
在另一种可能实现的方式中,G 1,p 1,p 21,L t满足公式(8):
L t=α 3×(G 1+p 1+p 21)    公式(8);
其中,α 3为实数。α 3=1。
在又一种可能实现的方式中,G 1,p 1,p 21,L t满足公式(9):
L t=α 3×(G 1+p 1+p 21)+c 3    公式(9);
其中,α 3、c 3均为实数。c 3=0,α 3=1。
由于总损失中包含关键点类别损失,可在使用车辆识别网络对待处理图像进行处理的过程中,依据k个第七特征数据得到的第四特征数据。
作为一种可选的实施方式,车辆识别装置在执行步骤36的过程中执行以下步骤:
41、依据所包括的信息量对上述至少一个第十七特征数据进行排序,得到第一顺序。
本公开实施例中,第一顺序为所包括的信息量从大到小的顺序,第一顺序或为所包括的信息量从小到大的顺序。
42、依据上述第一顺序从上述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到上述s个第十八特征数据。
在第一顺序为所包括的信息量从大到小的顺序的情况下,车辆识别装置选取第一顺序中的前s个特征数据作为s个第十八特征数据;在第一顺序为所包括的信息量从小到大的顺序的情况下,车辆识别装置选取第一顺序中的后s个特征数据作为s个第十八特征数据。
在得到第一顺序的情况下,车辆识别装置在执行步骤40之前还执行以下步骤:
43、依据所对应的上述关键点类别损失对上述s个第一识别结果进行排序,得到第二顺序。
本公开实施例中,在第一顺序为所包括的信息量从大到小的顺序的情况下,第二顺序为关键点类别损失从小到大的顺序。即关键点类别损失越小,第一识别结果在第二顺序中的排名越高。
在第一顺序为所包括的信息量从小到大的顺序的情况下,第二顺序为关键点类别损失从大到小的顺序。即关键点类别损失越大,第一识别结果在第二顺序中的排名越高。
44、依据上述第一顺序和上述第二顺序之间的差异,得到关键点排序损失。
在得到关键点排序损失后,车辆识别装置在执行步骤40的过程中执行以下步骤:
45、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失和上述关键点排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 31,L t满足公式(10):
L t=G 1+p 1+p 2+p 31+c 4    公式(10);
其中,c 4为实数。c 4=0。
在另一种可能实现的方式中,G 1,p 1,p 2,p 31,L t满足公式(11):
L t=α 4×(G 1+p 1+p 2+p 31)    公式(11);
其中,α 4为实数。α 4=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 31,L t满足公式(12):
L t=α 4×(G 1+p 1+p 2+p 31)+c 4    公式(12);
其中,α 4、c 4均为实数。c 4=0,α 4=1。
在对待训练网络的训练过程中,在总损失中加入关键点类别损失,可提高s个第十八特征数据的准确度,进而提高第十五特征数据所包括的信息的准确度。这样,在使用车辆识别网络对待处理图像进行处理的过程中,可提高k个第七特征数据的准确度,进而提高第四特征数据所包括的信息的准确度。
作为一种可选的实施方式,车辆识别装置在执行步骤32的过程中执行以下步骤:
46、使用上述待训练网络对上述训练图像进行处理,得到至少一个第十九特征数据。
本公开实施例中,第十九特征数据包括第二待识别车辆的局部像素点区域特征信息,且任意两个第十九特征数据所包括的特征信息属于不同的局部像素点区域。
47、从上述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据,其中,p为不小于1的整数。
48、对上述p个第二十特征数据进行融合处理,得到上述第十六特征数据。
在对待训练网络的训练过程中,对p个第二十特征数据进行融合处理得到第十六特征数据,可在使用车辆识别网络对待处理图像进行处理过程,依据m个第十特征数据得到第五特征数据。
作为一种可选的实施方式,在得到p个第二十特征数据后,在执行步骤45之前,车辆识别装置还执行以下步骤:
49、依据上述p个第二十特征数据,得到上述第二待识别车辆的p个第二识别结果。
本公开实施例中,第二识别结果包括第二待识别车辆的类别信息。车辆识别装置依据一个第二十特征数据,可得到一个第二识别结果。依据p个第十八特征数据,可得到第二待识别车辆的p个第二识别结果。
50、分别依据上述p个第二识别结果与上述标签之间的差异,得到局部像素点区域类别损失。
在一种可能实现的方式中,车辆识别装置依据一个第二识别结果和标签可得到一个第二识别差异,依据p个第二识别结果和标签可得到p个第二识别差异。车辆识别装置通过确定p个第二识别差异的和,得到局部像素点区域类别损失。
在得到局部像素点区域类别损失之后,车辆识别装置在执行步骤45的过程中执行以下步骤:
51、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失和上述局部像素点区域类别损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 312,L t满足公式(13):
L t=G 1+p 1+p 2+p 312+c 5    公式(13);
其中,c 5为实数。c 5=0。
在另一种可能实现的方式中,G 1,p 1,p 2,p 312,L t满足公式(14):
L t=α 5×(G 1+p 1+p 2+p 312)    公式(14);
其中,α 5为实数。α 5=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 312,L t满足公式(15):
L t=α 5×(G 1+p 1+p 2+p 312)+c 5    公式(15);
其中,α 5、c 5均为实数。c 5=0,α 5=1。
由于总损失中包含局部像素点区域类别损失,可在使用车辆识别网络对待处理图像进行处理的过程中,依据m个第十特征数据得到的第五特征数据。
作为一种可选的实施方式,车辆识别装置在执行步骤47的过程中执行以下步骤:
52、依据所包括的信息量对上述至少一个第十九特征数据进行排序,得到第三顺序。
本公开实施例中,第三顺序为所包括的信息量从大到小的顺序,第三顺序或为所包括的信息量从小到大的顺序。
53、依据上述第三顺序从上述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到上述p个第二十特征数据。
在第三顺序为所包括的信息量从大到小的顺序的情况下,车辆识别装置选取第三顺序中的前p个特征数据作为p个第二十特征数据;在第三顺序为所包括的信息量从小到大的顺序的情况下,车辆识别装置选取第三顺序中的后p个特征数据作为p个第二十特征数据。
在得到第三顺序的情况下,车辆识别装置在执行步骤51之前还执行以下步骤:
54、依据所对应的上述局部像素点区域类别损失对上述p个第二识别结果进行排序,得到第四顺序。
本公开实施例中,在第三顺序为所包括的信息量从大到小的顺序的情况下,第四顺序为局部像素点区域类别损失从小到大的顺序。即局部像素点区域类别损失越小,第二识别结果在第四顺序中的排名越高。
在第三顺序为所包括的信息量从小到大的顺序的情况下,第四顺序为局部像素点区域类别损失从大到小的顺序。即局部像素点区域类别损失越大,第二识别结果在第四顺序中的排名越高。
55、依据上述第三顺序和上述第四顺序之间的差异,得到局部像素点区域排序损失。
在得到局部像素点区域排序损失后,车辆识别装置在执行步骤51的过程中执行以下步骤:
56、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失、上述局部像素点区域类别损失和上述局部像素点区域排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,局部像素点区域排序损失为γ 3,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 3123,L t满足公式(16):
L t=G 1+p 1+p 2+p 3123+c 6    公式(16);
其中,c 6为实数。c 6=0。
在另一种可能实现的方式中,G 1,p 1,p 2,p 3123,L t满足公式(17):
L t=α 6×(G 1+p 1+p 2+p 3123)    公式(17);
其中,α 6为实数。α 6=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 3123,L t满足公式(18):
L t=α 6×(G 1+p 1+p 2+p 3123)+c 6    公式(18);
其中,α 6、c 6均为实数。c 6=0,α 6=1。
在对待训练网络的训练过程中,在总损失中加入局部像素点区域类别损失,可提高p个第二十特征数据的准确度,进而提高第十六特征数据所包括的信息的准确度。这样,在使用车辆识别网络对待处理图像进行处理的过程中,可提高k个第七特征数据的准确度,进而提高第四特征数据所包括的信息的准确度。
作为一种可选的实施方式,第一全局损失包括全局焦点损失,车辆识别装置在执行步骤28的过程中执行以下步骤:
57、依据上述第十四特征数据,得到上述第二待识别车辆的第三识别结果。
本公开实施例中,第三识别结果包括第二待识别车辆的类别信息。车辆识别装置依据第十四特征数据,可确定第二待识别车辆的类别,进而得到第三识别结果。
58、依据上述第三识别结果和上述标签,得到上述第三识别结果的焦点损失,作为上述全局焦点损失。
假设第三识别结果的焦点损失为L F1,则L F1满足公式(19):
Figure PCTCN2020140315-appb-000001
其中,B为训练图像的数量,β n为正数,γ为非负数,u n为第三识别结果中与标签的类别对应的概率。β n=2,γ=2。
例如,训练图像包括图像a,使用待训练网络对图像a进行处理得到第三识别结果1。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1)。在第三识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.9、图像a中的第二待识别车辆为车辆2的概率为0.1。假设β n=2,γ=2,此时,L F1=-2×(1-0.9) 2×log0.9。
又例如,训练图像包括图像a和图像b,使用待训练网络对图像a进行处理得到第三识别结果1,使用待训练网络对图像b进行处理得到第三识别结果2。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1),图像b的标签所包括的类别为车辆2(即图像a的标签为车辆2)。在第三识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.3、图像a中的第二待识别车辆为车辆2的概率为0.7。在第三识别结果2中,图像b中的第二待识别车辆为车辆1的概率为0.2、图像b中的第二待识别车辆为车辆2的概率为0.8。假设β n=2,γ=2,此时,L F1=-2×(1-0.3) 2×log0.3-2×(1-0.8) 2×log0.8。
将最大概率处于第一概率阈值至第二概率阈值之间的第三识别结果所对应的图像称为第一容易样本,将训练图像中除第一容易样本之外的图像称为第一难样本。例如,假设第一概率阈值为0.4,第二概率阈值为0.7。在训练过程中,待训练网络通过对图像a进行处理得到第三识别结果1。
若在第三识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.8,图像a中第二待识别车辆为车辆2的概率为0.2。由于第三识别结果1的最大概率为0.8,该最大概率大于第二概率阈值,图像a为第一容易样本。
若在第三识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.5,图像a中第二待识别车辆为车辆2的概率为0.5。由于第三识别结果1的最大概率为0.5,该最大概率阈值大于第一概率阈值,且该最大概率小于第二阈值,图像a为第一难样本。
在训练过程中,通过计算第三识别结果的焦点损失得到关键点焦点损失,进而确定总损失,可提升对待训练网络的训练效果。
作为一种可选的实施方式,训练图像属于训练图像集,训练图像集还包括训练图像的第一正样本图像和训练图像的第一负样本图像,第一全局损失还包括全局三元组损失。车辆识别装置在执行步骤28的过程中还执行以下步骤:
59、使用上述待训练网络对上述第一正样本图像进行特征提取处理,得到上述第一正样本图像的特征数据。
本公开实施例中,第一正样本图像的标签所包括类别信息与训练图像的标签所包括的类别信息相同,第一负样本图像的标签所包括类别信息与训练图像的标签所包括的类别信息不同。
第一正样本图像的特征数据包括第一正样本图像的语义信息,该语义信息可用于识别第一正样本图像中的第二待识别车辆的类别。
60、使用上述待训练网络对上述第一负样本图像进行特征提取处理,得到上述第一负样本图像的特征数据。
第一负样本图像的特征数据包括第一正样本图像的语义信息,该语义信息可用于识别第一负样本图像中的第二待识别车辆的类别。
61、依据上述第十二特征数据、上述第一正样本图像的特征数据和上述第一负样本图像的特征数据,得到上述全局三元组损失。
车辆识别装置计算第十二特征数据与第一正样本图像的特征数据之间的相似度得到第一正相似度、计算第十二特征数据与第一负样本图像的特征数据之间的相似度得到第一负相似度。
假设第十二特征数据为x a,第一正相似度为s 1,第一负相似度为s 2,全局三元组损失为L T1,则L T1,s 1,s 2,x a满足公式(20):
L T1=[v 1+s 1-s 2]    公式(20);
其中,v 1为实数。v 1=1。
在一些实施例中,第一正相似度为第十二特征数据与第一正样本图像的特征数据之间的第二范数。第一负相似度为第十二特征数据与第一负样本图像的特征数据之间的第二范数。
在一些实施例中,在训练图像集包括除训练图像、第一正样本图像、第一负样本图像之外的图像的情况下,车辆识别装置可将训练图像集中训练图像之外的图像分为正样本图像集和负样本图像集。正样本图像集中的图像的标签所包括的类别信息与训练图像的标签所包括的类别信息相同,负样本图像集中的图像的标签所包括的类别信息与训练图像的标签所包括的类别信息不同。
车辆识别装置对正样本图像集中的图像进行特征提取处理得到正样本特征数据集、对负样本图像集中的图像进行特征提取处理得到负样本特征数据集。车辆识别装置计算第十二特征数据与正样本特征数据集中的特征数据之间的相似度得到第一正相似度集、计算第十二特征数据与负样本特征数据集中的特征数据之间的相似度得到第一负相似度集。将第一正相似度集中的最小值称为第一类内最小相似度,将第一负相似度集中的最大值称为第一类外最大相似度。
假设第十二特征数据为x a,第一类内最小相似度为max d(x a,x p),第一类外最大相似度为min d(x a,x n),全局三元组损失为L T1,则L T1,max d(x a,x p),min d(x a,x n),x a满足公式(21):
L T1=[v 1+max d(x a,x p)-min d(x a,x n)]    公式(21);
其中,v 1为实数。v 1=1。
在一些实施例中,第十二特征数据与第一正样本特征数据集中的特征数据之间的相似度为,第十二特征数据与第一正样本特征数据集中的特征数据之间的第二范数。第十二特征数据与第一负样本特征数据集中的特征数据之间的相似度为,第十二特征数据与第一负样本特征数据集中的特征数据之间的第二范数。
在训练过程中,全局三元组损失可提升待训练网络基于第十二特征数据得到的第二待识别车辆的识别结果的准确度,从而提升车辆识别网络对第一待识别车辆的分类准确度。
应理解,在第一全局损失包括全局焦点损失和全局三元组损失的情况下,第一全局损失可以为全局焦点损失和全局三元组损失的和。
作为一种可选的实施方式,在执行步骤56之前,车辆识别装置还执行以下步骤:
62、依据上述第十五特征数据,得到上述第二待识别车辆的第四识别结果。
本公开实施例中,第四识别结果包括第二待识别车辆的类别信息。车辆识别装置依据第十五特征数据,可确定第二待识别车辆的类别,进而得到第四识别结果。
63、依据上述第四识别结果和上述标签,得到上述第四识别结果的焦点损失,作为关键点焦点损失。
假设第四识别结果的焦点损失为L F2,则L F2满足公式(22):
Figure PCTCN2020140315-appb-000002
其中,B为训练图像的数量,β n为正数,γ为非负数,u m为第四识别结果中与标签的类别对应的概率。β n=2,γ=2。
例如,训练图像包括图像a,使用待训练网络对图像a进行处理得到第四识别结果1。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1)。在第四识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.9、图像a中的第二待识别车辆为车辆2的概率为0.1。假设β n=2,γ=2,此时,L F2=-2×(1-0.9) 2×log0.9。
又例如,训练图像包括图像a和图像b,使用待训练网络对图像a进行处理得到第四识别结果1,使用待训练网络对图像b进行处理得到第四识别结果2。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1),图像b的标签所包括的类别为车辆2(即图像a的标签为车辆2)。在第四识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.3、图像a中的第二待识别车辆为车辆2的概率为0.7。在第四识别结果2中,图像b中的第二待识别车辆为车辆1的概率为0.2、图像b中的第二待识别车辆为车辆2的概率为0.8。假设β n=2,γ=2,此时,L F2=-2×(1-0.3) 2×log0.3-2×(1-0.8) 2×log0.8。
在得到关键点焦点损失后,车辆识别装置在执行步骤58的过程中执行以下步骤:
64、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失、上述局部像素点区域类别损失、上述关键点焦点损失和上述局部像素点区域排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,局部像素点区域排序损失为γ 3,关键点焦点损失为p 4,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4123,L t满足公式(23):
L t=G 1+p 1+p 2+p 3+p 4123+c 7    公式(23);
其中,c 7为实数。c 7=0。
在另一种可能实现的方式中,满足公式(24):
L t=α 7×(G 1+p 1+p 2+p 3+p 4123)    公式(24);
其中,α 7为实数。α 7=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4123,L t满足公式(25):
L t=α 7×(G 1+p 1+p 2+p 3+p 4123)+c 7    公式(25);
其中,α 7、c 7均为实数。c 7=0,α 7=1。
将最大概率处于第三概率阈值至第四概率阈值之间的第四识别结果所对应的图像称为第二容易样本,将训练图像中除第二容易样本之外的图像称为第二难样本。例如,假设第三概率阈值为0.4,第四概率阈值为0.7。在训练过程中,待训练网络通过对图像a进行处理得到第四识别结果1。
若在第四识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.8,图像a中第二待识别车辆为车辆2的概率为0.2。由于第四识别结果1的最大概率为0.8,该最大概率大于第四阈值,图像a为第二容易样本。
若在第四识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.5,图像a中第二待识别车辆为车辆2的概率为0.5。由于第四识别结果1的最大概率为0.5,该最大概率大于第三概率阈值,且该最大概率小于第四概率阈值,图像a为第二难样本。
在训练过程中,通过计算第四识别结果的焦点损失得到局部像素点区域焦点损失,并在总损失中增加局部像素点区域损失,可提升对待训练网络的训练效果。
作为一种可选的实施方式,在执行步骤63之前,车辆识别装置还执行以下步骤:
65、依据上述第十五特征数据、上述第一正样本图像的特征数据和上述第一负样本图像的特征数据,得到关键点三元组损失。
车辆识别装置计算第十五特征数据与第一正样本图像的特征数据之间的相似度得到第二正相似度、计算第十五特征数据与第一负样本图像的特征数据之间的相似度得到第二负相似度。
假设第十五特征数据为x b,第二正相似度为s 3,第二负相似度为s 4,关键点三元组损失为L T2,则L T2,s 4,s 3,x b满足公式(26):
L T2=[v 2+s 3-s 4]    公式(26);
其中,v 2为实数。v 2=1。
在一些实施例中,第二正相似度为第十五特征数据与第一正样本图像的特征数据之间的第二范数。第二负相似度为第十五特征数据与第一负样本图像的特征数据之间的第二范数。
车辆识别装置对正样本图像集中的图像进行特征提取处理得到正样本特征数据集、对负样本图像集中的图像进行特征提取处理得到负样本特征数据集。车辆识别装置计算第十五特征数据与正样本特征数据集中的特征数据之间的相似度得到第二正相似度集、计算第十五特征数据与负样本特征数据集中的特征数据之间的相似度得到第二负相似度集。将第二正相似度集中的最小值称为第二类内最小相似度,将第二负相似度集中的最大值称为第二类外最大相似度。
假设第十五特征数据为x b,第二类内最小相似度为max d(x b,x p),第二类外最大相似度为min d(x b,x n),关键点三元组损失为L T2,则L T2,max d(x b,x p),min d(x b,x n),x b满足公式(27):
L T2=[v 2+max d(x b,x p)-min d(x b,x n)]    公式(27);
其中,v 2为实数。v 2=1。
在一些实施例中,第十五特征数据与正样本特征数据集中的特征数据之间的相似度为,第十五特征数据与正样本特征数据集中的特征数据之间的第二范数。第十五特征数据与负样本特征数据集中的特征数据之间的相似度为,第十五特征数据与负样本特征数据集中的特征数据之间的第二范数。
在得到关键点焦点损失后,车辆识别装置在执行步骤63的过程中执行以下步骤:
66、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失、上述局部像素点区域类别损失、上述关键点焦点损失、上述关键点三元组损失和上述局部像素点区域排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,局部像素点区域排序损失为γ 3,关键点焦点损失为p 4,关键点三元组损失为p 5,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 5123,L t满足公式(28):
L t=G 1+p 1+p 2+p 3+p 4+p 5123+c 8    公式(28);
其中,c 8为实数。c 8=0。
在另一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 5123,L t满足公式(29):
L t=α 8×(G 1+p 1+p 2+p 3+p 4+p 5123)    公式(29);
其中,α 8为实数。α 8=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 5123,L t满足公式(30):
L t=α 8×(G 1+p 1+p 2+p 3+p 4+p 5123)+c 8    公式(30);
其中,α 8、c 8均为实数。c 8=0,α 8=1。
在训练过程中,关键点三元组损失可提升待训练网络基于第十五特征数据得到的第二待识别车辆的识别结果的准确度,从而提升车辆识别网络对第一待识别车辆的分类准确度。
作为一种可选的实施方式,在执行步骤66之前,车辆识别装置还执行以下步骤:
67、依据上述第十六特征数据,得到上述第二待识别车辆的第五识别结果。
本公开实施例中,第五识别结果包括第二待识别车辆的类别信息。车辆识别装置依据第十六特征数据,可确定第二待识别车辆的类别,进而得到第五识别结果。
68、依据上述第五识别结果和上述标签,得到上述第五识别结果的焦点损失,作为局部像素点区域焦点损失。
假设第五识别结果的焦点损失为L F3,则L F3满足公式(31):
Figure PCTCN2020140315-appb-000003
其中,B为训练图像的数量,β n为正数,γ为非负数,u k为第五识别结果中与标签的类别对应的概率。β n=2,γ=2。
例如,训练图像包括图像a,使用待训练网络对图像a进行处理得到第五识别结果1。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1)。在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.9、图像a中的第二待识别车辆为车辆2的概率为0.1。假设β n=2,γ=2,此时,L F3=-2×(1-0.9) 2×log0.9。
又例如,训练图像包括图像a和图像b,使用待训练网络对图像a进行处理得到第五识别结果1,使用待训练网络对图像b进行处理得到第五识别结果2。若图像a的标签所包括的类别为车辆1(即图像a的标签为车辆1),图像b的标签所包括的类别为车辆2(即图像a的标签为车辆2)。在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.3、图像a中的第二待识别车辆为车辆2的概率为0.7。在第五识别结果2中,图像b中的第二待识别车辆为车辆1的概率为0.2、图像b中的第二待识别车辆为车辆2的概率为0.8。假设β n=2,γ=2,此时,L F3=-2×(1-0.3) 2×log0.3-2×(1-0.8) 2×log0.8。
在得到局部像素点区域焦点损失后,车辆识别装置在执行步骤66的过程中执行以下步骤:
69、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失、上述局部像素点区域类别损失、上述关键点焦点损失、上述关键点三元组损失、上述局部像素点区域焦点损失和上述局部像素点区域排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,局部像素点区域排序损失为γ 3,局部像素点区域焦点损失为γ 4,关键点焦点损失为p 4,关键点三元组损失为p 5,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 51234,L t满足公式(32):
L t=G 1+p 1+p 2+p 3+p 4+p 51234+c 9    公式(32);
其中,c 9为实数。c 9=0。
在另一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 51234,L t满足公式(33):
L t=α 9×(G 1+p 1+p 2+p 3+p 4+p 51234)    公式(33);
其中,α 9为实数。α 9=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 51234,L t满足公式(34):
L t=α 9×(G 1+p 1+p 2+p 3+p 4+p 51234)+c 9    公式(34);
其中,α 9、c 9均为实数。c 9=0,α 9=1。
将最大概率处于第五概率阈值至第六概率阈值之间的第五识别结果所对应的图像称为第三容易样本,将训练图像中除第三容易样本之外的图像称为第三难样本。例如,假设第五概率阈值为0.4,第六概率阈值为0.7。在训练过程中,待训练网络通过对图像a进行处理得到第五识别结果1。
若在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.8,图像a中第二待识别车辆为车辆2的概率为0.2。由于第五识别结果1的最大概率为0.8,该最大概率大于第六概率阈值,图像a为第三容易样本。
若在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.5,图像a中第二待识别车辆为车辆2的概率为0.5。由于第五识别结果1的最大概率为0.5,该最大概率阈值大于第五概率阈值,且该最大概率小于第六阈值,图像a为第三难样本。
在训练过程中,通过计算第五识别结果的焦点损失得到局部像素点区域焦点损失,进而确定总损失,可提升对第三难样本的训练效果,进而提升对待训练网络的训练效果。
作为一种可选的实施方式,在执行步骤69之前,车辆识别装置还执行以下步骤:
70、依据上述第十六特征数据、上述第一正样本图像的特征数据和上述第一负样本图像的特征数据,得到局部像素点区域三元组损失。
车辆识别装置计算第十六特征数据与第一正样本图像的特征数据之间的相似度得到第三正相似度、计算第十六特征数据与第一负样本图像的特征数据之间的相似度得到第三负相似度。
假设第十六特征数据为x c,第三正相似度为s 5,第三负相似度为s 6,局部像素点区域三元组损失为L T3,则L T3,s 5,s 6,x c满足公式(35):
L T3=[v 3+s 5-s 6]    公式(35);
其中,v 3为实数。v 3=1。
在一些实施例中,第三正相似度为第十六特征数据与第一正样本图像的特征数据之间的第二范数。第三负相似度为第十六特征数据与第一负样本图像的特征数据之间的第二范数。
车辆识别装置计算第十六特征数据与正样本特征数据集中的特征数据之间的相似度得到第三正相似度集、计算第十六特征数据与负样本特征数据集中的特征数据之间的相似度得到第三负相似度集。将第三正相似度集中的最小值称为第三类内最小相似度,将第三负相似度集中的最大值称为第三类外最大相似度。
假设第十六特征数据为x c,第三类内最小相似度为max d(x c,x p),第三类外最大相似度为min d(x c,x n),局部像素点区域三元组损失为L T3,则L T3,max d(x c,x p),min d(x c,x n),x c满足公式(36):
L T3=[v 3+max d(x c,x p)-min d(x c,x n)]    公式(36);
其中,v 3为实数。v 3=1。
在一些实施例中,第十六特征数据与正样本特征数据集中的特征数据之间的相似度为,第十六特征数据与正样本特征数据集中的特征数据之间的第二范数。第十六特征数据与负样本特征数据集中的特征数据之间的相似度为,第十六特征数据与负样本特征数据集中的特征数据之间的第二范数。
在得到局部像素点区域焦点损失后,车辆识别装置在执行步骤69的过程中执行以下步骤:
71、依据上述第一全局损失、上述第一关键点损失、上述第一局部像素点区域损失、上述关键点类别损失、上述关键点排序损失、上述局部像素点区域类别损失、上述关键点焦点损失、上述关键点三元组损失、上述局部像素点区域焦点损失、上述局部像素点三元组损失和上述局部像素点区域排序损失,得到上述总损失。
假设第一全局损失为G 1,第一关键点损失为p 1,第一局部像素点区域损失为γ 1,关键点类别损失为p 2,关键点排序损失为p 3,局部像素点区域类别损失为γ 2,局部像素点区域排序损失为γ 3,局部像素点区域焦点损失为γ 4,局部像素点区域三元损失为γ 5,关键点焦点损失为p 4,关键点三元组损失为p 5,总损失为L t,在一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 512345,L t满足公式(37):
L t=G 1+p 1+p 2+p 3+p 4+p 512345+c 10    公式(37);
其中,α 10为实数。α 10=1。
在另一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 512345,L t满足公式(38):
L t=α 10×(G 1+p 1+p 2+p 3+p 4+p 512345)    公式(38);
其中,α 10为实数。α 10=1。
在又一种可能实现的方式中,G 1,p 1,p 2,p 3,p 4,p 512345,L t满足公式(39):
L t=α 10×(G 1+p 1+p 2+p 3+p 4+p 512345)+c 10    公式(39);
其中,α 10、c 10均为实数。c 10=0,α 10=1。
将最大概率处于第五概率阈值至第六概率阈值之间的第五识别结果所对应的图像称为第三容易样本,将训练图像中除第三容易样本之外的图像称为第三难样本。例如,假设第五概率阈值为0.4,第六概率阈值为0.7。在训练过程中,待训练网络通过对图像a进行处理得到第五识别结果1。
若在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.8,图像a中第二待识别车辆为车辆2的概率为0.2。由于第五识别结果1的最大概率为0.8,该最大概率大于第六概率阈值,图像a为第三容易样本。
若在第五识别结果1中,图像a中的第二待识别车辆为车辆1的概率为0.5,图像a中第二待识别车辆为车辆2的概率为0.5。由于第五识别结果1的最大概率为0.5,该最大概率阈值大于第五概率阈值,且该最大概率小于第六阈值,图像a为第三难样本。
在训练过程中,局部像素点区域三元组损失可提升待训练网络基于第十六特征数据得到的第二待识别车辆的识别结果的准确度,从而提升车辆识别网络对第一待识别车辆的分类准确度。
作为一种可选的实施方式,车辆识别装置获取生成数据集,并使用生成数据集对关键点和局部像素点区域生成模块进行训练。
本公开实施例中,生成数据集包括至少一张热力图训练图像,且每张热力图训练图像的标签包括关键点标签热力图和局部像素点区域标签热力图。其中,关键点标签热力图包括热力图训练图像中关键点的位置信息,局部像素点区域标签热力图包括热力图训练图像中局部像素点区域的位置信息。
基于本公开实施例提供的技术的方案,本公开实施例还提供了一种车辆识别方法的应用场景。随着公共场所内摄像头数量的快速增长,如何有效的通过海量视频流确定肇事逃逸车辆的行踪具有重要的意义。
A地方发生交通事故,且肇事车辆逃逸。通过A事故现场的监控摄像头采集到了肇事逃逸车辆的图像。警方可将肇事逃逸车辆的图像输入至车辆识别装置。
车辆识别装置使用本公开实施例提供的技术方案,从肇事逃逸车辆的图像中提取出肇事逃逸车辆的特征数据。
车辆识别装置可与多个监控摄像头相连,不同的监控摄像头安装在不同位置,且车辆识别装置可从每个监控摄像头获取实时采集的视频流。车辆识别装置使用本公开实施例提供的技术方案,从视频流中的图像中提取出视频流中的车辆的特征数据,得到特征数据库。
车辆识别装置将肇事逃逸车辆的特征数据与特征数据库中的特征数据进行比对,得到与肇事逃逸车辆的特征数据匹配的特征数据,作为目标特征数据。确定与目标特征数据对应的图像为包含肇事逃逸车辆的图像,进而可依据包含肇事逃逸车辆的图像确定肇事逃逸车辆的行踪。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
上述详细阐述了本公开实施例的方法,下面提供了本公开实施例的装置。
请参阅图12,图12为本公开实施例提供的一种车辆识别装置1的结构示意图,该车辆识别装置1包括:获取单元11、第一处理单元12、第二处理单元13、融合处理单元14、第三处理单元15、第四处理单元16,其中:
获取单元11,配置为获取包含第一待识别车辆的待处理图像;
第一处理单元12,配置为对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;
第二处理单元13,配置为对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;
融合处理单元14,配置为对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;所述第三特征数据用于获得所述第一待识别车辆的识别结果。
结合本公开任一实施方式,所述局部特征信息包括关键点特征信息,所述第一特征数据包括所述待识别车辆的至少一个关键点的特征信息。
结合本公开任一实施方式,所述局部特征信息还包括局部像素点区域特征信息,所述第一特征数据还包括所述待识别车辆的至少一个局部像素点区域的特征信息。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述待处理图像进行第三特征提取处理,得到第四特征数据;所述第四特征数据包括所述第一待识别车辆的至少一个关键点的特征信息;
对所述待处理图像进行第四特征提取处理,得到第五特征数据;所述第五特征数据包括所述第一待识别车辆的至少一个局部像素点区域的特征信息;所述局部像素点区域属于所述第一待识别车辆所覆盖的像素点区域,且所述局部像素点区域的面积小于所述第一待识别车辆所覆盖的像素点区域的面积;
对所述第四特征数据和第五特征数据进行融合处理,得到所述第一特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述待处理图像进行第五特征提取处理,得到至少一个第六特征数据;所述第六特征数据包括所述关键点的特征信息,且任意两个所述第六特征数据所包括的特征信息属于不同的关键点;
从所述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据;所述k为不小于1的整数;
依据所述k个第七特征数据得到所述第四特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述待处理图像进行第六特征提取处理,得到至少一张第一热力图;所述第一热力图包括所述关键点在所述待处理图像中的位置信息,且任意两张所述第一热力图所包括的信息属于不同的关键点;
对所述待处理图像进行第七特征提取处理,得到所述待处理图像的第一特征图像;所述第一特征图像包括所述待处理图像中的关键点的特征信息;
分别确定每张所述第一热力图与所述第一特征图像之间的点积,得到所述至少一个第六特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述至少一个第六特征数据中的特征数据分别进行池化处理,得到至少一个第八特征数据;
依据所述至少一个第八特征数据所包括的信息量,得到至少一个第一概率;所述第一概率用于表征所述第六特征数据所包括的信息量;所述第一概率与所述第六特征数据一一对应;
在所述第一概率与所述第六特征数据所包括的信息量呈正相关的情况下,选取最大的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据;或,
在所述第一概率与所述第六特征数据所包括的信息量呈负相关的情况下,选取最小的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述待处理图像进行第十特征提取处理,得到至少一个第九特征数据;所述第九特征数据包括所述关键点的特征信息,且任意两个所述第九特征数据所包括的特征信息属于不同的局部像素点区域;
从所述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据;所述m为不小于1的整数;
依据所述m个第十特征数据得到所述第五特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述待处理图像进行第十一特征提取处理,得到所述至少一张第二热力图;所述第二热力图包括所述局部像素点区域在所述待处理图像中的位置信息,且任意两张所述第二热力图所包括的信息属于不同的局部像素点区域;
对所述待处理图像进行第十二特征提取处理,得到所述待处理图像的第二特征图像;所述第二特征图像包括所述待处理图像中的局部像素点区域的特征信息;
分别确定每张所述第二热力图与所述第二特征图像之间的点积,得到所述至少一个第九特征数据。
结合本公开任一实施方式,所述第一处理单元12配置为:
对所述第九特征数据中的特征数据分别进行池化处理,得到至少一个第十一特征数据;
依据所述至少一个第十一特征数据所包括的信息量,得到至少一个第二概率;所述第二概率用于表征所述第九特征数据中包括的信息量;所述第二概率与所述第九特征数据一一对应;
在所述第二概率与所述第九特征数据所包括的信息量呈正相关的情况下,选取最大的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据;或,
在所述第二概率与所述第九特征数据所包括的信息量呈负相关的情况下,选取最小的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据。
结合本公开任一实施方式,所述至少一个局部像素点区域包括:第一像素点区域和第二像素点区域,所述第九特征数据的数量和所述m均大于1,所述m个第十特征数据包括:第十二特征数据和第十三特征数据,所述第十二特征数据包括所述第一像素点区域的特征信息,所述第十三特征数据包括所述第二像素点区域的特征信息;
所述第一处理单元12配置为:
依据所述第十二特征数据所包括的信息量得到第一权重,依据所述第十三特征数据所包括的信息量得到第二权重;所述第一权重与所述第十二特征数据所包括的信息量呈正相关,所述第二权重与所述第十三特征数据所包括的信息量呈正相关;
依据所述第一权重和所述第二权重,对所述第十二特征数据和所述第十三特征数据进行加权融合,得到所述第五特征数据。
结合本公开任一实施方式,所述车辆识别装置执行的车辆识别方法应用于车辆识别网络,所述获取单元,还配置为获取包含第二待识别车辆的训练图像和待训练网络;
所述第一处理单元12,还配置为使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的全局特征信息的第十四特征数据和包括所述第二待识别车辆的关键点特征信息的第十五特征数据;
第三处理单元15,配置为依据所述第十四特征数据和所述训练图像的标签,得到第一全局损失;
所述第三处理单元15,还配置为依据所述第十五特征数据和所述标签,得到第一关键点损失;
所述第三处理单元15,还配置为依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失;
第四处理单元16,配置为基于所述总损失调整所述待训练网络的参数,得到所述车辆识别网络。
结合本公开任一实施方式,所述第一处理单元12,还配置为在所述依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失之前,使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据;
所述第三处理单元15,还配置为依据所述第十六特征数据和所述标签,得到第一局部像素点区域损失;
所述第三处理单元15,还配置为:依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失。
结合本公开任一实施方式,所述第一处理单元12,配置为:
使用所述待训练网络对所述训练图像进行处理,得到至少一个第十七特征数据;所述第十七特征数据包括所述第二待识别车辆的关键点特征信息,且任意两个所述第十七特征数据所包括的特征信息属于不同的关键点;
从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据;所述s为不小于1的整数;
对所述s个第十八特征数据进行融合处理,得到所述第十五特征数据。
结合本公开任一实施方式,所述第三处理单元,还配置为在所述依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失之前,依据所述s个第十八特征数据,得到所述第二待识别车辆的s个第一识别结果;
分别依据所述s个第一识别结果与所述标签之间的差异,得到关键点类别损失;
所述第四处理单元16,配置为:
依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失。
结合本公开任一实施方式,所述第一处理单元12,配置为:
依据所包括的信息量对所述至少一个第十七特征数据进行排序,得到第一顺序;所述第一顺序为所包括的信息量从大到小的顺序,所述第一顺序或为所包括的信息量从小到大的顺序;
依据所述第一顺序从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到所述s个第十八特征数据;
所述第三处理单元15,配置为在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失之前,依据所对应的所述关键点类别损失对所述s个第一识别结果进行排序,得到第二顺序;所述第二顺序为所述关键点类别损失从大到小的顺序,所述第二顺序或为所述关键点类别损失从小到大的顺序;
依据所述第一顺序和所述第二顺序之间的差异,得到关键点排序损失;
所述第四处理单元16,配置为:
依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失。
结合本公开任一实施方式,所述第一处理单元12,配置为:
使用所述待训练网络对所述训练图像进行处理,得到至少一个第十九特征数据;所述第十九特征数据包括所述局部像素点区域的特征信息,且任意两个所述第十九特征数据所包括的特征信息属于不同的局部像素点区域;
从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据;所述p为不小于1的整数;
对所述p个第二十特征数据进行融合处理,得到所述第十六特征数据。
结合本公开任一实施方式,所述第三处理单元15,配置为在依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失之前,依据所述p个第二十特征数据,得到所述第二待识别车辆的p个第二识别结果;
分别依据所述p个第二识别结果与所述标签之间的差异,得到局部像素点区域类别损失;
所述第四处理单元16,配置为:
依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失。
结合本公开任一实施方式,所述第一处理单元12,配置为:
依据所包括的信息量对所述至少一个第十九特征数据进行排序,得到第三顺序;所述第三顺序为所包括的信息量从大到小的顺序,所述第三顺序或为所包括的信息量从小到大的顺序;
依据所述第三顺序从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到所述p个第二十特征数据;
所述第三处理单元15,配置为在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失之前,依据所对应的所述局部像素点区域类别损失对所述p个第二识别结果进行排序,得到第四顺序;所述第四顺序为所述局部像素点区域类别损失从大到小的顺序,所述第四顺序或为所述局部像素点区域类别损失从小到大的顺序;
依据所述第三顺序和所述第四顺序之间的差异,得到局部像素点区域排序损失;
所述第四处理单元16,配置为:
依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失、所述局部像素点区域类别损失和所述局部像素点区域排序损失,得到所述总损失。
结合本公开任一实施方式,所述第一全局损失包括全局焦点损失;所述第三处理单元15,配置为:
依据所述第十四特征数据,得到所述第二待识别车辆的第三识别结果;
依据所述第三识别结果和所述标签,得到所述第三识别结果的焦点损失,作为所述全局焦点损失。
结合本公开任一实施方式,所述训练图像属于训练图像集;所述训练图像集还包括所述训练图像的第一正样本图像和所述训练图像的第一负样本图像;所述第一全局损失还包括全局三元组损失;
所述第三处理单元15,还配置为:
使用所述待训练网络对所述第一正样本图像进行特征提取处理,得到所述第一正样本图像的特征数据;
使用所述待训练网络对所述第一负样本图像进行特征提取处理,得到所述第一负样本图像的特征数据;
依据所述第十二特征数据、所述第一正样本图像的特征数据和所述第一负样本图像的特征数据,得到所述全局三元组损失。
本实施例中,车辆识别装置通过对第一特征数据和第二特征数据进行融合处理,可得到既包括第一待识别车辆的全局特征信息又包括第一待识别车辆的局部特征信息的第三特征数据。将第三特征数据作为第一待识别车辆的特征数据,可丰富第一待识别车辆的特征数据所包括的信息。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。
图13为本公开实施例提供的一种车辆识别装置的硬件结构示意图。该车辆识别装置2包括处理器21,存储器22,输入装置23,输出装置24。该处理器21、存储器22、输入装置23和输出装置24通过连接器相耦合,该连接器包括各类接口、传输线或总线等等,本公开实施例对此不作限定。应当理解,本公开的各个实施例中,耦合是指通过特定方式的相互联系,包括直接相连或者通过其他设备间接相连,例如可以通过各类接口、传输线、总线等相连。
处理器21可以是一个或多个图形处理器(graphics processing unit,GPU),在处理器21是一个GPU的情况下,该GPU可以是单核GPU,也可以是多核GPU。在一些实施例中,处理器21可以是多个GPU构成的处理器组,多个处理器之间通过一个或多个总线彼此耦合。在一些实施例中,该处理器还可以为其他类型的处理器等等,本公开实施例不作限定。
存储器22可用于存储计算机程序指令,以及用于执行本公开方案的程序代码在内的各类计算机程序代码。可选地,存储器包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器用于相关指令及数据。
输入装置23配置为输入数据和/或信号,以及输出装置24配置为输出数据和/或信号。输入装置23和输出装置24可以是独立的器件,也可以是一个整体的器件。
可理解,本公开实施例中,存储器22不仅可用于存储相关指令,还可用于存储相关数据,如该存储器22可用于存储通过输入装置23获取的待处理图像,又或者该存储器22还可用于存储通过处理器21得到的第三特征数据等等,本公开实施例对于该存储器中具体所存储的数据不作限定。
可以理解的是,图13仅仅示出了一种车辆识别装置的简化设计。在实际应用中,车辆识别装置还可以分别包含必要的其他元件,包含但不限于任意数量的输入/输出装置、处理器、存储器等,而所有可以实现本公开实施例的车辆识别装置都在本公开的保护范围之内。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本公开的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。所属领域的技术人员还可以清楚地了解到,本公开各个实施例描述各有侧重,为描述的方便和简洁,相同或类似的部分在不同实施例中可能没有赘述,因此,在某一实施例未描述或未详细描述的部分可以参见其他实施例的记载。
在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本公开实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(digital versatile disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述 的存储介质包括:只读存储器(read-only memory,ROM)或随机存储存储器(random access memory,RAM)、磁碟或者光盘等各种可存储程序代码的介质。
工业实用性
本公开公开了一种车辆识别方法及装置、电子设备及存储介质。该方法包括:获取包含第一待识别车辆的待处理图像;对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;所述第三特征数据应用于获得所述第一待识别车辆的识别结果。

Claims (24)

  1. 一种车辆识别方法,所述方法包括:
    获取包含第一待识别车辆的待处理图像;
    对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;
    对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;
    对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;其中,所述第三特征数据用于获得所述第一待识别车辆的识别结果。
  2. 根据权利要求1所述的方法,所述局部特征信息包括关键点特征信息,所述第一特征数据包括所述待识别车辆的至少一个关键点的特征信息。
  3. 根据权利要求2所述的方法,所述局部特征信息还包括局部像素点区域特征信息,所述第一特征数据还包括所述待识别车辆的至少一个局部像素点区域的特征信息。
  4. 根据权利要求3所述的方法,所述对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据,包括:
    对所述待处理图像进行第三特征提取处理,得到第四特征数据;所述第四特征数据包括所述第一待识别车辆的至少一个关键点的特征信息;
    对所述待处理图像进行第四特征提取处理,得到第五特征数据;所述第五特征数据包括所述第一待识别车辆的至少一个局部像素点区域的特征信息;所述局部像素点区域属于所述第一待识别车辆所覆盖的像素点区域,且所述局部像素点区域的面积小于所述第一待识别车辆所覆盖的像素点区域的面积;
    对所述第四特征数据和第五特征数据进行融合处理,得到所述第一特征数据。
  5. 根据权利要求4所述的方法,所述对所述待处理图像进行第三特征提取处理,得到第四特征数据,包括:
    对所述待处理图像进行第五特征提取处理,得到至少一个第六特征数据;所述第六特征数据包括所述关键点的特征信息,且任意两个所述第六特征数据所包括的特征信息属于不同的关键点;
    从所述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据;所述k为不小于1的整数;
    依据所述k个第七特征数据得到所述第四特征数据。
  6. 根据权利要求5所述的方法,所述对所述待处理图像进行第五特征提取处理,得到至少一个第六特征数据,包括:
    对所述待处理图像进行第六特征提取处理,得到至少一张第一热力图;所述第一热力图包括所述关键点在所述待处理图像中的位置信息,且任意两张所述第一热力图所包括的信息属于不同的关键点;
    对所述待处理图像进行第七特征提取处理,得到所述待处理图像的第一特征图像;所述第一特征图像包括所述待处理图像中的关键点的特征信息;
    分别确定每张所述第一热力图与所述第一特征图像之间的点积,得到所述至少一个第六特征数据。
  7. 根据权利要求5或6所述的方法,所述从所述至少一个第六特征数据中选取包括信息量最多的k个特征数据,得到k个第七特征数据,包括:
    对所述至少一个第六特征数据中的特征数据分别进行池化处理,得到至少一个第八特征数据;
    依据所述至少一个第八特征数据所包括的信息量,得到至少一个第一概率;所述第一概率用于表征所述第六特征数据所包括的信息量;所述第一概率与所述第六特征数据一一对应;
    在所述第一概率与所述第六特征数据所包括的信息量呈正相关的情况下,选取最大的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据;或,
    在所述第一概率与所述第六特征数据所包括的信息量呈负相关的情况下,选取最小的k个所述第一概率所对应的所述第六特征数据,作为所述k个第七特征数据。
  8. 根据权利要求3至7中任意一项所述的方法,所述对所述待处理图像进行第四特征提取处理,得到第五特征数 据,包括:
    对所述待处理图像进行第十特征提取处理,得到至少一个第九特征数据;所述第九特征数据包括所述关键点的特征信息,且任意两个所述第九特征数据所包括的特征信息属于不同的局部像素点区域;
    从所述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据;所述m为不小于1的整数;
    依据所述m个第十特征数据得到所述第五特征数据。
  9. 根据权利要求8所述的方法,所述对所述待处理图像进行第十特征提取处理,得到至少一个第九特征数据,包括:
    对所述待处理图像进行第十一特征提取处理,得到所述至少一张第二热力图;所述第二热力图包括所述局部像素点区域在所述待处理图像中的位置信息,且任意两张所述第二热力图所包括的信息属于不同的局部像素点区域;
    对所述待处理图像进行第十二特征提取处理,得到所述待处理图像的第二特征图像;所述第二特征图像包括所述待处理图像中的局部像素点区域的特征信息;
    分别确定每张所述第二热力图与所述第二特征图像之间的点积,得到所述至少一个第九特征数据。
  10. 根据权利要求8或9所述的方法,所述从所述至少两个第九特征数据中选取包含信息量最多的m个特征数据,得到m个第十特征数据,包括:
    对所述第九特征数据中的特征数据分别进行池化处理,得到至少一个第十一特征数据;
    依据所述至少一个第十一特征数据所包括的信息量,得到至少一个第二概率;所述第二概率用于表征所述第九特征数据中包括的信息量;所述第二概率与所述第九特征数据一一对应;
    在所述第二概率与所述第九特征数据所包括的信息量呈正相关的情况下,选取最大的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据;或,
    在所述第二概率与所述第九特征数据所包括的信息量呈负相关的情况下,选取最小的m个所述第二概率所对应的所述第九特征数据,作为所述m个第十特征数据。
  11. 根据权利要求8至10中任意一项所述的方法,所述至少一个局部像素点区域包括:第一像素点区域和第二像素点区域,所述第九特征数据的数量和所述m均大于1,所述m个第十特征数据包括:第十二特征数据和第十三特征数据,所述第十二特征数据包括所述第一像素点区域的特征信息,所述第十三特征数据包括所述第二像素点区域的特征信息;
    所述依据所述m个第十特征数据得到所述第五特征数据,包括:
    依据所述第十二特征数据所包括的信息量得到第一权重,依据所述第十三特征数据所包括的信息量得到第二权重;所述第一权重与所述第十二特征数据所包括的信息量呈正相关,所述第二权重与所述第十三特征数据所包括的信息量呈正相关;
    依据所述第一权重和所述第二权重,对所述第十二特征数据和所述第十三特征数据进行加权融合,得到所述第五特征数据。
  12. 根据权利要求1至11中任意一项所述的方法,所述车辆识别方法应用于车辆识别网络,所述车辆识别网络的训练方法,包括:
    获取包含第二待识别车辆的训练图像和待训练网络;
    使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的全局特征信息的第十四特征数据和包括所述第二待识别车辆的关键点特征信息的第十五特征数据;
    依据所述第十四特征数据和所述训练图像的标签,得到第一全局损失;
    依据所述第十五特征数据和所述标签,得到第一关键点损失;
    依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失;
    基于所述总损失调整所述待训练网络的参数,得到所述车辆识别网络。
  13. 根据权利要求12所述的方法,在所述依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失之前,所述方法还包括:
    使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据;
    依据所述第十六特征数据和所述标签,得到第一局部像素点区域损失;
    所述依据所述第一全局损失和所述第一关键点损失,得到所述待训练网络的总损失,包括:
    依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失。
  14. 根据权利要求13所述的方法,所述使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的关键点特征信息的第十五特征数据,包括:
    使用所述待训练网络对所述训练图像进行处理,得到至少一个第十七特征数据;所述第十七特征数据包括所述第二待识别车辆的关键点特征信息,且任意两个所述第十七特征数据所包括的特征信息属于不同的关键点;
    从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据;所述s为不小于1的整数;
    对所述s个第十八特征数据进行融合处理,得到所述第十五特征数据。
  15. 根据权利要求14所述的方法,在所述依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失之前,所述方法还包括:
    依据所述s个第十八特征数据,得到所述第二待识别车辆的s个第一识别结果;
    分别依据所述s个第一识别结果与所述标签之间的差异,得到关键点类别损失;
    所述依据所述第一全局损失、所述第一关键点损失和所述第一局部像素点区域损失,得到所述总损失,包括:
    依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失。
  16. 根据权利要求14或15所述的方法,所述从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到s个第十八特征数据,包括:
    依据所包括的信息量对所述至少一个第十七特征数据进行排序,得到第一顺序;所述第一顺序为所包括的信息量从大到小的顺序,所述第一顺序或为所包括的信息量从小到大的顺序;
    依据所述第一顺序从所述至少一个第十七特征数据中选取包括信息量最多的s个特征数据,得到所述s个第十八特征数据;
    在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失之前,所述方法还包括:
    依据所对应的所述关键点类别损失对所述s个第一识别结果进行排序,得到第二顺序;所述第二顺序为所述关键点类别损失从大到小的顺序,所述第二顺序或为所述关键点类别损失从小到大的顺序;
    依据所述第一顺序和所述第二顺序之间的差异,得到关键点排序损失;
    所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失和所述关键点类别损失,得到所述总损失,包括:
    依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失。
  17. 根据权利要求16所述的方法,所述使用所述待训练网络对所述训练图像进行处理,得到包括所述第二待识别车辆的局部像素点区域的特征信息的第十六特征数据,包括:
    使用所述待训练网络对所述训练图像进行处理,得到至少一个第十九特征数据;所述第十九特征数据包括所述局部像素点区域的特征信息,且任意两个所述第十九特征数据所包括的特征信息属于不同的局部像素点区域;
    从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据;所述p为不小于1的整数;
    对所述p个第二十特征数据进行融合处理,得到所述第十六特征数据。
  18. 根据权利要求17所述的方法,在依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失之前,所述方法还包括:
    依据所述p个第二十特征数据,得到所述第二待识别车辆的p个第二识别结果;
    分别依据所述p个第二识别结果与所述标签之间的差异,得到局部像素点区域类别损失;
    所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失和所述关键点排序损失,得到所述总损失,包括:
    依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失。
  19. 根据权利要求17或18所述的方法,所述从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到p个第二十特征数据,包括:
    依据所包括的信息量对所述至少一个第十九特征数据进行排序,得到第三顺序;所述第三顺序为所包括的信息量从大到小的顺序,所述第三顺序或为所包括的信息量从小到大的顺序;
    依据所述第三顺序从所述至少一个第十九特征数据中选取包括信息量最多的p个特征数据,得到所述p个第二十特征数据;
    在所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失之前,所述方法还包括:
    依据所对应的所述局部像素点区域类别损失对所述p个第二识别结果进行排序,得到第四顺序;所述第四顺序为所述局部像素点区域类别损失从大到小的顺序,所述第四顺序或为所述局部像素点区域类别损失从小到大的顺序;
    依据所述第三顺序和所述第四顺序之间的差异,得到局部像素点区域排序损失;
    所述依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失和所述局部像素点区域类别损失,得到所述总损失,包括:
    依据所述第一全局损失、所述第一关键点损失、所述第一局部像素点区域损失、所述关键点类别损失、所述关键点排序损失、所述局部像素点区域类别损失和所述局部像素点区域排序损失,得到所述总损失。
  20. 根据权利要求12至19中任意一项所述的方法,所述第一全局损失包括全局焦点损失;所述依据所述第十四特征数据和所述训练图像的标签,得到第一全局损失,包括:
    依据所述第十四特征数据,得到所述第二待识别车辆的第三识别结果;
    依据所述第三识别结果和所述标签,得到所述第三识别结果的焦点损失,作为所述全局焦点损失。
  21. 根据权利要求20所述的方法,所述训练图像属于训练图像集;所述训练图像集还包括所述训练图像的第一正样本图像和所述训练图像的第一负样本图像;所述第一全局损失还包括全局三元组损失;所述方法还包括:
    使用所述待训练网络对所述第一正样本图像进行特征提取处理,得到所述第一正样本图像的特征数据;
    使用所述待训练网络对所述第一负样本图像进行特征提取处理,得到所述第一负样本图像的特征数据;
    依据所述第十二特征数据、所述第一正样本图像的特征数据和所述第一负样本图像的特征数据,得到所述全局三元组损失。
  22. 一种车辆识别装置,所述装置包括:
    获取单元,配置为获取包含第一待识别车辆的待处理图像;
    第一处理单元,配置为对所述待处理图像进行第一特征提取处理,得到包括所述第一待识别车辆的局部特征信息的第一特征数据;
    第二处理单元,配置为对所述待处理图像进行第二特征提取处理,得到包括所述第一待识别车辆的全局特征信息的第二特征数据;
    融合处理单元,配置为对所述第一特征数据和所述第二特征数据进行融合处理,得到所述第一待识别车辆的第三特征数据;所述第三特征数据用于获得所述第一待识别车辆的识别结果。
  23. 一种电子设备,包括:处理器和存储器,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,在所述处理器执行所述计算机指令的情况下,所述电子设备执行如权利要求1至21中任一项所述的方法。
  24. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序包括程序指令,在所述程序指令被处理器执行的情况下,使所述处理器执行权利要求1至21中任意一项所述的方法。
PCT/CN2020/140315 2020-09-10 2020-12-28 车辆识别方法及装置、电子设备及存储介质 WO2022052375A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217042600A KR20220035335A (ko) 2020-09-10 2020-12-28 차량 식별 방법 및 장치, 전자 기기 및 저장 매체
JP2021575043A JP2023501028A (ja) 2020-09-10 2020-12-28 車両識別方法及び装置、電子デバイス及び記憶媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010947349.1A CN112101183B (zh) 2020-09-10 2020-09-10 车辆识别方法及装置、电子设备及存储介质
CN202010947349.1 2020-09-10

Publications (1)

Publication Number Publication Date
WO2022052375A1 true WO2022052375A1 (zh) 2022-03-17

Family

ID=73752542

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/140315 WO2022052375A1 (zh) 2020-09-10 2020-12-28 车辆识别方法及装置、电子设备及存储介质

Country Status (5)

Country Link
JP (1) JP2023501028A (zh)
KR (1) KR20220035335A (zh)
CN (2) CN112101183B (zh)
TW (1) TW202221567A (zh)
WO (1) WO2022052375A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455957A (zh) * 2023-12-25 2024-01-26 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) 一种基于深度学习的车辆轨迹定位追踪方法及系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101183B (zh) * 2020-09-10 2021-08-24 深圳市商汤科技有限公司 车辆识别方法及装置、电子设备及存储介质
CN113569912A (zh) * 2021-06-28 2021-10-29 北京百度网讯科技有限公司 车辆识别方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140270384A1 (en) * 2013-03-15 2014-09-18 Mitek Systems, Inc. Methods for mobile image capture of vehicle identification numbers
CN107862340A (zh) * 2017-11-16 2018-03-30 深圳市华尊科技股份有限公司 一种车型识别方法及装置
CN108229468A (zh) * 2017-06-28 2018-06-29 北京市商汤科技开发有限公司 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备
CN110533119A (zh) * 2019-09-04 2019-12-03 北京迈格威科技有限公司 标识识别方法及其模型的训练方法、装置及电子系统
CN112101183A (zh) * 2020-09-10 2020-12-18 深圳市商汤科技有限公司 车辆识别方法及装置、电子设备及存储介质

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105913405B (zh) * 2016-04-05 2019-03-29 智车优行科技(北京)有限公司 用于呈现图像细节的处理方法、装置及车辆
US10423855B2 (en) * 2017-03-09 2019-09-24 Entit Software Llc Color recognition through learned color clusters
CN108229353B (zh) * 2017-12-21 2020-09-22 深圳市商汤科技有限公司 人体图像的分类方法和装置、电子设备、存储介质、程序
CN108319907A (zh) * 2018-01-26 2018-07-24 腾讯科技(深圳)有限公司 一种车辆识别方法、装置和存储介质
CN108564119B (zh) * 2018-04-04 2020-06-05 华中科技大学 一种任意姿态行人图片生成方法
CN108960140B (zh) * 2018-07-04 2021-04-27 国家新闻出版广电总局广播科学研究院 基于多区域特征提取和融合的行人再识别方法
CN109063768B (zh) * 2018-08-01 2021-10-01 北京旷视科技有限公司 车辆重识别方法、装置及系统
CN109685023A (zh) * 2018-12-27 2019-04-26 深圳开立生物医疗科技股份有限公司 一种超声图像的面部关键点检测方法及相关装置
CN110689481A (zh) * 2019-01-17 2020-01-14 成都通甲优博科技有限责任公司 车辆种类识别方法及装置
CN110348463B (zh) * 2019-07-16 2021-08-24 北京百度网讯科技有限公司 用于识别车辆的方法和装置
CN111126379B (zh) * 2019-11-22 2022-05-17 苏州浪潮智能科技有限公司 一种目标检测方法与装置
CN111274954B (zh) * 2020-01-20 2022-03-15 河北工业大学 基于改进姿态估计算法的嵌入式平台实时跌倒检测方法
CN111339846B (zh) * 2020-02-12 2022-08-12 深圳市商汤科技有限公司 图像识别方法及装置、电子设备和存储介质
CN111340701B (zh) * 2020-02-24 2022-06-28 南京航空航天大学 一种基于聚类法筛选匹配点的电路板图像拼接方法
CN111401265B (zh) * 2020-03-19 2020-12-25 重庆紫光华山智安科技有限公司 行人重识别方法、装置、电子设备和计算机可读存储介质
CN111311532B (zh) * 2020-03-26 2022-11-11 深圳市商汤科技有限公司 图像处理方法及装置、电子设备、存储介质
CN111199550B (zh) * 2020-04-09 2020-08-11 腾讯科技(深圳)有限公司 图像分割网络的训练方法、分割方法、装置和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140270384A1 (en) * 2013-03-15 2014-09-18 Mitek Systems, Inc. Methods for mobile image capture of vehicle identification numbers
CN108229468A (zh) * 2017-06-28 2018-06-29 北京市商汤科技开发有限公司 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备
CN107862340A (zh) * 2017-11-16 2018-03-30 深圳市华尊科技股份有限公司 一种车型识别方法及装置
CN110533119A (zh) * 2019-09-04 2019-12-03 北京迈格威科技有限公司 标识识别方法及其模型的训练方法、装置及电子系统
CN112101183A (zh) * 2020-09-10 2020-12-18 深圳市商汤科技有限公司 车辆识别方法及装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455957A (zh) * 2023-12-25 2024-01-26 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) 一种基于深度学习的车辆轨迹定位追踪方法及系统
CN117455957B (zh) * 2023-12-25 2024-04-02 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) 一种基于深度学习的车辆轨迹定位追踪方法及系统

Also Published As

Publication number Publication date
KR20220035335A (ko) 2022-03-22
TW202221567A (zh) 2022-06-01
CN112101183A (zh) 2020-12-18
CN112101183B (zh) 2021-08-24
JP2023501028A (ja) 2023-01-18
CN113780165A (zh) 2021-12-10

Similar Documents

Publication Publication Date Title
WO2022052375A1 (zh) 车辆识别方法及装置、电子设备及存储介质
WO2020042489A1 (zh) 违法停车案件的鉴别方法、装置和计算机设备
WO2021203882A1 (zh) 姿态检测及视频处理方法、装置、电子设备和存储介质
WO2021051601A1 (zh) 利用Mask R-CNN选择检测框的方法及系统、电子装置及存储介质
CN109034086B (zh) 车辆重识别方法、装置及系统
CN111767831B (zh) 用于处理图像的方法、装置、设备及存储介质
CN111435446A (zh) 一种基于LeNet车牌识别方法及装置
Salarian et al. A vision based system for traffic lights recognition
WO2023024790A1 (zh) 车辆识别方法及装置、电子设备、计算机可读存储介质和计算机程序产品
WO2023246921A1 (zh) 目标属性识别方法、模型训练方法和装置
CN112733666A (zh) 一种难例图像的搜集、及模型训练方法、设备及存储介质
Latha et al. Image understanding: semantic segmentation of graphics and text using faster-RCNN
CN117218622A (zh) 路况检测方法、电子设备及存储介质
CN111709377B (zh) 特征提取方法、目标重识别方法、装置及电子设备
CN111178181B (zh) 交通场景分割方法及相关装置
CN116071557A (zh) 一种长尾目标检测方法、计算机可读存储介质及驾驶设备
US20220207879A1 (en) Method for evaluating environment of a pedestrian passageway and electronic device using the same
CN114724128A (zh) 一种车牌识别方法、装置、设备和介质
CN114882469A (zh) 一种基于dl-ssd模型的交通标志检测方法及系统
Wang et al. Cost effective and accurate vehicle make/model recognition method using YoloV5
CN111931680A (zh) 一种基于多尺度的车辆重识别方法及系统
CN116052220B (zh) 行人重识别方法、装置、设备及介质
Balabid et al. Cell phone usage detection in roadway images: from plate recognition to violation classification
CN113505653B (zh) 目标检测方法、装置、设备、介质及程序产品
CN111988506B (zh) 补光方法及装置、电子设备及存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021575043

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20953160

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 522431534

Country of ref document: SA

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.07.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20953160

Country of ref document: EP

Kind code of ref document: A1