CN114332564A

CN114332564A - Vehicle classification method, apparatus and storage medium

Info

Publication number: CN114332564A
Application number: CN202111656401.9A
Authority: CN
Inventors: 马伟; 章勇; 毛晓蛟; 赵妍珠
Original assignee: Suzhou Keda Technology Co Ltd
Current assignee: Suzhou Keda Technology Co Ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-12

Abstract

The application relates to a vehicle classification method, equipment and a storage medium, belonging to the technical field of image recognition, wherein the method comprises the following steps: inputting the target image into a pre-trained basic classification model to obtain the image characteristics of the target image, the classification result of the vehicles in the target image and the confidence coefficient of the classification result, wherein the classification result of each vehicle comprises m-level classification information; under the condition that the confidence coefficient is smaller than a preset threshold value, obtaining a sample vehicle image belonging to the first-level classification information to obtain a support set; inputting the support set into a small sample classification model trained in advance to execute a meta-test task, and obtaining small sample characteristics of each subclass under the first-stage classification information; determining a final classification result of the vehicle based on the image features and the small sample features of the subclasses; the problems that the robustness of the classification model is insufficient and the vehicle classification result is easy to make mistakes due to the fact that the training data for training the classification model is few can be solved; the accuracy of vehicle classification can be improved.

Description

Vehicle classification method, apparatus and storage medium

[ technical field ] A method for producing a semiconductor device

The application relates to a vehicle classification method, equipment and a storage medium, and belongs to the technical field of image recognition.

[ background of the invention ]

With the development of Intelligent Transport System (ITS) technology, automatic recognition of vehicle categories from images can be achieved. The vehicle category that the user needs to identify may be various, such as: the identification of the brand of the vehicle requires identification of not only a large brand of the vehicle but also sub-brands, annual money, and the like under the large brand.

The conventional vehicle classification methods include: combining various classification labels of the vehicle image into an integral label, and then training a neural network model by using the vehicle image and the integral label. And then, simultaneously identifying multiple classifications of the image by using the classification model obtained by training to obtain multiple classifications of the vehicle in the image.

However, a small number of vehicle types of the current vehicle may be used, so that training data for training the neural network model is also small, and thus, the robustness of the classification model obtained by training is insufficient, and further, the problem that subsequent behavior judgment is influenced by vehicle classification errors is caused.

[ summary of the invention ]

The application provides a vehicle classification method, equipment and a storage medium, which can solve the problems that the robustness of a classification model is insufficient and the vehicle classification result is easy to make mistakes due to the fact that the training data for training the classification model is few. In the method, the characteristics of the basic classification model and the characteristics of the small sample classification model are fused, and the vehicle classification form from end to end is designed to improve the overall accuracy of vehicle type classification. The application provides the following technical scheme:

in a first aspect, a vehicle classification method is provided, the method comprising:

acquiring a target image;

inputting the target image into a pre-trained basic classification model to obtain the image characteristics of the target image, the classification result of the vehicles in the target image and the confidence coefficient of the classification result, wherein the classification result of each vehicle comprises m-grade classification information, and m is a positive integer greater than 1; the basic classification model is obtained by training by using a first training set, wherein the first training set comprises a sample vehicle image and m classes of classification labels of each vehicle in the sample vehicle image;

under the condition that the confidence coefficient is smaller than a preset threshold value, obtaining a sample vehicle image belonging to the first-level classification information in the classification result from a second training set to obtain a support set; the second training set comprises vehicle sample images corresponding to the first-level classification information and subclass labels of each vehicle sample image under the corresponding first-level classification information;

inputting the support set into a pre-trained small sample classification model to execute a meta-test task, and obtaining small sample characteristics of each subclass under the first-stage classification information and an updated small sample classification model; the small sample classification model is obtained by training a meta-training task constructed by a third training set;

and determining a final classification result of the vehicle based on the image features and the small sample features of the subclasses.

Optionally, the determining a final classification result of the vehicle based on the image features and the small sample features of the respective subclasses includes:

fusing the image features with the small sample features of the subclasses respectively to obtain fused features of the subclasses;

inputting the target image into the updated small sample classification model to obtain the small sample characteristics of the target image;

and determining a final classification result of the vehicle based on the similarity between the small sample feature of the target image and the fused feature of each subclass.

Optionally, the determining a final classification result of the vehicle based on the similarity between the small sample feature of the target image and the fused features of the respective sub-classes includes:

determining a fused feature having the highest similarity with the small sample feature from the fused features of the respective subclasses when the similarity between the small sample feature of the target image and the fused feature of the respective subclasses is greater than a similarity threshold;

and taking the subclass corresponding to the fused feature with the highest similarity and the first-level classification information as the final classification result.

Optionally, the third training set comprises a plurality of sample vehicle images and at least two classification results for each sample vehicle image;

the process of training the small sample classification model by using the meta-training task constructed by the third training set comprises the following steps:

randomly determining N classification results from the at least two classification results; n is a positive integer;

for each classification result, extracting K sample vehicle images from the sample vehicle images corresponding to the classification result as a support set, and extracting P sample vehicle images from the rest sample vehicle images corresponding to the classification result as a query set to obtain the meta-training task; k and P are positive integers;

and performing iterative learning on the meta-training task by using a pre-established neural network model until the learned neural network model converges to obtain the small sample classification model.

Optionally, the inputting the support set into a pre-trained small sample classification model to execute a meta-test task to obtain small sample features of each subclass under the first-class classification information and an updated small sample classification model includes:

predicting the support set by using the pre-trained small sample classification model to obtain a predicted value;

comparing the predicted value with the subclass label corresponding to the support set to obtain a predicted loss value;

updating parameters of the pre-trained small sample classification model based on the predicted loss value to obtain the updated small sample classification model;

and performing small sample feature extraction on the support set by using the updated small sample classification model to obtain small sample features of each subclass under the first-stage classification information.

Optionally, the basic classification model includes a feature extraction network, a first branch network, a second branch network, and a third branch network respectively connected to the feature extraction network, and a fusion layer connected to the first branch network, the second branch network, and the third branch network;

the feature extraction network is used for extracting features of the target image to obtain a feature map;

the first branch network is used for directly inputting the feature map output by the feature extraction network into the fusion layer;

the second branch network is used for extracting features of the feature maps of different channels and then cascading the feature maps to obtain cascaded feature maps;

the third branch network is used for giving weight information to different channels of the feature map according to the pre-learned channel weight to obtain an updated feature map;

the fusion layer is used for carrying out feature fusion on the feature map, the cascaded feature map and the updated feature map to obtain the image features.

Optionally, the second branch network includes a global average feature extraction layer, a channel layer, and a channel matching layer, which are connected in sequence;

the global average feature extraction layer is used for extracting global average features of the feature map;

the channel layer is used for respectively extracting different channel characteristics and cascading the extracted characteristics of different channels;

the channel matching layer is configured to adjust the number of channels of the cascaded feature, so that the adjusted number of channels matches the number of channels of the first branch network and the number of channels of the third branch network.

Optionally, the third branch network includes a maximum pooling layer, a full-link layer, an activation function layer, and a channel matching layer, which are connected in sequence;

the maximum pooling layer is used for reducing the size of the feature map;

the full connection layer is used for arranging the reduced characteristic graphs according to the number of channels;

the activation function layer is used for giving weight information to the feature maps of different channels obtained after arrangement;

the channel matching layer is configured to adjust the number of channels of the weight-adjusted feature, so that the adjusted number of channels matches the number of channels of the first branch network and the number of channels of the second branch network.

In a second aspect, an electronic device is provided, the device comprising a processor and a memory; the memory has stored therein a program that is loaded and executed by the processor to implement the vehicle classification method provided by the first aspect.

In a third aspect, a computer-readable storage medium is provided, in which a program is stored, which, when being executed by a processor, is adapted to carry out the vehicle classification method provided in the first aspect.

The beneficial effects of this application include at least: inputting a target image into a pre-trained basic classification model to obtain image features of the target image, classification results of vehicles in the target image and confidence degrees of the classification results, wherein the classification results of each vehicle comprise m-level classification information; under the condition that the confidence coefficient is smaller than a preset threshold value, obtaining a sample vehicle image belonging to the first-level classification information in the classification result from the second training set to obtain a support set; inputting the support set into a pre-trained small sample classification model to execute a meta-test task, and obtaining small sample characteristics of each subclass under the first-stage classification information and an updated small sample classification model; determining a final classification result of the vehicle based on the image features and the small sample features of the subclasses; the problems that the robustness of the classification model is insufficient and the vehicle classification result is easy to make mistakes due to the fact that the training data for training the classification model is few can be solved; the feature of the small sample can be extracted by continuously using the meta-learning mode under the condition that the classification of the basic classification model is inaccurate, and the feature of the small sample extracted by the meta-learning mode is accurate, so that the final classification result is determined by combining the features of the basic classification model and the features of the small sample classification model, and the accuracy of vehicle classification can be improved.

In addition, the vehicle type classification is determined by fusing the characteristics of the basic classification model and the characteristics of the small sample classification model, the classification results of the two models can be combined for determination during vehicle classification, and the accuracy of vehicle classification can be further improved.

When the similarity between the small sample feature of the target image and the fused feature of each subclass is greater than the similarity threshold, the subclass corresponding to the fused feature with the highest similarity and the first-level classification information are used as a final classification result; the problem that an erroneous classification result is output when the similarity between all the fused features and the small sample features of the target image is small can be solved; the classification result corresponding to the fused features with the highest similarity and larger than the similarity threshold can be output, and the accuracy of vehicle classification is improved.

In addition, by designing a basic classification network comprising three network branches, original input information can be kept, meanwhile, richer fine-grained characteristics can be obtained, meanwhile, weight information is given to each characteristic channel, and therefore the accuracy of the vehicle classification model is improved.

The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.

[ description of the drawings ]

FIG. 1 is a flow chart of a vehicle classification method provided by one embodiment of the present application;

FIG. 2 is a schematic diagram of a base classification model provided by an embodiment of the present application;

FIG. 3 is a distribution diagram of vehicle data for different classification results provided by one embodiment of the present application;

FIG. 4 is a schematic illustration of a vehicle classification process provided by one embodiment of the present application;

FIG. 5 is a block diagram of a vehicle classification device provided in an embodiment of the present application;

fig. 6 is a block diagram of an electronic device provided by an embodiment of the application.

[ detailed description ] embodiments

The following detailed description of embodiments of the present application will be made with reference to the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.

First, several terms referred to in the present application will be described.

Meta Learning (Meta Learning): is to help learn a new task using a previously learned task. Therefore, meta-training is required to learn a priori knowledge (i.e. a pre-trained small sample classification model in this application) in the previous task, and then use this a priori knowledge to help learn a new meta-test task.

N-Way K-Shot classification: N-Way or N-type, and K-Shot is K-order or K-number; refers to the construction of classification tasks with a small number of samples. The method is mainly applied to the condition that sample data in the field of few-sample Learning (Few-Shot Learning) is insufficient, such as the field of meta-Learning.

Optionally, the vehicle classification method provided in each embodiment is used in an electronic device as an example for description, the electronic device is a terminal or a server, the terminal may be a mobile phone, a computer, a tablet computer, a scanner, an electronic eye, a monitoring camera, and the like, and the embodiment does not limit the type of the electronic device.

Fig. 1 is a flowchart of a vehicle classification method according to an embodiment of the present application, the method at least includes the following steps:

step 101, acquiring a target image.

The target image refers to an image to be subjected to vehicle classification, and the target image may be an image acquired from a vehicle driving environment or a frame image in a video stream obtained by shooting a vehicle driving scene, and the source of the target image is not limited in this embodiment.

Alternatively, the target image may or may not include an image of the vehicle; in the case of including an image of a vehicle, the target image may include images of a plurality of vehicles, or include an image of one vehicle.

In the present application, the vehicle may be an automobile, a bicycle, or an electric vehicle, and the present embodiment does not limit the type of the vehicle.

And 102, inputting the target image into a pre-trained basic classification model to obtain the image characteristics of the target image, the classification result of the vehicle in the target image and the confidence coefficient of the classification result.

Wherein the classification result of each vehicle comprises m-level classification information, and m is a positive integer greater than 1. The first-level classification information includes m-1-level subclasses, and the number of the m-1-level subclasses is at least one. In other words, for each type of first-level classification information, the m-1 classification information under the first-level classification information is a subclass of the first-level classification information.

Such as: the classification result of the vehicle comprises three levels of classification information, wherein the first level of classification information is a large brand of the vehicle, the second level of classification information is a small brand, and the third level of classification information is a yearly money.

The categories of large brands include: a, B and C;

the categories of small brands include: x1 series and X2 series under A, E series and V series under B, A series and B series under C;

the classification of the annual fee includes: 2012 and 2013 of small brand X1, 2020 of small brand X2, 2010E 260L and 2010E 300L of small brand E series, 2018 and 2017 of small brand V series, 2015 and 2016 of small brand a series, 2019, 200T and 2019, 280T of small brand B series.

The above vehicle classification results are only illustrative, and in actual implementation, the vehicle classification results may also include other classifications, such as: the vehicle color, the vehicle type, etc., and the vehicle classification method is not limited in this embodiment.

The base classification model is trained using a first training set that includes the sample vehicle images and m class classification labels for each vehicle in the sample vehicle images.

In this embodiment, the sample vehicle image corresponding to each m-level classification label in the first training set is greater than the preset number threshold. The preset number threshold may be 50, 60, etc., and the value of the preset number threshold is not limited in this embodiment.

Illustratively, the acquisition process of the first training set includes: obtaining a plurality of sample vehicle images, and classifying the plurality of sample vehicle images to obtain m-level classification labels of each sample vehicle image; then, the classification result of the number of the sample vehicle images exceeding a preset number threshold is used as a large sample class, and a first training set is obtained.

The base classification model is trained using a first training set. In other words, the base classification model is trained using a large sample class.

Optionally, the basic classification model is obtained by improving a conventional classification model. Specifically, the basic classification model is designed for improving the feature expression of the large-sample vehicle type classification model, and is mainly based on an improved residual module and a unique attention module to acquire finer-grained features.

Referring to fig. 2, the basic classification model includes a feature extraction network 21, a first branch network 22, a second branch network 23, and a third branch network 24 connected to the feature extraction network 21, respectively, and a fusion layer 25 connected to the first branch network 22, the second branch network 23, and the third branch network 24.

The feature extraction network is used for extracting features of the target image to obtain a feature map. Alternatively, the feature extraction network may be referred to as a backbone network, and the feature extraction network may be composed of a plurality of residual modules.

The first branch network is used for directly inputting the feature graph output by the feature extraction network into the fusion layer. The first branch network, which may also be referred to as the original branch network, connects the feature extraction network to the fusion layer, so that the integrity of the feature map may be maintained.

The second branch network is used for extracting features of the feature maps of different channels and then cascading the feature maps to obtain cascaded feature maps. The second branch network may also be referred to as a multi-feature network. Referring to fig. 2, the second branch network includes a global average feature extraction layer, a channel layer, and a channel matching layer, which are connected in sequence.

The global average feature extraction layer is used for extracting global average features of the feature map. Illustratively, the global feature extraction layer includes a 1 × 1 convolutional layer and an Average pooling (Average pool) layer connected to the 1 × 1 convolutional layer.

The channel layer is used for respectively extracting different channel characteristics and cascading the extracted characteristics of different channels. Illustratively, the channel layer includes a plurality of convolutional layers respectively connected to the global average feature extraction layer, and the number of convolution kernels in different convolutional layers is different, such as the different numbers of convolution kernels represented by × 1, × 2, × 3, and × 4 in fig. 2; after the features of the corresponding channels are extracted from the different convolutional layers, the features of the channels are cascaded, that is, y1, y2, y3 and y4 in fig. 2 are cascaded to obtain the cascaded features. Therefore, the second branch network can finally obtain richer fine-grained characteristics, and the representation of vehicle type classification on the fine-grained characteristics, such as vehicle logos, vehicle body colors and the like, can be effectively improved through the second branch network.

The channel matching layer is used for adjusting the number of the channels of the cascaded features so as to enable the adjusted number of the channels to be matched with the number of the channels of the first branch network and the number of the channels of the third branch network. Illustratively, the channel matching layer is implemented by a 1 × 1 convolutional layer.

And the third branch network is used for giving weight information to different channels of the feature map according to the pre-learned channel weight to obtain an updated feature map. Referring to fig. 2, the third branch network includes a max pool (max pool) layer, a full connection layer, an activation function (sigmoid) layer, and a channel matching layer, which are sequentially connected.

Wherein the maximum pooling layer is used to reduce the size of the feature map. And the full connection layer is used for arranging the reduced characteristic diagram according to the number of channels.

And the activation function layer is used for giving weight information to the feature maps of the different channels obtained after arrangement. The activation function layer learns the weight expression for each channel in advance.

The channel matching layer is used for adjusting the channel number of the weight-adjusted feature so as to enable the adjusted channel number to be matched with the channel number of the first branch network and the channel number of the second branch network. Illustratively, the channel matching layer is implemented by a 1 × 1 convolutional layer.

The fusion layer is used for carrying out feature fusion on the feature map, the cascaded feature map and the updated feature map to obtain image features.

According to the above content, the basic classification model provided by the embodiment can obtain richer fine-grained features while maintaining the original input information, and meanwhile, weight information is given to each feature channel, so that the accuracy of the vehicle classification model is improved.

Optionally, since the basic classification model in this step is mainly based on the large sample data to fit the real distribution of the vehicle classification, the effect is better for the large sample data during the test, but actually, the vehicle data distribution is as shown in fig. 3, and these data are obvious super-long tail distribution, and such distribution inevitably causes the classification effect of tail data, i.e., small sample data, to be poor.

Based on this, in this embodiment, the large sample data and the small sample data are further distinguished by the confidence coefficient (or standard deviation) output by the softmax layer of the basic classification model, where the confidence coefficient is smaller than a preset threshold (e.g., 0.9), which indicates that the classification error is large, and the input target image is a small sample, and step 103 is executed; the confidence coefficient is larger than or equal to a preset threshold (such as 0.9), which indicates that the classification error is small, the input target image is a large sample, and the classification result is directly output. In other words, in the case that the confidence is greater than or equal to the preset threshold, the classification result output by the basic classification model is taken as the final classification result of the vehicle.

In this embodiment, the preset threshold is taken as 0.9 for example, and in actual implementation, the preset threshold may also be other values, such as: 0.85, etc., and the value of the preset threshold is not limited in this embodiment.

103, under the condition that the confidence coefficient is smaller than a preset threshold value, obtaining a sample vehicle image belonging to the first-level classification information in the classification result from the second training set to obtain a support set; the second training set comprises vehicle sample images corresponding to the first-level classification information and subclass labels of each vehicle sample image under the corresponding first-level classification information.

Because the first-level classification information output by the basic classification network is generally accurate, in order to reduce the calculation amount and accuracy of the small-sample classification model, in this embodiment, the sample vehicle images of the subclasses under the first-level classification information output by the basic classification model are used as support sets.

The type of each subclass label under the first-stage classification information in the second training set corresponds to the type of the label used in training the basic classification model.

Step 104, inputting the support set into a small sample classification model which is trained in advance to execute a meta-test task, and obtaining small sample characteristics of each subclass under first-stage classification information and an updated small sample classification model; and the small sample classification model is obtained by training the meta-training task constructed by the third training set.

Since the data of the vehicle type small samples are less, a vehicle type small sample classification method is also designed in the embodiment. The small sample classification mainly comprises the steps of collecting a large amount of multi-type data, and training small sample classification through a composition form based on a K-way-N-shot form of meta-learning, wherein the form is mainly processed at a data end of a classification model, K categories and N pieces of training data are selected randomly, and the classification of the small samples is trained through the composition form, so that the classification model is adaptive to the data form of the small samples, and further, when verification is carried out, the data of the composition form is also input to extract more suitable features.

Specifically, the third training set comprises a plurality of sample vehicle images and at least two classification results of each sample vehicle image; the process of training the small sample classification model by using the meta-training task constructed by the third training set comprises the following steps: randomly determining N classification results from at least two classification results; for each classification result, extracting K sample vehicle images from the sample vehicle images corresponding to the classification result as a support set, and extracting P sample vehicle images from the rest sample vehicle images corresponding to the classification result as a query set to obtain a meta-training task; and performing iterative learning on the meta-training task by using a pre-established neural network model until the learned neural network model converges to obtain a small sample classification model.

N, K and P are positive integers. In general, the value of K is usually small, such as: 5 pieces, 10 pieces and the like, in other words, the value of K is far smaller than the number of samples in the training of large sample data.

And the type of the classification result in the third training set is the same as or different from that in the second training set. Such as: the type of the classification result in the third training set is large brand-small brand-vehicle type 1-year money; the type of the classification result in the second training set is 1-year money of a large brand-vehicle type, and at the moment, the type of the classification result in the third training set is different from that in the second training set.

The number of the meta-training tasks is multiple, and the multiple meta-training tasks are divided into a plurality of batches and are sequentially input into a pre-established neural network model for iterative learning. Each meta-training task comprises N classification results, and each classification result corresponds to a support set and a query set.

Optionally, the pre-created neural network model may be a mobilenet small model (i.e. lightweight CNN), or may also be other lightweight networks, and the present embodiment does not limit the type of the neural network model.

Specifically, inputting a support set into a pre-trained small sample classification model to execute a meta-test task, so as to obtain small sample characteristics of each subclass under first-stage classification information and an updated small sample classification model, and the method comprises the following steps: predicting the support set by using a small sample classification model trained in advance to obtain a predicted value; comparing the predicted value with the subclass label corresponding to the support set to obtain a predicted loss value; updating parameters of a small sample classification model trained in advance based on the predicted loss value to obtain an updated small sample classification model; and performing small sample feature extraction on the support set by using the updated small sample classification model to obtain small sample features of each subclass under the first-level classification information.

And step 105, determining a final classification result of the vehicle based on the image characteristics and the small sample characteristics of each subclass.

Optionally, determining a final classification result of the vehicle based on the image features and the small sample features of the respective subclasses comprises: fusing the image characteristics with the small sample characteristics of each subclass respectively to obtain fused characteristics of each subclass; inputting the target image into the updated small sample classification model to obtain the small sample characteristics of the target image; and determining the final classification result of the vehicle based on the similarity between the small sample feature of the target image and the fused features of the subclasses.

The similarity between the small sample feature and the fused feature of each subclass can be represented by the Euclidean distance of the feature, and the Euclidean distance and the similarity are in a negative correlation relationship.

In one example, determining a final classification result of the vehicle based on the similarity between the small sample feature of the target image and the fused features of the respective sub-classes includes: determining a fused feature with the highest similarity to the small sample feature from the fused features of the subclasses under the condition that the similarity between the small sample feature of the target image and the fused features of the subclasses is greater than a similarity threshold; and taking the subclass corresponding to the fused features with the highest similarity and the first-level classification information as final classification results.

Optionally, when the similarity between the small sample feature of the target image and the fused feature of each sub-class is less than or equal to the similarity threshold, outputting the first-level classification information, or outputting a classification failure prompt.

Referring to an example of the vehicle classification process shown in fig. 4, after the target image is input into the basic classification model, the classification result and the confidence of the classification result are obtained; if the confidence coefficient is greater than 0.9, outputting the classification result; if the confidence is less than or equal to 0.9, obtaining sample vehicle images corresponding to each subclass under the first-level classification information (taking the vehicle logo as an example in fig. 3) in the classification result to obtain a support set; and inputting the image characteristics obtained by the support set and the basic classification model into a small sample classification model trained in advance for characteristic fusion so as to obtain a final classification result.

In summary, in the vehicle classification method provided in this embodiment, the image features of the target image, the classification results of the vehicles in the target image, and the confidence degrees of the classification results are obtained by inputting the target image into the pre-trained basic classification model, and the classification result of each vehicle includes m-level classification information; under the condition that the confidence coefficient is smaller than a preset threshold value, obtaining a sample vehicle image belonging to the first-level classification information in the classification result from the second training set to obtain a support set; inputting the support set into a pre-trained small sample classification model to execute a meta-test task, and obtaining small sample characteristics of each subclass under the first-stage classification information and an updated small sample classification model; determining a final classification result of the vehicle based on the image features and the small sample features of the subclasses; the problems that the robustness of the classification model is insufficient and the vehicle classification result is easy to make mistakes due to the fact that the training data for training the classification model is few can be solved; the feature of the small sample can be extracted by continuously using the meta-learning mode under the condition that the classification of the basic classification model is inaccurate, and the feature of the small sample extracted by the meta-learning mode is accurate, so that the final classification result is determined by combining the features of the basic classification model and the features of the small sample classification model, and the accuracy of vehicle classification can be improved.

Fig. 5 is a block diagram of a vehicle classification device according to an embodiment of the present application. The device at least comprises the following modules: an image acquisition module 510, a first classification module 520, a data acquisition module 530, a meta-test module 540, and a second classification model 550.

An image acquisition module 510 for acquiring a target image;

a first classification module 520, configured to input the target image into a pre-trained basic classification model, so as to obtain image features of the target image, a classification result of a vehicle in the target image, and a confidence of the classification result, where the classification result of each vehicle includes m-level classification information, and m is a positive integer greater than 1; the basic classification model is obtained by training by using a first training set, wherein the first training set comprises a sample vehicle image and m classes of classification labels of each vehicle in the sample vehicle image;

a data obtaining module 530, configured to obtain, from a second training set, a sample vehicle image belonging to the first-level classification information in the classification result to obtain a support set, when the confidence is smaller than a preset threshold; the second training set comprises vehicle sample images corresponding to the first-level classification information and subclass labels of each vehicle sample image under the corresponding first-level classification information;

a meta-test module 540, configured to input the support set into a pre-trained small sample classification model to perform a meta-test task, so as to obtain small sample features of each sub-class under the first-level classification information and an updated small sample classification model; the small sample classification model is obtained by training a meta-training task constructed by a third training set;

and a second classification model 550 for determining a final classification result of the vehicle based on the image features and the small sample features of the respective subclasses.

For relevant details reference is made to the above-described method embodiments.

It should be noted that: in the vehicle classification device provided in the above embodiment, only the division of the functional modules is illustrated when vehicle classification is performed, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the vehicle classification device is divided into different functional modules to complete all or part of the functions described above. In addition, the vehicle classification device provided by the embodiment and the vehicle classification method embodiment belong to the same concept, and the specific implementation process is described in the method embodiment and is not described again.

Fig. 6 is a block diagram of an electronic device provided by an embodiment of the application. The device comprises at least a processor 601 and a memory 602.

Processor 601 may include one or more processing cores such as: 4 core processors, 8 core processors, etc. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the vehicle classification method provided by the method embodiments herein.

In some embodiments, the electronic device may further include: a peripheral interface and at least one peripheral. The processor 601, memory 602 and peripheral interface may be connected by a bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.

Of course, the electronic device may include fewer or more components, which is not limited by the embodiment.

Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, the program being loaded and executed by a processor to implement the vehicle classification method of the above-mentioned method embodiment.

Optionally, the present application further provides a computer product including a computer readable storage medium, in which a program is stored, the program being loaded and executed by a processor to implement the vehicle classification method of the above-mentioned method embodiment.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A vehicle classification method, characterized in that the method comprises:

acquiring a target image;

2. The method of claim 1, wherein determining a final classification result for the vehicle based on the image features and the small sample features of the respective sub-classes comprises:

3. The method of claim 2, wherein determining a final classification result of the vehicle based on the similarity between the small sample features of the target image and the fused features of the respective sub-classes comprises:

4. The method of claim 1, wherein the third training set comprises a plurality of sample vehicle images and at least two classification results for each sample vehicle image;

5. The method of claim 1, wherein the inputting the support set into a pre-trained small sample classification model to perform a meta-test task to obtain small sample features of each sub-class under the first-level classification information and an updated small sample classification model comprises:

6. The method of claim 1, wherein the base classification model comprises a feature extraction network, a first branch network, a second branch network, and a third branch network connected to the feature extraction network, respectively, and a fusion layer connected to the first branch network, the second branch network, and the third branch network;

7. The method according to claim 6, wherein the second branch network comprises a global average feature extraction layer, a channel layer and a channel matching layer which are connected in sequence;

8. The method of claim 6, wherein the third branch network comprises a max pooling layer, a full connection layer, an activation function layer, and a channel matching layer connected in sequence;

the maximum pooling layer is used for reducing the size of the feature map;

9. An electronic device, characterized in that the device comprises a processor and a memory; the memory has stored therein a program that is loaded and executed by the processor to implement the vehicle classification method according to any one of claims 1 to 8.

10. A computer-readable storage medium, characterized in that the storage medium has stored therein a program which, when being executed by a processor, is adapted to carry out the vehicle classification method according to any one of claims 1 to 8.