CN111275061A

CN111275061A - Vehicle attribute identification method and model training method and device thereof, and electronic equipment

Info

Publication number: CN111275061A
Application number: CN201811474810.5A
Authority: CN
Inventors: 甘春生; 赵元; 沈海峰
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2018-12-04
Filing date: 2018-12-04
Publication date: 2020-06-12

Abstract

The application provides a vehicle attribute identification method, a model training method and a model training device thereof, and electronic equipment; in the vehicle attribute recognition model training method, a characteristic diagram corresponding to each first appointed convolutional layer is output through a plurality of convolutional layers; performing dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each first appointed convolutional layer; calculating loss values of the feature vectors obtained by reducing the dimensions through the loss functions in each group; and training the initial model based on the loss values of each group until the loss values of each group are converged to obtain the target model. In the embodiment of the application, the network structure of the model is simple, the data sharing of the corresponding characteristics of the vehicle attributes is realized, the data redundancy is reduced, the multi-attribute identification of the vehicle can be realized through the simple model structure, the real-time performance of the vehicle attribute identification is improved, and the application range of the vehicle attribute identification is expanded.

Description

Vehicle attribute identification method and model training method and device thereof, and electronic equipment

Technical Field

The application relates to the technical field of image processing, in particular to a vehicle attribute identification method, a model training method and device thereof and electronic equipment.

Background

The identification of vehicle attributes has wide application in various aspects such as charging of highways, parking lots, snap shots, public security systems, and the like. At present, in a vehicle attribute recognition mode based on image analysis, an artificial intelligence information extraction technology is mostly adopted, and text structured analysis of an image is realized through means such as space-time segmentation, feature extraction, target recognition and the like, so that a vehicle in the image can be detected and recognized. In the related art, in order to identify multiple attributes of a vehicle, a model is generally required to be established for each attribute; the whole recognition system is huge in structure, data redundancy among models is caused, the requirement of recognition instantaneity is difficult to meet, and the application range of the vehicle recognition system is limited.

Disclosure of Invention

In view of this, an object of the embodiments of the present application is to provide a vehicle attribute recognition model training method, a vehicle attribute recognition device, and an electronic device, so as to improve the real-time performance of vehicle attribute recognition and extend the application range of vehicle attribute recognition.

According to one aspect of the present application, an electronic device is provided that may include one or more storage media and one or more processors in communication with the storage media. One or more storage media store machine-readable instructions executable by a processor. When the electronic device is operated, the processor communicates with the storage medium through the bus, and the processor executes the machine readable instructions to perform one or more of the following operations:

a vehicle attribute recognition model training method, the method comprising: inputting a target training image into an initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute; outputting a characteristic diagram corresponding to each first appointed convolutional layer through a plurality of convolutional layers; performing dimensionality reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each layer of the first appointed convolutional layer to obtain a feature vector of a preset dimensionality corresponding to the feature map; calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group; training the initial model based on the loss values corresponding to each group until the loss values corresponding to each group are converged to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers and loss functions.

In some embodiments, the step of outputting the feature map corresponding to each of the first designated convolutional layers through a plurality of convolutional layers includes: in the multilayer convolutional layers, a characteristic diagram corresponding to the first appointed convolutional layer of each layer is obtained through calculation in a residual error network mode.

In some embodiments, the step of calculating, in the multi-layer convolutional layer, a feature map corresponding to the first designated convolutional layer of each layer by means of a residual error network includes: determining a plurality of second designated convolutional layers from the plurality of convolutional layers at a first preset interval, and executing the following steps for each convolutional layer one by one according to the sequence of each convolutional layer in the plurality of convolutional layers: if the current convolutional layer is the determined second designated convolutional layer, determining an input characteristic diagram of the current convolutional layer according to the characteristic diagram output by the designated convolutional layer before the current convolutional layer; inputting the input feature map into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; if the current convolutional layer is a convolutional layer except the second designated convolutional layer, inputting a feature map output by a convolutional layer which is one layer before the current convolutional layer into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; and traversing all the convolutional layers in the multi-layer convolutional layers to obtain a characteristic diagram corresponding to the first appointed convolutional layer in each layer of the multi-layer convolutional layers.

In some embodiments, the step of determining the input feature map of the current convolutional layer according to the output feature map of the specified convolutional layer before the current convolutional layer comprises: fusing a characteristic graph output by a convolutional layer which is one layer before the current convolutional layer with a characteristic graph which is output by a convolutional layer which is one layer before the previous convolutional layer and has a first preset interval with the current convolutional layer to obtain a fused characteristic graph; and determining the fused feature map as an input feature map of the current convolutional layer.

In some embodiments, said first second designated convolutional layer is a fourth convolutional layer of a multi-layer convolutional layer; the first predetermined space includes two convolutional layers.

In some embodiments, the vehicle attributes include a plurality of color attributes, type attributes, brand attributes, model attributes; the position of the first appointed convolutional layer connected with the pooling layer corresponding to each vehicle attribute in the multilayer convolutional layer is as follows from front to back: a first designated convolutional layer corresponding to the color attribute, a designated convolutional layer corresponding to the type attribute, a first designated convolutional layer corresponding to the brand attribute and a first designated convolutional layer corresponding to the model attribute; the adjacent first appointed convolution layers are separated by convolution layers with a second preset interval; wherein, the first layer of convolutional layer is a convolutional layer of the multi-layer convolutional layer for inputting the target training image.

In some embodiments, if the vehicle attributes include a color attribute, a type attribute, a brand attribute, and a model attribute, the first designated convolutional layer corresponding to the color attribute is a fifth layer convolutional layer; the second predetermined interval is four convolutional layers.

In some embodiments, the step of performing dimension reduction processing on the feature map corresponding to the first specified convolutional layer through the pooling layer and the fully-connected layer in the group to which each layer of the first specified convolutional layer is connected to obtain the feature vector of the preset dimension corresponding to the feature map includes: performing dimensionality reduction processing on the feature graph received by the pooling layers through the pooling layers in each group to obtain a dimensionality-reduced feature graph; and stretching the feature map after dimensionality reduction through a full connection layer connected by the pooling layer to obtain a one-dimensional feature vector corresponding to the feature map.

In some embodiments, the step of performing dimension reduction processing on the feature map received by the pooling layer through the pooling layer in each group to obtain a feature map after dimension reduction includes: and screening feature dimensions associated with the vehicle attributes corresponding to the current group from the received feature map through a pooling layer based on the vehicle attributes corresponding to the current group, and forming the screened feature dimensions into a feature map after dimension reduction.

In some embodiments, the loss function comprises a softmax function; softmax function:

wherein x is_iRepresenting the ith feature element in the feature vector; x is the number of_jRepresenting the jth feature element in the feature vector; n represents the total number of feature elements in the feature vector.

In some embodiments, the step of calculating the loss value of each group of corresponding feature vectors by using the loss function in each group includes: calculating an exponential function value of each characteristic element in the current group through a loss function for each group of corresponding characteristic vectors; determining the probability of the current characteristic element according to the index function value of each characteristic element in the current group and the sum of the index function values of each characteristic element in the characteristic vector corresponding to the current group; and determining the probability of the characteristic element with the highest probability in the characteristic vectors corresponding to the current group as the loss value of the characteristic vectors corresponding to the current group.

In some embodiments, after the step of calculating the loss value of each group of corresponding feature vectors by the loss function in each group, the method further includes: carrying out summation operation on the loss values corresponding to each group to obtain the sum of the loss values; the training of the initial model based on the loss values corresponding to each group until the loss values corresponding to each group converge includes: and training the initial model based on the loss values corresponding to each group until the sum of the loss values is converged, determining that the loss values corresponding to each group are converged, and stopping training.

According to another aspect of the application, a vehicle attribute identification method is also provided, and the method is applied to equipment configured with an identification model; the recognition model is a target model obtained by training the vehicle attribute recognition model training method; the method comprises the following steps: acquiring a vehicle image to be identified; and inputting the vehicle image into the target model to obtain various attributes of the vehicle corresponding to the vehicle image.

In some embodiments, the plurality of attributes of the vehicle include a plurality of color attributes, type attributes, brand attributes, model attributes.

According to another aspect of the present application, there is also provided a vehicle attribute recognition model training apparatus, including: the training image input module is used for inputting the target training image into the initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute; the characteristic diagram output module is used for outputting a characteristic diagram corresponding to each first appointed convolutional layer through the plurality of convolutional layers; the dimension reduction processing module is used for carrying out dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each layer of the first appointed convolutional layer to obtain a feature vector of a preset dimension corresponding to the feature map; the loss value calculation module is used for calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group; the training module is used for training the initial model based on each group of corresponding loss values until each group of corresponding loss values are converged to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers and loss functions.

In some embodiments, the feature map output module is configured to: in the multilayer convolutional layers, a characteristic diagram corresponding to the first appointed convolutional layer of each layer is obtained through calculation in a residual error network mode.

In some embodiments, the feature map output module is configured to: determining a plurality of second designated convolutional layers from the plurality of convolutional layers at a first preset interval, and executing the following steps for each convolutional layer one by one according to the sequence of each convolutional layer in the plurality of convolutional layers: if the current convolutional layer is the determined second designated convolutional layer, determining an input characteristic diagram of the current convolutional layer according to the characteristic diagram output by the designated convolutional layer before the current convolutional layer; inputting the input feature map into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; if the current convolutional layer is a convolutional layer except the second designated convolutional layer, inputting a feature map output by a convolutional layer which is one layer before the current convolutional layer into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; and traversing all the convolutional layers in the multi-layer convolutional layers to obtain a characteristic diagram corresponding to the first appointed convolutional layer in each layer of the multi-layer convolutional layers.

In some embodiments, the feature map output module is configured to: fusing a characteristic graph output by a convolutional layer which is one layer before the current convolutional layer with a characteristic graph which is output by a convolutional layer which is one layer before the previous convolutional layer and has a first preset interval with the current convolutional layer to obtain a fused characteristic graph; and determining the fused feature map as an input feature map of the current convolutional layer.

In some embodiments, the dimension reduction processing module is configured to: performing dimensionality reduction processing on the feature graph received by the pooling layers through the pooling layers in each group to obtain a dimensionality-reduced feature graph; and stretching the feature map after dimensionality reduction through a full connection layer connected by the pooling layer to obtain a one-dimensional feature vector corresponding to the feature map.

In some embodiments, the dimension reduction processing module is configured to: and screening feature dimensions associated with the vehicle attributes corresponding to the current group from the received feature map through a pooling layer based on the vehicle attributes corresponding to the current group, and forming the screened feature dimensions into a feature map after dimension reduction.

In some embodiments, the loss function comprises a softmax function; the softmax function:

In some embodiments, the loss value calculating module is configured to: calculating an exponential function value of each characteristic element in the current group through a loss function for each group of corresponding characteristic vectors; determining the probability of the current characteristic element according to the index function value of each characteristic element in the current group and the sum of the index function values of each characteristic element in the characteristic vector corresponding to the current group; and determining the probability of the characteristic element with the highest probability in the characteristic vectors corresponding to the current group as the loss value of the characteristic vectors corresponding to the current group.

In some embodiments, the apparatus further comprises a summing module configured to: carrying out summation operation on the loss values corresponding to each group to obtain the sum of the loss values; the training module is configured to: and training the initial model based on the loss values corresponding to each group until the sum of the loss values is converged, determining that the loss values corresponding to each group are converged, and stopping training.

According to another aspect of the application, a vehicle attribute identification device is also provided, and the device is applied to equipment configured with an identification model; the recognition model is a target model obtained by training the vehicle attribute recognition model training method; the device comprises: the vehicle image acquisition module is used for acquiring a vehicle image to be identified; and the vehicle image input module is used for inputting the vehicle image into the target model to obtain various attributes of the vehicle corresponding to the vehicle image.

According to another aspect of the present application, there is also provided an electronic device including: the vehicle attribute recognition model training system comprises a processor, a storage medium and a bus, wherein the storage medium stores machine readable instructions executable by the processor, when the electronic device runs, the processor is communicated with the storage medium through the bus, and the processor executes the machine readable instructions to execute the steps of the vehicle attribute recognition model training method or the steps of the vehicle attribute recognition method.

According to another aspect of the present application, there is also provided a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, performs the steps of the vehicle property recognition model training method as described above, or the steps of the vehicle property recognition method as described above.

Based on any one of the above aspects, the initial model comprises a plurality of sequentially connected convolution layers, and a plurality of groups of sequentially connected pooling layers, full-link layers and loss functions; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooled layers in each group are connected to a corresponding first designated convolutional layer, and the corresponding loss function of each group is used to evaluate the identification loss of one vehicle attribute. Outputting a characteristic diagram corresponding to each first appointed convolutional layer through a plurality of convolutional layers in the training process of the model; performing dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each first appointed convolutional layer; calculating loss values of the feature vectors obtained by reducing the dimensions through the loss functions in each group; and training the initial model based on the loss values of each group until the loss values of each group are converged to obtain the target model. The network structure of the model is simple, the feature graph output by the front-layer convolutional layer can be used for feature calculation of the back-layer convolutional layer, data sharing of features corresponding to vehicle attributes is achieved, data redundancy is reduced, multi-attribute recognition of the vehicle can be achieved through the simple model structure, the real-time performance of vehicle attribute recognition is improved, and the application range of the vehicle attribute recognition is expanded.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 illustrates a block diagram of a vehicle attribute identification system provided by an embodiment of the present application;

FIG. 2 illustrates a schematic diagram of exemplary hardware and software components of an electronic device provided by embodiments of the present application;

FIG. 3 is a flowchart illustrating a vehicle attribute recognition model training method provided by an embodiment of the present application;

FIG. 4 is a flow chart illustrating another vehicle attribute recognition model training method provided by an embodiment of the present application;

FIG. 5 is a diagram illustrating a network structure of an initial model provided by an embodiment of the present application;

FIG. 6 is a flow chart illustrating another vehicle attribute recognition model training method provided by an embodiment of the present application;

FIG. 7 is a flow chart illustrating a vehicle attribute identification method provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram illustrating a vehicle attribute recognition model training apparatus according to an embodiment of the present application;

fig. 9 shows a schematic structural diagram of a vehicle attribute identification device provided in an embodiment of the present application.

Detailed Description

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

To enable those skilled in the art to use the present disclosure, the following embodiments are presented in conjunction with a specific application scenario, "vehicle attribute identification". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of a vehicle attribute recognition model training method and a vehicle attribute recognition method, it should be understood that this is merely one exemplary embodiment. The present application may be applied to attribute identification of any other vehicle. For example, the present application may be applied to trains, bullet trains, high-speed railways, subways, ships, airplanes, spacecraft, hot air balloons, and the like. The present application may also include any system for vehicle attribute identification, such as a toll collection system for highways, a parking lot management system, a snap security snapshot system, a public security system, and the like.

It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.

One aspect of the present application relates to a vehicle attribute identification system. FIG. 1 is a block diagram of a vehicle attribute identification system 100 of some embodiments of the present application. The vehicle attribute identification system 100 may include one or more of a server 110, a network 120, an image capture device 130, an image capture device 140, and a database 150, and a processor executing instruction operations may be included in the server 110.

In some embodiments, the server 110 may be a single server or a group of servers. The set of servers can be centralized or distributed (e.g., the servers 110 can be a distributed system). In some embodiments, the server 110 may be local or remote to the terminal. For example, server 110 may access information and/or data stored in image capture device 130, image capture device 140, or database 150, or any combination thereof, via network 120. As another example, server 110 may be directly connected to at least one of image capture device 130, image capture device 140, and database 150 to access stored information and/or data. In some embodiments, the server 110 may be implemented on a cloud platform. In some embodiments, the server 110 may be implemented on an electronic device 200 having one or more of the components shown in FIG. 2 in the present application.

In some embodiments, the server 110 may include a processor. The processor may process information and/or data related to the service request to perform one or more of the functions described herein. For example, the processor may determine the target vehicle based on a service request obtained from image capture device 130. Network 120 may be used for the exchange of information and/or data. In some embodiments, one or more components (e.g., server 110, image capture device 130, image capture device 140, and database 150) in vehicle attribute identification system 100 may send information and/or data to other components. For example, server 110 may obtain a service request from image capture device 130 via network 120. In some embodiments, the network 120 may be any type of wired or wireless network, or combination thereof. In some embodiments, network 120 may include one or more network access points. For example, network 120 may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of vehicle attribute identification system 100 may connect to network 120 to exchange data and/or information.

In some embodiments,

image capture devices

130 and 140 may include mobile devices, tablet computers, laptop computers, cameras, the like, or any combination thereof. In some embodiments, the mobile device may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof.

Database 150 may store data and/or instructions. In some embodiments, database 150 may store data obtained from image acquisition device 130 and/or image acquisition device 140. In some embodiments, database 150 may store data and/or instructions for the exemplary methods described herein. In some embodiments, database 150 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, across clouds, multiple clouds, or the like, or any combination thereof.

In some embodiments, database 150 may be connected to network 120 to communicate with one or more components of vehicle attribute identification system 100 (e.g., server 110, image capture device 130, image capture device 140, etc.). One or more components in vehicle attribute identification system 100 may access data or instructions stored in database 150 via network 120. In some embodiments, database 150 may be directly connected to one or more components in vehicle attribute identification system 100 (e.g., server 110, image capture device 130, image capture device 140, etc.); alternatively, in some embodiments, database 150 may also be part of server 110.

Fig. 2 illustrates a schematic diagram of exemplary hardware and software components of a server 110, an image capture device 130, an electronic device 200 of an image capture device 140 that may implement the concepts of the present application, according to some embodiments of the present application. For example, a processor may be used on the electronic device 200 and to perform the functions herein.

The electronic device 200 may be a general purpose computer or a special purpose computer, both of which may be used to implement the vehicle attribute recognition model training method and the vehicle attribute recognition method of the present application. Although only a single computer is shown, for convenience, the functions described herein may be implemented in a distributed fashion across multiple similar platforms to balance processing loads.

For example, the electronic device 200 may include a network port 210 connected to a network, one or more processors 220 for executing program instructions, a communication bus 230, and a different form of storage medium 240, such as a disk, ROM, or RAM, or any combination thereof. Illustratively, the computer platform may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The method of the present application may be implemented in accordance with these program instructions. The electronic device 200 also includes an Input/Output (I/O) interface 250 between the computer and other Input/Output devices (e.g., keyboard, display screen).

The storage medium 240 stores machine-readable instructions executable by the processor 220, when the electronic device is operated, the processor 220 communicates with the storage medium 240 through a bus, and the processor executes the machine-readable instructions to execute the steps of the road map construction method described below. In addition, the storage medium may also be referred to as a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the vehicle attribute recognition model training method described below, or the steps of the vehicle attribute recognition method.

For ease of illustration, only one processor is depicted in the electronic device 200. However, it should be noted that the electronic device 200 in the present application may also comprise a plurality of processors, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor of the electronic device 200 executes steps a and B, it should be understood that steps a and B may also be executed by two different processors together or separately in one processor. For example, a first processor performs step a and a second processor performs step B, or the first processor and the second processor perform steps a and B together.

Based on the above description of the vehicle attribute identification system and the electronic device, refer to a flowchart of a vehicle attribute identification model training method shown in fig. 3; the method comprises the following steps:

step S302, inputting a target training image into an initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

the data of the multilayer convolutional layer is not particularly limited; each convolution layer can implement convolution operation by different convolution parameters (such as convolution kernel). Generally, each set of the pooling layer, the fully-connected layer, and the loss function connected in sequence is used to output a vehicle attribute identification result. The first designated convolutional layer is a plurality of designated convolutional layers in a multi-layer convolutional layer; the feature map output by each first designated convolutional layer is input to the pooling layer, fully-connected layer, and loss function connected to the first designated convolutional layer.

For ease of understanding, the structure of an initial model is given below as an example; the multilayer convolutional layer comprises 10 convolutional layers; wherein, the convolution layer used for inputting the target training image is the 1 st convolution layer. The initial model is intended to identify 3 vehicle attributes, such as attribute a, attribute B, and attribute C, respectively; thus the number of first designated convolutional layers is 3; specifically, the 3 first specific convolutional layers may be selected from the 10 convolutional layers according to the characteristics of the vehicle attributes, for example, the 3 first specific convolutional layers are the 4 th convolutional layer, the 7 th convolutional layer and the 10 th convolutional layer, respectively. The 4 th layer of convolution layer is connected with a group of connected pooling layers, full-connection layers and loss functions and is used for identifying the attribute A; the 7 th convolution layer is connected with a group of connected pooling layers, full-connection layers and loss functions and is used for identifying the attribute B; the layer 10 convolution layer connects a set of connected pooling layers, full-connect layers, and a loss function for identifying attributes.

Step S304, outputting a characteristic diagram corresponding to each first appointed convolutional layer through the plurality of convolutional layers;

after a target training image is input into a first convolutional layer of the convolutional layers, carrying out convolution operation on the target training image by using a convolution core corresponding to the convolutional layer by the first convolutional layer, and outputting a characteristic diagram; after the characteristic diagram is input into the second layer of convolutional layer, the second layer of convolutional layer uses the convolutional layer corresponding to the convolutional layer to carry out convolution operation on the input characteristic diagram, and a new characteristic diagram is output; and repeating the steps until the last layer of convolution layer outputs the characteristic diagram. The characteristic diagram corresponding to each layer of the first appointed convolutional layer is the characteristic diagram output by each layer of the first appointed convolutional layer; inputting the characteristic diagram output by each first appointed convolutional layer into a pooling layer, a full-link layer and a loss function connected with the characteristic diagram to perform dimension reduction and identification processing on the characteristic diagram output by the first appointed convolutional layer; in addition, except the case that the last layer of convolutional layer is used as the first designated convolutional layer, the first designated convolutional layer also inputs the output feature map into the next layer of convolutional layer connected with the first designated convolutional layer so as to make the feature map participate in the subsequent convolution operation.

Step S306, performing dimensionality reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each layer of the first appointed convolutional layer to obtain a feature vector of a preset dimensionality corresponding to the feature map;

in general, the feature map output by the convolutional layer has very high dimensionality and large data volume; the pooling layer and the full-connection layer can reduce the characteristic dimension of the characteristic diagram and reduce the data volume so as to simplify the calculation complexity of the model; the feature dimensions can also be screened and compressed to extract features related to attribute features corresponding to the group of pooling layers, the full-link layers and the loss function. The feature vector of the preset dimension may be preset, such as a two-dimensional feature vector, a one-dimensional feature vector, and the like.

Step S308, calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group;

this loss function may also be referred to as a supervisory signal; the loss function can judge the type of the target training image belonging to the group of corresponding vehicle attributes according to the feature vector to obtain the probability that the feature vector belongs to each type, wherein the probability is the loss value; for example, the corresponding vehicle attribute of the group is color, and a plurality of colors to be classified, such as red, black, white, blue, etc., are preset; calculating the probability that the characteristic vector belongs to each color according to the current characteristic vector by using the loss function, wherein the probability is 10% for red, 80% for black, 0% for white and 10% for blue; at this time, the loss value of the feature vector is 80%. Meanwhile, according to the loss value output by the loss function, the vehicle attribute corresponding to the characteristic vector can be determined.

Step S310, training the initial model based on the loss values corresponding to each group until the loss values corresponding to each group are converged to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers and loss functions.

In the actual training process, the loss value of each group obtained by each training may fluctuate, but after multiple times of training, the loss value tends to be stable, and after the loss values of the groups are converged, the training of the initial model is completed, so that the target model is obtained. The network structure of the target model is not usually transformed relative to the initial model, but parameters of each convolutional layer, pooling layer and full-link layer in the model may be changed.

According to the vehicle attribute recognition model training method provided by the embodiment of the invention, the initial model comprises a plurality of sequentially connected convolution layers, a plurality of groups of sequentially connected pooling layers, full-link layers and loss functions; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooled layers in each group are connected to a corresponding first designated convolutional layer, and the corresponding loss function of each group is used to evaluate the identification loss of one vehicle attribute. Outputting a characteristic diagram corresponding to each first appointed convolutional layer through a plurality of convolutional layers in the training process of the model; performing dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each first appointed convolutional layer; calculating loss values of the feature vectors obtained by reducing the dimensions through the loss functions in each group; and training the initial model based on the loss values of each group until the loss values of each group are converged to obtain the target model. The network structure of the model is simple, the feature graph output by the front-layer convolutional layer can be used for feature calculation of the back-layer convolutional layer, data sharing of features corresponding to vehicle attributes is achieved, data redundancy is reduced, multi-attribute recognition of the vehicle can be achieved through the simple model structure, the real-time performance of vehicle attribute recognition is improved, and the application range of the vehicle attribute recognition is expanded.

The embodiment of the invention also provides another vehicle attribute recognition model training method, which is realized on the basis of the method provided by the embodiment; in this embodiment, a manner of obtaining a feature map corresponding to the first designated convolutional layer is described in detail.

Generally, for a conventional convolutional network, multiple convolutional layers are connected in sequence, and data input by each convolutional layer is data output by a convolutional layer previous to the convolutional layer; in order to increase the depth of the convolution network, the training effect of the model is improved; in this embodiment, in the multi-layer convolution layer, the feature map corresponding to the first designated convolution layer in each layer is calculated by using a residual error network (may also be referred to as a resnet network). Therefore, in the multi-layer convolutional layer, the data output by some convolutional layers is directly transferred to the deeper convolutional layer by skipping the subsequent layer or layers by using the skip connection method.

Based on this, another vehicle attribute recognition model training method provided in this embodiment is shown in fig. 4, and specifically includes the following steps:

step S402, inputting a target training image into an initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

step S404, determining a plurality of second designated convolutional layers from the plurality of convolutional layers at a first preset interval; executing the following steps for each convolution layer one by one according to the sequence of each convolution layer in the multilayer convolution layers;

the second designated convolutional layer may be determined according to the actual structure of the multilayer convolutional layer; for a given one of the plurality of convolutional layers, the first designated convolutional layer and the second designated convolutional layer may be simultaneously.

Step S406, judging whether the current convolutional layer is the determined second designated convolutional layer; if yes, go to step S408; if not, executing step S410;

step S408, determining an input feature map of the current convolutional layer according to the feature map output by the specified convolutional layer before the current convolutional layer; inputting the input feature map into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; step S412 is executed;

for a conventional convolutional network, except for the first convolutional layer, the input feature map of the current convolutional layer is usually the output feature map of the previous convolutional layer of the current convolutional layer; in this embodiment, the input feature map of the current convolutional layer may include a feature map output by a specified convolutional layer before the current convolutional layer; the number of the designated convolutional layers is not limited, but is usually plural; for example, the sum of the feature maps output by all the specified convolutional layers before the current convolutional layer can be used as the input feature map of the current convolutional layer; or, the feature maps output by the specified convolutional layer before the current convolutional layer may be preprocessed, for example, feature map fusion is performed, and the fusion result is used as the input feature map of the current convolutional layer.

Specifically, the step S408 can be realized by the following steps 02 to 04:

step 02, fusing a characteristic graph output by a previous convolutional layer of the current convolutional layer with a characteristic graph output by a convolutional layer which is in front of the previous convolutional layer and has a first preset interval with the current convolutional layer to obtain a fused characteristic graph;

the first predetermined interval is usually in units of convolutional layers, such as one convolutional layer, two convolutional layers, three convolutional layers, etc.; if the current convolutional layer is a fourth convolutional layer of the multi-layer convolutional layers and the first preset interval is two convolutional layers, a convolutional layer of a convolutional layer having a first preset interval with the current convolutional layer, which is before the previous convolutional layer, is the first convolutional layer.

In the process of feature map fusion, feature map fusion can be carried out in a point-by-point addition or point-by-point multiplication mode to obtain a fused feature map; since the two feature maps are output from convolutional layers of different layers, the two feature maps may have different scales, and therefore, in order to enable the two feature maps to be fused, it is usually necessary to perform an interpolation operation on the feature map with a smaller scale before the feature map fusion is performed, so as to extend the scale of the feature map to be the same as the scale of the feature map with a larger scale.

And step 04, determining the fusion feature map as an input feature map of the current convolutional layer.

As an example of a network structure of an initial model, as shown in fig. 5, the multiple convolutional layers in the initial model include 17 convolutional layers, wherein the first second designated convolutional layer is a fourth convolutional layer in the multiple convolutional layers; the first preset interval comprises two convolution layers; the feature map input to the fourth convolutional layer is a fused feature map of the feature map output by the third convolutional layer and the feature map output by the first convolutional layer. The other second intermediate convolutional layers are a sixth convolutional layer, an eighth convolutional layer, a tenth convolutional layer, a twelfth convolutional layer, a fourteenth convolutional layer and a sixteenth convolutional layer, respectively.

As can be seen from the above, the initial model further includes a plurality of sets of pooling layers, full-link layers and loss functions connected in sequence; in FIG. 5, four groups are used as an example, with each connected convolutional layer being the first designated convolutional layer; thus the first designated convolutional layer is the fifth, ninth, thirteenth and seventeenth convolutional layers, respectively.

Vehicle attributes typically include color attributes, type attributes, brand attributes, model attributes, and the like; the initial model may be used to identify a plurality of the vehicle attributes; the group of the pooling layers, the full-connection layers and the loss functions which are connected in sequence correspond to one vehicle attribute; and wherein the loss function is matched to the corresponding vehicle attribute. Among the above vehicle attributes, the color attribute is generally most easily recognized, the type attribute and the brand attribute are second, and the model attribute is most recognized; therefore, in the initial model, the first designated convolutional layer connected to the pooling layer corresponding to the color attribute is usually located at a position more forward in the convolutional layers; the first appointed convolutional layer connected with the pooling layer corresponding to the model attribute is usually positioned at the later position in the plurality of convolutional layers; the first appointed convolutional layer connected with the pooling layer corresponding to each brand attribute of the type attribute is centered; thus, when the later type attribute is identified, partial parameters of the former type attribute, such as a feature map, a convolutional layer parameter and the like, can be shared, so that data redundancy is reduced, and the network structure is simplified.

Based on this, the position of the first designated convolutional layer connected to the pooling layer corresponding to each vehicle attribute in the multilayer convolutional layer sequentially from front to back is: a first designated convolutional layer corresponding to the color attribute, a designated convolutional layer corresponding to the type attribute, a first designated convolutional layer corresponding to the brand attribute and a first designated convolutional layer corresponding to the model attribute; the adjacent first appointed convolution layers are separated by convolution layers with a second preset interval; wherein, the first layer of convolutional layer is a convolutional layer of the multi-layer convolutional layer for inputting the target training image. Of course, the order of the first designated convolutional layers corresponding to each vehicle attribute may also be adjusted according to actual requirements, and the first designated convolutional layer corresponding to each vehicle attribute is specifically which convolutional layer in the multilayer convolutional layers, and is not particularly limited.

FIG. 5 is an example of a fifth layer of convolutional layers corresponding to color attributes if the vehicle attributes include color attributes, type attributes, brand attributes, and model attributes; the second predetermined interval is four convolutional layers. The first designated convolutional layer corresponding to the type attribute is a ninth convolutional layer; the first appointed convolutional layer corresponding to the brand attribute is the eleventh convolutional layer; the first designated convolutional layer corresponding to the model attribute is a seventeenth convolutional layer. It can be understood that when the model is used to identify only a portion of the four vehicle attributes, the pooling layer, the full-link layer, and the loss function corresponding to the vehicle attributes that need not be identified may be removed.

Step S410, inputting the feature map output by the convolution layer previous to the current convolution layer into the current convolution layer for convolution operation, and outputting the feature map corresponding to the current convolution layer;

step S412, judging whether the layer is the last layer of convolution layer; if not, go to step S414; if yes, go to step S416;

step S414, taking the next convolution layer of the current convolution layer as a new convolution layer, and executing step S404;

in step S416, all of the convolutional layers in the multi-layer convolutional layer are determined to be traversed, and a feature map corresponding to the first designated convolutional layer in each of the multi-layer convolutional layers is obtained.

Step S418, performing dimensionality reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each layer of the first appointed convolutional layer to obtain a feature vector of a preset dimensionality corresponding to the feature map;

step S420, calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group;

step S422, training the initial model based on the loss values corresponding to each group until the loss values corresponding to each group are converged to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers, and a loss function.

In the above embodiment, the feature map corresponding to each first designated convolutional layer is obtained by calculation in a residual error network manner, a specific network structure of the initial model is described based on the vehicle attribute to be identified, and the initial model is trained based on the network structure. The network structure of the model is simple, and the feature graph output by the front-layer convolutional layer can be used for feature calculation of the back-layer convolutional layer, so that data sharing of features corresponding to vehicle attributes is realized, data redundancy is reduced, the real-time performance of vehicle attribute identification is improved, and the application range of vehicle attribute identification is expanded.

The embodiment of the invention also provides another vehicle attribute recognition model training method, which is realized on the basis of the method provided by the embodiment; in this embodiment, a process of performing the dimension reduction processing on the feature map by the pooling layer and the full-link layer corresponding to each vehicle attribute, and a process of obtaining the loss value by the corresponding loss function are described in detail. As shown in fig. 6, the vehicle attribute recognition model training method specifically includes the following steps:

step S602, inputting a target training image into an initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

step S604, determining a plurality of second designated convolutional layers from the plurality of convolutional layers at a first preset interval; executing the following steps for each convolution layer one by one according to the sequence of each convolution layer in the multilayer convolution layers;

step S606, judging whether the current convolution layer is the determined second designated convolution layer; if yes, go to step S608; if yes, go to step S610;

step S608, determining the input characteristic diagram of the current convolutional layer according to the output characteristic diagram of the specified convolutional layer before the current convolutional layer; inputting the input feature map into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer; step S612 is executed;

step S610, inputting the feature map output by the convolution layer previous to the current convolution layer into the current convolution layer for convolution operation, and outputting the feature map corresponding to the current convolution layer;

step S612, judging whether the layer is the last layer of convolution layer; if not, executing step S614; if yes, go to step S616;

step S614, taking the next convolution layer of the current convolution layer as a new convolution layer, and executing step S606;

step S616, determining and traversing all the convolution layers in the multilayer convolution layers to obtain a characteristic diagram corresponding to the first appointed convolution layer of each multilayer convolution layer;

step S618, performing dimensionality reduction processing on the feature graph received by the pooling layers through the pooling layers in each group to obtain a dimensionality-reduced feature graph;

the Pooling layer may be an Average Pooling layer (Average Pooling or mean-Pooling), a Global Average Pooling layer (Global Pooling), a maximum Pooling layer (max-Pooling), or the like; the pooling layer may be configured to reserve main features in the feature map, delete non-main features, and reduce the dimension of the feature map, and the average pooling layer may average feature point values in a neighborhood of a preset range size of a current feature point, and use the average value as a new feature point value of the current feature point, taking the average pooling layer as an example. In addition, the pooling layer may also help the feature map to remain somewhat undeformed, such as rotation invariance, translation invariance, expansion invariance, and the like.

For pooling layers in different groups corresponding to different vehicle attributes, parameters of the pooling layers can be preset, so that the pooling layers screen feature dimensions associated with the vehicle attributes corresponding to the current group from the received feature map through the pooling layers based on the vehicle attributes corresponding to the current group, and the screened feature dimensions are combined into a feature map after dimension reduction. For example, taking a color attribute as an example, the pooling layer corresponding to the color attribute generally filters out feature dimensions related to the color attribute and eliminates feature dimensions that are not related to the color attribute or have a small association.

And S620, stretching the feature map after dimensionality reduction through a full connection layer connected with the pooling layer to obtain a one-dimensional feature vector corresponding to the feature map.

The fully-connected layer can stretch the characteristic diagram output by the pooling layer to obtain a characteristic vector; the stretching process can also be understood as a classification process, and each feature point in the obtained feature vector corresponds to a class, so that the probability of each class is calculated by a subsequent loss function.

Step S622, for each group of corresponding feature vectors, calculating an exponential function value of each feature element in the current group through a loss function;

the identification of the vehicle attributes can be understood as a multi-classification process, and therefore, the loss function of the multi-classification network model can be realized by a cross entropy function, which can also be referred to as a softmax function; specifically, the formula of the softmax function is as follows:

The difference between the feature elements can be expanded by the exponential function value of the feature element relative to the feature element itself, for example, the feature vector is [3,1, -3], and after the exponential function value of each feature element is calculated, the corresponding exponential function value vector of the feature vector is [20,2.7,0.05 ]. The probability of each characteristic element is calculated by adopting the exponential function values of the characteristic elements, so that the probability difference between the characteristic elements can be increased, the probability of a correct recognition result is higher, and the accuracy of the recognition result is facilitated.

Step S624, determining the probability of the current feature element according to the index function value of each feature element in the current group and the sum of the index function values of each feature element in the feature vector corresponding to the current group;

in step S626, the probability of the feature element with the highest probability in the feature vector corresponding to the current group is determined as the loss value of the feature vector corresponding to the current group.

Specifically, the probability of the feature element can be obtained by dividing the exponential function value of each feature element by the sum of the exponential function values of each feature element in the feature vector; and the category corresponding to the characteristic element with the maximum probability is the identification result of the vehicle attribute.

For ease of understanding, an example of one of the above-described steps 20 to 24 is described below, with color attributes preset as red, black, white, and blue; the one-dimensional feature vector output by the full connection layer corresponding to the color attribute usually comprises four feature elements, and each feature element corresponds to one color attribute; assuming the feature vector is [2,3, -1, -3 ]; wherein 2 corresponds to red, 3 corresponds to black, -1 corresponds to white, and-3 corresponds to blue; after each characteristic element is calculated to obtain an index function value, the vector of the obtained index function value is [7,20,0.4,0.05 ]; the probability vector corresponding to each feature element is [0.26,0.73,0.01,0 ]; at this time, the probability of the color attribute being black is the largest, i.e. 0.73, where 0.73 is the loss value of the feature vector corresponding to the current group.

After the model training is finished, in the recognition process, the loss function is used for calculating the probability of each feature element, and the category corresponding to the feature element with the maximum probability is output as a final recognition result.

Step S628, training the initial model based on the loss values corresponding to each group until the loss values corresponding to each group are converged to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers, and a loss function.

In order to facilitate training, the operation burden of the model in the training process is reduced, and after the loss values corresponding to each group are obtained through calculation, the loss values corresponding to each group are subjected to summation operation to obtain the loss value sum; in the training process, whether the sum of the loss values converges or not may be observed, and if the sum of the loss values converges, it may be determined that the loss values corresponding to each group all converge, and at this time, the training may be stopped.

In the embodiment, the dimension reduction is performed on the feature map output by the convolutional layer through the pooling layer and the full-link layer in the model to obtain the main features related to the vehicle attributes, the probability is calculated on the main features through the loss function, and the recognition result and the loss value are output; because the pooling layers corresponding to different vehicle attributes are connected in different convolutional layers in the multilayer convolutional layers, the feature graph output by the convolutional layer close to the front layer can be used for feature calculation of the convolutional layer close to the rear layer, data sharing of features corresponding to the vehicle attributes is realized, data redundancy is reduced, the real-time performance of vehicle attribute identification is improved, and the application range of the vehicle attribute identification is expanded.

Corresponding to the vehicle attribute recognition model training method, the embodiment of the invention also provides a vehicle attribute recognition method, which is applied to equipment configured with a recognition model; the identification model is a target model obtained by training the vehicle attribute identification method in the embodiment; as shown in fig. 7, the method includes the steps of:

step S702, obtaining a vehicle image to be identified;

the vehicle image usually includes all or part of the vehicle; the image to be identified can be screened in advance to obtain the vehicle image; or directly inputting the data into the target model without screening, wherein the result output by the model is usually an error prompt. In addition, the vehicle image may also carry position information of the vehicle in the image detected by other detection networks, where the position information may be positioning information, and the positioning information is usually a rectangular area including the vehicle; the position information may also be segmentation information, which is typically the position of the edge lines of the vehicle.

Step S704, inputting the vehicle image into the target model, and obtaining multiple attributes of the vehicle corresponding to the vehicle image.

Because the target model comprises a plurality of groups of pooling layers, full-link layers and loss functions which are sequentially connected, usually, the loss function in each group outputs an attribute identification result; if the target model is used for identifying the color attribute, the type attribute, the brand attribute and the model attribute of the vehicle, the target model comprises four groups of pooling layers, full-connection layers and loss functions which are sequentially connected; of course, the target model may also be used to identify a plurality of the above-described color attributes, type attributes, brand attributes, model attributes.

According to the vehicle attribute identification method, after the vehicle image to be identified is obtained, the vehicle image is input into the target model, and then various attributes of the vehicle corresponding to the vehicle image can be obtained. The method can realize the multi-attribute recognition of the vehicle through the model with a simple structure, thereby improving the real-time property of the vehicle attribute recognition and expanding the application range of the vehicle attribute recognition.

Corresponding to the above embodiment of the vehicle attribute recognition model training method, refer to a schematic structural diagram of a vehicle attribute recognition model training device shown in fig. 8; the functions realized by the vehicle attribute recognition model training device correspond to the steps executed by the method. The apparatus may be understood as the server or the processor of the server, or may be understood as a component which is independent of the server or the processor and implements the functions of the present application under the control of the server, as shown in fig. 8, the apparatus includes:

a training image input module 80 for inputting a target training image to the initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first appointed convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

a feature map output module 82, configured to output a feature map corresponding to each of the first designated convolutional layers through the plurality of convolutional layers;

the dimension reduction processing module 84 is configured to perform dimension reduction processing on the feature map corresponding to the first specified convolutional layer through the pooling layer and the full-connection layer in the group to which each layer of the first specified convolutional layer is connected, so as to obtain a feature vector of a preset dimension corresponding to the feature map;

a loss value calculation module 86, configured to calculate a loss value of the feature vector corresponding to each group through the loss function in each group;

the training module 88 is configured to train the initial model based on each group of corresponding loss values until each group of corresponding loss values converges to obtain a target model; the target model includes trained multilayer convolutional layers and multiple sets of pooling layers, fully-connected layers and loss functions.

According to the vehicle attribute recognition model training device provided by the embodiment of the invention, the initial model comprises a plurality of sequentially connected convolution layers, a plurality of groups of sequentially connected pooling layers, full-link layers and loss functions; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooled layers in each group are connected to a corresponding first designated convolutional layer, and the corresponding loss function of each group is used to evaluate the identification loss of one vehicle attribute. Outputting a characteristic diagram corresponding to each first appointed convolutional layer through a plurality of convolutional layers in the training process of the model; performing dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through the pooling layer and the full-connection layer in the group connected with each first appointed convolutional layer; calculating loss values of the feature vectors obtained by reducing the dimensions through the loss functions in each group; and training the initial model based on the loss values of each group until the loss values of each group are converged to obtain the target model. The network structure of the model is simple, the feature graph output by the front-layer convolutional layer can be used for feature calculation of the back-layer convolutional layer, data sharing of features corresponding to vehicle attributes is achieved, data redundancy is reduced, multi-attribute recognition of the vehicle can be achieved through the simple model structure, the real-time performance of vehicle attribute recognition is improved, and the application range of the vehicle attribute recognition is expanded.

The modules in the vehicle attribute recognition model training apparatus described above may be connected or communicate with each other via a wired connection or a wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, etc., or any combination thereof. The wireless connection may comprise a connection over a LAN, WAN, bluetooth, ZigBee, NFC, or the like, or any combination thereof. Two or more modules may be combined into a single module, and any one module may be divided into two or more units.

In some embodiments, the loss function comprises a softmax function; the softmax function described above:

In some embodiments, the apparatus further comprises a summing module configured to: carrying out summation operation on the loss values corresponding to each group to obtain the sum of the loss values; a training module to: and training the initial model based on the loss values corresponding to each group until the sum of the loss values is converged, determining that the loss values corresponding to each group are converged, and stopping training.

In correspondence to the above-described vehicle attribute identification embodiment, refer to a schematic structural diagram of a vehicle attribute identification device shown in fig. 9; the functions performed by the identification of the vehicle attributes correspond to the steps performed by the above-described method. The device can be understood as the server or the processor of the server, and can also be understood as a component which is independent of the server or the processor and realizes the functions of the application under the control of the server; the device is applied to equipment configured with a recognition model; the recognition model is a target model obtained by training the vehicle attribute recognition model training method; as shown in fig. 9, the apparatus includes:

a vehicle image acquisition module 90 for acquiring a vehicle image to be identified;

and the vehicle image input module 92 is used for inputting the vehicle image into the target model to obtain various attributes of the vehicle corresponding to the vehicle image.

After the vehicle image to be identified is acquired, the vehicle attribute identification device inputs the vehicle image into the target model, and then various attributes of the vehicle corresponding to the vehicle image can be acquired. The method can realize the multi-attribute recognition of the vehicle through the model with a simple structure, thereby improving the real-time property of the vehicle attribute recognition and expanding the application range of the vehicle attribute recognition.

The modules in the vehicle attribute identification device described above may be connected or communicate with each other via a wired connection or a wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, etc., or any combination thereof. The wireless connection may comprise a connection over a LAN, WAN, bluetooth, ZigBee, NFC, or the like, or any combination thereof. Two or more modules may be combined into a single module, and any one module may be divided into two or more units.

The device provided by the embodiment has the same implementation principle and technical effect as the foregoing embodiment, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiment for the portion of the embodiment of the device that is not mentioned.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A vehicle attribute recognition model training method, the method comprising:

inputting a target training image into an initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first designated convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

outputting a characteristic diagram corresponding to each layer of the first appointed convolutional layer through the plurality of layers of convolutional layers;

performing dimensionality reduction processing on a feature map corresponding to the first specified convolutional layer through a pooling layer and a full-connection layer in a group connected with each layer of the first specified convolutional layer to obtain a feature vector of a preset dimensionality corresponding to the feature map;

calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group;

training the initial model based on each group of corresponding loss values until each group of corresponding loss values are converged to obtain a target model; the target model comprises the trained multilayer convolutional layers and a plurality of groups of pooling layers, full-link layers and loss functions.

2. The method of claim 1, wherein outputting the feature map corresponding to each of the first designated convolutional layer via the plurality of convolutional layers comprises:

and in the multilayer convolutional layers, calculating by means of a residual error network to obtain a characteristic diagram corresponding to each first appointed convolutional layer.

3. The method of claim 2, wherein the step of calculating the feature map corresponding to each of the first designated convolutional layers in the plurality of convolutional layers by means of a residual error network comprises:

determining a plurality of second designated convolutional layers from the plurality of convolutional layers at a first preset interval, and executing the following steps for each convolutional layer one by one according to the sequence of each convolutional layer in the plurality of convolutional layers:

if the current convolutional layer is the determined second designated convolutional layer, determining an input feature map of the current convolutional layer according to a feature map output by the designated convolutional layer before the current convolutional layer; inputting the input feature map into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer;

if the current convolutional layer is a convolutional layer except the second designated convolutional layer, inputting a feature map output by a convolutional layer which is one layer before the current convolutional layer into the current convolutional layer for convolution operation, and outputting a feature map corresponding to the current convolutional layer;

and traversing all the convolutional layers in the multi-layer convolutional layers to obtain a characteristic diagram corresponding to each first appointed convolutional layer in the multi-layer convolutional layers.

4. The method of claim 3, wherein determining the input profile for the current convolutional layer based on the output profile for the specified convolutional layer prior to the current convolutional layer comprises:

fusing a feature map output by a convolutional layer which is one layer before the current convolutional layer with a feature map output by a convolutional layer which is one layer before the previous convolutional layer and has the first preset interval with the current convolutional layer to obtain a fused feature map;

and determining the fused feature map as an input feature map of the current convolutional layer.

5. The method of claim 3 or 4, wherein a first of said second designated convolutional layers is a fourth convolutional layer of said multi-layer convolutional layers; the first predetermined space includes two convolutional layers.

6. The method of claim 1, wherein the vehicle attributes include a plurality of color attributes, type attributes, brand attributes, model attributes;

the position of a first designated convolutional layer connected with the pooling layer corresponding to each vehicle attribute in the multilayer convolutional layer is as follows from front to back: a first designated convolutional layer corresponding to the color attribute, a designated convolutional layer corresponding to the type attribute, a first designated convolutional layer corresponding to the brand attribute and a first designated convolutional layer corresponding to the model attribute; the adjacent first appointed convolution layers are separated by convolution layers with a second preset interval; wherein the first layer of convolutional layer is a convolutional layer of the multi-layer convolutional layer to which a target training image is input.

7. The method of claim 6, wherein if the vehicle attributes include a color attribute, a type attribute, a brand attribute, and a model attribute, the color attribute corresponding first designated convolutional layer is a fifth layer convolutional layer; the second predetermined interval is four convolutional layers.

8. The method according to claim 1, wherein the step of performing dimension reduction processing on the feature map corresponding to each of the first specified convolutional layers through a pooling layer and a fully-connected layer in a group to which the first specified convolutional layer is connected to obtain a feature vector of a preset dimension corresponding to the feature map comprises:

performing dimensionality reduction processing on the feature map received by the pooling layers through the pooling layers in each group to obtain the feature map subjected to dimensionality reduction;

and stretching the feature map after dimensionality reduction through a full connection layer connected with the pooling layer to obtain a one-dimensional feature vector corresponding to the feature map.

9. The method according to claim 8, wherein the step of performing dimension reduction processing on the feature map received by the pooling layers in each group to obtain the feature map after dimension reduction comprises:

and based on the vehicle attributes corresponding to the current group, screening the feature dimensions associated with the vehicle attributes corresponding to the current group from the received feature map through the pooling layer, and forming the screened feature dimensions into the feature map after dimension reduction.

10. The method of claim 1, wherein the loss function comprises a softmax function;

the softmax function:

wherein x is_iRepresenting the ith feature element in the feature vector; x is the number of_jRepresenting the jth feature element in the feature vector; and N represents the total number of the characteristic elements in the characteristic vector.

11. The method of claim 1, wherein the step of calculating the loss value for each set of corresponding eigenvectors from the loss function in each set comprises:

for each group of corresponding feature vectors, calculating an exponential function value of each feature element in the current group through the loss function;

determining the probability of the current characteristic element according to the index function value of each characteristic element in the current group and the sum of the index function values of each characteristic element in the characteristic vector corresponding to the current group;

and determining the probability of the characteristic element with the highest probability in the characteristic vectors corresponding to the current group as the loss value of the characteristic vectors corresponding to the current group.

12. The method of claim 1,

after the step of calculating the loss value of each group of corresponding feature vectors by the loss function in each group, the method further comprises: carrying out summation operation on the loss values corresponding to each group to obtain the sum of the loss values;

the step of training the initial model based on the loss values corresponding to each group until the loss values corresponding to each group are all converged includes: and training the initial model based on each group of corresponding loss values until the sum of the loss values is converged, determining that each group of corresponding loss values is converged, and stopping training.

13. A vehicle attribute identification method is characterized in that the method is applied to equipment configured with an identification model; the identification model is a target model obtained by training according to the method of any one of claims 1 to 12; the method comprises the following steps:

acquiring a vehicle image to be identified;

and inputting the vehicle image into the target model to obtain various attributes of the vehicle corresponding to the vehicle image.

14. The method of claim 13, wherein the plurality of attributes of the vehicle comprise a plurality of color attributes, type attributes, brand attributes, model attributes.

15. A vehicle attribute recognition model training apparatus, characterized in that the apparatus comprises:

the training image input module is used for inputting the target training image into the initial model; the initial model comprises a plurality of convolution layers which are connected in sequence, and a plurality of groups of pooling layers, full-connection layers and loss functions which are connected in sequence; a plurality of first designated convolutional layers are predetermined in the multilayer convolutional layers; the pooling layers in each group are connected with the corresponding first designated convolution layer; the loss functions in each group are used for evaluating the identification loss of one vehicle attribute;

the characteristic diagram output module is used for outputting a characteristic diagram corresponding to each layer of the first appointed convolutional layer through the plurality of layers of convolutional layers;

the dimension reduction processing module is used for carrying out dimension reduction processing on the feature map corresponding to the first appointed convolutional layer through a pooling layer and a full connecting layer in a group connected with each layer of the first appointed convolutional layer to obtain a feature vector of a preset dimension corresponding to the feature map;

the loss value calculation module is used for calculating the loss value of the characteristic vector corresponding to each group through the loss function in each group;

the training module is used for training the initial model based on each group of corresponding loss values until each group of corresponding loss values are converged to obtain a target model; the target model comprises the trained multilayer convolutional layers and a plurality of groups of pooling layers, full-link layers and loss functions.

16. The apparatus of claim 15, wherein the feature map output module is configured to:

17. The apparatus of claim 16, wherein the feature map output module is configured to:

18. The apparatus of claim 17, wherein the feature map output module is configured to:

19. The apparatus of claim 17 or 18, wherein a first one of said second designated convolutional layers is a fourth one of said multi-layer convolutional layers; the first predetermined space includes two convolutional layers.

20. The apparatus of claim 15, wherein the vehicle attributes comprise a plurality of color attributes, type attributes, brand attributes, model attributes;

21. The apparatus of claim 20, wherein if the vehicle attributes include a color attribute, a type attribute, a brand attribute, and a model attribute, the color attribute corresponding first designated convolutional layer is a fifth layer convolutional layer; the second predetermined interval is four convolutional layers.

22. The apparatus of claim 15, wherein the dimension reduction processing module is configured to:

23. The apparatus of claim 22, wherein the dimension reduction processing module is configured to:

24. The apparatus of claim 15, wherein the loss function comprises a softmax function;

the softmax function:

25. The apparatus of claim 15, wherein the loss value calculation module is configured to:

26. The apparatus of claim 15,

the apparatus further comprises a summing module to: carrying out summation operation on the loss values corresponding to each group to obtain the sum of the loss values;

the training module is configured to: and training the initial model based on each group of corresponding loss values until the sum of the loss values is converged, determining that each group of corresponding loss values is converged, and stopping training.

27. A vehicle attribute recognition apparatus, characterized in that the apparatus is applied to a device provided with a recognition model; the identification model is a target model obtained by training according to the method of any one of claims 1 to 12; the device comprises:

the vehicle image acquisition module is used for acquiring a vehicle image to be identified;

and the vehicle image input module is used for inputting the vehicle image into the target model to obtain various attributes of the vehicle corresponding to the vehicle image.

28. The apparatus of claim 27, wherein the plurality of attributes of the vehicle comprise a plurality of color attributes, type attributes, brand attributes, model attributes.

29. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the vehicle property recognition model training method according to any one of claims 1 to 12, or the steps of the vehicle property recognition method according to claim 13 or 14.

30. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the vehicle property recognition model training method according to any one of claims 1 to 12, or the steps of the vehicle property recognition method according to claim 13 or 14.