CN113850136A

CN113850136A - Yolov5 and BCNN-based vehicle orientation identification method and system

Info

Publication number: CN113850136A
Application number: CN202110973740.3A
Authority: CN
Inventors: 周子淳; 黄志鹏; 刘建云; 李恒
Original assignee: 709th Research Institute of CSIC
Current assignee: 709th Research Institute of CSIC
Priority date: 2021-08-24
Filing date: 2021-08-24
Publication date: 2021-12-28

Abstract

The invention discloses a yolov5 and BCNN-based vehicle orientation identification method and system, which are characterized in that a vehicle detection model is constructed and trained based on a yolov5 algorithm, a test set is identified by the trained vehicle detection model, and vehicle position information in a test set picture is output; cutting the vehicle area from the original image according to the vehicle position information, and labeling and separately storing the vehicle data according to the vehicle orientation attribute to form a vehicle orientation data set; and constructing a vehicle orientation recognition model based on a BCNN algorithm, training the vehicle orientation recognition model by adopting a labeled vehicle orientation data set, sequentially predicting images in the input monitoring scene data by utilizing the trained vehicle detection model and the trained vehicle orientation recognition model, and outputting the type of the vehicle orientation. The yolov5 algorithm and the BCNN algorithm are combined, so that the vehicle positioning is more accurate, the detection speed is higher, the robustness is stronger, and the task of identifying the vehicle in multiple angles and complex scenes is realized.

Description

Yolov5 and BCNN-based vehicle orientation identification method and system

Technical Field

The invention relates to the technical field of image processing and pattern recognition, in particular to a yolov5 and BCNN-based vehicle orientation recognition method and system.

Background

The access & exit in large-scale garden need register the vehicle information of coming in and going out usually to and confirm the activity orbit of vehicle in the garden, be convenient for carry out intelligent management to the vehicle, most garden adopts the mode of artifical registration to the business turn over of vehicle at present, do like this and can take very much time, the efficiency is very low, when the traffic is very big, also produce very big operating pressure to the staff, also some places have adopted the automatic mode of typeeing based on the image, through discerning out locomotive rear of a vehicle, and then judge the business turn over direction of vehicle. At present, the automatic recording modes have certain limitations, and the camera hanging at a high place cannot achieve good effect.

The patent application number is 201710458187.3, the patent name is an image-based vehicle head orientation recognition method and device, the method provides an image-based vehicle head orientation recognition method and device, the area where each vertex of a vehicle face is located is determined by determining the area of a license plate feature in an image to be recognized, the area where the vehicle face is located is determined by the area where each vertex is located, then the vehicle face area is classified, and the orientation of the vehicle head is further determined.

Patent application No. 201911315840.6 entitled network training method, vehicle head orientation recognition method, device and terminal device, which proposes a network training method, device and terminal device, wherein a minimum quadrilateral region capable of including a target vehicle is determined in an image sample including the target vehicle, a first rectangular region is determined by the minimum quadrilateral region, an offset and a plurality of target regions are calculated by set interpolation points, and the network is trained by using the offset and the plurality of target regions, so that the vehicle head orientation can be recognized.

Disclosure of Invention

The invention provides a yolov5 and BCNN-based vehicle orientation identification method and system, which aim to overcome the technical defects.

In order to achieve the above technical objective, a first aspect of the present invention provides a vehicle orientation identification method based on yolov5 and BCNN, which includes the following steps:

collecting monitoring scene data to form a vehicle detection data set, dividing the vehicle detection data set into a training set and a testing set according to a proportion, and marking vehicle positions in a picture of the training set;

constructing a vehicle detection model based on a yolov5 algorithm, training the vehicle detection model by adopting a marked training set, identifying a test set by utilizing the trained vehicle detection model, and outputting to obtain vehicle position information in a test set picture;

cutting the vehicle area from the original image according to the vehicle position information, and labeling and separately storing the vehicle data according to the vehicle orientation attribute to form a vehicle orientation data set;

and constructing a vehicle orientation recognition model based on a BCNN algorithm, training the vehicle orientation recognition model by adopting the labeled vehicle orientation data set, sequentially predicting images in the input monitoring scene data by utilizing the trained vehicle detection model and the trained vehicle orientation recognition model, and outputting the type of the vehicle orientation.

The invention provides a vehicle orientation identification system based on yolov5 and BCNN, which comprises the following functional modules:

the data making module is used for acquiring monitoring scene data to form a vehicle detection data set, dividing the vehicle detection data set into a training set and a testing set according to a proportion, and marking the vehicle position in a picture of the training set;

the vehicle position identification module is used for constructing a vehicle detection model based on the yolov5 algorithm, training the vehicle detection model by adopting a marked training set, identifying a test set by utilizing the trained vehicle detection model, and outputting the vehicle position information in a test set picture;

the orientation data processing module is used for cutting the vehicle area from an original image according to the vehicle position information, labeling and separately storing the vehicle data according to the vehicle orientation attribute to form a vehicle orientation data set;

the orientation model training module is used for constructing a vehicle orientation recognition model based on a BCNN algorithm and training the vehicle orientation recognition model by adopting a labeled vehicle orientation data set;

and the vehicle orientation identification module is used for sequentially predicting the images in the input monitoring scene data by utilizing the trained vehicle detection model and the trained vehicle orientation identification model and outputting the type of the vehicle orientation.

A third aspect of the present invention provides a server, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the above vehicle orientation identification method based on yolov5 and BCNN when executing the computer program.

A fourth aspect of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-mentioned yolov5 and BCNN-based vehicle orientation identification method.

Compared with the prior art, the yolov5 and BCNN-based vehicle orientation recognition method and system sequentially train and predict the vehicle detection model and the vehicle orientation recognition model, the rectangular region only containing vehicle information can be obtained through the prediction result of the vehicle detection model, the interference of other objects except the rectangular region can be reduced, the vehicle orientation recognition model is used for recognizing the vehicle orientation of the detected vehicle, the local features of the objects can be focused on when the features are extracted, the vehicle head and the vehicle tail can be distinguished more conveniently, the vehicle orientation can be determined, meanwhile, the prediction result of the vehicle detection model is used as the input of the vehicle orientation recognition model, and the classification accuracy of the vehicle orientation recognition model can be further improved. The orientation of the vehicle is detected and identified by combining the yolov5 algorithm and the BCNN algorithm, so that the vehicle orientation detection method is more accurate in positioning, higher in detection speed and higher in robustness, and can realize the identification task of the vehicle in multiple angles and complex scenes.

Drawings

FIG. 1 is a block flow diagram of a vehicle orientation identification method based on yolov5 and BCNN according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a vehicle orientation recognition model according to an embodiment of the present invention;

fig. 3 is a block diagram of a yolov5 and BCNN-based vehicle orientation identification system according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Based on the above, an embodiment of the present invention provides a vehicle orientation identification method based on yolov5 and BCNN, as shown in fig. 1, which includes the following steps:

s1, collecting monitoring scene data to form a vehicle detection data set, dividing the vehicle detection data set into a training set and a testing set according to a proportion, and marking the vehicle position in the training set picture.

Specifically, monitoring scene data is collected, and an image with a moving foreground is extracted from the monitoring scene data and stored to form a vehicle detection data set. The image extraction method with the moving foreground comprises the following steps:

taking a frame of image without vehicles in the designated area as an original template image, taking the difference between the gray values of corresponding pixels of a subsequent image and the template image, if the absolute value of the difference is greater than a specified threshold value, taking the point as a foreground, otherwise, taking the point as a background, as shown in the following:

the img1(i, j) and the img2(i, j) are pixel values of corresponding pixels on two images at a certain moment respectively, and T is a threshold. If the number of foreground pixel points in the designated area is more than 50, the moving foreground exists, and the image with the moving foreground is stored to form a vehicle detection data set.

Carrying out random brightness processing on the acquired data, and adding random noise; and performing cutout data enhancement processing on the preprocessed data again, namely performing random rotation operation on the pictures, wherein the rotation angle is +/-30 degrees, and adding a mosaic with the size of 30x30 at a random position.

Dividing a data set according to a ratio of 9:1 to manufacture a training set and a testing set, labeling the extended training set through labelImg software, labeling vehicle position information on each picture, and generating a txt labeling file in a yolo format, wherein the format of the labeling file is 0m1 n1 m2 n2, 0 is vehicle category information, and m1, n1, m2 and n2 are coordinates after normalization.

S2, constructing a vehicle detection model based on yolov5 algorithm, training the vehicle detection model by adopting a marked training set, identifying the test set by utilizing the trained vehicle detection model, and outputting to obtain the vehicle position information in the test set picture.

In the embodiment of the invention, a renet 18 pre-training model is used as a vehicle detection model, a labeled training set is input into a renet 18 pre-training model for training, specifically, the training iteration number is set to be 400, Adam is used by an optimizer, and the initial learning rate is set to be 0.001. And then, recognizing the test set by using the trained vehicle detection model, and outputting to obtain the vehicle position information and the confidence coefficient in the test set picture. And comparing the test accuracy with the test accuracy of the previous iteration after each iteration is completed, and if the current test accuracy is greater than the test accuracy of the previous iteration, taking the currently generated training model as an optimal model and storing the optimal model until all iterations are completed.

And forming a vehicle position prediction frame according to the output vehicle position information, and performing non-maximum value suppression operation on the vehicle position prediction frame according to the superposition proportion and the confidence information between the vehicle position prediction frames to obtain a finally predicted vehicle position prediction frame through the non-maximum value suppression operation. The confidence degree scores are ranked according to the obtained confidence degree information, the vehicle position prediction frame m with the highest score is selected, the intersection and union of the areas of other vehicle position prediction frames and the vehicle position prediction frame with the highest score are calculated, the ratio P of the intersection and the union is calculated, if P is larger than 0.5, the other vehicle position prediction frames are removed until all the input vehicle position prediction frames are processed, and the final vehicle position prediction frame is obtained.

Calculating the minimum enclosed area q1 of two frames between the coordinate information of the vehicle position prediction frame and the coordinate information of the vehicle position marking frame which are finally predicted, and the intersection area q2, calculating the proportion q of the enclosed area occupied by the area which does not belong to the two frames in the enclosed area through q1 and q2 to be | q1-q2|/q1, calculating the GloU loss between the coordinate information of the vehicle position prediction frame and the coordinate information of the vehicle position marking frame, namely GloU ═ p-q, reversely propagating the GloU loss in the training network, enabling the GloU loss to be small until the model converges, repeatedly calculating the network once every iteration, storing the model once, comparing the test precision in each model after the training iteration is finished, and taking the model with the highest precision as the model.

And S3, cutting the vehicle area from the original image according to the vehicle position information, labeling the vehicle data according to the vehicle direction attribute, storing the vehicle data separately, and forming a vehicle direction data set.

That is, the vehicle detected in the image is cut out according to the vehicle position information output by the vehicle detection model, and is scaled into an image with a fixed size of 250 × 250, a vehicle orientation data set is formed, and the data amount of the vehicle orientation is approximately 1: the method comprises the following steps of 1, wherein the vehicle orientation data comprises scenes of various visual angles as much as possible, including a front visual angle, a top visual angle and a side visual angle.

S4, constructing a vehicle orientation recognition model based on a BCNN algorithm, and training the vehicle orientation recognition model by adopting the labeled vehicle orientation data set.

Before the labeled vehicle orientation data set is adopted to train the vehicle orientation recognition model, the vehicle orientation data set needs to be subjected to enhancement processing. The enhancement processing comprises the steps of performing center cutting processing on data, adding random noise processing, horizontally turning processing and performing random rotation processing within +/-30 degrees. The center cropping operation is to perform short-side adaptation according to the original size and the target size to obtain an adaptation map img1, then perform calculation of translation offset according to the long-width size of img1 and the target size, and scale the long-width of the adaptation map img1 according to the translation offset to obtain the target size, i.e. scale the original to the target size, for example, the size 20 of the map to be cropped is 20 x 8, the size 5 of the target map is 4, (20/5 is 4) > (8/4 is 2), when the short side 8 is used for adaptation, i.e. 8 is reduced to 4, the long side 20 becomes 10, i.e. becomes an image of 10 x 4, and then calculate a translation offset dx according to the wide side 10 of the adaptation image and the wide side 5 of the target size, i.e. dx is 0.5f is 1.5, and then crop the adaptation map according to the translation offset, i.e. crop the original to the target size, and a blank area is not left,

however, before the enhancement processing is performed on the vehicle orientation data set, a black rectangular region is filled in each of the upper left corner and the upper right corner of the input image before the image is enhanced, the upper left corner of the upper left corner rectangular region coincides with the upper left corner of the input image, the left side and the upper side of the upper left corner rectangular region coincide with the left side and the upper side of the input image, the upper right corner of the upper right corner rectangular region coincides with the upper right corner of the input image, and the right side and the upper side of the upper right corner rectangular region coincide with the right side and the upper side of the input image. And then, performing the four kinds of enhancement processing on each image added with the black area, and then randomly selecting three kinds of enhancement processing to perform superposition enhancement on each image added with the black area to form a final enhanced vehicle orientation data set. By adopting a single enhancement mode and a superposition enhancement mode, the data volume can be enriched, and the robustness of the algorithm is increased.

For the enhanced data, the following steps are carried out according to 9:1, dividing data to manufacture a vehicle orientation training set and a test set, wherein the number of the training set is 9 times that of the test set, recording the category information of each picture by using txt documents, labeling the categories according to 01, wherein 0 represents a vehicle head, 1 represents a vehicle tail, labeling the txt documents in a format of image paths plus category information, and spacing the two documents by using spaces.

Specifically, as shown in fig. 2, the vehicle orientation recognition model mainly comprises a backbone network N, which comprises convolution layers and pooling layers, wherein conv1^2 represents that two convolution operations are continuously performed, the following 2 represents the number of times of convolution, conv3^3 represents that three convolution operations are continuously performed, and so on, feature extraction is performed through the convolution layers, the last conv5 layer is transposed to obtain conv5_ T, a feature matrix obtained by conv5 and a feature matrix obtained by conv5_ T are subjected to matrix multiplication operation, then normalization operation is performed, and the features after normalization are connected with a fully connected layer.

And during prediction, the output of the full connection layer is subjected to maximum value calculation, the index of the position of the maximum value is recorded, the index value is the label value obtained by final prediction, if the output of the full connection layer is (0.9, 0.3), 0.9 is taken as the maximum value output by the full connection layer, and the index value is 0, namely the result of current picture prediction is the head.

The training of the vehicle orientation recognition model is divided into two steps, wherein in the first step, all layers except the main network are frozen, the model is trained until convergence, in the second step, all the frozen layers in the first step are opened, fine-tune training is carried out on the model trained in the first step, and the weight of each layer is re-adjusted until the model converges. The method comprises the following steps of firstly setting some hyper-parameters when the first-step training is started, wherein the hyper-parameters comprise an initial learning rate lr being 1, an iterative training step epochs being 150, a weight attenuation parameter weight _ decay being 1e-8, obtaining a predicted value after an input image passes through a normalization layer and a full connection layer, and calculating the loss between the predicted value and a real label by using cross-entropy, wherein the specific calculation formula is as follows:

where p (x) is the true label, q (x) is the output value obtained from the fully connected layer, n is the total number of training classes for vehicle orientation, and x is the current training class for vehicle orientation.

Calculating to obtain a cross-entroyoss value, then performing back propagation on the cross-entroyoss value in a network, performing back propagation on the cross-entroyoss value once every iteration, updating the weight of the model once, storing the model once every iteration for 10 times, calculating the accuracy once by using a test set until the model gradually converges, and finally selecting the model with the highest test accuracy.

And S5, sequentially predicting the images in the input monitoring scene data by using the trained vehicle detection model and vehicle orientation recognition model, and outputting the type of the vehicle orientation.

Meanwhile, when the vehicle orientation recognition model predicts the vehicle orientation in the input vehicle image, a black rectangular region needs to be added to the upper left corner and the upper right corner of the input image, so that some interference information of other vehicles is prevented from being extracted by the vehicle orientation recognition model, and the classification accuracy is improved.

According to the vehicle orientation recognition method based on yolov5 and BCNN, a vehicle detection model and a vehicle orientation recognition model are trained and predicted in sequence, a rectangular region only containing vehicle information can be obtained through a vehicle detection model prediction result, interference of other objects except the rectangular region can be reduced, vehicle orientation recognition is performed on a detected vehicle through a vehicle orientation recognition model, local features of the objects can be focused when the features are extracted, distinguishing of the head and the tail of the vehicle is facilitated, the vehicle orientation is further determined, meanwhile, the prediction result of the vehicle detection model is used as input of the vehicle orientation recognition model, and the classification accuracy of the vehicle orientation recognition model can be further improved. The orientation of the vehicle is detected and identified by combining the yolov5 algorithm and the BCNN algorithm, so that the vehicle orientation detection method is more accurate in positioning, higher in detection speed and higher in robustness, and can realize the identification task of the vehicle in multiple angles and complex scenes.

As shown in fig. 3, an embodiment of the present invention further provides a vehicle orientation identification system based on yolov5 and BCNN, which includes the following functional modules:

the data making module 10 is used for acquiring monitoring scene data to form a vehicle detection data set, dividing the vehicle detection data set into a training set and a testing set according to a proportion, and marking vehicle positions in a picture of the training set;

the vehicle position identification module 20 is used for constructing a vehicle detection model based on the yolov5 algorithm, training the vehicle detection model by adopting a marked training set, identifying a test set by utilizing the trained vehicle detection model, and outputting the vehicle position information in a test set picture;

the orientation data processing module 30 is used for cutting the vehicle area from the original image according to the vehicle position information, labeling and separately storing the vehicle data according to the vehicle orientation attribute to form a vehicle orientation data set;

the orientation model training module 40 is used for constructing a vehicle orientation recognition model based on a BCNN algorithm and training the vehicle orientation recognition model by adopting a labeled vehicle orientation data set;

and a vehicle orientation recognition module 50, configured to predict images in the input monitored scene data sequentially by using the trained vehicle detection model and vehicle orientation recognition model, and output a category of the vehicle orientation.

The implementation of the vehicle orientation recognition system based on yolov5 and BCNN in this embodiment is substantially the same as the vehicle orientation recognition method based on yolov5 and BCNN, and therefore, detailed description thereof is omitted.

The server in this embodiment is a device for providing computing services, and generally refers to a computer with high computing power, which is provided to a plurality of consumers via a network. The server of this embodiment includes: a memory including an executable program stored thereon, a processor, and a system bus, it will be understood by those skilled in the art that the terminal device structure of the present embodiment does not constitute a limitation of the terminal device, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.

The memory may be used to store software programs and modules, and the processor may execute various functional applications of the terminal and data processing by operating the software programs and modules stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the terminal, etc. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The memory contains an executable program of the vehicle orientation identification method based on yolov5 and BCNN, the executable program can be divided into one or more modules/units, the one or more modules/units are stored in the memory and executed by the processor to complete the information acquisition and implementation process, and the one or more modules/units can be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used for describing the execution process of the computer program in the server. For example, the computer program may be divided into a data production module 10, a vehicle position identification module 20, an orientation data processing module 30, an orientation model training module 40, a vehicle orientation identification module 50.

The processor is a control center of the server, connects various parts of the whole terminal equipment by various interfaces and lines, and executes various functions of the terminal and processes data by running or executing software programs and/or modules stored in the memory and calling data stored in the memory, thereby performing overall monitoring of the terminal. Alternatively, the processor may include one or more processing units; preferably, the processor may integrate an application processor, which mainly handles operating systems, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.

The system bus is used to connect functional units in the computer, and can transmit data information, address information and control information, and the types of the functional units can be PCI bus, ISA bus, VESA bus, etc. The system bus is responsible for data and instruction interaction between the processor and the memory. Of course, the system bus may also access other devices such as network interfaces, display devices, etc.

The server at least includes a CPU, a chipset, a memory, a disk system, and the like, and other components are not described herein again.

In the embodiment of the present invention, the executable program executed by the processor included in the terminal specifically includes: a vehicle orientation identification method based on yolov5 and BCNN comprises the following steps:

constructing a vehicle orientation recognition model based on a BCNN algorithm, and training the vehicle orientation recognition model by adopting a labeled vehicle orientation data set;

and sequentially predicting the images in the input monitoring scene data by using the trained vehicle detection model and the trained vehicle orientation recognition model, and outputting the type of the vehicle orientation.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A vehicle orientation identification method based on yolov5 and BCNN is characterized by comprising the following steps:

2. The yolov5 and BCNN-based vehicle orientation identification method according to claim 1, wherein the collecting monitoring scenario data forms a vehicle detection dataset, specifically comprising: and extracting and storing images with motion foregrounds in the monitored scene data to form a vehicle detection data set.

3. The vehicle orientation recognition method based on yolov5 and BCNN as claimed in claim 1, wherein the loss calculation method of the vehicle detection model is as follows:

forming a prediction frame according to the output vehicle position information, and performing non-maximum suppression operation on the prediction frame according to the coincidence proportion and the confidence coefficient information between the prediction frames to obtain final prediction coordinate information of the prediction frame;

and calculating to obtain the GloU loss between the prediction frame and the labeling frame according to the final prediction coordinate information of the prediction frame and the coordinate information labeled by the labeling frame.

4. The yolov5 and BCNN-based vehicle orientation recognition method of claim 1, wherein the vehicle orientation data set needs to be enhanced before the vehicle orientation recognition model is trained using the labeled vehicle orientation data set.

5. The yolov5 and BCNN-based vehicle orientation recognition method of claim 4, wherein before the enhancement processing is performed on the vehicle orientation data set, a black rectangular region is filled in each of the upper left corner and the upper right corner of the input image before the image is enhanced, the upper left corner of the upper left rectangular region coincides with the upper left corner point of the input image, the left side and the upper side of the upper left rectangular region coincide with the left side and the upper side of the input image, the upper right corner of the upper right rectangular region coincides with the upper right corner point of the input image, and the right side and the upper side of the upper right rectangular region coincide with the right side and the upper side of the input image.

6. The yolov5 and BCNN-based vehicle orientation recognition method according to claim 4, wherein the enhancement process comprises a center cropping process, a random noise adding process, a horizontal flipping process and a random rotation process within a range of ± 30 °.

7. The yolov5 and BCNN-based vehicle orientation recognition method according to claim 6, wherein the four enhancement processes are performed on each image with black regions added, and then three enhancement processes selected randomly are performed to perform superposition enhancement on each image with black regions added, so as to form a final enhanced vehicle orientation data set.

8. A vehicle orientation identification system based on yolov5 and BCNN is characterized by comprising the following functional modules:

9. A server comprising a memory, a processor and a computer program stored in said memory and executable on said processor, wherein said processor when executing said computer program performs the steps of any of the yolov5 and BCNN based vehicle orientation recognition methods of claims 1-7.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the yolov5 and BCNN-based vehicle orientation identification method according to any one of claims 1 to 7.