CN111160481A - Advanced learning-based adas target detection method and system - Google Patents
Advanced learning-based adas target detection method and system Download PDFInfo
- Publication number
- CN111160481A CN111160481A CN201911412209.8A CN201911412209A CN111160481A CN 111160481 A CN111160481 A CN 111160481A CN 201911412209 A CN201911412209 A CN 201911412209A CN 111160481 A CN111160481 A CN 111160481A
- Authority
- CN
- China
- Prior art keywords
- data set
- data
- model
- adas
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 80
- 238000012549 training Methods 0.000 claims abstract description 128
- 238000000034 method Methods 0.000 claims abstract description 55
- 238000013434 data augmentation Methods 0.000 claims abstract description 38
- 238000013135 deep learning Methods 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 23
- 230000004927 fusion Effects 0.000 claims description 25
- 238000012795 verification Methods 0.000 claims description 23
- 238000004422 calculation algorithm Methods 0.000 claims description 20
- 238000007689 inspection Methods 0.000 claims description 20
- 238000013139 quantization Methods 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 12
- 230000006835 compression Effects 0.000 claims description 11
- 238000007906 compression Methods 0.000 claims description 11
- 238000012216 screening Methods 0.000 claims description 9
- 238000012163 sequencing technique Methods 0.000 claims description 9
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 6
- 230000003416 augmentation Effects 0.000 description 5
- 238000013138 pruning Methods 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000000844 transformation Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000003321 amplification Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004821 distillation Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 102100034112 Alkyldihydroxyacetonephosphate synthase, peroxisomal Human genes 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 101000799143 Homo sapiens Alkyldihydroxyacetonephosphate synthase, peroxisomal Proteins 0.000 description 1
- 238000000848 angular dependent Auger electron spectroscopy Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
- 230000002087 whitening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an adas target detection method and system based on deep learning, wherein the method comprises the following steps: acquiring road image data to form an initial data set; establishing a data augmentation strategy, and performing extension processing on the initial data set according to the data augmentation strategy to form an extended data set; providing an adas data set, and combining the adas data set, the extended data set and the initial data set to form a training data set; performing model training by using the training data set to obtain a target detection model; and compressing and quantizing the target detection model to form an implantable model adapted to the adas. The invention starts from the adas data set, and combines the public data set with the acquired data set, so that the adas data covers various road conditions in China, and the problem of strong data pertinence is avoided.
Description
Technical Field
The invention relates to the technical field of target detection, in particular to an adas target detection method and system based on deep learning.
Background
The deep learning technology is increasingly applied to vehicle perception algorithms, in the field of visual perception, technologies based on target detection and road surface segmentation are numerous, and the training process of deep learning generally comprises the following steps: data enhancement and preprocessing, which will be briefly described below.
Due to the particularity of the ADAS application scene, the data acquisition and labeling need to take into consideration factors such as road conditions, weather, time, illumination and the like. The currently popular adas data processing method basically acquires specific data, then performs technologies such as color change, brightness adjustment, size scaling, clipping, data deflection or noise addition on the data to realize an extended data set, and then performs preprocessing on the extended data set, such as normalization, whitening and the like. One method that is currently popular is wordTree, which can be fused among multiple data sets to achieve the effect of covering different scenes. However, the data set of the wordTree method is relatively strong in pertinence, the class distribution of the samples is not balanced, for example, some data sets are directed at a logistics park, more detection is emphasized on a truck, some data sets are directed at a road, more detection is emphasized on a car, forced combination can easily cause model training to be over-fitted, or the appeared objects cannot be well detected.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides an adas target detection method and system based on deep learning, and solves the problem that the pertinence of a data set is high in the existing data enhancement processing, so that a model is easy to be over-trained and over-fitted.
The technical scheme for realizing the purpose is as follows:
the invention provides an adas target detection method based on deep learning, which comprises the following steps:
acquiring road image data to form an initial data set;
establishing a data augmentation strategy, and performing extension processing on the initial data set according to the data augmentation strategy to form an extended data set;
providing an adas data set, and combining the adas data set, the extended data set and the initial data set to form a training data set;
performing model training by using the training data set to obtain a target detection model; and
and compressing and quantizing the target detection model to form an implantable model adaptive to the adas.
The invention starts from the adas data set, and combines the public data set with the acquired data set, so that the adas data covers various road conditions in China, and the problem of strong data pertinence is avoided. The invention also adopts data expansion processing and utilizes a data expansion strategy to carry out data expansion, thereby realizing the acquisition of a small amount of domestic road image data, namely, the public adas data set can be well applied to domestic roads.
The advanced learning-based adas target detection method is further improved in that the method further comprises the following steps:
establishing various data augmentation strategies;
expanding the data sets with the selected number in the initial data set by using the established multiple data augmentation strategies to form a test set;
performing model training by using the test set to obtain a corresponding test model;
providing a verification set, verifying the inspection model by using the verification set to obtain a verification result, and sequencing from small to large according to the obtained verification result;
and selecting a data augmentation strategy with the top ranking to perform expansion processing on the initial data set to form an expanded data set.
The advanced learning-based adas target detection method is further improved in that before the initial data set is subjected to expansion processing, the method further comprises the following steps:
carrying out sample class distribution statistics on the image data in each initial data set, and drawing a sample class distribution curve graph;
and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion.
The advanced learning-based adas target detection method is further improved in that the step of performing model training by using the training data set to obtain a target detection model comprises the following steps:
performing model training by adopting a training algorithm based on feature points, and extracting each image sample in the training data set;
converting the image sample into a hotspot map;
and taking the peak point of each hot spot in the hot spot diagram as input data of model training, taking the width information, the height information and the category information of each hot spot as output data, and performing model training to obtain a corresponding target detection model.
The advanced learning-based adas target detection method is further improved in that the method further comprises the following steps:
establishing a lightweight network forward propagation framework;
inputting the implantable model which is formed by compression quantization and is adapted to adas into the lightweight network forward propagation framework, and obtaining the implantable model of the lightweight framework by utilizing the lightweight network forward propagation framework.
The invention also provides an adas target detection system based on deep learning, which comprises the following steps:
the acquisition module is used for acquiring road image data to form an initial data set;
the data expansion module is connected with the acquisition module and used for expanding the initial data set according to the established data expansion strategy to form an expanded data set;
the acquisition module is used for acquiring an adas data set;
the model training module is connected with the acquisition module, the data expansion module and the acquisition module and is used for utilizing the adas data set, the expansion data set and the initial data set as training data sets and carrying out model training to obtain a target detection model; and
and the model quantization module is connected with the model training module and is used for compressing and quantizing the target detection model to form an implantable model adaptive to the adas.
The advanced learning-based adas target detection system is further improved in that the system further comprises a strategy screening module connected with the data expansion module and the acquisition module;
the strategy screening module is internally stored with a plurality of established data augmentation strategies, the strategy screening module utilizes the plurality of data augmentation strategies to expand a selected number of data sets in the initial data set to form an inspection set, utilizes the inspection set to perform model training to obtain a corresponding inspection model, verifies the inspection model through the verification set to obtain a verification result, performs small-to-large sequencing according to the obtained verification result, and selects the data augmentation strategy with the front sequencing to send to the data expansion module.
The advanced learning-based adas target detection system is further improved by comprising a data fusion module connected with the acquisition module;
the data fusion module is used for carrying out sample class distribution statistics on the image data in each initial data set and drawing a sample class distribution curve graph; and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion.
The advanced learning-based adas target detection system is further improved in that the model training module performs model training by adopting a training algorithm based on feature points, extracts each image sample in the training data set, converts the image samples into a heat point diagram, takes a peak point of each hot point in the heat point diagram as input data of model training, takes width information, height information and category information of each hot point as output data, and performs model training to obtain a corresponding target detection model.
The adas target detection system based on deep learning is further improved by further comprising a lightweight network forward propagation framework connected with the model quantization module, and the lightweight network forward propagation framework is used for converting an implantable model which is formed by compression quantization and is adaptive to adas into an implantable model of the lightweight framework.
Drawings
Fig. 1 is a flowchart of an adas target detection method based on deep learning according to the present invention.
Fig. 2 is a system diagram of the adas target detection system based on deep learning according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples.
Referring to fig. 1, the invention provides an adas target detection method and system based on deep learning, and is used for solving the problems that wordTree in the existing data enhancement processing method has strong pertinence of a data set, the types of templates are distributed unevenly, forced combination is easy to make a model trained to be fitted, or an object cannot be well monitored. The method and the system for detecting the adas target design a new data strategy, and solve the problem of unbalanced sample plate distribution after fusion by calculating the type distribution of each data set in advance. By adopting data expansion and other skills and acquiring a small amount of domestic data sets, the disclosed adas data sets are well applicable to domestic roads. The method is also used for solving the problem that the difficulty of model training is further increased due to the fact that a plurality of noises are learned as features because of different data acquisition standards and different labeling levels when the data are forcibly combined. And the problem that when the image is subjected to size scaling or clipping, a lot of image semantic information is lost or unnecessary noise occurs, and the expansibility of image data is greatly reduced is solved. The invention designs 10 data augmentation strategies, combines the data augmentation strategies with model training, reduces the time for specially performing data augmentation, and simultaneously adopts a multi-scale training mode to enable the model framework to better adapt to data sets with different scales. The invention adopts a training algorithm based on the characteristic points and does not adopt a training algorithm based on the candidate frame, the position points of the target pixels are regarded as the key points of the target, the sample is directly sent to the middle area of the convolution network, and the pixels are operated, thereby greatly reducing the operation amount of the algorithm and simultaneously reducing the resource consumption and the memory consumption. The method and system for detecting adas targets based on deep learning according to the present invention are described below with reference to the accompanying drawings.
Referring to fig. 2, a system diagram of the adas target detection system based on deep learning of the present invention is shown. The adas target detection system based on deep learning of the present invention is described with reference to fig. 2.
As shown in fig. 2, the adas target detection system based on deep learning of the present invention includes an acquisition module 21, a data expansion module 22, an acquisition module 23, a model training module 24, and a model quantization module 25, where the acquisition module 21 is configured to acquire road image data to form an initial data set; the data expansion module 22 is connected with the acquisition module 21, and is configured to receive the initial data set acquired by the acquisition module 21, and perform expansion processing on the initial data set according to the established data expansion strategy to form an expanded data set; the obtaining module 23 is configured to obtain an adas data set, where the adas data set is an existing public data set, but many road conditions of the adas data set cannot meet actual scenes of domestic roads. The model training module 24 is connected with the acquisition module 21, the data expansion module 22 and the acquisition module 23, and is configured to acquire an adas data set, an expansion data set and an initial data set, and take an adas book, the expansion data set and the initial data set as training data sets, and perform model training to obtain a target detection model; the model quantization module 25 is connected to the model training module 24, and is configured to perform compression quantization on the target detection model to form an implantable model adapted to adas.
In a specific embodiment of the present invention, the acquisition module 21 is a vehicle-mounted camera, and is configured to capture a video of a road where a vehicle is traveling to form road image data, preferably, the road image data is an actual scene of a domestic road, and the acquired road image data is fused with an adas data set, so that data of model training can cover various domestic road conditions. The existing adas data is expanded.
In a specific embodiment of the present invention, the adas target detection system of the present invention further includes a data fusion module connected to the acquisition module 21; the data fusion module is used for carrying out sample class distribution statistics on the image data in each initial data set and drawing a sample class distribution curve graph; and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion. And the data fusion module is utilized to realize the fusion of the data sets of the same category and the fusion of the data set parts so as to solve the problem of serious unbalance of the samples in the acquired initial data set. Specifically, each data set has its focus, some data sets only contain vehicles and pedestrians but are not labeled with a signboard, and the other data sets are labeled with the signboard exactly, if forced fusion is carried out without processing, the identification degree of the signboard is not strong, and the sample is seriously unbalanced. Because the subsequent model training adopts the training algorithm based on the characteristic points (namely, the training is carried out by adopting the pixel point method), the marking error is not needed to be considered too much. For the data sets with dissimilar distribution curves, the data sets of the categories with the consistent curve trend are fused, so that the data can be expanded in a targeted manner.
When judging whether the sample class distribution curve graphs are similar or not, the sample distribution curve graphs are divided according to a set interval to obtain a plurality of divided line segments, whether the corresponding divided line segments are the same or not is compared, the occupation ratio of the same divided line segments is calculated, if the occupation ratio is more than 85%, the two sample class distribution curve graphs are judged to be similar, and if not, the two sample class distribution curve graphs are not similar. Similarly, when judging whether the curve trends are consistent, the two curves are correspondingly stacked, and if the two curves are overlapped or partially overlapped, the overlapped part is judged to be consistent in curve trends, and the parts are fused.
In a specific embodiment of the present invention, the adas target detection system of the present invention further includes a policy screening module connected to the data expansion module 22 and the acquisition module 21; the strategy screening module is internally stored with a plurality of established data augmentation strategies, the strategy screening module utilizes the plurality of data augmentation strategies to expand a selected number of data sets in an initial data set to form an inspection set, model training is carried out by utilizing the inspection set to obtain a corresponding inspection model, the inspection model is verified by the verification set to obtain a verification result, small-to-large sequencing is carried out according to the obtained verification result, and the data augmentation strategy with the front sequencing is selected to be sent to the data augmentation module.
Preferably, 10 data augmentation strategies are established, 10 data augmentation strategies are respectively utilized to conduct expansion processing on data sets with the number selected from the initial data sets to form an inspection set, model training is conducted on the inspection set to obtain corresponding inspection models, the inspection models are verified through the verification set to obtain the loss of each inspection model, namely verification results, the obtained loss is sorted from small to large, then the corresponding data augmentation strategies with the lower loss of the first 5 data augmentation strategies are selected, the comfortable data sets are expanded through the 5 selected data augmentation strategies to form expansion data sets, and during expansion processing, one of the 5 data augmentation strategies of each initial data set is randomly selected to conduct expansion processing to complete expansion of all the initial data sets.
Specifically, the 10 data augmentation strategies include: a color augmentation method, a transformation augmentation method, a cutout (part deletion) augmentation method, a bbox internal transformation method, a duck-filling method, sampling pairing, MixMatch (ultra-strong semi-supervised learning), Mixup, a strategy based on a search space, and fusion of a generated image and a real image; specifically, the color augmentation method processes image data in a manner of brightness adjustment, color balance adjustment, contrast adjustment, and image blurring to obtain a new image subjected to expansion processing, and this method artificially adds some noise to the original image data. The change augmentation method is to adopt mirror image, rotation, turning, scaling and cutting methods to physically operate the image data, so that the image data has different transformations, and new images of the expansion processing are obtained. Cutout is the addition of one or more black dots to a sample image (i.e., image data). The Bbox internal conversion method is to perform 1 or 2 operations on the target in the Bbox frame. The duck-filling method is to scratch out some targets and then place the targets on a map without the targets, so that the robustness of the image is improved, the targets are transplanted, and therefore the targets have certain randomness, and the transplanted targets cannot be placed at any positions of the image at will and cannot be subjected to a series of operations of the targets. The sampling is to directly add pixels of two pictures for averaging, label is unchanged, but in the training process, bdd100k data set common data augmentation mode is firstly used to complete multiple epochs and then intermittently prohibit the sampling, and after the training loss function and precision are stable, the sampling is prohibited from fine adjustment. The method is equivalent to randomly introducing noise into a training sample, and artificially introducing a guiding training sample into the training sample.
MixMatch is mainly used for preventing overfitting and enabling MixMatch to have good performance on a larger data set, and the method is mainly an implicit method for data amplification in training, 1) a Batch is randomly selected from the data set and marked as A, and the data of the Batch is subjected to conventional amplification, but label is not changed; 2) taking a Batch of the same size as A and noting B, without considering the label of the Batch, then randomly augmenting it k times, preferably k is 2, and feeding the augmented data into a simple classifier trained in advance, so as to calculate an average classification probability, and then processing it using the temperature Sharpen algorithm, which can be used to obtain a guessed label of the sample of Batch B, the Sharpen algorithm is as follows:
where T is a hyper-parameter (temperature), p is the probability that a sample belongs to a label, and the Sharpen algorithm helps to modify the model to give a low entropy decision. 3) At this time, after the Batch a data is expanded, there is a certain label, and after the second step of processing, k batches can be obtained, and the label of this Batch is predicted, and it can be seen that the data is expanded in this step, then a and k batches are randomly rearranged, one Batch is picked out from the randomly rearranged data and is marked as C, C and the original a are subjected to Mixup processing, and the Mixup method can be referred to the following description. 4) Then, the rearranged data are selected from A and then subjected to Mixup to obtain new Batch D, and the results of D, A and C Mixup are recorded as u 'and x' in the following formula. The loss is calculated for the data sets:
the Mixup domain distribution, assuming that the model behaves linearly when dealing with the sample and the region between samples, reduces the non-adaptability of the data beyond predictive training, as follows:
where λ Beta (α), α ∈ (0, ∞)
The Mixup super-parameter α controls the strength of interpolation between feature-target vectors, reverting to the ERM principle when α → 0.
And defining an image enhancement strategy as an unordered set of K sub-strategies based on a strategy of a search space, and randomly selecting the strategy from the K sub-strategies during training to enhance data of the image. Each strategy comprises N image transformations which are applied to the enhanced data in turn, and the research aims at searching the most effective strategy from the image transformations. These variables together define the search space of a discrete optimization problem, and K is 5 and N is 2 for data enhancement of the target detection task. The search space comprises five sub-strategies in total, each sub-strategy comprises 2 image operations, and each operation simultaneously comprises two parameters, namely the probability p corresponding to the operation and the specific value m of the operation. Here the probability defines the randomness of the enhancement strategy performed on the data samples, while m defines the magnitude of the enhancement.
The generated image is fused with the real image, and the generated image is additionally shielded, surrounded and the like. Some of the above strategies may also be included.
Further, in the process of performing the extension processing, multi-scale training is added to dynamically adjust the size of the image, that is, the size of the image sample is not fixed, and after the extension processing, a new image size is randomly selected to perform training, so that each target feature (that is, a target in the sample) with different resolution is obtained. Preferably, the semantic information of the features of different levels is considered to be different, that is, the semantic information of the shallow features is weaker, but the features of the small targets are more obvious, so that the method is suitable for detecting the small targets in the shallow layer, but for the large targets, the semantic features are weaker, so that the method is not suitable for detecting in the shallow layer; the semantic information of the deep layer features is not strong, and for small targets, the features of the deep layer features basically disappear, so that the method is suitable for large target detection in the deep layer, the range of effective candidate frames is set, in the shallow layer features, the candidate frames of the small targets are determined as the effective candidate frames, only loss calculation is carried out on the small targets, and targets of other scales are ignored; in the deep features, the candidate frame of the large target is considered as an effective candidate frame, only the loss calculation is carried out on the large target, and the size of the small target is ignored, so that the best feature can be extracted from the corresponding scale feature layer, and the detection accuracy is increased.
In a specific embodiment of the present invention, the model training module 24 performs model training by using a training algorithm based on feature points, extracts each image sample in a training data set, converts the image sample into a heat point map, takes a peak point of each hot point in the heat point map as input data for model training, takes width information, height information, and category information of each hot point as output data, and performs model training to obtain a corresponding target detection model.
The model training module 24 of the invention does not adopt algorithms such as yolov3 or ssd based on a candidate frame, but adopts a training algorithm based on feature points, directly sends a training sample into a convolution network to obtain a heatmap of the training sample, and then takes a peak point of the heatmap as a central point, wherein each time each algorithm is based on the candidate frame, the training sample is obtained by taking the peak point of the heatmap as a central pointThe peak point position of each heatmap predicts the width and height information of the target, and makesIs target K (class C)k) Bbox with its central point positionedIn addition, the size of the target is regressed from each target KTo reduce computational load, predictions are made using a single size for each target classIn addition, L1loss is added at the center point position. Here, the image size is not normalized, but the original pixel coordinates are used directly, so the loss function needs to be adjusted as follows: l isdet=Lk+λsizeLsize+λoffLoffHere, take λsize=0.1,λoffThe entire object detection model outputs width information and height information of the class and object at each position 1.
Preferably, when the model training module 24 performs model training, in the process of training for each training data set, after each epoch (iteration number) training, the training model is stored, and then the effect diagram test and the thermodynamic diagram test are performed for the stored model, so that the training situation can be known, and the training strategy can be adjusted at any time. The adas target detection system realizes real-time observation of the effect diagram selected by each epoch and timely adjustment.
The model training process is also a process of fusing an initial data set, an extended data set and an adas data set. After the data fusion is finished, selecting a small amount of data in equal proportion, and performing trial training on the small amount of data to mainly verify whether the initial parameter selection of the model is proper or not; the condition of model convergence to adjust the learning rate strategy; selecting a proper optimization algorithm according to the convergence condition of the model; and verifying whether the training process is correct. The trial training can be regarded as a debug mode, and codes are debugged through debug to judge whether the network structure is good or bad. In order to timely and effectively evaluate the effect of the training model, after each epoch training, the training model is stored, and then an effect diagram test and a thermodynamic diagram test are performed on the stored model. Thus, the training situation can be known, and the training strategy can be adjusted at any time.
In an embodiment of the present invention, the adas target detection system of the present invention further includes a lightweight network forward propagation framework connected to the model quantization module, for converting the implantable model adapted to adas formed by compression quantization into an implantable model of the lightweight framework. Specifically, because of the differences of model training frames, it is difficult to transplant an implantable model therebetween, and there are many unnecessary frames in the training frame for embedded computation, so for this situation, a lightweight network forward propagation frame is utilized, only the implementation of the trained model in the forward propagation process summary is considered, and because the backward propagation training process is removed, a large number of dependent files can be removed, and for forward propagation implementation, including preselecting model files to be loaded, inputting preprocessing, forward propagation individual network layer implementation, and finally outputting implementation. The pre-training model files are obtained by firstly training on different frames, and then the developed conversion tool is obtained to convert the models of different depth learning frames into the models supported by the uniform lightweight frame.
The model quantization module 25 compresses and quantizes the target detection model, the model quantization module 25 realizes the slimming of the model by combining channel pruning, quantization and distillation algorithms, firstly, in the channel pruning stage, two different methods are combined to judge the importance of the channel, the first method is LASSO regression, namely, an L1 norm is added to constrain the weight, so that the weight is sparse; the second is to determine the importance of the filter based on the entropy, convert the feature map into a vector with length of c (the number of filters) by a Global average Pooling output of each layer, obtain a matrix of n × c for n images, divide each filter into m bins, count the probability of each bin, then calculate its entropy, determine the importance of the filter by using the entropy, then cut unimportant filters, after calculating each channel entropy, set a threshold for pruning, or set a constant compression ratio, arrange the entropy from large to small, and only keep the first k. Where the n images are a subset of the training set or the validation set. The two methods are independent of each other. And (4) clipping the same channels (intersection or union) obtained by the two methods. (if the same channels obtained by intersection are too few and the preset clipping rate is not achieved), the quality of clipping channels is guaranteed.
And then carrying out quantification on the basis of obtaining the pruning model. Dividing all weights of each layer into a plurality of clusters, finding the central point of each cluster, then representing the number of the central point by using smaller bit information, and then representing the position and the value of each central point. Assuming n connections, each represented by b bits, and k clusters, only log2(k) bits are needed to represent the index, the compressibility can be calculated:
wherein nb is the total bits, nlog required before no clustering2(k) + kb is the bits of the cluster index plus the bits needed for ligation after clustering.
The clustering method adopts a K-means method to determine the sharing weight of each layer, the weight in each cluster is shared, and attention is paid to the fact that the weight cannot be shared across the layers and can only be shared within the layers. The initialization method for sharing weight (initializing the center of cluster) selects linear initialization, i.e. linear division from the minimum value to the maximum value of the original data is used as the initial cluster center. And then in the last round of training, accumulating the gradient of the middle point of each cluster, multiplying the gradient by the learning rate, subtracting the gradient from the middle point of each cluster, and finely adjusting and updating the cluster center.
Finally, a network distillation method is adopted in the aspect of performance recovery, and the method is based on a teacher-student network method and belongs to migration learning. The teacher network is often a more complex network, has very good performance and generalization capability, and can be used as a soft target to guide another simpler student network to learn, so that a simpler student model with less parameter calculation amount can also have performance similar to that of the teacher network. Therefore, the original fine-tune process is replaced, the model compressed in the first step can be further compressed, and the performance of the model is ensured to be basically unchanged.
The adas target detection system of the invention has the following beneficial effects:
the method disclosed by the invention is used for well applying the disclosed adas data set to domestic roads by data expansion and matching with collection of a small amount of domestic data sets, and a model training frame for key point detection is adopted, so that the resource consumption and the memory consumption can be greatly reduced, the detection rate can reach 23ms per frame before uncompressed quantization, and the detection precision of large, medium and small targets can reach map 0.577. After model compression, the detection precision is hardly lost, and the detection speed per frame can reach within 20 ms.
By the forward reasoning framework, the model conversion of all framework training can be supported without the limitation of the training framework, and the deployment is convenient to realize; as the framework mainly enables the forward propagation of the network. Less memory consumption is used, and the performance can be specially optimized according to the actual deployed equipment by performing special optimization on hardware equipment.
The invention starts from an adas data set, fuses the public data set and the collected data set, enables adas data to cover various road conditions in China, then adopts various data expansion strategies in the training process, and combines a pixel key point detection algorithm to design a new detection method. In the training process, parameter adjusting skills and training skills are used, and an effect graph of each epoch training can be observed in real time and adjusted in time. Finally, according to the trained detection characteristics, a compression quantization method combining multiple compression strategies is designed, unnecessary weight can be effectively removed by the method, the model speed is improved, the model parameters are reduced, and meanwhile, the precision is not lost too much. Finally, the compressed model is converted into a forward propagation framework suitable for embedded migration.
The invention also provides an adas target detection method based on deep learning, which is explained below.
The invention discloses an adas target detection method based on deep learning, which comprises the following steps:
as shown in fig. 1, step S11 is executed to acquire road image data to form an initial data set; then, step S12 is executed;
executing step S12, establishing a data augmentation strategy, and performing expansion processing on the initial data set according to the data augmentation strategy to form an expanded data set; then, step S13 is executed;
executing step S13, providing an adas data set, and combining the adas data set, the extended data set and the initial data set to form a training data set; then, step S14 is executed;
executing step S14, performing model training by using the training data set to obtain a target detection model; then, step S15 is executed;
step S15 is executed to perform compression quantization on the target detection model to form an implantable model adapted to adas.
The invention starts from the adas data set, and combines the public data set with the acquired data set, so that the adas data covers various road conditions in China, and the problem of strong data pertinence is avoided. The invention also adopts data expansion processing and utilizes a data expansion strategy to carry out data expansion, thereby realizing the acquisition of a small amount of domestic road image data, namely, the public adas data set can be well applied to domestic roads.
In a specific embodiment of the present invention, the adas target detection method further includes:
establishing various data augmentation strategies;
expanding the data sets with the selected number in the initial data set by using the established multiple data augmentation strategies to form a test set;
carrying out model training by using the test set to obtain a corresponding test model;
providing a verification set, verifying the inspection model by using the verification set to obtain a verification result, and sequencing from small to large according to the obtained verification result;
and selecting a data augmentation strategy with the top ranking to perform expansion processing on the initial data set to form an expanded data set.
In an embodiment of the present invention, before the expanding the initial data set, the method further includes:
carrying out sample class distribution statistics on the image data in each initial data set, and drawing a sample class distribution curve graph;
and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion.
In one embodiment of the present invention, the step of performing model training using a training data set to obtain a target detection model includes:
performing model training by adopting a training algorithm based on feature points, and extracting each image sample in a training data set;
converting the image sample into a hotspot graph;
and taking the peak point of each hot spot in the hot spot diagram as input data of model training, taking the width information, the height information and the category information of each hot spot as output data, and performing model training to obtain a corresponding target detection model.
In a specific embodiment of the present invention, the adas target detection method further includes:
establishing a lightweight network forward propagation framework;
inputting the implantable model which is formed by compression quantization and is adapted to adas into a lightweight network forward propagation framework, and obtaining the implantable model of the lightweight framework by using the lightweight network forward propagation framework.
While the present invention has been described in detail and with reference to the embodiments thereof as illustrated in the accompanying drawings, it will be apparent to one skilled in the art that various changes and modifications can be made therein. Therefore, certain details of the embodiments are not to be interpreted as limiting, and the scope of the invention is to be determined by the appended claims.
Claims (10)
1. An adas target detection method based on deep learning is characterized by comprising the following steps:
acquiring road image data to form an initial data set;
establishing a data augmentation strategy, and performing extension processing on the initial data set according to the data augmentation strategy to form an extended data set;
providing an adas data set, and combining the adas data set, the extended data set and the initial data set to form a training data set;
performing model training by using the training data set to obtain a target detection model; and
and compressing and quantizing the target detection model to form an implantable model adaptive to the adas.
2. The adas target detection method based on deep learning of claim 1, further comprising:
establishing various data augmentation strategies;
expanding the data sets with the selected number in the initial data set by using the established multiple data augmentation strategies to form a test set;
performing model training by using the test set to obtain a corresponding test model;
providing a verification set, verifying the inspection model by using the verification set to obtain a verification result, and sequencing from small to large according to the obtained verification result;
and selecting a data augmentation strategy with the top ranking to perform expansion processing on the initial data set to form an expanded data set.
3. The adas target detection method based on deep learning of claim 2, wherein before the expanding process is performed on the initial data set, the method further comprises:
carrying out sample class distribution statistics on the image data in each initial data set, and drawing a sample class distribution curve graph;
and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion.
4. The deep learning based adas target detection method of claim 1, wherein the step of performing model training using the training data set to obtain a target detection model comprises:
performing model training by adopting a training algorithm based on feature points, and extracting each image sample in the training data set;
converting the image sample into a hotspot map;
and taking the peak point of each hot spot in the hot spot diagram as input data of model training, taking the width information, the height information and the category information of each hot spot as output data, and performing model training to obtain a corresponding target detection model.
5. The adas target detection method based on deep learning of claim 1, further comprising:
establishing a lightweight network forward propagation framework;
inputting the implantable model which is formed by compression quantization and is adapted to adas into the lightweight network forward propagation framework, and obtaining the implantable model of the lightweight framework by utilizing the lightweight network forward propagation framework.
6. An adas target detection system based on deep learning, comprising:
the acquisition module is used for acquiring road image data to form an initial data set;
the data expansion module is connected with the acquisition module and used for expanding the initial data set according to the established data expansion strategy to form an expanded data set;
the acquisition module is used for acquiring an adas data set;
the model training module is connected with the acquisition module, the data expansion module and the acquisition module and is used for utilizing the adas data set, the expansion data set and the initial data set as training data sets and carrying out model training to obtain a target detection model; and
and the model quantization module is connected with the model training module and is used for compressing and quantizing the target detection model to form an implantable model adaptive to the adas.
7. The deep learning based adas target detection system of claim 6, further comprising a policy screening module connected to the data expansion module and the acquisition module;
the strategy screening module is internally stored with a plurality of established data augmentation strategies, the strategy screening module utilizes the plurality of data augmentation strategies to expand a selected number of data sets in the initial data set to form an inspection set, utilizes the inspection set to perform model training to obtain a corresponding inspection model, verifies the inspection model through the verification set to obtain a verification result, performs small-to-large sequencing according to the obtained verification result, and selects the data augmentation strategy with the front sequencing to send to the data expansion module.
8. The deep learning based adas target detection system of claim 7, further comprising a data fusion module connected to the acquisition module;
the data fusion module is used for carrying out sample class distribution statistics on the image data in each initial data set and drawing a sample class distribution curve graph; and selecting similar data sets or data set parts in the sample category distribution curve graph for fusion, or selecting the data set parts of the categories with consistent curve trends for fusion.
9. The deep learning-based adas target detection system according to claim 6, wherein the model training module performs model training by using a feature point-based training algorithm, extracts each image sample from the training data set, converts the image sample into a heat point map, takes a peak point of each hot point in the heat point map as input data for model training, takes width information, height information, and category information of each hot point as output data, and performs model training to obtain a corresponding target detection model.
10. The deep learning-based adas target detection system of claim 6, further comprising a lightweight network forward propagation framework connected to the model quantization module for converting an implantable model adapted to adas formed by compressed quantization into an implantable model of a lightweight framework.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911412209.8A CN111160481B (en) | 2019-12-31 | 2019-12-31 | Adas target detection method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911412209.8A CN111160481B (en) | 2019-12-31 | 2019-12-31 | Adas target detection method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111160481A true CN111160481A (en) | 2020-05-15 |
CN111160481B CN111160481B (en) | 2024-05-10 |
Family
ID=70560287
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911412209.8A Active CN111160481B (en) | 2019-12-31 | 2019-12-31 | Adas target detection method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111160481B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112163238A (en) * | 2020-09-09 | 2021-01-01 | 中国科学院信息工程研究所 | Network model training method for multi-party participation data unshared |
CN112163450A (en) * | 2020-08-24 | 2021-01-01 | 中国海洋大学 | Based on S3High-frequency ground wave radar ship target detection method based on D learning algorithm |
CN112488033A (en) * | 2020-12-10 | 2021-03-12 | 北京金山云网络技术有限公司 | Data set construction method and device, electronic equipment and storage medium |
CN112598020A (en) * | 2020-11-24 | 2021-04-02 | 深兰人工智能(深圳)有限公司 | Target identification method and system |
CN113111587A (en) * | 2021-04-20 | 2021-07-13 | 北京理工雷科电子信息技术有限公司 | Reusable and extensible machine learning method based on plug-in model |
CN114037053A (en) * | 2021-10-28 | 2022-02-11 | 岚图汽车科技有限公司 | Vehicle visual perception data enhancement method based on GAN and related equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105761306A (en) * | 2016-01-29 | 2016-07-13 | 珠海汇迪科技有限公司 | Road surface model based on field depth image or point cloud and establishment method thereof |
CN206383949U (en) * | 2017-01-04 | 2017-08-08 | 江西沃可视发展有限公司 | Driving safety system based on the pure image procossings of ADAS |
CN107316004A (en) * | 2017-06-06 | 2017-11-03 | 西北工业大学 | Space Target Recognition based on deep learning |
WO2019196130A1 (en) * | 2018-04-12 | 2019-10-17 | 广州飒特红外股份有限公司 | Classifier training method and device for vehicle-mounted thermal imaging pedestrian detection |
CN110414480A (en) * | 2019-08-09 | 2019-11-05 | 威盛电子股份有限公司 | Training image production method and electronic device |
-
2019
- 2019-12-31 CN CN201911412209.8A patent/CN111160481B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105761306A (en) * | 2016-01-29 | 2016-07-13 | 珠海汇迪科技有限公司 | Road surface model based on field depth image or point cloud and establishment method thereof |
CN206383949U (en) * | 2017-01-04 | 2017-08-08 | 江西沃可视发展有限公司 | Driving safety system based on the pure image procossings of ADAS |
CN107316004A (en) * | 2017-06-06 | 2017-11-03 | 西北工业大学 | Space Target Recognition based on deep learning |
WO2019196130A1 (en) * | 2018-04-12 | 2019-10-17 | 广州飒特红外股份有限公司 | Classifier training method and device for vehicle-mounted thermal imaging pedestrian detection |
CN110414480A (en) * | 2019-08-09 | 2019-11-05 | 威盛电子股份有限公司 | Training image production method and electronic device |
Non-Patent Citations (2)
Title |
---|
AI研习社: "多任务深度学习框架在ADAS中的应用 | 分享总结", pages 1 - 8 * |
AI研习社: "多任务深度学习框架在ADAS中的应用|分享总结", HTTPS://ZHUANLAN.ZHIHU.COM/P/29816608, pages 1 - 8 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112163450A (en) * | 2020-08-24 | 2021-01-01 | 中国海洋大学 | Based on S3High-frequency ground wave radar ship target detection method based on D learning algorithm |
CN112163238A (en) * | 2020-09-09 | 2021-01-01 | 中国科学院信息工程研究所 | Network model training method for multi-party participation data unshared |
CN112598020A (en) * | 2020-11-24 | 2021-04-02 | 深兰人工智能(深圳)有限公司 | Target identification method and system |
CN112488033A (en) * | 2020-12-10 | 2021-03-12 | 北京金山云网络技术有限公司 | Data set construction method and device, electronic equipment and storage medium |
CN113111587A (en) * | 2021-04-20 | 2021-07-13 | 北京理工雷科电子信息技术有限公司 | Reusable and extensible machine learning method based on plug-in model |
CN114037053A (en) * | 2021-10-28 | 2022-02-11 | 岚图汽车科技有限公司 | Vehicle visual perception data enhancement method based on GAN and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111160481B (en) | 2024-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111160481B (en) | Adas target detection method and system based on deep learning | |
CN113688723B (en) | Infrared image pedestrian target detection method based on improved YOLOv5 | |
CN111008562B (en) | Human-vehicle target detection method with feature map depth fusion | |
CN109684922B (en) | Multi-model finished dish identification method based on convolutional neural network | |
CN110852288B (en) | Cell image classification method based on two-stage convolutional neural network | |
CN110322445B (en) | Semantic segmentation method based on maximum prediction and inter-label correlation loss function | |
CN110807757B (en) | Image quality evaluation method and device based on artificial intelligence and computer equipment | |
CN111833322B (en) | Garbage multi-target detection method based on improved YOLOv3 | |
CN111914797B (en) | Traffic sign identification method based on multi-scale lightweight convolutional neural network | |
CN104598924A (en) | Target matching detection method | |
CN111428556A (en) | Traffic sign recognition method based on capsule neural network | |
CN113239865B (en) | Deep learning-based lane line detection method | |
CN112163508A (en) | Character recognition method and system based on real scene and OCR terminal | |
CN112149526B (en) | Lane line detection method and system based on long-distance information fusion | |
CN110503157B (en) | Image steganalysis method of multitask convolution neural network based on fine-grained image | |
CN117876383B (en) | Yolov5 l-based highway surface strip-shaped crack detection method | |
CN116563683A (en) | Remote sensing image scene classification method based on convolutional neural network and multi-layer perceptron | |
CN118212572A (en) | Road damage detection method based on improvement YOLOv7 | |
CN112560668B (en) | Human behavior recognition method based on scene priori knowledge | |
CN112132839B (en) | Multi-scale rapid face segmentation method based on deep convolution cascade network | |
CN117893737A (en) | Jellyfish identification and classification method based on YOLOv-LED | |
CN112132207A (en) | Target detection neural network construction method based on multi-branch feature mapping | |
CN115761667A (en) | Unmanned vehicle carried camera target detection method based on improved FCOS algorithm | |
CN107341456B (en) | Weather sunny and cloudy classification method based on single outdoor color image | |
CN116091918A (en) | Land utilization classification method and system based on data enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |