CN110826457A - Vehicle detection method and device under complex scene - Google Patents

Vehicle detection method and device under complex scene Download PDF

Info

Publication number
CN110826457A
CN110826457A CN201911050728.4A CN201911050728A CN110826457A CN 110826457 A CN110826457 A CN 110826457A CN 201911050728 A CN201911050728 A CN 201911050728A CN 110826457 A CN110826457 A CN 110826457A
Authority
CN
China
Prior art keywords
vehicle
module
network
detection network
vehicle detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911050728.4A
Other languages
Chinese (zh)
Other versions
CN110826457B (en
Inventor
张焕芹
罗国慧
毛士杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Science And Technology Ltd Of Upper Hiroad Army
Original Assignee
Science And Technology Ltd Of Upper Hiroad Army
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Science And Technology Ltd Of Upper Hiroad Army filed Critical Science And Technology Ltd Of Upper Hiroad Army
Priority to CN201911050728.4A priority Critical patent/CN110826457B/en
Publication of CN110826457A publication Critical patent/CN110826457A/en
Application granted granted Critical
Publication of CN110826457B publication Critical patent/CN110826457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a vehicle detection method and a device under a complex scene, which comprises the following steps: processing the self-collected vehicle image according to the generated countermeasure network to generate a transformed vehicle image; generating an on-line shielded vehicle image by performing self-adaptive shielding on the vehicle image, and expanding the on-line shielded vehicle image on line to form a shielding module; processing the vehicle image after online expansion by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, fusing feature information with different scales, inputting the feature information into a target detection network, and performing target classification and accurate frame regression to obtain a first vehicle detection network; adding a shielding module consisting of an on-line shielded vehicle image expansion network, and training a first vehicle detection network to form a second vehicle detection network; and detecting the vehicle by utilizing a second vehicle detection network. The invention improves the detection precision and is more suitable for the detection of the shielded target.

Description

Vehicle detection method and device under complex scene
Technical Field
The invention relates to the technical field of vehicle detection, in particular to a vehicle detection method and device under a complex scene.
Background
In both the traditional vision field and the computer vision field, target detection and identification are always one of the most fierce and hot tasks. The target detection task is to classify and identify all targets in one image, and can accurately position different targets with any shape and size at any position in the image. With the continuous development of deep learning technology in recent years, the target detection and identification algorithm is also shifted to the detection technology based on the deep neural network from the traditional detection algorithm based on manual features. On the basis of the target detection technology based on deep learning, various new methods are emerging continuously, and the method can be divided into a one-stage series method based on a region suggestion box and a two-stage series method based on regression according to the processing mode.
The two-stage series method is mainly divided into two steps to complete the target detection task. Firstly, automatically selecting a region with a possibility of having a target in an image to generate a region candidate frame, wherein the process mainly comprises the steps of generating a series of frames with different sizes and scales, extracting features according to the image in the frames, carrying out background-foreground secondary classification, screening the frames classified as foreground, and generating a final candidate frame; then, the image in the candidate frame is judged according to the detailed category and the position of the candidate frame is adjusted. The detection precision is high, but the detection speed is reduced. The method for detecting the small dense targets in the image in the first stage does not need a network to extract candidate frames in advance, the classification and positioning are uniformly regarded as the regression problem of the image to solve, and compared with a method in a second stage series, the method has the advantages of obvious speed and higher detection precision, but the method has a poor effect on detecting the small dense targets.
Disclosure of Invention
The invention provides a vehicle detection method and a device under a complex scene aiming at the problems in the prior art, and compared with a public data set, the method and the device have diversity and complexity, the detection precision is greatly improved, the method and the device are more suitable for detecting the shielded target, and the detection effect on the small target is improved through a multi-scale fused target detection network.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention provides a vehicle detection method under a complex scene, which comprises the following steps:
s11: processing the self-collected vehicle image according to the generated countermeasure network to generate a transformed vehicle image;
s12: inputting the transformed vehicle image in the S11 into a target detection network, generating an online occluded vehicle image by performing adaptive occlusion on the input image, and performing online expansion on the transformed vehicle image in the S11 according to an online occluded vehicle image expansion technology to form an occlusion module;
s13: processing the vehicle image after online expansion in the S12 by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fusing feature information in the feature layers with different scales, inputting the fused feature information into the target detection network in the S12, and performing target classification and accurate frame regression to obtain a first vehicle detection network;
s14: adding an occlusion module consisting of an online occluded vehicle image expansion network in the S12, training the first vehicle detection network formed in the S13, and returning data of the generated training sample to form a second vehicle detection network;
s15: and detecting the vehicle by using the second vehicle detection network formed by the S14.
Preferably, the S11 further includes: and carrying out de-duplication processing on the training samples in the vehicle detection network.
Preferably, the adaptively blocking the input image in S12 further includes: adding a full connection layer and a shielding mask layer on the last layer of feature map of the target detection network; and carrying out convolution on the occlusion mask layer and the feature map to generate an occlusion feature map, training the target detection network by taking the occlusion feature map as input so as to continuously learn how to occlude the image, and mapping the optimal occlusion feature map generated by the trained online difficult sample generation network back to a sample generated by an original image, namely the difficult sample in the training process of the target detection network.
Preferably, the feature layers of different scales in S13 include: the VGG16 convolutional neural network comprises four feature layers with different scales, a large target for predicting the feature information of a high layer is realized, and a small target for predicting the feature information of a bottom layer is realized.
Preferably, the S14 further comprises: and a shielding module formed by an online shielded vehicle image expansion network is used for respectively shielding the characteristic layer of each scale partially, so that self-adaptive shielding of targets of different scales is realized, shielding of a large target on a high-resolution characteristic layer is realized, and shielding of a small target on a low-resolution characteristic layer is realized.
The invention also provides a vehicle detection device under the complex scene, which comprises: the system comprises a self-collected vehicle image processing module, a shielding module generating module, a first vehicle detection network forming module, a second vehicle detection network forming module and a vehicle detection module; wherein the content of the first and second substances,
the self-collected vehicle image processing module is used for processing the self-collected vehicle images according to the generated countermeasure network to generate transformed vehicle images;
the shielding module generation module is used for inputting the vehicle images which are converted in the self-collected vehicle image processing module into a target detection network, generating on-line shielded vehicle images by carrying out self-adaptive shielding on the input images, and carrying out on-line expansion on the vehicle images which are converted in the self-collected vehicle image processing module according to an on-line shielded vehicle image expansion technology to form a shielding module;
the first vehicle detection network forming module is used for processing the vehicle image after online expansion in the shielding module generating module by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fuse feature information in the feature layers with different scales, input the fused feature information into a target detection network in the shielding module generating module, and perform target classification and accurate frame regression to obtain a first vehicle detection network;
the second vehicle detection network forming module is used for adding a shielding module which is formed by an online shielded vehicle image expansion network in the shielding module generating module, training the first vehicle detection network formed by the first vehicle detection network forming module, and returning data of the generated training sample to form a second vehicle detection network;
the vehicle detection module is used for detecting the vehicle by utilizing the second vehicle detection network formed by the second vehicle detection network forming module.
Preferably, the self-collected vehicle image processing module is further configured to perform deduplication processing on training samples in the vehicle detection network.
Preferably, the second vehicle detection network forming module is further configured to add a full connection layer and a shielding mask layer on the last layer of feature map of the target detection network; and carrying out convolution on the occlusion mask layer and the feature map to generate an occlusion feature map, training the target detection network by taking the occlusion feature map as input so as to continuously learn how to occlude the image, and mapping the optimal occlusion feature map generated by the trained online difficult sample generation network back to a sample generated by an original image, namely the difficult sample in the training process of the target detection network.
Preferably, the different-scale feature layers in the first vehicle detection network formation module include: the VGG16 convolutional neural network comprises four feature layers with different scales, a large target for predicting the feature information of a high layer is realized, and a small target for predicting the feature information of a bottom layer is realized.
Preferably, the second vehicle detection network forming module is further configured to perform partial shielding on the feature layer of each scale by using a shielding module formed by an online shielded vehicle image expansion network, so as to implement adaptive shielding on targets of different scales, further implement shielding on a large target on a high-resolution feature layer, and shield a small target on a low-resolution feature layer.
Compared with the prior art, the invention has the following advantages:
(1) according to the vehicle detection method and device under the complex scene, the self-built data set contains a large number of images generated through the countermeasure network and the shielded target images, so that the method and device are more diverse and complex compared with the public data set, the detection precision is greatly improved, and the method and device are more suitable for detecting the shielded target;
(2) according to the vehicle detection method and device under the complex scene, disclosed by the invention, through multi-scale feature fusion, a large target can be predicted by using high-level feature information with a large receptive field, and a small target can be predicted by using bottom-level feature information with a small receptive field, so that fusion of high-level semantic information and bottom-level detail information is completed, and the detection precision is further improved;
(3) according to the vehicle detection method and device under the complex scene, the shielding module formed by the on-line shielded vehicle image expansion network is used for respectively shielding the characteristic layer of each scale partially, and due to the fact that different characteristic layers have different target sensitivities, the shielding of a large target on a high-resolution characteristic layer and the shielding of a small target on a low-resolution characteristic layer can be achieved, and the detection performance of shielded vehicles is improved.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of a vehicle detection method in a complex scenario according to an embodiment of the present invention;
FIG. 2 is a vehicle image before the generation of the countermeasure network augmentation based on deep convolution;
FIG. 3 is a corresponding image 1 of random noise generation;
FIG. 4 is a corresponding image 2 of random noise generation;
FIG. 5 is an online occluded original vehicle image based on a reinforcement learning mechanism;
FIG. 6 is an occlusion gray scale map automatically generated by the network;
FIG. 7 is the occlusion result;
FIG. 8 is an original image of the occluded large target;
FIG. 9 shows the detection results of a conventional SSD 300;
FIG. 10 shows the results of a conventional YOLOv3 assay;
FIG. 11 shows the results of the conventional fast R-CNN assay;
FIG. 12 shows the results of the conventional Cascade R-CNN assay;
FIG. 13 is a detection result of a vehicle detection method in a complex scenario according to an embodiment of the present invention;
FIG. 14 is an original image of occluded small targets;
FIG. 15 shows the detection results of a conventional SSD 300;
FIG. 16 shows the results of a conventional YOLOv3 test;
FIG. 17 shows the results of the conventional fast R-CNN assay;
FIG. 18 shows the results of the conventional Cascade R-CNN assay;
fig. 19 shows a detection result of the vehicle detection method in a complex scene according to an embodiment of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
Fig. 1 is a flowchart illustrating a vehicle detection method in a complex scenario according to an embodiment of the present invention.
Referring to fig. 1, the vehicle detection method in the complex scene of the present embodiment includes the following steps:
s11: processing the self-collected vehicle image according to the generated countermeasure network to generate a transformed vehicle image;
s12: inputting the transformed vehicle image in S11 into a target detection network, generating an online occluded vehicle image by performing adaptive occlusion on the input image, and performing online expansion on the transformed vehicle image in S11 according to an online occluded vehicle image expansion technology to form an occlusion module;
s13: processing the vehicle image after online expansion in the S12 by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fusing feature information in the feature layers with different scales, inputting the fused feature information into a target detection network in the S12, and performing target classification and accurate frame regression to obtain a first vehicle detection network;
s14: adding an occlusion module consisting of an online occluded vehicle image expansion network in the S12, training the first vehicle detection network formed in the S13, and returning data of the generated training sample to form a second vehicle detection network;
s15: the vehicle is detected using the second vehicle detection network formed at S14.
In an embodiment, S11 specifically includes: and the vehicle image expansion of the countermeasure network is generated based on the depth convolution, so that stable training can be completed on a deeper generation model with higher resolution. Another implementation is vehicle image augmentation based on a cyclic generation countermeasure network, which can achieve inter-transformation between images of two domains.
In an embodiment, S12 specifically includes: the method comprises the steps of generating an optimal shielding mask for a target in an image, realizing online expansion of difficult samples, wherein the samples can cause false detection of a target detection network; meanwhile, the difficult samples can be sent to a target detection network again for training, and the detection performance of the difficult samples is improved. The two networks are mutually confronted and learned, so that the performance of the target detection network is improved while a large number of difficult samples are generated.
In an embodiment, S13 specifically includes: the target detection is carried out by adopting four feature maps with different scales in a VGG16 network, a large object is predicted by using high-level feature information with a large receptive field, and a small target is predicted by using low-level feature information with a small receptive field. The invention introduces a multi-scale feature fusion module to complete the fusion of high-level semantic information and low-level detail information and improve the task of detection precision.
In an embodiment, S14 specifically includes: and an occlusion mask module is introduced to further train the network, so that the problem of detection of occluded targets is solved, and the target detection precision is further improved. The network is respectively applied to four different feature layers in parallel, partial shielding is carried out on the feature map on each scale, self-adaptive shielding of targets with different scales is achieved, meanwhile, data returning is carried out on the difficult samples generated on line, and the detection capability of the network on the shielded targets is improved. The network is distributed over four profiles in parallel. Because the four characteristic layers have different sensibility to the targets with different scales in the original image and the four shielding mask layers act independently, the module can shield the large target on the high-resolution characteristic layer and shield the small target on the low-resolution characteristic layer.
In a preferred embodiment, S11 specifically includes: the size of an original data collection and interception target is 96 multiplied by 48, all training data do not need to be preprocessed, the size of an image output generated by a generation network G is converted to [ -1,1], a mini-batch SGD is adopted in the network training process to carry out gradient descent training, the size of the image batch input into the network each time is 128, all input parameters adopt an initialization mode that the average value is 0 and the normalization is 0.02. And (3) carrying out network optimization by using a modified Adam optimizer, carrying out fine adjustment on parameters, wherein the learning rate is modified from 0.001 to 0.0002, and carrying out deduplication processing on the training sample in order to prevent overfitting of the model, namely that the model remembers some simple input features to generate similar pictures, namely images with higher similarity in the training sample are removed.
In a preferred embodiment, S12 specifically includes: the network structure is that a full connection layer and a shielding mask layer are added on the last layer of feature map of the target detection network, and for an input original vehicle image, the original feature map is directly shielded by using the binary shielding mask layer, so that a corresponding feature map is formed. Its loss function expression
Figure BDA0002255274330000081
Wherein i and j represent horizontal and vertical coordinates on the feature map respectively, n represents the number of training sample pairs, d represents the dimension of the feature map, and XpRepresents the p-th image of the n original images, and A () represents the characteristic image X of the network pairpThe occlusion feature map obtained by the occlusion operation is carried out,
Figure BDA0002255274330000082
and (i, j) position output representing the binary shielding mask is in a value range of 0 or 1.
In a preferred embodiment, S13 specifically includes: and performing 3 × 3 convolution operation with the stride of 1 twice on the features in the candidate frame, and further performing feature extraction while reducing the scale of the feature map. 4 x 4 deconvolution with step of 2 is carried out on the characteristic information of the high-level candidate frame[55]And operation, generating a feature map with the same size as the previous layer, summing based on pixel values, adding an activation function and convolution operation to the summation result, and ensuring that the fused features have identifiability. The size of the deconvolution kernel is set according to the image sizes of the four characteristic layers of the VGG network. Assuming that the sizes of the feature maps of the last two layers are m × m and n × n respectively, the sizes of the feature maps generated after two times of 3 × 3 convolution with the step size of 1 are (m-2) × (m-2) and (n-2) × (n-2), and the size of the feature map obtained by deconvolution of the high-layer features with the step size of 4 × 4 of 2 is (2n-2) × (2 n-2). Because the sizes of the four feature maps in the selected VGG network are obtained through the pooling operation in sequence, if m is 2n, the feature maps with the same scale are generated through two operations after the deconvolution of the high-level features and the convolution of the low-level features, and the feature maps can be directly summed based on the pixel values.
In a preferred embodiment, S14 specifically includes: the shielding and shielding mask module is used for pre-training other networks, fixing parameters, accessing the shielding and shielding mask module, carrying out forward propagation on all data in the data set, wherein the training process is the same as that of the previous chapter, training is carried out by utilizing the image and shielding mask data pair, the parameters of the shielding and shielding mask module are trained independently in the process, and the two networks are jointly trained. After the input image passes through the shielding mask module, a shielding characteristic graph is generated, the input image is distinguished by using a loss function, a classification result is compared with a GT label of the input image, back propagation is carried out, and all parameters in the two networks are finely adjusted.
In order to verify that the method provided by the embodiment is more suitable for detecting an occluded target and to prove the effectiveness of the self-established data set of the embodiment, a network model trained by the self-established data set of the embodiment is compared and evaluated with a network model trained by public data sets (VOC07+12 and COCO), wherein the comparison algorithms are SSD300, YOLov3, Faster R-CNN (VGG-16 framework) and Cascade R-CNN.
TABLE 1 training results for different data sets for different methods
Figure BDA0002255274330000091
Figure BDA0002255274330000101
As can be seen from the data in table 1, the detection accuracy of different methods in the COCO data set is much lower than that in the VOC data set because the COCO data set contains images acquired in more categories of real scenes relative to the VOC data set, and at the same time, there are many images of each target and many targets in each image, so the network model obtained by training the COCO data set has much lower detection accuracy relative to the network model obtained by training the VOC data set; because the self-built data set of the invention is provided with a large number of images generated by GAN and occluded target images, the detection precision of the comparison method under the data set is generally reduced compared with the COCO data set, but the method of the invention is greatly improved, which shows that the self-built data set of the invention has more diversity and complexity compared with the public data set, and the method of the invention is more suitable for detecting the occluded target.
For a more intuitive display, reference is made to fig. 2-19, which are illustrations of an example process for vehicle detection in complex scenes using the above method, wherein: FIG. 2 is an example of the generation of an image of a vehicle prior to augmentation of a countermeasure network based on deep convolution; fig. 3 is a corresponding image 1 of random noise generation in this example, which increases the difficulty of detection compared to fig. 2. Fig. 4 is an image 2 of random noise generation in this example, which is more difficult to detect than the random noise generated in fig. 3.
FIG. 5 is an online occluded original vehicle image based on a reinforcement learning mechanism in this example; FIG. 6 is an occlusion gray scale map automatically generated by the network in this example; FIG. 7 is the occlusion result for this example, which generates the optimal occlusion image produced by the network for online difficult samples. Fig. 8 is an original image of the occlusion large object detection in this example.
For comparison with the prior art method, fig. 9 shows the detection result of the prior art SSD 300; FIG. 10 shows the results of a conventional YOLOv3 assay; FIG. 11 shows the results of the conventional fast R-CNN assay; FIG. 12 shows the results of the conventional Cascade R-CNN assay; FIG. 13 is a detection result of a vehicle detection method in a complex scenario according to an embodiment of the present invention; as can be seen directly from the comparison, the embodiment of the invention has better detection effect on the shielded vehicle and higher detection precision.
FIG. 14 is an original image of occluded small targets; FIG. 15 shows the detection results of a conventional SSD 300; FIG. 16 shows the results of a conventional YOLOv3 test; FIG. 17 shows the results of the conventional fast R-CNN assay; FIG. 18 shows the results of the conventional Cascade R-CNN assay; FIG. 19 is a detection result of a vehicle detection method in a complex scenario according to an embodiment of the present invention; as can be seen directly from the comparison, the embodiment of the invention has better detection effect on small targets.
In another embodiment, the present invention further provides a vehicle detection apparatus in a complex scene, which is used to implement the vehicle detection method of the foregoing embodiment, and includes: the system comprises a self-collected vehicle image processing module, a shielding module generating module, a first vehicle detection network forming module, a second vehicle detection network forming module and a vehicle detection module; wherein the content of the first and second substances,
the self-collected vehicle image processing module is used for processing the self-collected vehicle images according to the generated countermeasure network to generate transformed vehicle images;
the shielding module generation module is used for inputting the vehicle images which are subjected to conversion in the self-collected vehicle image processing module into a target detection network, generating on-line shielded vehicle images by carrying out self-adaptive shielding on the input images, and carrying out on-line expansion on the vehicle images which are subjected to conversion in the self-collected vehicle image processing module according to an on-line shielded vehicle image expansion technology to form a shielding module;
the first vehicle detection network forming module is used for processing the vehicle image after online expansion in the shielding module generating module by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fuse feature information in the feature layers with different scales, input the fused feature information into a target detection network in the shielding module generating module, and perform target classification and accurate frame regression to obtain a first vehicle detection network;
the second vehicle detection network forming module is used for adding a shielding module which is formed by an online shielded vehicle image expansion network in the shielding module generating module, training the first vehicle detection network formed by the first vehicle detection network forming module, and returning data of the generated training sample to form a second vehicle detection network;
the vehicle detection module is used for detecting the vehicle by utilizing a second vehicle detection network formed by the second vehicle detection network forming module.
The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and not to limit the invention. Any modifications and variations within the scope of the description, which may occur to those skilled in the art, are intended to be within the scope of the invention.

Claims (10)

1. A vehicle detection method under a complex scene is characterized by comprising the following steps:
s11: processing the self-collected vehicle image according to the generated countermeasure network to generate a transformed vehicle image;
s12: inputting the transformed vehicle image in the S11 into a target detection network, generating an online occluded vehicle image by performing adaptive occlusion on the input image, and performing online expansion on the transformed vehicle image in the S11 according to an online occluded vehicle image expansion technology to form an occlusion module;
s13: processing the vehicle image after online expansion in the S12 by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fusing feature information in the feature layers with different scales, inputting the fused feature information into the target detection network in the S12, and performing target classification and accurate frame regression to obtain a first vehicle detection network;
s14: adding an occlusion module consisting of an online occluded vehicle image expansion network in the S12, training the first vehicle detection network formed in the S13, and returning data of the generated training sample to form a second vehicle detection network;
s15: and detecting the vehicle by using the second vehicle detection network formed by the S14.
2. The vehicle detection method under the complex scene according to claim 1, wherein the S11 further comprises: and carrying out de-duplication processing on the training samples in the vehicle detection network.
3. The method for detecting vehicles in complex scenes according to claim 1, wherein the adaptively blocking the input image in S12 further comprises: adding a full connection layer and a shielding mask layer on the last layer of feature map of the target detection network; and carrying out convolution on the occlusion mask layer and the feature map to generate an occlusion feature map, training the target detection network by taking the occlusion feature map as input so as to continuously learn how to occlude the image, and mapping the optimal occlusion feature map generated by the trained online difficult sample generation network back to a sample generated by an original image, namely the difficult sample in the training process of the target detection network.
4. The method for detecting the vehicle under the complex scene according to claim 1, wherein the feature layers with different scales in the S13 comprise: the VGG16 convolutional neural network comprises four feature layers with different scales, a large target for predicting the feature information of a high layer is realized, and a small target for predicting the feature information of a bottom layer is realized.
5. The vehicle detection method under the complex scene according to claim 1, wherein the S14 is further: and a shielding module formed by an online shielded vehicle image expansion network is used for respectively shielding the characteristic layer of each scale partially, so that self-adaptive shielding of targets of different scales is realized, shielding of a large target on a high-resolution characteristic layer is realized, and shielding of a small target on a low-resolution characteristic layer is realized.
6. A vehicle detection device under a complex scene is characterized by comprising: the system comprises a self-collected vehicle image processing module, a shielding module generating module, a first vehicle detection network forming module, a second vehicle detection network forming module and a vehicle detection module; wherein the content of the first and second substances,
the self-collected vehicle image processing module is used for processing the self-collected vehicle images according to the generated countermeasure network to generate transformed vehicle images;
the shielding module generation module is used for inputting the vehicle images which are converted in the self-collected vehicle image processing module into a target detection network, generating on-line shielded vehicle images by carrying out self-adaptive shielding on the input images, and carrying out on-line expansion on the vehicle images which are converted in the self-collected vehicle image processing module according to an on-line shielded vehicle image expansion technology to form a shielding module;
the first vehicle detection network forming module is used for processing the vehicle image after online expansion in the shielding module generating module by utilizing a multi-scale feature extraction technology to obtain region candidate frames with different scales, obtain feature layers with different scales, fuse feature information in the feature layers with different scales, input the fused feature information into a target detection network in the shielding module generating module, and perform target classification and accurate frame regression to obtain a first vehicle detection network;
the second vehicle detection network forming module is used for adding a shielding module which is formed by an online shielded vehicle image expansion network in the shielding module generating module, training the first vehicle detection network formed by the first vehicle detection network forming module, and returning data of the generated training sample to form a second vehicle detection network;
the vehicle detection module is used for detecting the vehicle by utilizing the second vehicle detection network formed by the second vehicle detection network forming module.
7. The vehicle detection apparatus under the complex scene of claim 6, wherein the self-collected vehicle image processing module is further configured to perform de-duplication processing on training samples in a vehicle detection network.
8. The vehicle detection device under the complex scene according to claim 6, wherein the second vehicle detection network forming module is further configured to add a full connection layer and an occlusion mask layer on the last layer of feature map of the target detection network; and carrying out convolution on the occlusion mask layer and the feature map to generate an occlusion feature map, training the target detection network by taking the occlusion feature map as input so as to continuously learn how to occlude the image, and mapping the optimal occlusion feature map generated by the trained online difficult sample generation network back to a sample generated by an original image, namely the difficult sample in the training process of the target detection network.
9. The vehicle detection device under the complex scene of claim 6, wherein the feature layers of different scales in the first vehicle detection network formation module comprise: the VGG16 convolutional neural network comprises four feature layers with different scales, a large target for predicting the feature information of a high layer is realized, and a small target for predicting the feature information of a bottom layer is realized.
10. The vehicle detection device under the complex scene according to claim 6, wherein the second vehicle detection network forming module is further configured to respectively perform partial occlusion on the feature layer of each scale by using an occlusion module formed by an online occluded vehicle image expansion network, so as to achieve adaptive occlusion on targets of different scales, further achieve occlusion on a large target in a high resolution feature layer, and occlusion on a small target in a low resolution feature layer.
CN201911050728.4A 2019-10-31 2019-10-31 Vehicle detection method and device under complex scene Active CN110826457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911050728.4A CN110826457B (en) 2019-10-31 2019-10-31 Vehicle detection method and device under complex scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911050728.4A CN110826457B (en) 2019-10-31 2019-10-31 Vehicle detection method and device under complex scene

Publications (2)

Publication Number Publication Date
CN110826457A true CN110826457A (en) 2020-02-21
CN110826457B CN110826457B (en) 2022-08-19

Family

ID=69551758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911050728.4A Active CN110826457B (en) 2019-10-31 2019-10-31 Vehicle detection method and device under complex scene

Country Status (1)

Country Link
CN (1) CN110826457B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113012176A (en) * 2021-03-17 2021-06-22 北京百度网讯科技有限公司 Sample image processing method and device, electronic equipment and storage medium
CN114782799A (en) * 2022-02-10 2022-07-22 成都臻识科技发展有限公司 Simulation method, system, equipment and medium for shielding of large vehicle under high-phase camera visual angle
CN114882449A (en) * 2022-04-11 2022-08-09 淮阴工学院 Car-Det network model-based vehicle detection method and device
CN115082758A (en) * 2022-08-19 2022-09-20 深圳比特微电子科技有限公司 Training method of target detection model, target detection method, device and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284704A (en) * 2018-09-07 2019-01-29 中国电子科技集团公司第三十八研究所 Complex background SAR vehicle target detection method based on CNN
US20190156144A1 (en) * 2017-02-23 2019-05-23 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156144A1 (en) * 2017-02-23 2019-05-23 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device
CN109284704A (en) * 2018-09-07 2019-01-29 中国电子科技集团公司第三十八研究所 Complex background SAR vehicle target detection method based on CNN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHIFENG ZHANG ET AL.: "Single-Shot Refinement Neural Network for Object Detection", 《ARXIV:1711.06897V3》 *
XIAOLONG WANG ET AL.: "A-Fast-RCNN:Hard Positive Generation via Adversary for Object Detection", 《ARXIV:1704.03414V1》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113012176A (en) * 2021-03-17 2021-06-22 北京百度网讯科技有限公司 Sample image processing method and device, electronic equipment and storage medium
CN113012176B (en) * 2021-03-17 2023-12-15 阿波罗智联(北京)科技有限公司 Sample image processing method and device, electronic equipment and storage medium
CN114782799A (en) * 2022-02-10 2022-07-22 成都臻识科技发展有限公司 Simulation method, system, equipment and medium for shielding of large vehicle under high-phase camera visual angle
CN114882449A (en) * 2022-04-11 2022-08-09 淮阴工学院 Car-Det network model-based vehicle detection method and device
CN114882449B (en) * 2022-04-11 2023-08-22 淮阴工学院 Car-Det network model-based vehicle detection method and device
CN115082758A (en) * 2022-08-19 2022-09-20 深圳比特微电子科技有限公司 Training method of target detection model, target detection method, device and medium
CN115082758B (en) * 2022-08-19 2022-11-11 深圳比特微电子科技有限公司 Training method of target detection model, target detection method, device and medium

Also Published As

Publication number Publication date
CN110826457B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
CN110826457B (en) Vehicle detection method and device under complex scene
CN112052787B (en) Target detection method and device based on artificial intelligence and electronic equipment
CN107169421B (en) Automobile driving scene target detection method based on deep convolutional neural network
CN111723693B (en) Crowd counting method based on small sample learning
CN110348437B (en) Target detection method based on weak supervised learning and occlusion perception
CN111027576B (en) Cooperative significance detection method based on cooperative significance generation type countermeasure network
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN114187311A (en) Image semantic segmentation method, device, equipment and storage medium
CN110348475A (en) It is a kind of based on spatial alternation to resisting sample Enhancement Method and model
CN112613350A (en) High-resolution optical remote sensing image airplane target detection method based on deep neural network
CN112434618B (en) Video target detection method, storage medium and device based on sparse foreground priori
CN114202743A (en) Improved fast-RCNN-based small target detection method in automatic driving scene
CN111767962A (en) One-stage target detection method, system and device based on generation countermeasure network
CN115131797A (en) Scene text detection method based on feature enhancement pyramid network
CN111368634B (en) Human head detection method, system and storage medium based on neural network
Liang et al. Car detection and classification using cascade model
CN111612803B (en) Vehicle image semantic segmentation method based on image definition
CN115641584B (en) Foggy day image identification method and device
CN116861262A (en) Perception model training method and device, electronic equipment and storage medium
CN112614158B (en) Sampling frame self-adaptive multi-feature fusion online target tracking method
CN113807407A (en) Target detection model training method, model performance detection method and device
Dong et al. Intelligent pixel-level pavement marking detection using 2D laser pavement images
CN113362372B (en) Single target tracking method and computer readable medium
CN111582057A (en) Face verification method based on local receptive field
Akan et al. Just noticeable difference for machine perception and generation of regularized adversarial images with minimal perturbation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant