CN117115616A - Real-time low-illumination image target detection method based on convolutional neural network - Google Patents

Real-time low-illumination image target detection method based on convolutional neural network Download PDF

Info

Publication number
CN117115616A
CN117115616A CN202310940678.7A CN202310940678A CN117115616A CN 117115616 A CN117115616 A CN 117115616A CN 202310940678 A CN202310940678 A CN 202310940678A CN 117115616 A CN117115616 A CN 117115616A
Authority
CN
China
Prior art keywords
image
low
network
enhancement
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310940678.7A
Other languages
Chinese (zh)
Inventor
袁宥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University of Science and Technology
Original Assignee
Jiangsu University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University of Science and Technology filed Critical Jiangsu University of Science and Technology
Priority to CN202310940678.7A priority Critical patent/CN117115616A/en
Publication of CN117115616A publication Critical patent/CN117115616A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a real-time low-illumination image target detection method based on a convolutional neural network, which is applied to the technical field of computer vision and comprises the following steps: constructing a real low-illumination image dataset; based on the image enhancement process, downsampling the high-resolution image, so as to reduce the calculation cost; the contrast of the low-illumination image is restored by utilizing the depth curve estimation, so that the image quality is improved; based on the target detection process, a lightweight network is used to meet the overall real-time requirement of the detection model; the efficient coordinate attention mechanism is used for focusing on the channel and space position information, so that the characteristic learning capability of the network is enhanced; taking image enhancement as a preprocessing part of target detection to form an enhancement+detection model; adding a weight for each characteristic channel, learning the importance of each channel in the characteristic diagram, establishing a double-channel fusion original image and an enhanced image, and inhibiting the influence of noise amplification caused by image enhancement. And establishing a complementary relation between the original image and the enhanced image.

Description

Real-time low-illumination image target detection method based on convolutional neural network
Technical Field
The invention belongs to the field of deep learning and computer vision, and particularly relates to a real-time low-illumination image target detection method based on a convolutional neural network.
Background
Along with the development of deep learning in the field of computer vision, a target detection algorithm is gradually evolved into Two branches, namely One-Stage and Two-Stage. The target detection task is regarded as a regression problem of positioning and classification by a one-stage algorithm, and the candidate region is selected and classified by a two-stage algorithm. Compared with a one-stage algorithm, the two-stage algorithm is often required to be deployed on a platform with larger calculation power and takes longer detection time, and the real-time requirement of a target detection task is not met. The existing target detection algorithm such as R-CNN, SSD, YOLO can obtain a good detection effect in the universal data set ImageNet, COCO, VOC, and is widely applied to the fields of intelligent traffic, face recognition, pathological analysis, industrial detection and the like. However, imaging in the real world is affected by illumination and equipment, and the captured image has problems of insufficient contrast and low signal-to-noise ratio. The low-quality image not only affects the visual effect, but also makes the downstream visual task more difficult, and seriously affects the detection accuracy of the algorithm.
Researchers generally use two types of methods to deal with the detection problem of low-illumination images, one type of method uses devices such as thermal imaging or infrared sensors to acquire images, but the method has high requirements on physical devices and high cost; another category restores image quality by image enhancement techniques, but traditional histogram equalization or Retinex-based methods focus on restoring the contrast of the image, failing to restore the true color of the image. With the application of deep learning in the field of image processing, the convolutional neural network can be used for extracting the high-level semantics of an algorithm model, learning the characteristics of image contrast, illumination color and the like, and generating a more expressive effect. The method does not judge the image quality by visual sense, combines with a downstream visual task, takes image enhancement as a preprocessing operation of target detection, and cascades an enhancer and a detector to form an enhancement and detection strategy, thereby reducing the influence of a low-illumination image on a target detection algorithm.
Disclosure of Invention
The invention provides a convolution neural network-based real-time low-illumination image target detection method for solving the problem of low target detection precision in a low-illumination environment. Aiming at the characteristics of insufficient contrast and low signal-to-noise ratio of the low-illumination image, the method utilizes a depth network to enhance and recover the image on the premise of not utilizing physical equipment such as illumination, infrared and the like, and optimizes a data set for training a neural network in a data enhancement mode. The invention mainly solves two technical problems, namely, the image is enhanced to restore the image and the noise is amplified, so that the detection model has lower recognition capability on fuzzy objects; and secondly, the parameter quantity of the cascade connection of the enhancer and the detector in actual application is too large, the calculated quantity is too high, and the requirement of realizing real-time detection on an embedded platform with smaller calculation force cannot be met.
Aiming at the first problem, the invention designs a feature fusion module based on a channel attention mechanism to form a feature extraction network of attention pixel level information, and the fusion module is utilized to fuse low-level features of an enhanced image and an original image so as to strengthen the recognition capability of a fuzzy object; aiming at the second problem, the enhancer is subjected to light weight treatment, and the common convolution in the depth network is replaced by the depth separable convolution, so that the network parameter number of the enhancer is greatly reduced, and the processing speed and reasoning capacity of the enhancer are improved. The technical scheme adopted by the invention is as follows:
a real-time low-illumination image target detection method based on a convolutional neural network comprises the following steps:
step one, configuring a deep learning software environment, and configuring an image enhancement algorithm and a target detection algorithm environment based on a convolutional neural network;
step two, constructing a low-illumination image data set, acquiring a real low-illumination image, marking the image, and summarizing the image into a tag data set;
step three, an enhancement recovery module is established, and a weight mechanism is used for inhibiting noise amplification caused by image enhancement;
and fourthly, constructing a deep neural network, and establishing a detection mode of cascade connection of the enhancer and the detector. The method comprises the steps of performing image enhancement on a data set to serve as a preprocessing part of a detection network, and then performing feature extraction and detection activities;
step five, a lightweight network firstly performs downsampling processing on an image in a preprocessing process, and secondly replaces standard convolution with depth separable convolution to ensure network real-time requirements;
step six, optimizing the network, adding an attention mechanism into the feature extraction network to make up for the problem of precision reduction caused by network weight reduction, so that the overall model is balanced in precision and speed;
and step seven, training a neural network model, and verifying the detection effect of the low-light environment.
Specific:
the target detection algorithm is a two-step algorithm based on candidate areas or a single-step algorithm based on regression, and the image enhancement algorithm is a histogram image enhancement algorithm, a tone mapping image enhancement algorithm or a Retinex image enhancement algorithm;
and step two, the real low-illumination image data set is a synthesized data set, and firstly, the real low-illumination image data set is obtained by utilizing a network, and the low-illumination images in the public data set are screened for expansion. And marking the images and summarizing the images into a label data set. Finally, randomly dividing the label data set into a training set, a verification set and a test set;
and thirdly, the enhancement recovery module takes the enhanced image and the original image as input to be fused in a double-channel mode, and adds a weight for each characteristic channel by utilizing a channel attention mechanism. And secondly, learning the importance of each channel in the feature map through a neural network. Finally, according to the weight aggregation dual-channel input characteristic channel, the attention of the module to the target characteristic information channel is improved, and the influence of noise amplification during image enhancement is restrained;
the cascade network of the enhancer and the detector consists of a low-illumination image enhancement algorithm, a cascade module and a target detection algorithm, and comprises an image preprocessing layer, a feature fusion layer, a feature extraction layer and a prediction layer. The low-illumination image enhancement algorithm is used as a preprocessing layer of a network to enhance image quality, the low-illumination image enhancement algorithm comprises a histogram equalization method and a method based on Retinex theory or curve mapping, the cascade module is an enhancement recovery module in the third step, the target detection algorithm comprises a two-step algorithm based on candidate areas or a single-step algorithm based on classification regression, feature fusion is carried out through a network model such as CSPDarknet, VGG or Mobilene, feature extraction is carried out through a network structure such as feature pyramid FPN, PANet or BiFPN, classification regression is carried out through a convolution module of 3 multiplied by 1, and the probability that a target appears in a priori frame is calculated and compared;
and fifthly, the model is light, and the preprocessing part takes the downsampled small-scale image as the input of the preprocessing layer, so that the calculation cost of convolutional layer learning is reduced. And secondly, restoring the enhanced image to the original resolution through upsampling, and substituting the enhanced image into a subsequent activity. Finally, the common convolution is replaced by the depth separable convolution, and the parameter quantity can be reduced to one tenth of the original one;
and step six, adding an attention mechanism into the feature extraction network to compensate for the problem of precision reduction caused by network weight reduction, wherein the attention mechanism can allocate computing resources to more important tasks under the condition of limited computing capacity, so that the neural network has the feature extraction capacity of concentrating on space information and channel information, and the overall model achieves balance in precision and speed.
And step seven, the model training part randomly divides the low-illumination image data set into a training set, a verification set and a test set according to the ratio of 8:1:1 to generate a low-illumination image target detection model. Secondly, verifying the detection effect of the model, shooting a real image under the low-illumination condition, respectively sending the image into a traditional target detection model and the low-illumination target detection model based on the invention, and verifying the detection effect of the model;
the invention has the beneficial effects that:
firstly, the invention uses the image enhancement algorithm as a preprocessing step of target detection, is more suitable for extracting the characteristics of the low-illumination image, and can improve the accuracy of the neural network on the low-illumination image identification. And secondly, the invention designs an enhancement recovery module based on a channel attention mechanism, and weight regulation is carried out on the image noise amplification problem caused by image enhancement, so that an image with higher quality is obtained. And thirdly, the invention pre-processes the downsampled image, reduces the requirement of the model on the calculation force, and can apply the model to a platform with lower calculation force such as a mobile terminal or embedded equipment. Then, the method replaces standard convolution with depth separable convolution in the feature extraction stage, so that the overall model is improved in detection speed, and the requirement of real-time detection is met. Finally, the invention does not need to use hardware equipment such as infrared imaging and the like to process the image, and has lower cost.
Drawings
FIG. 1 is a flow chart of real-time low-light level target detection based on convolutional neural networks;
FIG. 2 is a block diagram of an enhanced recovery module;
FIG. 3 is a block diagram of a depth separable convolution;
fig. 4 is a block diagram of a coordinated attention mechanism.
Detailed description of the preferred embodiments
In order to make the technical solution and features of the present invention more clearly revealed, the present invention is explained below with reference to the accompanying drawings, but the present invention is not limited by examples.
Example 1:
a method for detecting a real-time low-illuminance image target based on a convolutional neural network, the method comprising:
step one, configuring an environment: and configuring an image enhancement algorithm and a target detection algorithm environment based on deep learning. The required development environment is configured under the window system, wherein the computer graphics card used is RTX3060, and each application environment is python 3.9.7,anconda 4.11.0,cuda11.0. The present example obtains the open source procedure of the object detection algorithm YOLOX and the image enhancement algorithm ZeroDCE on the gitsub.
Step two, collecting data: and constructing a low-illumination image data set, acquiring a real low-illumination image, marking the image, and summarizing the image into a label data set. The example uses an open source real low-light data set Expark to cover 10 low-light conditions with different degrees, including 12 categories of people, bicycles, boats, chairs and the like, and 7363 low-light images. Since both the PSCAL VOC and the actual dim light detection dataset ExDark contain 10 classes of objects, 2760 low-intensity images were screened from the VOC2007 dataset for expansion, forming a new dataset a. To facilitate YOLOX training, the labels in dataset a are converted to VOC2007 format and the image resolution is adjusted to accommodate the network input.
And thirdly, establishing an enhancement recovery module, and using a weight mechanism to inhibit noise amplification caused by image enhancement. The embodiment provides a new cascade module by referring to the network structure of SKNet. The enhancement recovery module is shown in fig. 2, and is composed of an input layer, a feature fusion layer and a feature aggregation layer, wherein the input layer takes enhanced image features and original image features as the input of a model, the fusion features are influenced by the amplified noise of the enhanced image by taking the pixel-by-pixel Addition (Point-Wise Addition) method into consideration, a vector splicing (connectate) method is selected to fuse images input by two channels, the feature size of the fused image is 2C, H, W, and the fused features U E R are obtained 2C*H*W . The calculation formula can be expressed as: u=u 1 +U 2 Wherein U represents the characteristics of the fused image, U 1 Representing enhanced image features, U 2 Representing original image features; secondly, in order to characterize the importance of the information of each channel, the feature fusion layer adopts a global average pooling method to encode the feature channels of H and W dimensions, and reduces the dimension of each layer of U into a number M. The calculation formula can be expressed as:where W and H represent the width and height of the feature and (i, j) represent the spatial location of the feature. In order to learn the correlation between the feature channels, the module reduces the dimension of the M input first and then increases the dimension to obtain a weight vector Z by dividing the M input into two layers of full-connection layer branches fc, and a calculation formula can be expressed as follows: z is Z a =F fc (M,W)=σ(W a δ(WM)),Z b =F fc (M,W)=σ(W b Delta (WM)), where Z a And Z b Representing two output weight vectors, wherein W is a parameter of a first full connection layer, the dimension is c/gamma c, gamma is a scaling factor, and the scaling factor is used for reducing vector dimension, reducing calculated amount and W a And W is b Parameters of the second full-connection layer in the two full-connection branches are respectively that the dimension isC/γ for generating a weight vector corresponding to an input feature, δ being a ReLu activation function, β being a Sigmoid layer; finally, obtaining the channel weight Z of the original image features and the enhanced image features by using a softmax function in the feature aggregation layer a ,Z b U is set up 1 ,U 2 Extracting feature weighted addition to obtain feature map U + ,U + Can be expressed as: u (U) + =Z a *U 1 +Z b *U 2 . The contrast enhancer is directly connected with the detector, and the enhancement recovery module provided by the invention preferentially aggregates the enhanced image with the original image, so that the image quality is improved, and the influence of noise amplification after the enhanced image is reduced.
And fourthly, the model is light, and the model is ensured to meet the requirement of real-time detection. The ZeroDCE and YOLOX are selected as image enhancement and object detection models, respectively. Firstly, taking a downsampled small-scale image as an input of a depth network DCENT for an image enhancement part, mapping and upsampling a curve parameter of depth estimation, recovering to an original resolution, and then carrying out subsequent iterative enhancement. This downsampling operation takes as input a low resolution image, which can significantly reduce the computational cost. And secondly, for the target detection part, the common convolution used by the characteristic extraction network can be replaced by a more efficient depth separable network. As shown in the depth separable convolution of fig. 3, it is shown how the standard convolution (a) is decomposed into a depth-wise convolution (b) and a point-wise convolution (c). Standard convolution layer input image size is MxD F ×D F With N sizes M x D K ×D K Is convolved with a final output size of NxD G ×D G Is a feature map of (1). Wherein D is F Representing the width and height of the input feature map, D G Representing the width and height of the output feature map, D K Is the spatial dimension of the convolution kernel, M is the number of input channels and N is the number of output channels. The parameters and calculated amounts used to finally obtain the standard convolution and the depth separable convolution are as follows: the reference number of the standard convolution layer is D K ×D K X M x N; the standard convolution calculated amount is D K ×D K ×M×N×D F ×D F The method comprises the steps of carrying out a first treatment on the surface of the Depth separable convolution parameterNumber D K ×D K ×M×N+D K ×D K X N; depth separable calculated amount D K ×D K ×M×D F ×D F +M×N×D F ×D F . The ratio of the parameter quantity to the calculated quantity can find that the light characteristic extraction network parameter quantity and the calculated quantity are about one ninth of the original parameter quantity and the calculated quantity, so that the height and the width of the model are greatly reduced, and the model reasoning speed is improved.
And fifthly, optimizing the model, and adding an attention mechanism into the feature extraction network to make up for the problem of precision reduction caused by network weight reduction. The embodiment adds a high-efficiency coordinate attention mechanism CA for the mobile terminal in the feature extraction layer, and can encode the transverse and longitudinal position information into the feature channel, so that the mobile network can pay attention to the position information in a large range, better locate and identify the target, and can not bring excessive calculation amount. The CA attention mechanism module aims at enhancing the expression capability of the mobile network learning characteristics, the implementation process of the CA attention mechanism module is as shown in an attention structure diagram of the CA in fig. 4, and in order to acquire the attention on the width and the height of an image and encode accurate position information, the CA firstly divides the input characteristic diagram into two directions of the width and the height to respectively carry out global average pooling, and respectively obtains the characteristic diagram in the two directions of the width and the height. Then the feature graphs of the width and the height directions of the obtained global receptive field are spliced together, then the feature graphs are sent to a convolution module with a shared convolution kernel of 1 multiplied by 1, the dimension of the feature graphs is reduced to be the original C/r, and then the feature graph F1 subjected to batch normalization processing is sent to a Sigmoid activation function to obtain the feature graph with the shape of 1 multiplied by (W+H) multiplied by C/r. And then, carrying out convolution with the convolution kernel of 1 multiplied by 1 on the feature map according to the original height and width to respectively obtain feature maps Fh and Fw with the same channel number as the original feature maps, and respectively obtaining the attention weights of the feature map on the height and width after a Sigmoid activation function. Finally, the characteristic diagram with the attention weight in the width and height directions is finally obtained through multiplication weighted calculation on the original characteristic diagram.
And step six, training a model, and testing the detection effect of the low-illumination environment. Dividing the data set A in the second step into training sets according to the ratio of 8:1:1, introducing a verification set and a test set into the embodiment to serve as the training set, the verification set and the test set, and finally generating a low-illumination target detection model.
Experimental results:
and (3) simultaneously sending the low-illumination image into a common target detection model and a model of the embodiment to verify the weak light detection effect. The experimental results are shown in table 1,
table 1 comparison of results
The model of the invention exceeds YOLOX in terms of both detection speed and accuracy, and the model size is smaller. In addition, the invention reduces the resolution of the original image by downsampling and then carries out upsampling to restore the enhanced image, thereby reducing the demand of the model on calculation force on the premise of not influencing the image enhancement effect, leading the model processing speed to be faster and meeting the demand of real-time detection.
While the invention has been described in detail in connection with specific embodiments thereof, it will be understood that the individual details are not limited to the particular embodiments described above, and that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, within the scope of the appended claims.

Claims (8)

1. A real-time low-illumination image target detection method based on a convolutional neural network comprises the following steps:
step one, configuring a deep learning software environment, and configuring an image enhancement algorithm and a target detection algorithm environment based on a convolutional neural network;
step two, constructing a low-illumination image data set, acquiring a real low-illumination image, marking the image, and summarizing the image into a tag data set;
step three, an enhancement recovery module is established, and a weight mechanism is used for inhibiting noise amplification caused by image enhancement;
constructing a deep neural network, and establishing a detection mode of 'enhancement+detection', wherein the data set is subjected to image enhancement firstly to serve as a preprocessing part of the detection network, and then feature extraction and detection activities are carried out;
step five, a lightweight network is adopted, the image is subjected to downsampling treatment in the preprocessing process, and standard convolution is replaced by depth separable convolution, so that the real-time requirement of the network is ensured;
step six, optimizing the network, focusing on the channel and space position information by using a high-efficiency coordinate attention mechanism, and enhancing the characteristic learning capability of the network;
and step seven, training a neural network model, and verifying the detection effect of the low-light environment.
2. The method according to claim 1, wherein the target detection algorithm in the first step is a two-step algorithm based on candidate regions or a single-step algorithm based on regression; the image enhancement algorithm is a histogram image enhancement algorithm, a tone mapped image enhancement algorithm, or a Retinex image enhancement algorithm.
3. The method for detecting real-time low-illuminance images according to claim 1, wherein step two, the real low-illuminance image dataset is a composite dataset, and the real low-illuminance image dataset is obtained by using a network and the low-illuminance images in the public dataset are screened for expansion; secondly, labeling the images, and summarizing the images into a label data set; and finally, dividing the label data set into a training set, a verification set and a test set.
4. The method according to claim 1, wherein the enhancement recovery module first fuses the enhanced image with the original image as input in a dual-channel manner, and adds a weight to each feature channel by using a channel attention mechanism; secondly, learning the importance of each channel in the feature map through a neural network; and finally, aggregating the input characteristic channels of the two channels according to the weights, improving the attention of the module to the target characteristic information channel, and inhibiting the influence of noise amplification during image enhancement.
5. The method according to claim 1, wherein the "enhancement+detection" network in step four is composed of a low-illumination image enhancement algorithm, a cascade module and a target detection algorithm, and comprises an image preprocessing layer, a feature fusion layer, a feature extraction layer and a prediction layer. The low-illumination image enhancement algorithm is used as a preprocessing layer of a network to enhance the image quality, and comprises a histogram equalization method and a method based on Retinex theory or depth curve estimation; the cascade module is the enhancement recovery module in claim 4; the target detection algorithm comprises a two-step algorithm based on candidate areas or a single-step algorithm based on classification regression, feature fusion is carried out through network models such as CSPDarknet, VGG or Mobilene, feature extraction is carried out through network structures such as feature pyramids FPN, PANet or BiFPN, classification regression is carried out through convolution modules of 3 multiplied by 3 and 1 multiplied by 1, the cross ratio is calculated, and the probability that a target appears in a priori frame is predicted.
6. The method for detecting a real-time low-illuminance image according to claim 1, wherein the model weight reduction step five uses a small-scale image after downsampling as an input of a preprocessing layer in the preprocessing part, thereby reducing the calculation cost of convolutional layer learning; secondly, restoring the enhanced image to the original resolution through upsampling to replace the subsequent activity; and finally, replacing the common convolution with the depth separable convolution, and reducing the parameter quantity to one tenth of the original one.
7. The method for detecting a real-time low-illuminance image according to claim 1, wherein in step six, an attention mechanism is added to the feature extraction network by the optimization model to compensate for the problem of reduced accuracy caused by light weight of the network, so that the overall model is balanced in accuracy and speed.
8. The method according to claim 1, wherein the model training section firstly randomly divides the low-illuminance image dataset into a training set, a verification set and a test set according to a ratio of 8:1:1 to generate a low-illuminance image target detection model; secondly, verifying the model detection effect, shooting a real image under the low-illumination condition, and respectively sending the image into the target detection model without image enhancement and the low-illumination target detection model in claim 1, and verifying the model detection effect.
CN202310940678.7A 2023-07-28 2023-07-28 Real-time low-illumination image target detection method based on convolutional neural network Pending CN117115616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310940678.7A CN117115616A (en) 2023-07-28 2023-07-28 Real-time low-illumination image target detection method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310940678.7A CN117115616A (en) 2023-07-28 2023-07-28 Real-time low-illumination image target detection method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN117115616A true CN117115616A (en) 2023-11-24

Family

ID=88811871

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310940678.7A Pending CN117115616A (en) 2023-07-28 2023-07-28 Real-time low-illumination image target detection method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN117115616A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118071752A (en) * 2024-04-24 2024-05-24 中铁电气化局集团有限公司 Contact net detection method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118071752A (en) * 2024-04-24 2024-05-24 中铁电气化局集团有限公司 Contact net detection method
CN118071752B (en) * 2024-04-24 2024-07-19 中铁电气化局集团有限公司 Contact net detection method

Similar Documents

Publication Publication Date Title
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN111950649B (en) Attention mechanism and capsule network-based low-illumination image classification method
CN110909666B (en) Night vehicle detection method based on improved YOLOv3 convolutional neural network
CN112183203B (en) Real-time traffic sign detection method based on multi-scale pixel feature fusion
CN112801027B (en) Vehicle target detection method based on event camera
CN111209858B (en) Real-time license plate detection method based on deep convolutional neural network
CN115861380B (en) Method and device for tracking visual target of end-to-end unmanned aerial vehicle under foggy low-illumination scene
Cho et al. Semantic segmentation with low light images by modified CycleGAN-based image enhancement
CN113888461A (en) Method, system and equipment for detecting defects of hardware parts based on deep learning
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN113052057A (en) Traffic sign identification method based on improved convolutional neural network
CN118230175B (en) Real estate mapping data processing method and system based on artificial intelligence
CN111199255A (en) Small target detection network model and detection method based on dark net53 network
CN114140672A (en) Target detection network system and method applied to multi-sensor data fusion in rainy and snowy weather scene
CN117115616A (en) Real-time low-illumination image target detection method based on convolutional neural network
Wu et al. Vehicle detection based on adaptive multi-modal feature fusion and cross-modal vehicle index using RGB-T images
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN117611911A (en) Single-frame infrared dim target detection method based on improved YOLOv7
Li et al. An end-to-end system for unmanned aerial vehicle high-resolution remote sensing image haze removal algorithm using convolution neural network
Cho et al. Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation
Liangjun et al. MSFA-YOLO: A Multi-Scale SAR Ship Detection Algorithm Based on Fused Attention
CN117994573A (en) Infrared dim target detection method based on superpixel and deformable convolution
CN117372853A (en) Underwater target detection algorithm based on image enhancement and attention mechanism
CN116363610A (en) Improved YOLOv 5-based aerial vehicle rotating target detection method
CN115861595A (en) Multi-scale domain self-adaptive heterogeneous image matching method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication