CN117095158A - Terahertz image dangerous article detection method based on multi-scale decomposition convolution - Google Patents

Terahertz image dangerous article detection method based on multi-scale decomposition convolution Download PDF

Info

Publication number
CN117095158A
CN117095158A CN202311063505.8A CN202311063505A CN117095158A CN 117095158 A CN117095158 A CN 117095158A CN 202311063505 A CN202311063505 A CN 202311063505A CN 117095158 A CN117095158 A CN 117095158A
Authority
CN
China
Prior art keywords
image
feature
scale
convolution
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311063505.8A
Other languages
Chinese (zh)
Other versions
CN117095158B (en
Inventor
吴衡
郭梓杰
罗劭娟
陈梅云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202311063505.8A priority Critical patent/CN117095158B/en
Priority claimed from CN202311063505.8A external-priority patent/CN117095158B/en
Publication of CN117095158A publication Critical patent/CN117095158A/en
Application granted granted Critical
Publication of CN117095158B publication Critical patent/CN117095158B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The application discloses a terahertz image dangerous goods detection method based on multi-scale decomposition convolution, which comprises the following steps: acquiring an image of a target object to be detected by using terahertz imaging equipment; processing the image of the object to be detected to construct an input image data set; constructing a target detection network model, inputting the image of the target object to be detected into the target detection network model, generating feature layers with different sizes, carrying out multi-scale feature fusion on features in the feature layers, extracting multi-scale features, and identifying hidden dangerous goods, wherein the target detection network model is obtained by training the input image data set; outputting a hidden dangerous goods detection result image comprising a dangerous goods detection frame, a dangerous goods class number and a predicted probability value if dangerous goods exist in the image of the object to be detected, and outputting the image of the object to be detected consistent with input if dangerous goods do not exist in the image of the object to be detected.

Description

Terahertz image dangerous article detection method based on multi-scale decomposition convolution
Technical Field
The application belongs to the technical field of terahertz detection, and particularly relates to a terahertz image dangerous article detection method based on multi-scale decomposition convolution.
Background
Terahertz imaging techniques use a terahertz radiation source to illuminate an object and capture transmitted or reflected light rays of the object for imaging. Because of the frequency and wavelength characteristics of terahertz radiation, and the advantage of being harmless to human bodies, terahertz imaging technology has great application potential in the fields of biomedicine and safety, such as detection of inflammable and explosive substances, drugs, illicit guns and other dangerous substances. However, because of factors such as hardware of a terahertz imaging system and external environment interference, the terahertz image has the problems of serious noise interference, low signal-to-noise ratio and contrast, blurred image and the like. The problems have great influence on the accuracy of security detection, and the traditional target detection system has the problems of low detection precision, low recognition rate and the like due to the limitation that dangerous goods cannot be accurately recognized and positioned from terahertz images with poor quality. With the development of deep learning technology, the detection task can be automatically completed by adopting a deep learning mode under the training and learning of a large number of sample data sets. In addition, the target detection method based on the multi-scale decomposition convolution can improve accuracy and reduce model complexity so as to be deployed to terminal equipment. Therefore, the development of the target detection algorithm which can more accurately identify the types of dangerous goods in the terahertz image and has high detection accuracy is very helpful for the application and development of the terahertz image detection technology.
Disclosure of Invention
In order to solve the technical problems, the application provides a terahertz image dangerous article detection method based on multi-scale decomposition convolution. By means of a deep learning target detection algorithm and a network model optimization method, detection accuracy and recognition rate of dangerous goods in the terahertz image are improved. Is expected to be widely applied in the field of safety detection of scenes such as subways, airports, frontier and the like.
In order to achieve the above purpose, the application provides a terahertz image dangerous article detection method based on multi-scale decomposition convolution, which comprises the following steps:
acquiring an image of a target object to be detected by using terahertz imaging equipment;
processing the image of the object to be detected to construct an input image data set;
constructing a target detection network model, inputting the image of the target object to be detected into the target detection network model, generating feature layers with different sizes, carrying out multi-scale feature fusion on features in the feature layers, extracting multi-scale features, and identifying hidden dangerous goods, wherein the target detection network model is obtained by training the input image data set;
outputting a hidden dangerous goods detection result image comprising a dangerous goods detection frame, a dangerous goods class number and a predicted probability value if dangerous goods exist in the image of the object to be detected, and outputting the image of the object to be detected consistent with input if dangerous goods do not exist in the image of the object to be detected.
Optionally, processing the image of the object to be detected includes:
converting dangerous goods in the image of the object to be detected into tag data by using a rectangular frame, and obtaining an image containing the tag data;
and performing mosaics enhancement, random left-right overturn and size random scaling on the image containing the tag data to complete the construction of the input image data set.
Optionally, the object detection network model includes: a feature extraction backbone network, a feature fusion network and a feature detection network;
the feature extraction backbone network performs shallow feature extraction on the image of the object to be detected through convolution operation to obtain feature layers with different sizes;
the feature fusion network performs multi-scale feature fusion on the feature layers with different sizes, extracts multi-scale features and acquires a multi-scale feature map;
and the characteristic detection network predicts the multi-scale characteristic map and outputs a prediction result map.
Optionally, a self-adaptive multi-scale large-kernel decomposition convolution module and a attention mechanism BRA are added into the feature extraction backbone network; the characteristic fusion network is added with the self-adaptive multi-scale large-kernel decomposition convolution module and the non-parametric 3-D local attention SimAM;
the self-adaptive multi-scale large-kernel decomposition convolution module carries out multi-scale decomposition and self-adaptive fusion on the input feature images.
Optionally, the adaptive multi-scale large-kernel decomposition convolution module includes a depth convolution, a depth expansion convolution, and a point-by-point convolution.
Optionally, before the adaptive multi-scale large-kernel decomposition convolution module performs multi-scale decomposition and adaptive fusion on the input feature map, the method includes: and carrying out convolution and classification operation on the input feature images to obtain a plurality of feature images.
Optionally, the performing multi-scale decomposition and self-adaptive fusion on the input feature map by the self-adaptive multi-scale large-kernel decomposition convolution module includes:
extracting one of the feature maps as a first feature map;
respectively inputting the rest characteristic diagrams in the plurality of characteristic diagrams into different self-adaptive multi-scale large-kernel decomposition convolution modules;
setting different large convolution kernels and different expansion rates in different self-adaptive multi-scale large kernel decomposition convolution modules, setting depth expansion convolution, depth convolution and point-by-point convolution in different self-adaptive multi-scale large kernel decomposition convolution modules, performing multi-scale decomposition on the residual feature images, and outputting a plurality of new feature images;
connecting a plurality of the new feature graphs in a channel dimension by using cascading operation to obtain a second feature graph;
performing feature fusion and channel number dimension reduction on the second feature map through convolution operation to obtain a third feature map;
performing Softmax and channel separation operation on the third feature map to obtain a space self-adaptive weight;
carrying out weighted aggregation on a plurality of new feature images and the space self-adaptive weights to obtain a fourth feature image;
and performing cascading and convolution operation on the fourth characteristic diagram and the first characteristic diagram to obtain a fifth characteristic diagram, so as to realize self-adaptive fusion.
Optionally, the target parameter in the adaptive multi-scale large-kernel decomposition convolution module is a preset target value.
Optionally, the mathematical model for outputting the hidden dangerous goods detection result image is:
wherein O is an image of a hidden dangerous article detection result,in order to optimize parameters, F' is an image group obtained after the feature extraction backbone network and the feature fusion network are processed, D (·) represents an objective detection function, ψ is parameters of a neural network, and (x, y) represents pixel coordinates of an output detection frame.
Optionally, the training process of the object detection network model through the input image dataset includes:
optimizing the Loss function Loss (Θ) by adopting an SGD function:
L b =L CIoU +L DFL
L DFL =-((y i+1 -y)log(S i )+(y-y i )log(S i+1 ))
wherein N is the number of detection layers, L b For the frame regression loss function, L c To classify the loss function, alpha 1 ,α 2 Weight coefficient of loss function, L CIoU And L DFL As the boundary frame loss function, ioU is the cross ratio, ρ is the distance between the center points of the predicted frame and the real frame, p and g are the center points of the predicted frame and the real frame, c is the diagonal distance between the minimum circumscribed rectangular frames of the two frames, v is the parameter for measuring the consistent length-width ratio, y i+1 Is the nearest integer to the right of the true value, y i Is the nearest integer to the left of the true value, n is the number of samples, B i Is the target value, S i And outputting a value for the model.
The application has the technical effects that: the application adopts the self-adaptive multi-scale large-kernel decomposition convolution module, can effectively increase the receptive field, simultaneously reduce the complexity and improve the identification and extraction capacity of dangerous goods features in the image. In addition, attention mechanisms are added to the feature extraction backbone network and the feature fusion layer, and the detection precision and the recognition rate of dangerous goods can be improved by utilizing global features. The method is beneficial to application and research of the terahertz image dangerous object detection technology.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a schematic diagram of a network model architecture of a target detection algorithm according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an adaptive multi-scale large-kernel decomposition convolution module architecture according to an embodiment of the present application;
fig. 3 is a schematic diagram of an attention mechanism BRA module architecture according to an embodiment of the present application;
fig. 4 is a flowchart of a terahertz image dangerous article detection method based on multi-scale decomposition convolution in an embodiment of the application.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
As shown in fig. 4, in this embodiment, a method for detecting a terahertz image dangerous article based on multi-scale decomposition convolution is provided, including: acquiring an image of a target object to be detected by using terahertz imaging equipment;
processing the image of the object to be detected to construct an input image data set;
constructing a target detection network model, inputting the image of the target object to be detected into the target detection network model, generating feature layers with different sizes, carrying out multi-scale feature fusion on features in the feature layers, extracting multi-scale features, and identifying hidden dangerous goods, wherein the target detection network model is obtained by training the input image data set;
outputting a hidden dangerous goods detection result image comprising a dangerous goods detection frame, a dangerous goods class number and a predicted probability value if dangerous goods exist in the image of the object to be detected, and outputting the image of the object to be detected consistent with input if dangerous goods do not exist in the image of the object to be detected.
Acquiring an image of a target object to be detected by using terahertz imaging equipment, processing the image of the target object to be detected, and constructing an input image data set (manufacturing a terahertz image training data set for carrying hidden dangerous articles by a human body) comprises the following steps:
shooting N=3157 images of the object to be detected by using terahertz imaging equipment, wherein each object to be detectedThe image of the object is marked as S i I=1, 2, …,3157. And (3) framing out dangerous goods in the N=3157 target object images to be detected by using a rectangular frame, converting the dangerous goods into tag data, obtaining an image containing the tag data, and preprocessing the target object image containing the tag data through mosaics enhancement, random left-right overturn, random size scaling and the like to obtain a preprocessed sub-image. After n=3157 images of the object to be measured are preprocessed in the same manner, an input image set a including n=3157 images can be obtained.
The target detection network model is trained by the input image set.
Inputting the image of the object to be detected into the object detection network model, and detecting the image of the object to be detected (the object detection algorithm obtains the detection image of the hidden dangerous goods carried by the human body) comprises the following steps:
as shown in fig. 1, the target detection algorithm obtains the detection image of the hidden dangerous goods carried by the human body through the multi-scale decomposition convolution self-adaptive fusion deep learning neural network. Image I epsilon R of object to be measured 3×640×640 Inputting the hidden dangerous goods detection images into a target detection network model, and outputting the hidden dangerous goods detection images O epsilon R 3×640×640 The mathematical model thereof can be expressed as follows:
O=O(x,y)=Φ(I,Ψ)
in the above formula, O (x, y) represents a dangerous article detection image, phi (·) represents a target detection algorithm neural network model, ψ is a parameter of the neural network, and (x, y) represents pixel coordinates of an output detection frame, and I is a detected target object image. If dangerous goods exist in the detected object image, the output detection frame is drawn on the detected object image. In addition, the class number of the dangerous goods and the predicted probability value thereof are marked on the detection frame. Otherwise, if no dangerous article exists in the detected object image, the output image is consistent with the input image.
The object detection network model includes:
in a feature extraction backbone network, shallow feature extraction is carried out through convolution operation to obtain a feature map with the size reduced by half in sequence, three feature layers with different sizes generated by three back-layer convolution operation in the backbone network are utilized to carry out multi-scale feature fusion at a feature fusion stage, so that the multi-scale feature extraction is facilitated, and the recognition rate of hidden dangerous goods is improved.
A self-adaptive multi-scale large-kernel decomposition convolution module AMDC is designed in the feature extraction process, so that the complexity is reduced while the receptive field is increased by the network, and the recognition and extraction capacity of dangerous goods features in the image is improved. As shown in fig. 2, the designed adaptive multi-scale large-kernel decomposition convolution module is realized by the following way: given a c×h×w=64×160×160 feature map f, the feature map f is first convolved and separated to obtain four C/4×h×w=16×160×160 feature maps f 1 、f 2 、f 3 、f 4 And then three of the feature patterns f 1 、f 2 、f 3 Three large kernel decomposition convolution modules are respectively input. The large-core decomposition convolution module comprises three convolutions, namely depth-wise convolution (depth-wise), depth-expansion convolution (depth-wise-position) and point-wise convolution (point-wise), wherein the depth-expansion convolution is realized by setting expansion rate of the depth convolution. Assuming that the dilation rate is d=3, a conventional convolution of a k×k=9×9 large convolution kernel can be decomposed into a depth convolution of (2 d-1) × (2 d-1) =5×5 convolution kernel, a depth dilation convolution of K/d×k/d=3×3 convolution kernel, and a point-wise convolution of a 1×1 convolution kernel. Finally, in the three large-kernel decomposition convolution modules, three large convolution kernels K with different sizes are set 1 =5、K 2 =21、K 3 =45 and different expansion rates d are set 1 =1、d 2 =3、d 3 Decomposition was performed to achieve multi-scale decomposition. The mathematical model can be expressed as follows:
D i =P c (D dc (D c (f i ))),i=1,2,3
in the above, P c (. Cndot.) represents a point-by-point convolution function, D dc (. Cndot.) represents the depth-expanded convolution function, D c (. Cndot.) represents the depth convolution function, D i And the characteristic diagram which is output by the ith large-kernel decomposition convolution module is shown.
In a multi-scale large coreThe decomposition convolution module outputs three feature maps D i ∈R 16×160×160 After i=1, 2,3, first, three feature graphs are connected in the channel dimension by using a cascade operation to obtain a feature graph D e R 48×160×160 . Secondly, feature fusion and channel number dimension reduction are carried out on the feature map D through a convolution operation with a convolution kernel of 3, and a feature map D' E R is obtained C ″×H×W Then, performing Softmax and channel separation operation to obtain three space self-adaptive weights Q i I=1, 2,3. Three feature maps D to be input i ∈R 16×160×160 I=1, 2,3 and three spatially adaptive weights Q, respectively i The i=1, 2,3 is weighted and aggregated to obtain an output characteristic diagram D'. Epsilon.R 16×160×160 . Finally, the feature map D' and the feature map f are outputted 4 Performing cascading and convolution operations to obtain a characteristic diagram f' E R 64×160×160 The self-adaptive fusion of the features can be realized. In addition, in the feature extraction trunk part, if the parameter shortcut of the adaptive multi-scale large-kernel decomposition convolution module AMDC is set to True, the feature map f is added with the feature map f' after a convolution operation with a convolution kernel of 1 to realize residual connection. The mathematical model can be expressed as follows:
D=Concat(D 1 ,D 2 ,D 3 )
[Q 1 ,Q 2 ,Q 3 ]=S p (S o (C(D)))
D″=Q 1 D 1 +Q 2 D 2 +Q 3 D 3
f′=C(Concat(D″,f 4 ))
f′=f′+C(f),if shortcut=True
concat (·) in the above formula represents the hidden function of the cascading operation, S p (. Cndot.) represents separation channel dimension, S o (. Cndot.) represents a Softmax function, and C (-) represents a hidden function of the convolution operation.
As shown in fig. 1, in order to improve the global representation capability and detection accuracy of terahertz image dangerous articles, a attention mechanism BRA shown in fig. 3 is designed in a feature extraction main network, and the attention mechanism BRA is introduced into the target in the applicationIn the feature extraction backbone network of the target detection network model, the network can utilize global features to improve the detection accuracy. Given an input feature map I epsilon R 20×20×256 Dividing the feature map I into s×s=4×4 mutually non-overlapping regions, and converting the feature vectors therein into I r ∈R 16×25×256 At the same time, tensors Q, K, V epsilon R are derived 16×25×256 . Calculating the average value Q of Q and K in each region of the feature map r ,K r ∈R 16×256 By Q r 、K r The area correlation between them results in an adjacency matrix A r The first k connection indexes of each area are reserved to obtain an index matrix X r . Finally, K and V are respectively combined with an index matrix X r Aggregation to obtain tensor K g And V g The feature map P is obtained for the aggregated K-V pairs using attention manipulation. The mathematical model can be expressed as follows:
Q=I r W q ,K=I r W k ,V=I r W v
A r =Q r (K r ) T
K g =g(K,X r ),V g =g(V,X r )
P=Attention(Q,K g ,V g )
w in the above q ,W k ,W v ∈R C×C Projection weights of query, key, value, g (·) represent hidden functions of the gather operation, and Attention (·) represents hidden functions of the Attention operation, respectively.
In addition, in order to better distinguish the feature difference between dangerous goods and the background, a parameter-free 3-D local attention SimAM is introduced into a feature fusion layer with deep semantic information, as shown in fig. 1. According to the application, the SimAM attention is introduced into the feature fusion module of the network model, so that the feature representation of the target detection model can be enhanced, and the accuracy of target detection can be improved. The implementation process is as follows: given an input profile Z ε R 1×128×80×80 Deriving 3-D attention weight of feature map by energy function EI (&) calculation and activating with sigmod (&) function to obtain activation weight, and then inputting feature mapMultiplying the activation weight to obtain an output attention weight Q at The mathematical model can be expressed as follows:
Q at =Z×sigmod(EI(Z))
after feature extraction backbone network extraction features and feature fusion are carried out on the detected object image, a feature map with n=3 sizes reduced by half in sequence is obtainedFeature map O' = [ O ] by Detect module 1 ′,O 2 ′,O 3 ′]And (5) predicting and outputting a prediction result graph O. The mathematical model can be expressed as follows:
O=O(x,y)=D(O′)
in the above formula, D (·) represents an objective detection function, and O' is an array including n input feature maps.
In the deep neural network training process, the Loss function Loss (Θ) is optimized by adopting an SGD function, and the process is expressed as follows:
L b =L CIoU +L DFL
L DFL =-((y i+1 -y)log(S i )+(y-y i )log(S i+1 ))
wherein N is the number of detection layers, L b For the frame regression loss function, L c To classify the loss function, alpha 1 ,α 2 Weight coefficient of loss function, L CIoU And L DFL As a bounding box loss function, ioU is an intersection ratio. ρ is a prediction frameDistance from center point of real frame, p and g are center points of predicted frame and real frame, c is diagonal distance of minimum circumscribed rectangular frame of two frames, v is parameter for measuring length-width ratio to be consistent, y i+1 Is the nearest integer to the right of the true value, y i Is the nearest integer to the left of the true value, n is the number of samples, B i Is the target value, S i And outputting a value for the model.
After m=300 training, the optimized parameters can be obtained
For an object image F to be detected shot by terahertz imaging equipment, an image group F 'is obtained after the processing of a feature extraction backbone network and a feature fusion network, and finally a dangerous article detection result can be obtained after the image group F' is input into a feature detection network, namely byAnd obtaining a hidden dangerous goods detection result image containing the dangerous goods detection frame, the dangerous goods category number and the predicted probability value.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (10)

1. A terahertz image dangerous goods detection method based on multi-scale decomposition convolution is characterized by comprising the following steps:
acquiring an image of a target object to be detected by using terahertz imaging equipment;
processing the image of the object to be detected to construct an input image data set;
constructing a target detection network model, inputting the image of the target object to be detected into the target detection network model, generating feature layers with different sizes, carrying out multi-scale feature fusion on features in the feature layers, extracting multi-scale features, and identifying hidden dangerous goods, wherein the target detection network model is obtained by training the input image data set;
outputting a hidden dangerous goods detection result image comprising a dangerous goods detection frame, a dangerous goods class number and a predicted probability value if dangerous goods exist in the image of the object to be detected, and outputting the image of the object to be detected consistent with input if dangerous goods do not exist in the image of the object to be detected.
2. The method for detecting the dangerous goods based on the terahertz image of the multi-scale decomposition convolution according to claim 1, wherein the processing the image of the object to be detected comprises the following steps:
converting dangerous goods in the image of the object to be detected into tag data by using a rectangular frame, and obtaining an image containing the tag data;
and performing mosaics enhancement, random left-right overturn and size random scaling on the image containing the tag data to complete the construction of the input image data set.
3. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 1, wherein the target detection network model includes: a feature extraction backbone network, a feature fusion network and a feature detection network;
the feature extraction backbone network performs shallow feature extraction on the image of the object to be detected through convolution operation to obtain feature layers with different sizes;
the feature fusion network performs multi-scale feature fusion on the feature layers with different sizes, extracts multi-scale features and acquires a multi-scale feature map;
and the characteristic detection network predicts the multi-scale characteristic map and outputs a prediction result map.
4. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 3, wherein a self-adaptive multi-scale large-kernel decomposition convolution module and an attention mechanism BRA are added into the feature extraction main network; the characteristic fusion network is added with the self-adaptive multi-scale large-kernel decomposition convolution module and the non-parametric 3-D local attention SimAM;
the self-adaptive multi-scale large-kernel decomposition convolution module carries out multi-scale decomposition and self-adaptive fusion on the input feature images.
5. The terahertz image threat detection method based on multi-scale decomposition convolution of claim 4, wherein the adaptive multi-scale large-kernel decomposition convolution module comprises a depth convolution, a depth expansion convolution and a point-by-point convolution.
6. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 5, wherein before the self-adaptive multi-scale large-kernel decomposition convolution module performs multi-scale decomposition and self-adaptive fusion on the input feature map, the method comprises: and carrying out convolution and classification operation on the input feature images to obtain a plurality of feature images.
7. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 6, wherein the adaptive multi-scale large-kernel decomposition convolution module performs multi-scale decomposition and adaptive fusion on the input feature map, including:
extracting one of the feature maps as a first feature map;
respectively inputting the rest characteristic diagrams in the plurality of characteristic diagrams into different self-adaptive multi-scale large-kernel decomposition convolution modules;
setting different large convolution kernels and different expansion rates in different self-adaptive multi-scale large kernel decomposition convolution modules, setting depth expansion convolution, depth convolution and point-by-point convolution in different self-adaptive multi-scale large kernel decomposition convolution modules, performing multi-scale decomposition on the residual feature images, and outputting a plurality of new feature images;
connecting a plurality of the new feature graphs in a channel dimension by using cascading operation to obtain a second feature graph;
performing feature fusion and channel number dimension reduction on the second feature map through convolution operation to obtain a third feature map;
performing Softmax and channel separation operation on the third feature map to obtain a space self-adaptive weight;
carrying out weighted aggregation on a plurality of new feature images and the space self-adaptive weights to obtain a fourth feature image;
and performing cascading and convolution operation on the fourth characteristic diagram and the first characteristic diagram to obtain a fifth characteristic diagram, so as to realize self-adaptive fusion.
8. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 4, wherein the target parameter in the adaptive multi-scale large-kernel decomposition convolution module is a preset target value.
9. The terahertz image dangerous article detection method based on multi-scale decomposition convolution of claim 3, wherein a mathematical model for outputting the hidden dangerous article detection result image is:
wherein O is an image of a hidden dangerous article detection result,in order to optimize parameters, F' is an image group obtained after the feature extraction backbone network and the feature fusion network are processed, D (·) represents an objective detection function, ψ is parameters of a neural network, and (x, y) represents pixel coordinates of an output detection frame.
10. The terahertz image dangerous article detection method based on multi-scale decomposition convolution according to claim 1, wherein the training process of the target detection network model through the input image dataset comprises the following steps:
optimizing the Loss function Loss (Θ) by adopting an SGD function:
L b =L CIoU +L DFL
L DFL =-((y i+1 -y)log(S i )+(y-y i )log(S i+1 ))
wherein N is the number of detection layers, L b For the frame regression loss function, L c To classify the loss function, alpha 1 ,α 2 Weight coefficient of loss function, L CIoU And L DFL As the boundary frame loss function, ioU is the cross ratio, ρ is the distance between the center points of the predicted frame and the real frame, p and g are the center points of the predicted frame and the real frame, c is the diagonal distance between the minimum circumscribed rectangular frames of the two frames, v is the parameter for measuring the consistent length-width ratio, y i+1 Is the nearest integer to the right of the true value, y i Is the nearest integer to the left of the true value, n is the number of samples, B i Is the target value, S i And outputting a value for the model.
CN202311063505.8A 2023-08-23 Terahertz image dangerous article detection method based on multi-scale decomposition convolution Active CN117095158B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311063505.8A CN117095158B (en) 2023-08-23 Terahertz image dangerous article detection method based on multi-scale decomposition convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311063505.8A CN117095158B (en) 2023-08-23 Terahertz image dangerous article detection method based on multi-scale decomposition convolution

Publications (2)

Publication Number Publication Date
CN117095158A true CN117095158A (en) 2023-11-21
CN117095158B CN117095158B (en) 2024-04-26

Family

ID=

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200012940A1 (en) * 2017-03-17 2020-01-09 Portland State University Frame interpolation via adaptive convolution and adaptive separable convolution
CN110992324A (en) * 2019-11-26 2020-04-10 南京邮电大学 Intelligent dangerous goods detection method and system based on X-ray image
WO2022042470A1 (en) * 2020-08-31 2022-03-03 浙江商汤科技开发有限公司 Image decomposition method and related apparatus and device
CN114494891A (en) * 2022-04-15 2022-05-13 中国科学院微电子研究所 Dangerous article identification device and method based on multi-scale parallel detection
CN114581330A (en) * 2022-03-14 2022-06-03 广东工业大学 Terahertz image denoising method based on multi-scale mixed attention
CN114862837A (en) * 2022-06-02 2022-08-05 西京学院 Human body security check image detection method and system based on improved YOLOv5s
CN115208847A (en) * 2021-03-25 2022-10-18 国际商业机器公司 Content analysis messaging routing
CN115393719A (en) * 2022-08-29 2022-11-25 哈尔滨理工大学 Hyperspectral image classification method combining space spectral domain self-adaption and ensemble learning
CN115393796A (en) * 2022-08-26 2022-11-25 重庆邮电大学 X-ray security inspection image dangerous article detection method based on attention mechanism
CN115546223A (en) * 2022-12-05 2022-12-30 南京天创电子技术有限公司 Method and system for detecting loss of fastening bolt of equipment under train
US11631238B1 (en) * 2022-04-13 2023-04-18 Iangxi Electric Power Research Institute Of State Grid Method for recognizing distribution network equipment based on raspberry pi multi-scale feature fusion

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200012940A1 (en) * 2017-03-17 2020-01-09 Portland State University Frame interpolation via adaptive convolution and adaptive separable convolution
CN110992324A (en) * 2019-11-26 2020-04-10 南京邮电大学 Intelligent dangerous goods detection method and system based on X-ray image
WO2022042470A1 (en) * 2020-08-31 2022-03-03 浙江商汤科技开发有限公司 Image decomposition method and related apparatus and device
CN115208847A (en) * 2021-03-25 2022-10-18 国际商业机器公司 Content analysis messaging routing
CN114581330A (en) * 2022-03-14 2022-06-03 广东工业大学 Terahertz image denoising method based on multi-scale mixed attention
US11631238B1 (en) * 2022-04-13 2023-04-18 Iangxi Electric Power Research Institute Of State Grid Method for recognizing distribution network equipment based on raspberry pi multi-scale feature fusion
CN114494891A (en) * 2022-04-15 2022-05-13 中国科学院微电子研究所 Dangerous article identification device and method based on multi-scale parallel detection
CN114862837A (en) * 2022-06-02 2022-08-05 西京学院 Human body security check image detection method and system based on improved YOLOv5s
CN115393796A (en) * 2022-08-26 2022-11-25 重庆邮电大学 X-ray security inspection image dangerous article detection method based on attention mechanism
CN115393719A (en) * 2022-08-29 2022-11-25 哈尔滨理工大学 Hyperspectral image classification method combining space spectral domain self-adaption and ensemble learning
CN115546223A (en) * 2022-12-05 2022-12-30 南京天创电子技术有限公司 Method and system for detecting loss of fastening bolt of equipment under train

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
C: "HSSNet: A End-to-End Network for Detecting Tiny Targets of Apple Leaf Diseases in Complex Backgrounds", HSSNET: A END-TO-END NETWORK FOR DETECTING TINY TARGETS OF APPLE LEAF DISEASES IN COMPLEX BACKGROUNDS, 28 July 2023 (2023-07-28), pages 1 - 24 *
HENG WU等: "Multi-dimensional attention fusion network for terahertz image super-resolution", 《SSRN》, 13 July 2023 (2023-07-13), pages 1 - 13 *
宋欢等: "融合多尺度注意力的太赫兹图像目标检测研究", 《小型微型计算机系统》, 31 March 2022 (2022-03-31), pages 1 - 5 *
注胡铮: "基于深度学习的鬼成像算法研究", 《中国优秀硕士学位论文电子期刊》, 15 February 2023 (2023-02-15) *

Similar Documents

Publication Publication Date Title
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
Frizzi et al. Convolutional neural network for video fire and smoke detection
US7813581B1 (en) Bayesian methods for noise reduction in image processing
CN111767882A (en) Multi-mode pedestrian detection method based on improved YOLO model
CN109902715B (en) Infrared dim target detection method based on context aggregation network
CN109544563B (en) Passive millimeter wave image human body target segmentation method for security inspection of prohibited objects
Greenspan et al. Learning texture discrimination rules in a multiresolution system
CN110245675B (en) Dangerous object detection method based on millimeter wave image human body context information
Soni et al. Hybrid meta-heuristic algorithm based deep neural network for face recognition
CN112766223B (en) Hyperspectral image target detection method based on sample mining and background reconstruction
Hassan et al. Deep CMST framework for the autonomous recognition of heavily occluded and cluttered baggage items from multivendor security radiographs
Miao et al. Detection of mines and minelike targets using principal component and neural-network methods
CN116469020A (en) Unmanned aerial vehicle image target detection method based on multiscale and Gaussian Wasserstein distance
Jiang et al. Symmetry detection algorithm to classify the tea grades using artificial intelligence
Koziarski et al. Marine snow removal using a fully convolutional 3d neural network combined with an adaptive median filter
Yao et al. Robust photon-efficient imaging using a pixel-wise residual shrinkage network
Raj et al. Object detection in live streaming video using deep learning approach
CN117115675A (en) Cross-time-phase light-weight spatial spectrum feature fusion hyperspectral change detection method, system, equipment and medium
CN117095158B (en) Terahertz image dangerous article detection method based on multi-scale decomposition convolution
Jangblad Object detection in infrared images using deep convolutional neural networks
CN117095158A (en) Terahertz image dangerous article detection method based on multi-scale decomposition convolution
Menaka et al. Classification of multispectral satellite images using sparse SVM classifier
Sara et al. MC-CDPNet: multi-channel correlated detail preserving network for X-Ray-based baggage screening
CN114758231A (en) Remote sensing image occlusion processing method and device based on supervised contrast learning
Patel et al. Depthwise convolution for compact object detector in nighttime images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant