CN117132875A - Dangerous goods identification method, dangerous goods identification device and storage medium - Google Patents
Dangerous goods identification method, dangerous goods identification device and storage medium Download PDFInfo
- Publication number
- CN117132875A CN117132875A CN202211278338.4A CN202211278338A CN117132875A CN 117132875 A CN117132875 A CN 117132875A CN 202211278338 A CN202211278338 A CN 202211278338A CN 117132875 A CN117132875 A CN 117132875A
- Authority
- CN
- China
- Prior art keywords
- target detection
- semantic segmentation
- result
- head
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000001514 detection method Methods 0.000 claims abstract description 105
- 230000011218 segmentation Effects 0.000 claims abstract description 87
- 238000000605 extraction Methods 0.000 claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000004364 calculation method Methods 0.000 claims description 37
- 238000002372 labelling Methods 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 12
- 238000007689 inspection Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 231100001261 hazardous Toxicity 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 4
- 230000007246 mechanism Effects 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000036626 alertness Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003867 tiredness Effects 0.000 description 1
- 208000016255 tiredness Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The invention discloses a dangerous goods identification method, a dangerous goods identification device and a storage medium, and relates to the field of artificial intelligence, wherein the dangerous goods identification method comprises the following steps: the data processing stage, the training stage and the reasoning stage utilize the advantages of target detection, and fuse rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to a dangerous goods identification method, a dangerous goods identification device and a storage medium.
Background
In the security inspection recognition system, the existing manual mode is adopted to inspect the X-ray imaging pictures generated by the X-ray machine, and although the step of unpacking inspection can be omitted, the professional literacy requirement on the inspection personnel is higher, and meanwhile, the inspection personnel is required to have better tolerance and dangerous object alertness.
However, with longer operating periods, there is a greater risk of false leak detection due to perceived degradation limited to tiredness.
Disclosure of Invention
The invention provides a dangerous goods identification method, a dangerous goods identification device and a storage medium, aiming at the problem that the existing security inspection system is large in dangerous goods identification risk of false detection.
The technical scheme provided by the invention for the technical problems is as follows:
in a first aspect, the invention provides a dangerous goods identification method, which is applied to a security inspection system, and the method comprises the following steps:
and a data processing stage:
performing target detection labeling on the collected X-ray image to obtain a target detection label; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training phase:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
reasoning:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
Preferably, the feature extraction by the convolution layer of the backbone network using the channel attention module includes:
carrying out average pooling operation on the features of the first feature map to obtain an average channel descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum channel descriptor;
the average channel descriptor and the maximum channel descriptor are input into a shared network to generate a one-dimensional channel attention map.
Preferably, the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a spatial attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
the average spatial descriptor and the maximum spatial descriptor are input into a shared network to generate a two-dimensional spatial attention graph.
Preferably, the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
Preferably, the calculating the first loss degree of the target prediction result and the target detection label to optimize the target detection head includes:
by L locationloss Performing position loss calculation on the function; by L classesloss The function performs class loss calculation, where L locationloss The function satisfies the following formula:
L locationloss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c ,
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classesloss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
Preferably, the performing the second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
Preferably, the calculating of the cross-correlation ratio between the semantic segmentation result and the target detection result is performed, and a region with an overlapping region larger than a preset value is reserved as a final target detection region and output, so that the following formula is satisfied:
wherein I is out Representing an output result; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented.
In a second aspect, the present invention provides a hazardous article identification device, the device comprising:
the data processing module is used for carrying out target detection labeling on the collected X-ray images to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training module for:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
an inference module for:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
In a third aspect, the present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the hazardous article identification method as described above.
The technical scheme provided by the embodiment of the invention has the beneficial effects that:
the dangerous goods identification method provided by the invention utilizes the advantages of target detection, and fuses rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a dangerous goods identification method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of functional modules of the dangerous goods identification device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a dangerous goods identification method provided by the invention is provided in an embodiment. The dangerous goods identification method is mainly applied to a security inspection system, and is used for obtaining whether the current security inspection object is a dangerous goods or not and determining the type and the position of the dangerous goods by processing and identifying the X-ray image.
As shown in fig. 1, the dangerous goods identification method may include the following steps:
s101: in the data processing stage, in the stage, mainly, the collected X-ray image is subjected to target detection labeling to obtain a target detection label, and the X-ray image is subjected to semantic segmentation labeling to obtain a semantic segmentation label, specifically:
the X-ray image can be acquired first, and can be acquired and generated in real time by the X-ray machine or can be stored under a specified directory.
Then, marking the specific position of the dangerous goods in the corresponding X-ray image by using MASK according to the category information and the position information of the dangerous goods, thereby obtaining a target detection tag y1; meanwhile, different pixel values are used for calibrating the X-ray image, so that the types of dangerous goods and the specific positions of the dangerous goods in the corresponding X-ray image are divided in the X-ray image under the different pixel values, and the semantic segmentation label y2 is obtained.
After that, the X-ray image and the corresponding object detection tag y1 and semantic segmentation tag y2 may be further subjected to enhancement processing, where the enhancement processing may include clipping, affine transformation, rotation transformation, and the like, and the X-ray processed image I, the object detection tag y1 ', and the semantic segmentation tag y 2' are obtained after the processing is completed.
S102: in the training stage, the YOLOV5 network is mainly used for extracting and training the characteristics of the X-ray image I, and specifically:
(1) The method comprises the steps of performing feature extraction on an X-ray image I by using a backbone network to obtain a first feature map I 0 The convolution layer of the backbone network performs feature extraction by using a channel attention module and a space attention module.
In the invention, because of the problem that the inter-domain offset is easily caused by the internal factors of the X-ray machine such as imaging principle, hardware parameters, machine aging and the like, a channel attention module and a space attention module are introduced in a convolution layer in a feature extraction network for reducing the inter-domain offset so as to optimize feature extraction.
In the channel attention module of the present embodiment, the feature extraction by the convolution layer of the backbone network using the channel attention module may include:
(1) the features of the first feature map are subjected to an average pooling operation to obtain an average channel descriptor, which can be counted as F avg The method comprises the steps of carrying out a first treatment on the surface of the The maximum pooling operation is carried out on the features of the first feature map to obtain a maximum channel descriptor, which can be counted as F max 。
(2) Inputting the average channel descriptor and the maximum channel descriptor into a shared network to generate a one-dimensional channel attention map, which may be M c ∈R Cx1x1 。
Here, the shared network is composed of multiple layers of perceptrons, and the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
In the spatial attention module of the present embodiment, the feature extraction by the convolution layer of the backbone network using the spatial attention module may include:
(1) carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
(2) inputting the average spatial descriptor and the maximum spatial descriptor into a shared network to generate a two-dimensional spatial attention graph.
Here, the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
In this embodiment, the channel attention module is preferably used to extract the features first, and then the spatial attention module is used to extract the features.
(2) Performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network, namely firstly extracting the first feature map I obtained by channel attention and space attention features by using a feature coder 0 Coding, and then utilizing a neck network to code the first characteristic diagram I 0 Processing to obtain a second feature map I with more scale information 1 。
(3) Inputting the second feature map into a target detection head (head_objdet) and a semantic segmentation head (head_segment) respectively to obtain a target prediction result and a semantic segmentation prediction result, wherein the target prediction result can be counted as x n1 The method comprises the steps of carrying out a first treatment on the surface of the The semantic segmentation prediction result can be counted asx n2 。
(4) Performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; and carrying out second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label so as to optimize a semantic segmentation head.
In this step, the calculating the first loss degree between the target prediction result and the target detection tag to optimize the target detection head includes:
by L location loss Performing position loss calculation on the function; by L classes loss The function performs class loss calculation, where L location loss The function satisfies the following formula:
L location loss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c ,
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classes loss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
In addition, L can also be utilized objectness loss The function calculates the position regression loss and satisfies the following formula:
L classes loss =L objectness loss ,
while all sample obj losses satisfy:
L obj =L classes loss +L objectness loss +L location loss 。
in this step, the performing the second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
The total loss calculation satisfies the following formula:
L=L location loss +L objectness loss +L classes loss +L seg 。
s103: in the reasoning stage, in this stage, the X-ray image to be identified may be input into a training-completed network, specifically:
(1) Inputting the X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting the X-ray image to be identified into the optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of the target.
(2) And calculating the cross-over ratio of the semantic segmentation result and the target detection result, and reserving a region with the overlapping region larger than a preset value as a final target detection region and outputting the final target detection region, wherein the following formula is satisfied:
wherein I is out Representing an output result; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented. Here, the preset value is 0.5, i.e. pairAnd reserving the area with the overlapping area larger than 0.5 as a final target detection area and outputting the final target detection area.
Compared with the existing manual screening method, the method has the problems of long training time consumption, high labor cost, missed recognition and false recognition of dangerous goods due to fatigue and the like; the existing target detection and recognition method based on deep learning often needs to be subjected to downsampling for a plurality of times in the feature extraction process, and the problem that missing recognition and false recognition are easy to cause due to inter-domain difference caused by internal factors in X-ray imaging. The dangerous goods identification method provided by the invention utilizes the advantages of target detection, and fuses rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.
Referring to fig. 2, a functional module schematic diagram of the dangerous goods identification device according to an embodiment of the present invention is shown. The dangerous goods identification device 100 of the present embodiment may include a data processing module 11, a training module 12, and an inference module 13, wherein:
the data processing module 11 is used for carrying out target detection labeling on the collected X-ray images to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
the training module 12 is configured to:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
the reasoning module 13 is configured to:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
It should be understood that after the corresponding modules execute the corresponding functions, the effect achieved may be the same as the aforementioned dangerous goods identification method, so that the description thereof is omitted herein.
The invention provides a computer device, which comprises a processor, wherein the processor is used for realizing the steps in the dangerous goods identification method when executing a computer program stored in a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
In addition, the invention also provides a storage medium, on which a computer program is stored, which when being executed by a processor, realizes the steps in the dangerous goods identification method.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (10)
1. The dangerous goods identification method is applied to a security inspection system and is characterized by comprising the following steps of:
and a data processing stage:
performing target detection labeling on the collected X-ray image to obtain a target detection label; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training phase:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
reasoning:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
2. The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a channel attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average channel descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum channel descriptor;
the average channel descriptor and the maximum channel descriptor are input into a shared network to generate a one-dimensional channel attention map.
3. The dangerous goods recognition method according to claim 2, wherein the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
4. The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a spatial attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
the average spatial descriptor and the maximum spatial descriptor are input into a shared network to generate a two-dimensional spatial attention graph.
5. The method for identifying dangerous goods according to claim 4, wherein the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
6. The method of any one of claims 1 to 5, wherein the performing a first loss degree calculation on the target prediction result and the target detection tag to optimize a target detection head includes:
by L locationloss Performing position loss calculation on the function; by L classesloss The function performs class loss calculation, where L locationloss The function satisfies the following formula:
L locationloss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c ,
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classes loss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
7. The method of any one of claims 1 to 5, wherein the performing a second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
8. The dangerous goods identification method according to any one of claims 1 to 5, wherein the calculating of the cross-correlation ratio between the semantic segmentation result and the target detection result is performed, and a region with an overlapping region larger than a preset value is reserved as a final target detection region and output, so as to satisfy the following formula:
wherein I is out Representing output junctionsFruit; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented.
9. A hazardous article identification device, the device comprising:
the data processing module is used for carrying out target detection labeling on the collected X-ray images so as to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training module, training module:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
an inference module that:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
10. A storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for identifying a hazardous article as claimed in any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211278338.4A CN117132875A (en) | 2022-10-19 | 2022-10-19 | Dangerous goods identification method, dangerous goods identification device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211278338.4A CN117132875A (en) | 2022-10-19 | 2022-10-19 | Dangerous goods identification method, dangerous goods identification device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117132875A true CN117132875A (en) | 2023-11-28 |
Family
ID=88855216
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211278338.4A Pending CN117132875A (en) | 2022-10-19 | 2022-10-19 | Dangerous goods identification method, dangerous goods identification device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117132875A (en) |
-
2022
- 2022-10-19 CN CN202211278338.4A patent/CN117132875A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112435215B (en) | Image-based vehicle damage assessment method, mobile terminal and server | |
CN108229341B (en) | Classification method and device, electronic equipment and computer storage medium | |
CN110414507B (en) | License plate recognition method and device, computer equipment and storage medium | |
CN110348294B (en) | Method and device for positioning chart in PDF document and computer equipment | |
US20180336683A1 (en) | Multi-Label Semantic Boundary Detection System | |
CN113139543B (en) | Training method of target object detection model, target object detection method and equipment | |
CN111914835A (en) | Bill element extraction method and device, electronic equipment and readable storage medium | |
CN114937086B (en) | Training method and detection method for multi-image target detection and related products | |
US20220076119A1 (en) | Device and method of training a generative neural network | |
CN114067321A (en) | Text detection model training method, device, equipment and storage medium | |
CN112036400B (en) | Method for constructing network for target detection and target detection method and system | |
CN110807463B (en) | Image segmentation method and device, computer equipment and storage medium | |
CN112052907A (en) | Target detection method and device based on image edge information and storage medium | |
Pattanaik et al. | Enhancement of license plate recognition performance using Xception with Mish activation function | |
CN109190662A (en) | A kind of three-dimensional vehicle detection method, system, terminal and storage medium returned based on key point | |
CN111160368A (en) | Method, device and equipment for detecting target in image and storage medium | |
CN110880035B (en) | Convolutional neural network training method and device and nodule sign identification method and device | |
CN116363365A (en) | Image segmentation method based on semi-supervised learning and related equipment | |
CN117132875A (en) | Dangerous goods identification method, dangerous goods identification device and storage medium | |
CN116266259A (en) | Image and text structured output method and device, electronic equipment and storage medium | |
CN116433704A (en) | Cell nucleus segmentation method based on central point and related equipment | |
Pototzky et al. | Self-supervised learning for object detection in autonomous driving | |
CN110969602B (en) | Image definition detection method and device | |
CN114399657A (en) | Vehicle detection model training method and device, vehicle detection method and electronic equipment | |
CN113012132A (en) | Image similarity determining method and device, computing equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |