CN117132875A - Dangerous goods identification method, dangerous goods identification device and storage medium - Google Patents

Dangerous goods identification method, dangerous goods identification device and storage medium Download PDF

Info

Publication number
CN117132875A
CN117132875A CN202211278338.4A CN202211278338A CN117132875A CN 117132875 A CN117132875 A CN 117132875A CN 202211278338 A CN202211278338 A CN 202211278338A CN 117132875 A CN117132875 A CN 117132875A
Authority
CN
China
Prior art keywords
target detection
semantic segmentation
result
head
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211278338.4A
Other languages
Chinese (zh)
Inventor
苗应亮
付雪平
夏炉系
张浒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Maxvision Technology Corp
Original Assignee
Maxvision Technology Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Maxvision Technology Corp filed Critical Maxvision Technology Corp
Priority to CN202211278338.4A priority Critical patent/CN117132875A/en
Publication of CN117132875A publication Critical patent/CN117132875A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a dangerous goods identification method, a dangerous goods identification device and a storage medium, and relates to the field of artificial intelligence, wherein the dangerous goods identification method comprises the following steps: the data processing stage, the training stage and the reasoning stage utilize the advantages of target detection, and fuse rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.

Description

Dangerous goods identification method, dangerous goods identification device and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to a dangerous goods identification method, a dangerous goods identification device and a storage medium.
Background
In the security inspection recognition system, the existing manual mode is adopted to inspect the X-ray imaging pictures generated by the X-ray machine, and although the step of unpacking inspection can be omitted, the professional literacy requirement on the inspection personnel is higher, and meanwhile, the inspection personnel is required to have better tolerance and dangerous object alertness.
However, with longer operating periods, there is a greater risk of false leak detection due to perceived degradation limited to tiredness.
Disclosure of Invention
The invention provides a dangerous goods identification method, a dangerous goods identification device and a storage medium, aiming at the problem that the existing security inspection system is large in dangerous goods identification risk of false detection.
The technical scheme provided by the invention for the technical problems is as follows:
in a first aspect, the invention provides a dangerous goods identification method, which is applied to a security inspection system, and the method comprises the following steps:
and a data processing stage:
performing target detection labeling on the collected X-ray image to obtain a target detection label; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training phase:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
reasoning:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
Preferably, the feature extraction by the convolution layer of the backbone network using the channel attention module includes:
carrying out average pooling operation on the features of the first feature map to obtain an average channel descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum channel descriptor;
the average channel descriptor and the maximum channel descriptor are input into a shared network to generate a one-dimensional channel attention map.
Preferably, the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a spatial attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
the average spatial descriptor and the maximum spatial descriptor are input into a shared network to generate a two-dimensional spatial attention graph.
Preferably, the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
Preferably, the calculating the first loss degree of the target prediction result and the target detection label to optimize the target detection head includes:
by L locationloss Performing position loss calculation on the function; by L classesloss The function performs class loss calculation, where L locationloss The function satisfies the following formula:
L locationloss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classesloss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
Preferably, the performing the second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
Preferably, the calculating of the cross-correlation ratio between the semantic segmentation result and the target detection result is performed, and a region with an overlapping region larger than a preset value is reserved as a final target detection region and output, so that the following formula is satisfied:
wherein I is out Representing an output result; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented.
In a second aspect, the present invention provides a hazardous article identification device, the device comprising:
the data processing module is used for carrying out target detection labeling on the collected X-ray images to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training module for:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
an inference module for:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
In a third aspect, the present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the hazardous article identification method as described above.
The technical scheme provided by the embodiment of the invention has the beneficial effects that:
the dangerous goods identification method provided by the invention utilizes the advantages of target detection, and fuses rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a dangerous goods identification method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of functional modules of the dangerous goods identification device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a dangerous goods identification method provided by the invention is provided in an embodiment. The dangerous goods identification method is mainly applied to a security inspection system, and is used for obtaining whether the current security inspection object is a dangerous goods or not and determining the type and the position of the dangerous goods by processing and identifying the X-ray image.
As shown in fig. 1, the dangerous goods identification method may include the following steps:
s101: in the data processing stage, in the stage, mainly, the collected X-ray image is subjected to target detection labeling to obtain a target detection label, and the X-ray image is subjected to semantic segmentation labeling to obtain a semantic segmentation label, specifically:
the X-ray image can be acquired first, and can be acquired and generated in real time by the X-ray machine or can be stored under a specified directory.
Then, marking the specific position of the dangerous goods in the corresponding X-ray image by using MASK according to the category information and the position information of the dangerous goods, thereby obtaining a target detection tag y1; meanwhile, different pixel values are used for calibrating the X-ray image, so that the types of dangerous goods and the specific positions of the dangerous goods in the corresponding X-ray image are divided in the X-ray image under the different pixel values, and the semantic segmentation label y2 is obtained.
After that, the X-ray image and the corresponding object detection tag y1 and semantic segmentation tag y2 may be further subjected to enhancement processing, where the enhancement processing may include clipping, affine transformation, rotation transformation, and the like, and the X-ray processed image I, the object detection tag y1 ', and the semantic segmentation tag y 2' are obtained after the processing is completed.
S102: in the training stage, the YOLOV5 network is mainly used for extracting and training the characteristics of the X-ray image I, and specifically:
(1) The method comprises the steps of performing feature extraction on an X-ray image I by using a backbone network to obtain a first feature map I 0 The convolution layer of the backbone network performs feature extraction by using a channel attention module and a space attention module.
In the invention, because of the problem that the inter-domain offset is easily caused by the internal factors of the X-ray machine such as imaging principle, hardware parameters, machine aging and the like, a channel attention module and a space attention module are introduced in a convolution layer in a feature extraction network for reducing the inter-domain offset so as to optimize feature extraction.
In the channel attention module of the present embodiment, the feature extraction by the convolution layer of the backbone network using the channel attention module may include:
(1) the features of the first feature map are subjected to an average pooling operation to obtain an average channel descriptor, which can be counted as F avg The method comprises the steps of carrying out a first treatment on the surface of the The maximum pooling operation is carried out on the features of the first feature map to obtain a maximum channel descriptor, which can be counted as F max
(2) Inputting the average channel descriptor and the maximum channel descriptor into a shared network to generate a one-dimensional channel attention map, which may be M c ∈R Cx1x1
Here, the shared network is composed of multiple layers of perceptrons, and the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
In the spatial attention module of the present embodiment, the feature extraction by the convolution layer of the backbone network using the spatial attention module may include:
(1) carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
(2) inputting the average spatial descriptor and the maximum spatial descriptor into a shared network to generate a two-dimensional spatial attention graph.
Here, the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
In this embodiment, the channel attention module is preferably used to extract the features first, and then the spatial attention module is used to extract the features.
(2) Performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network, namely firstly extracting the first feature map I obtained by channel attention and space attention features by using a feature coder 0 Coding, and then utilizing a neck network to code the first characteristic diagram I 0 Processing to obtain a second feature map I with more scale information 1
(3) Inputting the second feature map into a target detection head (head_objdet) and a semantic segmentation head (head_segment) respectively to obtain a target prediction result and a semantic segmentation prediction result, wherein the target prediction result can be counted as x n1 The method comprises the steps of carrying out a first treatment on the surface of the The semantic segmentation prediction result can be counted asx n2
(4) Performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; and carrying out second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label so as to optimize a semantic segmentation head.
In this step, the calculating the first loss degree between the target prediction result and the target detection tag to optimize the target detection head includes:
by L location loss Performing position loss calculation on the function; by L classes loss The function performs class loss calculation, where L location loss The function satisfies the following formula:
L location loss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classes loss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
In addition, L can also be utilized objectness loss The function calculates the position regression loss and satisfies the following formula:
L classes loss =L objectness loss
while all sample obj losses satisfy:
L obj =L classes loss +L objectness loss +L location loss
in this step, the performing the second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
The total loss calculation satisfies the following formula:
L=L location loss +L objectness loss +L classes loss +L seg
s103: in the reasoning stage, in this stage, the X-ray image to be identified may be input into a training-completed network, specifically:
(1) Inputting the X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting the X-ray image to be identified into the optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of the target.
(2) And calculating the cross-over ratio of the semantic segmentation result and the target detection result, and reserving a region with the overlapping region larger than a preset value as a final target detection region and outputting the final target detection region, wherein the following formula is satisfied:
wherein I is out Representing an output result; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented. Here, the preset value is 0.5, i.e. pairAnd reserving the area with the overlapping area larger than 0.5 as a final target detection area and outputting the final target detection area.
Compared with the existing manual screening method, the method has the problems of long training time consumption, high labor cost, missed recognition and false recognition of dangerous goods due to fatigue and the like; the existing target detection and recognition method based on deep learning often needs to be subjected to downsampling for a plurality of times in the feature extraction process, and the problem that missing recognition and false recognition are easy to cause due to inter-domain difference caused by internal factors in X-ray imaging. The dangerous goods identification method provided by the invention utilizes the advantages of target detection, and fuses rich semantic information to correct the target detection result by utilizing the semantic segmentation result. Meanwhile, a channel attention and a spatial attention mechanism are introduced in the feature extraction to optimize the inter-domain difference problem, so that the problems of false recognition and missing recognition caused by the object detection downsampling and the inter-domain difference of an X-ray machine can be reduced, the detection effect is improved, and the recognition precision is improved.
Referring to fig. 2, a functional module schematic diagram of the dangerous goods identification device according to an embodiment of the present invention is shown. The dangerous goods identification device 100 of the present embodiment may include a data processing module 11, a training module 12, and an inference module 13, wherein:
the data processing module 11 is used for carrying out target detection labeling on the collected X-ray images to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
the training module 12 is configured to:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
the reasoning module 13 is configured to:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
It should be understood that after the corresponding modules execute the corresponding functions, the effect achieved may be the same as the aforementioned dangerous goods identification method, so that the description thereof is omitted herein.
The invention provides a computer device, which comprises a processor, wherein the processor is used for realizing the steps in the dangerous goods identification method when executing a computer program stored in a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
In addition, the invention also provides a storage medium, on which a computer program is stored, which when being executed by a processor, realizes the steps in the dangerous goods identification method.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (10)

1. The dangerous goods identification method is applied to a security inspection system and is characterized by comprising the following steps of:
and a data processing stage:
performing target detection labeling on the collected X-ray image to obtain a target detection label; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training phase:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
reasoning:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
2. The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a channel attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average channel descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum channel descriptor;
the average channel descriptor and the maximum channel descriptor are input into a shared network to generate a one-dimensional channel attention map.
3. The dangerous goods recognition method according to claim 2, wherein the channel attention calculation formula satisfies:
M c (F)=δ(MLP(AvgPool(F))+MLP(MaxPool(F)))
=δ(W 1 (W 0 (F c avg ))+W 1 (W 0 (F c max ))),
wherein W is 1 And W is 0 Respectively representing two convolution layers, σ representing a sigmoid function.
4. The method of claim 1, wherein the feature extraction by the convolution layer of the backbone network using a spatial attention module comprises:
carrying out average pooling operation on the features of the first feature map to obtain an average space descriptor; performing maximum pooling operation on the features of the first feature map to obtain a maximum space descriptor;
the average spatial descriptor and the maximum spatial descriptor are input into a shared network to generate a two-dimensional spatial attention graph.
5. The method for identifying dangerous goods according to claim 4, wherein the channel attention calculation formula satisfies:
M s (F)=δ(f 7x7 ([AvgPool(F);MaxPool(F)]))
=δ(f 7x7 ([F s avg ;F s max ])),
wherein f 7×7 Indicating that the convolution kernel performs a 7 x 7 convolution operation.
6. The method of any one of claims 1 to 5, wherein the performing a first loss degree calculation on the target prediction result and the target detection tag to optimize a target detection head includes:
by L locationloss Performing position loss calculation on the function; by L classesloss The function performs class loss calculation, where L locationloss The function satisfies the following formula:
L locationloss =1-(I 0 /(A p +A g -I 0 )-((A c -(A p +A g -I 0 ))/A c
wherein I is 0 Representing the intersection of a prediction frame and a real label frame, A p +A g -I 0 Intersection is subtracted from the union representing the prediction box and the real label box, A c A minimum bounding rectangle representing the prediction box and the actual label box;
L classes loss the function satisfies the following formula:
wherein x is n Representing the predicted result of the target detection head; y is n Representing a real label; n represents the sample size.
7. The method of any one of claims 1 to 5, wherein the performing a second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label includes:
by L seg The function performs semantic segmentation loss calculation, and the following formula is satisfied:
wherein, alpha is used for balancing the weights of positive and negative samples, y is gt, the prediction probability p E [0,1], (1-p) r times and p r times of the foreground class are used for modulating the weight of each sample, so that the difficult samples have higher weights.
8. The dangerous goods identification method according to any one of claims 1 to 5, wherein the calculating of the cross-correlation ratio between the semantic segmentation result and the target detection result is performed, and a region with an overlapping region larger than a preset value is reserved as a final target detection region and output, so as to satisfy the following formula:
wherein I is out Representing output junctionsFruit; x is x n1 Representing the predicted result of the target detection head; x is x n2 The detection result of the semantic segmentation head is represented.
9. A hazardous article identification device, the device comprising:
the data processing module is used for carrying out target detection labeling on the collected X-ray images so as to obtain target detection labels; performing semantic segmentation labeling on the X-ray image to obtain a semantic segmentation label;
training module, training module:
performing feature extraction on the X-ray image by using a main network to obtain a first feature map, wherein a convolution layer of the main network performs feature extraction by using a channel attention module and a space attention module;
performing feature coding on the first feature map and generating a second feature map with more scale information by using a neck network;
the second feature map is respectively input into a target detection head and a semantic segmentation head to obtain a target prediction result and a semantic segmentation prediction result;
performing first loss degree calculation on the target prediction result and the target detection label to optimize a target detection head; performing second loss degree calculation on the semantic segmentation prediction result and the semantic segmentation label to optimize a semantic segmentation head;
an inference module that:
inputting an X-ray image to be identified into the optimized semantic segmentation head to obtain a semantic segmentation result, wherein the semantic segmentation result comprises target areas of different types of targets; inputting an X-ray image to be identified into an optimized target detection head to obtain a target detection result, wherein the target detection result comprises classification information and position information of a target;
and carrying out cross-correlation calculation on the semantic segmentation result and the target detection result, and keeping a region with an overlapping region larger than a preset value as a final target detection region and outputting the final target detection region.
10. A storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for identifying a hazardous article as claimed in any one of claims 1 to 8.
CN202211278338.4A 2022-10-19 2022-10-19 Dangerous goods identification method, dangerous goods identification device and storage medium Pending CN117132875A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211278338.4A CN117132875A (en) 2022-10-19 2022-10-19 Dangerous goods identification method, dangerous goods identification device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211278338.4A CN117132875A (en) 2022-10-19 2022-10-19 Dangerous goods identification method, dangerous goods identification device and storage medium

Publications (1)

Publication Number Publication Date
CN117132875A true CN117132875A (en) 2023-11-28

Family

ID=88855216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211278338.4A Pending CN117132875A (en) 2022-10-19 2022-10-19 Dangerous goods identification method, dangerous goods identification device and storage medium

Country Status (1)

Country Link
CN (1) CN117132875A (en)

Similar Documents

Publication Publication Date Title
CN112435215B (en) Image-based vehicle damage assessment method, mobile terminal and server
CN108229341B (en) Classification method and device, electronic equipment and computer storage medium
CN110414507B (en) License plate recognition method and device, computer equipment and storage medium
CN110348294B (en) Method and device for positioning chart in PDF document and computer equipment
US20180336683A1 (en) Multi-Label Semantic Boundary Detection System
CN113139543B (en) Training method of target object detection model, target object detection method and equipment
CN111914835A (en) Bill element extraction method and device, electronic equipment and readable storage medium
CN114937086B (en) Training method and detection method for multi-image target detection and related products
US20220076119A1 (en) Device and method of training a generative neural network
CN114067321A (en) Text detection model training method, device, equipment and storage medium
CN112036400B (en) Method for constructing network for target detection and target detection method and system
CN110807463B (en) Image segmentation method and device, computer equipment and storage medium
CN112052907A (en) Target detection method and device based on image edge information and storage medium
Pattanaik et al. Enhancement of license plate recognition performance using Xception with Mish activation function
CN109190662A (en) A kind of three-dimensional vehicle detection method, system, terminal and storage medium returned based on key point
CN111160368A (en) Method, device and equipment for detecting target in image and storage medium
CN110880035B (en) Convolutional neural network training method and device and nodule sign identification method and device
CN116363365A (en) Image segmentation method based on semi-supervised learning and related equipment
CN117132875A (en) Dangerous goods identification method, dangerous goods identification device and storage medium
CN116266259A (en) Image and text structured output method and device, electronic equipment and storage medium
CN116433704A (en) Cell nucleus segmentation method based on central point and related equipment
Pototzky et al. Self-supervised learning for object detection in autonomous driving
CN110969602B (en) Image definition detection method and device
CN114399657A (en) Vehicle detection model training method and device, vehicle detection method and electronic equipment
CN113012132A (en) Image similarity determining method and device, computing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination