CN109740688B - Terahertz image information interpretation method, network and storage medium - Google Patents

Terahertz image information interpretation method, network and storage medium Download PDF

Info

Publication number
CN109740688B
CN109740688B CN201910019673.4A CN201910019673A CN109740688B CN 109740688 B CN109740688 B CN 109740688B CN 201910019673 A CN201910019673 A CN 201910019673A CN 109740688 B CN109740688 B CN 109740688B
Authority
CN
China
Prior art keywords
resolution
result
super
low
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910019673.4A
Other languages
Chinese (zh)
Other versions
CN109740688A (en
Inventor
程良伦
梁广宇
何伟健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910019673.4A priority Critical patent/CN109740688B/en
Publication of CN109740688A publication Critical patent/CN109740688A/en
Application granted granted Critical
Publication of CN109740688B publication Critical patent/CN109740688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a terahertz image information interpretation method, a network and a storage medium, wherein the method comprises the following steps: acquiring a low-resolution terahertz image; performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image; performing feature extraction on the low-resolution terahertz image by using at least one convolution layer to obtain a low-resolution feature map; performing feature extraction on the super-resolution terahertz image by using at least one layer of convolutional layer to obtain a super-resolution feature map; performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map; carrying out region proposing on the feature fusion image to obtain a candidate region; performing regional pooling on the candidate region to obtain a regional pooling result; and carrying out target detection and/or semantic segmentation on the regional pooling result to finish information interpretation of the terahertz image. The invention can finish the information interpretation of the terahertz image with high efficiency and high quality.

Description

Terahertz image information interpretation method, network and storage medium
Technical Field
The invention relates to the field of image processing, in particular to a terahertz image information interpretation method, a terahertz image information interpretation network and a terahertz image information storage medium.
Background
The terahertz technology can carry out non-contact automatic identification on the portable hidden dangerous objects, and can solve the technical problem of safety detection in the public environment. For a long time, poor imaging definition and contrast, unclear edge contour and speckle noise interference of low-resolution terahertz images become one of the main factors hindering the application of terahertz images.
The terahertz image can be identified by combining an image processing technology with a terahertz imaging principle, but semantic segmentation is performed by using a low-resolution image, and although the identification rate is high, much small and important detail information is lost relative to the high-resolution image.
In order to obtain information which needs to be expressed by an image, including scene and class information of a certain object, a new terahertz image information interpretation method is needed to meet the requirement of high-quality interpretation of a low-resolution terahertz image. The system and the method have the advantages that functions such as real-time and intelligent detection are achieved, and the terahertz security inspection technology has a good application scene in the field of urban rail transit security inspection.
Disclosure of Invention
The invention aims to provide a terahertz image information interpretation method, a network and a storage medium, which are used for interpreting a low-resolution terahertz image with high quality.
In order to achieve the purpose, the invention adopts the following technical scheme:
a terahertz image information interpretation method comprises the following steps:
performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image;
performing feature extraction on the low-resolution terahertz image by using at least one convolution layer to obtain a low-resolution feature map; performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
carrying out region proposing on the feature fusion graph to obtain a candidate region; performing region pooling on the candidate region to obtain a region pooling result;
and carrying out target detection and/or semantic segmentation on the region pooling result to finish information interpretation on the terahertz image.
Optionally, the steps of: performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image, which specifically comprises the following steps:
performing super-resolution reconstruction on the acquired low-resolution terahertz image by using a super-resolution reconstruction subnetwork to obtain a super-resolution terahertz image;
the super-resolution reconstruction sub-network comprises at least one recursive block, and each recursive block comprises at least one residual unit; and after convolution is carried out on the output value of the last recursion block, the output value is superposed with the low-resolution terahertz image to obtain the super-resolution terahertz image.
Optionally, the steps are: performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map, which specifically comprises the following steps:
sequentially carrying out deconvolution, cavity convolution and L2 normalization calculation on the low-resolution feature map to obtain a low-resolution processing map;
sequentially performing convolution of a convolution kernel 1*1 and L2 normalization calculation on the super-resolution feature map to obtain a super-resolution processing map;
and fusing the low-resolution processing image and the super-resolution processing image, and then calculating a ReLu activation function to obtain a feature fusion image.
Optionally, the steps of: performing region proposing on the feature fusion map to obtain a candidate region, specifically comprising:
generating a plurality of possible regions with different sizes on an x y sliding window by utilizing an anchor point mechanism aiming at the feature fusion graph;
performing classification calculation and border regression calculation on the possible regions to obtain a classification result and a border regression result;
and preliminarily screening the various possible regions according to the classification result, and preliminarily offsetting the screened possible regions according to the frame regression result to obtain candidate regions.
Optionally, the steps are: performing region pooling on the candidate region to obtain a region pooling result, specifically comprising:
mapping the candidate region to a corresponding position of the feature fusion map to obtain a mapping region;
and dividing the mapping area into a plurality of intervals with the same size, and performing maximum pooling calculation on each interval to obtain an area pooling result.
Optionally, the steps of: performing target detection and/or semantic segmentation on the region pooling result to finish information interpretation of the terahertz image, specifically comprising:
deconvoluting the regional pooling result;
inputting the deconvolution result of the regional pooling result into a full-link layer for conversion to obtain a one-dimensional vector;
respectively carrying out classification calculation and border regression calculation on the one-dimensional vector by using a region classifier and a border regression device to obtain a classification result and a border regression result;
preliminarily screening the one-dimensional vectors according to the classification result to obtain candidate frames belonging to a preset category, and offsetting the positions of the candidate frames according to a frame regression result to obtain a target detection map;
and/or the presence of a gas in the gas,
up-sampling a deconvolution result of the region pooling result; classifying the up-sampling result by using a semantic classifier, and then up-sampling to obtain a semantic segmentation graph;
optionally, the steps are: and sequentially carrying out deconvolution, cavity convolution and L2 normalization calculation on the low-resolution feature map to obtain a low-resolution processing map, and then, the method further comprises the following steps:
sequentially carrying out convolution of convolution kernel 1*1 and Softmax function calculation on the deconvolution result of the low-resolution feature map to obtain a loss value L 1 (ii) a Wherein the Softmax function calculates a low resolution tag guided by a cascade tag;
the steps are as follows: utilizing a region classifier and a frame regression device to perform classification calculation and frame regression calculation on the one-dimensional vector respectively to obtain a classification result and a frame regression result, and then further comprising the following steps:
training the region classifier by utilizing a Softmax Loss function to obtain a Loss value L 21 (ii) a Training the frame regression by using Smooth L1 Loss to obtain a Loss value L 22 (ii) a Will lose value L 21 And L 22 Overlapping to obtain loss value L 2
The steps are as follows: classifying the up-sampled result by a semantic classifier, and then further comprising the following steps:
carrying out classification training on the semantic classifier by using the high-resolution label guided by the cascade label to obtain a loss value L 3
The steps are as follows: the information interpretation of the terahertz image is completed, and then the method further comprises the following steps:
calculating a total loss function L;
if only the regional pooling result is subjected to target detection, the total loss function L = λ 1 L 12 L 2
If the region pooling result is only semantically segmented, the total loss function L = λ 1 L 13 L 3
If the region pooling result is subjected to target detection and semantic segmentation, the total loss function L = lambda 1 L 12 L 23 L 3
The total loss function L is minimized.
A terahertz image information interpretation network comprising:
the super-resolution reconstruction sub-network is used for carrying out super-resolution reconstruction on the acquired low-resolution terahertz image;
the characteristic extraction sub-network is used for carrying out characteristic extraction on the low-resolution terahertz image by utilizing at least one convolution layer to obtain a low-resolution characteristic diagram; performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
the feature fusion sub-network is used for carrying out feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
the region proposing sub-network is used for carrying out region proposing on the feature fusion map to obtain a candidate region;
the region pooling sub-network is used for performing region pooling on the candidate region to obtain a region pooling result;
the object detection and semantic segmentation sub-network is used for carrying out object detection and/or semantic segmentation on the regional pooling result to finish information interpretation on the terahertz image;
the super-resolution reconstruction sub-network is connected with the feature extraction sub-network, the feature extraction sub-network is connected with the feature fusion sub-network, the feature fusion sub-network is connected with the area proposal sub-network, and the area proposal sub-network is connected with the area pooling sub-network.
A storage medium having stored thereon computer-executable instructions, wherein the computer-executable instructions, when executed by a computer processor, implement the steps of the terahertz image information interpretation method described above.
Compared with the prior art, the invention has the following beneficial effects:
the method realizes super-resolution reconstruction of the low-resolution terahertz image, performs feature level image fusion on the super-resolution terahertz image and the low-resolution terahertz image obtained by reconstruction, performs target detection and/or semantic segmentation on the feature fusion image, and completes information interpretation of the low-resolution terahertz image. The calculation cost of the identification of the low-resolution terahertz image can be effectively reduced, and meanwhile, the accuracy of final output cannot be reduced, so that the information interpretation of the terahertz image is efficiently completed. And can effectively overcome characteristics such as terahertz imaging low resolution, high noise. The information interpretation method is very beneficial to the research of the terahertz image processing technology and the application of the terahertz image.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a terahertz image information interpretation method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a super-resolution reconstruction sub-network according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a recursive block according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a terahertz image information interpretation network according to a second embodiment of the present invention.
Fig. 5 is a schematic diagram of a terahertz image information interpretation network according to a second embodiment of the present invention.
Fig. 6 is a schematic diagram of a feature fusion sub-network according to a second embodiment of the present invention.
Illustration of the drawings: 11. a super-resolution reconstruction subnetwork; 12. a feature extraction subnetwork; 13. a feature fusion subnetwork; 14. a regional proposal subnetwork; 15. a regional pooling subnetwork; 16. object detection and semantic segmentation sub-networks.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. It should be noted that when one component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present.
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Example one
In this embodiment, a method for interpreting terahertz image information is provided, please refer to fig. 1, which includes the following steps:
s1, performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image;
s2, performing feature extraction on the low-resolution terahertz image by using at least one layer of convolution layer to obtain a low-resolution feature map; performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
s3, performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
s4, carrying out region proposing on the feature fusion image to obtain a candidate region; performing regional pooling on the candidate region to obtain a regional pooling result;
and S5, performing target detection and/or semantic segmentation on the regional pooling result to finish information interpretation on the terahertz image.
In this embodiment, the step S1 specifically includes:
performing super-resolution reconstruction on the low-resolution terahertz image by using a super-resolution reconstruction sub-network 11 to obtain a super-resolution terahertz image;
the super-resolution reconstruction sub-network 11 comprises at least one recursive block, and each recursive block comprises at least one residual unit; and after convolution is carried out on the output value of the last recursion block, the output value is overlapped with the low-resolution terahertz image to obtain the super-resolution terahertz image.
Fig. 2 is a schematic diagram of a super-resolution reconstruction subnetwork 11 comprising n recursive blocks. Fig. 3 is a schematic diagram of a recursive block comprising two residual units. Wherein n is more than or equal to 1.
As shown in fig. 3, the output value of the previous recursive block is convolved and input to the 1 st residual unit. In the 1 st residual error unit, after carrying out convolution twice on the input value, superposing the input value with the input value, and outputting the input value to the 2 nd residual error unit; in the 2 nd residual unit, the output value of the 1 st residual unit is convolved twice and then is superposed with the input value of the 1 st residual unit, and the superposed value is output to the next recursive block.
If m residual error units are included in one recursion block, the output value of the m-1 residual error unit is convolved twice in the mth residual error unit and then is superposed with the input value of the 1 st residual error unit, and the output value is output to the next recursion block. Wherein m is more than or equal to 1.
When m =1, the super-resolution reconstruction subnetwork 11 can be represented as:
y=D(x)=f conv (R n (R n-1 (⋯(R 1 (x))⋯)))+x
wherein R represents a function of the recursive block,
Figure SMS_1
f conv representing the last convolutional layer function.
The S3 step specifically comprises:
sequentially carrying out deconvolution, cavity convolution and L2 normalization calculation on the low-resolution feature map to obtain a low-resolution processing map;
the deconvolution result of the low-resolution feature map is sequentially subjected to convolution of a convolution kernel 1*1 and calculation of a Softmax function to obtain a loss value L 1 (ii) a Wherein the Softmax function calculates a low resolution tag guided by a cascade tag;
sequentially performing convolution of a convolution kernel 1*1 and L2 normalization calculation on the super-resolution feature map to obtain a super-resolution processing map;
and fusing the low-resolution processing image and the super-resolution processing image, and then calculating a ReLu activation function to obtain a feature fusion image.
This embodiment is based on the obtained loss value L 1 Training can enhance learning of steps from the low-resolution feature map to the low-resolution processing map. In order to combine feature maps with different resolutions, the embodiment first performs up-sampling on the low-resolution feature map by deconvolution to increase the receptive field; reducing the dimension of the super-resolution characteristic diagram through the convolution of a convolution kernel 1*1 to obtain the dimension same as that of the low-resolution characteristic diagram; then the characteristic maps with high resolution and low resolution are subjected to L2 classificationNormalizing and fusing; and then obtaining a feature fusion map with the same resolution as the super-resolution feature map by using a ReLu activation function.
In the step S4, performing a region proposing on the feature fusion map to obtain a candidate region, specifically including:
generating a plurality of possible regions with different sizes on a sliding window of x y by utilizing an anchor point mechanism aiming at the feature fusion graph;
performing classification calculation and border regression calculation on various possible regions to obtain a classification result and a border regression result;
and preliminarily screening various possible regions according to the classification result, and preliminarily offsetting the screened possible regions according to the frame regression result to obtain candidate regions.
Wherein the possible area is generally 300.
In the step S4, performing region pooling on the candidate region to obtain a region pooling result, which specifically includes:
mapping the candidate region to a corresponding position of the feature fusion map to obtain a mapping region;
and dividing the mapping area into a plurality of intervals with the same size, and performing maximum pooling calculation on each interval to obtain an area pooling result.
The S5 step specifically comprises:
s51, deconvoluting the regional pooling result;
s52, inputting a deconvolution result of the regional pooling result into a full-link layer for conversion to obtain a one-dimensional vector;
respectively carrying out classification calculation and frame regression calculation on the one-dimensional vectors by using a region classifier and a frame regression device to obtain a classification result and a frame regression result;
preliminarily screening the one-dimensional vectors according to the classification result to obtain candidate frames belonging to a preset category, and offsetting the positions of the candidate frames according to a frame regression result to obtain a target detection map;
wherein, the region classifier is trained by utilizing a Softmax Loss function to obtain a Loss value L 21 Training the frame regression by using Smooth L1 Loss to obtainObtaining a loss value L 22 Will lose value L 21 And L 22 Overlapping to obtain loss value L 2
And/or the presence of a gas in the gas,
s53, up-sampling a deconvolution result of the regional pooling result; classifying the upsampled result by using a semantic classifier, and then upsampling to obtain a semantic segmentation graph;
carrying out classification training on the semantic classifier by using the high-resolution label guided by the cascade label to obtain a loss value L 3
After the regional pooling is realized, a target detection sub-network and a semantic segmentation sub-network are used for respectively obtaining a target detection graph and a semantic segmentation graph, and the terahertz image is comprehensively interpreted.
Step S6 is further included after step S5:
calculating a total loss function L;
if only the regional pooling result is subjected to target detection, the total loss function L = lambda 1 L 12 L 2
If the region pooling result is only semantically segmented, the total loss function L = λ 1 L 13 L 3
If the region pooling result is subjected to target detection and semantic segmentation, the total loss function L = lambda 1 L 12 L 23 L 3
The total loss function L is minimized.
Interpretation of terahertz images can be made more accurate by minimizing the total loss function.
In the first embodiment of the invention, a sub-network with a plurality of recursion blocks is used for carrying out super-resolution reconstruction on a low-resolution image, the obtained super-resolution image is subjected to feature extraction through at least one convolution layer, meanwhile, the low-resolution image is also subjected to feature extraction through at least one convolution layer, feature graphs of two resolutions after feature extraction are fused, and a loss function L is obtained in the fusion process 1 . After the region proposal and the region pooling are carried out on the feature fusion graph,carrying out deconvolution, and then respectively carrying out target detection and semantic segmentation to finish information interpretation of the low-resolution terahertz image; wherein the target detection process obtains a loss function L 2 The semantic segmentation process obtains a loss function L 3 . According to L 1 、L 2 And L 3 The total loss function is calculated and minimized.
The process effectively reduces the calculation cost of the identification of the low-resolution terahertz image, does not reduce the accuracy of final output, and efficiently finishes the information interpretation of the terahertz image. Experimental results show that the network structure effectively improves the correct classification rate of small objects of the low-resolution terahertz image, obtains a finer segmentation effect and can achieve satisfactory segmentation speed.
The method realizes super-resolution reconstruction, target detection and semantic segmentation of the low-resolution terahertz image, and completes information interpretation of the low-resolution terahertz image. Meanwhile, the number of the convolution layers of the super-resolution branch is far smaller than that of the low-resolution branch, a small number of convolution layers are utilized to process input data of the super-resolution branch, and a fusion feature map is built in a very short time. Therefore, rapid high-resolution target detection and semantic segmentation can be realized, more detailed information is embodied, and the characteristics of low resolution, high noise and the like of terahertz imaging can be effectively overcome. The information interpretation method is very beneficial to the research of the terahertz image processing technology and the application of the terahertz image.
Example two
The embodiment provides a terahertz image information interpretation network for implementing the terahertz image information interpretation method of the first embodiment, and the method specifically includes:
the super-resolution reconstruction sub-network 11 is used for carrying out super-resolution reconstruction on the acquired low-resolution terahertz image;
the feature extraction sub-network 12 is used for performing feature extraction on the low-resolution terahertz image by using at least one convolution layer to obtain a low-resolution feature map, and performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
the feature fusion sub-network 13 is used for performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
a region proposing sub-network 14, configured to propose a region for the feature fusion map to obtain a candidate region;
the regional pooling sub-network 15 is used for carrying out target detection and/or semantic segmentation on the regional pooling result to finish information interpretation on the terahertz image;
and the object detection and semantic segmentation sub-network 16 is used for carrying out object detection and/or semantic segmentation on the region pooling result to finish information interpretation on the terahertz image.
The super-resolution reconstruction sub-network 11 is connected with a feature extraction sub-network 12, the feature extraction sub-network 12 is connected with a feature fusion sub-network 13, the feature fusion sub-network 13 is connected with a region proposing sub-network 14, the region proposing sub-network 14 is connected with a region pooling sub-network 15, and the region pooling sub-network 15 is connected with a target detection and semantic segmentation sub-network 16.
Fig. 2 is a schematic diagram of a super-resolution reconstruction subnetwork 11 comprising n recursive blocks. Fig. 3 is a schematic diagram of a recursive block comprising two residual units. Wherein n is more than or equal to 1.
As shown in fig. 2, the super-resolution reconstruction subnetwork 11 comprises at least one recursive block, each recursive block comprising at least one residual unit; and after convolution is carried out on the output value of the last recursion block, the output value is superposed with the low-resolution terahertz image to obtain a super-resolution terahertz image.
As shown in fig. 3, the output value of the previous recursive block is convolved and input to the 1 st residual unit. In the 1 st residual error, the input value is convoluted twice and then is superposed with the input value, and the superposed value is output to a 2 nd residual error unit; in the 2 nd residual unit, the output value of the 1 st residual unit is convolved twice and then is superposed with the input value of the 1 st residual unit, and the superposed value is output to the next recursive block.
If m residual error units are included in the recursion block, in the mth residual error unit, the output value of the (m-1) th residual error unit is convolved twice and then is superposed with the input value of the 1 st residual error unit, and the output value is output to the next recursion block. Wherein m is more than or equal to 1.
When m =1, the super-resolution reconstruction subnetwork 11 can be represented as:
y=D(x)=f conv (R n (R n-1 (⋯(R 1 (x))⋯)))+x
wherein R represents a function of the recursive block,
Figure SMS_2
f conv representing the last convolutional layer function.
Referring to fig. 5, fig. 5 is a schematic diagram of a terahertz image information interpretation network.
As shown, feature extraction subnetwork 12 is specifically configured to:
processing the low-resolution terahertz image by at least one layer of convolution layer to obtain a low-resolution characteristic diagram;
and processing the super-resolution terahertz image by at least one layer of convolution layer to obtain a high-resolution characteristic image.
Namely, the feature extraction sub-network 12 comprises i convolutional layers for processing the low-resolution terahertz images and j convolutional layers for processing the super-resolution terahertz images; wherein i is not less than 1,j is not less than 1.
Referring to fig. 6, fig. 6 is a schematic diagram of the feature fusion sub-network 13. As shown, the feature fusion sub-network 13 is specifically configured to:
deconvoluting the low-resolution feature map;
sequentially performing void convolution and L2 normalization calculation on the deconvolution result of the low-resolution feature map to obtain a low-resolution processing map;
sequentially performing convolution of a convolution kernel 1*1 and calculation of a Softmax function on a deconvolution result of the low-resolution feature map to obtain a loss value L1; wherein the Softmax function calculates a low resolution tag guided by a cascade tag;
sequentially performing convolution of a convolution kernel 1*1 and L2 normalization calculation on the super-resolution feature map to obtain a super-resolution processing map;
and fusing the low-resolution processing image and the super-resolution processing image, and then calculating a ReLu activation function to obtain a feature fusion image.
In the present embodiment, the area proposal subnetwork 14 is specifically configured to:
generating a plurality of possible regions with different sizes on a sliding window of x y by utilizing an anchor point mechanism aiming at the feature fusion graph;
performing classification calculation and border regression calculation on the multiple possible regions to obtain a classification result and a border regression result;
and preliminarily screening various possible regions according to the classification result, and preliminarily offsetting the screened possible regions according to the frame regression result to obtain candidate regions.
Wherein the possible area is generally 300.
The regional pooling sub-network 15 is specifically configured to:
mapping the candidate region to a corresponding position of the feature fusion map to obtain a mapping region;
and dividing the mapping area into a plurality of intervals with the same size, and performing maximum pooling calculation on each interval to obtain an area pooling result.
Referring to fig. 5, the object detection and semantic segmentation subnetwork 16 is specifically configured to:
deconvoluting the regional pooling result;
inputting the deconvolution result of the regional pooling result into a full-link layer for conversion to obtain a one-dimensional vector;
respectively carrying out classification calculation and frame regression calculation on the one-dimensional vectors by using a region classifier and a frame regression device to obtain a classification result and a frame regression result;
preliminarily screening the one-dimensional vectors according to the classification result to obtain candidate frames belonging to a preset category, and offsetting the positions of the candidate frames according to a frame regression result to obtain a target detection diagram;
wherein, the region classifier is trained by utilizing a Softmax Loss function to obtain a Loss value L 21 Training the frame regression by using Smooth L1 Loss to obtain a Loss value L 22 Will lose value L 21 And L 22 Overlapping to obtain loss value L 2
Up-sampling the deconvolution result of the regional pooling result; classifying the upsampled result by using a semantic classifier, and then upsampling to obtain a semantic segmentation graph;
wherein, the semantic classifier is classified and trained by using the high-resolution label guided by the cascade label to obtain the loss value L 3
In this embodiment, the terahertz image information interpretation network further includes a training unit, and the training unit is configured to: calculating a total loss function L;
if only the regional pooling result is subjected to target detection, the total loss function L = λ 1 L 12 L 2
If the region pooling result is only semantically segmented, the total loss function L = λ 1 L 13 L 3
If the region pooling result is subjected to target detection and semantic segmentation, the total loss function L = lambda 1 L 12 L 23 L 3
The total loss function L is minimized.
The training unit trains the whole super-resolution full-convolution cascade network by minimizing the total loss function L so as to enhance the accuracy of the super-resolution full-convolution cascade network.
EXAMPLE III
Embodiments of the present invention provide a storage medium containing computer-executable instructions, which are used to execute a terahertz image information interpretation method according to a first embodiment of the present invention when executed by a computer processor.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (5)

1. A terahertz image information interpretation method is characterized by comprising the following steps:
performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image;
performing feature extraction on the low-resolution terahertz image by using at least one convolution layer to obtain a low-resolution feature map; performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
carrying out region proposing on the feature fusion graph to obtain a candidate region; performing region pooling on the candidate region to obtain a region pooling result;
performing target detection and/or semantic segmentation on the region pooling result to finish information interpretation of the terahertz image; the steps are as follows: performing feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map, which specifically comprises the following steps:
sequentially carrying out deconvolution, cavity convolution and L2 normalization calculation on the low-resolution feature map to obtain a low-resolution processing map;
sequentially carrying out convolution of a convolution kernel 1*1 and L2 normalization calculation on the super-resolution feature map to obtain a super-resolution processing map;
fusing the low-resolution processing image and the super-resolution processing image, and then calculating a ReLu activation function to obtain a feature fusion image; the steps are as follows: performing target detection and/or semantic segmentation on the region pooling result to finish information interpretation of the terahertz image, specifically comprising the following steps:
deconvoluting the regional pooling result;
inputting the deconvolution result of the regional pooling result into a full-link layer for conversion to obtain a one-dimensional vector;
respectively carrying out classification calculation and border regression calculation on the one-dimensional vector by using a region classifier and a border regression device to obtain a classification result and a border regression result;
preliminarily screening the one-dimensional vectors according to the classification result to obtain candidate frames belonging to a preset category, and offsetting the positions of the candidate frames according to a frame regression result to obtain a target detection map;
and/or the presence of a gas in the atmosphere,
upsampling a deconvolution result of the region pooling result; classifying the up-sampling result by using a semantic classifier, and then up-sampling to obtain a semantic segmentation graph; the steps are as follows: sequentially carrying out deconvolution, cavity convolution and L2 normalization calculation on the low-resolution feature map to obtain a low-resolution processing map, and then, further comprising the following steps:
sequentially carrying out convolution of convolution kernel 1*1 and Softmax function calculation on the deconvolution result of the low-resolution feature map to obtain a loss value L 1 (ii) a Wherein the Softmax function calculates a low resolution tag guided by a cascade tag;
the steps are as follows: utilizing a region classifier and a frame regression device to perform classification calculation and frame regression calculation on the one-dimensional vector respectively to obtain a classification result and a frame regression result, and then further comprising the following steps:
training the region classifier by utilizing a Softmax Loss function to obtain a Loss value L 21 (ii) a Training the frame regression by using Smooth L1 Loss to obtain a Loss value L 22 (ii) a Will lose value L 21 And L 22 Overlapping to obtain loss value L 2
The steps are as follows: classifying the up-sampled result by a semantic classifier, and then further comprising the following steps:
carrying out classification training on the semantic classifier by using the high-resolution label guided by the cascade label to obtain a loss value L 3
The steps are as follows: the information interpretation of the terahertz image is completed, and then the method further comprises the following steps:
calculating a total loss function L;
if only the regional pooling result is subjected to target detection, the total loss function L = λ 1 L 12 L 2
If the region pooling result is only semantically segmented, the total loss function L = λ 1 L 13 L 3
If the region pooling result is subjected to target detection and semantic segmentation, the total loss function L = lambda 1 L 12 L 23 L 3
Minimizing the total loss function L;
the steps are as follows: performing super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image, which specifically comprises the following steps:
performing super-resolution reconstruction on the acquired low-resolution terahertz image by using a super-resolution reconstruction sub-network to obtain a super-resolution terahertz image;
the super-resolution reconstruction sub-network comprises at least one recursive block, and each recursive block comprises at least one residual unit; and after convolution is carried out on the output value of the last recursion block, the output value is superposed with the low-resolution terahertz image to obtain the super-resolution terahertz image.
2. The terahertz image information interpretation method according to claim 1, wherein the steps of: performing region proposing on the feature fusion map to obtain a candidate region, specifically comprising:
generating a plurality of possible regions with different sizes on a sliding window of x y by utilizing an anchor point mechanism aiming at the feature fusion graph;
performing classification calculation and frame regression calculation on the possible regions to obtain a classification result and a frame regression result;
and preliminarily screening various possible regions according to the classification result, and preliminarily offsetting the screened possible regions according to the frame regression result to obtain candidate regions.
3. The terahertz image information interpretation method according to claim 1, wherein the steps of: performing region pooling on the candidate region to obtain a region pooling result, specifically comprising:
mapping the candidate region to a corresponding position of the feature fusion map to obtain a mapping region;
and dividing the mapping area into a plurality of intervals with the same size, and performing maximum pooling calculation on each interval to obtain an area pooling result.
4. A terahertz image information interpretation network is characterized by comprising:
the super-resolution reconstruction sub-network is used for carrying out super-resolution reconstruction on the acquired low-resolution terahertz image to obtain a super-resolution terahertz image;
the characteristic extraction sub-network is used for carrying out characteristic extraction on the low-resolution terahertz image by utilizing at least one convolution layer to obtain a low-resolution characteristic diagram; performing feature extraction on the super-resolution terahertz image by using at least one convolution layer to obtain a super-resolution feature map;
the feature fusion sub-network is used for carrying out feature level image fusion on the low-resolution feature map and the super-resolution feature map to obtain a feature fusion map;
the region proposing sub-network is used for carrying out region proposing on the feature fusion map to obtain a candidate region;
the region pooling sub-network is used for performing region pooling on the candidate region to obtain a region pooling result;
the object detection and semantic segmentation sub-network is used for carrying out object detection and/or semantic segmentation on the regional pooling result to finish information interpretation on the terahertz image;
the super-resolution reconstruction sub-network is connected with the feature extraction sub-network, the feature extraction sub-network is connected with the feature fusion sub-network, the feature fusion sub-network is connected with the area proposal sub-network, the area proposal sub-network is connected with the area pooling sub-network, and the area pooling sub-network is connected with the target detection and semantic segmentation sub-network;
the feature fusion subnetwork is specifically configured to:
deconvoluting the low-resolution feature map;
sequentially performing void convolution and L2 normalization calculation on the deconvolution result of the low-resolution feature map to obtain a low-resolution processing map;
sequentially performing convolution of a convolution kernel 1*1 and calculation of a Softmax function on a deconvolution result of the low-resolution feature map to obtain a loss value L1; wherein the Softmax function calculates a low resolution tag guided by a cascade tag;
sequentially carrying out convolution of a convolution kernel 1*1 and L2 normalization calculation on the super-resolution feature map to obtain a super-resolution processing map;
fusing the low-resolution processing image and the super-resolution processing image, and then calculating a ReLu activation function to obtain a feature fusion image;
the object detection and semantic segmentation sub-network is specifically configured to:
deconvoluting the regional pooling result;
inputting the deconvolution result of the regional pooling result into a full-link layer for conversion to obtain a one-dimensional vector;
respectively carrying out classification calculation and frame regression calculation on the one-dimensional vectors by using a region classifier and a frame regression device to obtain a classification result and a frame regression result;
preliminarily screening the one-dimensional vectors according to the classification result to obtain candidate frames belonging to a preset category, and offsetting the positions of the candidate frames according to a frame regression result to obtain a target detection diagram;
wherein, the region classifier is trained by utilizing a Softmax Loss function to obtain a Loss value L 21 Training the frame regression by using Smooth L1 Loss to obtain a Loss value L 22 Will lose value L 21 And L 22 Overlapping to obtain loss value L 2
Up-sampling the deconvolution result of the regional pooling result; classifying the up-sampling result by using a semantic classifier, and then up-sampling to obtain a semantic segmentation graph;
wherein, the semantic classifier is classified and trained by using the high-resolution label guided by the cascade label to obtain the loss value L 3
The terahertz image information interpretation network further comprises a training unit, and the training unit is used for: calculating a total loss function L;
if only the regional pooling result is subjected to target detection, the total loss function L = λ 1 L 12 L 2
If the semantic segmentation is only carried out on the region pooling result, the total loss function L = lambda 1 L 13 L 3
If the region pooling result is subjected to target detection and semantic segmentation, the total loss function L = lambda 1 L 12 L 23 L 3
The total loss function L is minimized.
5. A storage medium having stored thereon computer-executable instructions, which, when executed by a computer processor, implement the steps of the terahertz image information interpretation method of any one of claims 1 to 3.
CN201910019673.4A 2019-01-09 2019-01-09 Terahertz image information interpretation method, network and storage medium Active CN109740688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910019673.4A CN109740688B (en) 2019-01-09 2019-01-09 Terahertz image information interpretation method, network and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910019673.4A CN109740688B (en) 2019-01-09 2019-01-09 Terahertz image information interpretation method, network and storage medium

Publications (2)

Publication Number Publication Date
CN109740688A CN109740688A (en) 2019-05-10
CN109740688B true CN109740688B (en) 2023-04-18

Family

ID=66364095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910019673.4A Active CN109740688B (en) 2019-01-09 2019-01-09 Terahertz image information interpretation method, network and storage medium

Country Status (1)

Country Link
CN (1) CN109740688B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111123266B (en) * 2019-11-22 2023-05-16 中国电子科技集团公司第四十一研究所 Terahertz wave large-area uniform illumination device and imaging method
CN111353940B (en) * 2020-03-31 2021-04-02 成都信息工程大学 Image super-resolution reconstruction method based on deep learning iterative up-down sampling
CN111784573A (en) * 2020-05-21 2020-10-16 昆明理工大学 Passive terahertz image super-resolution reconstruction method based on transfer learning
CN111709878B (en) * 2020-06-17 2023-06-23 北京百度网讯科技有限公司 Face super-resolution implementation method and device, electronic equipment and storage medium
CN112435162B (en) * 2020-11-13 2024-03-05 中国科学院沈阳自动化研究所 Terahertz image super-resolution reconstruction method based on complex domain neural network
CN116309274B (en) * 2022-12-12 2024-01-30 湖南红普创新科技发展有限公司 Method and device for detecting small target in image, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2994307B1 (en) * 2012-08-06 2015-06-05 Commissariat Energie Atomique METHOD AND DEVICE FOR RECONSTRUCTION OF SUPER-RESOLUTION IMAGES
CN104778671B (en) * 2015-04-21 2017-09-22 重庆大学 A kind of image super-resolution method based on SAE and rarefaction representation
WO2016197303A1 (en) * 2015-06-08 2016-12-15 Microsoft Technology Licensing, Llc. Image semantic segmentation
CN108428212A (en) * 2018-01-30 2018-08-21 中山大学 A kind of image magnification method based on double laplacian pyramid convolutional neural networks
CN108876792B (en) * 2018-04-13 2020-11-10 北京迈格威科技有限公司 Semantic segmentation method, device and system and storage medium
CN108830225B (en) * 2018-06-13 2021-07-06 广东工业大学 Method, device, equipment and medium for detecting target object in terahertz image
CN108875659B (en) * 2018-06-26 2022-04-22 上海海事大学 Sea chart cultivation area identification method based on multispectral remote sensing image

Also Published As

Publication number Publication date
CN109740688A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
CN109740688B (en) Terahertz image information interpretation method, network and storage medium
CN108876792B (en) Semantic segmentation method, device and system and storage medium
CN113673425B (en) Multi-view target detection method and system based on Transformer
US20190294661A1 (en) Performing semantic segmentation of form images using deep learning
CN113139543B (en) Training method of target object detection model, target object detection method and equipment
CN110544212B (en) Convolutional neural network hyperspectral image sharpening method based on hierarchical feature fusion
CN110796649B (en) Target detection method and device, electronic equipment and storage medium
CN111783819B (en) Improved target detection method based on region of interest training on small-scale data set
CN112883887B (en) Building instance automatic extraction method based on high spatial resolution optical remote sensing image
CN113378933A (en) Thyroid ultrasound image classification and segmentation network, training method, device and medium
CN111914654A (en) Text layout analysis method, device, equipment and medium
CN112183649A (en) Algorithm for predicting pyramid feature map
CN114612872A (en) Target detection method, target detection device, electronic equipment and computer-readable storage medium
Patel et al. An optimized deep learning model for flower classification using nas-fpn and faster r-cnn
CN116645592A (en) Crack detection method based on image processing and storage medium
WO2021147055A1 (en) Systems and methods for video anomaly detection using multi-scale image frame prediction network
CN117079163A (en) Aerial image small target detection method based on improved YOLOX-S
CN113033715B (en) Target detection model training method and target vehicle detection information generation method
CN110796003B (en) Lane line detection method and device and electronic equipment
Shabaninia et al. High‐order Markov random field for single depth image super‐resolution
CN111929688A (en) Method and equipment for determining radar echo prediction frame sequence
CN116051850A (en) Neural network target detection method, device, medium and embedded electronic equipment
CN115731517A (en) Crowd detection method based on Crowd-RetinaNet network
CN114332780A (en) Traffic man-vehicle non-target detection method for small target
CN112052863B (en) Image detection method and device, computer storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant