CN109949257B - Region-of-interest compressed sensing image reconstruction method based on deep learning - Google Patents

Region-of-interest compressed sensing image reconstruction method based on deep learning Download PDF

Info

Publication number
CN109949257B
CN109949257B CN201910166307.1A CN201910166307A CN109949257B CN 109949257 B CN109949257 B CN 109949257B CN 201910166307 A CN201910166307 A CN 201910166307A CN 109949257 B CN109949257 B CN 109949257B
Authority
CN
China
Prior art keywords
image
region
convolution
layer
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910166307.1A
Other languages
Chinese (zh)
Other versions
CN109949257A (en
Inventor
谢雪梅
毛思颖
王陈业
赵至夫
石光明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910166307.1A priority Critical patent/CN109949257B/en
Publication of CN109949257A publication Critical patent/CN109949257A/en
Application granted granted Critical
Publication of CN109949257B publication Critical patent/CN109949257B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method for reconstructing a compressed sensing image of an interested area based on deep learning, which solves the problem that the reconstruction quality of the interested area in the image is low under the limited observation resources by the conventional method for reconstructing the compressed sensing image, and comprises the following steps: (1) constructing a region-of-interest perception reconstruction network; (2) training a perception reconstruction network of the region of interest; (3) preprocessing a natural image to be reconstructed; (4) acquiring first observation information; (5) obtaining a primary recovery image; (6) acquiring an image of a region of interest; (7) acquiring second observation information; (8) the perceptually restored image is reconstructed. The invention utilizes the method of two times of observation to distribute more observation resources for the interested region, and the texture details of the interested region in the reconstructed image are clear.

Description

Region-of-interest compressed sensing image reconstruction method based on deep learning
Technical Field
The invention belongs to the technical field of image processing, and further relates to a region-of-interest compressed sensing image reconstruction method based on deep learning in the technical field of image reconstruction. The method can be used for obtaining the image with higher quality of the interested area under the equivalent observation rate when the natural image is reconstructed.
Background
The rapid development of information technology has led to a dramatic increase in the demand for information. The compressive sensing theory brings revolutionary breakthrough to the signal acquisition technology, and shows that the signal can be sampled at a frequency far lower than the Nyquist frequency under a certain condition, and the original signal is reconstructed at a high probability through the numerical optimization problem, so that a large amount of resources are saved. Compared with the traditional optimization solution method, the compressed sensing image reconstruction method based on deep learning has high reconstruction speed due to the fact that the parameters are trained offline, and can realize real-time image reconstruction. The existing compressed sensing methods based on deep learning are all to perform uniform observation on the whole scene, that is, to evenly distribute observation resources, but further, when human eyes sense images, more attention is paid to interested areas, such as areas of objects, salient signs and the like in one image, pathological areas in medical images, ground vehicles shot by unmanned aerial vehicles and the like. The information of these areas also plays a more important role in the post-imaging processing.
Chakraborty et al, in its published paper "Region of interest compressive sensing of the images and the questions compressive quality" (Proceedings of the IEEE Aerospace Conference,2018, pp.1-11), propose a block-based Region of interest compressive sensing method. The method comprises the steps of firstly dividing an image into blocks, then finding out an interested image block according to a traditional classification method, wherein the interested region is a region with specific topographic features, a designed observation matrix contains a larger sampling number at a position corresponding to the interested region, and finally reconstructing an observation value through a worried-free method. The method has the disadvantages that firstly, the reconstruction process in the method is completed by the traditional iterative algorithm, so that the time complexity is high, and the speed of the algorithm is influenced; secondly, the extraction of the interest region in the method is not accurate enough by using a traditional classification algorithm.
The patent document "compressed sensing image reconstruction method based on measurement domain block saliency detection" (patent application number: 201510226877.7, application publication number: CN 105678699 a) applied by the university of sienna electronic technology discloses a compressed sensing image reconstruction method based on measurement domain block saliency detection. The method comprises the steps of observing an image twice, dividing the original image into subblocks with the same size and without overlapping, observing for the first time, transforming measured values of the subblocks through a traditional transformation algorithm, and classifying the subblocks into a significant block or a non-significant block. And respectively carrying out secondary observation on the significant blocks and the non-significant blocks at different sampling rates, and finally reconstructing the significant blocks and the non-significant blocks by a traditional optimization method. The method improves the utilization rate of the observation resources by combining the two times of observation resources for reconstruction under the fixed observation resources, but has two defects, one is that the resource waste is caused because only the second time of observation information is used for reconstructing the image and the first time of observation information is not used; secondly, the method can only process gray images and cannot perform compressed sensing on color images.
Disclosure of Invention
The invention aims to provide a region-of-interest compressed sensing image reconstruction method based on deep learning, aiming at the defects of the prior art. The invention can lead the reconstruction quality of the interested region in the image to be better under the condition of limited observation resources and has high reconstruction speed.
The method comprises the steps of extracting an interested region from a primary recovery image obtained by primary observation recovery to obtain an interested region image, carrying out secondary observation on the image, combining two times of observation information for reconstruction, and distributing more observation resources for the interested region by a two-time observation method so as to reconstruct a perception recovery image with higher quality of the interested region.
The method comprises the following specific steps:
(1) constructing a region-of-interest perception reconstruction network:
(1a) establishing an interested region extracting sub-network in an interested region perception reconstruction network, wherein the sub-network comprises an eight-layer primary unified observation recovery module and a six-layer salient target region extracting module;
the structure of the first unified observation recovery module is as follows in sequence: the first convolution layer → the reverse convolution layer → the second convolution layer → the first residual block → the second residual block → the third convolution layer → the fourth convolution layer;
setting parameters of each layer of the primary unified observation recovery module;
the structure of the salient target area extraction module is as follows: the five convolutional layers are sequentially connected with one pooling layer, the pooling layer is respectively connected with a first convolution layer, a second convolution layer, a third convolution layer and a fourth convolution layer, the fifth convolutional layer is respectively connected with the first convolution layer, the second convolution layer, the third convolution layer and the fourth convolution layer, the fourth convolutional layer is respectively connected with the first convolution layer and the second convolution layer, the third convolutional layer and the second convolution layer are respectively connected with one softmax activation layer, and six classifiers are formed by connecting each convolutional layer and each pooling layer;
setting parameters of each layer of a saliency target area extraction module;
(1b) constructing a region-of-interest enhanced compressed sensing sub-network in the region-of-interest sensing reconstruction network:
the structure of the region-of-interest enhanced compressive sensing subnetwork sequentially comprises the following steps: the first convolution layer → the deconvolution layer → the second convolution layer → the first residual block → the second residual block → the third residual block → the fourth residual block → the fifth residual block → the sixth residual block → the seventh residual block → the third convolution layer → the fourth convolution layer, wherein the first convolution layer and the first layer convolution layer of the primary unified observation recovery module are both connected with the deconvolution layer;
setting parameters of each layer of the region-of-interest enhanced compressive sensing sub-network;
(2) training the interested region perception reconstruction network:
(2a) respectively inputting 3000 natural images into an interested region perception reconstruction network, and outputting a primary recovery image corresponding to each image through a primary unified observation recovery module; outputting an interested area image corresponding to each primary recovery image through a saliency target area extraction module; outputting a perception recovery image corresponding to each interested area image through the interested area enhancement compressed perception sub-network;
(2b) calculating the loss value of each input image and the corresponding initial recovery image by using a mean square error function;
(2c) calculating the loss value of the interesting region image and the salient region label image corresponding to the interesting region image by using a cross entropy function;
(2d) calculating loss values of each input image and the corresponding perception recovery image by using a mean square error function;
(2e) calculating a total loss value, and minimizing the total loss value by adopting a random gradient descent algorithm to obtain a trained region-of-interest perception reconstruction network;
(3) preprocessing a natural image to be reconstructed:
cutting the size of a natural image to be reconstructed into 256 multiplied by 256 pixels;
(4) acquiring first observation information:
inputting the preprocessed image into a primary unified observation recovery module, and carrying out primary observation through a first layer of convolution layer in the module to obtain primary observation information;
(5) obtaining a primary recovery image:
inputting the first observation information into a residual structure of a first unified observation recovery module for reconstruction, and outputting a first recovery image;
(6) acquiring an image of a region of interest:
inputting the primary recovery image into a saliency target region extraction module, and outputting an interested region image;
(7) acquiring second observation information:
inputting the image of the region of interest into the region of interest enhanced compressed sensing subnetwork, and obtaining second observation information through convolution operation of the first layer of convolution layer;
(8) reconstructing the perceptually restored image:
and combining the first observation information and the second observation information through concat operation, inputting the combined observation information into the residual structure of the region-of-interest enhanced compressed sensing subnetwork for reconstruction, and obtaining a sensing recovery image.
Compared with the prior art, the invention has the following advantages:
firstly, the image is reconstructed by building the region-of-interest enhanced compressed sensing subnetwork, so that the problems that in the prior art, the time complexity is high and the speed of the algorithm is influenced due to the fact that the reconstruction process is completed by a traditional iterative algorithm are solved, the reconstruction speed is high, and the real-time effect can be achieved.
Secondly, the method extracts the region of interest in the image by building the salient target region extraction module, and solves the problem that the extraction result is not accurate enough when the region of interest is extracted by using the traditional classification algorithm in the prior art, so that the method can accurately extract the region of interest.
Thirdly, the first observation information and the second observation information are combined to reconstruct a perception recovery image, so that the problem of resource waste caused by the fact that the image is reconstructed only by the second observation information in the prior art is solved, and the perception recovery image with good quality of the region of interest can be reconstructed even if the image is observed at a low observation rate.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a graph of simulation results at different observation rates using the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The steps of the present invention are further described with reference to fig. 1.
Step 1, constructing a region-of-interest perception reconstruction network.
And constructing an interested region extracting sub-network in the interested region perception reconstruction network, wherein the sub-network comprises an eight-layer primary unified observation recovery module and a six-layer salient target region extracting module.
The structure of the first unified observation recovery module is as follows in sequence: first convolution layer → inverse convolution layer → second convolution layer → first residual block → second residual block → third convolution layer → fourth convolution layer.
And setting parameters of each layer of the initial unified observation recovery module.
The parameters of each layer of the initial unified observation recovery module are as follows:
the convolution kernel size of the first convolution layer is 32 × 32, the number of convolution kernels is 41, and the step size is 32.
The deconvolution kernel size of the deconvolution layer was 32 × 32, the number of convolution kernels was 1, and the step size was 32.
The convolution kernel size of the second convolution layer is 9 × 9, the number of convolution kernels is 64, and the step size is 1.
The convolution kernel size in the first, second and third residual blocks is 3 × 3, the number of convolution kernels is 64, and the step size is 1.
The convolution kernel size of the third convolution layer is 3 × 3, the number of convolution kernels is 64, and the step size is 1.
The convolution kernel size of the fourth convolution layer is 9 × 9, the number of convolution kernels is 1, and the step size is 1.
The structure of the salient target area extraction module is as follows: the five convolutional layers are sequentially connected with one pooling layer, the pooling layer is respectively connected with the first, second, third and fourth convolution layers, the fifth convolutional layer is respectively connected with the first, second, third and fourth convolution layers, the fourth convolutional layer is respectively connected with the first and second convolution layers, the third convolutional layer is respectively connected with the first and second convolution layers, and each convolutional layer and pooling layer are respectively connected with one softmax activation layer to form six classifiers.
And setting parameters of each layer of the saliency target area extraction module.
The parameters of each layer of the saliency target area extraction module are as follows:
the convolution kernel size of the first and second convolution layers is 3 × 3, the number of convolution kernels is 128, and the step size is 1.
The convolution kernel size of the third and fourth convolutional layers is 5 × 5, the number of convolution kernels is 256, and the step size is 2.
The convolution kernel size of the fifth convolution layer is 5 × 5, the number of convolution kernels is 512, and the step size is 2.
The convolution kernel size of the pooling layer was 7 × 7, the number of convolution kernels was 512, and the step size was 2.
And constructing a region-of-interest enhanced compressed sensing sub-network in the region-of-interest sensing reconstruction network.
The structure of the region-of-interest enhanced compressive sensing subnetwork sequentially comprises the following steps: the first convolution layer → the deconvolution layer → the second convolution layer → the first residual block → the second residual block → the third residual block → the fourth residual block → the fifth residual block → the sixth residual block → the seventh residual block → the third convolution layer → the fourth convolution layer, wherein the first convolution layer and the first layer convolution layer of the primary unified observation recovery module are connected with the deconvolution layer.
Setting the region of interest enhances the parameters of each layer of the compressive sensing subnetwork.
The parameters of each layer of the region-of-interest enhanced compressed sensing sub-network are as follows:
the convolution kernel size of the first convolution layer is 32 × 32, the number of convolution kernels is 215, and the step size is 32.
The deconvolution kernel size of the deconvolution layer was 32 × 32, the number of convolution kernels was 1, and the step size was 32.
The convolution kernel size of the second convolution layer is 9 × 9, the number of convolution kernels is 64, and the step size is 1.
The sizes of convolution kernels in the first, second, third, fourth, fifth, sixth and seventh residual blocks are 3 x 3, the number of convolution kernels is 64, and the step size is 1.
The convolution kernel size of the third convolution layer is 3 × 3, the number of convolution kernels is 64, and the step size is 1.
The convolution kernel size of the fourth convolution layer is 9 × 9, the number of convolution kernels is 1, and the step size is 1.
And 2, training the perception reconstruction network of the region of interest.
Respectively inputting 3000 natural images into an interested region perception reconstruction network, and outputting a primary recovery image corresponding to each image through a primary unified observation recovery module; outputting an interested area image corresponding to each primary recovery image through a saliency target area extraction module; and outputting a perception recovery image corresponding to each interested area image through the interested area enhancement compressed perception sub-network.
The embodiment of the invention downloads an MSRA-B public data set containing 5000 natural images and 5000 corresponding marked region label images, 3000 natural images and corresponding marked region label images are randomly separated from the MSRA-B public data set to serve as a training set, and the rest images serve as a testing set.
And calculating the loss value of each input image and the corresponding initial recovery image by using a mean square error function.
The mean square error function is as follows:
Figure GDA0002994651700000061
wherein L isiRepresenting the ith input image and its corresponding initial restored imageLoss value or loss value of perception recovery image corresponding to the loss value, c, w and h respectively represent the channel number, width and height of the ith input image, n represents the total number of pixels of the ith input image, j represents the serial number of the pixels in the ith input image, and the value of j is [1,65536 ]]Σ represents a summation operation, | · non-woven phosphor2Which means a two-norm operation is shown,
Figure GDA0002994651700000062
representing the jth pixel value in the ith input image,
Figure GDA0002994651700000063
represents the jth pixel value of the original restored image corresponding to the ith input image or the jth pixel value of the perceptual restored image corresponding to the ith input image.
And calculating the loss value of the region-of-interest image and the salient region label image corresponding to the image by using a cross entropy function.
In the embodiment of the invention, the salient region label image is from the MSRA-B public data set and is the salient region label image corresponding to the input natural image.
The cross entropy function is as follows:
Figure GDA0002994651700000064
wherein,
Figure GDA0002994651700000065
representing the loss value of the interested area image acquired by the input ith initial recovery image and the salient area label image corresponding to the image, M representing the total number of classifiers of the salient target area extraction module, M representing the serial number of the classifier, and alphamRepresents the weight value of the mth classifier,
Figure GDA0002994651700000066
a label corresponding to the jth pixel of the region of interest image, log represents a base 10 logarithmic operation,
Figure GDA0002994651700000067
the jth pixel of the image representing the region of interest corresponds to the activation value of tag 1 via the mth classifier,
Figure GDA0002994651700000068
the jth pixel representing the region of interest image passing through the mth classifier corresponds to an activation value corresponding to label 0,
Figure GDA0002994651700000069
representing the jth pixel value of the region of interest image corresponding to the ith input image.
And calculating the loss value of each input image and the corresponding perception recovery image by using a mean square error function.
And calculating a total loss value, and minimizing the total loss value by adopting a random gradient descent algorithm to obtain a trained region-of-interest perception reconstruction network.
The total loss value is calculated by the following formula:
Figure GDA0002994651700000071
wherein liIndicating the corresponding total loss value of the ith input image,
Figure GDA0002994651700000072
indicating the loss value of the ith input image and the corresponding initial recovery image,
Figure GDA0002994651700000073
a loss value representing the significant region label image corresponding to the ith region of interest image,
Figure GDA0002994651700000074
loss values representing the ith input image and the corresponding perceptually restored image, due to the calculated loss values
Figure GDA0002994651700000075
And loss value
Figure GDA0002994651700000076
Not an order of magnitude and to a greater extent that the method wishes to enhance the training of the compressed sensing subnetwork for the region of interest, the invention sets lambda1、λ2Respectively represent loss values
Figure GDA0002994651700000077
The corresponding weight coefficients.
And 3, preprocessing the natural image to be reconstructed.
The size of the natural image to be reconstructed is cropped to 256 × 256 pixels.
And 4, acquiring first observation information.
And inputting the preprocessed image into a primary unified observation recovery module, and carrying out primary observation through a first layer of convolution layer in the module to obtain primary observation information.
And 5, obtaining an initial recovery image.
And inputting the first observation information into the residual structure of the primary unified observation recovery module for reconstruction, and outputting a primary recovery image.
And 6, acquiring an image of the region of interest.
And inputting the initial recovery image into a saliency target region extraction module, and outputting an interested region image.
And 7, acquiring second observation information.
And inputting the image of the region of interest into the region of interest enhancement compressed sensing subnetwork, and obtaining second observation information through convolution operation of the first layer of convolution layer.
And 8, reconstructing a perception recovery image.
And combining the first observation information and the second observation information through concat operation, inputting the combined observation information into the residual structure of the region-of-interest enhanced compressed sensing subnetwork for reconstruction, and obtaining a sensing recovery image.
The effects of the present invention can be further illustrated by the following simulations.
1. Simulation experiment conditions are as follows:
the hardware environment used in the simulation experiment of the invention is NVIDIA TITAN XPGPUs with a main frequency of 3.4GHz and a hardware environment of a memory of 128GB, and the software environment is a pytorch and pycharm 2017.
2. Simulation content:
the simulation experiment of the invention is to download MSRA-B public data sets, take 6 of them as test images, and adopt the method of the invention and the image reconstruction method (compressed sensing image reconstruction method based on deep learning) of the prior art to reconstruct the test images respectively under the observation rates of 8%, 11% and 15%.
3. And (3) simulation result analysis:
FIG. 2(a) is a simulation of the use of the method of the present invention and the prior art at an observation rate of 8%. Wherein, fig. 2(a1) is the first image of 6 test images, fig. 2(a2) is a partial enlarged view of the region of interest in the first test image, fig. 2(a3) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulating the first test image using the method of the present invention, and fig. 2(a4) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulating the first test image using the method of the prior art.
Fig. 2(a5) is the second image of 6 test images, fig. 2(a6) is a partial enlarged view of the region of interest in the second test image, fig. 2(a7) is a partial enlarged view of the region of interest in a reconstructed image obtained by simulation of the first test image using the method of the present invention, and fig. 2(a8) is a partial enlarged view of the region of interest in a reconstructed image obtained by simulation of the first test image using the method of the prior art.
It can be seen from fig. 2(a4) and fig. 2(a8) that the region of interest in the image simulated by the prior art at the observation rate of 8% is blurred, the letter texture in fig. 2(a4) is not clear, and the feather texture details of the bird in fig. 2(a8) are not clear, and it can be seen from fig. 2(a3) and fig. 2(a7) that the region of interest in the image reconstructed by the method of the present invention is closer to the test chart, and the image quality of the region of interest in the result chart reconstructed by the present invention at the sampling rate of 8% is better than that of the prior art.
FIG. 2(b) is a simulation of the use of the method of the present invention and the prior art at an observation rate of 11%. Wherein, fig. 2(b1) is the third image of the 6 test images, fig. 2(b2) is a partial enlarged view of the region of interest in the third test image, fig. 2(b3) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulating the third test image by the method of the present invention, and fig. 2(b4) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulating the third test image by the method of the prior art.
Fig. 2(b5) is a fourth image of the 6 test images, fig. 2(b6) is a partial enlarged view of the region of interest in the fourth test image, fig. 2(b7) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulation of the fourth test image using the method of the present invention, and fig. 2(b8) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulation of the fourth test image using the method of the prior art.
As can be seen from fig. 2(b4) and 2(b8), the region of interest in the image after simulation by using the prior art is blurred at the observation rate of 11%, the pattern and text texture on the bottle in fig. 2(b4) are not clear, the water drop on the petal in fig. 2(b8) is not clear, and it can be seen from fig. 2(b3) and 2(b7) that the region of interest in the image reconstructed by the method of the present invention is closer to the test chart, and the image quality of the region of interest in the reconstructed result chart by the present invention at the sampling rate of 11% is better than that of the prior art.
FIG. 2(c) is a simulation using the method of the present invention and the prior art at an observation rate of 15%. Wherein, fig. 2(c1) is the fifth image of the 6 test images, fig. 2(c2) is a partial enlarged view of the region of interest in the fifth test image, fig. 2(c3) is a partial enlarged view of the region of interest in the reconstructed image obtained after simulation of the fifth test image by the method of the present invention, and fig. 2(c4) is a partial enlarged view of the region of interest in the reconstructed image obtained after simulation of the fifth test image by the method of the prior art.
Fig. 2(c5) is a sixth image of the 6 test images, fig. 2(c6) is a partial enlarged view of the region of interest in the sixth test image, fig. 2(c7) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulation of the sixth test image using the method of the present invention, and fig. 2(c8) is a partial enlarged view of the region of interest in the reconstructed image obtained by simulation of the sixth test image using the method of the prior art.
As can be seen from fig. 2(c4) and fig. 2(c8), the region of interest in the image after simulation by the prior art at the observation rate of 15% is blurred, the text texture on the signboard in fig. 2(c4) is not clear, the details of the stamen in the flower in fig. 2(c8) are also not clear, and it can be seen from fig. 2(c3) and fig. 2(c7) that the region of interest in the image reconstructed by the method of the present invention is closer to the test chart, and the image quality of the region of interest in the reconstructed result chart by the method of the present invention at the sampling rate of 15% is better than that of the prior art.
In order to better compare the recovery quality of the region of interest in the reconstructed images of the present invention and the prior art, the peak signal-to-noise ratio (PSNR) of the locally enlarged image of the region of interest in the simulation result graphs of the different methods was calculated, and the final data is shown in the following table.
Method \ test picture 1 2 3 4 5 6 Mean value of
The method of the invention 27.82 28.52 29.83 31.58 19.12 30.24 27.85
Prior Art 23.24 24.55 26.78 30.51 16.86 26.41 24.72
It can be seen from the above table that the peak signal-to-noise ratio PSNR of the locally amplified image of the region of interest in the result graph obtained by the method of the present invention is higher than the PSNR obtained in the prior art, i.e., compared with the prior reconstruction method, the present invention improves the reconstruction quality of the region of interest in the image.

Claims (7)

1. A method for reconstructing a compressed sensing image of an interested area based on deep learning is characterized by comprising the following steps of constructing a sensing reconstruction network of the interested area, acquiring first observation information and a first restored image by using a first unified observation and restoration module, acquiring the image of the interested area by using a saliency target area extraction module, acquiring second observation information by using an interested area enhanced compressed sensing subnetwork, and reconstructing the image after the second observation information is combined with the observation information of the first unified observation and restoration module, wherein the method comprises the following specific steps:
(1) constructing a region-of-interest perception reconstruction network:
(1a) establishing an interested region extracting sub-network in an interested region perception reconstruction network, wherein the sub-network comprises an eight-layer primary unified observation recovery module and a six-layer salient target region extracting module;
the structure of the first unified observation recovery module is as follows in sequence: the first convolution layer → the reverse convolution layer → the second convolution layer → the first residual block → the second residual block → the third convolution layer → the fourth convolution layer;
setting parameters of each layer of the primary unified observation recovery module;
the structure of the salient target area extraction module is as follows: the five convolutional layers are sequentially connected with one pooling layer, the pooling layer is respectively connected with a first convolution layer, a second convolution layer, a third convolution layer and a fourth convolution layer, the fifth convolutional layer is respectively connected with the first convolution layer, the second convolution layer, the third convolution layer and the fourth convolution layer, the fourth convolutional layer is respectively connected with the first convolution layer and the second convolution layer, the third convolutional layer and the second convolution layer are respectively connected with one softmax activation layer, and six classifiers are formed by connecting each convolutional layer and each pooling layer;
setting parameters of each layer of a saliency target area extraction module;
(1b) constructing a region-of-interest enhanced compressed sensing sub-network in the region-of-interest sensing reconstruction network:
the structure of the region-of-interest enhanced compressive sensing subnetwork sequentially comprises the following steps: the first convolution layer → the deconvolution layer → the second convolution layer → the first residual block → the second residual block → the third residual block → the fourth residual block → the fifth residual block → the sixth residual block → the seventh residual block → the third convolution layer → the fourth convolution layer, wherein the first convolution layer and the first layer convolution layer of the primary unified observation recovery module are both connected with the deconvolution layer;
setting parameters of each layer of the region-of-interest enhanced compressive sensing sub-network;
(2) training the interested region perception reconstruction network:
(2a) respectively inputting 3000 natural images into an interested region perception reconstruction network, and outputting a primary recovery image corresponding to each image through a primary unified observation recovery module; outputting an interested area image corresponding to each primary recovery image through a saliency target area extraction module; outputting a perception recovery image corresponding to each interested area image through the interested area enhancement compressed perception sub-network;
(2b) calculating the loss value of each input image and the corresponding initial recovery image by using a mean square error function;
(2c) calculating the loss value of the interesting region image and the salient region label image corresponding to the interesting region image by using a cross entropy function;
(2d) calculating loss values of each input image and the corresponding perception recovery image by using a mean square error function;
(2e) calculating a total loss value, and minimizing the total loss value by adopting a random gradient descent algorithm to obtain a trained region-of-interest perception reconstruction network;
(3) preprocessing a natural image to be reconstructed:
cutting the size of a natural image to be reconstructed into 256 multiplied by 256 pixels;
(4) acquiring first observation information:
inputting the preprocessed image into a primary unified observation recovery module, and carrying out primary observation through a first layer of convolution layer in the module to obtain primary observation information;
(5) obtaining a primary recovery image:
inputting the first observation information into a residual structure of a first unified observation recovery module for reconstruction, and outputting a first recovery image;
(6) acquiring an image of a region of interest:
inputting the primary recovery image into a saliency target region extraction module, and outputting an interested region image;
(7) acquiring second observation information:
inputting the image of the region of interest into the region of interest enhanced compressed sensing subnetwork, and obtaining second observation information through convolution operation of the first layer of convolution layer;
(8) reconstructing the perceptually restored image:
and combining the first observation information and the second observation information through concat operation, inputting the combined observation information into the residual structure of the region-of-interest enhanced compressed sensing subnetwork for reconstruction, and obtaining a sensing recovery image.
2. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 1, wherein the parameters of the layers of the primary unified observation restoration module in the step (1a) are as follows:
the convolution kernel size of the first convolution layer is 32 multiplied by 32, the number of the convolution kernels is 41, and the step length is 32;
the size of the deconvolution kernel of the deconvolution layer is 32 multiplied by 32, the number of convolution kernels is 1, and the step length is 32;
the convolution kernel size of the second convolution layer is 9 multiplied by 9, the number of convolution kernels is 64, and the step size is 1;
the sizes of convolution kernels in the first, second and third residual blocks are 3 multiplied by 3, the number of the convolution kernels is 64, and the step length is 1;
the convolution kernel size of the third convolution layer is 3 multiplied by 3, the number of convolution kernels is 64, and the step length is 1;
the convolution kernel size of the fourth convolution layer is 9 × 9, the number of convolution kernels is 1, and the step size is 1.
3. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 1, wherein the parameters of the layers of the saliency target region extraction module in step (1a) are as follows:
the convolution kernel size of the first convolution layer and the second convolution layer is 3 multiplied by 3, the number of the convolution kernels is 128, and the step length is 1;
the convolution kernel size of the third and fourth convolution layers is 5 multiplied by 5, the number of convolution kernels is 256, and the step length is 2;
the convolution kernel size of the fifth convolution layer is 5 multiplied by 5, the number of convolution kernels is 512, and the step length is 2;
the convolution kernel size of the pooling layer was 7 × 7, the number of convolution kernels was 512, and the step size was 2.
4. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 1, wherein the parameters of the layers of the region of interest enhanced compressed sensing sub-network in step (1b) are as follows:
the convolution kernel size of the first convolution layer is 32 multiplied by 32, the number of convolution kernels is 215, and the step size is 32;
the size of the deconvolution kernel of the deconvolution layer is 32 multiplied by 32, the number of convolution kernels is 1, and the step length is 32;
the convolution kernel size of the second convolution layer is 9 multiplied by 9, the number of convolution kernels is 64, and the step size is 1;
the sizes of convolution kernels in the first, second, third, fourth, fifth, sixth and seventh residual blocks are 3 multiplied by 3, the number of the convolution kernels is 64, and the step length is 1;
the convolution kernel size of the third convolution layer is 3 multiplied by 3, the number of convolution kernels is 64, and the step length is 1;
the convolution kernel size of the fourth convolution layer is 9 × 9, the number of convolution kernels is 1, and the step size is 1.
5. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 1, wherein the mean square error functions in steps (2b) and (2d) are as follows:
Figure FDA0002994651690000031
wherein L isiThe loss value of the ith input image and the corresponding initial recovery image or the loss value of the corresponding perception recovery image is shown, c, w and h respectively show the channel number, width and height of the ith input image, n shows the total number of pixels of the ith input image, j shows the serial number of the pixels in the ith input image, and the value of j is [1,65536 ]]Σ represents a summation operation, | · non-woven phosphor2Which means a two-norm operation is shown,
Figure FDA0002994651690000041
is shown asThe jth pixel value in the i input images,
Figure FDA0002994651690000042
represents the jth pixel value of the original restored image corresponding to the ith input image or the jth pixel value of the perceptual restored image corresponding to the ith input image.
6. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 5, wherein the cross entropy function in step (2c) is as follows:
Figure FDA0002994651690000043
wherein,
Figure FDA0002994651690000044
representing the loss value of the interested area image acquired by the input ith initial recovery image and the salient area label image corresponding to the image, M representing the total number of classifiers of the salient target area extraction module, M representing the serial number of the classifier, and alphamRepresents the weight value of the mth classifier,
Figure FDA0002994651690000045
a label corresponding to the jth pixel of the region of interest image, log represents a base 10 logarithmic operation,
Figure FDA0002994651690000046
the jth pixel of the image representing the region of interest corresponds to the activation value of tag 1 via the mth classifier,
Figure FDA0002994651690000047
the jth pixel representing the region of interest image passing through the mth classifier corresponds to an activation value corresponding to label 0,
Figure FDA0002994651690000048
representing the jth pixel value of the region of interest image corresponding to the ith input image.
7. The method for reconstructing a compressed sensing image of a region of interest based on deep learning according to claim 1, wherein the total loss value in step (2e) is calculated by the following formula:
Figure FDA0002994651690000049
wherein liIndicating the corresponding total loss value of the ith input image,
Figure FDA00029946516900000410
indicating the loss value of the ith input image and the corresponding initial recovery image,
Figure FDA00029946516900000411
a loss value representing the significant region label image corresponding to the ith region of interest image,
Figure FDA00029946516900000412
represents the loss value, lambda, of the ith input image and its corresponding perceptually restored image1、λ2Respectively represent values according to loss
Figure FDA00029946516900000413
And loss value
Figure FDA00029946516900000414
Arranged differently in order of magnitude
Figure FDA00029946516900000415
The corresponding weight coefficients.
CN201910166307.1A 2019-03-06 2019-03-06 Region-of-interest compressed sensing image reconstruction method based on deep learning Active CN109949257B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910166307.1A CN109949257B (en) 2019-03-06 2019-03-06 Region-of-interest compressed sensing image reconstruction method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910166307.1A CN109949257B (en) 2019-03-06 2019-03-06 Region-of-interest compressed sensing image reconstruction method based on deep learning

Publications (2)

Publication Number Publication Date
CN109949257A CN109949257A (en) 2019-06-28
CN109949257B true CN109949257B (en) 2021-09-10

Family

ID=67008943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910166307.1A Active CN109949257B (en) 2019-03-06 2019-03-06 Region-of-interest compressed sensing image reconstruction method based on deep learning

Country Status (1)

Country Link
CN (1) CN109949257B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018176000A1 (en) 2017-03-23 2018-09-27 DeepScale, Inc. Data synthesis for autonomous control systems
US10671349B2 (en) 2017-07-24 2020-06-02 Tesla, Inc. Accelerated mathematical engine
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11157441B2 (en) 2017-07-24 2021-10-26 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11215999B2 (en) 2018-06-20 2022-01-04 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
WO2020077117A1 (en) 2018-10-11 2020-04-16 Tesla, Inc. Systems and methods for training machine models with augmented data
US11196678B2 (en) 2018-10-25 2021-12-07 Tesla, Inc. QOS manager for system on a chip communications
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11150664B2 (en) 2019-02-01 2021-10-19 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US10997461B2 (en) 2019-02-01 2021-05-04 Tesla, Inc. Generating ground truth for machine learning from time series elements
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data
CN111428751B (en) * 2020-02-24 2022-12-23 清华大学 Object detection method based on compressed sensing and convolutional network
CN115311174B (en) * 2022-10-10 2023-03-24 深圳大学 Training method and device for image recovery network and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104103052A (en) * 2013-04-11 2014-10-15 北京大学 Sparse representation-based image super-resolution reconstruction method
CN104280705A (en) * 2014-09-30 2015-01-14 深圳先进技术研究院 Magnetic resonance image reconstruction method and device based on compressed sensing
US9275294B2 (en) * 2011-11-03 2016-03-01 Siemens Aktiengesellschaft Compressed sensing using regional sparsity
CN105741252A (en) * 2015-11-17 2016-07-06 西安电子科技大学 Sparse representation and dictionary learning-based video image layered reconstruction method
CN108510464A (en) * 2018-01-30 2018-09-07 西安电子科技大学 Compressed sensing network and full figure reconstructing method based on piecemeal observation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275294B2 (en) * 2011-11-03 2016-03-01 Siemens Aktiengesellschaft Compressed sensing using regional sparsity
CN104103052A (en) * 2013-04-11 2014-10-15 北京大学 Sparse representation-based image super-resolution reconstruction method
CN104280705A (en) * 2014-09-30 2015-01-14 深圳先进技术研究院 Magnetic resonance image reconstruction method and device based on compressed sensing
CN105741252A (en) * 2015-11-17 2016-07-06 西安电子科技大学 Sparse representation and dictionary learning-based video image layered reconstruction method
CN108510464A (en) * 2018-01-30 2018-09-07 西安电子科技大学 Compressed sensing network and full figure reconstructing method based on piecemeal observation

Also Published As

Publication number Publication date
CN109949257A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN109949257B (en) Region-of-interest compressed sensing image reconstruction method based on deep learning
WO2021184891A1 (en) Remotely-sensed image-based terrain classification method, and system
CN109410261B (en) Monocular image depth estimation method based on pyramid pooling module
CN111127374B (en) Pan-sharing method based on multi-scale dense network
CN107229918A (en) A kind of SAR image object detection method based on full convolutional neural networks
CN111986099A (en) Tillage monitoring method and system based on convolutional neural network with residual error correction fused
CN111080567A (en) Remote sensing image fusion method and system based on multi-scale dynamic convolution neural network
CN109978854B (en) Screen content image quality evaluation method based on edge and structural features
CN110969124A (en) Two-dimensional human body posture estimation method and system based on lightweight multi-branch network
CN111027590B (en) Breast cancer data classification method combining deep network features and machine learning model
CN111160114B (en) Gesture recognition method, gesture recognition device, gesture recognition equipment and computer-readable storage medium
CN109872305A (en) It is a kind of based on Quality Map generate network without reference stereo image quality evaluation method
CN112257741B (en) Method for detecting generative anti-false picture based on complex neural network
CN115953303B (en) Multi-scale image compressed sensing reconstruction method and system combining channel attention
CN109961446A (en) CT/MR three-dimensional image segmentation processing method, device, equipment and medium
CN114581347B (en) Optical remote sensing spatial spectrum fusion method, device, equipment and medium without reference image
CN112232328A (en) Remote sensing image building area extraction method and device based on convolutional neural network
CN114494015B (en) Image reconstruction method based on blind super-resolution network
CN111178121A (en) Pest image positioning and identifying method based on spatial feature and depth feature enhancement technology
CN111640116A (en) Aerial photography graph building segmentation method and device based on deep convolutional residual error network
CN114972646B (en) Method and system for extracting and modifying independent ground objects of live-action three-dimensional model
CN108256557A (en) The hyperspectral image classification method integrated with reference to deep learning and neighborhood
CN115527657A (en) Image and image multi-mode reconstruction, imaging and labeling based on medical digital imaging and communication
CN111666813A (en) Subcutaneous sweat gland extraction method based on three-dimensional convolutional neural network of non-local information
CN117474764A (en) High-resolution reconstruction method for remote sensing image under complex degradation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant