CN114170076A - Method for extracting target object information from video based on super-resolution and application - Google Patents
Method for extracting target object information from video based on super-resolution and application Download PDFInfo
- Publication number
- CN114170076A CN114170076A CN202111272433.9A CN202111272433A CN114170076A CN 114170076 A CN114170076 A CN 114170076A CN 202111272433 A CN202111272433 A CN 202111272433A CN 114170076 A CN114170076 A CN 114170076A
- Authority
- CN
- China
- Prior art keywords
- target object
- super
- layer
- video
- resolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000000513 principal component analysis Methods 0.000 claims abstract description 34
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 21
- 238000002583 angiography Methods 0.000 claims abstract description 9
- 230000033001 locomotion Effects 0.000 claims description 19
- 238000013528 artificial neural network Methods 0.000 claims description 18
- 230000000306 recurrent effect Effects 0.000 claims description 15
- 238000005070 sampling Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000003384 imaging method Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 230000003068 static effect Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 210000004204 blood vessel Anatomy 0.000 abstract description 34
- 230000000694 effects Effects 0.000 abstract description 5
- 238000001514 detection method Methods 0.000 abstract description 4
- 230000008030 elimination Effects 0.000 abstract 1
- 238000003379 elimination reaction Methods 0.000 abstract 1
- 239000011159 matrix material Substances 0.000 description 33
- 238000011176 pooling Methods 0.000 description 11
- 230000015654 memory Effects 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 6
- 210000000056 organ Anatomy 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 210000000988 bone and bone Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 210000004072 lung Anatomy 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 210000003484 anatomy Anatomy 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 239000002872 contrast media Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 210000000188 diaphragm Anatomy 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000006386 memory function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012847 principal component analysis method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 230000000241 respiratory effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000002792 vascular Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10116—X-ray image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30101—Blood vessel; Artery; Vein; Vascular
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a super-resolution-based method for extracting target object information from a video and application thereof, wherein the method comprises the following steps: acquiring a video sequence containing a target object; dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object; the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis. Compared with the prior art, the method has the advantages of high real-time performance, interference elimination, accurate detection and the like, and when the method is applied to the X-ray angiography video, the influence of the background blood vessel structure and the complex mixed noise can be effectively reduced, and the small blood vessel extraction effect is remarkably improved.
Description
Technical Field
The invention relates to the technical field of information extraction, in particular to a video processing method, and particularly relates to a super-resolution-based method for extracting target object information from a video and application thereof.
Background
In the field of information, it is often necessary to extract target object information in a video sequence. Such as an X-ray angiographic video sequence, is a video sequence in which accurate vessel information is the object that the technician needs to acquire. Due to the mechanism of X-ray projection imaging, such video sequence images include numerous structures other than the flow of contrast agent through blood vessels, such as human tissues and organs like bones, lungs, diaphragms, etc. In addition, various mixed noises are inevitably generated in the imaging process. These background structures and mixed noise interfere with the identification of the vessel information, thereby affecting the further analysis of the vessel information and accurate clinical diagnosis. Therefore, the background layer in the video sequence needs to be separated from the blood vessel layer, so as to obtain the blood vessel layer video sequence with easier acquisition of blood vessel information.
At present, robust principal component analysis (robust principal component analysis) is an algorithm [ Jin, m., Li, r., Jiang, j.and Qin, b.,2017.Extracting constrained-filtered vessels in X-ray imaging by reduced RPCA with motion coherence constraint.pattern Recognition,63, pp.653-666 ] that performs the best effect of Extracting the vessel layer of an X-ray angiography video sequence. The algorithm decomposes a video sequence into a low-rank matrix and a sparse matrix from the viewpoint of motion analysis, wherein the low-rank matrix represents a background layer with larger similarity and smaller motion change in the video sequence, and the sparse matrix represents a target object layer with sparse distribution and larger motion change in the video sequence.
The traditional robust principal component analysis algorithm has limitation on the blood vessel extraction of an X-ray angiography video sequence. The algorithm needs a large amount of iterative computation, so that the time efficiency and the space efficiency are low, and the application in clinic is limited. Secondly, the human tissues and organs present in the background layer of the X-ray image are not completely static, and slight movements of these structures have a large influence on the algorithm results. Meanwhile, a great deal of complex mixed noise exists in the X-ray image, and the mixed noise can destroy the blood vessel information, especially the small blood vessel branch information. Therefore, the interference of tissues and organs in the background layer and the complex mixed noise makes the traditional robust principal component analysis algorithm unable to accurately separate the blood vessel layer from the background layer.
In addition, there are some image segmentation techniques used in the blood vessel segmentation work to obtain the blood vessel region part in the image. Common methods are image enhancement techniques, deformable models, vessel tracking, etc. These methods are usually based on vessel morphology or image grey values. When the methods are used, the blood vessel-like structure in the image background and the complex Gaussian Poisson mixed noise in the image can cause great interference to the segmentation result, so that the foreground and the background in a partial region are difficult to distinguish. Meanwhile, the segmentation results of the methods pay attention to the extraction of the contour features of the blood vessel structure, and the gray information of the blood vessel in the original image is ignored.
Therefore, the existing method cannot extract the blood vessel information from the X-ray angiography video sequence quickly and accurately, and further diagnostic items such as quantification and functional analysis based on the shape and gray scale restoration of the contrast blood vessel are difficult to develop. In general, the drawbacks of the existing vessel extraction algorithms are summarized as follows:
1. the time efficiency and the space efficiency of extracting the blood vessels are low;
2. the extracted contrast blood vessel image contains tissue organ structures and noise in a background layer;
3. in the extracted contrast blood vessel image, the small blood vessel branch information cannot be retained.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a super-resolution-based method for extracting target object information from a video and an application thereof, wherein the method has high real-time performance and accurate detection.
The purpose of the invention can be realized by the following technical scheme:
a super-resolution-based method for extracting target object information from a video is applied to a transmission imaging video, and comprises the following steps:
acquiring a video sequence containing a target object;
dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object;
the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis.
Further, the specific construction process of the deep expansion network model is as follows:
constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model;
performing iterative solution on the Lagrange formal model to obtain a calculation formula of each motion layer of the video;
carrying out depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers;
and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
Further, the constructed robust principal component analysis model is as follows:
min||L||*+λ||S||1s.t.D=L+S
wherein matrix D represents a data matrix of the original video sequence, each column vector of which is a frame of vectorized original video image, matrix L represents a low rank matrix which is a data matrix of a background layer to be solved, matrix S represents a sparse matrix which is a data matrix of a foreground layer to be solved, | L |*Represents the kernel norm, | S | of the matrix L1Represents l of the matrix S1The norm, λ, is a regularization parameter used to adjust the proportion of the foreground layer component obtained by decomposition.
Further, the lagrangian formal model is:
wherein H1And H2Metric matrices of L and S, respectively, here taken as H1=H2=I,‖S‖1,2Represents l of the matrix S1,2Norm, λ1And λ2Regularization parameters for L and S, respectively.
Further, the iterative solution is realized by adopting a linear inverse problem solution algorithm.
Specifically, the linear inverse problem solving algorithm includes a soft threshold iteration algorithm, a fast soft threshold iteration algorithm, an alternating direction multiplier method, and the like.
Further, each motion layer of the video comprises an approximately static background layer and a moving object layer.
Specifically, in the iterative solution process of the Lagrange formal model by using a soft threshold iterative algorithm, the low-rank matrix L and the sparse matrix S are iteratively updated until convergence is reached, and in the (k + 1) th iteration, L is obtainedk+1And Sk+1Can be updated according to the following calculation:
wherein the content of the first and second substances,is a singular value threshold operator that is,is a soft threshold operator, LfIs the Lipschitz constant.
Further, the performing of the deep unfolding specifically includes: and replacing the coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing the multiplication operation with the convolution operation.
In particular, from H1And H2The coefficient matrix terms formed may be replaced by convolutional layers for multiplicationThe convolution operation is replaced, and the k-th layer in the expansion network is calculated as follows:
wherein x represents a convolution operator,is a coiled-up layer, and is,is a regularization parameter. Both convolutional layer parameters and regularization parameters are obtained in the training.
Further, the super-resolution module includes a sampling layer and a sub-block sparse feature selection network layer, and the combination of the multiple iteration layers and the super-resolution module specifically includes:
and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer.
The sampling layer is a network layer which is commonly used in the neural network and has the function of feature selection, and is used for eliminating redundant information and reserving effective information. In particular, the sampling layers include, but are not limited to, an average pooling layer, a maximum pooling layer, an overlapping pooling layer, an empty pyramid pooling layer, an upsampling layer, and the like.
Further, the sub-block sparse feature selection network layer comprises a residual network layer and a recurrent neural network layer.
The recurrent neural network layer is a network which takes a sequence as input and has a memory function. In particular, the recurrent neural network layer includes, but is not limited to, conventional recurrent neural networks, bidirectional recurrent neural networks, gated recurrent neural networks, long-short term memory networks, convolutional long-short term memory networks, and the like.
Further, the conventional target object extraction algorithm includes an extraction method based on background completion.
Further, the extraction method based on background completion comprises the following steps:
segmenting the original image to obtain a background layer image of the region where the target object is removed;
estimating a background gray value of a target object region through background completion to obtain an estimated background layer image;
and obtaining a target object gray information image by subtracting the estimated background layer image from the original image.
Further, the deep expansion network model is formed by training label samples, wherein the label samples are weak supervision label samples and are obtained by utilizing a traditional target object extraction algorithm or manual labeling.
The invention also provides application of the method for extracting the target object information from the video based on the super-resolution in the X-ray angiography video.
Compared with the prior art, the invention has the following beneficial effects:
first, the conventional robust principal component analysis algorithm utilizes iterative solution, the number of iterations is large, and a large amount of time is consumed, so that the method is limited in practical application. The invention constructs a convolution robust principal component analysis deep expansion network, firstly proposes to combine robust principal component analysis and deep expansion, and each layer of the network represents one iteration of an iterative algorithm. Under the general condition, the deep expansion network can obtain better results under the condition that the number of network layers is far less than the iteration times of the traditional algorithm, so that the time efficiency of the deep expansion network is greatly improved compared with the time efficiency of the original iteration algorithm. Therefore, the use of the robust principal component analysis deep expansion network enables the method to have higher real-time performance, and the method can be applied to clinical applications such as X-ray sequence vessel extraction.
Secondly, the convolution robust principal component analysis deep expansion network constructed by the method is firstly provided to be combined with a super-resolution module. The background portion of the transmission imaging image, such as the X-ray sequence image, contains overlapping complex anatomical structures, such as human tissues and organs, such as bones, lungs, vertebrae, diaphragms, and the like. Due to factors such as respiratory motion and human body movement, a certain amplitude of motion exists in part of background structures. Meanwhile, some structures in the background have morphological features and gray levels similar to those of blood vessels. These factors have a great influence on the extraction effect of the traditional robust principal component analysis method and the deep-expansion network method based on the robust principal component analysis only, and the blood vessel component in a partial region is difficult to distinguish from the background component, especially the small blood vessel component. According to the invention, a super-resolution module is embedded in a network layer, and the super-resolution module comprises a sampling layer and a subblock sparse feature selection network layer. The sampling layer is positioned before robust principal component analysis, and can screen input features, retain effective features, remove useless features and eliminate partial background vascular structure interference. The subblock sparse feature selection network layer can realize the functions of reducing the influence of complex mixed noise, extracting and enhancing image detail features and improving the detection rate of small blood vessels.
Thirdly, the sub-block sparse feature selection network layer comprises a residual error network layer and a cyclic neural network layer, and the selection of the blood vessel features by using the cyclic neural network is firstly proposed. The recurrent neural network has memorability, can transmit characteristic information between a front frame and a rear frame of a video sequence in the network, screens the characteristics and improves the detection rate of a current frame.
Fourthly, the method divides the video sequence into sub-blocks, solves the sub-blocks, and then splices the sub-blocks to obtain the prediction result of the target object. Transmission imaging images, such as X-ray sequence images, have complex mixed gaussian poisson noise caused by quantum noise and thermal noise generated by electronic devices. The poisson noise is signal-dependent, the strength of the noise is greatly related to the strength of the local signal, and a general global noise reduction algorithm is not suitable for the noise reduction of the poisson noise. The method can effectively reduce noise aiming at the mixed noise mode in the subblock region based on the subblock mode X-ray sequence image, and effectively remove the mixed Gaussian Poisson noise.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of a convolution robust principal component analysis deep expansion network constructed by the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
The invention provides a super-resolution-based method for extracting target object information from a video, which is applied to a transmission imaging video, wherein in the video imaging process, a target object has an attenuation effect on imaging light, and referring to fig. 1, the method comprises the following steps: acquiring a video sequence containing a target object; dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object; the deep expansion network model is a convolution Robust Principal Component Analysis (RPCA) deep expansion network, is constructed and obtained by combining a super-resolution module according to a RPCA based on the RPCA, and is trained by weak supervision label samples acquired by using a traditional target object extraction algorithm.
Specifically, the specific construction process of the deep expansion network model is as follows: constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model; solving the Lagrange formal model, wherein the solving Method comprises but is not limited to soft threshold Iterative Solution (ISTA), Fast soft threshold Iterative Solution (ISTA), Alternating Direction multiplier Method (Alternating Direction Method of Multipliers), and obtaining the calculation formula of each motion layer of the video; performing depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers, wherein the depth expansion specifically comprises the following steps: replacing coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing multiplication operation with convolution operation; and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
Specifically, the super-resolution module includes a sampling layer and a sub-block sparse feature selection network layer, and the combination of the multiple iteration layers and the super-resolution module specifically includes: and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer. The subblock sparse feature selection network layer comprises a residual error network layer and a recurrent neural network layer. Wherein, the sampling layer includes but is not limited to an average pooling layer, a maximum pooling layer, an overlapping pooling layer, an empty pyramid pooling layer, an upsampling layer, etc.; the recurrent neural network layer includes, but is not limited to, a conventional recurrent neural network, a bidirectional recurrent neural network, a gated recurrent neural network, a long-short term memory network, a convolutional long-short term memory network, and the like.
As shown in fig. 2, a Convolutional robust principal component analysis depth expansion network constructed based on the above method includes a Pooling layer (Pooling layer), a robust principal component analysis depth expansion layer (RPCAurolinglayer), and a Super-resolution layer (SR module), where the SR module includes a Convolutional layer (Convolutional layer), a Residual module (Residual module), a CLSTM module (Convolutional long short term memory network), and a Pixel Shuffle (Pixel reorganization).
The traditional target object extraction algorithm is a background-based completion method, and specifically comprises the following steps: segmenting the original image to obtain a background layer image of the region where the target object is removed; estimating a background gray value of a target object region through background completion to obtain an estimated background layer image; and obtaining a target object gray information image by subtracting the estimated background layer image from the original image.
The method utilizes the constructed convolution robust principal component analysis deep expansion network, and can effectively improve the real-time performance and accuracy of target object information extraction.
The above method, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Example 1
The embodiment of the present invention is based on the above method to realize accurate extraction of blood vessels of an X-ray angiography image sequence, and includes:
101) constructing a robust principal component analysis model for the X-ray angiography image sequence, namely:
min||L||*+λ||S||1s.t.D=L+S
wherein matrix D represents a data matrix of the original video sequence, each column vector of which is a frame of vectorized original video image, matrix L represents a low rank matrix which is a data matrix of a background layer to be solved, matrix S represents a sparse matrix which is a data matrix of a foreground layer to be solved, | L |*Represents the kernel norm, | S | of the matrix L1Represents l of the matrix S1The norm, λ, is a regularization parameter used to adjust the proportion of the foreground layer component obtained by decomposition.
102) Writing a robust principal component analysis model (RPCA algorithm mathematical model) into Lagrange form:
wherein H1And H2Metric matrices of L and S, respectively, here taken as H1=H2=I,‖S‖1,2Represents l of the matrix S1,2Norm, λ1And λ2Are respectively positive for L and SThe parameters are normalized.
103) And solving the Lagrange form model by adopting a soft threshold iterative algorithm. In the iterative process, the low-rank matrix L and the sparse matrix S are updated until convergence. In the k +1 th iteration, Lk+1And Sk+1Can be updated according to the following equation:
wherein the content of the first and second substances,is a singular value threshold operator that is,is a soft threshold operator, LfIs a Lipschitz constant which is a constant,andis H1And H2The conjugate matrix of (2).
104) And carrying out deep expansion on the robust principal component analysis solution. H is to be1And H2The constructed coefficient matrix terms are replaced by convolution layers, and the multiplication operations are replaced by convolution operations. Thus, the k-th layer in the expanded network is calculated as follows:
whereinRepresents the convolution operator and the convolution operation,is a coiled-up layer, and is,is a regularization parameter.
105) And connecting the expanded iterative network layers to construct a deep expanded network. The number of iteration layers constructed in this example is 4. The first two layers of convolution kernels have a size of 5 and the second two layers of convolution kernels have a size of 3. Regularization parameters for low rank componentsRegularization parameter of sparse component of 0.4Is 1.8.
106) A mean pooling layer is embedded starting at the iteration layer for down-sampling the input.
107) And a residual error module and a recurrent neural network module are embedded behind the robust principal component analysis module. In this embodiment, the recurrent neural network module employs a convolution long-short term memory network.
108) And segmenting the original blood vessel image sequence by using SVS-net to obtain a blood vessel region segmentation map sequence and a background region segmentation map sequence.
109) And solving the segmentation graph sequence of the background area by adopting a t-TNN tensor completion model to obtain a tensor image of the background layer data.
110) And dividing the original image sequence data by the elements at the same position in the background layer data tensor to obtain a blood vessel data tensor image.
111) And training the robust principal component analysis deep expansion network by using the blood vessel data tensor image to obtain a training model.
112) And (3) segmenting the image sequence of the blood vessel to be extracted into subblocks, inputting the subblocks into a training model, and splicing the output to obtain a complete output image. In this embodiment, the sub-blocks obtained by division have a size of 64 × 64 × 20 (length and width are 64, respectively, and the number of frames is 20), and adjacent sub-blocks partially overlap by 50%.
The overall implementation of the above-described accurate extraction of vessels from an X-ray angiographic image sequence is shown in fig. 1. The structure of the deep network iteration layer in the embodiment is shown in fig. 2.
The present embodiment further illustrates the above method by taking 43 clinical angiographic image sequences as an example, each angiographic image sequence includes 30-140 frames of images, each frame of image has a resolution of 512 × 512, each pixel represents an actual size of 0.3mm × 0.3mm, and the bit depth of the pixel is 8.
The blood vessel extraction process of the X-ray angiography image sequence has the advantages of effectively reducing the influence of background blood vessel structures and complex mixed noise, remarkably improving the small blood vessel extraction effect and the like.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.
Claims (10)
1. A method for extracting target object information from a video based on super-resolution is applied to a transmission imaging video, and is characterized by comprising the following steps:
acquiring a video sequence containing a target object;
dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object;
the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis.
2. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the specific construction process of the deep-expansion network model is as follows:
constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model;
performing iterative solution on the Lagrange formal model to obtain a calculation formula of each motion layer of the video;
carrying out depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers;
and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
3. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the iterative solution is implemented by using a linear inverse problem solution algorithm.
4. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein each motion layer of the video comprises an approximately static background layer and a moving target object layer.
5. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the performing depth expansion specifically comprises: and replacing the coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing the multiplication operation with the convolution operation.
6. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the super-resolution module comprises a sampling layer and a sub-block sparse feature selection network layer, and the combination of the plurality of iteration layers and the super-resolution module is specifically:
and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer.
7. The super-resolution-based method for extracting target object information from a video according to claim 6, wherein the sub-block sparse feature selection network layer comprises a residual network layer and a recurrent neural network layer.
8. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the conventional target object extraction algorithm comprises a background-completion-based extraction method.
9. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the deep-expanded network model is trained with label samples, the label samples are weakly supervised label samples, and are obtained by using a traditional target object extraction algorithm or manual labeling.
10. Use of the super resolution based method of extracting target object information from video according to any of claims 1-9 in X-ray angiography video.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111272433.9A CN114170076A (en) | 2021-10-29 | 2021-10-29 | Method for extracting target object information from video based on super-resolution and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111272433.9A CN114170076A (en) | 2021-10-29 | 2021-10-29 | Method for extracting target object information from video based on super-resolution and application |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114170076A true CN114170076A (en) | 2022-03-11 |
Family
ID=80477516
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111272433.9A Pending CN114170076A (en) | 2021-10-29 | 2021-10-29 | Method for extracting target object information from video based on super-resolution and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114170076A (en) |
-
2021
- 2021-10-29 CN CN202111272433.9A patent/CN114170076A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Handbook of medical image computing and computer assisted intervention | |
Sim et al. | Optimal transport driven CycleGAN for unsupervised learning in inverse problems | |
Hammernik et al. | Machine learning for image reconstruction | |
CN112205993A (en) | System and method for processing data acquired using multi-energy computed tomography imaging | |
WO2003042712A1 (en) | Black blood angiography method and apparatus | |
WO2021041772A1 (en) | Dilated convolutional neural network system and method for positron emission tomography (pet) image denoising | |
Li et al. | Low-dose CT image denoising with improving WGAN and hybrid loss function | |
CN111091575B (en) | Medical image segmentation method based on reinforcement learning method | |
CN112419378B (en) | Medical image registration method, electronic device and storage medium | |
CN113112559A (en) | Ultrasonic image segmentation method and device, terminal equipment and storage medium | |
CN111724365B (en) | Interventional instrument detection method, system and device for endovascular aneurysm repair operation | |
Mredhula et al. | An extensive review of significant researches on medical image denoising techniques | |
CN114511581B (en) | Multi-task multi-resolution collaborative esophageal cancer lesion segmentation method and device | |
CN110599530B (en) | MVCT image texture enhancement method based on double regular constraints | |
Jafari et al. | LMISA: A lightweight multi-modality image segmentation network via domain adaptation using gradient magnitude and shape constraint | |
Sander et al. | Autoencoding low-resolution MRI for semantically smooth interpolation of anisotropic MRI | |
CN117291835A (en) | Denoising network model based on image content perception priori and attention drive | |
Jin et al. | Low-dose CT image restoration based on noise prior regression network | |
CN114170076A (en) | Method for extracting target object information from video based on super-resolution and application | |
Arega et al. | Using Polynomial Loss and Uncertainty Information for Robust Left Atrial and Scar Quantification and Segmentation | |
Qiu et al. | A despeckling method for ultrasound images utilizing content-aware prior and attention-driven techniques | |
CN109166097B (en) | Method and device for extracting blood vessels from contrast image sequence | |
Zhang et al. | Medical image denoising | |
Mehr et al. | Deep Learning-Based Ultrasound Image Despeckling by Noise Model Estimation. | |
EP4343680A1 (en) | De-noising data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |