CN114170076A - Method for extracting target object information from video based on super-resolution and application - Google Patents

Method for extracting target object information from video based on super-resolution and application Download PDF

Info

Publication number
CN114170076A
CN114170076A CN202111272433.9A CN202111272433A CN114170076A CN 114170076 A CN114170076 A CN 114170076A CN 202111272433 A CN202111272433 A CN 202111272433A CN 114170076 A CN114170076 A CN 114170076A
Authority
CN
China
Prior art keywords
target object
super
layer
video
resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111272433.9A
Other languages
Chinese (zh)
Inventor
秦斌杰
茅好好
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202111272433.9A priority Critical patent/CN114170076A/en
Publication of CN114170076A publication Critical patent/CN114170076A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a super-resolution-based method for extracting target object information from a video and application thereof, wherein the method comprises the following steps: acquiring a video sequence containing a target object; dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object; the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis. Compared with the prior art, the method has the advantages of high real-time performance, interference elimination, accurate detection and the like, and when the method is applied to the X-ray angiography video, the influence of the background blood vessel structure and the complex mixed noise can be effectively reduced, and the small blood vessel extraction effect is remarkably improved.

Description

Method for extracting target object information from video based on super-resolution and application
Technical Field
The invention relates to the technical field of information extraction, in particular to a video processing method, and particularly relates to a super-resolution-based method for extracting target object information from a video and application thereof.
Background
In the field of information, it is often necessary to extract target object information in a video sequence. Such as an X-ray angiographic video sequence, is a video sequence in which accurate vessel information is the object that the technician needs to acquire. Due to the mechanism of X-ray projection imaging, such video sequence images include numerous structures other than the flow of contrast agent through blood vessels, such as human tissues and organs like bones, lungs, diaphragms, etc. In addition, various mixed noises are inevitably generated in the imaging process. These background structures and mixed noise interfere with the identification of the vessel information, thereby affecting the further analysis of the vessel information and accurate clinical diagnosis. Therefore, the background layer in the video sequence needs to be separated from the blood vessel layer, so as to obtain the blood vessel layer video sequence with easier acquisition of blood vessel information.
At present, robust principal component analysis (robust principal component analysis) is an algorithm [ Jin, m., Li, r., Jiang, j.and Qin, b.,2017.Extracting constrained-filtered vessels in X-ray imaging by reduced RPCA with motion coherence constraint.pattern Recognition,63, pp.653-666 ] that performs the best effect of Extracting the vessel layer of an X-ray angiography video sequence. The algorithm decomposes a video sequence into a low-rank matrix and a sparse matrix from the viewpoint of motion analysis, wherein the low-rank matrix represents a background layer with larger similarity and smaller motion change in the video sequence, and the sparse matrix represents a target object layer with sparse distribution and larger motion change in the video sequence.
The traditional robust principal component analysis algorithm has limitation on the blood vessel extraction of an X-ray angiography video sequence. The algorithm needs a large amount of iterative computation, so that the time efficiency and the space efficiency are low, and the application in clinic is limited. Secondly, the human tissues and organs present in the background layer of the X-ray image are not completely static, and slight movements of these structures have a large influence on the algorithm results. Meanwhile, a great deal of complex mixed noise exists in the X-ray image, and the mixed noise can destroy the blood vessel information, especially the small blood vessel branch information. Therefore, the interference of tissues and organs in the background layer and the complex mixed noise makes the traditional robust principal component analysis algorithm unable to accurately separate the blood vessel layer from the background layer.
In addition, there are some image segmentation techniques used in the blood vessel segmentation work to obtain the blood vessel region part in the image. Common methods are image enhancement techniques, deformable models, vessel tracking, etc. These methods are usually based on vessel morphology or image grey values. When the methods are used, the blood vessel-like structure in the image background and the complex Gaussian Poisson mixed noise in the image can cause great interference to the segmentation result, so that the foreground and the background in a partial region are difficult to distinguish. Meanwhile, the segmentation results of the methods pay attention to the extraction of the contour features of the blood vessel structure, and the gray information of the blood vessel in the original image is ignored.
Therefore, the existing method cannot extract the blood vessel information from the X-ray angiography video sequence quickly and accurately, and further diagnostic items such as quantification and functional analysis based on the shape and gray scale restoration of the contrast blood vessel are difficult to develop. In general, the drawbacks of the existing vessel extraction algorithms are summarized as follows:
1. the time efficiency and the space efficiency of extracting the blood vessels are low;
2. the extracted contrast blood vessel image contains tissue organ structures and noise in a background layer;
3. in the extracted contrast blood vessel image, the small blood vessel branch information cannot be retained.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a super-resolution-based method for extracting target object information from a video and an application thereof, wherein the method has high real-time performance and accurate detection.
The purpose of the invention can be realized by the following technical scheme:
a super-resolution-based method for extracting target object information from a video is applied to a transmission imaging video, and comprises the following steps:
acquiring a video sequence containing a target object;
dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object;
the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis.
Further, the specific construction process of the deep expansion network model is as follows:
constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model;
performing iterative solution on the Lagrange formal model to obtain a calculation formula of each motion layer of the video;
carrying out depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers;
and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
Further, the constructed robust principal component analysis model is as follows:
min||L||*+λ||S||1s.t.D=L+S
wherein matrix D represents a data matrix of the original video sequence, each column vector of which is a frame of vectorized original video image, matrix L represents a low rank matrix which is a data matrix of a background layer to be solved, matrix S represents a sparse matrix which is a data matrix of a foreground layer to be solved, | L |*Represents the kernel norm, | S | of the matrix L1Represents l of the matrix S1The norm, λ, is a regularization parameter used to adjust the proportion of the foreground layer component obtained by decomposition.
Further, the lagrangian formal model is:
Figure BDA0003329264460000031
wherein H1And H2Metric matrices of L and S, respectively, here taken as H1=H2=I,‖S‖1,2Represents l of the matrix S1,2Norm, λ1And λ2Regularization parameters for L and S, respectively.
Further, the iterative solution is realized by adopting a linear inverse problem solution algorithm.
Specifically, the linear inverse problem solving algorithm includes a soft threshold iteration algorithm, a fast soft threshold iteration algorithm, an alternating direction multiplier method, and the like.
Further, each motion layer of the video comprises an approximately static background layer and a moving object layer.
Specifically, in the iterative solution process of the Lagrange formal model by using a soft threshold iterative algorithm, the low-rank matrix L and the sparse matrix S are iteratively updated until convergence is reached, and in the (k + 1) th iteration, L is obtainedk+1And Sk+1Can be updated according to the following calculation:
Figure BDA0003329264460000032
Figure BDA0003329264460000033
wherein the content of the first and second substances,
Figure BDA0003329264460000034
is a singular value threshold operator that is,
Figure BDA0003329264460000035
is a soft threshold operator, LfIs the Lipschitz constant.
Further, the performing of the deep unfolding specifically includes: and replacing the coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing the multiplication operation with the convolution operation.
In particular, from H1And H2The coefficient matrix terms formed may be replaced by convolutional layers for multiplicationThe convolution operation is replaced, and the k-th layer in the expansion network is calculated as follows:
Figure BDA0003329264460000036
Figure BDA0003329264460000037
wherein x represents a convolution operator,
Figure BDA0003329264460000038
is a coiled-up layer, and is,
Figure BDA0003329264460000039
is a regularization parameter. Both convolutional layer parameters and regularization parameters are obtained in the training.
Further, the super-resolution module includes a sampling layer and a sub-block sparse feature selection network layer, and the combination of the multiple iteration layers and the super-resolution module specifically includes:
and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer.
The sampling layer is a network layer which is commonly used in the neural network and has the function of feature selection, and is used for eliminating redundant information and reserving effective information. In particular, the sampling layers include, but are not limited to, an average pooling layer, a maximum pooling layer, an overlapping pooling layer, an empty pyramid pooling layer, an upsampling layer, and the like.
Further, the sub-block sparse feature selection network layer comprises a residual network layer and a recurrent neural network layer.
The recurrent neural network layer is a network which takes a sequence as input and has a memory function. In particular, the recurrent neural network layer includes, but is not limited to, conventional recurrent neural networks, bidirectional recurrent neural networks, gated recurrent neural networks, long-short term memory networks, convolutional long-short term memory networks, and the like.
Further, the conventional target object extraction algorithm includes an extraction method based on background completion.
Further, the extraction method based on background completion comprises the following steps:
segmenting the original image to obtain a background layer image of the region where the target object is removed;
estimating a background gray value of a target object region through background completion to obtain an estimated background layer image;
and obtaining a target object gray information image by subtracting the estimated background layer image from the original image.
Further, the deep expansion network model is formed by training label samples, wherein the label samples are weak supervision label samples and are obtained by utilizing a traditional target object extraction algorithm or manual labeling.
The invention also provides application of the method for extracting the target object information from the video based on the super-resolution in the X-ray angiography video.
Compared with the prior art, the invention has the following beneficial effects:
first, the conventional robust principal component analysis algorithm utilizes iterative solution, the number of iterations is large, and a large amount of time is consumed, so that the method is limited in practical application. The invention constructs a convolution robust principal component analysis deep expansion network, firstly proposes to combine robust principal component analysis and deep expansion, and each layer of the network represents one iteration of an iterative algorithm. Under the general condition, the deep expansion network can obtain better results under the condition that the number of network layers is far less than the iteration times of the traditional algorithm, so that the time efficiency of the deep expansion network is greatly improved compared with the time efficiency of the original iteration algorithm. Therefore, the use of the robust principal component analysis deep expansion network enables the method to have higher real-time performance, and the method can be applied to clinical applications such as X-ray sequence vessel extraction.
Secondly, the convolution robust principal component analysis deep expansion network constructed by the method is firstly provided to be combined with a super-resolution module. The background portion of the transmission imaging image, such as the X-ray sequence image, contains overlapping complex anatomical structures, such as human tissues and organs, such as bones, lungs, vertebrae, diaphragms, and the like. Due to factors such as respiratory motion and human body movement, a certain amplitude of motion exists in part of background structures. Meanwhile, some structures in the background have morphological features and gray levels similar to those of blood vessels. These factors have a great influence on the extraction effect of the traditional robust principal component analysis method and the deep-expansion network method based on the robust principal component analysis only, and the blood vessel component in a partial region is difficult to distinguish from the background component, especially the small blood vessel component. According to the invention, a super-resolution module is embedded in a network layer, and the super-resolution module comprises a sampling layer and a subblock sparse feature selection network layer. The sampling layer is positioned before robust principal component analysis, and can screen input features, retain effective features, remove useless features and eliminate partial background vascular structure interference. The subblock sparse feature selection network layer can realize the functions of reducing the influence of complex mixed noise, extracting and enhancing image detail features and improving the detection rate of small blood vessels.
Thirdly, the sub-block sparse feature selection network layer comprises a residual error network layer and a cyclic neural network layer, and the selection of the blood vessel features by using the cyclic neural network is firstly proposed. The recurrent neural network has memorability, can transmit characteristic information between a front frame and a rear frame of a video sequence in the network, screens the characteristics and improves the detection rate of a current frame.
Fourthly, the method divides the video sequence into sub-blocks, solves the sub-blocks, and then splices the sub-blocks to obtain the prediction result of the target object. Transmission imaging images, such as X-ray sequence images, have complex mixed gaussian poisson noise caused by quantum noise and thermal noise generated by electronic devices. The poisson noise is signal-dependent, the strength of the noise is greatly related to the strength of the local signal, and a general global noise reduction algorithm is not suitable for the noise reduction of the poisson noise. The method can effectively reduce noise aiming at the mixed noise mode in the subblock region based on the subblock mode X-ray sequence image, and effectively remove the mixed Gaussian Poisson noise.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of a convolution robust principal component analysis deep expansion network constructed by the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
The invention provides a super-resolution-based method for extracting target object information from a video, which is applied to a transmission imaging video, wherein in the video imaging process, a target object has an attenuation effect on imaging light, and referring to fig. 1, the method comprises the following steps: acquiring a video sequence containing a target object; dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object; the deep expansion network model is a convolution Robust Principal Component Analysis (RPCA) deep expansion network, is constructed and obtained by combining a super-resolution module according to a RPCA based on the RPCA, and is trained by weak supervision label samples acquired by using a traditional target object extraction algorithm.
Specifically, the specific construction process of the deep expansion network model is as follows: constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model; solving the Lagrange formal model, wherein the solving Method comprises but is not limited to soft threshold Iterative Solution (ISTA), Fast soft threshold Iterative Solution (ISTA), Alternating Direction multiplier Method (Alternating Direction Method of Multipliers), and obtaining the calculation formula of each motion layer of the video; performing depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers, wherein the depth expansion specifically comprises the following steps: replacing coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing multiplication operation with convolution operation; and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
Specifically, the super-resolution module includes a sampling layer and a sub-block sparse feature selection network layer, and the combination of the multiple iteration layers and the super-resolution module specifically includes: and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer. The subblock sparse feature selection network layer comprises a residual error network layer and a recurrent neural network layer. Wherein, the sampling layer includes but is not limited to an average pooling layer, a maximum pooling layer, an overlapping pooling layer, an empty pyramid pooling layer, an upsampling layer, etc.; the recurrent neural network layer includes, but is not limited to, a conventional recurrent neural network, a bidirectional recurrent neural network, a gated recurrent neural network, a long-short term memory network, a convolutional long-short term memory network, and the like.
As shown in fig. 2, a Convolutional robust principal component analysis depth expansion network constructed based on the above method includes a Pooling layer (Pooling layer), a robust principal component analysis depth expansion layer (RPCAurolinglayer), and a Super-resolution layer (SR module), where the SR module includes a Convolutional layer (Convolutional layer), a Residual module (Residual module), a CLSTM module (Convolutional long short term memory network), and a Pixel Shuffle (Pixel reorganization).
The traditional target object extraction algorithm is a background-based completion method, and specifically comprises the following steps: segmenting the original image to obtain a background layer image of the region where the target object is removed; estimating a background gray value of a target object region through background completion to obtain an estimated background layer image; and obtaining a target object gray information image by subtracting the estimated background layer image from the original image.
The method utilizes the constructed convolution robust principal component analysis deep expansion network, and can effectively improve the real-time performance and accuracy of target object information extraction.
The above method, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Example 1
The embodiment of the present invention is based on the above method to realize accurate extraction of blood vessels of an X-ray angiography image sequence, and includes:
101) constructing a robust principal component analysis model for the X-ray angiography image sequence, namely:
min||L||*+λ||S||1s.t.D=L+S
wherein matrix D represents a data matrix of the original video sequence, each column vector of which is a frame of vectorized original video image, matrix L represents a low rank matrix which is a data matrix of a background layer to be solved, matrix S represents a sparse matrix which is a data matrix of a foreground layer to be solved, | L |*Represents the kernel norm, | S | of the matrix L1Represents l of the matrix S1The norm, λ, is a regularization parameter used to adjust the proportion of the foreground layer component obtained by decomposition.
102) Writing a robust principal component analysis model (RPCA algorithm mathematical model) into Lagrange form:
Figure BDA0003329264460000071
wherein H1And H2Metric matrices of L and S, respectively, here taken as H1=H2=I,‖S‖1,2Represents l of the matrix S1,2Norm, λ1And λ2Are respectively positive for L and SThe parameters are normalized.
103) And solving the Lagrange form model by adopting a soft threshold iterative algorithm. In the iterative process, the low-rank matrix L and the sparse matrix S are updated until convergence. In the k +1 th iteration, Lk+1And Sk+1Can be updated according to the following equation:
Figure BDA0003329264460000072
Figure BDA0003329264460000073
wherein the content of the first and second substances,
Figure BDA0003329264460000074
is a singular value threshold operator that is,
Figure BDA0003329264460000075
is a soft threshold operator, LfIs a Lipschitz constant which is a constant,
Figure BDA0003329264460000081
and
Figure BDA0003329264460000082
is H1And H2The conjugate matrix of (2).
104) And carrying out deep expansion on the robust principal component analysis solution. H is to be1And H2The constructed coefficient matrix terms are replaced by convolution layers, and the multiplication operations are replaced by convolution operations. Thus, the k-th layer in the expanded network is calculated as follows:
Figure BDA0003329264460000083
Figure BDA0003329264460000084
whereinRepresents the convolution operator and the convolution operation,
Figure BDA0003329264460000085
is a coiled-up layer, and is,
Figure BDA0003329264460000086
is a regularization parameter.
105) And connecting the expanded iterative network layers to construct a deep expanded network. The number of iteration layers constructed in this example is 4. The first two layers of convolution kernels have a size of 5 and the second two layers of convolution kernels have a size of 3. Regularization parameters for low rank components
Figure BDA0003329264460000087
Regularization parameter of sparse component of 0.4
Figure BDA0003329264460000088
Is 1.8.
106) A mean pooling layer is embedded starting at the iteration layer for down-sampling the input.
107) And a residual error module and a recurrent neural network module are embedded behind the robust principal component analysis module. In this embodiment, the recurrent neural network module employs a convolution long-short term memory network.
108) And segmenting the original blood vessel image sequence by using SVS-net to obtain a blood vessel region segmentation map sequence and a background region segmentation map sequence.
109) And solving the segmentation graph sequence of the background area by adopting a t-TNN tensor completion model to obtain a tensor image of the background layer data.
110) And dividing the original image sequence data by the elements at the same position in the background layer data tensor to obtain a blood vessel data tensor image.
111) And training the robust principal component analysis deep expansion network by using the blood vessel data tensor image to obtain a training model.
112) And (3) segmenting the image sequence of the blood vessel to be extracted into subblocks, inputting the subblocks into a training model, and splicing the output to obtain a complete output image. In this embodiment, the sub-blocks obtained by division have a size of 64 × 64 × 20 (length and width are 64, respectively, and the number of frames is 20), and adjacent sub-blocks partially overlap by 50%.
The overall implementation of the above-described accurate extraction of vessels from an X-ray angiographic image sequence is shown in fig. 1. The structure of the deep network iteration layer in the embodiment is shown in fig. 2.
The present embodiment further illustrates the above method by taking 43 clinical angiographic image sequences as an example, each angiographic image sequence includes 30-140 frames of images, each frame of image has a resolution of 512 × 512, each pixel represents an actual size of 0.3mm × 0.3mm, and the bit depth of the pixel is 8.
The blood vessel extraction process of the X-ray angiography image sequence has the advantages of effectively reducing the influence of background blood vessel structures and complex mixed noise, remarkably improving the small blood vessel extraction effect and the like.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (10)

1. A method for extracting target object information from a video based on super-resolution is applied to a transmission imaging video, and is characterized by comprising the following steps:
acquiring a video sequence containing a target object;
dividing the video sequence into sub-blocks, inputting the sub-blocks into a trained deep expansion network model for solving, and splicing the output to obtain a prediction result of a target object;
the deep expansion network model is a convolution robust principal component analysis deep expansion network and is constructed and obtained by combining a super-resolution module according to a deep expansion algorithm based on robust principal component analysis.
2. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the specific construction process of the deep-expansion network model is as follows:
constructing a robust principal component analysis model based on video characteristics, and converting the robust principal component analysis model into a Lagrange form model;
performing iterative solution on the Lagrange formal model to obtain a calculation formula of each motion layer of the video;
carrying out depth expansion on the calculation formulas of the motion layers of the video to obtain a plurality of iteration layers;
and combining a plurality of iteration layers and a super-resolution module into the deep expansion network model.
3. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the iterative solution is implemented by using a linear inverse problem solution algorithm.
4. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein each motion layer of the video comprises an approximately static background layer and a moving target object layer.
5. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the performing depth expansion specifically comprises: and replacing the coefficient items in the calculation formulas of all the motion layers with convolution layers, and replacing the multiplication operation with the convolution operation.
6. The super-resolution-based method for extracting target object information from a video according to claim 2, wherein the super-resolution module comprises a sampling layer and a sub-block sparse feature selection network layer, and the combination of the plurality of iteration layers and the super-resolution module is specifically:
and embedding the sampling layer at the start position of the iteration layer, and embedding the sub-block sparse feature selection network layer at the end position of the iteration layer.
7. The super-resolution-based method for extracting target object information from a video according to claim 6, wherein the sub-block sparse feature selection network layer comprises a residual network layer and a recurrent neural network layer.
8. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the conventional target object extraction algorithm comprises a background-completion-based extraction method.
9. The super-resolution-based method for extracting target object information from a video according to claim 1, wherein the deep-expanded network model is trained with label samples, the label samples are weakly supervised label samples, and are obtained by using a traditional target object extraction algorithm or manual labeling.
10. Use of the super resolution based method of extracting target object information from video according to any of claims 1-9 in X-ray angiography video.
CN202111272433.9A 2021-10-29 2021-10-29 Method for extracting target object information from video based on super-resolution and application Pending CN114170076A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111272433.9A CN114170076A (en) 2021-10-29 2021-10-29 Method for extracting target object information from video based on super-resolution and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111272433.9A CN114170076A (en) 2021-10-29 2021-10-29 Method for extracting target object information from video based on super-resolution and application

Publications (1)

Publication Number Publication Date
CN114170076A true CN114170076A (en) 2022-03-11

Family

ID=80477516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111272433.9A Pending CN114170076A (en) 2021-10-29 2021-10-29 Method for extracting target object information from video based on super-resolution and application

Country Status (1)

Country Link
CN (1) CN114170076A (en)

Similar Documents

Publication Publication Date Title
Zhou et al. Handbook of medical image computing and computer assisted intervention
Sim et al. Optimal transport driven CycleGAN for unsupervised learning in inverse problems
Hammernik et al. Machine learning for image reconstruction
CN112205993A (en) System and method for processing data acquired using multi-energy computed tomography imaging
WO2003042712A1 (en) Black blood angiography method and apparatus
WO2021041772A1 (en) Dilated convolutional neural network system and method for positron emission tomography (pet) image denoising
Li et al. Low-dose CT image denoising with improving WGAN and hybrid loss function
CN111091575B (en) Medical image segmentation method based on reinforcement learning method
CN112419378B (en) Medical image registration method, electronic device and storage medium
CN113112559A (en) Ultrasonic image segmentation method and device, terminal equipment and storage medium
CN111724365B (en) Interventional instrument detection method, system and device for endovascular aneurysm repair operation
Mredhula et al. An extensive review of significant researches on medical image denoising techniques
CN114511581B (en) Multi-task multi-resolution collaborative esophageal cancer lesion segmentation method and device
CN110599530B (en) MVCT image texture enhancement method based on double regular constraints
Jafari et al. LMISA: A lightweight multi-modality image segmentation network via domain adaptation using gradient magnitude and shape constraint
Sander et al. Autoencoding low-resolution MRI for semantically smooth interpolation of anisotropic MRI
CN117291835A (en) Denoising network model based on image content perception priori and attention drive
Jin et al. Low-dose CT image restoration based on noise prior regression network
CN114170076A (en) Method for extracting target object information from video based on super-resolution and application
Arega et al. Using Polynomial Loss and Uncertainty Information for Robust Left Atrial and Scar Quantification and Segmentation
Qiu et al. A despeckling method for ultrasound images utilizing content-aware prior and attention-driven techniques
CN109166097B (en) Method and device for extracting blood vessels from contrast image sequence
Zhang et al. Medical image denoising
Mehr et al. Deep Learning-Based Ultrasound Image Despeckling by Noise Model Estimation.
EP4343680A1 (en) De-noising data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination