CN109168002B - Video signal measurement domain estimation method based on compressed sensing and convolutional neural network - Google Patents

Video signal measurement domain estimation method based on compressed sensing and convolutional neural network Download PDF

Info

Publication number
CN109168002B
CN109168002B CN201810831091.1A CN201810831091A CN109168002B CN 109168002 B CN109168002 B CN 109168002B CN 201810831091 A CN201810831091 A CN 201810831091A CN 109168002 B CN109168002 B CN 109168002B
Authority
CN
China
Prior art keywords
estimated
convolutional neural
neural network
measurement
macroblock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810831091.1A
Other languages
Chinese (zh)
Other versions
CN109168002A (en
Inventor
郭洁
吕军梅
宋彬
姚继鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201810831091.1A priority Critical patent/CN109168002B/en
Publication of CN109168002A publication Critical patent/CN109168002A/en
Application granted granted Critical
Publication of CN109168002B publication Critical patent/CN109168002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a video signal measurement domain estimation method based on compressed sensing and a convolutional neural network, which comprises the following steps: dividing reference frame image data into a plurality of macro blocks; randomly selecting a macro block to be estimated, and selecting four macro blocks adjacent to the macro block to be estimated in a first direction and a second direction; calculating the pixel value and the real measured value of the macro block to be estimated according to the four adjacent macro blocks; constructing a convolutional neural network model, and calculating a predicted measurement value of a macro block to be estimated; and training the convolutional neural network model according to the real measured value and the predicted measured value, and obtaining the optimal parameter when the loss function of the output layer of the convolutional neural network model is lower than a preset threshold value. The invention overcomes the problem that the prior art can not consider both the calculation complexity and the model robustness, can quickly analyze the time correlation of the video frame, and can still realize real-time and accurate macro block analysis at any position under the conditions of very low measurement rate and arbitrarily changed motion vectors.

Description

Video signal measurement domain estimation method based on compressed sensing and convolutional neural network
Technical Field
The invention belongs to the technical field of video processing, and particularly relates to a video signal measurement domain estimation method based on compressed sensing and a convolutional neural network.
Background
With the rapid development of information technology, people have higher and higher requirements on multimedia information such as images and videos, and the pressure of signal acquisition equipment such as cameras and video cameras is also higher and higher. The conventional Nyquist sampling theorem states that the sampling frequency is required to be not less than twice the maximum frequency of the signal in order to recover the analog signal without distortion. The digital signal obtained by the sampling method has huge data quantity and information redundancy, and is not beneficial to storage and transmission. The sampling technology based on the Compressed Sensing (CS) theory reduces the requirement for the sampling frequency, can realize information acquisition with low power consumption and low complexity, and can realize more effective information acquisition, transmission, storage and processing by using the technology.
In order to process Video efficiently, a CS theory-based Video compression Sensing (CVS) codec system is proposed. It provides an efficient way of processing video information with low complexity. However, in the CVS system, after sampling once, the original pixel domain signal is converted to the measurement domain, and since we only obtain the measured value of the block, the time correlation between video frames is difficult to obtain accurately, which makes it impossible to directly obtain the motion information of the video sequence in the measurement domain, and the motion information between video sequences can ensure that the decoding end reduces the data amount required to be transmitted by the encoding end on the premise of recovering the original video signal well, thereby further improving the compression sampling efficiency.
On the other hand, there are many emerging tools in the process of processing visual information, voice information and natural language, and deep learning is widely used as a very effective tool. Among them, Convolutional Neural Network (CNN) is the most representative deep learning Network model, and is very suitable for processing image and video data. In 1959, Hubel & Wiesel discovered that cells of the visual cortex of animals were responsible for detecting optical signals. In the 90 s of the 20 th century, l.cun et al published papers that established the modern structure of CNN. CNN was once striking in ImageNet competition in 2012, and directly established its importance. Through continuous improvement of structure and algorithm, the strong feature extraction and fitting capability of the CNN makes the CNN widely used in various fields.
In recent years, many researchers have sought effective combinations of CS and CNN to address the challenges presented by CVS. In the "CS-CNN" method proposed by y.shen, t.han et al, the CS sampling and reconstruction process is implemented directly using the CNN structure, which brings an increase in computational complexity and ignores the temporal correlation between video frames. In addition, the article "Estimation of measurement for block-based compressed video sensing" of simulation in measurement domain proposes a CVS measurement domain Estimation model based on macro blocks, which has the disadvantages that the pseudo-inverse matrix of the measurement matrix is obtained by approximation through singular value decomposition, the model robustness is not high, and the Estimation error is very large when a Gaussian matrix is used as the measurement matrix.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a video signal measurement domain estimation method based on compressed sensing and a convolutional neural network. The technical problem to be solved by the invention is realized by the following technical scheme:
the embodiment of the invention provides a video signal measurement domain estimation method based on compressed sensing and a convolutional neural network, which comprises the following steps:
dividing a video sequence into a plurality of image groups, wherein each image group comprises at least two frames of image data, selecting a first frame of each image group as a reference frame, and dividing the reference frame image data into a plurality of macro blocks;
selecting a macro block to be estimated at any position in the reference frame image data, and selecting four adjacent macro blocks in a first direction and a second direction with the macro block to be estimated;
calculating the pixel value and the real measured value of the macro block to be estimated according to the four adjacent macro blocks;
constructing a convolutional neural network model, and calculating a predicted measurement value of a macro block to be estimated;
and training the convolutional neural network model according to the real measured value and the predicted measured value, and obtaining an optimal parameter when the loss function of the output layer of the convolutional neural network model is lower than a preset threshold value.
In an embodiment of the present invention, the calculating the pixel value of the macroblock to be estimated includes:
and carrying out modeling operation on the macro block to be estimated and the four adjacent macro blocks to obtain the pixel value of the macro block to be estimated.
In one embodiment of the present invention, the formula of the modeling operation is:
Figure BDA0001743519220000031
where x (B') represents the macroblock pixel value to be estimated, Σ represents the summation operation, Γ represents the macroblock pixel value to be estimatediA positional relationship matrix, x (B), representing adjacent four macroblocksi) Pixel values representing four adjacent macroblocks, i being 1,2,3, 4.
In an embodiment of the present invention, the calculating the true measurement value of the macroblock to be estimated includes:
calculating the measured values of four adjacent macro blocks according to a preset rule;
and calculating the real measured value of the macroblock to be estimated according to the measured values of the four adjacent macroblocks.
In an embodiment of the present invention, the preset rule is:
y(Bi)=Φx(Bi)
wherein, y (B)i) Denotes the measured value of the macroblock, x (B)i) Denotes the pixel value of the macroblock, i denotes the index of the macroblock, and Φ denotes the measurement matrix.
In an embodiment of the present invention, the actual measurement values of the macroblock to be estimated are:
Figure BDA0001743519220000041
wherein, ytureRepresenting the true measurement of the macroblock to be estimated, sigma a summation operation, lambdaiRepresenting a matrix of weighting coefficients determined by the motion vector and the measurement matrix, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i ═ 1,2,3,4, and Φ represent the measurement matrix.
In one embodiment of the present invention, the convolutional neural network model includes: four convolutional neural networks and one perceptron layer.
In an embodiment of the present invention, the prediction measurement values of the macroblock to be estimated are:
Figure BDA0001743519220000042
wherein, ypredRepresents the prediction measure of the macroblock to be estimated, sigma represents the summation operation, wiRepresenting the weight distribution of four adjacent macroblocks, fiRepresenting a convolutional neural network, Φ Γ, corresponding to four adjacent macroblocksiRepresenting the input of a convolutional neural network, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i ═ 1,2,3,4, and Φ represent the measurement matrix.
In one embodiment of the present invention, the loss function is measurement domain dependent noise, and the expression is:
J=||ypred-ytrue||2/||ytrue||2
where J denotes the measurement domain correlated noise, ytrueRepresenting the true measurement value, y, of the macroblock to be estimatedpredRepresenting the prediction measure of the macroblock to be estimated.
In one embodiment of the present invention, the preset threshold is 0.003.
Compared with the prior art, the invention has the beneficial effects that:
1. the method and the device have the advantages that the convolutional neural network is introduced into the video compression sensing system, the pseudo-inverse and the block weight of the measurement matrix are trained by using the convolutional neural network, the problem that the prior art cannot give consideration to both the calculation complexity and the model robustness is solved, the accuracy and the real-time performance of the method and the device are superior to those of the prior art, the accurate estimation of the macro block at any position on the measurement domain is realized, and the method and the device are used for quickly analyzing the time correlation of the video frame.
2. The method is based on supervised training of a labeled data set, and by means of the advantages of a convolutional neural network structure, the characteristics of locality and the like contained in the data are fully utilized through down sampling, and invariance of displacement and deformation to a certain degree is guaranteed, so that a model obtains strong generalization capability. The invention overcomes the defect that the multimedia information processing has certain difficulty under the condition of limited resources due to the requirement of the measurement rate in the prior art, and can still realize accurate and real-time macro block analysis at any position and effectively process image and video information in the environment of limited resources compared with the prior art even under the condition of very low measurement rate and arbitrarily changed motion vectors.
Drawings
FIG. 1 is a flow chart of a video signal measurement domain estimation method based on compressive sensing and convolutional neural networks according to an embodiment of the present invention;
FIG. 2 is an exploded view of the relative position of a macroblock to be estimated according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a convolutional neural network structure provided in an embodiment of the present invention;
FIG. 4 is an overall framework diagram of a convolutional neural network model provided by an embodiment of the present invention;
FIG. 5 is a graph comparing noise associated with simulation experiment content 1 according to an embodiment of the present invention;
fig. 6 is a comparison result of the noise correlation and the time complexity of the simulation experiment content 2 provided by the embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Example one
Referring to fig. 1, fig. 1 is a flowchart of a video signal measurement domain estimation method based on compressive sensing and convolutional neural network according to an embodiment of the present invention.
The invention provides a video signal measurement domain estimation method based on compressed sensing and a convolutional neural network, which comprises the following steps:
the method comprises the steps of dividing a video sequence into a plurality of image groups, wherein each image group comprises at least two frames of image data, selecting a first frame of each image group as a reference frame, and dividing the reference frame image data into a plurality of macro blocks.
In order to collect enough samples, the first 15 frames of the video sequence are selected for analysis and modeling, and every two frames of image data are an image group.
Firstly, dividing reference frame image data into n two-dimensional macro blocks and carrying out two-dimensional to one-dimensional conversion to obtain a column vector x of each macro blockiWherein each macro block has the same size and is not overlapped with each other, i represents the index of the macro block, and n is an integer greater than 1. The embodiment of the invention divides each frame image into fixed macroblocks of size 8 x 8.
And selecting a macro block to be estimated at any position in the reference frame image data, and selecting four adjacent macro blocks in the first direction and the second direction with the macro block to be estimated.
The decomposition of the macroblock vector to be estimated is further described with reference to fig. 2:
in fig. 2, assuming that the motion vector MV is (m, n), the pixel value of the macroblock B' to be estimated can be decomposed into four adjacent macroblocks B in the first direction and the second directioniI is a combination of 1,2,3,4 pixel values. Wherein, B1' is macroblock B in first directioniI-1, 2 pixel value combination, B2' is macroblock B in first directioniAnd i is a combination of pixel values of 3 and 4. B' is the macro block B in the second directioniA pixel value combination of 1, 2.
And calculating the pixel value and the real measurement value of the macroblock to be estimated according to the four adjacent macroblocks.
The calculating the pixel value of the macroblock to be estimated comprises the following steps:
and carrying out modeling operation on the macro block to be estimated and the four adjacent macro blocks to obtain the pixel value of the macro block to be estimated.
The formula of the modeling operation is as follows:
Figure BDA0001743519220000071
where x (B') represents the macroblock pixel value to be estimated, Σ represents the summation operation, Γ represents the macroblock pixel value to be estimatediA positional relationship matrix, x (B), representing adjacent four macroblocksi) Pixel values representing four adjacent macroblocks, i being 1,2,3, 4.
The calculating the true measurement value of the macro block to be estimated comprises the following steps:
and calculating the measured values of the four adjacent macro blocks according to a preset rule.
The preset rule is as follows:
y(Bi)=Φx(Bi)
wherein, y (B)i) Denotes the measured value of the macroblock, x (B)i) Denotes the pixel value of the macroblock, i denotes the index of the macroblock, and Φ denotes the measurement matrix.
The specific embodiment of the invention respectively adopts a Gaussian matrix and a partial Hadamard matrix as measurement matrixes; the ratio of the number of rows to the number of columns of the measurement matrix represents the measurement rate MR. The MR range in the specific embodiment of the invention is 0.1-0.7, and the good generalization capability of the invention is verified.
And calculating the real measured value of the macro block to be estimated according to the measured values of the four adjacent macro blocks.
The actual measurement values of the macroblock to be estimated are:
Figure BDA0001743519220000072
wherein, ytureRepresenting the true measurement of the macroblock to be estimated, sigma a summation operation, lambdaiRepresenting a matrix of weighting coefficients determined by the motion vector and the measurement matrix, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i ═ 1,2,3,4, and Φ represent the measurement matrix.
And constructing a convolutional neural network model, and calculating a prediction measurement value of the macro block to be estimated.
The convolutional neural network model comprises four convolutional neural networks and a perceptron layer. Each convolutional neural network comprises 3 convolutional layers, each convolutional layer is composed of a plurality of two-dimensional feature mapping planes, and the activation function adopts a ReLu activation function y which is max (0, x); particularly, because the measurement domain vector dimension is relatively low, the convolutional neural network does not contain a pooling layer; the sensor layer adopts a full-connection network structure, and the activation function of the layer adopts an identity function y which is x.
The convolutional neural network in the convolutional neural network model is described below with reference to fig. 3.
The input size of the convolutional neural network in fig. 3 is 64 × 64, each layer in the middle is composed of a plurality of feature maps, each feature map is composed of a plurality of neural units, and all the neural units of the same feature map share one convolution kernel. The size of the first layer convolution kernel is set to 1 x 1, the first layer step size is set to (1, 3), the size of the second layer convolution kernel is 1 x 1, the second layer step size is set to (1, 3), the number of feature maps of the first layer is set to 1, and the number of feature maps of the second layer is set to 1. The image is translated in a two-dimensional plane through a convolution kernel, and each element of the convolution kernel is multiplied by the corresponding position of the convolved image and then summed. And obtaining the characteristic output of the next layer through the continuous movement of the convolution kernel. By convolution operation, the original signal characteristic can be enhanced and the noise can be reduced.
The overall framework of the convolutional neural network model is described below with reference to fig. 4.
In FIG. 4, phi ΓiInput to a corresponding convolutional neural network fiIn the method, corresponding output is obtained through the transformation and operation of the convolutional neural network and is marked as fi(ΦΓi) And i is 1,2,3, 4. Measuring values y of four adjacent macroblocks of a macroblock to be estimated in a first direction and a second directioniOutput f of the corresponding convolutional neural networki(ΦΓi) I 1,2,3,4, by the perceptron layer weight parameter wiWeighted linear summation to obtain output y of convolution neural network modelpredNamely:
Figure BDA0001743519220000081
wherein, ypredRepresents the prediction measure of the macroblock to be estimated, sigma represents the summation operation, wiRepresenting the weight distribution of four adjacent macroblocks, fiRepresenting a convolutional neural network, Φ Γ, corresponding to four adjacent macroblocksiRepresenting the input of a convolutional neural network, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i ═ 1,2,3,4, and Φ represent the measurement matrix.
The convolution neural network realizes the input phi gammaiConversion into a weighting coefficient matrix ΛiThis has the same transforming effect as the pseudo-inverse of the measurement matrix phi.
Training a convolutional neural network model according to the real measurement value and the prediction measurement value, wherein a mini-batch stochastic gradient is adopted in the specific embodiment of the inventionA descending optimization method, a gradual learning rate adjustment mode and a loss function defined as the measurement domain correlated noise J | | ypred-ytrue||2/||ytrue||2
Let X represent an input, X comprising: measuring matrix phi and spatial position relation matrix gamma of four adjacent macro blocksiAnd the measured values y (B) of the adjacent four macroblocksi) I is 1,2,3, 4; y represents a label, y ═ yture
The training of the convolutional neural network model according to the real measurement value and the prediction measurement value comprises the following steps:
(1) a forward propagation phase; inputting a sample set X of a batch into a convolutional neural network model to calculate corresponding output, and at this stage, carrying out step-by-step transformation on information from an input layer of the convolutional neural network model, extracting and combining sufficiently complex nonlinear features by using the characteristics of weight sharing and local connection of convolutional layers, and transmitting the extracted and combined nonlinear features to an output layer of the convolutional neural network model.
(2) A backward propagation phase; calculating actual output y of convolutional neural network modelpredAnd a sample label ytrueAnd the error between the two is transmitted back and forth step by utilizing an error back propagation algorithm to adjust the weight of the convolutional neural network model, so that the related noise of the measurement domain is continuously reduced.
(3) Repeating the operations of (1) and (2), continuously inputting the next batch into the convolutional neural network model for training until the loss function of the output layer of the convolutional neural network model is lower than 0.003, and training to obtain the optimal parameter w1、w2、w3、w4And a convolutional neural network f1、f2、f3、f4Weight and bias in (1).
Based on the trained model, an average relative error of 0.0038 was measured given MR of 0.3 and MV of (3, 1). The mean relative error measured floated between 0.0039 and 0.0068 as the MR and MV changed. The convolutional neural network model has stronger modeling capability than that of the traditional method.
The effects of the present invention can be further explained by the following simulation experiments.
1. Simulation experiment conditions are as follows:
the experimental simulation environment of the invention is as follows:
operating the system: ubuntu 14.04, python2.7
An experiment platform: tensorflow-1.4
A processor: intel Core i7-7700k CPU @4.2GHZ 8
A display card: NVIDIA 1080Ti GPU
Memory: 15.6GB
The data source used in the simulation experiment of the invention is a representative video sequence Foreman CIF sequence. The first 15 frames of the Foreman CIF sequence are compressed and sampled, the total sample number is 11424, and a training set and a test set are randomly divided into 9632 and 1792; the motion vector is set to MV (3,1), 3 for horizontal motion and 1 for vertical motion; the measurement rate MR was 0.3.
2. Simulation experiment contents:
the simulation experiment of the invention is divided into two simulation experiments.
Simulation experiment I: in the same experiment simulation environment, in fig. 5, the gaussian measurement matrix is used for both the reference method and the method provided by the invention, and the correlated noise of the method provided by the invention is almost one sixtieth of the correlated noise of the reference method. In particular, the logarithmic correlation error of the proposed method using the gaussian measurement matrix is reduced by a factor of 15 in order to see a clear contrast on one graph.
In the same experimental simulation environment, in fig. 5, the partial hadamard matrix measurement matrix is used for both the reference method and the proposed method, and the correlation noise of the reference method is 2-3 times that of the proposed method (in order to see clear comparison on a graph, the logarithmic correlation error of the two methods is reduced by 15 times at the same time).
And (2) simulation experiment II: macroblock accurate estimation experiment under different motion vector and measuring rate condition
Different MVs and MRs were set for correlation analysis experiments using gaussian measurement matrix, batch 300. In fig. 6, the correlation noise (denoted by CN) and the time complexity (in seconds) of the measurement domain are used as the measurement criteria, and when the MV and MR change, the proposed method is always greatly superior to the reference method.
3. Simulation experiment result analysis:
as can be seen from fig. 5, when the measurement matrix is a gaussian matrix and a partial hadamard matrix, respectively, given the measurement rate and the motion vector, the logarithmic error of the correlated noise in the measurement domain of the method proposed by the present invention has a great advantage compared with the existing approximate model. It can be seen from fig. 6 that, when the motion vector is (2,2) and (-2, -2) respectively as the measurement rate is changed in the range of 0.1-0.7, the method proposed by the present invention is far superior to the existing approximate model in terms of the related noise error index and the time complexity. In summary, the invention uses the convolutional neural network to train the pseudo-inverse of the measurement matrix and the weights of the blocks, so that the estimation model of any macroblock in the video frame can be obviously improved in terms of precision and robustness, even by more than two orders of magnitude. Meanwhile, the model provided by the invention is near real-time, the processing time is almost one eighth of that of the existing approximate model, and the image and video information can be processed in the environment with limited resources.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (3)

1. A video signal measurement domain estimation method based on compressed sensing and a convolutional neural network is characterized by comprising the following steps:
dividing a video sequence into a plurality of image groups, wherein each image group comprises at least two frames of image data, selecting a first frame of each image group as a reference frame, and dividing the reference frame image data into a plurality of macro blocks;
selecting a macro block to be estimated at any position in the reference frame image data, and selecting four adjacent macro blocks in a first direction and a second direction with the macro block to be estimated;
calculating the pixel value and the real measured value of the macro block to be estimated according to the four adjacent macro blocks; the calculating the pixel value of the macroblock to be estimated comprises the following steps: modeling operation is carried out on the macro block to be estimated and the four adjacent macro blocks to obtain a pixel value of the macro block to be estimated; the calculating the true measurement value of the macro block to be estimated comprises the following steps: calculating the measured values of four adjacent macro blocks according to a preset rule; calculating the real measured value of the macro block to be estimated according to the measured values of the four adjacent macro blocks;
constructing a convolutional neural network model, and calculating a predicted measurement value of a macro block to be estimated;
training the convolutional neural network model according to the real measured value and the predicted measured value, and obtaining an optimal parameter when a loss function of an output layer of the convolutional neural network model is lower than a preset threshold value;
the formula of the modeling operation is as follows:
Figure FDA0002441270020000011
where x (B') represents the macroblock pixel value to be estimated, Σ represents the summation operation, ΓiA positional relationship matrix, x (B), representing adjacent four macroblocksi) Pixel values representing four adjacent macroblocks, i ═ 1,2,3, 4;
the preset rule is as follows:
y(Bi)=Φx(Bi)
wherein, y (B)i) Denotes the measured value of the macroblock, x (B)i) Pixel values representing a macroblock, i represents an index of the macroblock, and Φ represents a measurement matrix;
the prediction measurement value of the macroblock to be estimated is as follows:
Figure FDA0002441270020000021
wherein, ypredRepresenting predicted measurements of a macroblock to be estimated, sigmaDenotes a summation operation, wiRepresenting the weight distribution of four adjacent macroblocks, fiRepresenting a convolutional neural network, Φ Γ, corresponding to four adjacent macroblocksiRepresenting the input of a convolutional neural network, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i is 1,2,3,4, and Φ represents a measurement matrix;
the actual measurement values of the macroblock to be estimated are:
Figure FDA0002441270020000022
wherein, ytureRepresenting true measurement values of the macroblock to be estimated, sigma representing a summing operation, ΛiRepresenting a matrix of weighting coefficients determined by the motion vector and the measurement matrix, y (B)i) Denotes the measured value, x (B), of four adjacent macroblocksi) Pixel values representing four adjacent macroblocks, i is 1,2,3,4, and Φ represents a measurement matrix;
the loss function is measurement domain correlated noise, and the expression is as follows:
J=||ypred-ytrue||2/||ytrue||2
where J denotes the measurement domain correlated noise, ytrueRepresenting the true measurement value, y, of the macroblock to be estimatedpredRepresenting the prediction measure of the macroblock to be estimated.
2. The method of claim 1, wherein the convolutional neural network model comprises: four convolutional neural networks and one perceptron layer.
3. The method of claim 1, wherein the preset threshold is 0.003.
CN201810831091.1A 2018-07-26 2018-07-26 Video signal measurement domain estimation method based on compressed sensing and convolutional neural network Active CN109168002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810831091.1A CN109168002B (en) 2018-07-26 2018-07-26 Video signal measurement domain estimation method based on compressed sensing and convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810831091.1A CN109168002B (en) 2018-07-26 2018-07-26 Video signal measurement domain estimation method based on compressed sensing and convolutional neural network

Publications (2)

Publication Number Publication Date
CN109168002A CN109168002A (en) 2019-01-08
CN109168002B true CN109168002B (en) 2020-06-12

Family

ID=64898202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810831091.1A Active CN109168002B (en) 2018-07-26 2018-07-26 Video signal measurement domain estimation method based on compressed sensing and convolutional neural network

Country Status (1)

Country Link
CN (1) CN109168002B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111192334B (en) * 2020-01-02 2023-06-06 苏州大学 Trainable compressed sensing module and image segmentation method
CN111354051B (en) * 2020-03-03 2022-07-15 昆明理工大学 Image compression sensing method of self-adaptive optimization network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103546749B (en) * 2013-10-14 2017-05-10 上海大学 Method for optimizing HEVC (high efficiency video coding) residual coding by using residual coefficient distribution features and bayes theorem
US10397498B2 (en) * 2017-01-11 2019-08-27 Sony Corporation Compressive sensing capturing device and method
CN107784676B (en) * 2017-09-20 2020-06-05 中国科学院计算技术研究所 Compressed sensing measurement matrix optimization method and system based on automatic encoder network

Also Published As

Publication number Publication date
CN109168002A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
WO2020037965A1 (en) Method for multi-motion flow deep convolutional network model for video prediction
CN110933429B (en) Video compression sensing and reconstruction method and device based on deep neural network
CN111626245B (en) Human behavior identification method based on video key frame
CN107992938B (en) Space-time big data prediction technique and system based on positive and negative convolutional neural networks
CN104199627B (en) Gradable video encoding system based on multiple dimensioned online dictionary learning
CN107680116A (en) A kind of method for monitoring moving object in video sequences
CN110751649A (en) Video quality evaluation method and device, electronic equipment and storage medium
CN108182694B (en) Motion estimation and self-adaptive video reconstruction method based on interpolation
CN102148987A (en) Compressed sensing image reconstructing method based on prior model and 10 norms
CN109168002B (en) Video signal measurement domain estimation method based on compressed sensing and convolutional neural network
Zhao et al. Image compressive-sensing recovery using structured laplacian sparsity in DCT domain and multi-hypothesis prediction
CN102946539B (en) Method for estimating motion among video image frames based on compressive sensing
CN116152591A (en) Model training method, infrared small target detection method and device and electronic equipment
Hu et al. Optimized spatial recurrent network for intra prediction in video coding
CN110728728A (en) Compressed sensing network image reconstruction method based on non-local regularization
Liu et al. Diverse hyperspectral remote sensing image synthesis with diffusion models
Di et al. Learned compression framework with pyramidal features and quality enhancement for SAR images
Ma High-resolution image compression algorithms in remote sensing imaging
Pei et al. MobileViT-GAN: A Generative Model for Low Bitrate Image Coding
Sun et al. Video snapshot compressive imaging using residual ensemble network
CN105894485B (en) A kind of adaptive video method for reconstructing based on signal correlation
CN108769674A (en) A kind of video estimation method based on adaptive stratification motion modeling
Liu et al. An end-to-end multi-scale residual reconstruction network for image compressive sensing
CN114663307B (en) Integrated image denoising system based on uncertainty network
CN112085779A (en) Wave parameter estimation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant