CN110234011A - A kind of video-frequency compression method and system - Google Patents
A kind of video-frequency compression method and system Download PDFInfo
- Publication number
- CN110234011A CN110234011A CN201910318187.2A CN201910318187A CN110234011A CN 110234011 A CN110234011 A CN 110234011A CN 201910318187 A CN201910318187 A CN 201910318187A CN 110234011 A CN110234011 A CN 110234011A
- Authority
- CN
- China
- Prior art keywords
- data
- residual error
- frame
- dimension
- encoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000007906 compression Methods 0.000 title claims abstract description 28
- 230000006835 compression Effects 0.000 title claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 117
- 238000005070 sampling Methods 0.000 claims abstract description 23
- 238000013528 artificial neural network Methods 0.000 claims description 33
- 230000009467 reduction Effects 0.000 claims description 12
- 239000000284 extract Substances 0.000 claims description 9
- 238000013144 data compression Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000004913 activation Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000004744 fabric Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000013499 data model Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000001595 contractor effect Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/625—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Discrete Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a kind of video-frequency compression method and systems, wherein the described method includes: determining the frame and reference frame to be encoded in target video, and calculates residual error data of the frame to be encoded relative to the reference frame;The Mean Vector and variance vectors of the residual error data are extracted respectively;Normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.Technical solution provided by the present application can effectively compress video file.
Description
Technical field
The present invention relates to technical field of video processing, in particular to a kind of video-frequency compression method and system.
Background technique
With the continuous promotion of video definition, the data volume of video file is also increasing.In order to save transmission video
The bandwidth of file needs efficient, stable video compression scheme.
At present in the video compression scheme of mainstream, first video data can be quantified, then carried out for quantized result
After scanning, the cataloged procedure to video file is realized.Specifically, video data can be quantified by quantization table, then
Quantized result is scanned by way of ZigZag again.Some 0 values in video data can be given up in this way, to compress
The data volume of video file.
However, this video compression scheme in the prior art, the video file more for 0 value can have preferable pressure
Contracting effect, and the video file less for 0 value, since the data volume given up is less, compression effectiveness is unsatisfactory.And such as
Fruit wants the step-length by improving quantization come compressed data, and it is higher to will lead to compressed video file distortion rate.Therefore, mesh
Before need a kind of significantly more efficient video compression scheme.
Summary of the invention
The application's is designed to provide a kind of video-frequency compression method and system, can effectively press video file
Contracting.
To achieve the above object, on the one hand the application provides a kind of video-frequency compression method, which comprises determines target
Frame and reference frame to be encoded in video, and calculate residual error data of the frame to be encoded relative to the reference frame;It mentions respectively
Take the Mean Vector and variance vectors of the residual error data;Normal distribution is carried out to the Mean Vector and the variance vectors to adopt
Sample, to obtain the compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data
Degree.
To achieve the above object, on the other hand the application also provides a kind of video compression system, the system comprises: residual error
Data Computation Unit for determining frame and reference frame to be encoded in target video, and calculates the frame to be encoded relative to institute
State the residual error data of reference frame;Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively;
Data compression unit, it is described wait compile to obtain for carrying out normal distribution sampling to the Mean Vector and the variance vectors
The compressed data of code frame, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
To achieve the above object, on the other hand the application also provides a kind of video compression apparatus, the video compression apparatus
Including memory and processor, the memory for storing computer program, held by the processor by the computer program
When row, above-mentioned video-frequency compression method is realized.
Therefore technical solution provided by the present application, for the frame to be encoded in target video, it may be predetermined that should
The reference frame of frame to be encoded.Wherein, which can retain the content of full frame in compression.And for frame to be encoded,
The residual error data of the frame to be encoded relative to the reference frame can be calculated, it is subsequent when being encoded to the frame to be encoded, it can
To be encoded only for residual error data, to considerably reduce data volume needed for coding.In order to further reduce volume
Data volume needed for code, can extract the characteristic parameter that can characterize the residual error data from the residual error data.In this application,
This feature parameter can be the Mean Vector and variance vectors of residual error data.The dimension of the Mean Vector and variance vectors that extract
The dimension of original residual error data can be lower than, so as to realize Data Dimensionality Reduction.It then, can for Mean Vector and variance vectors
To carry out normal distribution sampling.The purpose handled in this way is, on the one hand can be sampled by normal distribution and eliminate Mean Vector
With the noise in variance vectors, to improve the accuracy of data compression.On the other hand, after normal distribution can be made to sample
Data can meet the NATURAL DISTRIBUTION rule of data, after being sampled by normal distribution, be equivalent to by Mean Vector and variance to
The preliminary data dimension restored for original residual error data, after only normal distribution samples of amount, than original residual error data
Dimension it is low.In this way, not only can guarantee that the data after normal distribution sampling had higher fidelity, but also it can guarantee that normal distribution is adopted
Data after sample have lower dimension, to improve the efficiency of data compression while guaranteeing fidelity.In this way, just
Data after state profile samples can be used for as the compressed data of frame to be encoded, the compressed data it is subsequent transmission or
Person's decoding.Therefore technical solution provided by the present application, data needed for video compress being reduced by residual error data
Amount, in addition, normal distribution sampling is carried out by extracting Mean Vector and variance vectors, and to Mean Vector and variance vectors, from
And effectively video can be compressed.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is the step schematic diagram of video-frequency compression method in embodiment of the present invention;
Fig. 2 is the schematic diagram for carrying out image procossing in embodiment of the present invention as unit of macro block;
Fig. 3 is the structural schematic diagram of compact model in embodiment of the present invention;
Fig. 4 is the schematic diagram of neural network in embodiment of the present invention;
Fig. 5 is the functional block diagram of video compression system in embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
The application provides a kind of video-frequency compression method, and the method can be applied to the equipment for having data processing function
In.Referring to Fig. 1, the method may include following steps.
S1: determining the frame and reference frame to be encoded in target video, and calculates the frame to be encoded relative to the reference
The residual error data of frame.
In the present embodiment, the target video can be video (to be compressed) to be encoded, in the target video
In can determine frame to be encoded reference frame corresponding with the frame to be encoded.Specifically, SATD (Sum of can be passed through
Absolute Transformed Difference, absolute transformed error and algorithm) or SAD (Sum of Absolute
Differences, absolute error and algorithm) scheduling algorithm, the similarity between frame and reference frame to be encoded is calculated, when being calculated
Similarity when reaching specified threshold value, can be using the reference frame as the corresponding reference frame of the frame to be encoded.Certainly, exist
In practical application, the selection of reference frame and frame to be encoded can be not limited to above-mentioned true according to scene according to other standards
Fixed scheme.Therefore, the application to the method for determination of reference frame and frame to be encoded without limitation.
In the present embodiment, it after determining the frame to be encoded and reference frame, is being compiled to reduce frame to be encoded
Data needed for code process can determine residual quantity of the frame to be encoded relative to reference frame, and be directed to the residual quantity and encoded,
So as to greatly reduce frame to be encoded in an encoding process needed for data volume.
Specifically, residual error data of the frame to be encoded relative to the reference frame can be calculated.Calculating the residual error number
According to when, the residual error between the frame to be encoded and reference frame can be calculated first.The residual error can be frame and reference frame to be encoded
Between corresponding pixel points pixel value difference.For example, reference frame and frame to be encoded are the video frame of 28*28, then being calculated
Residual error can be element number be 28*28=784 vector.However, due to compared between reference frame and frame to be encoded
Higher similarity, therefore in the vector of characterization residual error, comprising 0 more value, these 0 values can be very big during next code
Ground mitigates coding pressure.
In one embodiment, reference frame and frame to be encoded usually can all be divided into the macro block of preset quantity
((MacroBlock), then the process of above-mentioned calculating residual error, can be carried out as unit of macro block.It specifically, can will be described
Frame to be encoded is divided into the target macroblock of preset quantity, and determines each target macroblock corresponding ginseng in the reference frame
Examine macro block.Referring to Fig. 2, the number for the pixel that the reference macroblock and target macroblock that the both ends of dotted line are respectively directed to are covered
Amount and the area size of covering can be consistent.In this way, frame to be encoded and reference frame can be divided into pairs of target macro
Block and reference macroblock.It is then possible to calculate separately the part between each target macroblock and the corresponding reference macroblock
Residual error.Specifically, the pixel value of corresponding pixel points between target macroblock and reference macroblock can be subtracted each other, to obtain each picture
Pixel value difference at vegetarian refreshments position.The combination of each pixel value difference in one target macroblock, can as the target macroblock with
Local residual error between corresponding reference macroblock.It, can will be each after the local residual error of each target macroblock is calculated
The combination of the part residual error, as the residual error between the frame to be encoded and reference frame.
In the present embodiment, in order to further increase the quantity of 0 value, so that data volume needed for coding is reduced, it can be with
The residual error being calculated is converted from time-domain to frequency domain.Specifically, in practical applications, discrete cosine can be used
Transformation (Discrete Cosine Transform, DCT) is handled the residual error being calculated, the number after dct transform
According to, high frequency section may be implemented and separated with low frequency part so that data volume is less, 0 value it is more.In this way, can will turn
Residual error data of the residual error for the frequency domain got in return as the frame to be encoded relative to the reference frame.
Certainly, in practical applications, dct transform can also be carried out as unit of macro block.Specifically, be calculated it is each
After the local residual error of above-mentioned target macroblock, each local residual error can be converted from time-domain to frequency domain, and will turn
The combination of the local residual error for the frequency domain got in return, the residual error data as the frame to be encoded relative to the reference frame.
S3: the Mean Vector and variance vectors of the residual error data are extracted respectively.
In the present embodiment, after the residual error data that the frame to be encoded is calculated, the residual error data can be directed to
It is encoded.Specifically, the process which encoded, can one training complete compact model in into
Row.Referring to Fig. 3, may include the two units of encoder and decoder in the compact model.The encoder receives
The residual error data of input can extract the characteristic parameter of the residual error data, and the residual error number is characterized using this feature parameter
According to.Specifically, the characteristic parameter can be the Mean Vector and variance vectors of the residual error data.
As shown in figure 3, in practical applications, may include the deep neural network for completing training in above-mentioned encoder
(Deep Neural Networks, DNN), the DNN can fit data model for the great amount of samples data of input, be fitted
Obtained data model can extract corresponding Mean Vector and variance vectors for the residual error data of input.Specifically,
In the training stage, a large amount of residual error data sample and the corresponding practical expectation of these residual error data samples can be prepared in advance
Vector sum realized variance vector.It is subsequent, these residual error data samples can be inputted into DNN to be trained in batch.For example, every
A batch can select the residual error data of 100 video frames, it is assumed that include 784 elements in each residual error data, then each
The residual error data matrix that can be 100*784 of batch input DNN.DNN to be trained can be according to initial neuron to defeated
The residual error data matrix entered is handled, so that obtaining corresponding prediction expectation vector sum presets variance vectors.Due in instruction
To practice the stage, prediction expectation vector sum presets variance vectors and practical Mean Vector and realized variance vector possible error are larger, because
This can return to DNN using the error as value of feedback, so that DNN is adjusted the weight coefficient of internal neuron, until
It, being capable of correctly predicted actual Mean Vector and variance vectors out after inputting the sample of residual error data again.In this way, by a large amount of
The obtained DNN of residual error data sample training, the residual error data that can relatively accurately predict current frame to be encoded is corresponding
Mean Vector and variance vectors.
Incorporated by reference to Fig. 3 and Fig. 4, complete to may include multiple full articulamentum (Full Connected in trained DNN
Layer, FCL), the extraction of Mean Vector and variance vectors and the function of Data Dimensionality Reduction may be implemented in these full articulamentums.Tool
Body, in fig. 4, it is assumed that is inputted is the residual error data of 100 frames to be encoded, the residual error data of input can show as 100*
784 matrix.So by the first full articulamentum in the deep neural network, the residual error data can be tieed up from first
Degree (100*784) is reduced to the second dimension (100*256).Specifically, the process of dimensionality reduction can be realized by means of convolution kernel.Volume
Product core can be weighted and averaged for the pixel value in a region in frame to be encoded, and being replaced with the value after weighted average should
Pixel value in region, to have the function that dimensionality reduction.It is subsequent, it can be complete by second in the deep neural network respectively
Articulamentum and the full articulamentum of third extract the Mean Vector and variance vectors of the residual error data of second dimension.Extraction obtains
Mean Vector and variance vectors, for the data compared to the second dimension, also have dimensionality reduction effect.Therefore, it is described it is expected to
The dimension (100*128) of amount and the variance vectors can be lower than second dimension (100*256).
S5: normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the frame to be encoded
Compressed data, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
In the present embodiment, in order to enable natural data can be met by extracting obtained Mean Vector and variance vectors
The regularity of distribution can carry out normal distribution sampling to the Mean Vector and the variance vectors, to tentatively restore original
Residual error data.The data only sampled by normal distribution, can be lower than the dimension of original residual error data.
Specifically, referring to Fig. 4, can also include being sampled for executing normal distribution in the deep neural network
Normal state sample level, in this way, the residual error data of second dimension (100*256) exported by the multiple full articulamentums in front
Mean Vector and variance vectors can be entered the normal state sample level and carry out normal distribution sampling, to obtain third dimension
The compressed data of (100*128).In this way, the dimension of compressed data is not only below the second dimension, and due to having carried out normal state point
Cloth sampling, additionally it is possible to remove the noise in Mean Vector and variance vectors.Finally, the compression exported by above-mentioned normal state sample level
Data, can be as the compressed data of the frame to be encoded.
In one embodiment, in order to measure normal distribution sampling data distortion degree, can according to it is described it is expected to
Amount and the variance vectors calculate the relative entropy (RL divergence) of the residual error data, and characterize normal state point by the relative entropy
The distortion factor after cloth sampling.In an application example, the relative entropy can be indicated by following formula:
Wherein, KL indicates the relative entropy, εiThe corresponding variance vectors of the frame to be encoded of expression i-th, μiIndicate i-th to
The corresponding Mean Vector of coded frame.
By the calculated result of above-mentioned relative entropy, the process that can be sampled to normal distribution is adjusted, so that normal state point
The distortion factor after cloth sampling is maintained at zone of reasonableness.
In the present embodiment, the compressed data after encoder encodes can carry out subsequent transmission process.It is connecing
After receiving the compressed data, it can use decoder shown in Fig. 3 and it be decoded.Specifically, Fig. 3 and Fig. 4 are please referred to,
Can be based on the DNN building decoding neural network that above-mentioned training obtains, and pass through the decoding neural network to the frame to be encoded
Compressed data reversely reconstructed, so as to which the compressed data is reduced to match with the dimension of the residual error data
Decoding data.
Specifically, the decoding neural network can be the reversed network for the DNN that above-mentioned training obtains.In decoding mind
In network, two full articulamentums are may include.As shown in figure 4, the compressed data of the frame to be encoded can be inputted first
First full articulamentum in neural network is decoded, to the compressed data is restored to from the low latitudes of 100*128 above-mentioned
The second dimension 100*256.It is then possible to which the data for continuing to be restored to second dimension input the decoding neural network
Second full articulamentum, so as to by the data convert of the second dimension 100*256 to the first dimension 100*
784.In this way, can decode to obtain decoding data identical with the dimension of original residual error data.Therefore, it can will be restored to
The decoding data that the data of first dimension match as the dimension with the residual error data.
In one embodiment, in order to assess the encoding and decoding effects of entire encoder and decoder, to described wait compile
The compressed data of code frame carries out after reversely reconstructing, and can calculate between the decoding data and the residual error data that reduction obtains
Error, and the cross entropy of the error and the relative entropy is calculated, so as to characterize the solution yardage by the cross entropy
According to the distortion factor relative to the residual error data.In an application example, the simplified formula of the cross entropy can following institute
Show:
C=-logP (X'| X)+KL
Wherein, C indicates that the cross entropy, X' indicate decoding data, and X indicates residual error data, and-logP (X'| X) indicates decoding
Error between data and residual error data, KL indicate the relative entropy.
In this way, the neural network of encoder and decoder can be corrected in conjunction with above-mentioned relative entropy and cross entropy,
So that the distortion factor after encoding and decoding is in allowed band, or it is preferably minimized.
In the present embodiment, the decoding data decoded, it is subsequent can continue according to existing coding mode (such as
The coding modes such as CACBA, VLC) it is encoded, the application is to this and without limitation.
In practical applications, full articulamentum needs to select suitable activation primitive when handling data.For example,
In the first above-mentioned full articulamentum, ReLU (Rectified Linear Unit, line rectification) activation primitive can be used.
In another example ReLU can be respectively adopted in first full articulamentum and second full articulamentum in above-mentioned decoding neural network
Activation primitive and Sigmoid activation primitive.Certainly, it can flexibly be selected according to fitting effect and actual demand in practical applications
With other activation primitives.For example, it is also possible to select Tanh activation primitive.
Referring to Fig. 5, the application also provides a kind of video compression system, the system comprises:
Residual error data computing unit for determining frame and reference frame to be encoded in target video, and calculates described wait compile
Residual error data of the code frame relative to the reference frame;
Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively;
Data compression unit, for carrying out normal distribution sampling to the Mean Vector and the variance vectors, to obtain
The compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
In one embodiment, the residual error data computing unit includes:
Frequency domain conversion module, for calculating the residual error between the frame to be encoded and reference frame, and by the residual error from when
Between domain convert to frequency domain, and using the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame
Residual error data.
In one embodiment, the system also includes:
Neural network input unit, for the residual error data to be inputted to the deep neural network for completing training, the depth
Spending includes multiple full articulamentums in neural network;
Dimensionality reduction unit, for by the first full articulamentum in the deep neural network, by the residual error data from
Dimension is reduced to the second dimension;
Correspondingly, the vector extraction unit is also used to respectively through the second full connection in the deep neural network
Layer and the full articulamentum of third extract the Mean Vector and variance vectors of the residual error data of second dimension;Wherein, the expectation
The dimension of variance vectors described in vector sum is lower than second dimension.
In one embodiment, the system also includes:
Decoding unit is reversely reconstructed, also by the compressed data for the compressed data to the frame to be encoded
It originally was the decoding data to match with the dimension of the residual error data.
In one embodiment, the decoding unit includes:
Decoding network input module, for first in the compressed data input decoding neural network by the frame to be encoded
The compressed data is restored to second dimension by a full articulamentum;
Data restoring module, for the data for being restored to second dimension to be inputted the second of the decoding neural network
A full articulamentum by the data convert of second dimension to first dimension, and will be restored to first dimension
The decoding data that data match as the dimension with the residual error data.
In one embodiment, the system also includes:
Relative entropy computing unit, for calculating the residual error data according to the Mean Vector and the variance vectors
Relative entropy, and the distortion factor after normal distribution sampling is characterized by the relative entropy;
Cross entropy computing unit, for calculating the error between the decoding data and the residual error data that reduction obtains, and
The cross entropy of the error and the relative entropy is calculated, and the decoding data is characterized relative to described residual by the cross entropy
The distortion factor of difference data.
Therefore technical solution provided by the present application, for the frame to be encoded in target video, it may be predetermined that should
The reference frame of frame to be encoded.Wherein, which can retain the content of full frame in compression.And for frame to be encoded,
The residual error data of the frame to be encoded relative to the reference frame can be calculated, it is subsequent when being encoded to the frame to be encoded, it can
To be encoded only for residual error data, to considerably reduce data volume needed for coding.In order to further reduce volume
Data volume needed for code, can extract the characteristic parameter that can characterize the residual error data from the residual error data.In this application,
This feature parameter can be the Mean Vector and variance vectors of residual error data.The dimension of the Mean Vector and variance vectors that extract
The dimension of original residual error data can be lower than, so as to realize Data Dimensionality Reduction.It then, can for Mean Vector and variance vectors
To carry out normal distribution sampling.The purpose handled in this way is, on the one hand can be sampled by normal distribution and eliminate Mean Vector
With the noise in variance vectors, to improve the accuracy of data compression.On the other hand, after normal distribution can be made to sample
Data can meet the NATURAL DISTRIBUTION rule of data, after being sampled by normal distribution, be equivalent to by Mean Vector and variance to
Amount is tentatively reduced to original residual error data, only the data dimension after normal distribution sampling, than original residual error data
Dimension is low.In this way, not only can guarantee that the data after normal distribution sampling had higher fidelity, but also it can guarantee that normal distribution samples
Data afterwards have lower dimension, to improve the efficiency of data compression while guaranteeing fidelity.In this way, normal state
Data after profile samples can be used for as the compressed data of frame to be encoded, the compressed data it is subsequent transmission or
Decoding.Therefore technical solution provided by the present application, data needed for video compress being reduced by residual error data
Amount, in addition, normal distribution sampling is carried out by extracting Mean Vector and variance vectors, and to Mean Vector and variance vectors, from
And effectively video can be compressed.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to be realized by hardware.Based on such
Understand, substantially the part that contributes to existing technology can embody above-mentioned technical proposal in the form of software products in other words
Out, which may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, packet
Some instructions are included to use so that a computer equipment (can be personal computer, server or the network equipment etc.) executes
Method described in certain parts of each embodiment or embodiment.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (15)
1. a kind of video-frequency compression method, which is characterized in that the described method includes:
It determines the frame and reference frame to be encoded in target video, and calculates residual error of the frame to be encoded relative to the reference frame
Data;
The Mean Vector and variance vectors of the residual error data are extracted respectively;
Normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the compression number of the frame to be encoded
According to, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
2. the method according to claim 1, wherein calculating the frame to be encoded relative to the residual of the reference frame
Difference data includes:
The residual error between the frame to be encoded and reference frame is calculated, and the residual error is converted from time-domain to frequency domain, and will
Residual error data of the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame.
3. according to the method described in claim 2, it is characterized in that, calculating the residual error packet between the frame to be encoded and reference frame
It includes:
The frame to be encoded is divided into the target macroblock of preset quantity, and determines each target macroblock in the reference frame
In corresponding reference macroblock;
Calculate separately the local residual error between each target macroblock and the corresponding reference macroblock, and by each mesh
The combination for marking the corresponding local residual error of macro block, as the residual error between the frame to be encoded and reference frame;
Correspondingly, each local residual error is converted from time-domain to frequency domain, and by the part for the frequency domain being converted to
The combination of residual error, the residual error data as the frame to be encoded relative to the reference frame.
4. the method according to claim 1, wherein calculating the frame to be encoded relative to the reference frame
After residual error data, the method also includes:
The residual error data is inputted to the deep neural network for completing training, includes multiple full connections in the deep neural network
Layer;
By the first full articulamentum in the deep neural network, the residual error data is reduced to the second dimension from the first dimension
Degree;
Correspondingly, respectively by the second full articulamentum and the full articulamentum of third in the deep neural network, described the is extracted
The Mean Vector and variance vectors of the residual error data of two-dimensions;Wherein, the dimension of the Mean Vector and the variance vectors is low
In second dimension.
5. according to the method described in claim 4, it is characterized in that, further including for executing normal state in the deep neural network
The normal state sample level of profile samples;
Correspondingly, the Mean Vector of the residual error data of second dimension and variance vectors are entered the normal state sample level and carry out
Normal distribution sampling, obtains the compressed data of third dimension;Wherein, the third dimension is lower than second dimension.
6. according to the method described in claim 4, it is characterized in that, after the compressed data for obtaining the frame to be encoded, institute
State method further include:
The compressed data of the frame to be encoded is reversely reconstructed, the compressed data is reduced to and the residual error data
The decoding data that matches of dimension.
7. according to the method described in claim 6, it is characterized in that, reversely being reconstructed to the compressed data of the frame to be encoded
Include:
By first full articulamentum in the compressed data input decoding neural network of the frame to be encoded, by the compression number
According to being restored to second dimension;
Second full articulamentum that the data for being restored to second dimension are inputted to the decoding neural network, by described the
The data convert of two-dimensions to first dimension, and will be restored to the data of first dimension as with the residual error data
The decoding data that matches of dimension.
8. according to the method described in claim 6, it is characterized in that, being carried out just to the Mean Vector and the variance vectors
After state profile samples, the method also includes:
According to the Mean Vector and the variance vectors, the relative entropy of the residual error data is calculated, and passes through the relative entropy
Characterize the distortion factor after normal distribution sampling;
Correspondingly, after the compressed data to the frame to be encoded reversely reconstruct, the method also includes:
The error between the decoding data and the residual error data that reduction obtains is calculated, and calculates the error and the relative entropy
Cross entropy, and the distortion factor of the decoding data relative to the residual error data is characterized by the cross entropy.
9. a kind of video compression system, which is characterized in that the system comprises:
Residual error data computing unit for determining frame and reference frame to be encoded in target video, and calculates the frame to be encoded
Residual error data relative to the reference frame;
Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively;
Data compression unit, it is described to obtain for carrying out normal distribution sampling to the Mean Vector and the variance vectors
The compressed data of frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
10. system according to claim 9, which is characterized in that the residual error data computing unit includes:
Frequency domain conversion module, for calculating the residual error between the frame to be encoded and reference frame, and by the residual error from time-domain
It converts to frequency domain, and the residual error using the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame
Data.
11. system according to claim 9, which is characterized in that the system also includes:
Neural network input unit, for the residual error data to be inputted to the deep neural network for completing training, the depth mind
It is included multiple full articulamentums in network;
Dimensionality reduction unit, for by the first full articulamentum in the deep neural network, the residual error data to be tieed up from first
Degree is reduced to the second dimension;
Correspondingly, the vector extraction unit, be also used to respectively by the second full articulamentum in the deep neural network and
The full articulamentum of third extracts the Mean Vector and variance vectors of the residual error data of second dimension;Wherein, the Mean Vector
It is lower than second dimension with the dimension of the variance vectors.
12. system according to claim 11, which is characterized in that the system also includes:
Decoding unit is reversely reconstructed for the compressed data to the frame to be encoded, the compressed data is reduced to
The decoding data to match with the dimension of the residual error data.
13. system according to claim 12, which is characterized in that the decoding unit includes:
Decoding network input module is complete for first in the compressed data input decoding neural network by the frame to be encoded
The compressed data is restored to second dimension by articulamentum;
Data restoring module is complete for the data for being restored to second dimension to be inputted second for decoding neural network
Articulamentum, by the data convert of second dimension to first dimension, and the data that first dimension will be restored to
The decoding data to match as the dimension with the residual error data.
14. system according to claim 12, which is characterized in that the system also includes:
Relative entropy computing unit, for calculating the opposite of the residual error data according to the Mean Vector and the variance vectors
Entropy, and the distortion factor after normal distribution sampling is characterized by the relative entropy;
Cross entropy computing unit for calculating the error between the decoding data and the residual error data that reduction obtains, and calculates
The cross entropy of the error and the relative entropy, and the decoding data is characterized relative to the residual error number by the cross entropy
According to the distortion factor.
15. a kind of video compression apparatus, which is characterized in that the video compression apparatus includes memory and processor, described to deposit
Reservoir is for storing computer program, when the computer program is executed by the processor, realizes as in claim 1 to 8
Any method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910318187.2A CN110234011B (en) | 2019-04-19 | 2019-04-19 | Video compression method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910318187.2A CN110234011B (en) | 2019-04-19 | 2019-04-19 | Video compression method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110234011A true CN110234011A (en) | 2019-09-13 |
CN110234011B CN110234011B (en) | 2021-09-24 |
Family
ID=67860744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910318187.2A Expired - Fee Related CN110234011B (en) | 2019-04-19 | 2019-04-19 | Video compression method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110234011B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021093393A1 (en) * | 2019-11-13 | 2021-05-20 | 南京邮电大学 | Video compressed sensing and reconstruction method and apparatus based on deep neural network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060193527A1 (en) * | 2005-01-11 | 2006-08-31 | Florida Atlantic University | System and methods of mode determination for video compression |
US20080084929A1 (en) * | 2006-10-05 | 2008-04-10 | Xiang Li | Method for video coding a sequence of digitized images |
CN102158703A (en) * | 2011-05-04 | 2011-08-17 | 西安电子科技大学 | Distributed video coding-based adaptive correlation noise model construction system and method |
CN103546749A (en) * | 2013-10-14 | 2014-01-29 | 上海大学 | Method for optimizing HEVC (high efficiency video coding) residual coding by using residual coefficient distribution features and bayes theorem |
CN104299201A (en) * | 2014-10-23 | 2015-01-21 | 西安电子科技大学 | Image reconstruction method based on heredity sparse optimization and Bayes estimation model |
CN104702961A (en) * | 2015-02-17 | 2015-06-10 | 南京邮电大学 | Code rate control method for distributed video coding |
CN109587487A (en) * | 2017-09-28 | 2019-04-05 | 上海富瀚微电子股份有限公司 | The appraisal procedure and system of the structural distortion factor of a kind of pair of RDO strategy |
-
2019
- 2019-04-19 CN CN201910318187.2A patent/CN110234011B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060193527A1 (en) * | 2005-01-11 | 2006-08-31 | Florida Atlantic University | System and methods of mode determination for video compression |
US20080084929A1 (en) * | 2006-10-05 | 2008-04-10 | Xiang Li | Method for video coding a sequence of digitized images |
CN102158703A (en) * | 2011-05-04 | 2011-08-17 | 西安电子科技大学 | Distributed video coding-based adaptive correlation noise model construction system and method |
CN103546749A (en) * | 2013-10-14 | 2014-01-29 | 上海大学 | Method for optimizing HEVC (high efficiency video coding) residual coding by using residual coefficient distribution features and bayes theorem |
CN104299201A (en) * | 2014-10-23 | 2015-01-21 | 西安电子科技大学 | Image reconstruction method based on heredity sparse optimization and Bayes estimation model |
CN104702961A (en) * | 2015-02-17 | 2015-06-10 | 南京邮电大学 | Code rate control method for distributed video coding |
CN109587487A (en) * | 2017-09-28 | 2019-04-05 | 上海富瀚微电子股份有限公司 | The appraisal procedure and system of the structural distortion factor of a kind of pair of RDO strategy |
Non-Patent Citations (1)
Title |
---|
王建福: "H.265/HEVC编码加速算法研究", 《优秀博士论文电子期刊》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021093393A1 (en) * | 2019-11-13 | 2021-05-20 | 南京邮电大学 | Video compressed sensing and reconstruction method and apparatus based on deep neural network |
Also Published As
Publication number | Publication date |
---|---|
CN110234011B (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cheng et al. | Energy compaction-based image compression using convolutional autoencoder | |
US5699121A (en) | Method and apparatus for compression of low bit rate video signals | |
US20080008246A1 (en) | Optimizing video coding | |
CN110677651A (en) | Video compression method | |
CN103489203A (en) | Image coding method and system based on dictionary learning | |
CN107211133B (en) | Method and device for inverse quantization of transform coefficients and decoding device | |
WO2002096118A2 (en) | Decoding compressed image data | |
US8594189B1 (en) | Apparatus and method for coding video using consistent regions and resolution scaling | |
CN109903351B (en) | Image compression method based on combination of convolutional neural network and traditional coding | |
Zhou et al. | DCT-based color image compression algorithm using an efficient lossless encoder | |
Akbari et al. | Learned variable-rate image compression with residual divisive normalization | |
Song et al. | Compressed image restoration via artifacts-free PCA basis learning and adaptive sparse modeling | |
CN101272489A (en) | Encoding and decoding device and method for video image quality enhancement | |
CN111163314A (en) | Image compression method and system | |
CN111741300A (en) | Video processing method | |
CN116916036A (en) | Video compression method, device and system | |
Akbari et al. | Learned bi-resolution image coding using generalized octave convolutions | |
CN110234011A (en) | A kind of video-frequency compression method and system | |
Lee et al. | CNN-based approach for visual quality improvement on HEVC | |
CN110730347A (en) | Image compression method and device and electronic equipment | |
CN111161363A (en) | Image coding model training method and device | |
Selim et al. | A simplified fractal image compression algorithm | |
Putra et al. | Intra-frame based video compression using deep convolutional neural network (dcnn) | |
CN114501034B (en) | Image compression method and medium based on discrete Gaussian mixture super prior and Mask | |
CN110717948A (en) | Image post-processing method, system and terminal equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210924 |