CN106060539B

CN106060539B - A kind of method for video coding of low transmission bandwidth

Info

Publication number: CN106060539B
Application number: CN201610428792.1A
Authority: CN
Inventors: 李永旭; 马自强; 肖子玉; 唐大钫; 田言金; 毕鹏飞; 徐圣凯; 王建鹏
Original assignee: Shenzhen Fengjing Network Technology Co Ltd
Current assignee: Shenzhen Fengjing Network Technology Co Ltd
Priority date: 2016-06-16
Filing date: 2016-06-16
Publication date: 2019-04-09
Anticipated expiration: 2036-06-16
Also published as: CN106060539A

Abstract

The present invention provides a kind of method for video coding of low transmission bandwidth, comprising the following steps: step S1 obtains the image of original video；Step S2, pre-processes original video, obtains key frame；Step S3 carries out the pretreatment of inter prediction encoding to the multiple image between two neighboring key frame；Step S4, using adaptively every frame coding by the way of original video is encoded；Step S5 exports the code stream of generation.The present invention is using adaptive active every frame coding mode, the video image quantity for needing to encode can be greatly reduced at Video coding end, and un-encoded image can be restored using decoded image as reference frame in such a way that temporal interpolation mends frame in decoding end, the compression ratio of video is effectively improved, the code stream of video is largely reduced；The code stream saved can be also used for improving the encoding efficiency of key frame, improve the quality of reconstructed image and the accuracy of the image restored.

Description

A kind of method for video coding of low transmission bandwidth

Technical field

The present invention relates to a kind of method for video coding more particularly to a kind of method for video coding of low transmission bandwidth.

Background technique

The data volume of transmission of video is huge, is transmitted if without processing, to transmission bandwidth and memory space It is required that it is all very high, since actual transmission bandwidth is limited, so guaranteeing the same of certain Video coding reconstructed image quality When, so that it is occupied few bandwidth, to effectively be transmitted to video, it is necessary to compressed encoding is carried out to original video, so Video coding should develop towards higher compression ratio direction, and also guarantee that distortion is controllable.

Summary of the invention

The technical problem to be solved by the present invention is to need to provide a kind of compression ratio that can effectively improve video, very Reduce the method for video coding of the low transmission bandwidth of the code stream of video in big degree.

In this regard, the present invention provides a kind of method for video coding of low transmission bandwidth, comprising the following steps:

Step S1 obtains the image of original video；

Step S2, pre-processes original video, obtains key frame；

Step S3 carries out the pretreatment of inter prediction encoding to the multiple image between two neighboring key frame；

Step S4, using adaptively every frame coding by the way of original video is encoded；

Step S5 exports the code stream of generation.

A further improvement of the present invention is that the key frame in the step S2 is the first frame of each scene.

A further improvement of the present invention is that then successively calculating this first using first key frame as reference frame The similarity of key frame and subsequent each frame image to be encoded, the similarity is compared with preset threshold, Zhi Daoxiang It is lower than preset threshold like degree, it is determined that the frame is next key frame.

A further improvement of the present invention is that the calculation formula of the similarity is S (z_i, z_j)=w_v*S_v(z_i, z_j)+w_m* S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent total similarity of i frame and j frame, S_v(z_i, z_j) and S_m(z_i, z_j) respectively indicate vision spy Levy similarity and motion feature similarity, W_vAnd W_mRespectively indicate the weight of vision and component motion.

A further improvement of the present invention is that the pretreatment of inter prediction encoding is to two neighboring pass in the step S3 Multiple image between key frame carries out the processing of two-way difference frame and difference frame translocation sorting.

A further improvement of the present invention is that in the step S4, adaptively every the mode of frame coding be to key frame into Row frame data compression coding, skips over two-way difference frame, carries out the disparity code with previous key frame to difference frame.

A further improvement of the present invention is that it is described adaptively every frame coding mode in, when being encoded to key frame, Traverse the predictive coding mode in all key frames progress frames, and then the predictive coding mode for selecting data volume small；To difference frame When carrying out disparity code, for current macro, the predictive coding mode of former frame respective macroblock, selection and former frame are read first The relevant predictive coding mode of predictive coding mode as candidate range, then the mode in candidate range is transported respectively The calculating of dynamic estimation and rate distortion costs, the predictive coding mode for selecting rate distortion costs the smallest as difference frame.

A further improvement of the present invention is that the selection method of the predictive coding mode is as follows: firstly, to the original of input Beginning pixel carries out the sub-sampling of 2:1, carries out the calculating of edge direction vector to the pixel after sampling, the edge direction for generating macro block is straight Fang Tu, and candidate modes are found out by edge orientation histogram；Then, judge whether the edge orientation histogram found out just has Unimodality, if with a predictive coding mode selecting amplitude maximum in edge orientation histogram if unimodality and adjacent Two predictive coding modes be candidate prediction coding mode；DC coding mode is used if not having unimodality；Finally, right Each candidate prediction coding mode calculated distortion cost value selects distortion cost value the smallest a kind of as final predictive coding Mode.

A further improvement of the present invention is that determining the method for skipping over two-way difference frame number are as follows: compare by every frame pressure After contracting and H.264 compressing m frame video image by standard, the total distortion of m frame image is reconstructed, n is initially 0, successively increases；When full FootAndWhen, then skip over i frame Most preferably, wherein B_c(n) and B_s(n) distortion under H.264 frame compression and standard are compressed is respectively indicated.

A further improvement of the present invention is that in the step S4, when the video sequence of input is two-way difference frame, meter The frame calculated between the front and back key frame of two-way difference frame is poor；Compare frame difference and frame difference threshold value, if frame difference is less than frame difference threshold value, Save the coding to the two-way difference frame；If frame difference is greater than frame difference threshold value, which is mended It repays, then compensated information is encoded；The calculation formula of the frame difference is C=∑_{I, j}| A (i, j)-B (i, j) |²/ n, wherein C Represent that frame is poor, A (i, j) and B (i, j) respectively represent the pixel of former and later two key frames, and the pixel that n includes by image is total Number.

Compared with prior art, the beneficial effects of the present invention are: using adaptive active every frame coding mode, regarding The video image quantity for needing to encode can be greatly reduced in frequency coding side, and in decoding end in such a way that temporal interpolation mends frame Un-encoded image can be restored as reference frame using decoded image, effectively improve the compression ratio of video, Largely reduce the code stream of video；The bit stream saved can not only save transmission bandwidth, but also can be with For improving the encoding efficiency of key frame, the quality of reconstructed image and the accuracy of the image restored are improved.

Detailed description of the invention

Fig. 1 is the workflow schematic diagram of an embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawing, preferably embodiment of the invention is described in further detail.

As shown in Figure 1, this example provides a kind of method for video coding of low transmission bandwidth, comprising the following steps:

Step S1 obtains the image of original video；

Step S2, pre-processes original video, obtains key frame；

Step S5 exports the code stream of generation.

This example still falls within the hybrid encoding frame that H.264 standard is compressed, and original video passes through intra prediction or inter-prediction, Code stream is generated after transition coding；Key frame in step S2 described in this example is the first frame of each scene.

After this example obtains the image of original video, it is necessary first to original video is pre-processed, all key frames are found, The first frame of i.e. each scene.Using first key frame as reference frame, then successively calculate first key frame and it is subsequent to The similarity of each frame image of coding, the similarity is compared with preset threshold, if similarity is higher than threshold value, There is no scene switchings；Until similarity is lower than preset threshold, the preset threshold is preset preset threshold, the preset threshold It can generally be determined according to video bits number variable quantity accounting is foundation, such as be set as 0.7~0.9 or so, it can also be according to reality Border requires to be defined and modify.Then representative image scene changes determine that the frame is next key frame.The first frame of each scene It is all key frame, the key frame is also referred to as I frame, carries out intraframe predictive coding in this example.The preset threshold can be according to view Frequency reduction requires to carry out customized setting, and preset threshold is arranged smaller, then the video distortion restored is smaller, generation Code stream also accordingly becomes larger；Vice versa.

Similarity threshold setting method is global color feature and motion feature using image come split sence, same field Not only visual signature is similar for scape, but also motion feature is also with uniformity.The calculation formula of the similarity is S (z_i, z_j)= w_v*S_v(z_i, z_j)+w_m*S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent total similarity of i frame and j frame, S_v(z_i, z_j) and S_m(z_i, z_j) Respectively indicate visual signature similarity and motion feature similarity, S_v(z_i, z_j) hsv color histogram calculation is used, because of HSV The perception color mode of color space and people are close, and S_m(z_i, z_j) depend on camera lens number and search range.W_vAnd W_mTable respectively Show the weight of vision and component motion, that is, respectively indicates the weight of visual signature similarity and motion feature similarity.Vision is special The value range for levying similarity and motion feature similarity is 0~1, but the weight of visual signature similarity is less than movement spy The weight of similarity is levied, the weight of general vision similarity is 0.2~0.4, and the weight of motion feature similarity is 0.6~0.8.

The calculation method of weight are as follows: W_mValue be motion feature similarity variance and the motion feature similarity and view Feel the ratio of the sum of variance of characteristic similarity；W_vValue be visual signature similarity variance and motion feature similarity and The ratio of the sum of variance of visual signature similarity.

After all key frames determine, the multiple image between two key frames belongs to the image of Same Scene.Unified Field Image in scape, other than first frame image intraframe predictive coding, to the residual image in the Same Scene, using inter-prediction Coding is handled.It up to tens kinds of the mode of existing inter prediction encoding, can be related according to the prediction mode of consecutive frame The high feature of property, reduces the optional range of prediction mode, reduces the complexity of algorithm.Either intraframe predictive coding or frame Between predictive coding can all generate residual error, the i.e. difference of forecast image and original image.After residual error is transformed, quantization and entropy coding It is transferred to decoding end together with predictive information, just forms code stream.

The selection method of predictive coding mode described in this example is as follows: firstly, the Asia for carrying out 2:1 to the original pixels of input is adopted Sample carries out the calculating of edge direction vector to the pixel after sampling, generates the edge orientation histogram of macro block, and straight by edge direction Square figure finds out candidate modes；Then, judge whether the edge orientation histogram found out just has unimodality, if had unimodal Property then select amplitude maximum in edge orientation histogram a predictive coding mode and two adjacent predictive coding modes For candidate prediction coding mode；DC coding mode is used if not having unimodality；Finally, encoding mould to each candidate prediction Formula calculated distortion cost value selects distortion cost value the smallest a kind of as final predictive coding mode.

The pretreatment of inter prediction encoding is to the multiple image between two neighboring key frame in step S3 described in this example Carry out the processing of two-way difference frame and difference frame translocation sorting.

That is, original video uses the frame structure of IB ... BPB ... BP, the first frame of each new scene is exactly I frame, that is, is closed Key frame；Then B frame and P frame translocation sorting, the B frame is two-way difference frame, that is, B frame recording is this frame and before and after frames Difference；The P frame is difference frame, that is, what P frame indicated is this frame with a key frame before or therewith previous P frame Between difference.The adaptive mode every frame coding in the step S4 is exactly actively to skip over B frame therein without encoding, Only I frame and P frame are encoded, accordingly I frame and P frame are decoded in decoding end, restore the B frame skipped over.Every a pair of I B number of frames between frame and P frame or P frame and P frame is to guarantee the distortion rate of each frame B frame image all meeting picture quality Under requirement within the allowable range, adaptive determining.The quantity of B frame is more, and the effect of compression is also more obvious.

This example determines the best approach for skipping over two-way difference frame number are as follows: compares by compressing and passing through standard every frame H.264 after compressing m frame video image, the total distortion of m frame image is reconstructed, n is initially 0, successively increases；Work as satisfactionAndIt is best then to skip over i frame, Wherein B_c(n) and B_s(n) distortion under H.264 frame compression and standard are compressed is respectively indicated.

In step S4 described in this example, it is adaptively that frame data compression coding is carried out to key frame every the mode of frame coding, skips over Two-way difference frame carries out the disparity code with previous key frame to difference frame.

It is described adaptively every in the mode of frame coding, when being encoded to key frame, traverse all key frames and carry out in frames Predictive coding mode, and then select the smallest predictive coding mode of data volume as forced coding mode；Difference frame is carried out When disparity code, for current macro, the predictive coding mode of former frame respective macroblock is read first, is selected pre- with former frame The relevant predictive coding mode of coding mode is surveyed as candidate range, then movement is carried out respectively to the mode in candidate range and estimates The calculating of meter and rate distortion costs, the predictive coding mode for selecting rate distortion costs the smallest as difference frame.

The video image of input is divided into different zones as unit of macro block, is claimed respectively according to during difference For coding unit, predicting unit and converter unit.This example is higher to the coding requirement of key frame, thus can to key frame into When row intra prediction, predicting unit is divided smaller, improves precision of prediction.

As shown in Figure 1, when the video sequence of input is two-way difference frame, calculating two-way difference frame in the step S4 Front and back key frame between frame it is poor；Compare frame difference and frame difference threshold value, which can generally change according to video bits number Amount accounting is foundation to determine, is such as set as 0.4 or so, can also be defined and modify according to actual requirement.If frame is poor Less than the poor threshold value of frame, then the coding to the two-way difference frame is saved；If frame difference is greater than frame difference threshold value, to the two-way difference frame Region compensates, then compensated information is encoded；The calculation formula of the frame difference is C=∑_{I, j}| A (i, j)-B (i, j)|²/ n, wherein C represents that frame is poor, and A (i, j) and B (i, j) respectively represent the pixel of former and later two key frames, and n is image institute The pixel sum for including.

Since the paradox of the non-linear of object of which movement and reduction linearity estimation causes regional area distortion serious, but It is that range is little, video image quality can be largely improved as long as improving to the region, therefore for such image These regional areas need to only be encoded, and the other parts of image can still save coding.By the way of local code, need Determine the big image-region of distortion rate.The reference frame that can use coded picture buffer restores intermediate frame, by the frame It is compared with reconstructed frame image, if the distortion cost of the frame is small, does not need local compensation, just can be carried out also in decoding end It is former.If the distortion cost of reconstructed frame is smaller, the block for being greater than threshold value to distortion cost in image is encoded, all B frames It will be judged.

Since the sequence variation having in video is larger, then relative motion is slow for some.Wherein video sequence interframe is become When changing larger, estimation is not accurate enough, and the picture quality of reduction is decreased obviously；And video sequence interframe is changed slow When, image quality decrease it is unobvious, can satisfy visual demand.Since there are this limitations, the matter of image reconstruction will affect Amount, so needs carry out Local treatment to serious image is distorted after reduction.This kind of image information is marked and is transmitted by coding side To decoding end.

Restoring method for uncompensated picture frame is, utilizes the adjacent I frame in front and back or P frame corresponding position pixel Pixel value weighted sum obtains the pixel of reduction frame.Restoring method for the picture frame with compensation is, according to the adjacent I in front and back Frame or P frame find the motion vector of the B frame, further according to local code information, detect a compensation range, within this range The topography is compensated using vector adjustment and bi-directional motion estimation, finally obtains the reduction frame met the requirements.

In this example, B frame is calculated by its adjacent I frame or P frame, so I frame and P frame influence the reduction effect of B frame, and P frame is obtained according to inter-prediction again, it is therefore necessary to improve the accuracy of inter-prediction, can be used to this smaller size of Unify the macro block of size to improve precision of prediction, improves compression ratio.Video image by it is adaptive every frame coding after, with coding The code stream that required head information is formed carries out package transmission and storage through network-adaptive layer according to RTSP agreement, meets lower biography Defeated bandwidth requirement.

This example, every frame coding mode, can be greatly reduced what needs encoded at Video coding end using adaptive active Video image quantity, and in decoding end using decoded image as reference frame in such a way that temporal interpolation mends frame Un-encoded image is restored, the compression ratio of video is effectively improved, largely reduces the code stream of video；Under saving The bit bitstream come can not only save transmission bandwidth, but also can be also used for improving the encoding efficiency of key frame, improve weight The accuracy of the quality of composition picture and the image restored.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims

1. a kind of method for video coding of low transmission bandwidth, which comprises the following steps:

Step S1 obtains the image of original video；

Step S2, pre-processes original video, obtains key frame；

Step S5 exports the code stream of generation；Key frame in the step S2 is the first frame of each scene；By first key Then frame successively calculates the similarity of first key frame Yu subsequent each frame image to be encoded, by institute as reference frame It states similarity to be compared with preset threshold, until similarity is lower than preset threshold, it is determined that the frame is next key frame；Institute The calculation formula for stating similarity is S (z_i, z_j)=w_v*S_v(z_i, z_j)+w_m*S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent i frame and j frame Total similarity, S_v(z_i, z_j) and S_m(z_i, z_j) respectively indicate visual signature similarity and motion feature similarity, W_vAnd W_mRespectively Indicate the weight of vision and component motion；

In the step S3 pretreatment of inter prediction encoding be between two neighboring key frame multiple image carry out B frame and The processing of P frame translocation sorting；

In the step S4, it is adaptively that frame data compression coding is carried out to key frame every the mode of frame coding, B frame is skipped over, to P frame Carry out the disparity code with previous key frame.

2. the method for video coding of low transmission bandwidth according to claim 1, which is characterized in that described adaptively to be compiled every frame In the mode of code, when encoding to key frame, the predictive coding mode in all key frames progress frames is traversed, and then select number According to the small predictive coding mode of amount；When carrying out disparity code to P frame, for current macro, first reading former frame respective macroblock Predictive coding mode, select predictive coding mode relevant to the predictive coding mode of former frame as candidate range, then The calculating for carrying out estimation and rate distortion costs respectively to the mode in candidate range, select rate distortion costs the smallest as The predictive coding mode of P frame.

3. the method for video coding of low transmission bandwidth according to claim 2, which is characterized in that the predictive coding mode Selection method it is as follows: firstly, the original pixels of input are carried out with the sub-sampling of 2:1, edge side is carried out to the pixel after sampling It is calculated to vector, generates the edge orientation histogram of macro block, and candidate modes are found out by edge orientation histogram；Then, Judge whether the edge orientation histogram found out just has unimodality, selects width in edge orientation histogram if there is unimodality It is worth a maximum predictive coding mode and two adjacent predictive coding modes is candidate prediction coding mode；If do not had Standby unimodality then uses DC coding mode；Finally, to each candidate prediction coding mode calculated distortion cost value, selection distortion generation It is worth the smallest a kind of as final predictive coding mode.

4. the method for video coding of low transmission bandwidth according to claim 1, which is characterized in that determination skips over B frame number Method are as follows: compare by reconstructing total mistake of m frame image after frame compresses and by standard H.264 compresses m frame video image Very, n is initially 0, successively increases；Work as satisfactionAndWhen, then it is best to skip over i frame, wherein B_c(n) and B_s(n) it respectively indicates every frame pressure The distortion contracted under H.264 being compressed with standard.

5. the method for video coding of low transmission bandwidth according to claim 1, which is characterized in that in the step S4, when When the video sequence of input is B frame, the frame calculated between the front and back key frame of B frame is poor；Compare frame difference and frame difference threshold value, if frame Difference is less than frame difference threshold value, then saves the coding to the B frame；If frame difference is greater than frame difference threshold value, which is carried out Compensation, then compensated information is encoded；The calculation formula of the frame difference is C=∑_{I, j}| A (i, j)-B (i, j) |²/ n, wherein C represents that frame is poor, and A (i, j) and B (i, j) respectively represent the pixel of former and later two key frames, the pixel that n includes by image Sum.