CN101094398A - Video conversion process method based on transform of time resolution - Google Patents
Video conversion process method based on transform of time resolution Download PDFInfo
- Publication number
- CN101094398A CN101094398A CN 200610061250 CN200610061250A CN101094398A CN 101094398 A CN101094398 A CN 101094398A CN 200610061250 CN200610061250 CN 200610061250 CN 200610061250 A CN200610061250 A CN 200610061250A CN 101094398 A CN101094398 A CN 101094398A
- Authority
- CN
- China
- Prior art keywords
- macro block
- coding
- frame
- current
- covered
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000008569 process Effects 0.000 title claims abstract description 20
- 238000006243 chemical reaction Methods 0.000 title claims description 37
- 239000013598 vector Substances 0.000 claims abstract description 104
- 238000010187 selection method Methods 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000002123 temporal effect Effects 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 8
- 238000003672 processing method Methods 0.000 claims description 8
- 230000002596 correlated effect Effects 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 7
- 230000009466 transformation Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 102100037812 Medium-wave-sensitive opsin 1 Human genes 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The method comprises: a) in a time-based resolution switch process using a main control vector selection method; detecting the coverage state of the abandoned frame covered by the macro block encoded with the inter-frame encode approach in the encoded video stream to the switched; b) when the encoded macro blocks covered by matched macro block at the corresponding location in reference frame are two or four, and sizes of all encoded macro blocks are closed to and all encoded macro blocks use the inter-frame encode approach, according to the DCT coefficient correlated to said macro blocks, determining the matching location of current encoded macro block at previous reference frame; based on that, getting the motion vector of current encoded macro block.
Description
Technical Field
The invention relates to calculation, in particular to a video conversion processing method based on time resolution transformation.
Background
Video transcoding is understood to be the conversion from one video compression format to another video compression format, where the format includes syntax of the bitstream and relevant parameters in the bitstream, such as coding rate, spatial resolution of video images, temporal resolution, adaptability to network bandwidth, etc., and the essence is to convert a video stream in one compression format into a video stream in the same or another compression format in order to adapt to different bandwidth conditions of a transmission network or according to the decoding capability of a client.
Scalable video coding is another coding method that compresses video in multiple layers and transmits several layers as appropriate according to the variation of network bandwidth. The original video data is compressed into a base layer and a plurality of enhancement layers, the base layer must ensure the whole transmission, the wider the bandwidth, the more enhancement layers can be transmitted, the better the quality of the reconstructed video image is, the enhancement layers depend on the base layer, and no base layer has more enhancement layers and is useless.
The expandable video coding method mainly has the methods of space domain layering, time domain layering, signal to noise ratio (SNR) layering, fine layered quantization (FGS) and the like, and can generate a path of code stream suitable for various channel environments by carrying out primary coding, so the flexibility of the method is higher than that of a video conversion coding method, but the expandable video requires a decoder to support multi-layer decoding, the complex decoding function is not easily supported on a handheld terminal, in addition, the code rate control mode is far more complex than video conversion coding, the operation complexity of a streaming media server end is improved, and the head information is increased due to layering, and the quality of a reconstructed video image obtained by the coding mode is poor compared with that obtained by video conversion coding under the same bandwidth condition. The multiple description coding is a method for coding an original video into multiple video streams, wherein any one video stream can be decoded and played independently, and better reconstructed image quality and network adaptability of the video streams are enhanced by combining the multiple decoded video streams, but the implementation complexity is high, and the fluctuation of the reconstructed video image quality is generally large. Therefore, compared with the video transform coding method, the scalable video coding method has high operation complexity and limited application range.
With the continuous improvement of wireless network transmission bandwidth and the continuous development of multimedia retrieval service, the demand of time conversion coding on the existing coded video stream is more and more strong. The temporal resolution conversion (frame rate conversion) is actually to satisfy the network bandwidth change or the terminal decoding requirement on the input video stream by frame loss. The simplest time resolution conversion method is to discard the bidirectional prediction frame (B frame), and because the B frame is not used as a reference frame of other frames during encoding, the transcoder only needs to perform syntax conversion on the original code stream, and the correct decoding of other frames by the decoder is not influenced. However, when the dropped frames are not limited to B frames, part of the motion vectors of the original stream become invalid because their reference frames are dropped in the new stream, and the motion vectors need to be corrected, so the key problem of video temporal resolution conversion (video temporal resolution conversion is to remove data bits of some frames in the encoded video stream according to the change of the actual network transmission bandwidth) is also summarized as fast re-estimation of the motion vectors, i.e. a scheme is found that can utilize the motion vector information in the original stream without introducing obvious quality degradation of the reconstructed video image.
Among the existing processing methods, there are three main methods in summary: the most direct time video resolution conversion coding mode is that firstly, the coded video stream to be converted is completely decoded, then the required coding format is adopted to carry out motion estimation again by adopting a full search block matching algorithm (FS) without using any motion information in the coded video stream, and the coded video stream which meets the conversion requirement and has better reconstructed video image quality is obtained, but the full search block matching algorithm consumes most of the computing resources of a processor, so the full search block matching algorithm is not adopted generally; the other is to adopt BILINEAR interpolation (BILINEAR) to the motion vector, correct the interpolated motion vector, determine the search range according to the frame jump number and the motion vector accumulation amplitude, and search in the range to obtain the updated motion vector.
Another method is that Jeongam young and Ming-Ting Sun propose the method called forward dominant vector selection FDVS in article A fast motion vector optimization method for temporal coding, which has smaller calculation amount and better performance compared with the above two methods, as shown in FIG. 1, four small squares forming one large square in each frame are called coding macro block S1 (upper left), S2 (upper right), S3 (lower left) and S4 (lower right), respectively, and the method finds coding macro block S1 (motion vector is I) in the following frame (n) using the frame data as reference from the discarded frame (n-1)1(n)) of matching macroblocks MB1' at the corresponding position, one master motion vector is selected from the motion vectors corresponding to the four partially covered coded macroblocks S1, S2, S3, S4, and this master motion vector is the motion vector I of the master macroblock S22(n-1), the master macroblock refers to the macroblock of the four macroblocks that is covered most by the matching macroblock, however, when the areas of the four coded macroblocks covered by its corresponding matching location in the reference frame (n-1) are relatively close, as shown in FIG. 2, the master macroblock now coversThe FDVS still simply selects the main control motion vector according to the size of the coverage area, and the obtained main control motion vector lacks representativeness, thereby influencing the quality of the reconstructed video image at a decoding end.
Disclosure of Invention
The invention aims to provide a video conversion processing method based on time resolution transformation, which aims to solve the problem that the motion vector selected under the condition of close coverage area in the FDVS algorithm in the prior art is lack of representativeness and obtain better reconstructed video image quality.
The video conversion processing method based on time resolution transformation adopted by the invention comprises the following steps:
A. in the time video resolution conversion adopting a master control vector selection method, the covering state of a macro block coded by adopting an interframe coding mode in a video detection stream to be converted and coded to a discarded reference frame is detected;
B. at least when the coding macro block covered by the corresponding matching position of the matching macro block in the reference frame is two or four, the area of each coding macro block is relatively close and all the coding macro blocks adopt the interframe coding mode, the matching position of the current coding macro block in the previous reference frame is determined according to the Discrete Cosine Transform (DCT) direct current coefficient related to the coding macro block, and the motion vector of the current coding macro block is obtained according to the matching position.
In the step B, if the coded macro block covered by the corresponding matching position of the matching macro block in the reference frame is four blocks, the reference value of the DCT dc coefficient calculation of the coded macro block is determined according to the respective motion vectors of the four covered coded macro blocks.
After the step B, if the reference frame of the current coding macro block is also discarded, searching forward until the reference frame of the current coding macro block is not the discarded frame, and obtaining the motion vector of the current coding macro block.
The step B comprises the following steps:
b1, if the covered coded macro block is four, determining whether there is a covered portion larger than the pixel point threshold T, and performing the following operations:
b11, if the covered part larger than the pixel point threshold value T exists, selecting the corresponding macro block as the main control macro block, if the main control macro block adopts the intra-frame coding mode, then the current macro block also adopts the intra-frame coding mode to code, and the process is ended; if the main control macro block adopts the inter-frame coding mode, the step B2 is carried out;
b12, otherwise, judging whether there is covered part coded by adopting intra-frame coding mode, and carrying out the following operations:
b121, if the covered part coded by adopting the intra-frame coding mode exists, the current macro block is also coded by adopting the intra-frame coding mode, and the process is ended;
b122, otherwise, judging whether the motion vectors of the four covered coding macro blocks are not equal, and performing the following operations:
b1221, if the motion vectors of the four covered coding macro blocks are not equal, selecting the corresponding position of the covered coding macro block with the largest residual DCT direct current coefficient in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing with the following step B2;
b1222, otherwise, selecting the corresponding position of the covered coding macro block with the smallest residual DCT dc coefficient in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing with the following step B2;
b2, determining whether the reference frame of the current coding macro block is also discarded, and performing the following operations:
b21, if not discarded, directly updating the current obtained motion vector;
b22, otherwise, searching forward until the reference frame of the current coding macro block is not the drop frame, and obtaining the motion vector of the current coding macro block.
The step B also comprises the following steps:
BA. If the number of covered coded macro blocks is two, judging whether a covered part larger than a pixel point threshold value T exists, and performing the following operations:
BA1, if the covered part larger than the pixel point threshold value T exists, if the covered coding macro block is coded by adopting an interframe coding mode, adopting the matching position of the covered coding macro block in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, continuing the step BB, if the covered coding macro block is coded by adopting an intraframe coding mode, coding the current coding macro block by adopting the intraframe coding mode, and ending the process;
BA2, otherwise, judging whether there is covered part coded by adopting intra-frame coding mode, and carrying out the following operations:
BA21, if there is a covered part coded by adopting the intra-frame coding mode, the current macro block is also coded by adopting the intra-frame coding mode, and the process is ended;
BA22, otherwise, comparing the sum of the dc coefficients of two 8 × 8 blocks with larger coverage area in the two macroblocks, selecting the matching position of the larger one of the two covered coded macroblocks in the previous reference frame as the matching position of the current coded macroblock in the previous reference frame, and continuing with the following step BB;
BB. Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
BB1, if not discarded, directly updating the currently obtained motion vector;
BB2, otherwise, searches forward until the reference frame of the current coded macroblock is not a dropped frame, obtaining the motion vector of the current coded macroblock.
The step BB is followed by the step BC:
BC. Further selecting a search window to update the motion vector, said search window being updated according to the following equation:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
The step B also comprises the following steps:
ba. If the covered coding macro block is a block, if the macro block is coded by adopting an interframe coding mode, taking the matching position of the macro block in the previous reference frame as the matching position of the current macro block in the previous reference frame, and turning to the step Bb;
bb. Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
bb1, if not discarded, directly updating the currently obtained motion vector;
bb2, otherwise, searching forward until the reference frame of the current coding macro block is not the discarding frame, and obtaining the motion vector of the current coding macro block.
The step B also comprises the following step C:
C. and further selecting a search window to update according to the motion vector condition and the frame loss number of the four covered macro blocks.
The step C comprises the following steps:
c1, determining whether the motion vectors of the four covered coding macro blocks are equal, and performing the following operations:
c11, if the motion vectors are equal, the motion vectors do not need to be updated;
c12, otherwise, judging whether the motion vectors of the four covered coding macro blocks are not equal, and performing the following operations:
c121, if the search window is not equal, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default size of the search window in the full search block matching calculation, j is the number of lost frames between the current frame and a reference frame of a non-lost frame, α is 2, and β is 1.5;
c122, otherwise, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
The invention has the beneficial effects that: in the invention, the motion vector of the video stream after the conversion coding is estimated again by utilizing the motion information in the coded video stream and the direct current coefficient obtained by DCT transformation, the calculated amount in the video conversion coding process can be reduced, the coding speed is improved, the calculated amount in the video conversion coding process can be greatly reduced, the reconstructed video image quality close to the full search block matching method is obtained, meanwhile, the influence of the network bandwidth change on the reconstructed video image quality can be effectively reduced, the utilization rate of the network bandwidth is improved, the expandability and the interactivity of the video stream are ensured, and excellent visual experience is provided for users.
Drawings
FIG. 1 is a schematic diagram of a forward master vector selection method;
FIG. 2 is a diagram illustrating a special case of the forward master vector selection method;
FIG. 3 is a schematic diagram of the overall control flow of the present invention;
FIG. 4 is a schematic diagram of the basic control flow for covering four blocks in the present invention;
FIG. 5 is a diagram illustrating a specific control flow for covering four blocks according to the present invention;
FIG. 6 is a diagram illustrating a specific control flow for covering two blocks according to the present invention;
FIG. 7 is a diagram illustrating a specific control flow for covering a block according to the present invention;
Detailed Description
The invention is explained in more detail below with reference to the figures and examples:
in the international standard of video coding, the inter-frame prediction coding can adopt two modes of frame prediction and field prediction, so that the number of motion vectors of each macro block of a forward prediction frame can be one or two. If a macroblock encoded using inter-coding in the input coded video stream before temporal resolution conversion coding uses frame prediction, the resulting motion vector is the motion vector of the macroblock. If field prediction is adopted, each macro block is divided into two odd-even blocks of 16 × 8, and the motion vector of the macro block is the motion vector corresponding to the 16 × 8 block which has a large sum of direct current coefficients after DCT transformation, and the reason for doing so is that the block matching method establishes the assumption that all pixels in the macro block are in a translation state and have the same motion trend, but generally at the edge of an object, the condition is difficult to satisfy, so the block matching method has a trend of generating a large prediction error at the edge of the object, and the pixel values of the residual macro block obtained through motion compensation obey laplacian distribution, which means that the probability that the direct current coefficients after quantization are not zero is higher than the alternating current coefficients, and experiments prove that the activity of the macro block is related to the energy of the DCT coefficients, so a large DC coefficient is selected as a judgment sign of the activity of the macro block.
According to fig. 3, the overall control flow of the present invention is as follows:
1) in the time video resolution conversion adopting the master control vector selection method, the covering state of a macro block coded by adopting an interframe coding mode in a video stream to be converted and coded to a discarded reference frame is detected.
2) At least when the coding macro blocks covered by the corresponding matching positions of the matching macro blocks in the reference frame are two or four, the areas of the coding macro blocks are relatively close and the coding is carried out in an interframe coding mode, the matching positions of the current coding macro blocks in the previous reference frame are determined according to the DCT direct current coefficients related to the coding macro blocks, and the motion vectors of the current coding macro blocks are obtained according to the matching positions.
As described above, it should be noted that the processing of the present invention is performed in the pixel domain, and the main reason why the processing is not performed in the frequency domain is that the frequency domain video transform coding is based on four assumption conditions, i.e., the linearization in the motion compensation process, the disregarding of the truncation function before the frame buffering, the uniform arithmetic precision of the DCT/IDCT transform in the coding/decoding process, and the uniform coding mode of each macroblock after transform coding and the uniform coding mode before transform coding, but these four assumptions are difficult to be satisfied in the normal case, and thus a drift error is generated to reduce the quality of the reconstructed video image.
As shown in fig. 4, when the coded macroblock covered by the corresponding matching position of the matching macroblock in the reference frame is four blocks, the following basic control flow is adopted:
a) and detecting the covering state of the macro block coded by adopting an inter-frame coding mode in the video stream to be converted and coded to the discarded reference frame.
b) The covered coding macro blocks are four blocks, the areas of the coding macro blocks are relatively close, and the coding macro blocks are coded in an inter-frame coding mode.
c) And determining the DCT direct current coefficient calculation reference value of the coding macro block according to the respective motion vectors of the four covered coding macro blocks.
d) Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
d1) if not, directly updating the currently obtained motion vector, and continuing to the following step e).
d2) Otherwise, searching forward until the reference frame of the current coding macro block is not the discarded frame, and obtaining the current coding
The motion vectors of the macroblock are coded, continuing with step e) below.
e) And further selecting a search window to update according to the motion vector condition and the frame loss number of the four covered macro blocks.
As shown in fig. 5, the following detailed description is made on the specific control flow when the coded macroblock covered by the corresponding matching position of the matching macroblock in the reference frame is four blocks:
1. and detecting the covering state of the macro block coded by adopting an inter-frame coding mode in the video stream to be converted and coded to the discarded reference frame.
2. When the covered coding macro block is four blocks, judging whether a covered part larger than a pixel point threshold value T exists or not, and carrying out the following operations:
21. if the covered part larger than the pixel point threshold value T exists, the corresponding macro block is selected as the main control macro block, whether the macro block adopts an intra-frame coding mode for coding is judged, and the following operations are carried out:
211. if the intra-frame coding mode is adopted for coding, the process is ended.
212. Otherwise, the motion vector currently obtained is directly updatedThe following step 4 is continued.
22. Otherwise, judging whether a covered part coded by adopting an intra-frame coding mode exists or not, and carrying out the following operations:
221. if the covered part coded by adopting the intra-frame coding mode exists, the current macro block is also coded by adopting the intra-frame coding mode, and the process is ended.
222. Otherwise, judging the motion vectors of the four covered coding macro blocksIf not, the following operations are carried out:
2221. if four motion vectors of the covered coded macroblockIf the difference is not equal, the corresponding position of the covered coding macro block with the largest residual DCT direct current coefficient in the previous reference frame is selected as the matching position of the current coding macro block in the previous reference frame, and the following step 3 is continued.
In this case, the following formula applies:
2222. Otherwise, selecting the corresponding position of the covered coding macro block with the minimum residual DCT direct current coefficient in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing to the following step 3.
In this case, the following formula applies:
3. Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
32. Otherwise, searching forward until the reference frame of the current coding macro block is not the discarded frame, and obtaining the motion vector of the current coding macro blockHere, the following calculation is employed:
vtx=xn-j-xn
vty=yn-j-yn
wherein (x)n,yn) For the position of the current coded macroblock in the current frame, (x)n-j,yn-j) To match the location of the macroblock in the final non-discarded reference frame, vtx、vtyFor the current macroblock motion vectorJ is the number of dropped frames between reference frames of the current frame and the non-dropped frames.
Thereby obtaining the motion vector of the current coding macro blockThe following step 4 is continued.
4. Judging whether the motion vectors of the four covered coding macro blocks are equal or not, and carrying out the following operations:
41. if they are equal, there is no need to match the motion vectorsAnd (6) updating.
42. Otherwise, judging the motion vectors of the four covered coding macro blocksIf not, the following operations are carried out:
421. if the search window is not equal to the preset search window, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 2, and β is 1.5.
422. Otherwise, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
Thus, through the above steps 4, 41, 42, 421, 422, the search window is further selected for updating according to the motion vector condition of the four covered macro blocks and the frame loss number.
As shown in fig. 6, when the coded macroblock covered by the corresponding matching position of the matching macroblock in the reference frame is two blocks, the following specific control flow is adopted:
I. and detecting the covering state of the macro block coded by adopting an inter-frame coding mode in the video stream to be converted and coded to the discarded reference frame.
II. If the number of covered coded macro blocks is two, judging whether a covered part larger than a pixel point threshold value T exists, and performing the following operations:
II1, if there is a covered part larger than the pixel point threshold value T, if the covered coding macro block adopts the inter-frame coding mode to code, adopting the matching position of the covered coding macro block in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing the following step III; if the coverage coding macro block is coded by adopting the intra-frame coding mode, the current coding macro block is also coded by adopting the intra-frame coding mode, and the process is ended.
II2, otherwise, judging whether there is covered part coded by adopting intra-frame coding mode, and carrying out the following operations:
II2l, if there is a covered portion coded by intra-frame coding, the current macroblock is also coded by intra-frame coding, and the process ends.
II22, otherwise, comparing the sum of the dc coefficients of the two 8 × 8 blocks with larger coverage area in the two macroblocks, selecting the matching position of the larger one of the two covered coded macroblocks in the previous reference frame as the matching position of the current coded macroblock in the previous reference frame, and continuing with the following step III.
And III, judging whether the reference frame of the current coding macro block is also discarded, and performing the following operations:
III1, if not discarded, directly updating the currently obtained motion vector, and continuing with the following step IV.
III2, otherwise, searching forward until the reference frame of the current coding macro block is not the discarded frame, obtaining the motion vector of the current coding macro block, and continuing as the following step IV.
IV, further selecting a search window to update the motion vector, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
As shown in fig. 6, when the coded macroblock covered by the corresponding matching position of the matching macroblock in the reference frame is a block, the following specific control flow is adopted:
i. and detecting the covering state of the macro block coded by adopting an inter-frame coding mode in the video stream to be converted and coded to the discarded reference frame.
if the covered coding macro block is a block, if the macro block is coded by adopting an inter-frame coding mode, the matching position of the macro block in the previous reference frame is taken as the matching position of the current macro block in the previous reference frame,
turning to step iii; if the macro block is coded in the intra-frame coding mode, the current macro block is also coded in the intra-frame coding mode, and the process is ended.
Determining whether a reference frame of a current encoded macroblock is also discarded, performing the following operations:
iii1, if not discarded, directly updating the currently obtained motion vector.
iii2, otherwise, searching forward until the reference frame of the current coding macro block is not the discarded frame, and obtaining the motion vector of the current coding macro block.
For example, in a time resolution transform coding experiment, seven CIF format video sequences with different motion complexity, a coding rate of 384kbit/s and a frame rate of 30 frames/s were tested, each sequence was coded with 240 frames, a default search window size was ± 7, GOPs were constructed in the manner of IPPPPPPPPP, a transform coded video stream rate of 100kbit/s and a frame rate of 12 frames/s (two out of every three adjacent forward predicted frames).
As shown in table 1 below, comparing the PSNR change condition of the FS algorithm with the methods of bitline, FDVS, TFMVRE, and TFMVRE + R (TFMVRE is the method of the present invention, and TFMVRE + R is a method of further updating adopted in the present invention, as described in step e) or step 4), it can be seen that the difference between the reconstructed video image quality obtained by performing the time resolution conversion on the video sequence with slow motion by using the bitline algorithm and the FDVS and TFMVRE algorithms is not large, but when the video motion is severe, the reconstructed video quality obtained by the present invention is greatly different from the FDVS and TFMVRE algorithms.
The reason is that when the video motion is slow, the probability that a macro block coded by adopting an inter-frame coding mode in the coded stream to be converted covers a large part of area of a certain macro block of a reference frame at the best matching position of the reference frame is high, such as a video sequence Akiyo, and a motion vector obtained by a BILINEAR algorithm is closer to a motion vector obtained by FDVS and TFMVRE algorithms.
However, when the video motion is severe, the probability that the best matching position in the reference frame is close in area in four or two covered macro blocks is increased, and the motion vector obtained by the BILINEAR algorithm is lack of representativeness, so that the quality of the reconstructed video image is poor (such as Football of the video sequence); when the FDVS algorithm is used for carrying out time resolution conversion coding on a video sequence with slow motion, most of obtained motion vectors are the same as the TFMVRE algorithm, but the quality of a reconstructed video image in an occasional local motion severe region is lower than that of the TFMVRE algorithm, so that the difference of the quality of the reconstructed video image is not large, but when the video motion is severe, the quality of the reconstructed video image obtained by the algorithm is lower than that of the TFMVRE algorithm, and for video sequences Foreman and Paris with relatively gentle motion, the FDVS algorithm obtains the quality of the reconstructed video image which is superior to that of video sequences Tennis and Football with severe motion, and the reason is the same as that the BILINEAR algorithm has the same reduction on the video coding with severe motion.
The invention modifies the lack of representativeness of the motion vector selected when the area of the covered macro block is close to the FDVS algorithm, and particularly, the quality of the reconstructed video obtained by the invention is greatly superior to the FDVS algorithm when the video moves violently; the method for further updating TFMVRE + R (refinement) of the invention can obtain the quality of the reconstructed video image close to the FS algorithm by updating the obtained motion vector.
PSNR obtained by using various time domain transform coding algorithms for different sequences in the table 1
(unit: dB) (relative to FS)
TABLE 1
In summary, the present invention re-estimates the motion vector of the video stream after transform coding by using the motion information in the coded video stream and the dc coefficient obtained by DCT transform, thereby reducing the amount of computation in the video transform coding process and increasing the coding speed.
Claims (9)
1. A video conversion processing method based on time resolution conversion is characterized in that: it comprises the following steps:
A. in time video resolution conversion by adopting a master control vector selection method, detecting the covering state of a macro block coded by adopting an interframe coding mode in a coded video stream to be converted to a discarded reference frame;
B. at least when the coding macro blocks covered by the corresponding matching positions of the matching macro blocks in the reference frame are two or four, the areas of the coding macro blocks are relatively close and the coding is carried out in an interframe coding mode, the matching positions of the current coding macro blocks in the previous reference frame are determined according to the DCT direct current coefficients related to the coding macro blocks, and the motion vectors of the current coding macro blocks are obtained according to the matching positions.
2. The temporal resolution transform-based video conversion processing method according to claim 1, wherein: in the step B, if the coded macro block covered by the corresponding matching position of the matching macro block in the reference frame is four blocks, the reference value of the DCT dc coefficient calculation of the coded macro block is determined according to the respective motion vectors of the four covered coded macro blocks.
3. The temporal resolution transform-based video conversion processing method according to claim 2, wherein: after the step B, if the reference frame of the current coding macro block is also discarded, searching forward until the reference frame of the current coding macro block is not the discarded frame, and obtaining the motion vector of the current coding macro block.
4. The method for video conversion processing based on temporal resolution transform according to any one of claims 1 to 3, wherein: the step B comprises the following steps:
b1, if the covered coded macro block is four, determining whether there is a covered portion larger than the pixel point threshold T, and performing the following operations:
b11, if there is a covered part larger than the pixel point threshold T, selecting the corresponding macro block as the main control macro block, if the main control macro block adopts the intra-frame coding mode, then the current macro block also adopts the intra-frame coding mode to code, the process is finished, if the main control macro block adopts the inter-frame coding mode, then go to step B2;
b12, otherwise, judging whether there is covered part coded by adopting intra-frame coding mode, and carrying out the following operations:
b121, if the covered part coded by adopting the intra-frame coding mode exists, the current macro block is also coded by adopting the intra-frame coding mode, and the process is ended;
b122, otherwise, judging whether the motion vectors of the four covered coding macro blocks are not equal, and performing the following operations:
b1221, if the motion vectors of the four covered coding macro blocks are not equal, selecting the corresponding position of the covered coding macro block with the largest residual DCT direct current coefficient in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing with the following step B2;
b1222, otherwise, selecting the corresponding position of the covered coding macro block with the smallest residual DCT dc coefficient in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, and continuing with the following step B2;
b2, determining whether the reference frame of the current coding macro block is also discarded, and performing the following operations:
b21, if not discarded, directly updating the current obtained motion vector;
b22, otherwise, searching forward until the reference frame of the current coding macro block is not the drop frame, and obtaining the motion vector of the current coding macro block.
5. The method of temporal resolution transform-based video conversion processing according to claim 4, wherein: the step B also comprises the following steps:
BA. If the number of covered coded macro blocks is two, judging whether a covered part larger than a pixel point threshold value T exists, and performing the following operations:
BA1, if the covered part larger than the pixel point threshold value T exists, if the covered coding macro block is coded by adopting an interframe coding mode, adopting the matching position of the covered coding macro block in the previous reference frame as the matching position of the current coding macro block in the previous reference frame, continuing the step BB, if the covered coding macro block is coded by adopting an intraframe coding mode, coding the current coding macro block by adopting the intraframe coding mode, and ending the process;
BA2, otherwise, judging whether there is covered part coded by adopting intra-frame coding mode, and carrying out the following operations:
BA21, if there is a covered part coded by adopting the intra-frame coding mode, the current macro block is also coded by adopting the intra-frame coding mode, and the process is ended;
BA22, otherwise, comparing the sum of the dc coefficients of two 8 × 8 blocks with larger coverage area in the two macroblocks, selecting the matching position of the larger one of the two covered coded macroblocks in the previous reference frame as the matching position of the current coded macroblock in the previous reference frame, and continuing with the following step BB;
BB. Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
BB1, if not discarded, directly updating the currently obtained motion vector;
BB2, otherwise, searches forward until the reference frame of the current coded macroblock is not a dropped frame, obtaining the motion vector of the current coded macroblock.
6. The method of temporal resolution transform-based video conversion processing according to claim 5, wherein: the step BB is followed by the step BC:
BC. Further selecting a search window to update the motion vector, said search window being updated according to the following equation:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
7. The method of temporal resolution transform-based video conversion processing according to claim 4, wherein: the step B also comprises the following steps:
ba. If the covered coding macro block is a block, if the macro block is coded by adopting an interframe coding mode, taking the matching position of the macro block in the previous reference frame as the matching position of the current macro block in the previous reference frame, and turning to the step Bb;
bb. Judging whether the reference frame of the current coding macro block is also discarded or not, and carrying out the following operations:
bb1, if not discarded, directly updating the currently obtained motion vector;
bb2, otherwise, searching forward until the reference frame of the current coding macro block is not the discarding frame, and obtaining the motion vector of the current coding macro block.
8. The method for video conversion processing based on temporal resolution transform according to any one of claims 1 to 3, wherein: the step B also comprises the following step C:
C. and further selecting a search window to update according to the motion vector condition and the frame loss number of the four covered macro blocks.
9. The method of temporal resolution transform-based video conversion processing according to claim 8, wherein: the step C comprises the following steps:
c1, determining whether the motion vectors of the four covered coding macro blocks are equal, and performing the following operations:
c11, if the motion vectors are equal, the motion vectors do not need to be updated;
c12, otherwise, judging whether the motion vectors of the four covered coding macro blocks are not equal, and performing the following operations:
c121, if the search window is not equal, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default size of the search window in the full search block matching calculation, j is the number of lost frames between the current frame and a reference frame of a non-lost frame, α is 2, and β is 1.5;
c122, otherwise, selecting a larger search window to update the search window, wherein the search window is updated according to the following formula:
SW=±min(defaultSW,α*βj-1)
wherein, SW is the size of a search window required when the current motion vector is updated, defaultSW is the default search window size in the full search block matching calculation, j is the number of dropped frames between the reference frames of the current frame and the non-dropped frames, α is 1, and β is 1.25.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200610061250A CN100584006C (en) | 2006-06-20 | 2006-06-20 | Video conversion process method based on transform of time resolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200610061250A CN100584006C (en) | 2006-06-20 | 2006-06-20 | Video conversion process method based on transform of time resolution |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101094398A true CN101094398A (en) | 2007-12-26 |
CN100584006C CN100584006C (en) | 2010-01-20 |
Family
ID=38992373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200610061250A Expired - Fee Related CN100584006C (en) | 2006-06-20 | 2006-06-20 | Video conversion process method based on transform of time resolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100584006C (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016202189A1 (en) * | 2015-06-14 | 2016-12-22 | 同济大学 | Image coding and decoding methods, image processing device, and computer storage medium |
CN106576170A (en) * | 2014-08-01 | 2017-04-19 | Ati科技无限责任公司 | Adaptive search window positioning for video encoding |
CN107027029A (en) * | 2017-03-01 | 2017-08-08 | 四川大学 | High-performance video coding improved method based on frame rate conversion |
CN107734342A (en) * | 2012-01-19 | 2018-02-23 | 佳能株式会社 | The method for coding and decoding the validity mapping of the residual error coefficient of change of scale |
US11159818B2 (en) | 2015-06-14 | 2021-10-26 | Zte Corporation | Image coding and decoding methods, image processing device and computer storage medium |
CN114827662A (en) * | 2022-03-18 | 2022-07-29 | 百果园技术(新加坡)有限公司 | Video resolution self-adaptive adjusting method, device, equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100530223B1 (en) * | 2003-05-13 | 2005-11-22 | 삼성전자주식회사 | Frame interpolation method and apparatus at frame rate conversion |
WO2006007527A2 (en) * | 2004-07-01 | 2006-01-19 | Qualcomm Incorporated | Method and apparatus for using frame rate up conversion techniques in scalable video coding |
KR100644629B1 (en) * | 2004-09-18 | 2006-11-10 | 삼성전자주식회사 | Method for estimating motion based on hybrid search block matching algorithm and frame-rate converter using thereof |
-
2006
- 2006-06-20 CN CN200610061250A patent/CN100584006C/en not_active Expired - Fee Related
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10516887B2 (en) | 2012-01-19 | 2019-12-24 | Canon Kabushiki Kaisha | Method, apparatus and system for encoding and decoding the significance map for residual coefficients of a transform unit |
US10531100B2 (en) | 2012-01-19 | 2020-01-07 | Canon Kabushiki Kaisha | Method, apparatus and system for encoding and decoding the significance map for residual coefficients of a transform unit |
CN107734343B (en) * | 2012-01-19 | 2020-03-03 | 佳能株式会社 | Method and apparatus for encoding and decoding image frame data and computer readable medium |
CN107734342A (en) * | 2012-01-19 | 2018-02-23 | 佳能株式会社 | The method for coding and decoding the validity mapping of the residual error coefficient of change of scale |
CN107734343A (en) * | 2012-01-19 | 2018-02-23 | 佳能株式会社 | The method for coding and decoding the validity mapping of the residual error coefficient of change of scale |
CN107734342B (en) * | 2012-01-19 | 2020-01-21 | 佳能株式会社 | Method for coding and decoding a significance map of residual coefficients of a transform unit |
US10531101B2 (en) | 2012-01-19 | 2020-01-07 | Canon Kabushiki Kaisha | Method, apparatus and system for encoding and decoding the significance map for residual coefficients of a transform unit |
CN106576170A (en) * | 2014-08-01 | 2017-04-19 | Ati科技无限责任公司 | Adaptive search window positioning for video encoding |
CN106576170B (en) * | 2014-08-01 | 2019-06-21 | Ati科技无限责任公司 | The method and system that adaptive search-window for Video coding positions |
WO2016202189A1 (en) * | 2015-06-14 | 2016-12-22 | 同济大学 | Image coding and decoding methods, image processing device, and computer storage medium |
US11159818B2 (en) | 2015-06-14 | 2021-10-26 | Zte Corporation | Image coding and decoding methods, image processing device and computer storage medium |
US11653019B2 (en) | 2015-06-14 | 2023-05-16 | Zte Corporation | Image coding and decoding methods, image processing device and computer storage medium |
CN107027029B (en) * | 2017-03-01 | 2020-01-10 | 四川大学 | High-performance video coding improvement method based on frame rate conversion |
CN107027029A (en) * | 2017-03-01 | 2017-08-08 | 四川大学 | High-performance video coding improved method based on frame rate conversion |
CN114827662A (en) * | 2022-03-18 | 2022-07-29 | 百果园技术(新加坡)有限公司 | Video resolution self-adaptive adjusting method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN100584006C (en) | 2010-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5619688B2 (en) | Hierarchical video coding with two-layer coding and single-layer decoding | |
US8073048B2 (en) | Method and apparatus for minimizing number of reference pictures used for inter-coding | |
KR100593350B1 (en) | Image encoding device, image encoding method, image decoding device, image decoding method, and communication device | |
US20050114093A1 (en) | Method and apparatus for motion estimation using variable block size of hierarchy structure | |
US8369408B2 (en) | Method of fast mode decision of enhancement layer using rate-distortion cost in scalable video coding (SVC) encoder and apparatus thereof | |
JP2008533850A5 (en) | ||
KR100977691B1 (en) | Method and apparatus for progressive channel switching | |
US7245662B2 (en) | DCT-based scalable video compression | |
US20050129125A1 (en) | Method and apparatus for pitcure compression using variable block of arbitrary size | |
KR20050078099A (en) | Video coding apparatus and method for inserting key frame adaptively | |
MXPA06002525A (en) | Coding and decoding for interlaced video. | |
JPWO2003003749A1 (en) | Image encoding device, image decoding device, image encoding method, and image decoding method | |
KR20050105271A (en) | Video encoding | |
CN100584006C (en) | Video conversion process method based on transform of time resolution | |
KR100597397B1 (en) | Method For Encording Moving Picture Using Fast Motion Estimation Algorithm, And Apparatus For The Same | |
KR100905059B1 (en) | The method and apparatus for block mode decision using predicted bit generation possibility in video coding | |
JPH08265765A (en) | Image coding system and motion compensating device for use therein | |
CN102484717A (en) | Video encoding device, video encoding method and video encoding program | |
KR100870554B1 (en) | Motion compensated temporal filtering method for efficient wavelet-based scalable video coding and record-medium for executing method thereof | |
WO2015015404A2 (en) | A method and system for determining intra mode decision in h.264 video coding | |
KR101337410B1 (en) | bit-rate control method for a macro-block | |
CN101184225A (en) | Spatial resolution transformation based video switch encoding method | |
KR20040039809A (en) | Moving picture encoder and method for coding using the same | |
JPH11205801A (en) | Dynamic image coder and its coding selection method | |
KR101072459B1 (en) | Method for Mode decision on combined scalability |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20100120 Termination date: 20150620 |
|
EXPY | Termination of patent right or utility model |