CN101651831B - Method and apparatus for improved coding mode selection - Google Patents

Method and apparatus for improved coding mode selection Download PDF

Info

Publication number
CN101651831B
CN101651831B CN 200910152180 CN200910152180A CN101651831B CN 101651831 B CN101651831 B CN 101651831B CN 200910152180 CN200910152180 CN 200910152180 CN 200910152180 A CN200910152180 A CN 200910152180A CN 101651831 B CN101651831 B CN 101651831B
Authority
CN
China
Prior art keywords
coding mode
values
coding
value
lagrangian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200910152180
Other languages
Chinese (zh)
Other versions
CN101651831A (en
Inventor
A·杜米特拉斯
B·G·哈斯克尔
A·普里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Computer Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/614,929 external-priority patent/US7194035B2/en
Application filed by Apple Computer Inc filed Critical Apple Computer Inc
Publication of CN101651831A publication Critical patent/CN101651831A/en
Application granted granted Critical
Publication of CN101651831B publication Critical patent/CN101651831B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Some embodiments provide a method of performing mode selection in a video compression and encoding system. The method encodes with several encoding modes from a set of encoding modes. The method computes a distortion value for each encoding mode from the several encoding modes. The method computes a bit rate value for each encoding mode from the several encoding modes. The method computes a Lagrangian value for each encoding mode from the several encoding modes, using the distortion value, the bit rate value, and a Lagrangian multiplier. The method selects an encoding mode based on the Lagrangian values. In some embodiments, computing the distortion value includes using a function that reduces the effects of outliers. In some embodiments, the Lagrangian multiplier is a slow varying Lagrangian multiplier that varies at a slower rate than a varying reference Lagrangian multiplier for a reference encoding mode. In yet some embodiments, the method clusters the Lagrangian values.

Description

For improvement of the method and apparatus selected of coding mode
The application is that application number is 200480002031.5, the applying date is on January 7th, 2004, denomination of invention for " for improvement of the method and apparatus selected of coding mode " the dividing an application of application for a patent for invention.
Related application
Present patent application require name be called " for improvement of the method and apparatus selected of coding mode ", sequence number is 60/439062, in the priority of disclosed U.S. Provisional Patent Application on the 8th January in 2003.
Technical field
The present invention relates to multimedia compressibility field.Particularly, the invention discloses the method and system of selecting for improvement of coding mode.
Background technology
Electronic media form based on numeral is finally replacing the simulation electronic medium format on a large scale.Optical digital disk (CD) has long ago just replaced simulation polyethylene disc.Analog tape becomes increasingly scarce.Second and third generation digital audio system, such as minidisk and MP3 (the 3rd layer of mpeg audio) just from the first generation CD-DA form share of capturing market.
Yet video media is slower than the development speed of audio frequency to the speed of stored digital and transmission formats.This mainly is owing to accurately represent the digital information that the video needs are a large amount of with number format.Accurately the required a large amount of digital informations of expression video need the very digital storage system of high power capacity and the transmission system of high bandwidth.
But video is just rapidly to stored digital and transmission formats.Computer processor, highdensity storage system and new efficient Coding Compression Algorithm finally make the digital video practice become practical aspect consumption price faster.In the period of several, DVD (digital versatile disc), Digital Video System have become one of the fastest consumption electronic product of sale.Because its video quality height, audio quality height, facility and other characteristics, DVD has replaced video tape recorder (VCR) rapidly, becomes the video playback systems of prerecording of selection.Discarded simulation NTSC (NTSC) video transmission standard is finally replaced by digitized ATSC (Advanced Television standard committee) Video transmission system.
For many years, computer system has been used various digital video coding form.Best digitized video compression and the coded system used by computer system are the digitized video systems that is supported by the known Motion Picture Experts Group that it is abbreviated as MPEG.The most known and the video format that utilization rate is very high of three kinds of MPEG is MPEG-1, MPEG-2 and MPEG-4.CD-Video and user class digital video editing system use early stage MPEG-1 form.Digital versatile disc (DVD) and dish-shaped Network brand (brand) direct broadcasting satellite (DBS) television broadcasting system are used MPEG-2 compression of digital video and coded system.Latest computed machine based on digital video code just promptly adopts the MPEG-4 coded system with relevant video frequency player.
Summary of the invention
The invention discloses the method and system of selecting for improvement of coding mode.In the disclosure, a kind of method of novelty is disclosed, be used for the enhancing of skip mode in the enhancing of Direct Model in the framework B-image of H.264 (MPEG-4/ part 10) and the P-image.
Direct Model and skip mode enhancing obtain by existing compressibility is made a plurality of changes.Particularly, system of the present invention has introduced the step that removes outlier in the distortion value, specified the step of smaller value of Lagrange's multiplier and the step of (clustering) lagrangian values of trooping in the rate-distortion optimisation that coding mode is selected before coding mode is selected.In one embodiment, in order to remove outlier, utilize the Huber cost function to calculate the distortion of different coding pattern.In an embodiment of the invention, system changes Lagrange's multiplier with the function as quantizer values Q, than benchmark H.264 the realization of (MPEG4/ part 10) change slowlyer.Utilize the coding mode of the pattern 0 that Lagrange troops to support that bit rate reduces.
Utilizing the experimental result of high quality video sequences to show, is cost with the small loss of Y-PSNR (PSNR), utilizes method of the present invention to obtain the minimizing of bit rate.By carrying out two different experiments, change has taken place although verified Y-PSNR, there is not subjective vision loss.
Compare with the existing rate-distortion optimization method in current being used in (non-standard) MPEG-4/ part 10 encoders, method of the present invention provides a kind of simple and useful addition Item (add-on).The more important thing is that when because not received pseudomorphism is incorporated into other the method for making in the image of decoding, when unavailable such as the value of further increase quantization parameter, method of the present invention is not introduced the minimizing that just can obtain bit rate in the decoding sequence with visible distortion.
By accompanying drawing and following detailed description, other purpose of the present invention, characteristics and advantage will be apparent.
Description of drawings
By following detailed, purpose of the present invention, characteristics and advantage will be apparent to those skilled in the art, wherein:
Fig. 1 diagrammatically shows the Huber cost function of variable r.
Fig. 2 A shows lagrangian multiplier original and that revise ModeIn the scope of being concerned about as the variation of the function of quantization parameter (Q) value.
Fig. 2 B shows B-frame lagrangian multiplier original and that revise ModeIn the scope of being concerned about as the variation of the function of quantization parameter (Q) value.
Fig. 2 C shows lagrangian multiplier original and that revise MotionIn the scope of being concerned about as the variation of the function of quantization parameter (Q) value.
Fig. 3 shows the flow chart how explanation selects coding mode.
Embodiment
The invention discloses the method and system of selecting for improvement of coding mode.In following description, for the ease of explaining that proposing concrete term provides complete understanding of the present invention.Yet it is evident that to those skilled in the art in order to implement the present invention does not need these concrete details.
The H.264 video encoding standard that occurs, be also referred to as MPEG4/ part 10, joint video team (JVT), advanced video encoding (AVC) and H.26L, it is by the common exploitation of Motion Picture Experts Group (MPEG) and International Telecommunications Union (ITU), so that the compression of the moving image higher than state-of-the-art video coding system to be provided, wherein said state-of-the-art video coding system and existing mpeg standard adapt.Its target application H.264 that is expected to become international standard in 2003 includes, but is not limited to video conference, digital storage media, television broadcasting, the Internet flows and communicates by letter.
Similar to other video encoding standard (in their main body or annex), H.264 standard uses rate distortion (RD) to determine framework.Particularly, H.264 standard is used rate-distortion optimisation and the locomotion evaluation that coding mode is selected.In open, principal focal point is that the coding mode in the framework of standard is H.264 selected.
In most of video coding systems, each frame of video of video sequence is divided into pixel subset, wherein pixel subset is known as picture element module.In standard H.264, picture element module is of different sizes (picture element module with 16 * 16 pixel sizes is commonly referred to as macro block).Coding mode is selected problem can be defined as " select in all possible coding method (or coding mode) best with to each picture element module in the frame of video " off the record to encode.Can solve coding mode by video encoder with different ways and select problem.Solving coding mode selects a possible method of problem to utilize rate-distortion optimisation exactly.
There is multiple different coding mode, its H.264 each interior picture element module of framework of video encoding standard that can be selected to encode.Pattern 0 is called " Direct Model " and " skip mode " in the P frame in the B frame.Other coding mode utilizes size in B frame or the P frame to equal the picture element module of 16 * 16,16 * 8 and 8 * 16 pixels, 8 * 8,8 * 4,4 * 8,4 * 4 pixels.
(pattern 0 of B image) do not have movable information to be transferred to decoder in Direct Model.And be to use prognoses system to generate movable information.Therefore, Direct Model can provide the saving of important bit rate to sequence, and wherein contiguous space or the temporal information of this sequence utilization allows good motion vector prediction.Yet the Direct Model during H.264 experimental estimation shows is selected not generate and the as many selecteed picture element module desired to some video sequences.
The disclosure has been recommended a kind of method, and the Direct Model (pattern 0) that is used for strengthening the bidirectional predictive picture (being called B image or B frame) in the framework of standard is H.264 selected.When being applied to the P frame, coding method of the present invention obtains the enhancing that skip mode (also being pattern 0) is selected.The enhancing of Direct Model and skip mode by the lagrangian values of trooping, remove outlier and in the rate-distortion optimisation that coding mode is selected, specify the smaller value of Lagrange's multiplier to obtain.
Utilize the experimental result of the video sequence of high quality sample to represent, compare with the bit stream of the compression that utilizes H.264 encoding and decoding to obtain, the bit rate of the bit stream of compression of the present invention has reduced.The minimizing of bit rate is slightly damaged relevant with the Y-PSNR of bit stream (PSNR).Yet experimental results show that of two tests do not have subjective vision loss relevant with the variation of Y-PSNR.The more important thing is, when being introduced in the image of decoding owing to unacceptable pseudomorphism, make when inapplicable such as other possible scheme of the value of further increase quantization parameter, method of the present invention is not introduced in the video sequence of decoding under the situation of visual distortion, just further obtains bit rate significantly and reduces.And, no matter the present invention uses the H.264 fact of framework, the video coding system that coding method of the present invention is optimized applicable to any use bit distortion.
The remainder of this document is organized as follows.Video compression overview part has at first been described the basic conception relevant with the optimization framework of bit distortion in the standard H.264.The coding method that the present invention proposes is partly described in detail in the Direct Model Enhancement Method that proposes.At last, one group of experimental result and conclusion are provided respectively in experimental result part and conclusion part.
Video compression overview
As described in before this document, each frame of video is divided into the H.264 sets of pixelblocks of standard.Can utilize motion compensated predictive coding that these picture element modules are encoded.The picture element module of prediction can be in its coding, do not use before image information inside (I) picture element module (I picture element module), use previous image information single directional prediction (P) picture element module (P picture element module) or use previous image information and the picture element module (B picture element module) of bi-directional predicted (B) of a back image information.
For the P picture element module in each P image, calculate a motion vector.(noting in each video image, in many ways the encoded pixels module).For example, picture element module can be divided into littler submodule, each submodule is calculated and the transmitting moving vector.The shape of submodule can change and can not be foursquare).Utilize the computer motion vector, the pixel transitions by in the image before above-mentioned can form the predict pixel module.Difference in the video image between the picture element module of the picture element module of reality and prediction is encoded then for transmission.(this difference is for the less difference between the picture element module of the picture element module of correcting prediction and reality).
Also can be by each motion vector of coding transmission of prediction.Just, near the motion vector that has been transmitted utilizing forms the prediction to motion vector, and the difference between the motion vector of actual motion vector and prediction then is used for transmission by coding subsequently.
For each B picture element module, typically calculate two motion vectors, one is the motion vector of above-mentioned previous image, one is the motion vector of a back image.(note in P image or B image, can encoding better to some picture element modules without motion compensation.Such pixel can be encoded as intra-pixelblocks.In the B image, utilize the compensation of one-way movement forward or backward to encode better to some picture element modules.Such pixel can be encoded as prediction forward or prediction backward, and this depends on whether used previous image or a back image in prediction.) two predict pixel modules are from two B picture element module motion vector calculation.Then two predict pixel modules are combined, to form final predict pixel module.As mentioned above, the picture element module of reality and the difference between the prediction module are encoded then for transmission in the video image.
As the P picture element module, each motion vector of B picture element module can transmit by predictive coding.Just, utilize near the motion vector that has been transmitted to form predicted motion vector.Difference between the motion vector of Shi Ji motion vector and prediction is encoded subsequently and is used for transmission then.
Yet, for the B picture element module, also there is the chance of interpolation motion vector, motion vector comes from the motion vector in the image pixel module of configuration or contiguous storage.(when the motion vector of the module of utilizing the current pixel modules configured made up motion vector prediction, the Direct Model type was called the time Direct Model.When the space that utilizes the current pixel module adjacent make up motion vector prediction the time, the Direct Model type is known as the space Direct Model.) interpolate value can be used as predicted motion vector then, actual motion vector and the difference between the predicted motion vector are encoded then for transmission.Be inserted in the encoder like this and all carry out.(notice that encoder always has decoder, will how to occur so this encoder will be known the video image of reconstruction exactly).
In some cases, the motion vector of interpolation enough well uses, and does not need to do any difference correction, does not need the transmitting moving vector data in this case.H.264 this be called the Direct Model in (and H.263) standard.When recording camera lentamente during the static background of pan (pan), Direct Model is selected just effective especially.In fact, such motion vector interpolation enough well can be used according to present situation, and this means does not need to transmit differential information to these B picture element module motion vectors.In skip mode (pattern 0 in the P image), make up motion vector prediction with mode identical in 16 * 16 Direct Model, making does not have the transmission of motion vector bits to be performed.
Before transmission, typically the predicated error (difference) of picture element module or submodule is changed, quantification and entropy coding, to reduce the quantity of bit.Be calculated as original expectation picture element module and be encoded with Direct Model in the predicated error of utilizing the mean square error between the decoded predict pixel module in Direct Model coding back.Yet predicated error is not encoded and transmits in skip mode.The size and dimension that is used for the submodule of conversion can be different with the submodule size and dimension that is used for motion compensation.For example, 8 * 8 pixels or 4 * 4 pixels are generally used for conversion, and 16 * 16 pixels, 16 * 8 pixels, 8 * 16 pixels or littler size are generally used for motion compensation.Motion compensation and conversion submodule size and dimension can be different between picture element module and picture element module.
The selection of best coding mode of each picture element module of encoding is one of decision in standard H.264, and this standard has very directly influence to the distortion D in the bit rate R of compressed bit stream and the decoded video sequence.The purpose that coding mode is selected is to select coding mode M *, it will be subjected to R (P)≤R MaxThe distortion D (p) of bit rate constraints minimize, wherein P is the vector of adjustable coding parameter, R MaxIt is maximum admissible bit rate.Affined optimization problem can be converted to utilize Lagrange's equation J (p, unconstrained optimization problem λ) is provided by following formula:
J(p,λ)=D(p)+λR(p) (1)
Wherein λ is Lagrange's multiplier, the compromise of its control distortion rate.The coding mode problem identificatioin has just become J (p, minimizing λ).This can express with following equation:
min all p ‾ { D ( p ‾ ) + λR ( p ‾ ) } - - - ( 2 )
Can assess aforesaid Lagrange's equation by each permissible coding mode is carried out the following step:
(a) after utilizing specific coding mode Code And Decode, calculated distortion D is as the standard L of the error between the picture element module of original picture element module and reconstruction 2
(b) calculate bit rate R as the sum of encoding motion vector and the necessary bit of conversion coefficient;
(c) utilize equation (1) to calculate lagrangian values J;
At last, after all coding modes were calculated lagrangian values J, the lagrangian values J of the minimum of acquisition represented to have solved the minimized coding mode M that is expressed by equation (2) *
Note, in video compression standard H.264, before the coding mode of determining bigger picture element module, utilize 8 * 8 and littler picture element module carry out determining of coding mode.And, note in the work of the complexity that reduces the optimization process, utilize fixing quantizer values Q to carry out to minimize definite, and often select Lagrange's multiplier to equal (for example) 0.85 * Q/2 or 0.85 * 2 Q/3, wherein Q is quantization parameter.For a plurality of B images, often select bigger value.Certainly, the reduction of this complexity has also limited the search to the minimum value of Lagrangian J in the distortion rate plane.
The Direct Model Enhancement Method of recommending
System recommendation of the present invention a kind of method, the Direct Model that be used for to strengthen the B frame is selected and strengthens in the P frame skip mode to select.System of the present invention utilization troop value at cost, outlier reduces and the explanation of Lagrange's multiplier.In one embodiment, native system utilizes four steps to carry out this method.With reference to accompanying drawing 3, following text provides the detailed description to these method steps.
At first, the current pixel module of each possible coding mode M is carried out Code And Decode, and as described in step 310 and 320 to distortion D MCalculate.In one embodiment, with distortion D MBe calculated to be the Huber functional value sum of error between the pixel in the picture element module of pixel in the original picture element module and decoding.The Huber function as shown in Figure 1, is provided by following equation:
D M ( χ ) = 1 2 χ 2 , | χ | ≤ β β | χ | - 1 2 β 2 , | χ | > β
Wherein χ is the error of a pixel of picture element module, and β is parameter.Undoubtedly, for the error amount less than β, the value of Huber function equals by the square error specified value.For the error amount greater than β, the value of Huber function is less than the value of the square error of same error value.
The second, as described in step 330, calculate the bit rate R of each coding mode.In one embodiment, system-computed bit rate R is as the sum of encoding motion vector and the necessary bit of conversion picture element module coefficient.
The 3rd, as described in step 340, system of the present invention utilizes the lagrangian of equation (1) calculation code pattern.In one embodiment, the value of lagrangian multiplier is selected by this system, and the value of this lagrangian multiplier is as the function of quantization parameter, and its original Lagrangian λ that partly advises than the nonstandardized technique of standard 4.1 versions H.264 changes slowlyer.Variation as the suggestion of the Lagrangian λ of the function of quantizer Q is described in accompanying drawing 2A, 2B and 2C.By making that lagrangian multiplier lambda vary must be slower than the lambda in the benchmark realization, the less bit rate composition R that emphasizes Lagrange's equation (1) of system of the present invention, and thereby the more distortion components D that emphasizes.As the result that lagrangian multiplier lambda is changed, the increase that bit rate R is small will have less influence to the lagrangian values J of output.(this also will reduce the influence that bit rate R troops to the Lagrange described in the following paragraph).
The 4th, make
Figure G2009101521804D00082
Become all J MMinimum value (utilizing equation (1)), M is one of them possible coding mode.System does not select to generate
Figure G2009101521804D00083
Coding mode (M *), but as the lagrangian values J that gets off and troop and calculate MMake S be made as the set of coding mode K, wherein the lagrangian values of Ji Suaning satisfies condition:
S = { k | J * J k | ≥ ϵ } - - - ( 3 )
Wherein general Shillong in distress (" ε ") is the error amount of selecting, J *Be the J of the minimum of all patterns.If coding mode 0 is the element of S set, then system selects coding mode 0 as the coding mode that will be used to the encoded pixels module, otherwise system select with Corresponding codes pattern M *(generate the coding mode M of minimum J value *).
Above-mentioned step utilized with benchmark (nonstandardized technique) H.264 encoder compare novel assembly.Especially, the present invention uses Huber cost function calculated distortion, the Lagrange's multiplier of modification and trooping of lagrangian values.
The Huber cost function belongs to robust M estimator classification.The key property of these functions is abilities that they reduce the outlier influence.More particularly, if any outlier is present in the picture element module, then the Huber cost function is lower than (the quadratic power ground) of mean square error function to their weighting (linearly), and making successively may be identical with the coding mode of neighboring macro-blocks to the selected coding mode of picture element module.
The lagrangian multiplier of revising must be slower as the function of quantization parameter Q, thereby the degree height that the degree that the distortion components of lagrangian values J is paid attention to is paid attention to than bit rate composition R.(in the document, " lambda " or " λ " expression is used in coding mode and determines Lagrange's multiplier in the process.The multiplier that is used in the motion vector selection process is different).
At last, trooping of the lagrangian values of describing in the past supported coding mode 0.Therefore, system of the present invention allows to utilize respectively for the Direct Model of B picture element module and P picture element module or the skip mode more picture element module of encoding.
Experimental result
Vidclip " is visited Egypt (Discovering Egypt) " by coming from, 9 kinds of color video montages of " wafing " and " Britain patient " constitute to be used in video measurement collection in the experiment.The particular characteristics of these video sequences is as described in Table 1.
Table 1: cycle tests
(slightly write ch and Og and represent chapters and sections and oppositely flicker (glance) respectively)
Sequence number The video sequence title Frame size Frame number Type
1 Visit Egypt, ch.1 704×464 58 Distant taking the photograph
2 Waft ch.11 720×480 44 Og
3 Visit Egypt, ch.1 704×464 630 Distant taking the photograph
4 Visit Egypt, ch.2 704×464 148 Zoom
5 Visit Egypt, ch.3 704×464 196 Lifting (Boom)
6 Visit Egypt, ch.6 704×464 298 Distant taking the photograph
7 The Britain patient, ch.2 720×352 97 Veining
8 The Britain patient, ch.6 720×352 196 Og
9 The Britain patient, ch.8 720×352 151 Og
Frame of video is represented with yuv format, equals per second 23.976 frames (fps) for all video sequence video frame rates.The visual quality of the bit rate R of the video sequence of utilization compression and the video sequence of decoding is come the effect of the method for the present invention's recommendation is assessed.Visual inspection and Y-PSNR (PSNR) value by video sequence are assessed the latter.
The assembly of the novelty in the coding method of the present invention that the Direct Model Enhancement Method of recommending is partly described replenishes the influence of speed and distortion mutually according to them.Method of the present invention makes overall bit rate reduce and the minimizing of slight Y-PSNR.Two experiments that utilization is described in following textual portions are assessed system of the present invention.
The fixed quantisation parameter of all sequences
To all video sequences, first tests selected quantization parameter is identical, and equals Q, Q+1, Q+3 respectively for I frame, P frame and B frame.Described in table 2, when utilizing coding method of the present invention, the minimizing of bit rate can be 9%, wherein the about 0.12dB of loss of Y-PSNR (PSNR).With comparing of the method coding that utilizes benchmark, there is not visible distortion in the video sequence that utilizes coding method of the present invention to encode.
Table 2: utilize identical quantization parameter Q to use bit rate (BR) [k bps] and the Y-PSNR (PSNR) [dB] of video sequence of the method for pedestal method and recommendation to all sequences
Figure G2009101521804D00111
The highest quantization parameter of each sequence
For the further validity of assessment coding method of the present invention, design and carried out second experiment.When bit rate R and Y-PSNR value all reduced, general argument was that several different methods such as pre-filtering video sequence, the value that increases quantizer Q etc. can generate identical result.The purpose of this experiment is to show that method of the present invention can further reduce bit rate when these methods can not further be used under the situation of the quality that does not unacceptably weaken video.
At first, to the video sequence of each test, when distortion becomes visible, utilize pedestal method to reduce bit rate as far as possible by the value that increases quantization parameter, up to Q Max+ 1.Next, system utilizes Q Max(distortion is sightless maximum also) and pedestal method coding and decoding video sequence generate the bit rate and Y-PSNR (PSNR) value that are included in the table 3.For each sequence, Q MaxValue is different, and for I frame, P frame and B frame, it also is respectively different.Suppose that maximum available bit minimizing does not have vision loss, is coded in identical Q with coding method of the present invention then MaxThe sequence of value.
Table 3: utilize the highest quantization parameter to use bit rate (BR) [k bps] and the Y-PSNR (PSNR) [dB] of film sequence of the method for pedestal method and recommendation
Figure G2009101521804D00121
Described in table 3, method of the present invention can further reduce bit rate 13.3% significantly, and (PSNR) loses about 0.29dB for Y-PSNR.(in order to assess the relevant pseudomorphism of any B frame) can the deterministic bit rate minimizing not introduce visual pseudomorphism by the sequence visual inspection under full motion in the video sequence of decoding.Notice that when utilizing method of the present invention, the value that can increase quantization parameter surpasses Q Max, and obtain the more bits rate and reduce and do not have a vision loss.
Conclusion
The invention provides a kind of method, be used for the enhancing of skip mode in the enhancing of Direct Model in the framework B image of the video compression standard of (MPEG4/ part 10) H.264 and the P image.System of the present invention utilizes Huber cost function calculated distortion, revises Lagrange's multiplier, and the lagrangian values of trooping is to select to be used for the coding mode of encoded pixels module.Test has shown the method for the present invention of utilizing, just can obtain significant bit-rate reduction with small Y-PSNR (PSNR) loss, and not have subjective visual quality to descend.As additives, when other the value where applicable no longer of scheme such as further increase quantization parameter, it is particularly useful that these characteristics make that method of the present invention reduces for the bit rate in any video coding system, and this video coding system utilizes the distortion rate optimization framework that coding mode is determined.
The method and apparatus of combine digital figure image intensifying has below been described.Under the situation that does not deviate from scope of the present invention, those of ordinary skill in the art can make a change and revise material and the arrangement of parts of the present invention.

Claims (36)

1. method that execution pattern is selected in video compression and coded system, described method comprises:
Come the Code And Decode picture element module with each possible coding mode;
Calculate the distortion value of each coding mode;
Calculate the bit-rates values of each coding mode;
Use described distortion value, described bit-rates values and Lagrange's multiplier to calculate the lagrangian values of each coding mode;
The set of recognition coding pattern, wherein the Zui Xiao lagrangian values of calculating with this set in the ratio of the lagrangian values that is associated of each coding mode more than or equal to the threshold error value;
When coding mode 0 belongs to the set of the coding mode of identifying, select coding mode 0; And
When coding mode 0 does not belong to the set of the coding mode of identifying, select the coding mode that is associated with the minimum lagrangian values of calculating.
2. method according to claim 1, wherein said distortion value is calculated as the Huber functional value sum of the error between the pixel in the picture element module of the pixel in the original picture element module and decoding, wherein said Huber function has reduced the influence of outlier, and described Huber function is lower than mean square error function to the weighting of the outlier in the picture element module.
3. method according to claim 1 is wherein calculated described bit-rates values and is comprised one group of motion vector of calculation code and one group of necessary total number of bits of conversion coefficient.
4. method according to claim 1, wherein said Lagrange's multiplier comprises the Lagrange's multiplier of slow change, it is as the function of quantized value.
5. method that execution pattern is selected in video compression and coded system, described method comprises:
Come the Code And Decode picture element module with each possible coding mode;
Calculate the distortion value of each coding mode;
Calculate the bit-rates values of each coding mode;
The Lagrange's multiplier of using described distortion value, described bit-rates values and slowly changing is calculated the lagrangian values of each coding mode, when calculating described lagrangian values, the Lagrange's multiplier of described slow change changes slowlyer as the function of quantized value than standard Lagrange's multiplier, to emphasize distortion value with respect to bit-rates values; And
Select coding mode by using the lagrangian values of calculating.
6. method according to claim 5 wherein reduces outlier by use the function of the influence of distortion value is calculated described distortion value.
7. method according to claim 6, wherein said function comprises the Huber function.
8. method according to claim 5 is wherein calculated described bit-rates values and is comprised one group of motion vector of calculation code and one group of necessary total number of bits of conversion coefficient.
9. one kind is used for from the method for a plurality of coding modes selection coding modes, and described method comprises:
At each coding mode from described a plurality of coding modes,
Reduce outlier to the function of the influence of distortion value by use, the distortion value of calculation code pattern;
Calculate the bit-rates values of this coding mode; And
Based on bit-rates values and the Lagrange's multiplier of described distortion value, this coding mode, calculate lagrangian values;
The set of recognition coding pattern, wherein the Zui Xiao lagrangian values of calculating with this set in the ratio of the lagrangian values that is associated of each coding mode more than or equal to the threshold error value;
When coding mode 0 belongs to the set of the coding mode of identifying, select coding mode 0;
When coding mode 0 does not belong to the set of the coding mode of identifying, select the coding mode that is associated with the minimum lagrangian values of calculating; And
Use selected coding mode a plurality of video images of encoding.
10. method according to claim 9, the influence that wherein reduces outlier are in order to select the coding mode of a pixel groups, and it is identical with the coding mode that is used for the coding sets of adjacent pixels.
11. method according to claim 9, wherein when the error amount between the decoded pixel value group of the original pixel value group of video image and this video image during greater than the threshold error value, described distortion value equals described error amount * described threshold error value-(described threshold error value) 2/ 2.
12. method according to claim 9, wherein when the error amount between the decoded pixel value group of the original pixel value group of video image and this video image was equal to or less than the threshold error value, described distortion value equaled (described error amount) 2/ 2.
13. an equipment that is used for selecting from a plurality of coding modes coding mode, described equipment comprises:
Be used for reducing outlier by use the function of the influence of distortion value is calculated device from the distortion value of each coding mode of described a plurality of coding modes;
The device that is used for the bit-rates values of each coding mode of calculating;
For the device that calculates lagrangian values based on bit-rates values and the Lagrange's multiplier of described distortion value, this coding mode for each coding mode from described a plurality of coding modes;
The device that is used for the set of recognition coding pattern, wherein the Zui Xiao lagrangian values of calculating with this set in the ratio of the lagrangian values that is associated of each coding mode more than or equal to the threshold error value;
Be used for when coding mode 0 belongs to the set of the coding mode of identifying, selecting coding mode 0; And when coding mode 0 does not belong to the set of the coding mode of identifying, the device of the coding mode that selection and the minimum lagrangian values of calculating are associated; And
Be used for to use the encode device of a plurality of video images of selected coding mode.
14. equipment according to claim 13, the influence that wherein reduces outlier are in order to select the coding mode of a pixel groups, it is identical with the coding mode that is used for the coding sets of adjacent pixels.
15. equipment according to claim 13, wherein when the error amount between the decoded pixel value group of the original pixel value group of video image and this video image during greater than the threshold error value, described distortion value equals described error amount * described threshold error value-(described threshold error value) 2/ 2.
16. equipment according to claim 13, wherein when the error amount between the decoded pixel value group of the original pixel value group of video image and this video image was equal to or less than the threshold error value, described distortion value equaled (described error amount) 2/ 2.
17. a method that is used for selecting from a plurality of coding modes coding mode, described method comprises:
At each coding mode from described a plurality of coding modes,
Reduce outlier calculates this coding mode to the function of the influence of distortion value distortion value by use;
Calculate the bit-rates values of this coding mode; And
Based on the distortion value of (i) this coding mode, the (ii) bit-rates values of this coding mode and the Lagrange's multiplier that (iii) slowly changes, calculate lagrangian values, when calculating described lagrangian values, described Lagrange's multiplier with than the low speed of reference Lagrange's multiplier as the function of quantization parameter and change, to emphasize distortion value with respect to bit-rates values;
From described a plurality of coding modes, select coding mode based on the lagrangian values of calculating; And
Use selected coding mode a plurality of video images of encoding.
18. H.264 method according to claim 17 is wherein saidly being stipulated in the standard with reference to Lagrange's multiplier.
19. an equipment that is used for selecting from a plurality of coding modes coding mode, described equipment comprises:
Be used for reducing outlier calculates the distortion value of each coding mode to the function of the influence of distortion value device by use;
The device that is used for the bit-rates values of each coding mode of calculating;
Be used at each coding modes of described a plurality of coding modes based on the distortion value of (i) this coding mode, the (ii) bit-rates values of this coding mode and the device that the Lagrange's multiplier that (iii) slowly changes is calculated the lagrangian values of described coding mode, when calculating described lagrangian values, described Lagrange's multiplier with than the low speed of reference Lagrange's multiplier as the function of quantization parameter and change, to emphasize distortion value with respect to bit-rates values;
Be used for from described a plurality of coding modes, select the device of coding mode based on the lagrangian values of calculating; And
Be used for to use the encode device of a plurality of video images of selected coding mode.
20. H.264 equipment according to claim 19 wherein saidly stipulated in the standard with reference to Lagrange's multiplier.
21. a method that is used for selecting from a plurality of coding modes coding mode, described a plurality of coding modes comprise coding mode 0, and described method comprises:
At each coding mode from described a plurality of coding modes,
Calculate the distortion value of this coding mode;
Calculate the bit-rates values of this coding mode; And
Based on (i) this distortion value, (ii) this bit-rates values and the Lagrange's multiplier that (iii) slowly changes are calculated the lagrangian values of this coding mode, when calculating described lagrangian values, the Lagrange's multiplier of described slow change changes with the function that the standard Lagrange's multiplier is compared as quantized value slowlyer, to emphasize distortion value with respect to bit-rates values;
Minimum lagrangian values in the lagrangian values of determining to calculate;
The set of recognition coding pattern, wherein said minimum lagrangian values with this set in the ratio of the lagrangian values that is associated of each coding mode more than or equal to the threshold error value;
When coding mode 0 belongs to the set of the coding mode of identifying, select coding mode 0; And
When coding mode 0 does not belong to the set of the coding mode of identifying, the coding mode that selection and described minimum lagrangian values are associated.
22. method according to claim 21, wherein said coding mode 0 are the Direct Model codings.
23. method according to claim 21, wherein said coding mode 0 are the skip mode codings.
24. method according to claim 21, wherein said coding mode 0 are the coding modes of transmitting moving vector information not.
25. an equipment that is used for selecting from a plurality of coding modes coding mode, described a plurality of coding modes comprise coding mode 0, and described equipment comprises:
The device that is used for the distortion value of each coding mode of calculating;
The device that is used for the bit-rates values of each coding mode of calculating;
Be used for each coding mode at described a plurality of coding modes, based on the distortion value of (i) this coding mode, the Lagrange's multiplier that (ii) slowly changes and (iii) the bit-rates values of this coding mode calculate the device of lagrangian values, when calculating described lagrangian values, the Lagrange's multiplier of described slow change changes with the function that the standard Lagrange's multiplier is compared as quantized value slowlyer, to emphasize distortion value with respect to bit-rates values;
The device that is used for the minimum lagrangian values of definite lagrangian values of calculating;
The device that is used for the set of recognition coding pattern, the ratio of wherein said minimum lagrangian values and the lagrangian values of calculating for each coding mode in this set is more than or equal to the threshold error value; And
Be used for when coding mode 0 belongs to the set of the coding mode of identifying, selecting coding mode 0; And when coding mode 0 does not belong to the set of the coding mode of identifying, the device of the coding mode that selection and described minimum lagrangian values are associated.
26. equipment according to claim 25, wherein said coding mode 0 are the Direct Model codings.
27. equipment according to claim 25, wherein said coding mode 0 are the skip mode codings.
28. equipment according to claim 25, wherein said coding mode 0 are the coding modes of transmitting moving vector information not.
29. the equipment that execution pattern is selected in video compression and coded system, described equipment comprises:
Be used for coming with each possible coding mode the device of Code And Decode picture element module;
The device that is used for the distortion value of each coding mode of calculating;
The device that is used for the bit-rates values of each coding mode of calculating;
Be used for using described distortion value, described bit-rates values and Lagrange's multiplier to calculate the device of the lagrangian values of each coding mode;
The device that is used for the set of recognition coding pattern, wherein the Zui Xiao lagrangian values of calculating with this set in the ratio of the lagrangian values that is associated of each coding mode more than or equal to the threshold error value; And
Be used for when coding mode 0 belongs to the set of the coding mode of identifying, selecting coding mode 0; And when coding mode 0 does not belong to the set of the coding mode of identifying, the device of the coding mode that selection and the minimum lagrangian values of calculating are associated.
30. equipment according to claim 29, wherein said distortion value is calculated as the Huber functional value sum of the error between the pixel in the picture element module of the pixel in the original picture element module and decoding, and wherein said Huber function is lower than mean square error function to the weighting of the outlier in the picture element module.
31. equipment according to claim 29, the device that wherein be used for to calculate described bit-rates values comprises the device for one group of motion vector of calculation code and one group of necessary total number of bits of conversion coefficient.
32. equipment according to claim 29, wherein said Lagrange's multiplier comprises the Lagrange's multiplier of slow change, and it is as the function of quantized value.
33. the equipment that execution pattern is selected in video compression and coded system, described equipment comprises:
Be used for coming with each possible coding mode the device of Code And Decode picture element module;
The device that is used for the distortion value of each coding mode of calculating;
The device that is used for the bit-rates values of each coding mode of calculating;
The Lagrange's multiplier that is used for using described distortion value, described bit-rates values and slowly changes is calculated the device of the lagrangian values of each coding mode, when calculating described lagrangian values, the Lagrange's multiplier of described slow change changes slowlyer as the function of quantized value than standard Lagrange's multiplier, to emphasize distortion value with respect to bit-rates values; And
Be used for by using the lagrangian values of calculating to select the device of coding mode.
34. equipment according to claim 33 wherein reduces outlier by use the function of the influence of distortion value is calculated described distortion value.
35. equipment according to claim 34, wherein said function comprises the Huber function.
36. equipment according to claim 33, the device that wherein be used for to calculate described bit-rates values comprises the device for one group of motion vector of calculation code and one group of necessary total number of bits of conversion coefficient.
CN 200910152180 2003-01-08 2004-01-07 Method and apparatus for improved coding mode selection Expired - Fee Related CN101651831B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US43906203P 2003-01-08 2003-01-08
US60/439,062 2003-01-08
US10/614,929 2003-07-07
US10/614,929 US7194035B2 (en) 2003-01-08 2003-07-07 Method and apparatus for improved coding mode selection

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN 200480002031 Division CN100536571C (en) 2003-01-08 2004-01-07 Method and apparatus for improved coding mode selection

Publications (2)

Publication Number Publication Date
CN101651831A CN101651831A (en) 2010-02-17
CN101651831B true CN101651831B (en) 2013-07-17

Family

ID=36680333

Family Applications (2)

Application Number Title Priority Date Filing Date
CN 200910152180 Expired - Fee Related CN101651831B (en) 2003-01-08 2004-01-07 Method and apparatus for improved coding mode selection
CN 200480002031 Expired - Fee Related CN100536571C (en) 2003-01-08 2004-01-07 Method and apparatus for improved coding mode selection

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN 200480002031 Expired - Fee Related CN100536571C (en) 2003-01-08 2004-01-07 Method and apparatus for improved coding mode selection

Country Status (1)

Country Link
CN (2) CN101651831B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8503536B2 (en) * 2006-04-07 2013-08-06 Microsoft Corporation Quantization adjustments for DC shift artifacts
CN101198062A (en) * 2006-12-04 2008-06-11 华为技术有限公司 Method and system for estimating compression encoding
JP5409640B2 (en) * 2007-10-16 2014-02-05 トムソン ライセンシング Method and apparatus for artifact removal for bit depth scalability
CN101552924B (en) * 2008-03-31 2011-08-03 深圳市融创天下科技发展有限公司 Spatial prediction method for video coding
US8934548B2 (en) * 2009-05-29 2015-01-13 Mitsubishi Electric Corporation Image encoding device, image decoding device, image encoding method, and image decoding method
CN102256126A (en) * 2011-07-14 2011-11-23 北京工业大学 Method for coding mixed image
CN104320657B (en) * 2014-10-31 2017-11-03 中国科学技术大学 The predicting mode selecting method of HEVC lossless video encodings and corresponding coding method
CN106296754B (en) * 2015-05-20 2019-06-18 上海和辉光电有限公司 Show data compression method and display data processing system
CN108696750A (en) * 2017-04-05 2018-10-23 深圳市中兴微电子技术有限公司 A kind of decision method and device of prediction mode

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1157080A (en) * 1995-04-25 1997-08-13 菲利浦电子有限公司 Device and method for coding video pictures
CN1292978A (en) * 1999-01-15 2001-04-25 皇家菲利浦电子有限公司 Coding and noise filtering image sequence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69609137T2 (en) * 1995-04-25 2001-03-01 Koninkl Philips Electronics Nv DEVICE AND METHOD FOR ENCODING VIDEO IMAGES.

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1157080A (en) * 1995-04-25 1997-08-13 菲利浦电子有限公司 Device and method for coding video pictures
CN1292978A (en) * 1999-01-15 2001-04-25 皇家菲利浦电子有限公司 Coding and noise filtering image sequence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Mei-Yin Shen et al.Fast Compression Artifact Reduction Technique Based On Nonlinear Filtering.《Proceedings of the 1999 IEEE International Symposium on Circuits and Systems》.1999,第4卷179-182. *
Thomas Wiegand et al.Lagrange Multiplier Selection in Hybrid Video Coder Control.《Proceedings 2001 International Conference on Image Processing》.2001,第3卷542-545. *

Also Published As

Publication number Publication date
CN1754389A (en) 2006-03-29
CN101651831A (en) 2010-02-17
CN100536571C (en) 2009-09-02

Similar Documents

Publication Publication Date Title
CN1574970B (en) Method and apparatus for encoding/decoding image using image residue prediction
JP4480713B2 (en) Method and apparatus for improved coding mode selection
CN102150429B (en) System and method for video encoding using constructed reference frame
CN101176350B (en) Method and apparatus for coding motion and prediction weighting parameters
CN100417218C (en) Method and apparatus for variable accuracy inter-picture timing specification for digital video encoding with reduced requirements for division operations
CN101743753B (en) A buffer-based rate control exploiting frame complexity, buffer level and position of intra frames in video coding
CN103918262B (en) Code rate distortion optimization based on structural similarity perceives method for video coding and system
CN100553321C (en) The coding dynamic filter
CN102845060B (en) Data compression for video
CN111709896B (en) Method and equipment for mapping LDR video into HDR video
CN101321287A (en) Video encoding method based on movement object detection
CN105264888A (en) Encoding strategies for adaptive switching of color spaces, color sampling rates and/or bit depths
CN104320657B (en) The predicting mode selecting method of HEVC lossless video encodings and corresponding coding method
CN102986211A (en) Rate control in video coding
KR100922510B1 (en) Image coding and decoding method, corresponding devices and applications
CN103650494A (en) Image processing apparatus and image processing method
US20080089595A1 (en) Method of and apparatus for encoding/decoding data
CN101505429A (en) Apparatus and method for intra coding video data
US20110211637A1 (en) Method and system for compressing digital video streams
CN101651831B (en) Method and apparatus for improved coding mode selection
CN101873495B (en) Scene transition detecting device and image recording device
CN104104947B (en) A kind of method for video coding and device
US6549671B1 (en) Picture data encoding apparatus with bit amount adjustment
CN102387362A (en) Image processing apparatus and image processing method
CN100505874C (en) Video-frequency encoding-rate controlling method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130717

Termination date: 20160107