CN104662905B - Use multiple inter-frame prediction methods and its device for assuming the estimation factor - Google Patents

Use multiple inter-frame prediction methods and its device for assuming the estimation factor Download PDF

Info

Publication number
CN104662905B
CN104662905B CN201380049000.4A CN201380049000A CN104662905B CN 104662905 B CN104662905 B CN 104662905B CN 201380049000 A CN201380049000 A CN 201380049000A CN 104662905 B CN104662905 B CN 104662905B
Authority
CN
China
Prior art keywords
sub
pixel
unit
coding
hypothesis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201380049000.4A
Other languages
Chinese (zh)
Other versions
CN104662905A (en
Inventor
金壹求
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN104662905A publication Critical patent/CN104662905A/en
Application granted granted Critical
Publication of CN104662905B publication Critical patent/CN104662905B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

This disclosure relates to Video coding and video decoding, more particularly, it is related to a kind of method for estimating and motion compensation process, the method for estimating and the motion compensation process are related to estimating to determine motion vector because of sub-pixel by using multiple hypothesis as unit of sub-pixel, wherein, the method for estimating and the motion compensation process are performed to the inter-prediction being performed during Video coding and decoding.Motion compensation process using the estimation of motion vectors factor may include following operation: obtaining the motion vector of included predicting unit in coding unit and obtains the hypothesis estimation model information of coding unit;The combination of predetermined sub-pixel distance and predetermined rectilinear direction is determined based on the hypothesis estimation model information, wherein, the predetermined sub-pixel distance is selected in two or more sub-pixels distance, and the predetermined rectilinear direction is selected in two or more rectilinear directions;By using respectively including determining reference block because sub-pixel at a distance of two of the predetermined sub-pixel distance assumes block of the estimation because of sub-pixel with current estimation in the predetermined rectilinear direction, wherein current estimation is indicated because of sub-pixel by current motion vector.

Description

Use multiple inter-frame prediction methods and its device for assuming the estimation factor
Technical field
This disclosure relates to which Video coding and decoding are more particularly related to for executing fortune in video coding and decoding The method and apparatus of dynamic estimation and motion compensation.
Background technique
With the exploitation and offer of the hardware for reproducing and storing high-resolution or high-quality video content, for being used for The demand for effectively carrying out coding or decoded Video Codec to high-resolution or high-quality video content is increasing.Root Video is compiled according to limited coding method based on the macro block with predetermined size according to traditional Video Codec Code.
The image data of spatial domain is transformed to the coefficient of frequency domain via frequency transformation.It, will according to Video Codec Image is divided into the block of predetermined size, to each piece of execution discrete cosine transform (DCT), and in blocks to coefficient of frequency into Row coding, to carry out the quick calculating of frequency transformation.Compared with the image data of spatial domain, the coefficient of frequency domain is easy to be pressed Contracting.Specifically, since the prediction error according to the inter prediction or infra-frame prediction via Video Codec is come representation space domain Image pixel value, therefore when to prediction error execute frequency transformation when, mass data can be transformed to 0.It is compiled and is solved according to video Code device, can replace continuously laying equal stress on by using small amount of data reproducing raw data, to reduce data volume.
Summary of the invention
Technical problem
This disclosure relates to Video coding and decoding, more particularly, it is related to a kind of method for estimating and motion compensation side Method, wherein these methods are related to estimating the factor (hypothetical by using multiple hypothesis as unit of sub-pixel Estimator) pixel determines reference block, and is related to assuming estimation because of sub-pixel to determine using least information, wherein institute It states method for estimating and motion compensation process is performed to the inter-prediction executed during Video coding and decoding.
Solution
Method for estimating according to an embodiment of the present disclosure is related to: by the way that the movement as unit of whole pixel is not used only Vector and motion vector is determined using the multiple hypothesis estimation factor as unit of sub-pixel;Instruction is estimated in multiple hypothesis The information of the optimal hypothesis estimation factor selected in the meter factor executes entropy coding.Motion compensation side according to an embodiment of the present disclosure Method is related to: by instruction assume estimation the factor information execute entropy decoding come determine as unit of sub-pixel hypothesis estimation because Son;By using by current motion vector and assume the estimation factor in conjunction with and the final reference block that determines executes motion compensation.
Beneficial effect
The disclosure provides one or more embodiments of method for estimating, and the method for estimating is by additionally Estimate to determine reference block at a distance of the hypothesis estimation factor of sub-pixel distance because of sub-pixel using with current, it is pre- to improve interframe The accuracy of survey.The method for estimating only allows the combination with high probability as assuming estimation because sub-pixel is relative to working as The combination in direction and sub-pixel distance that preceding estimation is located at by sub-pixel assumes estimation factor picture so as to be rapidly selected Element.In addition, the hypothesis about selection is estimated to be reduced to bottom line because of the transmitted bit number of the information of sub-pixel, so as to mention Height comprises provide that the bit rate of the coded identification of estimation model information.
Detailed description of the invention
Fig. 1 is the frame according to an embodiment of the present disclosure based on according to the video encoder of the coding unit of tree construction Figure.
Fig. 2 is the frame according to an embodiment of the present disclosure based on according to the video decoding apparatus of the coding unit of tree construction Figure.
Fig. 3 is the diagram for describing the design of coding unit according to an embodiment of the present disclosure.
Fig. 4 is the block diagram of the image encoder according to an embodiment of the present disclosure based on coding unit.
Fig. 5 is the block diagram of the image decoder according to an embodiment of the present disclosure based on coding unit.
Fig. 6 is the diagram for showing the deeper coding unit according to an embodiment of the present disclosure according to depth and subregion.
Fig. 7 is the diagram for describing the relationship between coding unit and converter unit according to an embodiment of the present disclosure.
Fig. 8 is the encoded information for describing coding unit corresponding with coding depth according to an embodiment of the present disclosure Diagram.
Fig. 9 is the diagram of the deeper coding unit according to an embodiment of the present disclosure according to depth.
Figure 10 to Figure 12 be for describe coding unit, predicting unit and converter unit according to an embodiment of the present disclosure it Between relationship diagram.
Figure 13 is for describing between the coding unit of the coding mode information according to table 1, predicting unit and converter unit Relationship diagram.
Figure 14 is the block diagram of motion estimation apparatus according to an embodiment of the present disclosure.
Figure 15 is the block diagram of motion compensation equipment according to an embodiment of the present disclosure.
Figure 16 a and Figure 16 b show the type according to an embodiment of the present disclosure for assuming estimation model.
Figure 17 shows the group in the direction according to an embodiment of the present disclosure indicated by hypothesis estimation model, value of symbol and distance It closes.
Figure 18 shows the hypothesis according to an embodiment of the present disclosure as the test target about rate distortion (RD) cost and estimates Meter mode.
Figure 19 shows the flow chart of method for estimating according to an embodiment of the present disclosure.
Figure 20 shows the flow chart of motion compensation process according to an embodiment of the present disclosure.
Figure 21 shows the physical structure of the disk of storage program according to an embodiment of the present disclosure.
Figure 22 shows by using disk the disk drive recorded with reading program.
Figure 23, which is shown, provides the integrally-built diagram of the contents providing system of content distribution service.
Figure 24 and Figure 25 shows method for video coding and the video encoding/decoding method according to an embodiment of the present disclosure of applying The external structure and internal structure of mobile phone.
Figure 26 shows the digit broadcasting system of application communication system according to an embodiment of the present disclosure.
Figure 27 shows the cloud computing system according to an embodiment of the present disclosure using video encoder and video decoding apparatus The network structure of system.
Preferred forms
Method for estimating according to an embodiment of the present disclosure is related to: not only by using the movement as unit of whole pixel Vector also determines motion vector by using multiple hypothesis estimation factor as unit of sub-pixel;To instruction from the multiple Assuming that the information of the best hypothesis estimation factor selected in the estimation factor executes entropy coding.Movement according to an embodiment of the present disclosure Compensation method is related to: it is described to determine to execute entropy decoding by the information of the hypothesis estimation factor to instruction as unit of sub-pixel Assuming that the estimation factor;By using by current motion vector and it is described assume estimation the factor in conjunction with and determine final reference block come Motion compensation.
According to the one side of the disclosure, a kind of motion compensation process using the estimation of motion vectors factor, the fortune are provided Dynamic compensation method includes following operation: obtaining the motion vector of included predicting unit in coding unit, and is encoded The hypothesis estimation model information of unit;Predetermined sub-pixel distance and pre- boning out are determined based on the hypothesis estimation model information The combination in direction, wherein the predetermined sub-pixel distance is selected in two or more sub-pixels distance, described predetermined Rectilinear direction is selected in two or more rectilinear directions;By using respectively include in the predetermined rectilinear direction with It is current to estimate to estimate the block because of sub-pixel at a distance of two hypothesis of the predetermined sub-pixel distance because of sub-pixel to determine reference block, Wherein, current to estimate to be indicated because of sub-pixel by current motion vector.
Obtaining the operation for assuming estimation model information may include following operation: obtain the hypothesis estimation model information With motion vector difference value information, wherein motion vector difference value information indicates current motion vector and before current prediction unit Difference between the motion vector of predicting unit encoded;The residual error data between current prediction unit and reference block is obtained, Wherein it is determined that the operation of reference block may include following operation: by by the residual error data and with reference to merged block it is current to generate The recovery block of predicting unit.
Obtaining the operation for assuming estimation model information may include following operation: obtain in current coded unit Included predicting unit and determining hypothesis estimation model information jointly.
The two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance, described two or more Multiple rectilinear directions may include with 0 degree of angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles direction, the hypothesis estimation model information It may include the sub- pixel distance selected in the two or more sub-pixels distance and the two or more 8 kinds of combinations of the rectilinear direction selected in rectilinear direction.
It determines that the combined operation may include following operation: determining the hypothesis estimation according to the depth of coding unit The context model of pattern information;By using 4 context models corresponding with the depth of current coded unit to the vacation If estimation model information executes entropy decoding;Hypothesis estimation model information based on entropy decoding determines son for current motion vector The combination of pixel distance and rectilinear direction.
According to another aspect of the present disclosure, a kind of method for estimating using the estimation of motion vectors factor is provided, it is described Method for estimating includes following operation: in coding unit in included predicting unit, determining and is used for current prediction unit Inter-prediction current motion vector;By using respectively include in predetermined rectilinear direction currently to estimate to be because of sub-pixel Two of center assume block of the estimation because of sub-pixel to determine reference block, wherein described two hypothesis estimate because sub-pixel with It is current to estimate to estimate among because of sub-pixel because of sub-pixel at a distance of multiple hypothesis of predetermined sub-pixel distance, and wherein, currently estimate It counts and is indicated because of sub-pixel by current motion vector;The hypothesis estimation model information of exports coding unit, and export predicting unit Motion vector difference value information, wherein described to assume the indicating predetermined sub-pixel distance of estimation model information and predetermined rectilinear direction Combination, wherein predetermined sub-pixel distance is selected in two or more sub-pixels distance, and predetermined rectilinear direction is two It is selected in a or more rectilinear direction.
The step of exporting the hypothesis estimation model information and exporting the motion vector difference value information may include following behaviour Make: exporting the hypothesis estimation model information and the motion vector difference value information, wherein the motion vector difference value information refers to Show the difference between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit;Output is worked as Residual error data between preceding predicting unit and reference block.
Exporting the operation for assuming estimation model information may include following operation: output is in current coded unit Included predicting unit and determining hypothesis estimation model information jointly.
The two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance, described two or more Multiple rectilinear directions include with 0 degree angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles direction, and hypothesis estimation model letter Breath may include the sub- pixel distance selected in the two or more sub-pixels distance and described two or more 8 kinds of combinations of the rectilinear direction selected in a rectilinear direction.
Export the operation for assuming estimation model information can include: determine that the hypothesis is estimated according to the depth of coding unit Count the context model of pattern information;By using 4 context models corresponding with the depth of current coded unit to described Assuming that estimation model information executes entropy coding.
The operation for determining reference block may include following operation: by using with 0 degree of angle, an angle of 90 degrees, 135 degree of angles and 45 Spend in each rectilinear direction in the rectilinear direction at angle with current estimation because sub-pixel at a distance of 1/4 pixel distance hypothesis estimation because Sub-pixel carrys out calculation rate distortion (RD) cost;By using generate RD cost in minimum RD cost direction on currently estimate It counts and estimates to calculate RD cost because of sub-pixel at a distance of the hypothesis of 1/2 pixel distance because of sub-pixel;It determines and generates in RD cost most Rectilinear direction and sub-pixel distance where small RD cost;Determine reference block, wherein the reference block is respectively included based on production The block of rectilinear direction where minimum RD cost and sub-pixel distance in raw RD cost and the hypothesis estimation that determines because of sub-pixel Average block.
According to another aspect of the present disclosure, a kind of motion compensation equipment using the estimation of motion vectors factor is provided, it is described Motion compensation equipment includes: information obtainer, obtains the current prediction unit in included predicting unit in coding unit Residual error data and current motion vector, and obtain the hypothesis estimation model information of coding unit;Assuming that estimation model determiner, The combination of predetermined sub-pixel distance and predetermined rectilinear direction is determined based on the hypothesis estimation model information, wherein described pre- Stator pixel distance is selected in two or more sub-pixels distance, and the predetermined rectilinear direction is at two or more A rectilinear direction selection;Motion compensator, by using respectively include in the predetermined rectilinear direction with the current estimation factor Pixel assumes block of the estimation because of sub-pixel at a distance of two of the predetermined sub-pixel distance to determine reference block, and by by residual error Data and the recovery block that current prediction unit is generated with reference to merged block, wherein current to estimate to be sweared because of sub-pixel by current kinetic Amount instruction.
According to another aspect of the present disclosure, a kind of motion estimation apparatus using the estimation of motion vectors factor is provided, it is described Motion estimation apparatus includes: exercise estimator, in the predicting unit for including in coding unit, determines and is used for current prediction unit Inter-prediction current motion vector, and by using respectively include in predetermined rectilinear direction currently to estimate because of sub-pixel Centered on two assume block of the estimation because of sub-pixel to determine reference block, wherein described two hypothesis are estimated to exist because of sub-pixel Estimate to estimate among because of sub-pixel because of sub-pixel at a distance of multiple hypothesis of predetermined sub-pixel distance with current, and wherein, currently Estimate to be indicated because of sub-pixel by current motion vector;Information output unit, the hypothesis estimation model information of exports coding unit are defeated The motion vector difference value information of predicting unit out, wherein the indicating predetermined sub-pixel distance of hypothesis estimation model information and pre- The combination in boning out direction, wherein predetermined sub-pixel distance is selected in two or more sub-pixels distance, is made a reservation for straight Line direction is selected in two or more rectilinear directions.
According to another aspect of the present disclosure, a kind of computer journey for recording and having for executing the motion compensation process is provided The computer readable recording medium of sequence.
According to another aspect of the present disclosure, a kind of computer journey for recording and having for executing the method for estimating is provided The computer readable recording medium of sequence.
Specific embodiment
Hereinafter, Video coding and decoding referring to figs. 1 to Figure 13, by description based on the coding unit with tree construction Scheme.Hereinafter, term " image " can indicate static image or motion picture (that is, video itself).In addition, referring to Fig.1 4 to Figure 20, the method and apparatus that description is used to execute Motion estimation and compensation by using multiple hypothesis, the movement The interframe that estimation and motion compensation are used to execute in the Video coding and coding/decoding method based on the coding unit with tree construction Prediction.
Firstly, Video coding and decoding side referring to figs. 1 to Figure 13, by description based on the coding unit with tree construction Case.
Fig. 1 is the video encoder 100 according to an embodiment of the present disclosure based on according to the coding unit of tree construction Block diagram.
It is determined based on the video encoder 100 for carrying out video estimation according to the coding unit of tree construction including coding unit Device 120 and output unit 130.Hereinafter, for ease of description, based on according to the coding unit of tree construction progress video estimation Video encoder 100 be referred to as " video encoder 100 ".
Coding unit determiner 120 can divide current picture based on the maximum coding unit of the current picture of image.Such as Fruit current picture is greater than maximum coding unit, then it is single that the image data of current picture can be divided at least one maximum coding Member.Maximum coding unit according to an embodiment of the present disclosure can be having a size of 32 × 32,64 × 64,128 × 128,256 × 256 equal data cells, wherein the square for several powers that the shape of data cell is width and length is 2.Image data Coding unit determiner 120 can be output to according at least one maximum coding unit.
Coding unit according to an embodiment of the present disclosure can be characterized by full-size and depth.Depth representing coding unit from The number that maximum coding unit is divided by space, and with depth down, it can be from most according to the deeper coding unit of depth Big coding unit is divided into minimum coding unit.The depth of maximum coding unit is highest depth, the depth of minimum coding unit Degree is lowest depth.Due to the depth down with maximum coding unit, the size of coding unit corresponding with each depth subtracts It is small, therefore coding unit corresponding with greater depths may include multiple coding units corresponding with more low depth.
As described above, the image data of current picture is divided into maximum coding list according to the full-size of coding unit Member, and each maximum coding unit may include according to the divided deeper coding unit of depth.Due to according to depth to root It is divided according to the maximum coding unit of embodiment of the disclosure, therefore can be according to depth to including in maximum coding unit The image data of spatial domain carries out hierarchical classification.
The depth capacity and full-size of coding unit can be predefined, wherein the depth capacity and full-size limit The total degree that the height and width of maximum coding unit processed are divided by layering.
Coding unit determiner 120 to by according to region of the depth to maximum coding unit be obtained by dividing to A few division region is encoded, and is determined according at least one described division region for exporting the figure finally encoded As the depth of data.In other words, coding unit determiner 120 is by the maximum coding unit according to current picture according to depth Deeper coding unit image data is encoded, and the depth with minimum coding error is selected, to determine that coding is deep Degree.Therefore, the coded image data of final output coding unit corresponding with the coding depth determined.In addition, and coding depth Corresponding coding unit can be considered as the coding unit of coding.By determining coding depth and according to the volume of determining coding depth The image data of code is output to output unit 130.
Based on deeper coding unit corresponding at least one depth of depth capacity is equal to or less than, maximum is encoded Image data in unit is encoded, and the knot relatively encoded to image data based on each deeper coding unit Fruit.After the encoding error to deeper coding unit is compared, the depth with minimum coding error may be selected.It can needle At least one coding depth is selected to each maximum coding unit.
With coding unit hierarchically divided according to depth and with coding unit quantity increase, maximum coding The size of unit is divided.In addition, even if coding unit is corresponding to same depth in a maximum coding unit, yet by point It is single by each coding corresponding with same depth to determine whether not measure the encoding error of the image data of each coding unit Member is divided into more low depth.Therefore, even if image data is included in a maximum coding unit, image data is according to depth Multiple regions are divided into, and encoding error can be different according to the region in one maximum coding unit, therefore compiled Code depth can be different according to the region in image data.Therefore, it can be determined in a maximum coding unit one or more A coding depth, and can be carried out according to the coding unit of at least one coding depth come the image data to maximum coding unit It divides.
Therefore, coding unit determiner 120 can determine including the coding list with tree construction in maximum coding unit Member.It is according to an embodiment of the present disclosure " with tree construction coding unit " include include in maximum coding unit it is all compared with Coding unit corresponding with the depth for being confirmed as coding depth in deep layer coding unit.It can be according to the phase of maximum coding unit The coding unit of coding depth is hierarchically determined with the depth in region, and can independently determine that coding is deep in the different areas The coding unit of degree.Similarly, the coding depth in current region can be independently determined from the coding depth in another region.
Depth capacity according to an embodiment of the present disclosure be with from maximum coding unit to minimum coding unit performed by draw The relevant index of number divided.First depth capacity according to an embodiment of the present disclosure can be indicated from maximum coding unit to minimum The performed total degree divided of coding unit.Second depth capacity according to an embodiment of the present disclosure can indicate single from maximum coding Member arrives the sum of the depth levels of minimum coding unit.For example, when the depth of maximum coding unit is 0, it is single to maximum coding The depth that member divides primary coding unit can be arranged to 1, and the depth of coding unit twice is divided to maximum coding unit It can be arranged to 2.Here, if minimum coding unit is the coding unit that maximum coding unit is divided after four times, exist 5 depth levels of depth 0,1,2,3 and 4, and therefore the first depth capacity can be arranged to 4, the second depth capacity can be set It is set to 5.
Predictive coding and transformation can be executed according to maximum coding unit.Also according to maximum coding unit, it is equal to based on basis Or predictive coding and transformation are executed less than the deeper coding unit of the depth of depth capacity.It can be according to orthogonal transformation or integer The method of transformation executes transformation.
Since whenever being divided according to depth to maximum coding unit, the quantity of deeper coding unit increases, because This will execute the coding including predictive coding and transformation to all deeper coding units generated with depth down.In order to Convenient for explaining, in maximum coding unit, predictive coding and transformation will be described based on the coding unit of current depth now.
Video encoder 100 can differently select the size or shape of the data cell for being encoded to image data Shape.In order to encode to image data, the operation of such as predictive coding, transformation and entropy coding is executed, at this point, can be for all Identical data cell is operated with, or can be directed to and each operate with different data cells.
For example, video encoder 100 is not only alternatively used for the coding unit encoded to image data, it is also optional The data cell different from coding unit is selected, to execute predictive coding to the image data in coding unit.
It, can be based on coding unit corresponding with coding depth (that is, base in order to execute predictive coding in maximum coding unit In the coding unit for again not being divided into coding unit corresponding with more low depth) Lai Zhihang predictive coding.Hereinafter, no longer being drawn Divide and the coding unit for becoming the basic unit for predictive coding will be referred to as " predicting unit " now.It is single by dividing prediction The subregion that member obtains may include predicting unit and be divided by least one of height to predicting unit and width And the data cell obtained.Subregion be carry out being obtained by dividing data cell by the predicting unit to coding unit, and Predicting unit can be the subregion with size identical with coding unit.
For example, when the coding unit of 2N × 2N (wherein, N is positive integer) is no longer divided and becomes the prediction list of 2N × 2N When first, the size of subregion can be 2N × 2N, 2N × N, N × 2N or N × N.The example of divisional type includes by single to prediction The height or width of member are symmetrically obtained by dividing symmetric partitioning, non-by height to predicting unit or width progress Symmetrically divide (such as, 1:n or n:1) and obtain subregion, by being geometrically obtained by dividing to predicting unit point Area and subregion with arbitrary shape.
The prediction mode of predicting unit can be at least one of frame mode, inter-frame mode and skip mode.For example, Frame mode or inter-frame mode can be executed to the subregion of 2N × 2N, 2N × N, N × 2N or N × N.In addition, can be only to 2N × 2N's Subregion executes skip mode.Coding can independently be executed to a predicting unit in coding unit, so that selection has minimum The prediction mode of encoding error.
Video encoder 100 not only can be also based on and volume based on the coding unit for being encoded to image data The different converter unit of code unit, executes transformation to the image data in coding unit.In order to execute change in coding unit It changes, transformation can be executed based on having the data cell of the size less than or equal to coding unit.For example, the transformation for transformation Unit may include the converter unit of frame mode and the data cell of inter-frame mode.
Similar to the coding unit based on tree construction according to the present embodiment, the converter unit in coding unit can be by recurrence Ground is divided into smaller size of region, and can be single to coding based on the converter unit with tree construction according to transformed depth Residual error data in member is divided.
In accordance with an embodiment of the present disclosure, transformed depth can be also set in converter unit, wherein transformed depth instruction passes through The height and width of coding unit are divided and reach division number performed by converter unit.For example, in present encoding When the size of the converter unit of unit is 2N × 2N, transformed depth can be arranged to 0, when the size of converter unit is N × N, Transformed depth can be arranged to 1.In addition, transformed depth can be arranged to 2 when the size of converter unit is N/2 × N/2.? That is the converter unit according to tree construction can also be arranged according to transformed depth.
The information about coding depth is not required nothing more than according to the encoded information of coding unit corresponding with coding depth, is also wanted Seek information relevant to predictive coding and transformation.Therefore, coding unit determiner 120, which not only determines, has minimum coding error Coding depth also determines divisional type, the prediction mode according to predicting unit and the transformation for transformation in predicting unit The size of unit.
Then referring to Fig.1 0 to Figure 21 is described in detail the basis in maximum coding unit according to an embodiment of the present disclosure The coding unit and predicting unit/subregion of tree construction and the method for determining converter unit.
Coding unit determiner 120 can be by using the rate-distortion optimization based on Lagrange's multiplier, to measure according to depth The encoding error of the deeper coding unit of degree.
Output unit 130 exports the image data of maximum coding unit and in the bitstream about the volume according to coding depth The information of pattern, wherein the image data of the maximum coding unit determines at least based on by coding unit determiner 120 One coding depth is encoded.
It can be encoded by the residual error data to image to obtain coded image data.
Information about the coding mode according to coding depth may include about the information of coding depth, about single in prediction The information of the information of divisional type in member, the information about prediction mode and the size about converter unit.
It can be by using the information defined according to the division information of depth about coding depth, wherein according to depth Division information indicates whether to more low depth rather than the coding unit of current depth executes coding.If current coded unit Current depth is coding depth, then the image data in current coded unit is encoded and exported, therefore can be believed dividing Breath is defined as current coded unit not being divided into more low depth.Optionally, if the current depth of current coded unit is not Coding depth then executes coding to the coding unit of more low depth, and therefore can be defined as division information to present encoding list Member is divided to obtain the coding unit of more low depth.
If current depth is not coding depth, the coding unit for the coding unit for being divided into more low depth is executed Coding.Since at least one coding unit of more low depth is present in a coding unit of current depth, to lower Each coding unit of depth repeats coding, and therefore can recursively execute volume to the coding unit with same depth Code.
Due to determining the coding unit with tree construction for a maximum coding unit, and it is directed to the volume of coding depth Code unit determines the information about at least one coding mode, so can determine for a maximum coding unit about at least one The information of a coding mode.In addition, due to carrying out layering division, the figure of maximum coding unit to image data according to depth As the coding depth of data can be different according to position, therefore can be arranged for image data about coding depth and coding mode Information.
Therefore, output unit 130 encoded information about corresponding coding depth and coding mode can be distributed to including At least one of coding unit, predicting unit and minimum unit in maximum coding unit.
Minimum unit according to an embodiment of the present disclosure is by the way that the minimum coding unit for constituting lowest depth is divided into 4 Part and obtain rectangular data unit.Selectively, minimum unit can be included in maximum coding unit included There is maximum sized maximum rectangular data unit in all coding units, predicting unit, zoning unit and converter unit.
For example, the encoded information exported by output unit 130 can be classified as according to the encoded information of coding unit and According to the encoded information of predicting unit.Encoded information according to coding unit may include about prediction mode information and about point The information of area's size.Encoded information according to predicting unit may include the information in the estimation direction about inter-frame mode, about frame Between the information of reference picture index of mode, the information about motion vector, the information of the chromatic component about frame mode, with And the information of the interpolation method about frame mode.
In addition, according to picture, band or GOP define about the maximum sized information of coding unit and about maximum deep The information of degree can be inserted into the head of bit stream, sequence parameter set (SPS) or parameter sets (PPS).
In addition, can also be exported by the head, SPS or PPS of bit stream about the transformation list allowed for current video The maximum sized information of member and the information of the minimum dimension about converter unit.
In video encoder 100, deeper coding unit be can be by the way that the coding unit of greater depths is (higher One layer) height or width be divided into two parts and the coding unit that obtains.In other words, when the size of the coding unit of current depth When being 2N × 2N, the size of the coding unit of more low depth is N × N.In addition, the coding list of the current depth having a size of 2N × 2N Member may include the coding unit of more low depth described in most 4.
Therefore, video encoder 100 can based on consider current picture feature and determination maximum coding unit ruler Very little and depth capacity, by determining the coding unit with optimum shape and optimal size come shape for each maximum coding unit At the coding unit with tree construction.In addition, due to can be by using any one in various prediction modes and transformation to every A maximum coding unit executes coding, therefore is contemplated that the feature of the coding unit of various picture sizes determines optimum code mould Formula.
Therefore, if encoded with conventional macro block to the image with high-resolution or big data quantity, each picture Macro block quantity extremely increase.Therefore, the item number of the compression information generated for each macro block increases, and therefore, it is difficult to send pressure The information of contracting, and efficiency of data compression reduces.However, by using video encoder 100, due in the size for considering image While increase the full-size of coding unit, and adjust coding unit while considering the feature of image simultaneously, therefore can Improve picture compression efficiency.
Fig. 2 is the block diagram of the video decoding apparatus 200 of the coding unit according to an embodiment of the present disclosure based on tree construction.
Video decoding apparatus 200 based on the coding unit according to tree construction includes symbol acquisition device 220 and image data Decoder 230.Hereinafter, for ease of description, using based on the video according to the video estimation of the coding unit of tree construction Decoding device 200 will be referred to as " video decoding apparatus 200 ".
Various terms (such as coding unit, depth, predicting unit, the change of decoding operate for video decoding apparatus 200 Change unit and the information about various coding modes) definition with referring to Fig.1 with video encoder 100 description definition phase Together.
Symbol acquisition device 220 receives and the bit stream of parsing encoded video.The bit stream of symbol acquisition device 220 analytically, Coded image data is extracted for each coding unit, and the image data of extraction is output to image data decoder 230, In, coding unit has the tree construction according to each maximum coding unit.Symbol acquisition device 220 can be from about current picture Head, SPS or PPS extract the maximum sized information of the coding unit about current picture.
In addition, the bit stream of symbol acquisition device 220 analytically is extracted according to each maximum coding unit about with tree The coding depth of the coding unit of structure and the information of coding mode.The information quilt about coding depth and coding mode extracted It is output to image data decoder 230.In other words, the image data in bit stream is divided into maximum coding unit, so that figure As data decoder 230 is decoded image data for each maximum coding unit.
It can be encoded for the information setting about at least one coding unit corresponding with coding depth about according to maximum The coding depth of unit and the information of coding mode, the information about coding mode may include about phase corresponding with coding depth Answer the information of the divisional type of coding unit, the information of the information about prediction mode and the size about converter unit.In addition, The information about coding depth can be extracted as according to the division information of depth.
By symbol acquisition device 220 extract about according to the coding depth of each maximum coding unit and the letter of coding mode Breath is such information about coding depth and coding mode: the information is determined to be in encoder, and (such as, Video coding is set For 100) according to each maximum coding unit to generation when coding is repeatedly carried out according to each deeper coding unit of depth Minimum coding error.Therefore, video decoding apparatus 200 can be by according to the coding depth and coding mould for generating minimum coding error Formula is decoded image data to restore image.
Since the encoded information about coding depth and coding mode can be assigned to corresponding coding unit, predicting unit With the predetermined unit of data in minimum unit, therefore symbol acquisition device 220 can be extracted deep about coding according to predetermined unit of data The information of degree and coding mode.The predetermined unit of data that the identical information about coding depth and coding mode is assigned can quilt It is inferred as including the data cell in identical maximum coding unit.
Image data decoder 230 based on about according to the coding depth of maximum coding unit and the information of coding mode, By being decoded to the image data in each maximum coding unit, Lai Huifu current picture.In other words, image data decoding Device 230 can be based on extracting about including every among coding unit with tree construction in each maximum coding unit Divisional type, the information of prediction mode and converter unit of a coding unit, are decoded the image data of coding.At decoding Reason may include prediction (comprising intra prediction and motion compensation) and inverse transformation.It is executed according to inverse orthogonal transformation or inverse integer transform Inverse transformation.
Image data decoder 230 can be based on about the divisional type according to the predicting unit of the coding unit of coding depth Intra prediction or motion compensation are executed according to the subregion and prediction mode of each coding unit with the information of prediction mode.
In addition, being directed to the inverse transformation of each maximum coding unit, image data decoder 230 can be directed to each coding unit It reads according to the converter unit information of tree construction with the converter unit of each coding unit of determination, and based on each coding unit Converter unit executes inverse transformation.By inverse transformation, the pixel value of the spatial domain of coding unit can be restored.
Image data decoder 230 can be by using determining current maximum coding unit according to the division information of depth At least one coding depth.If division information instruction image data is no longer divided in current depth, current depth is Coding depth.Therefore, image data decoder 230 can be by using about for each coding unit corresponding with coding depth Predicting unit divisional type, the information of the size of prediction mode and converter unit, in current maximum coding unit with The coded data of at least one corresponding coding unit of each coding depth is decoded, and exports current maximum coding unit Image data.
It in other words, can be by observing the predetermined unit of data being assigned in coding unit, predicting unit and minimum unit Coding information set come collect include identical division information encoded information data cell, and collect data cell It can be considered as the data cell that will be decoded by image data decoder 230 with identical coding mode.For as above Determined each coding unit can get the information about coding mode to be decoded to current coded unit.
Video decoding apparatus 200 can get minimum about generating when recursively executing coding to each maximum coding unit The information of at least one coding unit of encoding error, and the information can be used to be decoded to current picture.In other words, The coding unit with tree construction for being confirmed as forced coding unit in each maximum coding unit can be decoded.This Outside, the full-size of coding unit is determined in the case where considering resolution ratio and image data amount.
Therefore, though image data have high-resolution and big data quantity, can also by using coding unit size and Coding mode is effectively decoded and restores to image data, wherein by using being received from encoder about optimal The information of coding mode adaptively determines the size and coding mode of the coding unit according to the feature of image data.
Fig. 3 is the diagram for describing the design of coding unit according to an embodiment of the present disclosure.
The size of coding unit may be expressed as width × height, and can be 64 × 64,32 × 32,16 × 16 and 8 ×8.64 × 64 coding unit can be divided into 64 × 64,64 × 32,32 × 64 or 32 × 32 subregion, 32 × 32 coding Unit can be divided into 32 × 32,32 × 16,16 × 32 or 16 × 16 subregion, and 16 × 16 coding unit can be divided into 16 × 16,16 × 8,8 × 16 or 8 × 8 subregion, 8 × 8 coding unit can be divided into 8 × 8,8 × 4,4 × 8 or 4 × 4 point Area.
In video data 310, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and depth capacity is 2.In video data 320, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, depth capacity 3.It is regarding For frequency according in 330, resolution ratio is 352 × 288, and the full-size of coding unit is 16, depth capacity 1.Shown in Figure 10 Depth capacity indicates the division total degree from maximum coding unit to minimum coding unit.
If high resolution or data volume are big, the full-size of coding unit may be larger, to not only improve coding Efficiency, and accurately reflect the feature of image.Therefore, there is 310 He of video data than 330 higher resolution of video data The full-size of 320 coding unit can be 64.
Since the depth capacity of video data 310 is 2, due to by maximum coding unit divide twice, depth Deepen to two layers, therefore the coding unit 315 of video data 310 may include the maximum coding unit and long axis that major axis dimension is 64 Having a size of 32 and 16 coding unit.Simultaneously as the depth capacity of video data 330 is 1, therefore due to by compiling to maximum Code dividing elements are primary, and depth down is to one layer, therefore it is 16 that the coding unit 335 of video data 330, which may include major axis dimension, Maximum coding unit and major axis dimension be 8 coding unit.
Since the depth capacity of video data 320 is 3, due to by maximum coding unit divide three times, depth Deepen to 3 layers, therefore the coding unit 325 of video data 320 may include the maximum coding unit and long axis that major axis dimension is 64 Having a size of 32,16 and 8 coding unit.With depth down, details can be indicated accurately.
Fig. 4 is the block diagram of the image encoder 400 according to an embodiment of the present disclosure based on coding unit.
The operation that image encoder 400 executes the coding unit determiner 120 of video encoder 100 comes to image data It is encoded.In other words, intra predictor generator 410 executes intra prediction to the coding unit under the frame mode in present frame 405, Exercise estimator 420 and motion compensator 425 are by using present frame 405 and reference frame 495, to the interframe in present frame 405 Coding unit under mode executes interframe estimation and motion compensation.
The data exported from intra predictor generator 410, exercise estimator 420 and motion compensator 425 pass through 430 and of converter Quantizer 440 is outputted as the transformation coefficient after quantization.Transformation coefficient after quantization passes through inverse DCT 460 and inverse converter 470 are recovered as the data in spatial domain, and the data in the spatial domain of recovery are by removing module unit 480 and loop filtering unit Reference frame 495 is outputted as after 490 post-processings.Transformation coefficient after quantization can be outputted as bit by entropy coder 450 Stream 455.
For the application image encoder 400 in video encoder 100, all elements of image encoder 400 (that is, It is intra predictor generator 410, exercise estimator 420, motion compensator 425, converter 430, quantizer 440, entropy coder 450, anti- Quantizer 460, removes module unit 480 and loop filtering unit 490 at inverse converter 470) considering each maximum coding unit most While big depth, operation is executed based on each coding unit in the coding unit with tree construction.
Specifically, intra predictor generator 410, exercise estimator 420 and motion compensator 425 are considering that current maximum coding is single Member full-size and depth capacity while determine have tree construction coding unit in each coding unit subregion and Prediction mode, converter 430 determine the size of the converter unit in each coding unit in the coding unit with tree construction.
Fig. 5 is the block diagram of the image decoder 500 according to an embodiment of the present disclosure based on coding unit.
Resolver 510 parses needed for the coded image data that will be decoded and decoding from bit stream 505 about coding Information.Coded image data is outputted as the data of inverse quantization, the number of inverse quantization by entropy decoder 520 and inverse DCT 530 The image data in spatial domain is recovered as according to by inverse converter 540.
For the image data in spatial domain, intra predictor generator 550 executes the coding unit under frame mode pre- in frame It surveys, motion compensator 560 executes motion compensation to the coding unit under inter-frame mode by using reference frame 585.
It can be by removing module unit by the image data in the spatial domain of intra predictor generator 550 and motion compensator 560 570 and loop filtering unit 580 post-processing after output for restore frame 595.In addition, by removing module unit 570 and loop filtering The image data that unit 580 post-processes may be output as reference frame 585.
In order to be decoded in the image data decoder 230 of video decoding apparatus 200 to image data, image decoding Device 500, which can be performed, executes the operation executed after operation in resolver 510.
For the application image decoder 500 in video decoding apparatus 200, all elements of image decoder 500 (that is, Resolver 510, inverse DCT 530, inverse converter 540, intra predictor generator 550, motion compensator 560, is gone entropy decoder 520 Module unit 570 and loop filtering unit 580) for each maximum coding unit behaviour executed based on the coding unit with tree construction Make.
Specifically, intra predictor generator 550 and motion compensator 560 are directed to each coding unit with tree construction and are based on dividing Area and prediction mode execute operation, and inverse converter 540 executes operation based on the size of converter unit for each coding unit.
Fig. 6 is the diagram for showing the deeper coding unit according to an embodiment of the present disclosure according to depth and subregion.
Video encoder 100 and video decoding apparatus 200 consider the feature of image using hierarchical coding unit.It can root Maximum height, maximum width and the depth capacity of coding unit are adaptively determined according to the feature of image, or can be by user's difference Maximum height, maximum width and the depth capacity of ground setting coding unit.It can be true according to the full-size of scheduled coding unit The size of the fixed deeper coding unit according to depth.
In accordance with an embodiment of the present disclosure, in the layered structure of coding unit 600, the maximum height and maximum of coding unit Width is 64, and depth capacity is 3.In the case, depth capacity refers to that coding unit is compiled from maximum coding unit to minimum The code divided total degree of unit.Since depth is deepened along the vertical axis of layered structure 600, deeper coding unit Height and width are divided.In addition, predicting unit and subregion are shown along the trunnion axis of layered structure 600, wherein described Predicting unit and subregion are the bases that predictive coding is carried out to each deeper coding unit.
In other words, in layered structure 600, coding unit 610 is maximum coding unit, wherein depth 0, size (that is, height multiplies width) is 64 × 64.Depth is deepened along vertical axis, there is the coding list for being 1 having a size of 32 × 32 and depth Member 620, having a size of 16 × 16 and depth be 2 coding unit 630, having a size of 8 × 8 and depth be 3 coding unit 640.Ruler The coding unit 640 that very little is 8 × 8 and depth is 3 is the minimum coding unit with lowest depth.
The predicting unit and subregion of coding unit are arranged according to each depth along trunnion axis.In other words, if size For 64 × 64 and depth be 0 coding unit 610 be predicting unit, then predicting unit can be divided into including in coding unit Subregion in 610, that is, the subregion 610 having a size of 64 × 64, the subregion 612 having a size of 64 × 32, having a size of 32 × 64 subregion 614 or having a size of 32 × 32 subregion 616.
Similarly, the predicting unit for the coding unit 620 for being 1 having a size of 32 × 32 and depth can be divided into and is included in volume Subregion in code unit 620, that is, the subregion 620 having a size of 32 × 32, the subregion 622 having a size of 32 × 16, having a size of 16 × 32 Subregion 624 and having a size of 16 × 16 subregion 626.
Similarly, the predicting unit for the coding unit 630 for being 2 having a size of 16 × 16 and depth can be divided into and is included in volume Code unit 630 in subregion, that is, including the size in coding degree unit 630 for 16 × 16 subregion, having a size of 16 × 8 Subregion 632, the subregion 634 having a size of 8 × 16 and the subregion 636 having a size of 8 × 8.
Similarly, the predicting unit for the coding unit 640 for being 3 having a size of 8 × 8 and depth can be divided into and is included in coding Subregion in unit 640, that is, including the size in coding unit 640 be 8 × 8 subregion, having a size of 8 × 4 subregion 642, Subregion 644 having a size of 4 × 8 and the subregion 646 having a size of 4 × 4.
In order to determine at least one coding depth for the coding unit for constituting maximum coding unit 610, video encoder 100 coding unit determiner 120 is to including that coding unit corresponding with each depth in maximum coding unit 610 executes Coding.
With depth down, being encoded according to the deeper of depth including the data with same range and identical size is single The quantity of member increases.For example, it is desired to which four coding units corresponding with depth 2 are included in one corresponding with depth 1 to cover Data in coding unit.Therefore, in order to according to depth relatively to identical data encoded as a result, corresponding with depth 1 Coding unit and four coding units corresponding with depth 2 are encoded.
In order to execute coding for the current depth among multiple depth, can pass through along the trunnion axis of layered structure 600 Coding is executed to each predicting unit in coding unit corresponding with current depth, to be directed to current depth, selects minimum compile Code error.Optionally, coding can be executed for each depth by as depth is along the vertical axis intensification of layered structure 600 Compare the minimum coding error according to depth, to search for minimum coding error.There is minimum code in coding unit 610 The depth and subregion of error can be chosen as the coding depth and divisional type of coding unit 610.
Fig. 7 is for describing the relationship between coding unit 710 and converter unit 720 according to an embodiment of the present disclosure Diagram.
Video encoder 100 or video decoding apparatus 200 are directed to each maximum coding unit, are less than or wait according to having In the coding unit of the size of maximum coding unit, image is encoded or is decoded.It can be based on no more than corresponding coding unit Data cell, to select the size of the converter unit for being converted during coding.
For example, in video encoder 100 or video decoding apparatus 200, if the size of coding unit 710 be 64 × 64, then transformation can be executed by using the converter unit 720 having a size of 32 × 32.
In addition, can by the size less than 64 × 64 be 32 × 32,16 × 16,8 × 8 and 4 × 4 each converter unit Transformation is executed, to encode to the data of the coding unit 710 having a size of 64 × 64, then may be selected that there is minimum code to miss The converter unit of difference.
Fig. 8 is the encoded information for describing coding unit corresponding with coding depth according to an embodiment of the present disclosure Diagram.
The output unit 130 of video encoder 100 can to corresponding with coding depth each coding unit about point The information 800 of area's type, the information 810 about prediction mode and the information 820 about converter unit size are encoded, and Information 800, information 810 and information 820 are sent as the information about coding mode.
Information 800 indicates the shape about the subregion for divide by the predicting unit to current coded unit acquisition Information, wherein subregion is the data cell for carrying out predictive coding to current coded unit.For example, having a size of 2N × 2N's Current coded unit CU_0 can be divided into the subregion 802 having a size of 2N × 2N, the subregion 804 having a size of 2N × N, having a size of N × Any one in the subregion 806 of 2N and the subregion 808 having a size of N × N.Here, it is set as about the information of divisional type 800 Indicate one of the subregion 804 having a size of 2N × N, the subregion 806 having a size of N × 2N and subregion having a size of N × N 808.
Information 810 indicates the prediction mode of each subregion.For example, information 810 can be indicated to the subregion indicated by information 800 The mode of the predictive coding of execution, that is, frame mode 812, inter-frame mode 814 or skip mode 816.
The converter unit that the instruction of information 820 is based on when current coded unit is executed and converted.For example, converter unit can To be converter unit 822 in first frame, converter unit 824, the first inter-frame transform unit 826 or the second Inter-frame Transformation in the second frame Unit 828.
The symbol acquisition device 220 of video decoding apparatus 200 can be extracted according to each deeper coding unit and use is used for Decoded information 800,810 and 820.
Fig. 9 is the diagram of the deeper coding unit according to an embodiment of the present disclosure according to depth.
Division information can be used to the change of indicated depth.Whether the coding unit of division information instruction current depth is divided At the coding unit of more low depth.
For being 0 to depth and the predicting unit 910 of the progress predictive coding of coding unit 900 having a size of 2N_0 × 2N_0 It may include the subregion of following divisional type: the divisional type 912 having a size of 2N_0 × 2N_0, the subregion having a size of 2N_0 × N_0 Type 914, the divisional type 916 having a size of N_0 × 2N_0 and the divisional type having a size of N_0 × N_0 918.Fig. 9 is illustrated only The divisional type 912 to 918 obtained and symmetrically dividing predicting unit 910, but divisional type is without being limited thereto, and pre- The subregion for surveying unit 910 may include asymmetric subregion, the subregion with predetermined shape and the subregion with geometry.
According to every kind of divisional type, to having a size of 2N_0 × 2N_0 a subregion, two points having a size of 2N_0 × N_0 Predictive coding is repeatedly carried out in area, two subregions having a size of N_0 × 2N_0 and four subregions having a size of N_0 × N_0.It can be right Subregion having a size of 2N_0 × 2N_0, N_0 × 2N_0,2N_0 × N_0 and N_0 × N_0 executes under frame mode and inter-frame mode Predictive coding.The predictive coding under skip mode is only executed to the subregion having a size of 2N_0 × 2N_0.
The error of coding including the predictive coding in divisional type 912 to 918 is compared, and in divisional type Determine minimum coding error.If encoding error is minimum in a divisional type in divisional type 912 to 916, can not Predicting unit 910 is divided into more low depth.
If encoding error is minimum in divisional type 918, depth changes to 1 from 0 to divide subregion in operation 920 Type 918, and be 2 to depth and coding unit 930 having a size of N_0 × N_0 is repeatedly carried out coding and searches for minimum code Error.
For carrying out predictive coding to depth for 1 and the coding unit 930 having a size of 2N_1 × 2N_1 (=N_0 × N_0) Predicting unit 940 may include following divisional type subregion: divisional type 942 having a size of 2N_1 × 2N_1, having a size of 2N_ Divisional type 944, the divisional type 946 having a size of N_1 × 2N_1 and the divisional type having a size of N_1 × N_1 of 1 × N_1 948。
If encoding error is minimum in divisional type 948, depth changes to 2 from 1 to divide subregion in operation 950 Type 948, and to depth be 2 and the coding unit 960 having a size of N_2 × N_2 repeat coding search for minimum code miss Difference.
When depth capacity is d, can be performed according to the division operation of each depth until depth becomes d-1, and draw Point information can be encoded until depth is 0 to arrive one of d-2.In other words, when coding is performed until corresponding to the depth of d-2 Coding unit operate be divided in 970 after depth be d-1 when, for depth be d-1 and having a size of 2N_ (d-1) × The predicting unit 990 of the progress predictive coding of coding unit 980 of 2N_ (d-1) may include the subregion of following divisional type: having a size of 2N_ (d-1) × 2N (d-1) divisional type 992, having a size of 2N_ (d-1) × N (d-1) divisional type 994, having a size of N_ (d-1) divisional type 996 of × 2N (d-1) and having a size of N_ (d-1) × N (d-1) divisional type 998.
It can be to the size in divisional type 992 to 998 for a subregion of 2N_ (d-1) × 2N_ (d-1), having a size of 2N_ (d-1) two subregions of × N_ (d-1), having a size of two subregions of N_ (d-1) × 2N_ (d-1), having a size of N_ (d-1) × N_ (d-1) predictive coding is repeatedly carried out in four subregions, to search for the divisional type with minimum coding error.
Even if, since depth capacity is d, depth is the volume of d-1 when divisional type 998 has minimum coding error Code unit CU_ (d-1) is also no longer divided into more low depth, for constituting the coding unit of current maximum coding unit 900 Coding depth is confirmed as d-1, and the divisional type of current maximum coding unit 900 can be confirmed as N_ (d-1) × N (d- 1).Further, since depth capacity is d, and it is lower that there is the minimum coding unit 980 of lowest depth d-1 to be no longer divided into Depth, therefore it is not provided with the division information of minimum coding unit 980.
Data cell 999 can be " minimum unit " for current maximum coding unit.In accordance with an embodiment of the present disclosure Minimum unit can be the rectangular data unit obtained and minimum coding unit 980 is divided into 4 parts.By repeatedly Coding is executed, video encoder 100 can select to have most by comparing according to the encoding error of the depth of coding unit 900 The depth of lower Item error sets respective partition type and prediction mode to the coding of coding depth to determine coding depth Mode.
In this way, being compared into d to according to the minimum coding error of depth in all depth 1, and there is minimum compile The depth of code error can be confirmed as coding depth.Coding depth, the divisional type of predicting unit and prediction mode can be used as pass It is encoded and sends in the information of coding mode.In addition, since coding unit from 0 depth is divided into coding depth, 0 only is set by the division information of coding depth, and sets 1 for the division information of the depth other than coding depth.
The symbol acquisition device 220 of video decoding apparatus 200 it is extractable and using about coding unit 900 coding depth and The information of predicting unit, to be decoded to subregion 912.Video decoding apparatus 200 can be believed by using according to the division of depth The depth that division information is 0 is determined as coding depth by breath, and using the information of the coding mode about respective depth come into Row decoding.
Figure 10 to Figure 12 is according to an embodiment of the present disclosure in coding unit 1010,1060 and of predicting unit for describing The diagram of relationship between converter unit 1070.
Coding unit 1010 is corresponding with the coding depth determined by video encoder 100 in maximum coding unit Coding unit with tree construction.Predicting unit 1060 is the subregion of the predicting unit in each coding unit 1010, and transformation is single Member 1070 is the converter unit of each coding unit 1010.
When the depth of the maximum coding unit in coding unit 1010 is 0, the depth of coding unit 1012 and 1054 is 1, the depth of coding unit 1014,1016,1018,1028,1050 and 1052 is 2, coding unit 1020,1022,1024, 1026,1030,1032 and 1048 depth is 3, and the depth of coding unit 1040,1042,1044 and 1046 is 4.
In predicting unit 1060, some coding degree units are obtained by dividing the coding unit in coding unit 1010 1014,1016,1022,1032,1048,1050,1052 and 1054.In other words, 1014,1022,1050 and of coding unit The size of divisional type in 1054 is 2N × N, the size of the divisional type in coding unit 1016,1048 and 1052 be N × 2N, the size of the divisional type of coding unit 1032 are N × N.The predicting unit and subregion of coding unit 1010 are less than or equal to Each coding unit.
In the converter unit 1070 in the data cell for being less than coding unit 1052, to the picture number of coding unit 1052 According to execution transformation or inverse transformation.In addition, in terms of size and shape, coding unit in converter unit 1,070 1014,1016, 1022,1032,1048,1050 and 1052 be different from coding unit 1014 in predicting unit 1060,1016,1022,1032, 1048,1050 and 1052.In other words, video encoder 100 and video decoding apparatus 200 can be in same coding units Data cell independently executes intra prediction, estimation, motion compensation, transformation and inverse transformation.
Therefore, in each region of maximum coding unit there is each coding unit layered recursively to execute Coding is to determine optimum code unit, to can get the coding unit with recurrence tree construction.Encoded information may include about The division information of coding unit, the information about divisional type, the information about prediction mode and the size about converter unit Information.Table 1 shows the encoded information that can be arranged by video encoder 100 and video decoding apparatus 200.
[table 1]
The exportable encoded information about the coding unit with tree construction of the output unit 130 of video encoder 100, The symbol acquisition device 220 of video decoding apparatus 200 can be from the bitstream extraction received about the coding unit with tree construction Encoded information.
Division information indicates whether the coding unit that current coded unit is divided into more low depth.If current depth d Division information be 0, then it is coding depth that current coded unit, which is no longer divided into the depth of more low depth, so as to be directed to institute Coding depth is stated to define the information of the size about divisional type, prediction mode and converter unit.If current coded unit It is further divided into according to division information, then coding is independently executed to four division coding units of more low depth.
Prediction mode can be one of frame mode, inter-frame mode and skip mode.All divisional types can be directed to Frame mode and inter-frame mode are defined, skip mode is only defined in the divisional type having a size of 2N × 2N.
Information about divisional type can indicate the ruler obtained and the height or width by symmetrically dividing predicting unit The very little symmetric partitioning type for 2N × 2N, 2N × N, N × 2N and N × N, and the height by asymmetricly dividing predicting unit Or width and the size that obtains are the asymmetric divisional type of 2N × nU, 2N × nD, nL × 2N and nR × 2N.It can be by pressing 1:3 The height of predicting unit is divided with 3:1 to obtain the asymmetric divisional type having a size of 2N × nU and 2N × nD respectively, can led to It crosses by 1:3 and 3:1 and divides the width of predicting unit to obtain the asymmetric subregion class having a size of nL × 2N and nR × 2N respectively Type.
Converter unit can be sized to the two types under frame mode and the two types under inter-frame mode.It changes Sentence is talked about, if the division information of converter unit is 0, the size of converter unit can be 2N × 2N, i.e. current coded unit Size.If the division information of converter unit is 1, it is single that transformation can be obtained by being divided to current coded unit Member.In addition, if when the divisional type of the current coded unit having a size of 2N × 2N is symmetric partitioning type, converter unit Size can be N × N, if the divisional type of current coded unit is non-symmetric partitioning type, the size of converter unit can To be N/2 × N/2.
Encoded information about the coding unit with tree construction may include coding unit corresponding with coding depth, prediction At least one of unit and minimum unit.Coding unit corresponding with coding depth may include pre- comprising identical encoded information Survey at least one of unit and minimum unit.
Therefore, determine whether adjacent data unit is included in and compiles by comparing the encoded information of adjacent data unit In the code corresponding same coding unit of depth.In addition, being determined by using the encoded information of data cell and coding depth phase The corresponding coding unit answered, and therefore can determine the distribution of the coding depth in maximum coding unit.
It therefore, can be direct if predicted based on the encoded information of adjacent data unit current coded unit With reference to and using data cell in the deeper coding unit neighbouring with current coded unit encoded information.
Optionally, it if predicted based on the encoded information of adjacent data unit current coded unit, uses The encoded information of neighbouring data cell searches for the data cell neighbouring with current coded unit with current coded unit, and can With reference to the neighbouring coding unit searched to predict current coded unit.
Figure 13 is single for describing the coding unit of the coding mode information according to table 1, predicting unit or subregion and transformation The diagram of relationship between member.
The coding unit 1302 of maximum coding unit 1300 including multiple coding depths, 1304,1306,1312,1314, 1316 and 1318.Here, due to the coding unit that coding unit 1318 is a coding depth, division information can be set It is set to 0.Information about the divisional type of the coding unit 1318 having a size of 2N × 2N can be arranged in following divisional type One kind: the divisional type 1322 having a size of 2N × 2N, the divisional type 1324 having a size of 2N × N, the subregion having a size of N × 2N Class1 326, the divisional type 1328 having a size of N × N, the divisional type 1332 having a size of 2N × nU, point having a size of 2N × nD Area's Class1 334, the divisional type 1336 having a size of nL × 2N and the divisional type having a size of nR × 2N 1338.
The division information (TU (converter unit) dimension mark) of converter unit is a type of manipulative indexing.With transformation rope The size for drawing corresponding converter unit can change according to the predicting unit type or divisional type of coding unit.
For example, when divisional type is configured to symmetrical (that is, divisional type 1322,1324,1326 or 1328), if become The division information (TU dimension mark) for changing unit is 0, then the converter unit 1342 having a size of 2N × 2N is arranged, if TU size mark Note is 1, then the converter unit 1344 having a size of N × N is arranged.
When divisional type is configured to asymmetric (that is, divisional type 1332,1334,1336 or 1338), if TU ruler Very little label is 0, then the converter unit 1352 having a size of 2N × 2N is arranged, if TU dimension mark is 1, is arranged having a size of N/2 The converter unit 1354 of × N/2.
Referring to Figure 20, TU dimension mark is the label with value 0 or 1, but TU dimension mark is not limited to 1 bit, and Converter unit can be layered when TU dimension mark increases since 0 and is divided into tree construction.The division information of converter unit (TU dimension mark) can be the example of manipulative indexing.
It in this case, in accordance with an embodiment of the present disclosure, can be by using the TU dimension mark of converter unit and change Full-size and the minimum dimension of unit are changed to indicate the size of actually used converter unit.According to the implementation of the disclosure Example, video encoder 100 can be to size information of maximum conversion unit, size information of minimum conversion unit and maximum TU size Label is encoded.Size information of maximum conversion unit, size information of minimum conversion unit and maximum TU dimension mark are carried out The result of coding can be inserted into SPS.In accordance with an embodiment of the present disclosure, video decoding apparatus 200 can be single by using maximum transformation Elemental size information, size information of minimum conversion unit and maximum TU dimension mark are decoded video.
If for example, the size of (a) current coded unit be 64 × 64 and maximum converter unit size be 32 × 32, (a-1) when TU dimension mark is 0, the size of converter unit can be 32 × 32, and (a-2) is converted when TU dimension mark is 1 The size of unit can be 16 × 16, and (a-3) when TU dimension mark is 2, the size of converter unit can be 8 × 8.
As another example, if (b) size of current coded unit be 32 × 32 and minimum converter unit size be 32 × 32, then (b-1) when TU dimension mark be 0 when, the size of converter unit can be 32 × 32.Here, due to converter unit Size can not be less than 32 × 32, therefore TU dimension mark can not be arranged to the value other than 0.
As another example, if (c) size of current coded unit is 64 × 64 and maximum TU dimension mark is 1, Then TU dimension mark can be 0 or 1.Here, TU dimension mark can not be arranged to the value other than 0 or 1.
Therefore, when TU dimension mark is 0, it is if defining maximum TU dimension mark " MaxTransformSizeIndex ", minimum converter unit having a size of " MinTransformSize ", converter unit having a size of " RootTuSize " can then define the current minimum converter unit ruler that can be determined in current coded unit by equation (1) Very little " CurrMinTuSize ":
CurrMinTuSize=max (MinTransformSize, RootTuSize/ (2^ MaxTransformSizeIndex))…(1)
Compared with the current minimum converter unit size " CurrMinTuSize " that can be determined in current coded unit, when When TU dimension mark is 0, converter unit size " RootTuSize " can indicate selectable maximum converter unit ruler in systems It is very little.In equation (1), " RootTuSize/ (2^MaxTransformSizeIndex) " instruction becomes when TU dimension mark is 0 Converter unit size when unit size " RootTuSize " has been divided number corresponding with maximum TU dimension mark is changed, " MinTransformSize " indicates minimum transform size.Therefore, " RootTuSize/ (2^ MaxTransformSizeIndex can be can be in current coded unit for lesser value) " and in " MinTransformSize " Determining current minimum converter unit size " CurrMinTuSize ".
In accordance with an embodiment of the present disclosure, maximum converter unit size RootTuSize can change according to the type of prediction mode Become.
For example, can be determined by using equation below (2) if current prediction mode is inter-frame mode "RootTuSize".In equation (2), " MaxTransformSize " indicates maximum converter unit size, " PUSize " instruction Current prediction unit size:
RootTuSize=min (MaxTransformSize, PUSize) ... (2)
That is, the converter unit size if current prediction mode is inter-frame mode, when TU dimension mark is 0 " RootTuSize " can be lesser value in maximum converter unit size and current prediction unit size.
If the prediction mode of current partition unit is frame mode, can be determined by using equation below (3) "RootTuSize".In equation (3), " PartitionSize " indicates the size of current partition unit:
RootTuSize=min (MaxTransformSize, PartitionSize) ... (3)
That is, the converter unit size if current prediction mode is frame mode, when TU dimension mark is 0 " RootTuSize " can be lesser value among maximum converter unit size and the size of current partition unit.
However, the current maximum converter unit size changed according to the type of the prediction mode in zoning unit " RootTuSize " is only example, and one or more embodiments of the disclosure are without being limited thereto.
According to the method for video coding based on the coding unit with tree construction described referring to figs. 1 to Figure 13, for tree Each coding unit of structure encodes the image data of spatial domain.According to the view based on the coding unit with tree construction Frequency coding/decoding method executes decoding for each maximum coding unit to restore the image data of spatial domain.Therefore, picture can be restored With the picture video as picture sequence.Video after recovery can be reproduced by reproduction equipment, can be stored in storage medium, Or it can be sent by network.
Hereinafter, referring to Fig.1 4 to Figure 20, it will method for estimating and motion compensation side of the description using multiple hypothesis Method, wherein the method for estimating and motion compensation process are used in the video solution based on the coding unit with tree construction The inter-prediction executed in code method and method for video coding.
Inter-prediction utilizes the similitude between present image and another image.From the ginseng being resumed before the present image Examine image detection reference zone similar with the current region of present image.Between current region and reference zone on coordinate Distance is represented as motion vector, and the difference between the pixel value of current region and the pixel value of reference zone is represented as residual error number According to.Therefore, by executing inter-prediction, index, motion vector and the residual error number of exportable instruction reference picture to current region According to image information without directly exporting current region.
Operation for inter-prediction can generally be classified as motion estimation operation and operation of motion compensation, wherein movement Estimation operation is for determining the reference picture, motion vector and the residual error data that are used for present image, and operation of motion compensation is for leading to It crosses using reference picture, motion vector and residual error data and restores present image.4 motion estimation apparatus will be described referring to Fig.1 1400 operation describes the operation of motion compensation equipment 1500 by referring to Fig.1 5.
Motion estimation apparatus 1400 and motion compensation equipment 1500 can execute inter-prediction for each of each image piece. The type of block can be rectangular, rectangle or any geometric figure.The type of block is not limited to the data cell with predetermined size.
Data block for inter-prediction can be predicting unit (or subregion).As described above, in the volume with tree construction In code unit, each maximum coding unit can be divided into multiple coding units, and can will be every in the multiple coding unit A coding unit is divided into one or more predicting units.Although in a coding unit including multiple predicting units, It is that can execute estimation according to predicting unit, so as to determine motion vector, residual error data etc. according to predicting unit.
Therefore, it describes in detail now with reference to Figure 14 and Figure 15 by motion estimation apparatus 1400 and motion compensation equipment 1500 operations carried out, wherein motion estimation apparatus 1400 and motion compensation equipment 1500 are to the coding unit with tree construction Inter-prediction is executed with predicting unit.
Figure 14 is the block diagram of motion estimation apparatus 1400 according to an embodiment of the present disclosure.
Motion estimation apparatus 1400 includes exercise estimator 1410 and information output unit 1420.
Exercise estimator 1410 can be according to including the predicting unit in each coding unit with tree construction of image To execute estimation.
Exercise estimator 1410 can determine current motion vector, with to including among predicting unit in coding unit Current prediction unit executes inter-prediction.In order to during motion estimation process determine motion vector, can be performed for search for The operation of the most like block of current prediction unit.Motion vector can be confirmed as position and the current predictive of the block that instruction searches The vector of difference between the position of unit.
In order to further be accurately determined about the current estimation indicated by current motion vector because the movement of sub-pixel is sweared Amount, information output unit 1420 is other than it can be used and currently estimate because of sub-pixel, and also working hypothesis estimation is because of sub-pixel.
In the present embodiment, it can be determined as unit of the pixel with predetermined accuracy by current motion vector instruction Current estimation is because of sub-pixel.It, can be with sub-pixel in the case where the sub-pixel of reference picture is interpolated with for inter-prediction Motion vector is determined for unit.Current estimation because sub-pixel be sub-pixel when, can with current estimation because sub-pixel is neighbouring Sub-pixel in determine assume estimation because of sub-pixel.May be selected in a predetermined direction currently to estimate because centered on sub-pixel Two or more assume estimation because of sub-pixel.
For example, in the case where the precision according to 1/4 pixel unit carries out interpolation to the width of reference picture and height, when Preceding estimation can refer to show the sub-pixel at 1/4 pixel distance of each of reference picture of interpolation because of sub-pixel.It therefore, can be according to 1/4 pixel unit determines the current estimation because of the x coordinate value and y-coordinate value of sub-pixel.In the present embodiment, it can will assume to estimate Because sub-pixel is determined as and currently estimates because sub-pixel is at a distance of 1/4 pixel distance or the sub-pixel of 1/2 pixel distance.By using At a distance of the hypothesis estimation factor of sub-pixel distance, (wherein, which indicates to be inserted according to sub-pixel unit with motion vector The reference picture of value), most like reference block can be determined from the reference picture being interpolated according to sub-pixel unit.
The hypothesis estimation indicated by the motion vector of current prediction unit is because sub-pixel can be the representative pixel of reference block. For example, it is assumed that estimation can be the upper left pixel of reference block because of sub-pixel.Therefore, when assuming that estimation is determined because of sub-pixel When, it may be determined that comprise provide that estimation because of reference block sub-pixel and with size identical as predicting unit.Therefore, when for working as When preceding estimation assumes estimation because of sub-pixel because sub-pixel determines, it may be determined that comprise provide that estimation because of the reference block of sub-pixel.
In addition, exercise estimator 1410 can be estimated from multiple hypothesis with current estimation because of sub-pixel at a distance of sub-pixel distance Because determining estimation because of sub-pixel in sub-pixel.In addition, exercise estimator 1410 can be by merging multiple hypothesis estimations because of sub-pixel To generate an estimation because of sub-pixel.
Such hypothesis estimation may be selected because of sub-pixel in exercise estimator 1410: the hypothesis estimation is because sub-pixel is in straight line On by currently estimate because centered on sub-pixel opposite to each other, and with current estimation because sub-pixel is at a distance of the hypothesis of sub-pixel distance Estimation is because in sub-pixel.
For example, such two hypothesis estimation may be selected because of sub-pixel in exercise estimator 1410: described two hypothesis estimations Because sub-pixel is on straight line currently to estimate because centered on sub-pixel, and with current estimation because sub-pixel is at a distance of sub-pixel distance Hypothesis estimation because in sub-pixel, also, exercise estimator 1410 can be by using the estimation of two hypothesis of selection because of sub-pixel To determine estimation because of sub-pixel.
Exercise estimator 1410 can determine such estimation because of sub-pixel: the estimation is because sub-pixel indicates two of selection Assuming that mean place of the estimation because of the motion vector of sub-pixel.Therefore, the first of selection assumes that estimation includes because of sub-pixel expression First assumes to estimate the first reference block because of sub-pixel, and the second of selection assumes estimation indicates to include that the second hypothesis is estimated because of sub-pixel Count the second reference block because of sub-pixel.Therefore, by estimating that the reference block indicated by sub-pixel can be by according to the first reference block Block is formed by with the average value of the pixel value of the location of pixels of the second reference block.
Therefore, exercise estimator 1410 can be estimated with current because sub-pixel is at a distance of son by using in predetermined rectilinear direction Two hypothesis of pixel distance are estimated finally to determine reference block because of sub-pixel.
Exercise estimator 1410 can produce the residual error data between current prediction unit and the reference block of determination.
The exportable motion vector difference value information of information output unit 1420, wherein the instruction of motion vector difference value information is current Difference between motion vector and the motion vector for the predicting unit being encoded before current prediction unit.Motion vector difference Information can be exported by each predicting unit.Information output unit 1420 can export current prediction unit and ginseng according to predicting unit Examine the residual error data between block.
Information output unit 1420 not only it is exportable include the predicting unit in coding unit motion vector difference letter Breath, the hypothesis estimation model information of also exportable coding unit.
Assuming that estimation model information may include such information: information instruction is from currently to estimate to be because of sub-pixel The hypothesis estimation of the heart is estimated because of two hypothesis selected in sub-pixel because of sub-pixel.
Since the selection of information output unit 1420 is estimated with current because sub-pixel is at a distance of sub-pixel in predetermined rectilinear direction The hypothesis estimation of distance is because of sub-pixel, therefore information output unit 1420 can produce instruction and be selected for determining estimation factor picture The rectilinear direction of element and the combined information of sub-pixel distance.
For example, can produce and exportable such assume estimation model information: the hypothesiss estimation model information indicates to exist The predetermined sub-pixel distance and selected in two or more rectilinear directions that two or more sub-pixels select in Predetermined rectilinear direction combination.
Information output unit 1420 can be determined according to coding unit assumes estimation model.It can will assume that estimation model information is total It is same to be applied to include the predicting unit in coding unit.Therefore, it can will assume that estimation model is equally applicable to be included in volume Each predicting unit in code unit.
Therefore, information output unit 1420 can export motion vector difference value information according to predicting unit, and can be according to coding Unit output hypothesis estimation model information.
For example, information output unit 1420 can be first when coding unit includes the first predicting unit and the second predicting unit The motion vector difference value information about the first predicting unit is first exported, the exportable hypothesis estimation model about coding unit is believed It ceases, then the exportable motion vector difference value information about the second predicting unit.
It as another example, can be for the first prediction after output is about the hypothesis estimation model information of coding unit Unit and the second predicting unit are sequentially output two motion vector difference value informations.
Two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance.1/4 pixel distance and 1/2 Pixel distance can respectively indicate the minimum between the point obtained and by the distance between two neighbouring integer pixels quartering Minimum range between distance and the point obtained and halving the distance between two neighbouring integer pixels.Two or more Multiple rectilinear directions may include 0 degree, 90 degree, 135 degree and 45 degree of angle.
It is straight by the sub-pixel distance that selects among two sub-pixels distance and at two due to the hypothesis estimation model The combination of the rectilinear direction selected among line direction is formed, therefore the hypothesis estimation model may include 8 combinations in total.
Information output unit 1420 can execute entropy coding to estimation model information is assumed.For example, in order to by using being based on Context adaptive binary arithmetic coding (CABAC) method executes entropy coding, and information output unit 1420 can determine for vacation If the context model of each binary digit (bin) of estimation model information.For example, if it is assumed that estimation model information has 4 A bit then can determine context model for every 4 binary digits, to can determine 4 context models.
In addition, information output unit 1420 can be determined according to the depth of current coded unit assumes estimation model information Context model.For example, then each of 3 depth are required for vacation if there is the coding unit with 3 depth If 4 context models of estimation model information, so that information output unit 1420 can determine for assuming estimation model information 12 context models in total.
3 context models may be selected in information output unit 1420, and by using the context model of selection to hypothesis Estimation model information executes entropy coding, wherein this 3 context models are corresponding to the depth of current coded unit, and in vacation If estimation model information based on previous context by among previously determined context model.
As described above, information output unit 1420 can compile the semiology analysis entropy generated and encoding to video Code, so that information output unit 1420 can produce and output bit flow.
Exercise estimator 1410 can be by using with each of 0 degree, 90 degree, 135 degree and 45 degree rectilinear direction With current estimation because sub-pixel apart estimate to carry out calculation rate distortion (RD) because of sub-pixel by the hypothesis of 1/4 pixel distance in rectilinear direction Cost.Exercise estimator 1410 can also by using among calculated RD cost on the smallest direction of RD cost with currently estimate The hypothesis because of sub-pixel at a distance of 1/2 pixel distance is counted to estimate because of sub-pixel, to calculate RD cost.Due in each rectilinear direction 4 RD costs are calculated by 1/4 pixel distance, and calculate a RD cost by 1/2 pixel distance in predetermined rectilinear direction, because This can calculate 5 RD costs.
Minimum RD cost can be determined from the combination according to direction and sub-pixel distance and among calculated RD cost, and The combination of corresponding with minimum RD cost rectilinear direction and sub-pixel distance may be selected.It then, can be by using the group in selection Estimate to estimate to estimate because of sub-pixel to determine at a distance of two hypothesis of sub-pixel distance because of sub-pixel with current in the rectilinear direction of conjunction Meter is because of sub-pixel.It can will respectively include in predetermined rectilinear direction with current estimation because sub-pixel is at a distance of two vacations of sub-pixel distance If estimation is determined as reference block because of the average block of the block of sub-pixel.
Motion estimation apparatus 1400 may include the center of usual control exercise estimator 1410 and information output unit 1420 Processor (not shown).Optionally, exercise estimator 1410 and information output unit 1420 can be respectively by their own processing Device (not shown) is driven, and since processor (not shown) is interactively operated, motion estimation apparatus 1400 It can be operated.Optionally, exercise estimator 1410 and information output unit 1420 can be by the outsides of motion estimation apparatus 1400 Processor (not shown) is controlled.
Motion estimation apparatus 1400 may include the input for storing exercise estimator 1410 and information output unit 1420 At least one of data and output data data storage cell (not shown).Motion estimation apparatus 1400 may include for controlling State the memory control unit (not shown) that the data of at least one data storage cell are output and input.
By using current motion vector and in addition as described above, motion estimation apparatus 1400 uses and currently estimates the factor Pixel is estimated to determine reference block because of sub-pixel at a distance of the hypothesis of sub-pixel distance, so that the precision of inter-prediction can be improved.This Outside, motion estimation apparatus 1400 allows the combination with high probability to be used as only to assume estimation because sub-pixel is relative to current estimation The combination in the direction and sub-pixel distance that are located at by sub-pixel assumes estimation because of sub-pixel so as to be rapidly selected.In addition, Hypothesis about selection is estimated to be reduced to bottom line because of the transmitted bit number of the information of sub-pixel, to can be improved including vacation If the bit rate of the coded identification of estimation model information.
Figure 15 is the block diagram of motion compensation equipment 1500 according to an embodiment of the present disclosure.
Motion compensation equipment 1500 includes information obtainer 1510, assumes estimation model determiner 1520 and motion compensator 1530。
Information obtainer 1510 can obtain working as current prediction unit according to including the predicting unit in coding unit Preceding motion vector and residual error data.It can obtain motion vector difference value information according to predicting unit, rather than current motion vector.
When information obtainer 1510 receives the bit stream of coded identification, information obtainer 1510 can be by from bit stream Symbol is parsed to obtain a plurality of encoded information.Information obtainer 1510 can obtain coding unit from bit stream by parsing bit stream Hypothesis estimation model information.
Motion compensation equipment 1500 can receive the bit stream being coded by entropy.In the case, information obtainer 1510 can be right Bit stream executes entropy decoding, so that information obtainer 1510 can get the hypothesis estimation model information of coding unit, and can get The motion vector difference value information and residual error data of predicting unit.It here, can will be from bit stream via inverse quantization and Transform operations The residual error data of acquisition reverts to the residual error data of spatial domain.
In another example, in order to generate the reference picture of the estimation for another image, if previous coding figure The following information of picture is stored in memory: the motion vector difference of hypothesis the estimation model information and predicting unit of coding unit Value information, then what motion compensation equipment 1500 can obtain coding unit from memory assumes estimation model information and predicting unit Motion vector difference value information.Residual error number can be stored in the form of executing inverse quantization and inverse transformation to the transformation coefficient after quantization According to.
Motion compensation equipment 1500 obtains a hypothesis estimation model information about current coded unit, so as to will be false If estimation model information is jointly used in including the predicting unit in current coded unit.
Assuming that estimation model determiner 1520 can be determined based on the hypothesis estimation model information of acquisition at two or more The predetermined sub-pixel distance selected and the predetermined rectilinear direction selected in two or more rectilinear directions in sub-pixel distance Combination.
Sub-pixel distance includes 1/4 pixel distance and 1/2 pixel distance, and rectilinear direction includes 0 degree, 90 degree, 135 degree and 45 Angle is spent, so as to obtain 8 combinations of sub-pixel distance and rectilinear direction from hypothesis estimation model information.
Thus, it is supposed that estimation model determiner 1520 can be by reference to assuming that estimation model information is determined from following combination One combination: combination (1/4 pixel distance and 0 ° of angular direction), combination (1/4 pixel distance and 90 ° of angular direction), combination (1/4 picture Plain distance and 135 ° of angular direction), combination (1/4 pixel distance and 45° angle direction), combination (1/2 pixel distance and 0 ° of angular direction), Combine (1/2 pixel distance and 90 ° of angular direction), combination (1/2 pixel distance and 135 ° of angular direction), combination (1/2 pixel distance and 45° angle direction).
Moreover, it is assumed that estimation model determiner 1520 can be by executing entropy to the hypothesis estimation model information being coded by entropy Decoding assumes estimation model information to read.
Assuming that estimation model determiner 1520 can execute entropy solution to hypothesis estimation model information by using CABAC method Code, to assume that estimation model determiner 1520 can be from assuming working as estimation model information interpretation sub-pixel distance and rectilinear direction Preceding combination.
Assuming that estimation model determiner 1520 can determine the context mould for assuming each bit of estimation model information Type.Therefore, 4 context models can be determined for the hypothesis estimation model information with 4 bits.
Assuming that estimation model determiner 1520 can be by using 4 context moulds corresponding with the depth of current coded unit Type executes entropy decoding to hypothesis estimation model information.Assuming that estimation model determiner 1520 can be estimated based on the hypothesis of entropy decoding Pattern information determines the combination of the sub-pixel distance and rectilinear direction for current motion vector.
According to determined combination, predetermined rectilinear direction can be selected from 4 rectilinear directions and 2 sub- pixel distances and is made a reservation for Sub-pixel distance.Motion compensator 1530 can determine in predetermined rectilinear direction with current estimation because sub-pixel is at a distance of pre- stator picture Two hypothesis of plain distance are estimated because of sub-pixel, wherein current estimation is indicated because of sub-pixel by current motion vector.Movement is mended Reference block can be determined by using block of described two hypothesis estimations because of sub-pixel is respectively included by repaying device 1530.Motion compensator 1530 can be by being combined to the residual error data of acquisition and the reference block determined to generate the recovery block of current prediction unit.It can Restore coding unit with the recovery block of predicting unit.
Motion compensation equipment 1500 can be by additionally using the hypothesis with current motion vector at a distance of sub-pixel distance to estimate Meter determines reference block because of sub-pixel, so that the precision of motion compensation can be improved.In addition, motion compensation equipment 1500 can according to by Assuming that the direction of estimation model information instruction and the combination of sub-pixel distance assume estimation because of sub-pixel to be rapidly selected.
The video encoder 100 and view based on the coding unit with tree construction described above by reference to Fig. 1 to Figure 13 Frequency decoding device 200 may include the operation carried out by motion estimation apparatus 1400 and the behaviour carried out by motion compensation equipment 1500 Make.
The executable fortune by motion estimation apparatus 1400 of coding unit determiner 120 in the video encoder 100 of Fig. 1 Move the operation of the progress of estimator 1410 and by the hypothesis estimation model determiner 1520 of motion compensation equipment 1500 and motion compensation The operation that device 1530 carries out.The output unit 130 of video encoder 100 is executable defeated by the information of motion estimation apparatus 1400 The operation that unit 1420 carries out out.
The executable letter by motion compensation equipment 1500 of the symbol acquisition device 220 of video decoding apparatus 200 shown in Figure 2 It ceases acquisition device 1510 and assumes the operation that estimation model determiner 1520 carries out, the image data decoding of video decoding apparatus 200 The executable operation carried out by the motion compensator 1530 of motion compensation equipment 1500 of device 230.
The executable movement by motion estimation apparatus 1400 of the exercise estimator 420 of image encoder 400 shown in Fig. 4 The operation that estimator 1420 carries out, the executable vacation by motion compensation equipment 1500 of the motion compensator 425 of image encoder 400 If the operation that estimation model determiner 1520 and motion compensator 1530 carry out.The entropy coder 450 of image encoder 400 can be held The operation that row is carried out by the information output unit 1420 of motion estimation apparatus 1400.
The executable information acquisition by motion compensation equipment 1500 of the resolver 510 of image decoder 500 shown in Fig. 5 The operation that device 1510 carries out, entropy decoder 520 is executable to be operated by the entropy decoding that information obtainer 1510 carries out.Image decoder The executable operation carried out by hypothesis estimation model determiner 1520 of 500 motion compensator 560 assumes estimation mould to explain Formula information, and the executable operation of motion compensation carried out by motion compensator 1530.
Figure 16 a and Figure 16 b show the type according to an embodiment of the present disclosure for assuming estimation model.
The current estimation of current motion vector instruction of current prediction unit is because of sub-pixel 1600.In accordance with an embodiment of the present disclosure Motion estimation apparatus 1400 and motion compensation equipment 1500 pass through pattern information other than current motion vector be used only To determine the reference block as unit of sub-pixel, wherein pattern information instruction assumes that estimation is sweared because of sub-pixel and current kinetic Amount apart from the direction and indicates the sub-pixel distance at a distance of sub-pixel.Hereinafter, it will now be described and set by estimation It is executed for 1400 and motion compensation equipment 1500 to determine and to assume estimation because of the operation of sub-pixel.
In this example, it is assumed that estimation can refer to show in the straight direction with current estimation because of sub-pixel 1600 because of sub-pixel At a distance of the sub-pixel of 1/2 pixel distance and 1/4 pixel distance.Moreover, it is assumed that estimation can refer to show at 0 °, 45 °, 90 ° because of sub-pixel And the sub-pixel isolated because of sub-pixel 1600 is estimated with current on 134 ° of angular direction.
Therefore, in the present embodiment, can estimate from the rectilinear direction with 0 °, 45 °, 90 ° and 135 ° angle with current Because sub-pixel 1600 at a distance of 1/2 pixel distance and 1/4 pixel distance sub-pixel 1611,1612,1621,1622,1631, 1632, it determines in 1641,1642,1651,1652,1661,1662,1671,1672,1681 and 1682 by assuming that the estimation factor refers to The pixel shown, that is, assuming that estimation is because of sub-pixel.
In order to determine reference block, can will sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642, 1651, multiple sub-pixels in 1652,1661,1662,1671,1672,1681 and 1682 are determined as assuming estimation because of sub-pixel.
For example, can from sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642,1651,1652, 1661,1662,1671,1672,1681 and 1682 selections are a pair of assumes estimation because of sub-pixel.For example, may be selected such a pair of Assuming that estimation is because of sub-pixel: the pair of hypothesiss is estimated because sub-pixel is currently to estimate because of phase each other centered on sub-pixel 1600 It is right, and in the rectilinear direction with 0 °, 45 °, 90 ° and 135 ° angle with current estimation because sub-pixel 1600 at a distance of 1/2 pixel away from From with 1/4 pixel distance sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642,1651,1652, 1661, in 1662,1671,1672,1681 and 1682.
In more detail, first group 1610 can be determined as assuming estimation because of sub-pixel group, wherein first group 1610 is included in It is located at 1/4 pixel distance in rectilinear direction with 0 ° of angle and currently to estimate because relative to each other centered on sub-pixel 1600 Sub-pixel 1611 and 1612.It similarly, can be by second group 1620, third group 1630, the 4th group 1640, the 5th group the 1650, the 6th 1660, the 7th group 1670 and the 8th group 1680 of group is determined as assuming estimation because of sub-pixel group, wherein second group 1620 is included in tool There is the sub-pixel 1621 and 1622 being located at 1/2 pixel distance in the rectilinear direction at 0 ° of angle, third group 1630 is included in 90 ° The sub-pixel 1631 and 1632 being located at 1/4 pixel distance in the rectilinear direction at angle, the 4th group 1640 includes having 90 ° of angles The sub-pixel 1641 and 1642 being located at 1/2 pixel distance in rectilinear direction, the 5th group 1650 includes having the straight of 135 ° of angles The sub-pixel 1651 and 1652 being located at 1/4 pixel distance on line direction, the 6th group 1660 includes in the straight line with 135 ° of angles The sub-pixel 1661 and 1662 being located at 1/2 pixel distance on direction, the 7th group 1670 includes in the rectilinear direction with 45° angle The upper sub-pixel 1671 and 1672 at 1/4 pixel distance, the 8th group 1680 includes in the rectilinear direction with 45° angle Sub-pixel 1681 and 1682 at 1/2 pixel distance.
Reference block can be ultimately determined to by including assuming that estimating by motion estimation apparatus 1400 and motion compensation equipment 1500 Count the average block of the reference block because of the sub-pixel instruction in sub-pixel group.
For example, when being chosen as assuming estimation because of sub-pixel group for the 7th group 1670, if first ginseng of the instruction of sub-pixel 1671 Block is examined, the second sub-pixel 1672 indicates the second reference block, then can produce by the pixel according to the first reference block and the second reference block The average block, then can be determined as the reference block of current pixel by the average block that the average value of the pixel value of position is formed.
In this example, it is assumed that estimation model information can be indicated from assuming estimation because of (i.e. first group to the of sub-pixel group Seven group 1610,1620,1630,1640,1650,1660,1670 and 1680) in select group.Further, since assuming the estimation factor Pixel group includes with current estimation on particular line direction because sub-pixel 1600 is at a distance of the sub-pixel of particular sub-pixel distance, because This assumes that estimation model information can indicate the combination in the particular line direction and the particular sub-pixel distance.
For example, can will assume that estimation model information is expressed as " mode N ".Mode 1 is corresponding to first group 1610, thus mode 1 can indicate the combination of 0 ° of angular direction and 1/4 pixel distance.Similarly, mode 2 is corresponding to third group 1630, so that mode 2 can refer to Show the combination of 90 ° of angular direction and 1/4 pixel distance.Mode 3 is corresponding to the 5th group 1650, so that mode 3 can indicate 135 ° of angle sides To the combination with 1/4 pixel distance.Mode 4 is corresponding to the 7th group 1670, so that mode 4 can indicate 45° angle direction and 1/4 pixel The combination of distance.Mode 5 is corresponding to second group 1620, so that mode 5 can indicate 0 ° of angular direction and 1/2 pixel distance.Mode 6 with 4th group 1640 corresponding, so that mode 6 can indicate the combination of 90 ° of angular direction and 1/2 pixel distance.Mode 7 and the 6th group 1660 Accordingly, so that mode 7 can indicate the combination of 135 ° of angular direction and 1/2 pixel distance.Mode 8 is corresponding to the 8th group 1680, thus Mode 8 can indicate the combination in 45° angle direction and 1/2 pixel distance.
Figure 17 shows the group in the direction according to an embodiment of the present disclosure indicated by hypothesis estimation model, value of symbol and distance It closes.
For example, Figure 17 is shown about hypothesis estimation because of the direction of sub-pixel and the combination of sub-pixel distance, wherein described group Conjunction is corresponding to the mode of estimation model information is assumed respectively, and bit symbol corresponding with the mode respectively is also shown in Figure 17 Number.
Mode 0 is not corresponding to any value.Mode 0 can be applied to the hypothesis as unit of sub-pixel and estimate because of sub-pixel not The case where being used for determining reference block.Mode 1 to mode 4 can be applied to the hypothesis that sub-pixel distance is 1/4 pixel distance and estimate The group because of sub-pixel is counted, mode 5 to 8 can be applied to the hypothesis that sub-pixel distance is 1/2 pixel distance and estimate because of sub-pixel Group.
In addition, mode value can be with 0 ° of angle (water in a plurality of hypothesis estimation model information for indicating identical sub-pixel distance It is flat), 90 ° of angles (vertical), 135 ° of angles (right-under) and 45° angle (under a left side -) sequentially increase.
Other than mode 0, the multiple values of symbol for being applied to multiple modes respectively in table shown in Figure 17 are determined Justice is 4 bits.The bit of the leftmost side of mode symbol value will now be described.In the feelings of the first bit indication 1 of mode symbol value Under condition, which can intermediate scheme 0.In the case where the first bit indication 0 of mode symbol value, which can be indicated in addition to mould Mode except formula 0.Second bit of mode symbol value can indicate sub-pixel distance be 1/4 pixel distance or 1/2 pixel away from From.The third bit of mode symbol value capable of indicating direction is diagonal direction or non-diagonal direction.4th ratio of mode symbol value Spy can indicate the determination about horizontal direction or vertical direction, or can indicate the diagonal direction with 135 ° of angles or 45° angle.
Therefore, when 1420 output hypothesis estimation model information of the information output unit of motion estimation apparatus 1400, information Output unit 1420 can assume estimation model information because sub-pixel determines according to the hypothesis estimation determined by exercise estimator 1410 Mode value, and can be exported according to the table of Figure 17 and the bit stream of the corresponding value of symbol of mode value of determination.
In addition, the hypothesis estimation model determiner 1520 of motion compensation equipment 1500 can be successively read to be included in and be obtained by information Obtain the first bit to the 4th bit for assuming the bit stream in estimation model information that device 1510 parses.Compared according to reading first Spy to the 4th bit result sequence, can determine assume estimation model whether be mode 0, sub-pixel distance is 1/4 pixel Distance or 1/2 pixel distance, whether direction is diagonal direction and the direction be horizontally oriented also be vertically oriented or The direction is the diagonal direction with 135 ° of angles or the diagonal direction of 45° angle.
The information output unit 1420 of motion estimation apparatus 1400 can execute entropy coding to estimation model information is assumed.Example Such as, entropy coding can be carried out to hypothesis estimation model information by using CABAC method.In order to execute CABAC method, can pass through by Assuming that estimation model information carries out binaryzation to generate bit stream, and context can be determined for each binary digit of bit stream Model.Therefore, 4 context models can be determined for the hypothesis estimation model information with 4 bits.It can be single according to each coding Member, which determines, assumes estimation model information.In addition, context model can be determined according to the depth of coding unit.For with identical The a plurality of hypothesis estimation model information of the coding unit of depth can execute entropy coding by using same context model.
Motion estimation apparatus 1400 can execute inter-prediction for 64 × 64,32 × 32 and 16 × 16 coding unit.? In this case, there are 3 depth of coding unit, and according to each of 3 depth, for hypothesis estimation model information 4 context models are determined, so that motion estimation apparatus 1400 can determine for a plurality of 12 for assuming estimation model information Hereafter model.
Motion compensation equipment 1500 can restore symbol by executing entropy decoding to the hypothesis estimation model information parsed Value.In the case, context model can be used alone according to the depth of coding unit, and can be by different context moulds Type is used to assume each binary digit of estimation model information.
As described above, a hypothesis estimation model information can be determined for a coding unit.For example, motion estimation apparatus 1400 information output unit 1420 can send the movement of the first predicting unit in the predicting unit of current coded unit first The hypothesis estimation model information of current coded unit can be transmitted in vector difference information, and the fortune of the second predicting unit then can be transmitted Dynamic vector difference information.In the case, the information obtainer 1510 of motion compensation equipment 1500 can parse present encoding first The motion vector difference value information of first predicting unit of unit can parse the hypothesis estimation model information of current coded unit, so The motion vector difference value information of the second predicting unit can be parsed afterwards.
However, the transmission for being assigned to the hypothesis estimation model information of coding unit is not limited to aforesaid way.
Figure 18 shows the hypothesis estimation model according to an embodiment of the present disclosure as the test target about RD cost.
The exercise estimator 1410 of motion estimation apparatus 1400 can be estimated from multiple hypothesis because selecting to assume in sub-pixel group Estimation is because of sub-pixel group, wherein estimates in the hypothesis of selection because generating minimum RD cost in sub-pixel group, then exercise estimator 1410 can determine hypothesis estimation model because of the direction of sub-pixel group and the combination of sub-pixel distance according to the hypothesis estimation of selection.
For example, it may be determined that from being related to assuming estimation because of first group 1610 of fortune of sub-pixel using corresponding with mode 0 The RD cost generated in the encoding operation of dynamic estimation, which arrives, to be related to using hypothesis corresponding with mode 8 estimation because of the 8th of sub-pixel The RD cost generated in the encoding operation of the estimation of group 1680, then can compare RD cost according to multiple modes, thus The mode of generation minimum RD cost may be selected.
As described above, vacation can be determined according to each of each sub-pixel distance and the direction with 4 angles If estimating the combination because of sub-pixel, so as to estimate to compare RD cost because of sub-pixel group for 8 hypothesis.
In another embodiment, motion estimation apparatus 1400 can first against close to current estimation because sub-pixel 1600 with Current estimation is because sub-pixel 1600 apart estimate to determine RD cost because of the group of sub-pixel by the hypothesis of 1/4 pixel distance.Hereafter, according to The direction of the mode of generation minimum RD cost in multiple modes with 1/4 pixel distance, motion estimation apparatus 1400 can be another Other places is for the hypothesis estimation being located at 1/2 pixel distance because the group of sub-pixel determines RD cost.
8 table referring to Fig.1 as determination and compares the RD cost about the mode 1,2,3 and 4 at 1/4 pixel distance As a result, when generating minimum RD cost in mode 1, can be for the additionally determining RD cost of mode 5, wherein mode 5 be It is located at one group of hypothesis at 1/2 pixel distance on direction identical with mode 1 to estimate because of sub-pixel.Therefore, can slave pattern 1 RD Final choice generates the mode of smaller RD cost in the RD cost of cost and mode 5.
It similarly, can be additionally as the RD cost compared at 1/4 pixel distance as a result, when mode 2 is selected The RD cost for determining mode 6, then can execute again and compare.It, can be further by the RD of mode 7 when mode 3 is selected first Cost is compared with the RD cost of mode 3.It, can be further by the RD cost and mode of mode 8 when mode 4 is selected first 4 RD cost is compared.
Therefore, motion estimation apparatus 1400 according to another embodiment can determine and compare about at 1/4 pixel distance 4 modes and 5 of another mode at 1/2 pixel distance assume estimation because of the RD cost of sub-pixel group, to move Estimation equipment 1400 can determine that best hypothesis is estimated to assume estimation model with best because of sub-pixel group.
Figure 19 shows the flow chart of method for estimating according to an embodiment of the present disclosure.
It, can be for the interframe for including current prediction unit in multiple predicting units in coding unit in operation 1910 It predicts to determine current motion vector.It can get by previously determined motion vector.
In operation 1920, it may be determined that such two hypothesis estimation is because of sub-pixel: described two hypothesis estimations are because of sub-pixel In predetermined rectilinear direction centered on currently estimating because of sub-pixel, and estimate with current because sub-pixel is at a distance of predetermined sub-pixel The hypothesis of distance is estimated because in sub-pixel, wherein current to estimate to be indicated because of sub-pixel by current motion vector.
Can select a sub- pixel distance in 1/4 pixel distance and 1/2 pixel distance, and can have 0 °, 90 °, A direction is selected in the rectilinear direction at 45 ° and 135 ° angles.It can be according to the combination in the direction of the sub-pixel distance and selection of selection To determine described two hypothesis estimations because of sub-pixel.
Firstly, can estimate the factor with current in the rectilinear direction with 0 °, 90 °, 45 ° and 135 ° angle by using respectively Pixel is estimated to calculate RD cost because of sub-pixel at a distance of the hypothesis of 1/4 pixel distance.Hereafter, can be estimated by using such hypothesis Meter additionally calculates RD cost because of sub-pixel: described to assume that estimation is generating the calculating at 1/4 pixel distance because of sub-pixel It is located at 1/2 pixel distance on the direction of minimum RD cost among RD cost.It can finally determine hypothesis estimation because in sub-pixel The rectilinear direction and sub-pixel distance for calculating the minimum RD cost among 5 RD costs of calculating.
Reference block can be determined by using block of the two determining hypothesis estimations because of sub-pixel is respectively included.It can determine finger Show the motion vector of the alternate position spike between current prediction unit and reference block.Can determine as current prediction unit pixel value and The residual error data of difference between the pixel value of reference block.
In operation 1930, the hypothesis estimation model information of exportable coding unit, wherein the hypothesis estimation model information Indicate the predetermined sub-pixel distance selected from two or more sub-pixels distance and from two or more rectilinear directions The combination of the predetermined rectilinear direction of selection.
In this example, it is assumed that estimation model be from two sub-pixels distance a sub- pixel distance selecting and from The combination of the rectilinear direction selected in two rectilinear directions, so as to the selection hypothesis estimation model from 8 combinations.
In another embodiment, exportable motion vector difference value information, without exporting motion vector, wherein motion vector Difference information indicates between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit Difference.
In another embodiment, the hypothesis estimation model information of exportable coding unit and include in coding unit The motion vector difference value information and residual error data of predicting unit.
In order to determine hypothesis estimation mould according to the depth of coding unit to estimation model information execution entropy coding is assumed The context model of formula information.In addition, can be by using 4 context models corresponding with the depth of current coded unit to vacation If estimation model information carries out entropy coding.
Figure 20 shows the flow chart of motion compensation process according to an embodiment of the present disclosure.
In operation 2010, available includes the vacation of the motion vector and coding unit of the predicting unit in coding unit If estimation model information.
In another embodiment, it can get the hypothesis estimation model information of motion vector difference value information and coding unit, In, motion vector difference value information indicates the movement of current motion vector and the predicting unit being encoded before current prediction unit Difference between vector.In addition, the residual error data between each predicting unit and reference block can be obtained according to predicting unit.It can root It is predicted that unit quantified after transformation coefficient, then can execute inverse quantization and inverse transformation to the transformation coefficient after quantization, from And it can get residual error data.
In the present embodiment, above and below can be by using the hypothesis estimation model information determined according to the depth of coding unit Literary model determines 4 context models corresponding with the depth of current coded unit, and can be by using respective contexts mould Type executes entropy decoding to each binary digit of the hypothesis estimation model information with 4 bits.
In operation 2020, according to hypothesis estimation model information, it may be determined that selected from two or more sub-pixels distance Predetermined sub-pixel distance and the predetermined rectilinear direction selected from two or more rectilinear directions combination.
According to assume estimation model information, it may be determined that the sub-pixel selected from 1/4 pixel distance and 1/2 pixel distance away from From the combination with the direction selected from 0 °, 90 °, 135 ° and the rectilinear direction of 45° angle.
It, can be by using including estimating with current because sub-pixel is at a distance of pre- stator picture in predetermined rectilinear direction in operation 2030 Two of plain distance assume block of the estimation because of sub-pixel to determine reference block, wherein current estimation is because sub-pixel is by current kinetic Vector instruction.
The residual error data obtained in operation 2010 and the reference merged block determined in operation 2030 can be worked as to can produce The recovery block of preceding predicting unit.
According to referring to the method for video coding based on the coding unit with tree construction described in Fig. 6 to Figure 19, for Each coding unit of tree construction encodes the image data of spatial domain.According to based on the coding unit with tree construction Video encoding/decoding method executes decoding for each maximum coding unit, to restore the image data of spatial domain.Therefore, can restore Picture and video as picture sequence.Video after recovery can be reproduced by reproduction equipment, can be stored in storage medium, Or it can be sent by network.
It can be written as computer program in accordance with an embodiment of the present disclosure, and can be implemented in and use computer-readable record Medium executes in the general purpose digital computer of described program.The example of computer readable recording medium includes magnetic storage medium (example Such as, ROM, floppy disk, hard disk etc.), optical recording medium (for example, CD-ROM or DVD) etc..
Although the disclosure is specifically illustrated and described with reference to the exemplary embodiment of the disclosure, this field it is common The skilled person will understand that can show in the case where not departing from according to the spirit and scope of the present disclosure being defined by the claims Various changes in form and details are made in example property embodiment.
It is set for executing referring to figs. 1 to Figure 20 multiple views method for estimating, motion compensation process, Video coding described The program of one or more embodiments of each of standby and video decoding apparatus is stored in computer-readable storage medium In matter, so that stand alone computer system can be easily performed the operation according to the embodiment being stored in a storage medium.
For ease of description, the view including method for estimating and motion compensation process described with reference to Fig. 1 to Figure 20 Frequency coding method will be collectively referred to as " according to the method for video coding of the disclosure ".In addition, including with reference to what Fig. 1 to Figure 20 was described The video encoding/decoding method of motion compensation process will be referred to as " according to the video encoding/decoding method of the disclosure ".
With reference to Fig. 1 to Figure 20 describe include video encoder 100, video encoder 400, motion estimation apparatus 1400 or motion compensation equipment 1500 video encoder will be referred to as " according to the video encoder of the disclosure ".In addition, The view including video decoding apparatus 200, image decoder 500 or motion compensation equipment 1500 described with reference to Fig. 1 to Figure 18 Frequency decoding device will be referred to as " according to the video decoding apparatus of the disclosure ".
Will be described in now it is according to an embodiment of the present disclosure storage program computer readable recording medium (for example, Disk 26000).
Figure 21 shows the physical structure of the disk 26000 of storage program according to an embodiment of the present disclosure.As storage medium Disk 26000 can be hard disk drive, compact disc read-only memory (CD-ROM) disk, Blu-ray disc or digital versatile disc (DVD).Disk 26000 includes multiple concentric magnetic track Tr, and each concentric magnetic track Tr is divided into specific along the circumferencial direction of disk 26000 The sector Se of quantity.In the specific region of disk 26000, can distribute and store execute method for estimating described above, The program of motion compensation process, method for video coding and video encoding/decoding method.
It describes to decode for executing method for video coding and video as described above using storage now with reference to Figure 22 The storage medium of the program of method is come the computer system realized.
Figure 22 shows by using disk 26000 disk drive 26300 for recording simultaneously reading program.Computer system 26500 can will execute method for video coding and video encoding/decoding method according to an embodiment of the present disclosure via disk drive 26300 At least one of program be stored in disk 26000.It is stored in disk 26000 to be run in computer system 26500 Program, computer system 26500 from 26000 reading program of disk and can be sent for program by using disk drive 26300.
Execute the program of at least one of method for video coding and video encoding/decoding method according to an embodiment of the present disclosure It can not only be stored in disk 26000 shown in Figure 21 and Figure 22, be also stored in storage card, ROM cassette tape or solid-state and drive In dynamic device (SSD).
System explained below using method for video coding and video encoding/decoding method described above.
Figure 23, which is shown, provides the overall structure of the contents providing system 11000 of content distribution service.By the clothes of communication system Wireless base station 11700,11800,11900 and 12000 and is separately mounted to these at the cell of predetermined size by region division of being engaged in In cell.
Contents providing system 11000 includes multiple self-contained units.For example, such as computer 12100, personal digital assistant (PDA) 12200, multiple self-contained units of video camera 12300 and mobile phone 12500 are via Internet Service Provider 11200, communication network 11400 and wireless base station 11700,11800,11900 and 12000 are connected to internet 11100.
However, contents providing system 11000 is not limited to as shown in Figure 23, and in device is optionally connected to Hold supply system 11000.Multiple self-contained units can not directly connect via wireless base station 11700,11800,11900 and 12000 It is connected to communication network 11400.
Video camera 12300 is the imaging device for capableing of captured video image, for example, digital video camera.Mobile phone 12500 can be using various agreements (for example, individual digital communicates (PDC), CDMA (CDMA), wideband code division multiple access (W- CDMA), global system for mobile communications (GSM) and personal handyphone system (PHS)) at least one of communication means.
Video camera 12300 can be connected to streaming server 11300 via wireless base station 11900 and communication network 11400.Stream The permission of server 11300 is streamed via the content that video camera 12300 is received from user via real-time broadcast.It can be used Video camera 12300 or streaming server 11300 encode the content received from video camera 12300.Pass through video The video data that camera 12300 captures can be sent to streaming server 11300 via computer 12100.
The video data captured by camera 12600 can also be sent to streaming server via computer 12100 11300.Similar with digital camera, camera 12600 is the imaging device that can capture both static image and video image.It can make The video data captured by camera 12600 is encoded with camera 12600 or computer 12100.Video will can be held The software of row coding and decoding is stored in can be by computer readable recording medium that computer 12100 accesses (for example, CD-ROM Disk, floppy disk, hard disk drive, SSD or storage card) in.
It, can be from mobile phone if video data is caught in by the camera being built in mobile phone 12500 12500 receive video data.
It can also be electric by the large-scale integrated being mounted in video camera 12300, mobile phone 12500 or camera 12600 Road (LSI) system encodes video data.
In accordance with an embodiment of the present disclosure, contents providing system 11000 can use video camera 12300, camera to by user 12600, the content-data that mobile phone 12500 or another imaging device are recorded during concert (for example, in recording Hold) it is encoded, and streaming server 11300 is sent by the content-data after coding.Streaming server 11300 can will be after coding Content-data is sent to other clients of request content data with the type of streaming content.
Client is the device that can be decoded to the content-data after coding, for example, computer 12100, PDA 12200, video camera 12300 or mobile phone 12500.Therefore, contents providing system 11000 allows client to receive and reproduce Content-data after coding.In addition, contents providing system 11000 allow client real-time reception to encode after content-data and right Content-data after coding is decoded and reproduces, and thus allows for personal broadcaster.
The coding and decoding operation for the multiple self-contained units being included in content in supply system 11000 can be similar to according to this The coding and decoding operation of the video encoder and video decoding apparatus of disclosed embodiment.
It is described more fully now with reference to Figure 24 and Figure 25 and is included in Content supply according to an embodiment of the present disclosure Mobile phone 12500 in system 11000.
Figure 24 shows the mobile phone according to an embodiment of the present disclosure using method for video coding and video encoding/decoding method 12500 external structure.Mobile phone 12500 can be smart phone, and the function of the smart phone is unrestricted, and described Most of functions of smart phone can be changed or extend.
Mobile phone 12500 includes the internal antenna that radio frequency (RF) signal can be exchanged with the wireless base station 12000 of Figure 25 12510, and including for showing the image captured by camera 12530 or being received via antenna 12510 and decoded figure The display screen 12520 (for example, liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) screen) of picture.Smart phone 12500 Operation panel 12540 including including control button and touch panel.If display screen 12520 is touch screen, operating surface Plate 12540 further includes the touch-sensing panel of display screen 12520.Mobile phone 12510 includes for exporting voice and sound Loudspeaker 12580 or another type sound follower and microphone 12550 or another type for inputting voice and sound Sound input unit.Mobile phone 12510 further includes the camera 12530 for capturing video and static image, such as charge coupling Clutch part (CCD) camera.Mobile phone 12510 may also include that storage medium 12570, be captured for storing by camera 12530 To, via e-mail receive or the coding/decoding data that are obtained according to various modes (for example, video or static figure Picture);Slot 12560, storage medium 12570 are loaded into mobile phone 12500 via slot 12560.Storage medium 12570 can To be flash memory, it may for example comprise secure digital (SD) card or electrically erasable and programmable read only memory in plastic housing (EEPROM)。
Figure 25 shows the internal structure of mobile phone 12500 according to an embodiment of the present disclosure.In order to systematically control packet Include the component of the mobile phone 12500 of display screen 12520 and operation panel 12540, power supply circuit 12700, operation input control Device 12640, image coding unit 12720, camera interface 12630, LCD controller 12620, image decoding unit 12690, multiplexing Device/demultiplexer 12680, recording unit/reading unit 12670, modulation unit/demodulating unit 12660 and Sound Processor Unit 12650 are connected to central controller 12710 via synchronous bus 12730.
If user's operation power knob, and " electric power starting " state, then electricity of powering would be set as from " power supply closing " state All components power supply of the road 12700 from battery pack to mobile phone 12500, to set operation mould for mobile phone 12500 Formula.
Central controller 12710 includes central processing unit (CPU), ROM and random access memory (RAM).
While communication data is sent outside by mobile phone 12500, under the control of central controller, in movement Digital signal is generated in phone 12500.For example, Sound Processor Unit 12650 can produce digital audio signal, image coding unit 12720 can produce data image signal, and the text data of message can be via operation panel 12540 and operation input controller 12640 are generated.When digital signal is sent to modulation unit/demodulating unit 12660 under the control in central controller 12710 When, modulation unit/demodulating unit 12660 is modulated the frequency band of digital signal, and telecommunication circuit 12610 is to band modulation Digital audio signal afterwards executes digital-to-analogue conversion (DAC) and frequency conversion.The transmission signal exported from telecommunication circuit 12610 can be through Voice communication base station or wireless base station 12000 are sent to by antenna 12510.
For example, when mobile phone 12500 is in call mode, under the control of central controller 12710, via Mike The voice signal that wind 12550 obtains is transformed into digital audio signal by Sound Processor Unit 12650.Digital audio signal can be through Transformation signal is transformed by modulation unit/demodulating unit 12660 and telecommunication circuit 12610, and can be sent out via antenna 12510 It send.
When text message (for example, Email) is sent in a data communication mode, the text data of text message It is entered via operation panel 12540, and is sent to central controller 12610 via operation input controller 12640.In Under the control for entreating controller 12610, text data is transformed via modulation unit/demodulating unit 12660 and telecommunication circuit 12610 At transmission signal, and wireless base station 12000 is sent to via antenna 12510.
In order to send image data in a data communication mode, the image data captured by camera 12530 is via camera Interface 12630 is provided to image coding unit 12720.The image data captured can be controlled via camera interface 12630 and LCD Device 12620 processed is displayed directly on display screen 12520.
The structure of image coding unit 12720 can be corresponding to the structure of video encoder 100 described above.Image is compiled Code unit 12720 can be according to the Video coding side used by video encoder 100 described above or image encoder 400 Method, by the image data after being compression and coding from the image data transformation that camera 12530 receives, and then will be after coding Image data is output to multiplexer/demultiplexer 12680.During the record operation of camera 12530, by mobile phone 12500 The voice signal that obtains of microphone 12550 can be transformed into digital audio data via Sound Processor Unit 12650, and number Voice data may pass to multiplexer/demultiplexer 12680.
Multiplexer/demultiplexer 12680 to from after the coding that image coding unit 12720 receives image data with from The voice data that Sound Processor Unit 12650 receives is multiplexed together.The result being multiplexed to data can be single via modulation Member/demodulating unit 12660 and telecommunication circuit 12610 are transformed into transmission signal, then can be sent via antenna 12510.
When mobile phone 12500 receives communication data from outside, the signal received via antenna 12510 can be executed Frequency retrieval and ADC are to translate the signals into digital signal.Modulation unit/demodulating unit 12660 to the frequency band of digital signal into Row modulation.Video decoding unit is sent by the digital signal after band modulation according to the type of the digital signal after band modulation 12690, Sound Processor Unit 12650 or LCD controller 12620.
In the talk mode, mobile phone 12500 amplifies the signal received via antenna 12510, and passes through Frequency conversion and ADC are executed to amplified signal to obtain digital audio signal.Under the control of central controller 12710, The digital audio signal received is transformed into simulated sound via modulation unit/demodulating unit 12660 and Sound Processor Unit 12650 Sound signal, and analoging sound signal is exported via loudspeaker 12580.
When in a data communication mode, the data of the video file accessed on internet site are received, via modulation Unit/demodulating unit 12660 will be exported via antenna 12510 from the signal that wireless base station 12000 receives as multiplex data, and Multiplexer/demultiplexer 12680 is sent by multiplex data.
In order to be decoded to the multiplex data received via antenna 12510, multiplexer/demultiplexer 12680 will be answered Video data stream after demultiplexing into coding with data and the voice data stream after coding.Via synchronous bus 12730, after coding Video data stream and coding after voice data stream be respectively provided to video decoding unit 12690 and Sound Processor Unit 12650。
The structure of image decoding unit 12690 can be corresponding to the structure of video decoding apparatus 200 described above.Image solution Code unit 12690 can be according to the video decoding side used by video decoding apparatus 200 described above or image decoder 500 Method is decoded the video data after coding to obtain the video data of recovery, and will restore via LCD controller 12620 Video data be supplied to display screen 12520.
Therefore, the data of the video file accessed on internet site can be shown on display screen 12520.Meanwhile Audio data can be transformed into analoging sound signal by Sound Processor Unit 12650, and analoging sound signal is supplied to loudspeaker 12580.Therefore, the audio number for including in the video file accessed on internet site can also be reproduced in via loudspeaker 12580 According to.
Mobile phone 12500 or another type of communication terminal can be to be compiled including video according to an embodiment of the present disclosure The transceiver terminal of both decoding apparatus and video decoding apparatus, can be only include video encoder transceiver terminal, Huo Zheke To be the transceiver terminal for only including video decoding apparatus.
The communication system described above by reference to Figure 24 is not limited to according to the communication system of the disclosure.For example, Figure 26 shows root According to the digit broadcasting system using communication system of embodiment of the disclosure.The digit broadcasting system of Figure 26 can be by using basis The video encoder and video decoding apparatus of embodiment of the disclosure come receive via satellite or ground network transmission number Broadcast.
In more detail, broadcasting station 12890 is by using radio wave by video data stream to telecommunication satellite or broadcast Satellite 12900.Broadcasting satellite 12900 sends broadcast singal, and broadcast singal is sent to satellite broadcasting via household antenna 12860 Receiver.It, can be by TV receiver 12810, set-top box 12870 or another device to the video flowing after coding in each house It is decoded and reproduces.
When video decoding apparatus according to an embodiment of the present disclosure is implemented in reproduction equipment 12830, reproduction equipment 12830 can be to the view after the coding being recorded on storage medium 12820 (such as restoring the disk or storage card of digital signal) Frequency stream is parsed and is decoded.Therefore, the vision signal of recovery can be reproduced on such as monitor 12840.
Line being connected to for the antenna 12860 of satellite/terrestrial broadcast or for receiving cable television (TV) broadcast In the set-top box 12870 of cable antenna 12850, mountable video decoding apparatus according to an embodiment of the present disclosure.From set-top box The data of 12870 outputs can also be reproduced on TV Monitor 12880.
As another example, video decoding apparatus according to an embodiment of the present disclosure can be mounted in TV receiver 12810, Rather than in set-top box 12870.
Automobile 12920 including appropriate antenna 12910 can receive the letter sent from satellite 12900 or wireless base station 11700 Number.Decoded video can be reproduced on the display screen for the auto-navigation system 12930 being mounted in automobile 12920.
Vision signal can be encoded by video encoder according to an embodiment of the present disclosure, then can be stored in storage In medium.Specifically, picture signal can be stored in DVD disc 12960 by DVD recorder, or can be by hdd recorder 12950 In a hard disk by picture signal storage.As another example, vision signal can be stored in SD card 12970.If hard disk recording Device 12950 includes video decoding apparatus according to an embodiment of the present disclosure, then is recorded in DVD disc 12960, SD card 12970 or another Vision signal on one storage medium can be reproduced on TV Monitor 12880.
Auto-navigation system 12930 may not include the camera 12530, camera interface 12630 and image coding unit of Figure 23 12720.For example, computer 12100 and TV receiver 12810 may not include camera 12530, camera interface 12630 in Figure 23 In image coding unit 12720.
Figure 27 shows the cloud computing system according to an embodiment of the present disclosure using video encoder and video decoding apparatus The network structure of system.
Cloud computing system may include cloud computing server 14000, customer data base (DB) 14100, multiple computing resources 14200 and user terminal.
In response to carrying out the request of user terminal, cloud computing system is provided via data communication network (for example, internet) The program request outsourcing service of multiple computing resources 14200.Under cloud computing environment, service provider is combined by using virtual technology Computing resource at the data center of different physical locations, to provide desired service for user.Servicing user need not By computing resource (for example, using, memory, operating system (OS) and security software) be mounted in the terminal that he/her possesses with Using them, but can selection and use are thought from service in the Virtual Space generated by virtual technology at desired time point The service wanted.
The user terminal of appointed service user is via the data communication network including internet and mobile communications network It is connected to cloud computing server 14100.Cloud computing service can be provided from cloud computing server 14100 to user terminal, especially It is rabbit service.User terminal can be the various types of electronic devices that can be connected to internet, for example, on table Type PC 14300, intelligence TV 14400, smart phone 14500, notebook computer 14600, portable media player (PMP) 14700, tablet PC 14800 etc..
Cloud computing server 14100 can combine the multiple computing resources 14200 being distributed in cloud network, and to user terminal Combined result is provided.The multiple computing resource 14200 may include various data services, and may include uploading from user terminal Data.As described above, cloud computing server 14100 can be by being distributed in the different areas according to virtual technology combination Video database to provide desired service to user terminal.
User information about the user for having subscribed cloud computing service is stored in user DB 14100.User information It may include registration information, address, name and the personal credit information of user.User information may also include the index of video.Here, The index may include the list of the list for the video being reproduced, the video being reproduced, and be reproduced before The pause point etc. of video.
The information about video being stored in user DB 14100 can be shared between the user device.For example, when response When Video service is supplied to notebook computer 14600 by the request from notebook computer 14600, Video service is again Existing history is stored in user DB 14100.When receiving the request for reproducing this Video service from smart phone 14500 When, cloud computing server 14100 is based on user DB 14100 and searches for and reproduce this Video service.When smart phone 14500 is from cloud When calculation server 14100 receives video data stream, reproduced by being decoded to video data stream the processing of video with Operation above by reference to Figure 27 mobile phone 12500 described is similar.
The reproduction that cloud computing server 14100 can refer to the desired Video service being stored in user DB 14100 is gone through History.For example, cloud computing server 14100 is received from user terminal for reproducing asking for the video being stored in user DB 14100 It asks.If this video was reproduced, by cloud computing server 14100 execute carry out spreading defeated method to this video can root According to come user terminal request (that is, according to be will since the starting point of video or the pause point of video reproduce video) without Together.For example, cloud computing server 14100 will be from video if user terminal requests reproduce video since the starting point of video The flow data of video that starts of first frame be sent to user terminal.If user terminal requests since the pause point of video again Existing video, then the flow data of the video since frame corresponding with pause point is sent user's end by cloud computing server 14100 End.
In the case, user terminal may include the video decoding apparatus as described in above by reference to Fig. 1 to Figure 20.It is such as another Example, user terminal may include the video encoder as described in above by reference to Fig. 1 to Figure 20.Optionally, user terminal can wrap Include both video decoding apparatus and the video encoder as described in above by reference to Fig. 1 to Figure 20.
The view according to an embodiment of the present disclosure described above by reference to Fig. 1 to Figure 20 is described above by reference to Figure 21 to Figure 27 The various applications of frequency coding method, video encoding/decoding method, video encoder and video decoding apparatus.However, according to the disclosure Various embodiments the method for being stored in a storage medium method for video coding and video encoding/decoding method or video is compiled Decoding apparatus and video decoding apparatus realize that method in a device is not limited to the embodiment described above by reference to Figure 21 to Figure 27.

Claims (13)

1. a kind of motion compensation process using the estimation of motion vectors factor, the motion compensation process include:
The current motion vector of included predicting unit in coding unit is obtained, and obtains the hypothesis estimation mould of coding unit Formula information;
The combination of sub-pixel distance and rectilinear direction is determined based on the hypothesis estimation model information, wherein the sub-pixel Distance is selected in two or more predetermined sub-pixel distances, and the rectilinear direction is predetermined straight at two or more It is selected in line direction;
Reference block is determined by using two blocks of two hypothesis estimations because of sub-pixel are respectively included,
Wherein, the estimation of described two hypothesis because sub-pixel in selected rectilinear direction centered on currently estimating because of sub-pixel On straight line opposite to each other, and with current estimation because sub-pixel is at a distance of selected sub-pixel distance,
It is current to estimate to be indicated because of sub-pixel by current motion vector,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Combination,
The step of obtaining the hypothesis estimation model information of coding unit includes: by using on corresponding with the depth of coding unit Hereafter model executes entropy decoding to hypothesis estimation model information,
The step of determining the combination of sub-pixel distance and rectilinear direction includes: the hypothesis estimation model information based on entropy decoding, needle The combination of sub-pixel distance and rectilinear direction is determined to current motion vector.
2. motion compensation process as described in claim 1, wherein obtain the hypothesis estimation model information the step of include:
Obtain the hypothesis estimation model information and motion vector difference value information, wherein the instruction of motion vector difference value information is current Difference between motion vector and the motion vector for the predicting unit being encoded before current prediction unit;
The residual error data between current prediction unit and reference block is obtained,
Wherein, the motion compensation process further include: by generating current predictive by the residual error data and with reference to merged block The recovery block of unit.
3. motion compensation process as described in claim 1, wherein obtain the hypothesis estimation model information the step of include: Obtain the hypothesis estimation model information determined jointly for included predicting unit in current coded unit.
4. motion compensation process as described in claim 1, wherein the two or more predetermined sub-pixel distances include 1/ 4 pixel distances and 1/2 pixel distance, the two or more predetermined rectilinear directions include having 0 degree of angle, an angle of 90 degrees, 135 degree The direction at angle and 45 degree of angles,
Wherein, one for assuming the instruction of estimation model information and being selected in the two or more predetermined sub-pixel distances One in 8 kinds of combinations of sub-pixel distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Kind combination.
5. motion compensation process as described in claim 1, wherein the step of obtaining the hypothesis estimation model information of coding unit Include:
The context model for assuming estimation model information is determined according to the depth of coding unit;
The hypothesis estimation model information is executed by using 4 context models corresponding with the depth of current coded unit Entropy decoding.
6. a kind of method for estimating using the estimation of motion vectors factor, the method for estimating include:
In coding unit in included predicting unit, the current kinetic arrow of the inter-prediction for current prediction unit is determined Amount;
Reference block is determined by using two blocks of two hypothesis estimations because of sub-pixel are respectively included, wherein described two vacations If estimation because sub-pixel from the rectilinear direction selected in two or more predetermined rectilinear directions currently to estimate factor picture Centered on element on straight line opposite to each other, described two hypothesis are estimated to estimate with current because sub-pixel is apart from two because of sub-pixel Multiple hypothesis of the sub-pixel distance selected in a or more predetermined sub-pixel distance estimate among because of sub-pixel, and its In, it is current to estimate to be indicated because of sub-pixel by current motion vector;
The hypothesis estimation model information of exports coding unit, and export the motion vector difference value information of current prediction unit, wherein It is described assume sub-pixel distance that the instruction of estimation model information selects among two or more predetermined sub-pixels distances and from The combination of the rectilinear direction selected in two or more predetermined rectilinear directions,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Combination,
The step of hypothesis estimation model information of exports coding unit includes: by using on corresponding with the depth of coding unit Hereafter model executes entropy coding to hypothesis estimation model information.
7. method for estimating as claimed in claim 6, wherein the output hypothesis estimation model information simultaneously exports the fortune The step of dynamic vector difference information includes:
Export the hypothesis estimation model information and the motion vector difference value information, wherein the motion vector difference value information Indicate the difference between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit;
Export the residual error data between current prediction unit and reference block.
8. method for estimating as claimed in claim 6, wherein export the hypothesis estimation model information the step of include: The hypothesis estimation model information that output determines jointly for included predicting unit in current coded unit.
9. method for estimating as claimed in claim 6, wherein the two or more predetermined sub-pixel distances include 1/ 4 pixel distances and 1/2 pixel distance, the two or more predetermined rectilinear directions include having 0 degree of angle, an angle of 90 degrees, 135 degree The direction at angle and 45 degree of angles,
Wherein, one for assuming the instruction of estimation model information and being selected in the two or more predetermined sub-pixel distances One in 8 kinds of combinations of sub-pixel distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Kind combination.
10. method for estimating as claimed in claim 6, wherein export the hypothesis estimation model information the step of include:
The context model for assuming estimation model information is determined according to the depth of coding unit;
The hypothesis estimation model information is executed by using 4 context models corresponding with the depth of current coded unit Entropy coding.
11. method for estimating as claimed in claim 6, wherein the step of determining reference block include:
By using with 0 degree angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles rectilinear direction in each rectilinear direction on and Hypothesis estimation of the current estimation because of sub-pixel at a distance of 1/4 pixel distance is distorted RD cost because sub-pixel carrys out calculation rate;
By using on the direction for generating the minimum RD cost in RD cost with current estimation because sub-pixel at a distance of 1/2 pixel away from From hypothesis estimate to calculate RD cost because of sub-pixel;
Determine the rectilinear direction and sub-pixel distance generated where the minimum RD cost in RD cost;
Determine reference block, wherein the reference block be respectively include based on generate RD cost in minimum RD cost where it is straight The average block in line direction and sub-pixel distance and block of the determining hypothesis estimation because of sub-pixel.
12. a kind of motion compensation equipment using the estimation of motion vectors factor, the motion compensation equipment include:
Information obtainer obtains the residual error data of the current prediction unit in included predicting unit in coding unit and works as Preceding motion vector, and obtain the hypothesis estimation model information of coding unit;
Assuming that estimation model determiner, the group of sub-pixel distance and rectilinear direction is determined based on the hypothesis estimation model information Close, wherein the sub-pixel distance is selected in two or more predetermined sub-pixels distances, the rectilinear direction be It is selected in two or more predetermined rectilinear directions;
Motion compensator determines reference block by using two blocks of two hypothesis estimations because of sub-pixel are respectively included, and leads to The recovery block that current prediction unit is generated by residual error data and with reference to merged block is crossed,
Wherein, the estimation of described two hypothesis because sub-pixel in selected rectilinear direction centered on currently estimating because of sub-pixel On straight line opposite to each other, and with current estimation because sub-pixel is at a distance of selected sub-pixel distance,
It is current to estimate to be indicated because of sub-pixel by current motion vector,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Combination,
Information obtainer executes hypothesis estimation model information by using context model corresponding with the depth of coding unit Entropy decoding,
Assuming that estimation model determiner determines sub-pixel for current motion vector based on the hypothesis estimation model information of entropy decoding The combination of distance and rectilinear direction.
13. a kind of motion estimation apparatus using the estimation of motion vectors factor, the motion estimation apparatus include:
Exercise estimator in coding unit in included predicting unit, determines the inter-prediction for being used for current prediction unit Current motion vector, and determine reference block by using respectively including two and assume two blocks of the estimation because of sub-pixel, In, the estimation of described two hypothesis because sub-pixel from the rectilinear direction selected in two or more predetermined rectilinear directions to work as Because relative to each other on straight line centered on sub-pixel, described two hypothesis estimations are estimating the factor with current because of sub-pixel for preceding estimation Pixel is estimated at a distance of multiple hypothesis of the sub-pixel distance selected from two or more predetermined sub-pixel distances because of sub-pixel Among, and wherein, it is current to estimate to be indicated because of sub-pixel by current motion vector;
Information output unit, the hypothesis estimation model information of exports coding unit, and export the motion vector of current prediction unit Difference information, wherein described to assume that estimation model information indicates the sub- picture selected in two or more predetermined sub-pixel distances The combination of plain distance and the rectilinear direction selected in two or more predetermined rectilinear directions,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions Combination,
Information output unit holds hypothesis estimation model information by using context model corresponding with the depth of coding unit Row entropy coding.
CN201380049000.4A 2013-07-08 2013-07-08 Use multiple inter-frame prediction methods and its device for assuming the estimation factor Expired - Fee Related CN104662905B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/KR2013/006039 WO2015005507A1 (en) 2013-07-08 2013-07-08 Inter prediction method using multiple hypothesis estimators and device therefor

Publications (2)

Publication Number Publication Date
CN104662905A CN104662905A (en) 2015-05-27
CN104662905B true CN104662905B (en) 2019-06-11

Family

ID=52280175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380049000.4A Expired - Fee Related CN104662905B (en) 2013-07-08 2013-07-08 Use multiple inter-frame prediction methods and its device for assuming the estimation factor

Country Status (2)

Country Link
CN (1) CN104662905B (en)
WO (1) WO2015005507A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11218721B2 (en) 2018-07-18 2022-01-04 Mediatek Inc. Method and apparatus of motion compensation bandwidth reduction for video coding system utilizing multi-hypothesis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101026758A (en) * 2006-02-24 2007-08-29 三星电子株式会社 Video transcoding method and apparatus

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8559514B2 (en) * 2006-07-27 2013-10-15 Qualcomm Incorporated Efficient fetching for motion compensation video decoding process
KR101403343B1 (en) * 2007-10-04 2014-06-09 삼성전자주식회사 Method and apparatus for inter prediction encoding/decoding using sub-pixel motion estimation
KR101386891B1 (en) * 2007-12-13 2014-04-18 삼성전자주식회사 Method and apparatus for interpolating image
KR101505815B1 (en) * 2009-12-09 2015-03-26 한양대학교 산학협력단 Motion estimation method and appartus providing sub-pixel accuracy, and video encoder using the same
KR101847072B1 (en) * 2010-04-05 2018-04-09 삼성전자주식회사 Method and apparatus for video encoding, and method and apparatus for video decoding
TR201819237T4 (en) * 2011-09-14 2019-01-21 Samsung Electronics Co Ltd A Unit of Prediction (TB) Decoding Method Depending on Its Size

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101026758A (en) * 2006-02-24 2007-08-29 三星电子株式会社 Video transcoding method and apparatus

Also Published As

Publication number Publication date
CN104662905A (en) 2015-05-27
WO2015005507A1 (en) 2015-01-15

Similar Documents

Publication Publication Date Title
CN104754356B (en) The method and apparatus determined for the motion vector in Video coding or decoding
CN104365101B (en) For determining the method and apparatus of the reference picture for inter-prediction
CN104488272B (en) It is used to encode video or the method and apparatus of the motion vector for decoding video for predicting
CN104081779B (en) Method and its device for inter prediction and the method and its device for motion compensation
CN105308966B (en) Method for video coding and its equipment and video encoding/decoding method and its equipment
CN103918255B (en) The method and apparatus that the depth map of multi-view point video data is encoded and the method and apparatus that the depth map encoded is decoded
CN103931184B (en) Method and apparatus for being coded and decoded to video
CN104471938B (en) The method and apparatus of SAO parameters encoded to video is shared according to chrominance component
CN105144713B (en) For the method and device thereof of decoder setting encoded to video and based on decoder, the method and device thereof being decoded to video are set
CN104869413B (en) Video decoding apparatus
CN105325004B (en) Based on the method for video coding and equipment and video encoding/decoding method and equipment with signal transmission sampling point self adaptation skew (SAO) parameter
CN105103552B (en) Method and device thereof for the method and device thereof of compensation brightness difference encoded to cross-layer video and for being decoded to video
CN104365104B (en) For multiple view video coding and decoded method and apparatus
CN105594212B (en) For determining the method and its equipment of motion vector
CN106031175B (en) Use the cross-layer video coding method of luminance compensation and its device and video encoding/decoding method and its device
CN106416256B (en) For carrying out coding or decoded method and apparatus to depth image
CN105308961B (en) Cross-layer video coding method and equipment and cross-layer video coding/decoding method and equipment for compensation brightness difference
CN105308970B (en) The method and apparatus that video is coded and decoded for the position of integer pixel
CN105532005B (en) Method and apparatus for the method and apparatus of interlayer coding and for using residual prediction to carry out room decipherer to video
CN105340273B (en) For predicting for the method for the decoded difference vector of cross-layer video and coding method and equipment
CN104662905B (en) Use multiple inter-frame prediction methods and its device for assuming the estimation factor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190611

Termination date: 20210708