CN104662905B - Use multiple inter-frame prediction methods and its device for assuming the estimation factor - Google Patents
Use multiple inter-frame prediction methods and its device for assuming the estimation factor Download PDFInfo
- Publication number
- CN104662905B CN104662905B CN201380049000.4A CN201380049000A CN104662905B CN 104662905 B CN104662905 B CN 104662905B CN 201380049000 A CN201380049000 A CN 201380049000A CN 104662905 B CN104662905 B CN 104662905B
- Authority
- CN
- China
- Prior art keywords
- sub
- pixel
- unit
- coding
- hypothesis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
This disclosure relates to Video coding and video decoding, more particularly, it is related to a kind of method for estimating and motion compensation process, the method for estimating and the motion compensation process are related to estimating to determine motion vector because of sub-pixel by using multiple hypothesis as unit of sub-pixel, wherein, the method for estimating and the motion compensation process are performed to the inter-prediction being performed during Video coding and decoding.Motion compensation process using the estimation of motion vectors factor may include following operation: obtaining the motion vector of included predicting unit in coding unit and obtains the hypothesis estimation model information of coding unit;The combination of predetermined sub-pixel distance and predetermined rectilinear direction is determined based on the hypothesis estimation model information, wherein, the predetermined sub-pixel distance is selected in two or more sub-pixels distance, and the predetermined rectilinear direction is selected in two or more rectilinear directions;By using respectively including determining reference block because sub-pixel at a distance of two of the predetermined sub-pixel distance assumes block of the estimation because of sub-pixel with current estimation in the predetermined rectilinear direction, wherein current estimation is indicated because of sub-pixel by current motion vector.
Description
Technical field
This disclosure relates to which Video coding and decoding are more particularly related to for executing fortune in video coding and decoding
The method and apparatus of dynamic estimation and motion compensation.
Background technique
With the exploitation and offer of the hardware for reproducing and storing high-resolution or high-quality video content, for being used for
The demand for effectively carrying out coding or decoded Video Codec to high-resolution or high-quality video content is increasing.Root
Video is compiled according to limited coding method based on the macro block with predetermined size according to traditional Video Codec
Code.
The image data of spatial domain is transformed to the coefficient of frequency domain via frequency transformation.It, will according to Video Codec
Image is divided into the block of predetermined size, to each piece of execution discrete cosine transform (DCT), and in blocks to coefficient of frequency into
Row coding, to carry out the quick calculating of frequency transformation.Compared with the image data of spatial domain, the coefficient of frequency domain is easy to be pressed
Contracting.Specifically, since the prediction error according to the inter prediction or infra-frame prediction via Video Codec is come representation space domain
Image pixel value, therefore when to prediction error execute frequency transformation when, mass data can be transformed to 0.It is compiled and is solved according to video
Code device, can replace continuously laying equal stress on by using small amount of data reproducing raw data, to reduce data volume.
Summary of the invention
Technical problem
This disclosure relates to Video coding and decoding, more particularly, it is related to a kind of method for estimating and motion compensation side
Method, wherein these methods are related to estimating the factor (hypothetical by using multiple hypothesis as unit of sub-pixel
Estimator) pixel determines reference block, and is related to assuming estimation because of sub-pixel to determine using least information, wherein institute
It states method for estimating and motion compensation process is performed to the inter-prediction executed during Video coding and decoding.
Solution
Method for estimating according to an embodiment of the present disclosure is related to: by the way that the movement as unit of whole pixel is not used only
Vector and motion vector is determined using the multiple hypothesis estimation factor as unit of sub-pixel;Instruction is estimated in multiple hypothesis
The information of the optimal hypothesis estimation factor selected in the meter factor executes entropy coding.Motion compensation side according to an embodiment of the present disclosure
Method is related to: by instruction assume estimation the factor information execute entropy decoding come determine as unit of sub-pixel hypothesis estimation because
Son;By using by current motion vector and assume the estimation factor in conjunction with and the final reference block that determines executes motion compensation.
Beneficial effect
The disclosure provides one or more embodiments of method for estimating, and the method for estimating is by additionally
Estimate to determine reference block at a distance of the hypothesis estimation factor of sub-pixel distance because of sub-pixel using with current, it is pre- to improve interframe
The accuracy of survey.The method for estimating only allows the combination with high probability as assuming estimation because sub-pixel is relative to working as
The combination in direction and sub-pixel distance that preceding estimation is located at by sub-pixel assumes estimation factor picture so as to be rapidly selected
Element.In addition, the hypothesis about selection is estimated to be reduced to bottom line because of the transmitted bit number of the information of sub-pixel, so as to mention
Height comprises provide that the bit rate of the coded identification of estimation model information.
Detailed description of the invention
Fig. 1 is the frame according to an embodiment of the present disclosure based on according to the video encoder of the coding unit of tree construction
Figure.
Fig. 2 is the frame according to an embodiment of the present disclosure based on according to the video decoding apparatus of the coding unit of tree construction
Figure.
Fig. 3 is the diagram for describing the design of coding unit according to an embodiment of the present disclosure.
Fig. 4 is the block diagram of the image encoder according to an embodiment of the present disclosure based on coding unit.
Fig. 5 is the block diagram of the image decoder according to an embodiment of the present disclosure based on coding unit.
Fig. 6 is the diagram for showing the deeper coding unit according to an embodiment of the present disclosure according to depth and subregion.
Fig. 7 is the diagram for describing the relationship between coding unit and converter unit according to an embodiment of the present disclosure.
Fig. 8 is the encoded information for describing coding unit corresponding with coding depth according to an embodiment of the present disclosure
Diagram.
Fig. 9 is the diagram of the deeper coding unit according to an embodiment of the present disclosure according to depth.
Figure 10 to Figure 12 be for describe coding unit, predicting unit and converter unit according to an embodiment of the present disclosure it
Between relationship diagram.
Figure 13 is for describing between the coding unit of the coding mode information according to table 1, predicting unit and converter unit
Relationship diagram.
Figure 14 is the block diagram of motion estimation apparatus according to an embodiment of the present disclosure.
Figure 15 is the block diagram of motion compensation equipment according to an embodiment of the present disclosure.
Figure 16 a and Figure 16 b show the type according to an embodiment of the present disclosure for assuming estimation model.
Figure 17 shows the group in the direction according to an embodiment of the present disclosure indicated by hypothesis estimation model, value of symbol and distance
It closes.
Figure 18 shows the hypothesis according to an embodiment of the present disclosure as the test target about rate distortion (RD) cost and estimates
Meter mode.
Figure 19 shows the flow chart of method for estimating according to an embodiment of the present disclosure.
Figure 20 shows the flow chart of motion compensation process according to an embodiment of the present disclosure.
Figure 21 shows the physical structure of the disk of storage program according to an embodiment of the present disclosure.
Figure 22 shows by using disk the disk drive recorded with reading program.
Figure 23, which is shown, provides the integrally-built diagram of the contents providing system of content distribution service.
Figure 24 and Figure 25 shows method for video coding and the video encoding/decoding method according to an embodiment of the present disclosure of applying
The external structure and internal structure of mobile phone.
Figure 26 shows the digit broadcasting system of application communication system according to an embodiment of the present disclosure.
Figure 27 shows the cloud computing system according to an embodiment of the present disclosure using video encoder and video decoding apparatus
The network structure of system.
Preferred forms
Method for estimating according to an embodiment of the present disclosure is related to: not only by using the movement as unit of whole pixel
Vector also determines motion vector by using multiple hypothesis estimation factor as unit of sub-pixel;To instruction from the multiple
Assuming that the information of the best hypothesis estimation factor selected in the estimation factor executes entropy coding.Movement according to an embodiment of the present disclosure
Compensation method is related to: it is described to determine to execute entropy decoding by the information of the hypothesis estimation factor to instruction as unit of sub-pixel
Assuming that the estimation factor;By using by current motion vector and it is described assume estimation the factor in conjunction with and determine final reference block come
Motion compensation.
According to the one side of the disclosure, a kind of motion compensation process using the estimation of motion vectors factor, the fortune are provided
Dynamic compensation method includes following operation: obtaining the motion vector of included predicting unit in coding unit, and is encoded
The hypothesis estimation model information of unit;Predetermined sub-pixel distance and pre- boning out are determined based on the hypothesis estimation model information
The combination in direction, wherein the predetermined sub-pixel distance is selected in two or more sub-pixels distance, described predetermined
Rectilinear direction is selected in two or more rectilinear directions;By using respectively include in the predetermined rectilinear direction with
It is current to estimate to estimate the block because of sub-pixel at a distance of two hypothesis of the predetermined sub-pixel distance because of sub-pixel to determine reference block,
Wherein, current to estimate to be indicated because of sub-pixel by current motion vector.
Obtaining the operation for assuming estimation model information may include following operation: obtain the hypothesis estimation model information
With motion vector difference value information, wherein motion vector difference value information indicates current motion vector and before current prediction unit
Difference between the motion vector of predicting unit encoded;The residual error data between current prediction unit and reference block is obtained,
Wherein it is determined that the operation of reference block may include following operation: by by the residual error data and with reference to merged block it is current to generate
The recovery block of predicting unit.
Obtaining the operation for assuming estimation model information may include following operation: obtain in current coded unit
Included predicting unit and determining hypothesis estimation model information jointly.
The two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance, described two or more
Multiple rectilinear directions may include with 0 degree of angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles direction, the hypothesis estimation model information
It may include the sub- pixel distance selected in the two or more sub-pixels distance and the two or more
8 kinds of combinations of the rectilinear direction selected in rectilinear direction.
It determines that the combined operation may include following operation: determining the hypothesis estimation according to the depth of coding unit
The context model of pattern information;By using 4 context models corresponding with the depth of current coded unit to the vacation
If estimation model information executes entropy decoding;Hypothesis estimation model information based on entropy decoding determines son for current motion vector
The combination of pixel distance and rectilinear direction.
According to another aspect of the present disclosure, a kind of method for estimating using the estimation of motion vectors factor is provided, it is described
Method for estimating includes following operation: in coding unit in included predicting unit, determining and is used for current prediction unit
Inter-prediction current motion vector;By using respectively include in predetermined rectilinear direction currently to estimate to be because of sub-pixel
Two of center assume block of the estimation because of sub-pixel to determine reference block, wherein described two hypothesis estimate because sub-pixel with
It is current to estimate to estimate among because of sub-pixel because of sub-pixel at a distance of multiple hypothesis of predetermined sub-pixel distance, and wherein, currently estimate
It counts and is indicated because of sub-pixel by current motion vector;The hypothesis estimation model information of exports coding unit, and export predicting unit
Motion vector difference value information, wherein described to assume the indicating predetermined sub-pixel distance of estimation model information and predetermined rectilinear direction
Combination, wherein predetermined sub-pixel distance is selected in two or more sub-pixels distance, and predetermined rectilinear direction is two
It is selected in a or more rectilinear direction.
The step of exporting the hypothesis estimation model information and exporting the motion vector difference value information may include following behaviour
Make: exporting the hypothesis estimation model information and the motion vector difference value information, wherein the motion vector difference value information refers to
Show the difference between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit;Output is worked as
Residual error data between preceding predicting unit and reference block.
Exporting the operation for assuming estimation model information may include following operation: output is in current coded unit
Included predicting unit and determining hypothesis estimation model information jointly.
The two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance, described two or more
Multiple rectilinear directions include with 0 degree angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles direction, and hypothesis estimation model letter
Breath may include the sub- pixel distance selected in the two or more sub-pixels distance and described two or more
8 kinds of combinations of the rectilinear direction selected in a rectilinear direction.
Export the operation for assuming estimation model information can include: determine that the hypothesis is estimated according to the depth of coding unit
Count the context model of pattern information;By using 4 context models corresponding with the depth of current coded unit to described
Assuming that estimation model information executes entropy coding.
The operation for determining reference block may include following operation: by using with 0 degree of angle, an angle of 90 degrees, 135 degree of angles and 45
Spend in each rectilinear direction in the rectilinear direction at angle with current estimation because sub-pixel at a distance of 1/4 pixel distance hypothesis estimation because
Sub-pixel carrys out calculation rate distortion (RD) cost;By using generate RD cost in minimum RD cost direction on currently estimate
It counts and estimates to calculate RD cost because of sub-pixel at a distance of the hypothesis of 1/2 pixel distance because of sub-pixel;It determines and generates in RD cost most
Rectilinear direction and sub-pixel distance where small RD cost;Determine reference block, wherein the reference block is respectively included based on production
The block of rectilinear direction where minimum RD cost and sub-pixel distance in raw RD cost and the hypothesis estimation that determines because of sub-pixel
Average block.
According to another aspect of the present disclosure, a kind of motion compensation equipment using the estimation of motion vectors factor is provided, it is described
Motion compensation equipment includes: information obtainer, obtains the current prediction unit in included predicting unit in coding unit
Residual error data and current motion vector, and obtain the hypothesis estimation model information of coding unit;Assuming that estimation model determiner,
The combination of predetermined sub-pixel distance and predetermined rectilinear direction is determined based on the hypothesis estimation model information, wherein described pre-
Stator pixel distance is selected in two or more sub-pixels distance, and the predetermined rectilinear direction is at two or more
A rectilinear direction selection;Motion compensator, by using respectively include in the predetermined rectilinear direction with the current estimation factor
Pixel assumes block of the estimation because of sub-pixel at a distance of two of the predetermined sub-pixel distance to determine reference block, and by by residual error
Data and the recovery block that current prediction unit is generated with reference to merged block, wherein current to estimate to be sweared because of sub-pixel by current kinetic
Amount instruction.
According to another aspect of the present disclosure, a kind of motion estimation apparatus using the estimation of motion vectors factor is provided, it is described
Motion estimation apparatus includes: exercise estimator, in the predicting unit for including in coding unit, determines and is used for current prediction unit
Inter-prediction current motion vector, and by using respectively include in predetermined rectilinear direction currently to estimate because of sub-pixel
Centered on two assume block of the estimation because of sub-pixel to determine reference block, wherein described two hypothesis are estimated to exist because of sub-pixel
Estimate to estimate among because of sub-pixel because of sub-pixel at a distance of multiple hypothesis of predetermined sub-pixel distance with current, and wherein, currently
Estimate to be indicated because of sub-pixel by current motion vector;Information output unit, the hypothesis estimation model information of exports coding unit are defeated
The motion vector difference value information of predicting unit out, wherein the indicating predetermined sub-pixel distance of hypothesis estimation model information and pre-
The combination in boning out direction, wherein predetermined sub-pixel distance is selected in two or more sub-pixels distance, is made a reservation for straight
Line direction is selected in two or more rectilinear directions.
According to another aspect of the present disclosure, a kind of computer journey for recording and having for executing the motion compensation process is provided
The computer readable recording medium of sequence.
According to another aspect of the present disclosure, a kind of computer journey for recording and having for executing the method for estimating is provided
The computer readable recording medium of sequence.
Specific embodiment
Hereinafter, Video coding and decoding referring to figs. 1 to Figure 13, by description based on the coding unit with tree construction
Scheme.Hereinafter, term " image " can indicate static image or motion picture (that is, video itself).In addition, referring to Fig.1 4 to
Figure 20, the method and apparatus that description is used to execute Motion estimation and compensation by using multiple hypothesis, the movement
The interframe that estimation and motion compensation are used to execute in the Video coding and coding/decoding method based on the coding unit with tree construction
Prediction.
Firstly, Video coding and decoding side referring to figs. 1 to Figure 13, by description based on the coding unit with tree construction
Case.
Fig. 1 is the video encoder 100 according to an embodiment of the present disclosure based on according to the coding unit of tree construction
Block diagram.
It is determined based on the video encoder 100 for carrying out video estimation according to the coding unit of tree construction including coding unit
Device 120 and output unit 130.Hereinafter, for ease of description, based on according to the coding unit of tree construction progress video estimation
Video encoder 100 be referred to as " video encoder 100 ".
Coding unit determiner 120 can divide current picture based on the maximum coding unit of the current picture of image.Such as
Fruit current picture is greater than maximum coding unit, then it is single that the image data of current picture can be divided at least one maximum coding
Member.Maximum coding unit according to an embodiment of the present disclosure can be having a size of 32 × 32,64 × 64,128 × 128,256 ×
256 equal data cells, wherein the square for several powers that the shape of data cell is width and length is 2.Image data
Coding unit determiner 120 can be output to according at least one maximum coding unit.
Coding unit according to an embodiment of the present disclosure can be characterized by full-size and depth.Depth representing coding unit from
The number that maximum coding unit is divided by space, and with depth down, it can be from most according to the deeper coding unit of depth
Big coding unit is divided into minimum coding unit.The depth of maximum coding unit is highest depth, the depth of minimum coding unit
Degree is lowest depth.Due to the depth down with maximum coding unit, the size of coding unit corresponding with each depth subtracts
It is small, therefore coding unit corresponding with greater depths may include multiple coding units corresponding with more low depth.
As described above, the image data of current picture is divided into maximum coding list according to the full-size of coding unit
Member, and each maximum coding unit may include according to the divided deeper coding unit of depth.Due to according to depth to root
It is divided according to the maximum coding unit of embodiment of the disclosure, therefore can be according to depth to including in maximum coding unit
The image data of spatial domain carries out hierarchical classification.
The depth capacity and full-size of coding unit can be predefined, wherein the depth capacity and full-size limit
The total degree that the height and width of maximum coding unit processed are divided by layering.
Coding unit determiner 120 to by according to region of the depth to maximum coding unit be obtained by dividing to
A few division region is encoded, and is determined according at least one described division region for exporting the figure finally encoded
As the depth of data.In other words, coding unit determiner 120 is by the maximum coding unit according to current picture according to depth
Deeper coding unit image data is encoded, and the depth with minimum coding error is selected, to determine that coding is deep
Degree.Therefore, the coded image data of final output coding unit corresponding with the coding depth determined.In addition, and coding depth
Corresponding coding unit can be considered as the coding unit of coding.By determining coding depth and according to the volume of determining coding depth
The image data of code is output to output unit 130.
Based on deeper coding unit corresponding at least one depth of depth capacity is equal to or less than, maximum is encoded
Image data in unit is encoded, and the knot relatively encoded to image data based on each deeper coding unit
Fruit.After the encoding error to deeper coding unit is compared, the depth with minimum coding error may be selected.It can needle
At least one coding depth is selected to each maximum coding unit.
With coding unit hierarchically divided according to depth and with coding unit quantity increase, maximum coding
The size of unit is divided.In addition, even if coding unit is corresponding to same depth in a maximum coding unit, yet by point
It is single by each coding corresponding with same depth to determine whether not measure the encoding error of the image data of each coding unit
Member is divided into more low depth.Therefore, even if image data is included in a maximum coding unit, image data is according to depth
Multiple regions are divided into, and encoding error can be different according to the region in one maximum coding unit, therefore compiled
Code depth can be different according to the region in image data.Therefore, it can be determined in a maximum coding unit one or more
A coding depth, and can be carried out according to the coding unit of at least one coding depth come the image data to maximum coding unit
It divides.
Therefore, coding unit determiner 120 can determine including the coding list with tree construction in maximum coding unit
Member.It is according to an embodiment of the present disclosure " with tree construction coding unit " include include in maximum coding unit it is all compared with
Coding unit corresponding with the depth for being confirmed as coding depth in deep layer coding unit.It can be according to the phase of maximum coding unit
The coding unit of coding depth is hierarchically determined with the depth in region, and can independently determine that coding is deep in the different areas
The coding unit of degree.Similarly, the coding depth in current region can be independently determined from the coding depth in another region.
Depth capacity according to an embodiment of the present disclosure be with from maximum coding unit to minimum coding unit performed by draw
The relevant index of number divided.First depth capacity according to an embodiment of the present disclosure can be indicated from maximum coding unit to minimum
The performed total degree divided of coding unit.Second depth capacity according to an embodiment of the present disclosure can indicate single from maximum coding
Member arrives the sum of the depth levels of minimum coding unit.For example, when the depth of maximum coding unit is 0, it is single to maximum coding
The depth that member divides primary coding unit can be arranged to 1, and the depth of coding unit twice is divided to maximum coding unit
It can be arranged to 2.Here, if minimum coding unit is the coding unit that maximum coding unit is divided after four times, exist
5 depth levels of depth 0,1,2,3 and 4, and therefore the first depth capacity can be arranged to 4, the second depth capacity can be set
It is set to 5.
Predictive coding and transformation can be executed according to maximum coding unit.Also according to maximum coding unit, it is equal to based on basis
Or predictive coding and transformation are executed less than the deeper coding unit of the depth of depth capacity.It can be according to orthogonal transformation or integer
The method of transformation executes transformation.
Since whenever being divided according to depth to maximum coding unit, the quantity of deeper coding unit increases, because
This will execute the coding including predictive coding and transformation to all deeper coding units generated with depth down.In order to
Convenient for explaining, in maximum coding unit, predictive coding and transformation will be described based on the coding unit of current depth now.
Video encoder 100 can differently select the size or shape of the data cell for being encoded to image data
Shape.In order to encode to image data, the operation of such as predictive coding, transformation and entropy coding is executed, at this point, can be for all
Identical data cell is operated with, or can be directed to and each operate with different data cells.
For example, video encoder 100 is not only alternatively used for the coding unit encoded to image data, it is also optional
The data cell different from coding unit is selected, to execute predictive coding to the image data in coding unit.
It, can be based on coding unit corresponding with coding depth (that is, base in order to execute predictive coding in maximum coding unit
In the coding unit for again not being divided into coding unit corresponding with more low depth) Lai Zhihang predictive coding.Hereinafter, no longer being drawn
Divide and the coding unit for becoming the basic unit for predictive coding will be referred to as " predicting unit " now.It is single by dividing prediction
The subregion that member obtains may include predicting unit and be divided by least one of height to predicting unit and width
And the data cell obtained.Subregion be carry out being obtained by dividing data cell by the predicting unit to coding unit, and
Predicting unit can be the subregion with size identical with coding unit.
For example, when the coding unit of 2N × 2N (wherein, N is positive integer) is no longer divided and becomes the prediction list of 2N × 2N
When first, the size of subregion can be 2N × 2N, 2N × N, N × 2N or N × N.The example of divisional type includes by single to prediction
The height or width of member are symmetrically obtained by dividing symmetric partitioning, non-by height to predicting unit or width progress
Symmetrically divide (such as, 1:n or n:1) and obtain subregion, by being geometrically obtained by dividing to predicting unit point
Area and subregion with arbitrary shape.
The prediction mode of predicting unit can be at least one of frame mode, inter-frame mode and skip mode.For example,
Frame mode or inter-frame mode can be executed to the subregion of 2N × 2N, 2N × N, N × 2N or N × N.In addition, can be only to 2N × 2N's
Subregion executes skip mode.Coding can independently be executed to a predicting unit in coding unit, so that selection has minimum
The prediction mode of encoding error.
Video encoder 100 not only can be also based on and volume based on the coding unit for being encoded to image data
The different converter unit of code unit, executes transformation to the image data in coding unit.In order to execute change in coding unit
It changes, transformation can be executed based on having the data cell of the size less than or equal to coding unit.For example, the transformation for transformation
Unit may include the converter unit of frame mode and the data cell of inter-frame mode.
Similar to the coding unit based on tree construction according to the present embodiment, the converter unit in coding unit can be by recurrence
Ground is divided into smaller size of region, and can be single to coding based on the converter unit with tree construction according to transformed depth
Residual error data in member is divided.
In accordance with an embodiment of the present disclosure, transformed depth can be also set in converter unit, wherein transformed depth instruction passes through
The height and width of coding unit are divided and reach division number performed by converter unit.For example, in present encoding
When the size of the converter unit of unit is 2N × 2N, transformed depth can be arranged to 0, when the size of converter unit is N × N,
Transformed depth can be arranged to 1.In addition, transformed depth can be arranged to 2 when the size of converter unit is N/2 × N/2.?
That is the converter unit according to tree construction can also be arranged according to transformed depth.
The information about coding depth is not required nothing more than according to the encoded information of coding unit corresponding with coding depth, is also wanted
Seek information relevant to predictive coding and transformation.Therefore, coding unit determiner 120, which not only determines, has minimum coding error
Coding depth also determines divisional type, the prediction mode according to predicting unit and the transformation for transformation in predicting unit
The size of unit.
Then referring to Fig.1 0 to Figure 21 is described in detail the basis in maximum coding unit according to an embodiment of the present disclosure
The coding unit and predicting unit/subregion of tree construction and the method for determining converter unit.
Coding unit determiner 120 can be by using the rate-distortion optimization based on Lagrange's multiplier, to measure according to depth
The encoding error of the deeper coding unit of degree.
Output unit 130 exports the image data of maximum coding unit and in the bitstream about the volume according to coding depth
The information of pattern, wherein the image data of the maximum coding unit determines at least based on by coding unit determiner 120
One coding depth is encoded.
It can be encoded by the residual error data to image to obtain coded image data.
Information about the coding mode according to coding depth may include about the information of coding depth, about single in prediction
The information of the information of divisional type in member, the information about prediction mode and the size about converter unit.
It can be by using the information defined according to the division information of depth about coding depth, wherein according to depth
Division information indicates whether to more low depth rather than the coding unit of current depth executes coding.If current coded unit
Current depth is coding depth, then the image data in current coded unit is encoded and exported, therefore can be believed dividing
Breath is defined as current coded unit not being divided into more low depth.Optionally, if the current depth of current coded unit is not
Coding depth then executes coding to the coding unit of more low depth, and therefore can be defined as division information to present encoding list
Member is divided to obtain the coding unit of more low depth.
If current depth is not coding depth, the coding unit for the coding unit for being divided into more low depth is executed
Coding.Since at least one coding unit of more low depth is present in a coding unit of current depth, to lower
Each coding unit of depth repeats coding, and therefore can recursively execute volume to the coding unit with same depth
Code.
Due to determining the coding unit with tree construction for a maximum coding unit, and it is directed to the volume of coding depth
Code unit determines the information about at least one coding mode, so can determine for a maximum coding unit about at least one
The information of a coding mode.In addition, due to carrying out layering division, the figure of maximum coding unit to image data according to depth
As the coding depth of data can be different according to position, therefore can be arranged for image data about coding depth and coding mode
Information.
Therefore, output unit 130 encoded information about corresponding coding depth and coding mode can be distributed to including
At least one of coding unit, predicting unit and minimum unit in maximum coding unit.
Minimum unit according to an embodiment of the present disclosure is by the way that the minimum coding unit for constituting lowest depth is divided into 4
Part and obtain rectangular data unit.Selectively, minimum unit can be included in maximum coding unit included
There is maximum sized maximum rectangular data unit in all coding units, predicting unit, zoning unit and converter unit.
For example, the encoded information exported by output unit 130 can be classified as according to the encoded information of coding unit and
According to the encoded information of predicting unit.Encoded information according to coding unit may include about prediction mode information and about point
The information of area's size.Encoded information according to predicting unit may include the information in the estimation direction about inter-frame mode, about frame
Between the information of reference picture index of mode, the information about motion vector, the information of the chromatic component about frame mode, with
And the information of the interpolation method about frame mode.
In addition, according to picture, band or GOP define about the maximum sized information of coding unit and about maximum deep
The information of degree can be inserted into the head of bit stream, sequence parameter set (SPS) or parameter sets (PPS).
In addition, can also be exported by the head, SPS or PPS of bit stream about the transformation list allowed for current video
The maximum sized information of member and the information of the minimum dimension about converter unit.
In video encoder 100, deeper coding unit be can be by the way that the coding unit of greater depths is (higher
One layer) height or width be divided into two parts and the coding unit that obtains.In other words, when the size of the coding unit of current depth
When being 2N × 2N, the size of the coding unit of more low depth is N × N.In addition, the coding list of the current depth having a size of 2N × 2N
Member may include the coding unit of more low depth described in most 4.
Therefore, video encoder 100 can based on consider current picture feature and determination maximum coding unit ruler
Very little and depth capacity, by determining the coding unit with optimum shape and optimal size come shape for each maximum coding unit
At the coding unit with tree construction.In addition, due to can be by using any one in various prediction modes and transformation to every
A maximum coding unit executes coding, therefore is contemplated that the feature of the coding unit of various picture sizes determines optimum code mould
Formula.
Therefore, if encoded with conventional macro block to the image with high-resolution or big data quantity, each picture
Macro block quantity extremely increase.Therefore, the item number of the compression information generated for each macro block increases, and therefore, it is difficult to send pressure
The information of contracting, and efficiency of data compression reduces.However, by using video encoder 100, due in the size for considering image
While increase the full-size of coding unit, and adjust coding unit while considering the feature of image simultaneously, therefore can
Improve picture compression efficiency.
Fig. 2 is the block diagram of the video decoding apparatus 200 of the coding unit according to an embodiment of the present disclosure based on tree construction.
Video decoding apparatus 200 based on the coding unit according to tree construction includes symbol acquisition device 220 and image data
Decoder 230.Hereinafter, for ease of description, using based on the video according to the video estimation of the coding unit of tree construction
Decoding device 200 will be referred to as " video decoding apparatus 200 ".
Various terms (such as coding unit, depth, predicting unit, the change of decoding operate for video decoding apparatus 200
Change unit and the information about various coding modes) definition with referring to Fig.1 with video encoder 100 description definition phase
Together.
Symbol acquisition device 220 receives and the bit stream of parsing encoded video.The bit stream of symbol acquisition device 220 analytically,
Coded image data is extracted for each coding unit, and the image data of extraction is output to image data decoder 230,
In, coding unit has the tree construction according to each maximum coding unit.Symbol acquisition device 220 can be from about current picture
Head, SPS or PPS extract the maximum sized information of the coding unit about current picture.
In addition, the bit stream of symbol acquisition device 220 analytically is extracted according to each maximum coding unit about with tree
The coding depth of the coding unit of structure and the information of coding mode.The information quilt about coding depth and coding mode extracted
It is output to image data decoder 230.In other words, the image data in bit stream is divided into maximum coding unit, so that figure
As data decoder 230 is decoded image data for each maximum coding unit.
It can be encoded for the information setting about at least one coding unit corresponding with coding depth about according to maximum
The coding depth of unit and the information of coding mode, the information about coding mode may include about phase corresponding with coding depth
Answer the information of the divisional type of coding unit, the information of the information about prediction mode and the size about converter unit.In addition,
The information about coding depth can be extracted as according to the division information of depth.
By symbol acquisition device 220 extract about according to the coding depth of each maximum coding unit and the letter of coding mode
Breath is such information about coding depth and coding mode: the information is determined to be in encoder, and (such as, Video coding is set
For 100) according to each maximum coding unit to generation when coding is repeatedly carried out according to each deeper coding unit of depth
Minimum coding error.Therefore, video decoding apparatus 200 can be by according to the coding depth and coding mould for generating minimum coding error
Formula is decoded image data to restore image.
Since the encoded information about coding depth and coding mode can be assigned to corresponding coding unit, predicting unit
With the predetermined unit of data in minimum unit, therefore symbol acquisition device 220 can be extracted deep about coding according to predetermined unit of data
The information of degree and coding mode.The predetermined unit of data that the identical information about coding depth and coding mode is assigned can quilt
It is inferred as including the data cell in identical maximum coding unit.
Image data decoder 230 based on about according to the coding depth of maximum coding unit and the information of coding mode,
By being decoded to the image data in each maximum coding unit, Lai Huifu current picture.In other words, image data decoding
Device 230 can be based on extracting about including every among coding unit with tree construction in each maximum coding unit
Divisional type, the information of prediction mode and converter unit of a coding unit, are decoded the image data of coding.At decoding
Reason may include prediction (comprising intra prediction and motion compensation) and inverse transformation.It is executed according to inverse orthogonal transformation or inverse integer transform
Inverse transformation.
Image data decoder 230 can be based on about the divisional type according to the predicting unit of the coding unit of coding depth
Intra prediction or motion compensation are executed according to the subregion and prediction mode of each coding unit with the information of prediction mode.
In addition, being directed to the inverse transformation of each maximum coding unit, image data decoder 230 can be directed to each coding unit
It reads according to the converter unit information of tree construction with the converter unit of each coding unit of determination, and based on each coding unit
Converter unit executes inverse transformation.By inverse transformation, the pixel value of the spatial domain of coding unit can be restored.
Image data decoder 230 can be by using determining current maximum coding unit according to the division information of depth
At least one coding depth.If division information instruction image data is no longer divided in current depth, current depth is
Coding depth.Therefore, image data decoder 230 can be by using about for each coding unit corresponding with coding depth
Predicting unit divisional type, the information of the size of prediction mode and converter unit, in current maximum coding unit with
The coded data of at least one corresponding coding unit of each coding depth is decoded, and exports current maximum coding unit
Image data.
It in other words, can be by observing the predetermined unit of data being assigned in coding unit, predicting unit and minimum unit
Coding information set come collect include identical division information encoded information data cell, and collect data cell
It can be considered as the data cell that will be decoded by image data decoder 230 with identical coding mode.For as above
Determined each coding unit can get the information about coding mode to be decoded to current coded unit.
Video decoding apparatus 200 can get minimum about generating when recursively executing coding to each maximum coding unit
The information of at least one coding unit of encoding error, and the information can be used to be decoded to current picture.In other words,
The coding unit with tree construction for being confirmed as forced coding unit in each maximum coding unit can be decoded.This
Outside, the full-size of coding unit is determined in the case where considering resolution ratio and image data amount.
Therefore, though image data have high-resolution and big data quantity, can also by using coding unit size and
Coding mode is effectively decoded and restores to image data, wherein by using being received from encoder about optimal
The information of coding mode adaptively determines the size and coding mode of the coding unit according to the feature of image data.
Fig. 3 is the diagram for describing the design of coding unit according to an embodiment of the present disclosure.
The size of coding unit may be expressed as width × height, and can be 64 × 64,32 × 32,16 × 16 and 8
×8.64 × 64 coding unit can be divided into 64 × 64,64 × 32,32 × 64 or 32 × 32 subregion, 32 × 32 coding
Unit can be divided into 32 × 32,32 × 16,16 × 32 or 16 × 16 subregion, and 16 × 16 coding unit can be divided into 16
× 16,16 × 8,8 × 16 or 8 × 8 subregion, 8 × 8 coding unit can be divided into 8 × 8,8 × 4,4 × 8 or 4 × 4 point
Area.
In video data 310, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and depth capacity is
2.In video data 320, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, depth capacity 3.It is regarding
For frequency according in 330, resolution ratio is 352 × 288, and the full-size of coding unit is 16, depth capacity 1.Shown in Figure 10
Depth capacity indicates the division total degree from maximum coding unit to minimum coding unit.
If high resolution or data volume are big, the full-size of coding unit may be larger, to not only improve coding
Efficiency, and accurately reflect the feature of image.Therefore, there is 310 He of video data than 330 higher resolution of video data
The full-size of 320 coding unit can be 64.
Since the depth capacity of video data 310 is 2, due to by maximum coding unit divide twice, depth
Deepen to two layers, therefore the coding unit 315 of video data 310 may include the maximum coding unit and long axis that major axis dimension is 64
Having a size of 32 and 16 coding unit.Simultaneously as the depth capacity of video data 330 is 1, therefore due to by compiling to maximum
Code dividing elements are primary, and depth down is to one layer, therefore it is 16 that the coding unit 335 of video data 330, which may include major axis dimension,
Maximum coding unit and major axis dimension be 8 coding unit.
Since the depth capacity of video data 320 is 3, due to by maximum coding unit divide three times, depth
Deepen to 3 layers, therefore the coding unit 325 of video data 320 may include the maximum coding unit and long axis that major axis dimension is 64
Having a size of 32,16 and 8 coding unit.With depth down, details can be indicated accurately.
Fig. 4 is the block diagram of the image encoder 400 according to an embodiment of the present disclosure based on coding unit.
The operation that image encoder 400 executes the coding unit determiner 120 of video encoder 100 comes to image data
It is encoded.In other words, intra predictor generator 410 executes intra prediction to the coding unit under the frame mode in present frame 405,
Exercise estimator 420 and motion compensator 425 are by using present frame 405 and reference frame 495, to the interframe in present frame 405
Coding unit under mode executes interframe estimation and motion compensation.
The data exported from intra predictor generator 410, exercise estimator 420 and motion compensator 425 pass through 430 and of converter
Quantizer 440 is outputted as the transformation coefficient after quantization.Transformation coefficient after quantization passes through inverse DCT 460 and inverse converter
470 are recovered as the data in spatial domain, and the data in the spatial domain of recovery are by removing module unit 480 and loop filtering unit
Reference frame 495 is outputted as after 490 post-processings.Transformation coefficient after quantization can be outputted as bit by entropy coder 450
Stream 455.
For the application image encoder 400 in video encoder 100, all elements of image encoder 400 (that is,
It is intra predictor generator 410, exercise estimator 420, motion compensator 425, converter 430, quantizer 440, entropy coder 450, anti-
Quantizer 460, removes module unit 480 and loop filtering unit 490 at inverse converter 470) considering each maximum coding unit most
While big depth, operation is executed based on each coding unit in the coding unit with tree construction.
Specifically, intra predictor generator 410, exercise estimator 420 and motion compensator 425 are considering that current maximum coding is single
Member full-size and depth capacity while determine have tree construction coding unit in each coding unit subregion and
Prediction mode, converter 430 determine the size of the converter unit in each coding unit in the coding unit with tree construction.
Fig. 5 is the block diagram of the image decoder 500 according to an embodiment of the present disclosure based on coding unit.
Resolver 510 parses needed for the coded image data that will be decoded and decoding from bit stream 505 about coding
Information.Coded image data is outputted as the data of inverse quantization, the number of inverse quantization by entropy decoder 520 and inverse DCT 530
The image data in spatial domain is recovered as according to by inverse converter 540.
For the image data in spatial domain, intra predictor generator 550 executes the coding unit under frame mode pre- in frame
It surveys, motion compensator 560 executes motion compensation to the coding unit under inter-frame mode by using reference frame 585.
It can be by removing module unit by the image data in the spatial domain of intra predictor generator 550 and motion compensator 560
570 and loop filtering unit 580 post-processing after output for restore frame 595.In addition, by removing module unit 570 and loop filtering
The image data that unit 580 post-processes may be output as reference frame 585.
In order to be decoded in the image data decoder 230 of video decoding apparatus 200 to image data, image decoding
Device 500, which can be performed, executes the operation executed after operation in resolver 510.
For the application image decoder 500 in video decoding apparatus 200, all elements of image decoder 500 (that is,
Resolver 510, inverse DCT 530, inverse converter 540, intra predictor generator 550, motion compensator 560, is gone entropy decoder 520
Module unit 570 and loop filtering unit 580) for each maximum coding unit behaviour executed based on the coding unit with tree construction
Make.
Specifically, intra predictor generator 550 and motion compensator 560 are directed to each coding unit with tree construction and are based on dividing
Area and prediction mode execute operation, and inverse converter 540 executes operation based on the size of converter unit for each coding unit.
Fig. 6 is the diagram for showing the deeper coding unit according to an embodiment of the present disclosure according to depth and subregion.
Video encoder 100 and video decoding apparatus 200 consider the feature of image using hierarchical coding unit.It can root
Maximum height, maximum width and the depth capacity of coding unit are adaptively determined according to the feature of image, or can be by user's difference
Maximum height, maximum width and the depth capacity of ground setting coding unit.It can be true according to the full-size of scheduled coding unit
The size of the fixed deeper coding unit according to depth.
In accordance with an embodiment of the present disclosure, in the layered structure of coding unit 600, the maximum height and maximum of coding unit
Width is 64, and depth capacity is 3.In the case, depth capacity refers to that coding unit is compiled from maximum coding unit to minimum
The code divided total degree of unit.Since depth is deepened along the vertical axis of layered structure 600, deeper coding unit
Height and width are divided.In addition, predicting unit and subregion are shown along the trunnion axis of layered structure 600, wherein described
Predicting unit and subregion are the bases that predictive coding is carried out to each deeper coding unit.
In other words, in layered structure 600, coding unit 610 is maximum coding unit, wherein depth 0, size
(that is, height multiplies width) is 64 × 64.Depth is deepened along vertical axis, there is the coding list for being 1 having a size of 32 × 32 and depth
Member 620, having a size of 16 × 16 and depth be 2 coding unit 630, having a size of 8 × 8 and depth be 3 coding unit 640.Ruler
The coding unit 640 that very little is 8 × 8 and depth is 3 is the minimum coding unit with lowest depth.
The predicting unit and subregion of coding unit are arranged according to each depth along trunnion axis.In other words, if size
For 64 × 64 and depth be 0 coding unit 610 be predicting unit, then predicting unit can be divided into including in coding unit
Subregion in 610, that is, the subregion 610 having a size of 64 × 64, the subregion 612 having a size of 64 × 32, having a size of 32 × 64 subregion
614 or having a size of 32 × 32 subregion 616.
Similarly, the predicting unit for the coding unit 620 for being 1 having a size of 32 × 32 and depth can be divided into and is included in volume
Subregion in code unit 620, that is, the subregion 620 having a size of 32 × 32, the subregion 622 having a size of 32 × 16, having a size of 16 × 32
Subregion 624 and having a size of 16 × 16 subregion 626.
Similarly, the predicting unit for the coding unit 630 for being 2 having a size of 16 × 16 and depth can be divided into and is included in volume
Code unit 630 in subregion, that is, including the size in coding degree unit 630 for 16 × 16 subregion, having a size of 16 × 8
Subregion 632, the subregion 634 having a size of 8 × 16 and the subregion 636 having a size of 8 × 8.
Similarly, the predicting unit for the coding unit 640 for being 3 having a size of 8 × 8 and depth can be divided into and is included in coding
Subregion in unit 640, that is, including the size in coding unit 640 be 8 × 8 subregion, having a size of 8 × 4 subregion 642,
Subregion 644 having a size of 4 × 8 and the subregion 646 having a size of 4 × 4.
In order to determine at least one coding depth for the coding unit for constituting maximum coding unit 610, video encoder
100 coding unit determiner 120 is to including that coding unit corresponding with each depth in maximum coding unit 610 executes
Coding.
With depth down, being encoded according to the deeper of depth including the data with same range and identical size is single
The quantity of member increases.For example, it is desired to which four coding units corresponding with depth 2 are included in one corresponding with depth 1 to cover
Data in coding unit.Therefore, in order to according to depth relatively to identical data encoded as a result, corresponding with depth 1
Coding unit and four coding units corresponding with depth 2 are encoded.
In order to execute coding for the current depth among multiple depth, can pass through along the trunnion axis of layered structure 600
Coding is executed to each predicting unit in coding unit corresponding with current depth, to be directed to current depth, selects minimum compile
Code error.Optionally, coding can be executed for each depth by as depth is along the vertical axis intensification of layered structure 600
Compare the minimum coding error according to depth, to search for minimum coding error.There is minimum code in coding unit 610
The depth and subregion of error can be chosen as the coding depth and divisional type of coding unit 610.
Fig. 7 is for describing the relationship between coding unit 710 and converter unit 720 according to an embodiment of the present disclosure
Diagram.
Video encoder 100 or video decoding apparatus 200 are directed to each maximum coding unit, are less than or wait according to having
In the coding unit of the size of maximum coding unit, image is encoded or is decoded.It can be based on no more than corresponding coding unit
Data cell, to select the size of the converter unit for being converted during coding.
For example, in video encoder 100 or video decoding apparatus 200, if the size of coding unit 710 be 64 ×
64, then transformation can be executed by using the converter unit 720 having a size of 32 × 32.
In addition, can by the size less than 64 × 64 be 32 × 32,16 × 16,8 × 8 and 4 × 4 each converter unit
Transformation is executed, to encode to the data of the coding unit 710 having a size of 64 × 64, then may be selected that there is minimum code to miss
The converter unit of difference.
Fig. 8 is the encoded information for describing coding unit corresponding with coding depth according to an embodiment of the present disclosure
Diagram.
The output unit 130 of video encoder 100 can to corresponding with coding depth each coding unit about point
The information 800 of area's type, the information 810 about prediction mode and the information 820 about converter unit size are encoded, and
Information 800, information 810 and information 820 are sent as the information about coding mode.
Information 800 indicates the shape about the subregion for divide by the predicting unit to current coded unit acquisition
Information, wherein subregion is the data cell for carrying out predictive coding to current coded unit.For example, having a size of 2N × 2N's
Current coded unit CU_0 can be divided into the subregion 802 having a size of 2N × 2N, the subregion 804 having a size of 2N × N, having a size of N ×
Any one in the subregion 806 of 2N and the subregion 808 having a size of N × N.Here, it is set as about the information of divisional type 800
Indicate one of the subregion 804 having a size of 2N × N, the subregion 806 having a size of N × 2N and subregion having a size of N × N 808.
Information 810 indicates the prediction mode of each subregion.For example, information 810 can be indicated to the subregion indicated by information 800
The mode of the predictive coding of execution, that is, frame mode 812, inter-frame mode 814 or skip mode 816.
The converter unit that the instruction of information 820 is based on when current coded unit is executed and converted.For example, converter unit can
To be converter unit 822 in first frame, converter unit 824, the first inter-frame transform unit 826 or the second Inter-frame Transformation in the second frame
Unit 828.
The symbol acquisition device 220 of video decoding apparatus 200 can be extracted according to each deeper coding unit and use is used for
Decoded information 800,810 and 820.
Fig. 9 is the diagram of the deeper coding unit according to an embodiment of the present disclosure according to depth.
Division information can be used to the change of indicated depth.Whether the coding unit of division information instruction current depth is divided
At the coding unit of more low depth.
For being 0 to depth and the predicting unit 910 of the progress predictive coding of coding unit 900 having a size of 2N_0 × 2N_0
It may include the subregion of following divisional type: the divisional type 912 having a size of 2N_0 × 2N_0, the subregion having a size of 2N_0 × N_0
Type 914, the divisional type 916 having a size of N_0 × 2N_0 and the divisional type having a size of N_0 × N_0 918.Fig. 9 is illustrated only
The divisional type 912 to 918 obtained and symmetrically dividing predicting unit 910, but divisional type is without being limited thereto, and pre-
The subregion for surveying unit 910 may include asymmetric subregion, the subregion with predetermined shape and the subregion with geometry.
According to every kind of divisional type, to having a size of 2N_0 × 2N_0 a subregion, two points having a size of 2N_0 × N_0
Predictive coding is repeatedly carried out in area, two subregions having a size of N_0 × 2N_0 and four subregions having a size of N_0 × N_0.It can be right
Subregion having a size of 2N_0 × 2N_0, N_0 × 2N_0,2N_0 × N_0 and N_0 × N_0 executes under frame mode and inter-frame mode
Predictive coding.The predictive coding under skip mode is only executed to the subregion having a size of 2N_0 × 2N_0.
The error of coding including the predictive coding in divisional type 912 to 918 is compared, and in divisional type
Determine minimum coding error.If encoding error is minimum in a divisional type in divisional type 912 to 916, can not
Predicting unit 910 is divided into more low depth.
If encoding error is minimum in divisional type 918, depth changes to 1 from 0 to divide subregion in operation 920
Type 918, and be 2 to depth and coding unit 930 having a size of N_0 × N_0 is repeatedly carried out coding and searches for minimum code
Error.
For carrying out predictive coding to depth for 1 and the coding unit 930 having a size of 2N_1 × 2N_1 (=N_0 × N_0)
Predicting unit 940 may include following divisional type subregion: divisional type 942 having a size of 2N_1 × 2N_1, having a size of 2N_
Divisional type 944, the divisional type 946 having a size of N_1 × 2N_1 and the divisional type having a size of N_1 × N_1 of 1 × N_1
948。
If encoding error is minimum in divisional type 948, depth changes to 2 from 1 to divide subregion in operation 950
Type 948, and to depth be 2 and the coding unit 960 having a size of N_2 × N_2 repeat coding search for minimum code miss
Difference.
When depth capacity is d, can be performed according to the division operation of each depth until depth becomes d-1, and draw
Point information can be encoded until depth is 0 to arrive one of d-2.In other words, when coding is performed until corresponding to the depth of d-2
Coding unit operate be divided in 970 after depth be d-1 when, for depth be d-1 and having a size of 2N_ (d-1) ×
The predicting unit 990 of the progress predictive coding of coding unit 980 of 2N_ (d-1) may include the subregion of following divisional type: having a size of
2N_ (d-1) × 2N (d-1) divisional type 992, having a size of 2N_ (d-1) × N (d-1) divisional type 994, having a size of N_
(d-1) divisional type 996 of × 2N (d-1) and having a size of N_ (d-1) × N (d-1) divisional type 998.
It can be to the size in divisional type 992 to 998 for a subregion of 2N_ (d-1) × 2N_ (d-1), having a size of 2N_
(d-1) two subregions of × N_ (d-1), having a size of two subregions of N_ (d-1) × 2N_ (d-1), having a size of N_ (d-1) × N_
(d-1) predictive coding is repeatedly carried out in four subregions, to search for the divisional type with minimum coding error.
Even if, since depth capacity is d, depth is the volume of d-1 when divisional type 998 has minimum coding error
Code unit CU_ (d-1) is also no longer divided into more low depth, for constituting the coding unit of current maximum coding unit 900
Coding depth is confirmed as d-1, and the divisional type of current maximum coding unit 900 can be confirmed as N_ (d-1) × N (d-
1).Further, since depth capacity is d, and it is lower that there is the minimum coding unit 980 of lowest depth d-1 to be no longer divided into
Depth, therefore it is not provided with the division information of minimum coding unit 980.
Data cell 999 can be " minimum unit " for current maximum coding unit.In accordance with an embodiment of the present disclosure
Minimum unit can be the rectangular data unit obtained and minimum coding unit 980 is divided into 4 parts.By repeatedly
Coding is executed, video encoder 100 can select to have most by comparing according to the encoding error of the depth of coding unit 900
The depth of lower Item error sets respective partition type and prediction mode to the coding of coding depth to determine coding depth
Mode.
In this way, being compared into d to according to the minimum coding error of depth in all depth 1, and there is minimum compile
The depth of code error can be confirmed as coding depth.Coding depth, the divisional type of predicting unit and prediction mode can be used as pass
It is encoded and sends in the information of coding mode.In addition, since coding unit from 0 depth is divided into coding depth,
0 only is set by the division information of coding depth, and sets 1 for the division information of the depth other than coding depth.
The symbol acquisition device 220 of video decoding apparatus 200 it is extractable and using about coding unit 900 coding depth and
The information of predicting unit, to be decoded to subregion 912.Video decoding apparatus 200 can be believed by using according to the division of depth
The depth that division information is 0 is determined as coding depth by breath, and using the information of the coding mode about respective depth come into
Row decoding.
Figure 10 to Figure 12 is according to an embodiment of the present disclosure in coding unit 1010,1060 and of predicting unit for describing
The diagram of relationship between converter unit 1070.
Coding unit 1010 is corresponding with the coding depth determined by video encoder 100 in maximum coding unit
Coding unit with tree construction.Predicting unit 1060 is the subregion of the predicting unit in each coding unit 1010, and transformation is single
Member 1070 is the converter unit of each coding unit 1010.
When the depth of the maximum coding unit in coding unit 1010 is 0, the depth of coding unit 1012 and 1054 is
1, the depth of coding unit 1014,1016,1018,1028,1050 and 1052 is 2, coding unit 1020,1022,1024,
1026,1030,1032 and 1048 depth is 3, and the depth of coding unit 1040,1042,1044 and 1046 is 4.
In predicting unit 1060, some coding degree units are obtained by dividing the coding unit in coding unit 1010
1014,1016,1022,1032,1048,1050,1052 and 1054.In other words, 1014,1022,1050 and of coding unit
The size of divisional type in 1054 is 2N × N, the size of the divisional type in coding unit 1016,1048 and 1052 be N ×
2N, the size of the divisional type of coding unit 1032 are N × N.The predicting unit and subregion of coding unit 1010 are less than or equal to
Each coding unit.
In the converter unit 1070 in the data cell for being less than coding unit 1052, to the picture number of coding unit 1052
According to execution transformation or inverse transformation.In addition, in terms of size and shape, coding unit in converter unit 1,070 1014,1016,
1022,1032,1048,1050 and 1052 be different from coding unit 1014 in predicting unit 1060,1016,1022,1032,
1048,1050 and 1052.In other words, video encoder 100 and video decoding apparatus 200 can be in same coding units
Data cell independently executes intra prediction, estimation, motion compensation, transformation and inverse transformation.
Therefore, in each region of maximum coding unit there is each coding unit layered recursively to execute
Coding is to determine optimum code unit, to can get the coding unit with recurrence tree construction.Encoded information may include about
The division information of coding unit, the information about divisional type, the information about prediction mode and the size about converter unit
Information.Table 1 shows the encoded information that can be arranged by video encoder 100 and video decoding apparatus 200.
[table 1]
The exportable encoded information about the coding unit with tree construction of the output unit 130 of video encoder 100,
The symbol acquisition device 220 of video decoding apparatus 200 can be from the bitstream extraction received about the coding unit with tree construction
Encoded information.
Division information indicates whether the coding unit that current coded unit is divided into more low depth.If current depth d
Division information be 0, then it is coding depth that current coded unit, which is no longer divided into the depth of more low depth, so as to be directed to institute
Coding depth is stated to define the information of the size about divisional type, prediction mode and converter unit.If current coded unit
It is further divided into according to division information, then coding is independently executed to four division coding units of more low depth.
Prediction mode can be one of frame mode, inter-frame mode and skip mode.All divisional types can be directed to
Frame mode and inter-frame mode are defined, skip mode is only defined in the divisional type having a size of 2N × 2N.
Information about divisional type can indicate the ruler obtained and the height or width by symmetrically dividing predicting unit
The very little symmetric partitioning type for 2N × 2N, 2N × N, N × 2N and N × N, and the height by asymmetricly dividing predicting unit
Or width and the size that obtains are the asymmetric divisional type of 2N × nU, 2N × nD, nL × 2N and nR × 2N.It can be by pressing 1:3
The height of predicting unit is divided with 3:1 to obtain the asymmetric divisional type having a size of 2N × nU and 2N × nD respectively, can led to
It crosses by 1:3 and 3:1 and divides the width of predicting unit to obtain the asymmetric subregion class having a size of nL × 2N and nR × 2N respectively
Type.
Converter unit can be sized to the two types under frame mode and the two types under inter-frame mode.It changes
Sentence is talked about, if the division information of converter unit is 0, the size of converter unit can be 2N × 2N, i.e. current coded unit
Size.If the division information of converter unit is 1, it is single that transformation can be obtained by being divided to current coded unit
Member.In addition, if when the divisional type of the current coded unit having a size of 2N × 2N is symmetric partitioning type, converter unit
Size can be N × N, if the divisional type of current coded unit is non-symmetric partitioning type, the size of converter unit can
To be N/2 × N/2.
Encoded information about the coding unit with tree construction may include coding unit corresponding with coding depth, prediction
At least one of unit and minimum unit.Coding unit corresponding with coding depth may include pre- comprising identical encoded information
Survey at least one of unit and minimum unit.
Therefore, determine whether adjacent data unit is included in and compiles by comparing the encoded information of adjacent data unit
In the code corresponding same coding unit of depth.In addition, being determined by using the encoded information of data cell and coding depth phase
The corresponding coding unit answered, and therefore can determine the distribution of the coding depth in maximum coding unit.
It therefore, can be direct if predicted based on the encoded information of adjacent data unit current coded unit
With reference to and using data cell in the deeper coding unit neighbouring with current coded unit encoded information.
Optionally, it if predicted based on the encoded information of adjacent data unit current coded unit, uses
The encoded information of neighbouring data cell searches for the data cell neighbouring with current coded unit with current coded unit, and can
With reference to the neighbouring coding unit searched to predict current coded unit.
Figure 13 is single for describing the coding unit of the coding mode information according to table 1, predicting unit or subregion and transformation
The diagram of relationship between member.
The coding unit 1302 of maximum coding unit 1300 including multiple coding depths, 1304,1306,1312,1314,
1316 and 1318.Here, due to the coding unit that coding unit 1318 is a coding depth, division information can be set
It is set to 0.Information about the divisional type of the coding unit 1318 having a size of 2N × 2N can be arranged in following divisional type
One kind: the divisional type 1322 having a size of 2N × 2N, the divisional type 1324 having a size of 2N × N, the subregion having a size of N × 2N
Class1 326, the divisional type 1328 having a size of N × N, the divisional type 1332 having a size of 2N × nU, point having a size of 2N × nD
Area's Class1 334, the divisional type 1336 having a size of nL × 2N and the divisional type having a size of nR × 2N 1338.
The division information (TU (converter unit) dimension mark) of converter unit is a type of manipulative indexing.With transformation rope
The size for drawing corresponding converter unit can change according to the predicting unit type or divisional type of coding unit.
For example, when divisional type is configured to symmetrical (that is, divisional type 1322,1324,1326 or 1328), if become
The division information (TU dimension mark) for changing unit is 0, then the converter unit 1342 having a size of 2N × 2N is arranged, if TU size mark
Note is 1, then the converter unit 1344 having a size of N × N is arranged.
When divisional type is configured to asymmetric (that is, divisional type 1332,1334,1336 or 1338), if TU ruler
Very little label is 0, then the converter unit 1352 having a size of 2N × 2N is arranged, if TU dimension mark is 1, is arranged having a size of N/2
The converter unit 1354 of × N/2.
Referring to Figure 20, TU dimension mark is the label with value 0 or 1, but TU dimension mark is not limited to 1 bit, and
Converter unit can be layered when TU dimension mark increases since 0 and is divided into tree construction.The division information of converter unit
(TU dimension mark) can be the example of manipulative indexing.
It in this case, in accordance with an embodiment of the present disclosure, can be by using the TU dimension mark of converter unit and change
Full-size and the minimum dimension of unit are changed to indicate the size of actually used converter unit.According to the implementation of the disclosure
Example, video encoder 100 can be to size information of maximum conversion unit, size information of minimum conversion unit and maximum TU size
Label is encoded.Size information of maximum conversion unit, size information of minimum conversion unit and maximum TU dimension mark are carried out
The result of coding can be inserted into SPS.In accordance with an embodiment of the present disclosure, video decoding apparatus 200 can be single by using maximum transformation
Elemental size information, size information of minimum conversion unit and maximum TU dimension mark are decoded video.
If for example, the size of (a) current coded unit be 64 × 64 and maximum converter unit size be 32 × 32,
(a-1) when TU dimension mark is 0, the size of converter unit can be 32 × 32, and (a-2) is converted when TU dimension mark is 1
The size of unit can be 16 × 16, and (a-3) when TU dimension mark is 2, the size of converter unit can be 8 × 8.
As another example, if (b) size of current coded unit be 32 × 32 and minimum converter unit size be
32 × 32, then (b-1) when TU dimension mark be 0 when, the size of converter unit can be 32 × 32.Here, due to converter unit
Size can not be less than 32 × 32, therefore TU dimension mark can not be arranged to the value other than 0.
As another example, if (c) size of current coded unit is 64 × 64 and maximum TU dimension mark is 1,
Then TU dimension mark can be 0 or 1.Here, TU dimension mark can not be arranged to the value other than 0 or 1.
Therefore, when TU dimension mark is 0, it is if defining maximum TU dimension mark
" MaxTransformSizeIndex ", minimum converter unit having a size of " MinTransformSize ", converter unit having a size of
" RootTuSize " can then define the current minimum converter unit ruler that can be determined in current coded unit by equation (1)
Very little " CurrMinTuSize ":
CurrMinTuSize=max (MinTransformSize, RootTuSize/ (2^
MaxTransformSizeIndex))…(1)
Compared with the current minimum converter unit size " CurrMinTuSize " that can be determined in current coded unit, when
When TU dimension mark is 0, converter unit size " RootTuSize " can indicate selectable maximum converter unit ruler in systems
It is very little.In equation (1), " RootTuSize/ (2^MaxTransformSizeIndex) " instruction becomes when TU dimension mark is 0
Converter unit size when unit size " RootTuSize " has been divided number corresponding with maximum TU dimension mark is changed,
" MinTransformSize " indicates minimum transform size.Therefore, " RootTuSize/ (2^
MaxTransformSizeIndex can be can be in current coded unit for lesser value) " and in " MinTransformSize "
Determining current minimum converter unit size " CurrMinTuSize ".
In accordance with an embodiment of the present disclosure, maximum converter unit size RootTuSize can change according to the type of prediction mode
Become.
For example, can be determined by using equation below (2) if current prediction mode is inter-frame mode
"RootTuSize".In equation (2), " MaxTransformSize " indicates maximum converter unit size, " PUSize " instruction
Current prediction unit size:
RootTuSize=min (MaxTransformSize, PUSize) ... (2)
That is, the converter unit size if current prediction mode is inter-frame mode, when TU dimension mark is 0
" RootTuSize " can be lesser value in maximum converter unit size and current prediction unit size.
If the prediction mode of current partition unit is frame mode, can be determined by using equation below (3)
"RootTuSize".In equation (3), " PartitionSize " indicates the size of current partition unit:
RootTuSize=min (MaxTransformSize, PartitionSize) ... (3)
That is, the converter unit size if current prediction mode is frame mode, when TU dimension mark is 0
" RootTuSize " can be lesser value among maximum converter unit size and the size of current partition unit.
However, the current maximum converter unit size changed according to the type of the prediction mode in zoning unit
" RootTuSize " is only example, and one or more embodiments of the disclosure are without being limited thereto.
According to the method for video coding based on the coding unit with tree construction described referring to figs. 1 to Figure 13, for tree
Each coding unit of structure encodes the image data of spatial domain.According to the view based on the coding unit with tree construction
Frequency coding/decoding method executes decoding for each maximum coding unit to restore the image data of spatial domain.Therefore, picture can be restored
With the picture video as picture sequence.Video after recovery can be reproduced by reproduction equipment, can be stored in storage medium,
Or it can be sent by network.
Hereinafter, referring to Fig.1 4 to Figure 20, it will method for estimating and motion compensation side of the description using multiple hypothesis
Method, wherein the method for estimating and motion compensation process are used in the video solution based on the coding unit with tree construction
The inter-prediction executed in code method and method for video coding.
Inter-prediction utilizes the similitude between present image and another image.From the ginseng being resumed before the present image
Examine image detection reference zone similar with the current region of present image.Between current region and reference zone on coordinate
Distance is represented as motion vector, and the difference between the pixel value of current region and the pixel value of reference zone is represented as residual error number
According to.Therefore, by executing inter-prediction, index, motion vector and the residual error number of exportable instruction reference picture to current region
According to image information without directly exporting current region.
Operation for inter-prediction can generally be classified as motion estimation operation and operation of motion compensation, wherein movement
Estimation operation is for determining the reference picture, motion vector and the residual error data that are used for present image, and operation of motion compensation is for leading to
It crosses using reference picture, motion vector and residual error data and restores present image.4 motion estimation apparatus will be described referring to Fig.1
1400 operation describes the operation of motion compensation equipment 1500 by referring to Fig.1 5.
Motion estimation apparatus 1400 and motion compensation equipment 1500 can execute inter-prediction for each of each image piece.
The type of block can be rectangular, rectangle or any geometric figure.The type of block is not limited to the data cell with predetermined size.
Data block for inter-prediction can be predicting unit (or subregion).As described above, in the volume with tree construction
In code unit, each maximum coding unit can be divided into multiple coding units, and can will be every in the multiple coding unit
A coding unit is divided into one or more predicting units.Although in a coding unit including multiple predicting units,
It is that can execute estimation according to predicting unit, so as to determine motion vector, residual error data etc. according to predicting unit.
Therefore, it describes in detail now with reference to Figure 14 and Figure 15 by motion estimation apparatus 1400 and motion compensation equipment
1500 operations carried out, wherein motion estimation apparatus 1400 and motion compensation equipment 1500 are to the coding unit with tree construction
Inter-prediction is executed with predicting unit.
Figure 14 is the block diagram of motion estimation apparatus 1400 according to an embodiment of the present disclosure.
Motion estimation apparatus 1400 includes exercise estimator 1410 and information output unit 1420.
Exercise estimator 1410 can be according to including the predicting unit in each coding unit with tree construction of image
To execute estimation.
Exercise estimator 1410 can determine current motion vector, with to including among predicting unit in coding unit
Current prediction unit executes inter-prediction.In order to during motion estimation process determine motion vector, can be performed for search for
The operation of the most like block of current prediction unit.Motion vector can be confirmed as position and the current predictive of the block that instruction searches
The vector of difference between the position of unit.
In order to further be accurately determined about the current estimation indicated by current motion vector because the movement of sub-pixel is sweared
Amount, information output unit 1420 is other than it can be used and currently estimate because of sub-pixel, and also working hypothesis estimation is because of sub-pixel.
In the present embodiment, it can be determined as unit of the pixel with predetermined accuracy by current motion vector instruction
Current estimation is because of sub-pixel.It, can be with sub-pixel in the case where the sub-pixel of reference picture is interpolated with for inter-prediction
Motion vector is determined for unit.Current estimation because sub-pixel be sub-pixel when, can with current estimation because sub-pixel is neighbouring
Sub-pixel in determine assume estimation because of sub-pixel.May be selected in a predetermined direction currently to estimate because centered on sub-pixel
Two or more assume estimation because of sub-pixel.
For example, in the case where the precision according to 1/4 pixel unit carries out interpolation to the width of reference picture and height, when
Preceding estimation can refer to show the sub-pixel at 1/4 pixel distance of each of reference picture of interpolation because of sub-pixel.It therefore, can be according to
1/4 pixel unit determines the current estimation because of the x coordinate value and y-coordinate value of sub-pixel.In the present embodiment, it can will assume to estimate
Because sub-pixel is determined as and currently estimates because sub-pixel is at a distance of 1/4 pixel distance or the sub-pixel of 1/2 pixel distance.By using
At a distance of the hypothesis estimation factor of sub-pixel distance, (wherein, which indicates to be inserted according to sub-pixel unit with motion vector
The reference picture of value), most like reference block can be determined from the reference picture being interpolated according to sub-pixel unit.
The hypothesis estimation indicated by the motion vector of current prediction unit is because sub-pixel can be the representative pixel of reference block.
For example, it is assumed that estimation can be the upper left pixel of reference block because of sub-pixel.Therefore, when assuming that estimation is determined because of sub-pixel
When, it may be determined that comprise provide that estimation because of reference block sub-pixel and with size identical as predicting unit.Therefore, when for working as
When preceding estimation assumes estimation because of sub-pixel because sub-pixel determines, it may be determined that comprise provide that estimation because of the reference block of sub-pixel.
In addition, exercise estimator 1410 can be estimated from multiple hypothesis with current estimation because of sub-pixel at a distance of sub-pixel distance
Because determining estimation because of sub-pixel in sub-pixel.In addition, exercise estimator 1410 can be by merging multiple hypothesis estimations because of sub-pixel
To generate an estimation because of sub-pixel.
Such hypothesis estimation may be selected because of sub-pixel in exercise estimator 1410: the hypothesis estimation is because sub-pixel is in straight line
On by currently estimate because centered on sub-pixel opposite to each other, and with current estimation because sub-pixel is at a distance of the hypothesis of sub-pixel distance
Estimation is because in sub-pixel.
For example, such two hypothesis estimation may be selected because of sub-pixel in exercise estimator 1410: described two hypothesis estimations
Because sub-pixel is on straight line currently to estimate because centered on sub-pixel, and with current estimation because sub-pixel is at a distance of sub-pixel distance
Hypothesis estimation because in sub-pixel, also, exercise estimator 1410 can be by using the estimation of two hypothesis of selection because of sub-pixel
To determine estimation because of sub-pixel.
Exercise estimator 1410 can determine such estimation because of sub-pixel: the estimation is because sub-pixel indicates two of selection
Assuming that mean place of the estimation because of the motion vector of sub-pixel.Therefore, the first of selection assumes that estimation includes because of sub-pixel expression
First assumes to estimate the first reference block because of sub-pixel, and the second of selection assumes estimation indicates to include that the second hypothesis is estimated because of sub-pixel
Count the second reference block because of sub-pixel.Therefore, by estimating that the reference block indicated by sub-pixel can be by according to the first reference block
Block is formed by with the average value of the pixel value of the location of pixels of the second reference block.
Therefore, exercise estimator 1410 can be estimated with current because sub-pixel is at a distance of son by using in predetermined rectilinear direction
Two hypothesis of pixel distance are estimated finally to determine reference block because of sub-pixel.
Exercise estimator 1410 can produce the residual error data between current prediction unit and the reference block of determination.
The exportable motion vector difference value information of information output unit 1420, wherein the instruction of motion vector difference value information is current
Difference between motion vector and the motion vector for the predicting unit being encoded before current prediction unit.Motion vector difference
Information can be exported by each predicting unit.Information output unit 1420 can export current prediction unit and ginseng according to predicting unit
Examine the residual error data between block.
Information output unit 1420 not only it is exportable include the predicting unit in coding unit motion vector difference letter
Breath, the hypothesis estimation model information of also exportable coding unit.
Assuming that estimation model information may include such information: information instruction is from currently to estimate to be because of sub-pixel
The hypothesis estimation of the heart is estimated because of two hypothesis selected in sub-pixel because of sub-pixel.
Since the selection of information output unit 1420 is estimated with current because sub-pixel is at a distance of sub-pixel in predetermined rectilinear direction
The hypothesis estimation of distance is because of sub-pixel, therefore information output unit 1420 can produce instruction and be selected for determining estimation factor picture
The rectilinear direction of element and the combined information of sub-pixel distance.
For example, can produce and exportable such assume estimation model information: the hypothesiss estimation model information indicates to exist
The predetermined sub-pixel distance and selected in two or more rectilinear directions that two or more sub-pixels select in
Predetermined rectilinear direction combination.
Information output unit 1420 can be determined according to coding unit assumes estimation model.It can will assume that estimation model information is total
It is same to be applied to include the predicting unit in coding unit.Therefore, it can will assume that estimation model is equally applicable to be included in volume
Each predicting unit in code unit.
Therefore, information output unit 1420 can export motion vector difference value information according to predicting unit, and can be according to coding
Unit output hypothesis estimation model information.
For example, information output unit 1420 can be first when coding unit includes the first predicting unit and the second predicting unit
The motion vector difference value information about the first predicting unit is first exported, the exportable hypothesis estimation model about coding unit is believed
It ceases, then the exportable motion vector difference value information about the second predicting unit.
It as another example, can be for the first prediction after output is about the hypothesis estimation model information of coding unit
Unit and the second predicting unit are sequentially output two motion vector difference value informations.
Two or more sub-pixel distances may include 1/4 pixel distance and 1/2 pixel distance.1/4 pixel distance and 1/2
Pixel distance can respectively indicate the minimum between the point obtained and by the distance between two neighbouring integer pixels quartering
Minimum range between distance and the point obtained and halving the distance between two neighbouring integer pixels.Two or more
Multiple rectilinear directions may include 0 degree, 90 degree, 135 degree and 45 degree of angle.
It is straight by the sub-pixel distance that selects among two sub-pixels distance and at two due to the hypothesis estimation model
The combination of the rectilinear direction selected among line direction is formed, therefore the hypothesis estimation model may include 8 combinations in total.
Information output unit 1420 can execute entropy coding to estimation model information is assumed.For example, in order to by using being based on
Context adaptive binary arithmetic coding (CABAC) method executes entropy coding, and information output unit 1420 can determine for vacation
If the context model of each binary digit (bin) of estimation model information.For example, if it is assumed that estimation model information has 4
A bit then can determine context model for every 4 binary digits, to can determine 4 context models.
In addition, information output unit 1420 can be determined according to the depth of current coded unit assumes estimation model information
Context model.For example, then each of 3 depth are required for vacation if there is the coding unit with 3 depth
If 4 context models of estimation model information, so that information output unit 1420 can determine for assuming estimation model information
12 context models in total.
3 context models may be selected in information output unit 1420, and by using the context model of selection to hypothesis
Estimation model information executes entropy coding, wherein this 3 context models are corresponding to the depth of current coded unit, and in vacation
If estimation model information based on previous context by among previously determined context model.
As described above, information output unit 1420 can compile the semiology analysis entropy generated and encoding to video
Code, so that information output unit 1420 can produce and output bit flow.
Exercise estimator 1410 can be by using with each of 0 degree, 90 degree, 135 degree and 45 degree rectilinear direction
With current estimation because sub-pixel apart estimate to carry out calculation rate distortion (RD) because of sub-pixel by the hypothesis of 1/4 pixel distance in rectilinear direction
Cost.Exercise estimator 1410 can also by using among calculated RD cost on the smallest direction of RD cost with currently estimate
The hypothesis because of sub-pixel at a distance of 1/2 pixel distance is counted to estimate because of sub-pixel, to calculate RD cost.Due in each rectilinear direction
4 RD costs are calculated by 1/4 pixel distance, and calculate a RD cost by 1/2 pixel distance in predetermined rectilinear direction, because
This can calculate 5 RD costs.
Minimum RD cost can be determined from the combination according to direction and sub-pixel distance and among calculated RD cost, and
The combination of corresponding with minimum RD cost rectilinear direction and sub-pixel distance may be selected.It then, can be by using the group in selection
Estimate to estimate to estimate because of sub-pixel to determine at a distance of two hypothesis of sub-pixel distance because of sub-pixel with current in the rectilinear direction of conjunction
Meter is because of sub-pixel.It can will respectively include in predetermined rectilinear direction with current estimation because sub-pixel is at a distance of two vacations of sub-pixel distance
If estimation is determined as reference block because of the average block of the block of sub-pixel.
Motion estimation apparatus 1400 may include the center of usual control exercise estimator 1410 and information output unit 1420
Processor (not shown).Optionally, exercise estimator 1410 and information output unit 1420 can be respectively by their own processing
Device (not shown) is driven, and since processor (not shown) is interactively operated, motion estimation apparatus 1400
It can be operated.Optionally, exercise estimator 1410 and information output unit 1420 can be by the outsides of motion estimation apparatus 1400
Processor (not shown) is controlled.
Motion estimation apparatus 1400 may include the input for storing exercise estimator 1410 and information output unit 1420
At least one of data and output data data storage cell (not shown).Motion estimation apparatus 1400 may include for controlling
State the memory control unit (not shown) that the data of at least one data storage cell are output and input.
By using current motion vector and in addition as described above, motion estimation apparatus 1400 uses and currently estimates the factor
Pixel is estimated to determine reference block because of sub-pixel at a distance of the hypothesis of sub-pixel distance, so that the precision of inter-prediction can be improved.This
Outside, motion estimation apparatus 1400 allows the combination with high probability to be used as only to assume estimation because sub-pixel is relative to current estimation
The combination in the direction and sub-pixel distance that are located at by sub-pixel assumes estimation because of sub-pixel so as to be rapidly selected.In addition,
Hypothesis about selection is estimated to be reduced to bottom line because of the transmitted bit number of the information of sub-pixel, to can be improved including vacation
If the bit rate of the coded identification of estimation model information.
Figure 15 is the block diagram of motion compensation equipment 1500 according to an embodiment of the present disclosure.
Motion compensation equipment 1500 includes information obtainer 1510, assumes estimation model determiner 1520 and motion compensator
1530。
Information obtainer 1510 can obtain working as current prediction unit according to including the predicting unit in coding unit
Preceding motion vector and residual error data.It can obtain motion vector difference value information according to predicting unit, rather than current motion vector.
When information obtainer 1510 receives the bit stream of coded identification, information obtainer 1510 can be by from bit stream
Symbol is parsed to obtain a plurality of encoded information.Information obtainer 1510 can obtain coding unit from bit stream by parsing bit stream
Hypothesis estimation model information.
Motion compensation equipment 1500 can receive the bit stream being coded by entropy.In the case, information obtainer 1510 can be right
Bit stream executes entropy decoding, so that information obtainer 1510 can get the hypothesis estimation model information of coding unit, and can get
The motion vector difference value information and residual error data of predicting unit.It here, can will be from bit stream via inverse quantization and Transform operations
The residual error data of acquisition reverts to the residual error data of spatial domain.
In another example, in order to generate the reference picture of the estimation for another image, if previous coding figure
The following information of picture is stored in memory: the motion vector difference of hypothesis the estimation model information and predicting unit of coding unit
Value information, then what motion compensation equipment 1500 can obtain coding unit from memory assumes estimation model information and predicting unit
Motion vector difference value information.Residual error number can be stored in the form of executing inverse quantization and inverse transformation to the transformation coefficient after quantization
According to.
Motion compensation equipment 1500 obtains a hypothesis estimation model information about current coded unit, so as to will be false
If estimation model information is jointly used in including the predicting unit in current coded unit.
Assuming that estimation model determiner 1520 can be determined based on the hypothesis estimation model information of acquisition at two or more
The predetermined sub-pixel distance selected and the predetermined rectilinear direction selected in two or more rectilinear directions in sub-pixel distance
Combination.
Sub-pixel distance includes 1/4 pixel distance and 1/2 pixel distance, and rectilinear direction includes 0 degree, 90 degree, 135 degree and 45
Angle is spent, so as to obtain 8 combinations of sub-pixel distance and rectilinear direction from hypothesis estimation model information.
Thus, it is supposed that estimation model determiner 1520 can be by reference to assuming that estimation model information is determined from following combination
One combination: combination (1/4 pixel distance and 0 ° of angular direction), combination (1/4 pixel distance and 90 ° of angular direction), combination (1/4 picture
Plain distance and 135 ° of angular direction), combination (1/4 pixel distance and 45° angle direction), combination (1/2 pixel distance and 0 ° of angular direction),
Combine (1/2 pixel distance and 90 ° of angular direction), combination (1/2 pixel distance and 135 ° of angular direction), combination (1/2 pixel distance and
45° angle direction).
Moreover, it is assumed that estimation model determiner 1520 can be by executing entropy to the hypothesis estimation model information being coded by entropy
Decoding assumes estimation model information to read.
Assuming that estimation model determiner 1520 can execute entropy solution to hypothesis estimation model information by using CABAC method
Code, to assume that estimation model determiner 1520 can be from assuming working as estimation model information interpretation sub-pixel distance and rectilinear direction
Preceding combination.
Assuming that estimation model determiner 1520 can determine the context mould for assuming each bit of estimation model information
Type.Therefore, 4 context models can be determined for the hypothesis estimation model information with 4 bits.
Assuming that estimation model determiner 1520 can be by using 4 context moulds corresponding with the depth of current coded unit
Type executes entropy decoding to hypothesis estimation model information.Assuming that estimation model determiner 1520 can be estimated based on the hypothesis of entropy decoding
Pattern information determines the combination of the sub-pixel distance and rectilinear direction for current motion vector.
According to determined combination, predetermined rectilinear direction can be selected from 4 rectilinear directions and 2 sub- pixel distances and is made a reservation for
Sub-pixel distance.Motion compensator 1530 can determine in predetermined rectilinear direction with current estimation because sub-pixel is at a distance of pre- stator picture
Two hypothesis of plain distance are estimated because of sub-pixel, wherein current estimation is indicated because of sub-pixel by current motion vector.Movement is mended
Reference block can be determined by using block of described two hypothesis estimations because of sub-pixel is respectively included by repaying device 1530.Motion compensator
1530 can be by being combined to the residual error data of acquisition and the reference block determined to generate the recovery block of current prediction unit.It can
Restore coding unit with the recovery block of predicting unit.
Motion compensation equipment 1500 can be by additionally using the hypothesis with current motion vector at a distance of sub-pixel distance to estimate
Meter determines reference block because of sub-pixel, so that the precision of motion compensation can be improved.In addition, motion compensation equipment 1500 can according to by
Assuming that the direction of estimation model information instruction and the combination of sub-pixel distance assume estimation because of sub-pixel to be rapidly selected.
The video encoder 100 and view based on the coding unit with tree construction described above by reference to Fig. 1 to Figure 13
Frequency decoding device 200 may include the operation carried out by motion estimation apparatus 1400 and the behaviour carried out by motion compensation equipment 1500
Make.
The executable fortune by motion estimation apparatus 1400 of coding unit determiner 120 in the video encoder 100 of Fig. 1
Move the operation of the progress of estimator 1410 and by the hypothesis estimation model determiner 1520 of motion compensation equipment 1500 and motion compensation
The operation that device 1530 carries out.The output unit 130 of video encoder 100 is executable defeated by the information of motion estimation apparatus 1400
The operation that unit 1420 carries out out.
The executable letter by motion compensation equipment 1500 of the symbol acquisition device 220 of video decoding apparatus 200 shown in Figure 2
It ceases acquisition device 1510 and assumes the operation that estimation model determiner 1520 carries out, the image data decoding of video decoding apparatus 200
The executable operation carried out by the motion compensator 1530 of motion compensation equipment 1500 of device 230.
The executable movement by motion estimation apparatus 1400 of the exercise estimator 420 of image encoder 400 shown in Fig. 4
The operation that estimator 1420 carries out, the executable vacation by motion compensation equipment 1500 of the motion compensator 425 of image encoder 400
If the operation that estimation model determiner 1520 and motion compensator 1530 carry out.The entropy coder 450 of image encoder 400 can be held
The operation that row is carried out by the information output unit 1420 of motion estimation apparatus 1400.
The executable information acquisition by motion compensation equipment 1500 of the resolver 510 of image decoder 500 shown in Fig. 5
The operation that device 1510 carries out, entropy decoder 520 is executable to be operated by the entropy decoding that information obtainer 1510 carries out.Image decoder
The executable operation carried out by hypothesis estimation model determiner 1520 of 500 motion compensator 560 assumes estimation mould to explain
Formula information, and the executable operation of motion compensation carried out by motion compensator 1530.
Figure 16 a and Figure 16 b show the type according to an embodiment of the present disclosure for assuming estimation model.
The current estimation of current motion vector instruction of current prediction unit is because of sub-pixel 1600.In accordance with an embodiment of the present disclosure
Motion estimation apparatus 1400 and motion compensation equipment 1500 pass through pattern information other than current motion vector be used only
To determine the reference block as unit of sub-pixel, wherein pattern information instruction assumes that estimation is sweared because of sub-pixel and current kinetic
Amount apart from the direction and indicates the sub-pixel distance at a distance of sub-pixel.Hereinafter, it will now be described and set by estimation
It is executed for 1400 and motion compensation equipment 1500 to determine and to assume estimation because of the operation of sub-pixel.
In this example, it is assumed that estimation can refer to show in the straight direction with current estimation because of sub-pixel 1600 because of sub-pixel
At a distance of the sub-pixel of 1/2 pixel distance and 1/4 pixel distance.Moreover, it is assumed that estimation can refer to show at 0 °, 45 °, 90 ° because of sub-pixel
And the sub-pixel isolated because of sub-pixel 1600 is estimated with current on 134 ° of angular direction.
Therefore, in the present embodiment, can estimate from the rectilinear direction with 0 °, 45 °, 90 ° and 135 ° angle with current
Because sub-pixel 1600 at a distance of 1/2 pixel distance and 1/4 pixel distance sub-pixel 1611,1612,1621,1622,1631,
1632, it determines in 1641,1642,1651,1652,1661,1662,1671,1672,1681 and 1682 by assuming that the estimation factor refers to
The pixel shown, that is, assuming that estimation is because of sub-pixel.
In order to determine reference block, can will sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642,
1651, multiple sub-pixels in 1652,1661,1662,1671,1672,1681 and 1682 are determined as assuming estimation because of sub-pixel.
For example, can from sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642,1651,1652,
1661,1662,1671,1672,1681 and 1682 selections are a pair of assumes estimation because of sub-pixel.For example, may be selected such a pair of
Assuming that estimation is because of sub-pixel: the pair of hypothesiss is estimated because sub-pixel is currently to estimate because of phase each other centered on sub-pixel 1600
It is right, and in the rectilinear direction with 0 °, 45 °, 90 ° and 135 ° angle with current estimation because sub-pixel 1600 at a distance of 1/2 pixel away from
From with 1/4 pixel distance sub-pixel 1611,1612,1621,1622,1631,1632,1641,1642,1651,1652,
1661, in 1662,1671,1672,1681 and 1682.
In more detail, first group 1610 can be determined as assuming estimation because of sub-pixel group, wherein first group 1610 is included in
It is located at 1/4 pixel distance in rectilinear direction with 0 ° of angle and currently to estimate because relative to each other centered on sub-pixel 1600
Sub-pixel 1611 and 1612.It similarly, can be by second group 1620, third group 1630, the 4th group 1640, the 5th group the 1650, the 6th
1660, the 7th group 1670 and the 8th group 1680 of group is determined as assuming estimation because of sub-pixel group, wherein second group 1620 is included in tool
There is the sub-pixel 1621 and 1622 being located at 1/2 pixel distance in the rectilinear direction at 0 ° of angle, third group 1630 is included in 90 °
The sub-pixel 1631 and 1632 being located at 1/4 pixel distance in the rectilinear direction at angle, the 4th group 1640 includes having 90 ° of angles
The sub-pixel 1641 and 1642 being located at 1/2 pixel distance in rectilinear direction, the 5th group 1650 includes having the straight of 135 ° of angles
The sub-pixel 1651 and 1652 being located at 1/4 pixel distance on line direction, the 6th group 1660 includes in the straight line with 135 ° of angles
The sub-pixel 1661 and 1662 being located at 1/2 pixel distance on direction, the 7th group 1670 includes in the rectilinear direction with 45° angle
The upper sub-pixel 1671 and 1672 at 1/4 pixel distance, the 8th group 1680 includes in the rectilinear direction with 45° angle
Sub-pixel 1681 and 1682 at 1/2 pixel distance.
Reference block can be ultimately determined to by including assuming that estimating by motion estimation apparatus 1400 and motion compensation equipment 1500
Count the average block of the reference block because of the sub-pixel instruction in sub-pixel group.
For example, when being chosen as assuming estimation because of sub-pixel group for the 7th group 1670, if first ginseng of the instruction of sub-pixel 1671
Block is examined, the second sub-pixel 1672 indicates the second reference block, then can produce by the pixel according to the first reference block and the second reference block
The average block, then can be determined as the reference block of current pixel by the average block that the average value of the pixel value of position is formed.
In this example, it is assumed that estimation model information can be indicated from assuming estimation because of (i.e. first group to the of sub-pixel group
Seven group 1610,1620,1630,1640,1650,1660,1670 and 1680) in select group.Further, since assuming the estimation factor
Pixel group includes with current estimation on particular line direction because sub-pixel 1600 is at a distance of the sub-pixel of particular sub-pixel distance, because
This assumes that estimation model information can indicate the combination in the particular line direction and the particular sub-pixel distance.
For example, can will assume that estimation model information is expressed as " mode N ".Mode 1 is corresponding to first group 1610, thus mode
1 can indicate the combination of 0 ° of angular direction and 1/4 pixel distance.Similarly, mode 2 is corresponding to third group 1630, so that mode 2 can refer to
Show the combination of 90 ° of angular direction and 1/4 pixel distance.Mode 3 is corresponding to the 5th group 1650, so that mode 3 can indicate 135 ° of angle sides
To the combination with 1/4 pixel distance.Mode 4 is corresponding to the 7th group 1670, so that mode 4 can indicate 45° angle direction and 1/4 pixel
The combination of distance.Mode 5 is corresponding to second group 1620, so that mode 5 can indicate 0 ° of angular direction and 1/2 pixel distance.Mode 6 with
4th group 1640 corresponding, so that mode 6 can indicate the combination of 90 ° of angular direction and 1/2 pixel distance.Mode 7 and the 6th group 1660
Accordingly, so that mode 7 can indicate the combination of 135 ° of angular direction and 1/2 pixel distance.Mode 8 is corresponding to the 8th group 1680, thus
Mode 8 can indicate the combination in 45° angle direction and 1/2 pixel distance.
Figure 17 shows the group in the direction according to an embodiment of the present disclosure indicated by hypothesis estimation model, value of symbol and distance
It closes.
For example, Figure 17 is shown about hypothesis estimation because of the direction of sub-pixel and the combination of sub-pixel distance, wherein described group
Conjunction is corresponding to the mode of estimation model information is assumed respectively, and bit symbol corresponding with the mode respectively is also shown in Figure 17
Number.
Mode 0 is not corresponding to any value.Mode 0 can be applied to the hypothesis as unit of sub-pixel and estimate because of sub-pixel not
The case where being used for determining reference block.Mode 1 to mode 4 can be applied to the hypothesis that sub-pixel distance is 1/4 pixel distance and estimate
The group because of sub-pixel is counted, mode 5 to 8 can be applied to the hypothesis that sub-pixel distance is 1/2 pixel distance and estimate because of sub-pixel
Group.
In addition, mode value can be with 0 ° of angle (water in a plurality of hypothesis estimation model information for indicating identical sub-pixel distance
It is flat), 90 ° of angles (vertical), 135 ° of angles (right-under) and 45° angle (under a left side -) sequentially increase.
Other than mode 0, the multiple values of symbol for being applied to multiple modes respectively in table shown in Figure 17 are determined
Justice is 4 bits.The bit of the leftmost side of mode symbol value will now be described.In the feelings of the first bit indication 1 of mode symbol value
Under condition, which can intermediate scheme 0.In the case where the first bit indication 0 of mode symbol value, which can be indicated in addition to mould
Mode except formula 0.Second bit of mode symbol value can indicate sub-pixel distance be 1/4 pixel distance or 1/2 pixel away from
From.The third bit of mode symbol value capable of indicating direction is diagonal direction or non-diagonal direction.4th ratio of mode symbol value
Spy can indicate the determination about horizontal direction or vertical direction, or can indicate the diagonal direction with 135 ° of angles or 45° angle.
Therefore, when 1420 output hypothesis estimation model information of the information output unit of motion estimation apparatus 1400, information
Output unit 1420 can assume estimation model information because sub-pixel determines according to the hypothesis estimation determined by exercise estimator 1410
Mode value, and can be exported according to the table of Figure 17 and the bit stream of the corresponding value of symbol of mode value of determination.
In addition, the hypothesis estimation model determiner 1520 of motion compensation equipment 1500 can be successively read to be included in and be obtained by information
Obtain the first bit to the 4th bit for assuming the bit stream in estimation model information that device 1510 parses.Compared according to reading first
Spy to the 4th bit result sequence, can determine assume estimation model whether be mode 0, sub-pixel distance is 1/4 pixel
Distance or 1/2 pixel distance, whether direction is diagonal direction and the direction be horizontally oriented also be vertically oriented or
The direction is the diagonal direction with 135 ° of angles or the diagonal direction of 45° angle.
The information output unit 1420 of motion estimation apparatus 1400 can execute entropy coding to estimation model information is assumed.Example
Such as, entropy coding can be carried out to hypothesis estimation model information by using CABAC method.In order to execute CABAC method, can pass through by
Assuming that estimation model information carries out binaryzation to generate bit stream, and context can be determined for each binary digit of bit stream
Model.Therefore, 4 context models can be determined for the hypothesis estimation model information with 4 bits.It can be single according to each coding
Member, which determines, assumes estimation model information.In addition, context model can be determined according to the depth of coding unit.For with identical
The a plurality of hypothesis estimation model information of the coding unit of depth can execute entropy coding by using same context model.
Motion estimation apparatus 1400 can execute inter-prediction for 64 × 64,32 × 32 and 16 × 16 coding unit.?
In this case, there are 3 depth of coding unit, and according to each of 3 depth, for hypothesis estimation model information
4 context models are determined, so that motion estimation apparatus 1400 can determine for a plurality of 12 for assuming estimation model information
Hereafter model.
Motion compensation equipment 1500 can restore symbol by executing entropy decoding to the hypothesis estimation model information parsed
Value.In the case, context model can be used alone according to the depth of coding unit, and can be by different context moulds
Type is used to assume each binary digit of estimation model information.
As described above, a hypothesis estimation model information can be determined for a coding unit.For example, motion estimation apparatus
1400 information output unit 1420 can send the movement of the first predicting unit in the predicting unit of current coded unit first
The hypothesis estimation model information of current coded unit can be transmitted in vector difference information, and the fortune of the second predicting unit then can be transmitted
Dynamic vector difference information.In the case, the information obtainer 1510 of motion compensation equipment 1500 can parse present encoding first
The motion vector difference value information of first predicting unit of unit can parse the hypothesis estimation model information of current coded unit, so
The motion vector difference value information of the second predicting unit can be parsed afterwards.
However, the transmission for being assigned to the hypothesis estimation model information of coding unit is not limited to aforesaid way.
Figure 18 shows the hypothesis estimation model according to an embodiment of the present disclosure as the test target about RD cost.
The exercise estimator 1410 of motion estimation apparatus 1400 can be estimated from multiple hypothesis because selecting to assume in sub-pixel group
Estimation is because of sub-pixel group, wherein estimates in the hypothesis of selection because generating minimum RD cost in sub-pixel group, then exercise estimator
1410 can determine hypothesis estimation model because of the direction of sub-pixel group and the combination of sub-pixel distance according to the hypothesis estimation of selection.
For example, it may be determined that from being related to assuming estimation because of first group 1610 of fortune of sub-pixel using corresponding with mode 0
The RD cost generated in the encoding operation of dynamic estimation, which arrives, to be related to using hypothesis corresponding with mode 8 estimation because of the 8th of sub-pixel
The RD cost generated in the encoding operation of the estimation of group 1680, then can compare RD cost according to multiple modes, thus
The mode of generation minimum RD cost may be selected.
As described above, vacation can be determined according to each of each sub-pixel distance and the direction with 4 angles
If estimating the combination because of sub-pixel, so as to estimate to compare RD cost because of sub-pixel group for 8 hypothesis.
In another embodiment, motion estimation apparatus 1400 can first against close to current estimation because sub-pixel 1600 with
Current estimation is because sub-pixel 1600 apart estimate to determine RD cost because of the group of sub-pixel by the hypothesis of 1/4 pixel distance.Hereafter, according to
The direction of the mode of generation minimum RD cost in multiple modes with 1/4 pixel distance, motion estimation apparatus 1400 can be another
Other places is for the hypothesis estimation being located at 1/2 pixel distance because the group of sub-pixel determines RD cost.
8 table referring to Fig.1 as determination and compares the RD cost about the mode 1,2,3 and 4 at 1/4 pixel distance
As a result, when generating minimum RD cost in mode 1, can be for the additionally determining RD cost of mode 5, wherein mode 5 be
It is located at one group of hypothesis at 1/2 pixel distance on direction identical with mode 1 to estimate because of sub-pixel.Therefore, can slave pattern 1 RD
Final choice generates the mode of smaller RD cost in the RD cost of cost and mode 5.
It similarly, can be additionally as the RD cost compared at 1/4 pixel distance as a result, when mode 2 is selected
The RD cost for determining mode 6, then can execute again and compare.It, can be further by the RD of mode 7 when mode 3 is selected first
Cost is compared with the RD cost of mode 3.It, can be further by the RD cost and mode of mode 8 when mode 4 is selected first
4 RD cost is compared.
Therefore, motion estimation apparatus 1400 according to another embodiment can determine and compare about at 1/4 pixel distance
4 modes and 5 of another mode at 1/2 pixel distance assume estimation because of the RD cost of sub-pixel group, to move
Estimation equipment 1400 can determine that best hypothesis is estimated to assume estimation model with best because of sub-pixel group.
Figure 19 shows the flow chart of method for estimating according to an embodiment of the present disclosure.
It, can be for the interframe for including current prediction unit in multiple predicting units in coding unit in operation 1910
It predicts to determine current motion vector.It can get by previously determined motion vector.
In operation 1920, it may be determined that such two hypothesis estimation is because of sub-pixel: described two hypothesis estimations are because of sub-pixel
In predetermined rectilinear direction centered on currently estimating because of sub-pixel, and estimate with current because sub-pixel is at a distance of predetermined sub-pixel
The hypothesis of distance is estimated because in sub-pixel, wherein current to estimate to be indicated because of sub-pixel by current motion vector.
Can select a sub- pixel distance in 1/4 pixel distance and 1/2 pixel distance, and can have 0 °, 90 °,
A direction is selected in the rectilinear direction at 45 ° and 135 ° angles.It can be according to the combination in the direction of the sub-pixel distance and selection of selection
To determine described two hypothesis estimations because of sub-pixel.
Firstly, can estimate the factor with current in the rectilinear direction with 0 °, 90 °, 45 ° and 135 ° angle by using respectively
Pixel is estimated to calculate RD cost because of sub-pixel at a distance of the hypothesis of 1/4 pixel distance.Hereafter, can be estimated by using such hypothesis
Meter additionally calculates RD cost because of sub-pixel: described to assume that estimation is generating the calculating at 1/4 pixel distance because of sub-pixel
It is located at 1/2 pixel distance on the direction of minimum RD cost among RD cost.It can finally determine hypothesis estimation because in sub-pixel
The rectilinear direction and sub-pixel distance for calculating the minimum RD cost among 5 RD costs of calculating.
Reference block can be determined by using block of the two determining hypothesis estimations because of sub-pixel is respectively included.It can determine finger
Show the motion vector of the alternate position spike between current prediction unit and reference block.Can determine as current prediction unit pixel value and
The residual error data of difference between the pixel value of reference block.
In operation 1930, the hypothesis estimation model information of exportable coding unit, wherein the hypothesis estimation model information
Indicate the predetermined sub-pixel distance selected from two or more sub-pixels distance and from two or more rectilinear directions
The combination of the predetermined rectilinear direction of selection.
In this example, it is assumed that estimation model be from two sub-pixels distance a sub- pixel distance selecting and from
The combination of the rectilinear direction selected in two rectilinear directions, so as to the selection hypothesis estimation model from 8 combinations.
In another embodiment, exportable motion vector difference value information, without exporting motion vector, wherein motion vector
Difference information indicates between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit
Difference.
In another embodiment, the hypothesis estimation model information of exportable coding unit and include in coding unit
The motion vector difference value information and residual error data of predicting unit.
In order to determine hypothesis estimation mould according to the depth of coding unit to estimation model information execution entropy coding is assumed
The context model of formula information.In addition, can be by using 4 context models corresponding with the depth of current coded unit to vacation
If estimation model information carries out entropy coding.
Figure 20 shows the flow chart of motion compensation process according to an embodiment of the present disclosure.
In operation 2010, available includes the vacation of the motion vector and coding unit of the predicting unit in coding unit
If estimation model information.
In another embodiment, it can get the hypothesis estimation model information of motion vector difference value information and coding unit,
In, motion vector difference value information indicates the movement of current motion vector and the predicting unit being encoded before current prediction unit
Difference between vector.In addition, the residual error data between each predicting unit and reference block can be obtained according to predicting unit.It can root
It is predicted that unit quantified after transformation coefficient, then can execute inverse quantization and inverse transformation to the transformation coefficient after quantization, from
And it can get residual error data.
In the present embodiment, above and below can be by using the hypothesis estimation model information determined according to the depth of coding unit
Literary model determines 4 context models corresponding with the depth of current coded unit, and can be by using respective contexts mould
Type executes entropy decoding to each binary digit of the hypothesis estimation model information with 4 bits.
In operation 2020, according to hypothesis estimation model information, it may be determined that selected from two or more sub-pixels distance
Predetermined sub-pixel distance and the predetermined rectilinear direction selected from two or more rectilinear directions combination.
According to assume estimation model information, it may be determined that the sub-pixel selected from 1/4 pixel distance and 1/2 pixel distance away from
From the combination with the direction selected from 0 °, 90 °, 135 ° and the rectilinear direction of 45° angle.
It, can be by using including estimating with current because sub-pixel is at a distance of pre- stator picture in predetermined rectilinear direction in operation 2030
Two of plain distance assume block of the estimation because of sub-pixel to determine reference block, wherein current estimation is because sub-pixel is by current kinetic
Vector instruction.
The residual error data obtained in operation 2010 and the reference merged block determined in operation 2030 can be worked as to can produce
The recovery block of preceding predicting unit.
According to referring to the method for video coding based on the coding unit with tree construction described in Fig. 6 to Figure 19, for
Each coding unit of tree construction encodes the image data of spatial domain.According to based on the coding unit with tree construction
Video encoding/decoding method executes decoding for each maximum coding unit, to restore the image data of spatial domain.Therefore, can restore
Picture and video as picture sequence.Video after recovery can be reproduced by reproduction equipment, can be stored in storage medium,
Or it can be sent by network.
It can be written as computer program in accordance with an embodiment of the present disclosure, and can be implemented in and use computer-readable record
Medium executes in the general purpose digital computer of described program.The example of computer readable recording medium includes magnetic storage medium (example
Such as, ROM, floppy disk, hard disk etc.), optical recording medium (for example, CD-ROM or DVD) etc..
Although the disclosure is specifically illustrated and described with reference to the exemplary embodiment of the disclosure, this field it is common
The skilled person will understand that can show in the case where not departing from according to the spirit and scope of the present disclosure being defined by the claims
Various changes in form and details are made in example property embodiment.
It is set for executing referring to figs. 1 to Figure 20 multiple views method for estimating, motion compensation process, Video coding described
The program of one or more embodiments of each of standby and video decoding apparatus is stored in computer-readable storage medium
In matter, so that stand alone computer system can be easily performed the operation according to the embodiment being stored in a storage medium.
For ease of description, the view including method for estimating and motion compensation process described with reference to Fig. 1 to Figure 20
Frequency coding method will be collectively referred to as " according to the method for video coding of the disclosure ".In addition, including with reference to what Fig. 1 to Figure 20 was described
The video encoding/decoding method of motion compensation process will be referred to as " according to the video encoding/decoding method of the disclosure ".
With reference to Fig. 1 to Figure 20 describe include video encoder 100, video encoder 400, motion estimation apparatus
1400 or motion compensation equipment 1500 video encoder will be referred to as " according to the video encoder of the disclosure ".In addition,
The view including video decoding apparatus 200, image decoder 500 or motion compensation equipment 1500 described with reference to Fig. 1 to Figure 18
Frequency decoding device will be referred to as " according to the video decoding apparatus of the disclosure ".
Will be described in now it is according to an embodiment of the present disclosure storage program computer readable recording medium (for example,
Disk 26000).
Figure 21 shows the physical structure of the disk 26000 of storage program according to an embodiment of the present disclosure.As storage medium
Disk 26000 can be hard disk drive, compact disc read-only memory (CD-ROM) disk, Blu-ray disc or digital versatile disc
(DVD).Disk 26000 includes multiple concentric magnetic track Tr, and each concentric magnetic track Tr is divided into specific along the circumferencial direction of disk 26000
The sector Se of quantity.In the specific region of disk 26000, can distribute and store execute method for estimating described above,
The program of motion compensation process, method for video coding and video encoding/decoding method.
It describes to decode for executing method for video coding and video as described above using storage now with reference to Figure 22
The storage medium of the program of method is come the computer system realized.
Figure 22 shows by using disk 26000 disk drive 26300 for recording simultaneously reading program.Computer system
26500 can will execute method for video coding and video encoding/decoding method according to an embodiment of the present disclosure via disk drive 26300
At least one of program be stored in disk 26000.It is stored in disk 26000 to be run in computer system 26500
Program, computer system 26500 from 26000 reading program of disk and can be sent for program by using disk drive 26300.
Execute the program of at least one of method for video coding and video encoding/decoding method according to an embodiment of the present disclosure
It can not only be stored in disk 26000 shown in Figure 21 and Figure 22, be also stored in storage card, ROM cassette tape or solid-state and drive
In dynamic device (SSD).
System explained below using method for video coding and video encoding/decoding method described above.
Figure 23, which is shown, provides the overall structure of the contents providing system 11000 of content distribution service.By the clothes of communication system
Wireless base station 11700,11800,11900 and 12000 and is separately mounted to these at the cell of predetermined size by region division of being engaged in
In cell.
Contents providing system 11000 includes multiple self-contained units.For example, such as computer 12100, personal digital assistant
(PDA) 12200, multiple self-contained units of video camera 12300 and mobile phone 12500 are via Internet Service Provider
11200, communication network 11400 and wireless base station 11700,11800,11900 and 12000 are connected to internet 11100.
However, contents providing system 11000 is not limited to as shown in Figure 23, and in device is optionally connected to
Hold supply system 11000.Multiple self-contained units can not directly connect via wireless base station 11700,11800,11900 and 12000
It is connected to communication network 11400.
Video camera 12300 is the imaging device for capableing of captured video image, for example, digital video camera.Mobile phone
12500 can be using various agreements (for example, individual digital communicates (PDC), CDMA (CDMA), wideband code division multiple access (W-
CDMA), global system for mobile communications (GSM) and personal handyphone system (PHS)) at least one of communication means.
Video camera 12300 can be connected to streaming server 11300 via wireless base station 11900 and communication network 11400.Stream
The permission of server 11300 is streamed via the content that video camera 12300 is received from user via real-time broadcast.It can be used
Video camera 12300 or streaming server 11300 encode the content received from video camera 12300.Pass through video
The video data that camera 12300 captures can be sent to streaming server 11300 via computer 12100.
The video data captured by camera 12600 can also be sent to streaming server via computer 12100
11300.Similar with digital camera, camera 12600 is the imaging device that can capture both static image and video image.It can make
The video data captured by camera 12600 is encoded with camera 12600 or computer 12100.Video will can be held
The software of row coding and decoding is stored in can be by computer readable recording medium that computer 12100 accesses (for example, CD-ROM
Disk, floppy disk, hard disk drive, SSD or storage card) in.
It, can be from mobile phone if video data is caught in by the camera being built in mobile phone 12500
12500 receive video data.
It can also be electric by the large-scale integrated being mounted in video camera 12300, mobile phone 12500 or camera 12600
Road (LSI) system encodes video data.
In accordance with an embodiment of the present disclosure, contents providing system 11000 can use video camera 12300, camera to by user
12600, the content-data that mobile phone 12500 or another imaging device are recorded during concert (for example, in recording
Hold) it is encoded, and streaming server 11300 is sent by the content-data after coding.Streaming server 11300 can will be after coding
Content-data is sent to other clients of request content data with the type of streaming content.
Client is the device that can be decoded to the content-data after coding, for example, computer 12100, PDA
12200, video camera 12300 or mobile phone 12500.Therefore, contents providing system 11000 allows client to receive and reproduce
Content-data after coding.In addition, contents providing system 11000 allow client real-time reception to encode after content-data and right
Content-data after coding is decoded and reproduces, and thus allows for personal broadcaster.
The coding and decoding operation for the multiple self-contained units being included in content in supply system 11000 can be similar to according to this
The coding and decoding operation of the video encoder and video decoding apparatus of disclosed embodiment.
It is described more fully now with reference to Figure 24 and Figure 25 and is included in Content supply according to an embodiment of the present disclosure
Mobile phone 12500 in system 11000.
Figure 24 shows the mobile phone according to an embodiment of the present disclosure using method for video coding and video encoding/decoding method
12500 external structure.Mobile phone 12500 can be smart phone, and the function of the smart phone is unrestricted, and described
Most of functions of smart phone can be changed or extend.
Mobile phone 12500 includes the internal antenna that radio frequency (RF) signal can be exchanged with the wireless base station 12000 of Figure 25
12510, and including for showing the image captured by camera 12530 or being received via antenna 12510 and decoded figure
The display screen 12520 (for example, liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) screen) of picture.Smart phone 12500
Operation panel 12540 including including control button and touch panel.If display screen 12520 is touch screen, operating surface
Plate 12540 further includes the touch-sensing panel of display screen 12520.Mobile phone 12510 includes for exporting voice and sound
Loudspeaker 12580 or another type sound follower and microphone 12550 or another type for inputting voice and sound
Sound input unit.Mobile phone 12510 further includes the camera 12530 for capturing video and static image, such as charge coupling
Clutch part (CCD) camera.Mobile phone 12510 may also include that storage medium 12570, be captured for storing by camera 12530
To, via e-mail receive or the coding/decoding data that are obtained according to various modes (for example, video or static figure
Picture);Slot 12560, storage medium 12570 are loaded into mobile phone 12500 via slot 12560.Storage medium 12570 can
To be flash memory, it may for example comprise secure digital (SD) card or electrically erasable and programmable read only memory in plastic housing
(EEPROM)。
Figure 25 shows the internal structure of mobile phone 12500 according to an embodiment of the present disclosure.In order to systematically control packet
Include the component of the mobile phone 12500 of display screen 12520 and operation panel 12540, power supply circuit 12700, operation input control
Device 12640, image coding unit 12720, camera interface 12630, LCD controller 12620, image decoding unit 12690, multiplexing
Device/demultiplexer 12680, recording unit/reading unit 12670, modulation unit/demodulating unit 12660 and Sound Processor Unit
12650 are connected to central controller 12710 via synchronous bus 12730.
If user's operation power knob, and " electric power starting " state, then electricity of powering would be set as from " power supply closing " state
All components power supply of the road 12700 from battery pack to mobile phone 12500, to set operation mould for mobile phone 12500
Formula.
Central controller 12710 includes central processing unit (CPU), ROM and random access memory (RAM).
While communication data is sent outside by mobile phone 12500, under the control of central controller, in movement
Digital signal is generated in phone 12500.For example, Sound Processor Unit 12650 can produce digital audio signal, image coding unit
12720 can produce data image signal, and the text data of message can be via operation panel 12540 and operation input controller
12640 are generated.When digital signal is sent to modulation unit/demodulating unit 12660 under the control in central controller 12710
When, modulation unit/demodulating unit 12660 is modulated the frequency band of digital signal, and telecommunication circuit 12610 is to band modulation
Digital audio signal afterwards executes digital-to-analogue conversion (DAC) and frequency conversion.The transmission signal exported from telecommunication circuit 12610 can be through
Voice communication base station or wireless base station 12000 are sent to by antenna 12510.
For example, when mobile phone 12500 is in call mode, under the control of central controller 12710, via Mike
The voice signal that wind 12550 obtains is transformed into digital audio signal by Sound Processor Unit 12650.Digital audio signal can be through
Transformation signal is transformed by modulation unit/demodulating unit 12660 and telecommunication circuit 12610, and can be sent out via antenna 12510
It send.
When text message (for example, Email) is sent in a data communication mode, the text data of text message
It is entered via operation panel 12540, and is sent to central controller 12610 via operation input controller 12640.In
Under the control for entreating controller 12610, text data is transformed via modulation unit/demodulating unit 12660 and telecommunication circuit 12610
At transmission signal, and wireless base station 12000 is sent to via antenna 12510.
In order to send image data in a data communication mode, the image data captured by camera 12530 is via camera
Interface 12630 is provided to image coding unit 12720.The image data captured can be controlled via camera interface 12630 and LCD
Device 12620 processed is displayed directly on display screen 12520.
The structure of image coding unit 12720 can be corresponding to the structure of video encoder 100 described above.Image is compiled
Code unit 12720 can be according to the Video coding side used by video encoder 100 described above or image encoder 400
Method, by the image data after being compression and coding from the image data transformation that camera 12530 receives, and then will be after coding
Image data is output to multiplexer/demultiplexer 12680.During the record operation of camera 12530, by mobile phone 12500
The voice signal that obtains of microphone 12550 can be transformed into digital audio data via Sound Processor Unit 12650, and number
Voice data may pass to multiplexer/demultiplexer 12680.
Multiplexer/demultiplexer 12680 to from after the coding that image coding unit 12720 receives image data with from
The voice data that Sound Processor Unit 12650 receives is multiplexed together.The result being multiplexed to data can be single via modulation
Member/demodulating unit 12660 and telecommunication circuit 12610 are transformed into transmission signal, then can be sent via antenna 12510.
When mobile phone 12500 receives communication data from outside, the signal received via antenna 12510 can be executed
Frequency retrieval and ADC are to translate the signals into digital signal.Modulation unit/demodulating unit 12660 to the frequency band of digital signal into
Row modulation.Video decoding unit is sent by the digital signal after band modulation according to the type of the digital signal after band modulation
12690, Sound Processor Unit 12650 or LCD controller 12620.
In the talk mode, mobile phone 12500 amplifies the signal received via antenna 12510, and passes through
Frequency conversion and ADC are executed to amplified signal to obtain digital audio signal.Under the control of central controller 12710,
The digital audio signal received is transformed into simulated sound via modulation unit/demodulating unit 12660 and Sound Processor Unit 12650
Sound signal, and analoging sound signal is exported via loudspeaker 12580.
When in a data communication mode, the data of the video file accessed on internet site are received, via modulation
Unit/demodulating unit 12660 will be exported via antenna 12510 from the signal that wireless base station 12000 receives as multiplex data, and
Multiplexer/demultiplexer 12680 is sent by multiplex data.
In order to be decoded to the multiplex data received via antenna 12510, multiplexer/demultiplexer 12680 will be answered
Video data stream after demultiplexing into coding with data and the voice data stream after coding.Via synchronous bus 12730, after coding
Video data stream and coding after voice data stream be respectively provided to video decoding unit 12690 and Sound Processor Unit
12650。
The structure of image decoding unit 12690 can be corresponding to the structure of video decoding apparatus 200 described above.Image solution
Code unit 12690 can be according to the video decoding side used by video decoding apparatus 200 described above or image decoder 500
Method is decoded the video data after coding to obtain the video data of recovery, and will restore via LCD controller 12620
Video data be supplied to display screen 12520.
Therefore, the data of the video file accessed on internet site can be shown on display screen 12520.Meanwhile
Audio data can be transformed into analoging sound signal by Sound Processor Unit 12650, and analoging sound signal is supplied to loudspeaker
12580.Therefore, the audio number for including in the video file accessed on internet site can also be reproduced in via loudspeaker 12580
According to.
Mobile phone 12500 or another type of communication terminal can be to be compiled including video according to an embodiment of the present disclosure
The transceiver terminal of both decoding apparatus and video decoding apparatus, can be only include video encoder transceiver terminal, Huo Zheke
To be the transceiver terminal for only including video decoding apparatus.
The communication system described above by reference to Figure 24 is not limited to according to the communication system of the disclosure.For example, Figure 26 shows root
According to the digit broadcasting system using communication system of embodiment of the disclosure.The digit broadcasting system of Figure 26 can be by using basis
The video encoder and video decoding apparatus of embodiment of the disclosure come receive via satellite or ground network transmission number
Broadcast.
In more detail, broadcasting station 12890 is by using radio wave by video data stream to telecommunication satellite or broadcast
Satellite 12900.Broadcasting satellite 12900 sends broadcast singal, and broadcast singal is sent to satellite broadcasting via household antenna 12860
Receiver.It, can be by TV receiver 12810, set-top box 12870 or another device to the video flowing after coding in each house
It is decoded and reproduces.
When video decoding apparatus according to an embodiment of the present disclosure is implemented in reproduction equipment 12830, reproduction equipment
12830 can be to the view after the coding being recorded on storage medium 12820 (such as restoring the disk or storage card of digital signal)
Frequency stream is parsed and is decoded.Therefore, the vision signal of recovery can be reproduced on such as monitor 12840.
Line being connected to for the antenna 12860 of satellite/terrestrial broadcast or for receiving cable television (TV) broadcast
In the set-top box 12870 of cable antenna 12850, mountable video decoding apparatus according to an embodiment of the present disclosure.From set-top box
The data of 12870 outputs can also be reproduced on TV Monitor 12880.
As another example, video decoding apparatus according to an embodiment of the present disclosure can be mounted in TV receiver 12810,
Rather than in set-top box 12870.
Automobile 12920 including appropriate antenna 12910 can receive the letter sent from satellite 12900 or wireless base station 11700
Number.Decoded video can be reproduced on the display screen for the auto-navigation system 12930 being mounted in automobile 12920.
Vision signal can be encoded by video encoder according to an embodiment of the present disclosure, then can be stored in storage
In medium.Specifically, picture signal can be stored in DVD disc 12960 by DVD recorder, or can be by hdd recorder 12950
In a hard disk by picture signal storage.As another example, vision signal can be stored in SD card 12970.If hard disk recording
Device 12950 includes video decoding apparatus according to an embodiment of the present disclosure, then is recorded in DVD disc 12960, SD card 12970 or another
Vision signal on one storage medium can be reproduced on TV Monitor 12880.
Auto-navigation system 12930 may not include the camera 12530, camera interface 12630 and image coding unit of Figure 23
12720.For example, computer 12100 and TV receiver 12810 may not include camera 12530, camera interface 12630 in Figure 23
In image coding unit 12720.
Figure 27 shows the cloud computing system according to an embodiment of the present disclosure using video encoder and video decoding apparatus
The network structure of system.
Cloud computing system may include cloud computing server 14000, customer data base (DB) 14100, multiple computing resources
14200 and user terminal.
In response to carrying out the request of user terminal, cloud computing system is provided via data communication network (for example, internet)
The program request outsourcing service of multiple computing resources 14200.Under cloud computing environment, service provider is combined by using virtual technology
Computing resource at the data center of different physical locations, to provide desired service for user.Servicing user need not
By computing resource (for example, using, memory, operating system (OS) and security software) be mounted in the terminal that he/her possesses with
Using them, but can selection and use are thought from service in the Virtual Space generated by virtual technology at desired time point
The service wanted.
The user terminal of appointed service user is via the data communication network including internet and mobile communications network
It is connected to cloud computing server 14100.Cloud computing service can be provided from cloud computing server 14100 to user terminal, especially
It is rabbit service.User terminal can be the various types of electronic devices that can be connected to internet, for example, on table
Type PC 14300, intelligence TV 14400, smart phone 14500, notebook computer 14600, portable media player
(PMP) 14700, tablet PC 14800 etc..
Cloud computing server 14100 can combine the multiple computing resources 14200 being distributed in cloud network, and to user terminal
Combined result is provided.The multiple computing resource 14200 may include various data services, and may include uploading from user terminal
Data.As described above, cloud computing server 14100 can be by being distributed in the different areas according to virtual technology combination
Video database to provide desired service to user terminal.
User information about the user for having subscribed cloud computing service is stored in user DB 14100.User information
It may include registration information, address, name and the personal credit information of user.User information may also include the index of video.Here,
The index may include the list of the list for the video being reproduced, the video being reproduced, and be reproduced before
The pause point etc. of video.
The information about video being stored in user DB 14100 can be shared between the user device.For example, when response
When Video service is supplied to notebook computer 14600 by the request from notebook computer 14600, Video service is again
Existing history is stored in user DB 14100.When receiving the request for reproducing this Video service from smart phone 14500
When, cloud computing server 14100 is based on user DB 14100 and searches for and reproduce this Video service.When smart phone 14500 is from cloud
When calculation server 14100 receives video data stream, reproduced by being decoded to video data stream the processing of video with
Operation above by reference to Figure 27 mobile phone 12500 described is similar.
The reproduction that cloud computing server 14100 can refer to the desired Video service being stored in user DB 14100 is gone through
History.For example, cloud computing server 14100 is received from user terminal for reproducing asking for the video being stored in user DB 14100
It asks.If this video was reproduced, by cloud computing server 14100 execute carry out spreading defeated method to this video can root
According to come user terminal request (that is, according to be will since the starting point of video or the pause point of video reproduce video) without
Together.For example, cloud computing server 14100 will be from video if user terminal requests reproduce video since the starting point of video
The flow data of video that starts of first frame be sent to user terminal.If user terminal requests since the pause point of video again
Existing video, then the flow data of the video since frame corresponding with pause point is sent user's end by cloud computing server 14100
End.
In the case, user terminal may include the video decoding apparatus as described in above by reference to Fig. 1 to Figure 20.It is such as another
Example, user terminal may include the video encoder as described in above by reference to Fig. 1 to Figure 20.Optionally, user terminal can wrap
Include both video decoding apparatus and the video encoder as described in above by reference to Fig. 1 to Figure 20.
The view according to an embodiment of the present disclosure described above by reference to Fig. 1 to Figure 20 is described above by reference to Figure 21 to Figure 27
The various applications of frequency coding method, video encoding/decoding method, video encoder and video decoding apparatus.However, according to the disclosure
Various embodiments the method for being stored in a storage medium method for video coding and video encoding/decoding method or video is compiled
Decoding apparatus and video decoding apparatus realize that method in a device is not limited to the embodiment described above by reference to Figure 21 to Figure 27.
Claims (13)
1. a kind of motion compensation process using the estimation of motion vectors factor, the motion compensation process include:
The current motion vector of included predicting unit in coding unit is obtained, and obtains the hypothesis estimation mould of coding unit
Formula information;
The combination of sub-pixel distance and rectilinear direction is determined based on the hypothesis estimation model information, wherein the sub-pixel
Distance is selected in two or more predetermined sub-pixel distances, and the rectilinear direction is predetermined straight at two or more
It is selected in line direction;
Reference block is determined by using two blocks of two hypothesis estimations because of sub-pixel are respectively included,
Wherein, the estimation of described two hypothesis because sub-pixel in selected rectilinear direction centered on currently estimating because of sub-pixel
On straight line opposite to each other, and with current estimation because sub-pixel is at a distance of selected sub-pixel distance,
It is current to estimate to be indicated because of sub-pixel by current motion vector,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances
One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Combination,
The step of obtaining the hypothesis estimation model information of coding unit includes: by using on corresponding with the depth of coding unit
Hereafter model executes entropy decoding to hypothesis estimation model information,
The step of determining the combination of sub-pixel distance and rectilinear direction includes: the hypothesis estimation model information based on entropy decoding, needle
The combination of sub-pixel distance and rectilinear direction is determined to current motion vector.
2. motion compensation process as described in claim 1, wherein obtain the hypothesis estimation model information the step of include:
Obtain the hypothesis estimation model information and motion vector difference value information, wherein the instruction of motion vector difference value information is current
Difference between motion vector and the motion vector for the predicting unit being encoded before current prediction unit;
The residual error data between current prediction unit and reference block is obtained,
Wherein, the motion compensation process further include: by generating current predictive by the residual error data and with reference to merged block
The recovery block of unit.
3. motion compensation process as described in claim 1, wherein obtain the hypothesis estimation model information the step of include:
Obtain the hypothesis estimation model information determined jointly for included predicting unit in current coded unit.
4. motion compensation process as described in claim 1, wherein the two or more predetermined sub-pixel distances include 1/
4 pixel distances and 1/2 pixel distance, the two or more predetermined rectilinear directions include having 0 degree of angle, an angle of 90 degrees, 135 degree
The direction at angle and 45 degree of angles,
Wherein, one for assuming the instruction of estimation model information and being selected in the two or more predetermined sub-pixel distances
One in 8 kinds of combinations of sub-pixel distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Kind combination.
5. motion compensation process as described in claim 1, wherein the step of obtaining the hypothesis estimation model information of coding unit
Include:
The context model for assuming estimation model information is determined according to the depth of coding unit;
The hypothesis estimation model information is executed by using 4 context models corresponding with the depth of current coded unit
Entropy decoding.
6. a kind of method for estimating using the estimation of motion vectors factor, the method for estimating include:
In coding unit in included predicting unit, the current kinetic arrow of the inter-prediction for current prediction unit is determined
Amount;
Reference block is determined by using two blocks of two hypothesis estimations because of sub-pixel are respectively included, wherein described two vacations
If estimation because sub-pixel from the rectilinear direction selected in two or more predetermined rectilinear directions currently to estimate factor picture
Centered on element on straight line opposite to each other, described two hypothesis are estimated to estimate with current because sub-pixel is apart from two because of sub-pixel
Multiple hypothesis of the sub-pixel distance selected in a or more predetermined sub-pixel distance estimate among because of sub-pixel, and its
In, it is current to estimate to be indicated because of sub-pixel by current motion vector;
The hypothesis estimation model information of exports coding unit, and export the motion vector difference value information of current prediction unit, wherein
It is described assume sub-pixel distance that the instruction of estimation model information selects among two or more predetermined sub-pixels distances and from
The combination of the rectilinear direction selected in two or more predetermined rectilinear directions,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances
One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Combination,
The step of hypothesis estimation model information of exports coding unit includes: by using on corresponding with the depth of coding unit
Hereafter model executes entropy coding to hypothesis estimation model information.
7. method for estimating as claimed in claim 6, wherein the output hypothesis estimation model information simultaneously exports the fortune
The step of dynamic vector difference information includes:
Export the hypothesis estimation model information and the motion vector difference value information, wherein the motion vector difference value information
Indicate the difference between current motion vector and the motion vector for the predicting unit being encoded before current prediction unit;
Export the residual error data between current prediction unit and reference block.
8. method for estimating as claimed in claim 6, wherein export the hypothesis estimation model information the step of include:
The hypothesis estimation model information that output determines jointly for included predicting unit in current coded unit.
9. method for estimating as claimed in claim 6, wherein the two or more predetermined sub-pixel distances include 1/
4 pixel distances and 1/2 pixel distance, the two or more predetermined rectilinear directions include having 0 degree of angle, an angle of 90 degrees, 135 degree
The direction at angle and 45 degree of angles,
Wherein, one for assuming the instruction of estimation model information and being selected in the two or more predetermined sub-pixel distances
One in 8 kinds of combinations of sub-pixel distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Kind combination.
10. method for estimating as claimed in claim 6, wherein export the hypothesis estimation model information the step of include:
The context model for assuming estimation model information is determined according to the depth of coding unit;
The hypothesis estimation model information is executed by using 4 context models corresponding with the depth of current coded unit
Entropy coding.
11. method for estimating as claimed in claim 6, wherein the step of determining reference block include:
By using with 0 degree angle, an angle of 90 degrees, 135 degree of angles and 45 degree of angles rectilinear direction in each rectilinear direction on and
Hypothesis estimation of the current estimation because of sub-pixel at a distance of 1/4 pixel distance is distorted RD cost because sub-pixel carrys out calculation rate;
By using on the direction for generating the minimum RD cost in RD cost with current estimation because sub-pixel at a distance of 1/2 pixel away from
From hypothesis estimate to calculate RD cost because of sub-pixel;
Determine the rectilinear direction and sub-pixel distance generated where the minimum RD cost in RD cost;
Determine reference block, wherein the reference block be respectively include based on generate RD cost in minimum RD cost where it is straight
The average block in line direction and sub-pixel distance and block of the determining hypothesis estimation because of sub-pixel.
12. a kind of motion compensation equipment using the estimation of motion vectors factor, the motion compensation equipment include:
Information obtainer obtains the residual error data of the current prediction unit in included predicting unit in coding unit and works as
Preceding motion vector, and obtain the hypothesis estimation model information of coding unit;
Assuming that estimation model determiner, the group of sub-pixel distance and rectilinear direction is determined based on the hypothesis estimation model information
Close, wherein the sub-pixel distance is selected in two or more predetermined sub-pixels distances, the rectilinear direction be
It is selected in two or more predetermined rectilinear directions;
Motion compensator determines reference block by using two blocks of two hypothesis estimations because of sub-pixel are respectively included, and leads to
The recovery block that current prediction unit is generated by residual error data and with reference to merged block is crossed,
Wherein, the estimation of described two hypothesis because sub-pixel in selected rectilinear direction centered on currently estimating because of sub-pixel
On straight line opposite to each other, and with current estimation because sub-pixel is at a distance of selected sub-pixel distance,
It is current to estimate to be indicated because of sub-pixel by current motion vector,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances
One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Combination,
Information obtainer executes hypothesis estimation model information by using context model corresponding with the depth of coding unit
Entropy decoding,
Assuming that estimation model determiner determines sub-pixel for current motion vector based on the hypothesis estimation model information of entropy decoding
The combination of distance and rectilinear direction.
13. a kind of motion estimation apparatus using the estimation of motion vectors factor, the motion estimation apparatus include:
Exercise estimator in coding unit in included predicting unit, determines the inter-prediction for being used for current prediction unit
Current motion vector, and determine reference block by using respectively including two and assume two blocks of the estimation because of sub-pixel,
In, the estimation of described two hypothesis because sub-pixel from the rectilinear direction selected in two or more predetermined rectilinear directions to work as
Because relative to each other on straight line centered on sub-pixel, described two hypothesis estimations are estimating the factor with current because of sub-pixel for preceding estimation
Pixel is estimated at a distance of multiple hypothesis of the sub-pixel distance selected from two or more predetermined sub-pixel distances because of sub-pixel
Among, and wherein, it is current to estimate to be indicated because of sub-pixel by current motion vector;
Information output unit, the hypothesis estimation model information of exports coding unit, and export the motion vector of current prediction unit
Difference information, wherein described to assume that estimation model information indicates the sub- picture selected in two or more predetermined sub-pixel distances
The combination of plain distance and the rectilinear direction selected in two or more predetermined rectilinear directions,
Where it is assumed that the sub- picture that the instruction of estimation model information selects in the two or more predetermined sub-pixel distances
One of the multiple combinations of plain distance and the rectilinear direction selected in the two or more predetermined rectilinear directions
Combination,
Information output unit holds hypothesis estimation model information by using context model corresponding with the depth of coding unit
Row entropy coding.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2013/006039 WO2015005507A1 (en) | 2013-07-08 | 2013-07-08 | Inter prediction method using multiple hypothesis estimators and device therefor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104662905A CN104662905A (en) | 2015-05-27 |
CN104662905B true CN104662905B (en) | 2019-06-11 |
Family
ID=52280175
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201380049000.4A Expired - Fee Related CN104662905B (en) | 2013-07-08 | 2013-07-08 | Use multiple inter-frame prediction methods and its device for assuming the estimation factor |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN104662905B (en) |
WO (1) | WO2015005507A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11218721B2 (en) | 2018-07-18 | 2022-01-04 | Mediatek Inc. | Method and apparatus of motion compensation bandwidth reduction for video coding system utilizing multi-hypothesis |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101026758A (en) * | 2006-02-24 | 2007-08-29 | 三星电子株式会社 | Video transcoding method and apparatus |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8559514B2 (en) * | 2006-07-27 | 2013-10-15 | Qualcomm Incorporated | Efficient fetching for motion compensation video decoding process |
KR101403343B1 (en) * | 2007-10-04 | 2014-06-09 | 삼성전자주식회사 | Method and apparatus for inter prediction encoding/decoding using sub-pixel motion estimation |
KR101386891B1 (en) * | 2007-12-13 | 2014-04-18 | 삼성전자주식회사 | Method and apparatus for interpolating image |
KR101505815B1 (en) * | 2009-12-09 | 2015-03-26 | 한양대학교 산학협력단 | Motion estimation method and appartus providing sub-pixel accuracy, and video encoder using the same |
KR101847072B1 (en) * | 2010-04-05 | 2018-04-09 | 삼성전자주식회사 | Method and apparatus for video encoding, and method and apparatus for video decoding |
TR201819237T4 (en) * | 2011-09-14 | 2019-01-21 | Samsung Electronics Co Ltd | A Unit of Prediction (TB) Decoding Method Depending on Its Size |
-
2013
- 2013-07-08 WO PCT/KR2013/006039 patent/WO2015005507A1/en active Application Filing
- 2013-07-08 CN CN201380049000.4A patent/CN104662905B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101026758A (en) * | 2006-02-24 | 2007-08-29 | 三星电子株式会社 | Video transcoding method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN104662905A (en) | 2015-05-27 |
WO2015005507A1 (en) | 2015-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104754356B (en) | The method and apparatus determined for the motion vector in Video coding or decoding | |
CN104365101B (en) | For determining the method and apparatus of the reference picture for inter-prediction | |
CN104488272B (en) | It is used to encode video or the method and apparatus of the motion vector for decoding video for predicting | |
CN104081779B (en) | Method and its device for inter prediction and the method and its device for motion compensation | |
CN105308966B (en) | Method for video coding and its equipment and video encoding/decoding method and its equipment | |
CN103918255B (en) | The method and apparatus that the depth map of multi-view point video data is encoded and the method and apparatus that the depth map encoded is decoded | |
CN103931184B (en) | Method and apparatus for being coded and decoded to video | |
CN104471938B (en) | The method and apparatus of SAO parameters encoded to video is shared according to chrominance component | |
CN105144713B (en) | For the method and device thereof of decoder setting encoded to video and based on decoder, the method and device thereof being decoded to video are set | |
CN104869413B (en) | Video decoding apparatus | |
CN105325004B (en) | Based on the method for video coding and equipment and video encoding/decoding method and equipment with signal transmission sampling point self adaptation skew (SAO) parameter | |
CN105103552B (en) | Method and device thereof for the method and device thereof of compensation brightness difference encoded to cross-layer video and for being decoded to video | |
CN104365104B (en) | For multiple view video coding and decoded method and apparatus | |
CN105594212B (en) | For determining the method and its equipment of motion vector | |
CN106031175B (en) | Use the cross-layer video coding method of luminance compensation and its device and video encoding/decoding method and its device | |
CN106416256B (en) | For carrying out coding or decoded method and apparatus to depth image | |
CN105308961B (en) | Cross-layer video coding method and equipment and cross-layer video coding/decoding method and equipment for compensation brightness difference | |
CN105308970B (en) | The method and apparatus that video is coded and decoded for the position of integer pixel | |
CN105532005B (en) | Method and apparatus for the method and apparatus of interlayer coding and for using residual prediction to carry out room decipherer to video | |
CN105340273B (en) | For predicting for the method for the decoded difference vector of cross-layer video and coding method and equipment | |
CN104662905B (en) | Use multiple inter-frame prediction methods and its device for assuming the estimation factor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190611 Termination date: 20210708 |