CN103270753A

CN103270753A - Coding and decoding utilizing context model selection with adaptive scan pattern

Info

Publication number: CN103270753A
Application number: CN2011800494129A
Authority: CN
Inventors: 娄健; 克里特·帕努索波内; 王利民
Original assignee: General Instrument Corp
Current assignee: General Instrument Holdings Ltd; Motorola Mobility LLC
Priority date: 2010-10-14
Filing date: 2011-10-05
Publication date: 2013-08-28
Also published as: EP2606645A1; WO2012051025A1; MX2013004135A; KR20130054435A; CA2812252A1

Abstract

There is a coding. The coding may include preparing video compression data based on source pictures. The preparing may include partitioning the source pictures into coding units and/or generating a transform unit having a transform array. The preparing may also include processing the generated transform unit. The processing may include generating a significance map, having a significance map array with y-x locations corresponding to the y-x locations of the transform array. The processing may also include determining, utilizing a scanning pattern, a context model for coding a significance map element of the plurality of significance map elements based on a value associated with at least one coded neighbor significance map element of the plurality of significance map elements in the significance map array. There is also a decoding including processing video compression data which is generated in the coding.

Description

Utilize compiling and the decoding of the context model selection of using the adaptive scanning pattern

Priority

The application requires people's such as Jian Lou exercise question to be U.S. Provisional Patent Application No.61/393 " Context Selection for Adaptive Scanning Pattern ", that submit on October 14th, 2010, the rights and interests of 198 priority, the open of this U.S. Provisional Patent Application is incorporated in this by integral body by reference.

The application also requires people's such as Jian Lou exercise question to be U.S. Provisional Patent Application No.61/389 " Adaptive Context Selection for Zigzag Scan ", that submit on October 5th, 2010, the rights and interests of 932 priority, the open of this U.S. Provisional Patent Application is incorporated in this by integral body by reference.

The cross reference of related application

Exercise question that the application relates to people such as Jian Lou is " Coding and Decoding Utilizing Adaptive Context Model Selection with Zigzag Scan ", the U.S. Patent application No.13/253 that submits on October 5th, 2010,385, people's such as this U.S. Patent application requirement Jian Lou exercise question is " Adaptive Context Selection for Zigzag Scan's ", the U.S. Provisional Patent Application No.61/389 that submits on October 5th, 2010, the rights and interests of 932 priority, the open of this U.S. Provisional Patent Application is incorporated in this by integral body by reference.

Background technology

Video compression utilizes piece to handle to carry out many operations.In piece is handled, the piece of neighbor is organized as compilation unit, and squeeze operation is regarded this group pixel as for the correlation between the neighbor that utilizes in compilation unit a unit.Block-based processing often comprises prediction compiling and conversion compiling.The conversion compiling of using quantification is one type data compression, it normally " diminishes ", because the quantification of the transform block that obtains from the source picture often abandons the data that are associated with transform block the picture of source, reduce its bandwidth requirement thus, but also often cause in the mass loss from the original transform block of source picture reproduction.

The MPEG-4AVC that is also referred to as H.264 is the video compression standard that utilizes the conversion compiling in piece is handled to set up.In H.264, picture is divided into the macro block (MB) of 16x16 pixel.Each MB often further is divided into less piece.Use in the picture/inter-picture prediction predicts the piece that is equal to or less than MB in size, and to prediction residue application space conversion and quantification.Usually use entropy Compilation Method (that is, variable-length compiling or arithmetic compiling) the remaining quantized transform coefficients of encoding.In H.264, introduced context adaptive binary arithmetic compiling (CABAC), with by with the incompatible compression efficiency that can't harm substantially that provides of adaptive binary arithmetic technique of compiling and a group context model group.Context model is chosen among the CABAC is providing self adaptation degree and redundant the minimizing to work.H.264 specify in two kinds of scan patterns on the 2D piece.Zigzag scan is used to use the picture of progressive scanned video compress technique compiling, and interlacing is used for using the picture of interlaced scanning video compress technique compiling.

H.264 use the conversion based on the 2D piece of block size 2x2,4x4 and 8x8.Block-based conversion is converted to the piece of the pixel in spatial domain the piece of the coefficient in transform domain.Quantize then conversion coefficient to be mapped in the finite aggregate.After quantification, many high frequency coefficient vanishing.Piece for have at least one nonzero coefficient after the operation of 2D transform and quantization forms significance map, and this significance map specifies in the position of the non-zero quantized coefficients in the 2D transform domain.Specifically, under the situation of the 2D transform block that provides quantification, if in the position (y, the value of the quantization parameter of x) locating is non-zero, then it is considered as important, and for the position in the significance map that is associated (y x) distributes " 1 ".Otherwise (y x) distributes " 0 " to the position in significance map.In H.264, CABAC is used for each element that the code weight important plan is conciliate in compiling.

Be developed to inherit international video compiling standard HEVC(efficient video compiling H.264) transform block size is expanded to 16x16 and 32x32 pixel, to be of value to the compiling of high definition (HD) video.In the model of the consideration that is used for HEVC, can enable one group of conversion coefficient compiling and decoding instrument to be used for entropy compiling and decoding.Have self adaptation to cut apart the zigzag scan scheme in the middle of these instruments, this scheme is used to for significance map compiling and decoding.If the significance map array size is greater than 8x8, then this scheme is changed between two kinds of scan patterns that are used for compiling reconciliation code weight important plan adaptively.

Self adaptation cut apart the zigzag scan scheme by on the diagonal angle from the lower-left to upper right or two predefine scan pattern between change the scanning sequency that guide for compiling reconciliation code weight important plan to ground, lower-left according to diagonal from upper right.This conversion occurs in the end of silver coin being scanned at every turn, and by two counter controls.The first counter c ₁Tracking is arranged in the number of important conversion coefficient of compiling of the lower-left half of transform block.The second counter c ₂Tracking is arranged in the number of important conversion coefficient of upper right half compiling of transform block.Sizable computation complexity and other memory requirement have been introduced in the realization of the model of considering for the HEVC that uses two scan patterns and two counters.These complexities comprise: the counting of following the tracks of the important conversion coefficient of upper right half the compiling of the lower-left half-sum that is arranged in conversion; Carry out branch operation; And scan selection for the coefficient in significance map compiling and the decoding.On the other hand, self adaptation is cut apart the zigzag scan scheme and has only been realized insignificant performance gain.Perhaps, it is reducing the gain that essence is not provided for the bandwidth requirement of the packed data that is associated with conversion process.

Summary of the invention

According to principle of the present invention, there are system, method and computer-readable medium (CRM), they provide compiling and the decoding that utilizes the context model selection of using the adaptive scanning pattern.By utilizing the context model of using the adaptive scanning pattern to select, reduced the poor efficiency in conversion process.These comprise that this computation complexity comprises based on the poor efficiency of the expense that is associated with computation complexity in addition: follow the tracks of partly the neutralize counting of important conversion coefficient of upper right compiling partly of the lower-left that is arranged in conversion; Carry out branch operation; And scan selection for the coefficient in the significance map compiling.

According to first principle of the present invention, there is a kind of system for compiling.Described system can comprise processor, and described processor is configured to prepare video compression data based on the source picture.Described preparation can comprise described source picture is divided into compilation unit.Described preparation also can comprise: become at least one converter unit next life based on the residue that is associated with described compilation unit tolerance, described at least one converter unit has transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described preparation also can comprise the converter unit that processing generates.Described processing can comprise the generation significance map.Described significance map can have the significance map array, and described significance map array has the y-x position corresponding with the described y-x position of described transform array.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

According to second principle of the present invention, there is a kind of method for compiling.Described method can comprise: prepare video compression data based on the source picture.Described preparation can comprise described source picture is divided into compilation unit.Described preparation also can comprise: become at least one converter unit next life based on the residue that is associated with described compilation unit tolerance, described at least one converter unit has transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described preparation also can comprise the converter unit that processing generates.Described processing can comprise the generation significance map.Described significance map can have the significance map array, and described significance map array has the y-x position corresponding with the described y-x position of described transform array.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

According to the 3rd principle of the present invention, there is a kind of non-transient state CRM, be used for the storage computer-readable instruction, it is a kind of for the method that compiles that described computer-readable instruction can work as when being carried out by computer system execution.Described method can comprise: prepare video compression data based on the source picture.Described preparation can comprise described source picture is divided into compilation unit.Described preparation also can comprise: become at least one converter unit next life based on the residue that is associated with described compilation unit tolerance, described at least one converter unit has transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described preparation also can comprise the converter unit that processing generates.Described processing can comprise the generation significance map.Described significance map can have the significance map array, and described significance map array has the y-x position corresponding with the described y-x position of described transform array.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

According to the 4th principle of the present invention, there is a kind of system for decoding.Described system can comprise the interface that is configured to the receiver, video packed data.Described system also can comprise processor, and described processor is configured to handle the video compression data that receives.The described video compression data that receives can be based on the converter unit of handling, and the converter unit of described processing can be based on the source picture.Can be by described source picture being divided into compilation unit and/or generating the converter unit that one or more converter units are prepared described processing.The converter unit that generates can have transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described conversion coefficient can be based on the residue tolerance that is associated with described compilation unit.Also can prepare the converter unit of described processing by the converter unit of handling described generation.Described processing can comprise the generation significance map.Described significance map can have the significance map array, described significance map array have can be corresponding with the described y-x position of described transform array the y-x position.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

According to Wuyuan of the present invention reason, there is a kind of method for decoding.Described method can comprise the receiver, video packed data.Described method also can comprise: handle the video compression data that receives.The described video compression data that receives can be based on the converter unit of handling, and the converter unit of described processing can be based on the source picture.Can be by described source picture being divided into compilation unit and/or generating the converter unit that one or more converter units are prepared described processing.The converter unit that generates can have transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described conversion coefficient can be based on the residue tolerance that is associated with described compilation unit.Also can prepare the converter unit of described processing by the converter unit of handling described generation.Described processing can comprise the generation significance map.Described significance map can have the significance map array, described significance map array have can be corresponding with the described y-x position of described transform array the y-x position.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

According to the 6th principle of the present invention, there is a kind of CRM, be used for the storage computer-readable instruction, described computer-readable instruction is carried out a kind of method for decoding when being carried out by computer system.Described method can comprise the receiver, video packed data.Described method also can comprise: handle the video compression data that receives.The described video compression data that receives can be based on the converter unit of handling, and the converter unit of described processing can be based on the source picture.Can be by described source picture being divided into compilation unit and/or generating the converter unit that one or more converter units are prepared described processing.The converter unit that produces can have transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses.Described conversion coefficient can be based on the residue tolerance that is associated with described compilation unit.Also can prepare the converter unit of described processing by the converter unit of handling described generation.Described processing can comprise the generation significance map.Described significance map can have the significance map array, described significance map array have can be corresponding with the described y-x position of described transform array the y-x position.Described processing also can comprise: utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements.It is described that determine can be based on the one or more values that are associated with neighbours' significance map element of one or more compilings of the described significance map element of described a plurality of significance map elements in described significance map array.

Realize these and other purposes according to the principle of the present invention in the system, method and the CRM that provide context model that utilize to use the adaptive scanning pattern to select to compile and decode.By accompanying drawing and following DETAILED DESCRIPTION OF THE PREFERRED, additional features, their character and various advantage will be more apparent.

Description of drawings

Example and disclosed feature are obvious for the those skilled in the art based on reference to the accompanying drawings following explanation, in the accompanying drawings:

Fig. 1 is that diagram is according to the compiling system of the context model selection of the utilization use adaptive scanning pattern of example and the block diagram of decode system;

Fig. 2 A is diagram according to the scan pattern at the zigzag scan of the significance map compiling of conversion process of being used for of example;

Fig. 2 B is diagram according to the scan pattern in lower-left, the diagonal angle scanning of the significance map compiling of conversion process of being used for of example;

Fig. 2 C is diagram according to the scan pattern in the upper right scanning in diagonal angle of the significance map compiling of conversion process of being used for of example;

Fig. 2 D is diagram according to the scan pattern at the vertical scanning of the significance map compiling of conversion process of being used for of example;

Fig. 2 E is diagram according to the scan pattern in the horizontal sweep of the significance map compiling of conversion process of being used for of example;

Fig. 3 is the model that diagram is selected according to the context model of using the adaptive scanning pattern in the significance map compiling of example;

Fig. 4 A is the model that diagram is selected according to significance map compiling and the fixed model in the decoding at the 2x2 array of example;

Fig. 4 B is the model that diagram is selected according to significance map compiling and the fixed model in the decoding at the 4x4 array of example;

Fig. 4 C is the model that diagram is selected according to significance map compiling and the fixed model in the decoding at the 8x8 array of example;

Fig. 5 is the flow chart that illustrates the method for the significance map of preparing to compile according to the context model that is used for utilization use adaptive scanning pattern of example;

Fig. 6 is the flow chart that illustrates the method for selecting to compile according to the context model that is used for utilization use adaptive scanning pattern of example;

Fig. 7 is the flow chart that illustrates the method for selecting to decode according to the context model that is used for utilization use adaptive scanning pattern of example; And

Fig. 8 be diagram according to the block diagram of the computer system of example, this computer system is used for being provided for utilizing the system that the context model of using the adaptive scanning pattern selects to compile and/or utilizes the platform of the system that the context model of using the adaptive scanning pattern selects to decode.

Embodiment

In order to simplify and illustrative purposes, by main reference embodiments of the invention, principle and example the present invention is described.In the following description, a plurality of details have been set forth in order to thorough understanding to example is provided.Yet, it is evident that easily, can implement this embodiment, and be not limited to these details.In other cases, do not describe certain methods and structure in detail in order to can obscure specification necessarily.And, different embodiment is described below.Can use or carry out embodiment together with different combinations.

As used herein, term " comprises " and means " comprising at least ", " only comprises " and be not limited to term.Term "based" means " at least in part based on ".Term " picture " means the picture that is equal to frame or is equal to the field that is associated with frame, such as the field as one of two groups of interlaced line of interlaced video frames.Term " bit stream " refers to numerical data stream.Term " compiling " can refer to the coding of unpressed video sequence.Term " compiling " also can refer to compressed video bitstream from a compressed format transcoding to another.Term " decoding " can refer to the decoding of compressed video bitstream.

As shown in below example and embodiment, there are system, method and (for example, CRM) go up the machine readable instructions of storage, be used for utilizing and use the context model of adaptive scanning pattern to select to compile and decode at computer-readable medium.Referring to Fig. 1, a kind of content delivering system 100 is disclosed, this content delivering system 100 comprises compiling system 110 and the decode system 140 that utilizes the context model selection of using the adaptive scanning pattern.

In compiling system 110, use the context model of adaptive scanning pattern to select and prepare video compression data by following manner based on the source picture and be associated: the source picture to be divided into compilation unit, and to handle converter unit based on described compilation unit.

In decode system 140, use the context model of adaptive scanning pattern to select to be associated with the video compression information that receives of decoding by following manner, utilize the context model of use adaptive scanning pattern to select to prepare this video compression information based on preparing video compression data based on the source picture: the source picture is divided into compilation unit, and handles converter unit based on compilation unit.

The compiling that is used for converter unit can comprise three aspects: the compiling of (1) significance map, the compiling of (2) nonzero coefficient level, and the compiling of (3) nonzero coefficient symbol.Can handle converter unit in the video compression data generating by following manner according to example: become converter unit next life based on the residue tolerance that is associated with compilation unit, this converter unit has transform array, and this transform array comprises the conversion coefficient that distributes to the y-x position of transform array as clauses and subclauses.The processing of the converter unit that generates can comprise: generate significance map, this significance map has the significance map array, and this significance map array has the y-x position corresponding with the y-x position of transform array.

Generating significance map can comprise: check the conversion coefficient in the converter unit that generates, and be compiled in the significance map element in the significance map array corresponding with the array of converter unit.Can in compiling significance map element, utilize the adaptive scanning pattern.The adaptive scanning pattern is following scan pattern: this scan pattern is determined the significance map that generates for scanning to determine context model, and this context model is used for compiling significance map element.Which scan pattern is used as the definite of adaptive scanning can be based on one or more criterions, such as efficiency goal, array size, benchmark etc.The adaptive scanning pattern is used for the scanning significance map, and this significance map has the significance map element in the significance map array.When using adaptive scanning pattern scanning significance map element with compiling, which context model the significance map element adjacent with the significance map that will compile may influence is selected to the significance map element that will compile for compiling.One or more values such as value of symbol or quantity parameter that can be associated with adjacent significance map element can be used as for the criterion of selecting for the context model of compiling significance map element.In addition, which neighbours' significance map element can be used in the function of determining to be used for neighbours' selection criterion of significance map or scan pattern that helps for determining context model value that select, that be associated with them.Neighbours' selection criterion can change, such as neighbours' significance map element above the significance map element in the significance map array and/or left side etc. whether.In addition, which neighbours' significance map element the scan pattern that is used as adaptive scanning also can influence and help for according to utilizing which neighbours' selection criterion to determine the value that context model is selected.The compiling of significance map can comprise: utilize the adaptive scanning pattern to be compiled in a plurality of significance map elements in the significance map array.After the following detailed that the context model of using the adaptive scanning pattern is selected, be described in greater detail below compiling system 110 and decode system 140.

The context model of use adaptive scanning pattern is chosen in significance map compiling and the decoding and plays an important role.Can be implemented in the video compression data under the greater efficiency by the various mechanism that context model is selected.In a kind of mechanism, use the context model of adaptive scanning pattern to select consideration different quantization steps can be applied to the possibility of the different converter units of identical size.For example, use different quantization steps to come the statistics possibility of quantized transform coefficients different.Use the context model of adaptive scanning pattern to select to overcome this point by the relation that depends between the significance map element in significance map.Under the situation that provides the converter unit that is associated with compilation unit, compile significance map based on converter unit according to scan pattern.

Can be based on the value that is associated with at least one neighbours' significance map element of the significance map element of compiling in the significance map array and/or based on high frequency or the analysis of the low frequency position context model that be identified for compile significance map element of significance map element in the significance map array.

For example, can be identified for being compiled in high frequency in the significance map array or the context model of the significance map element in the low frequency position based on neighbours''s (that is significance map element in significance map) of the compiling of the significance map element in same significance map array of elements value (0 or 1) with such as the scan pattern that utilizes of Z-shaped, level etc.

In another example, can whether be identified for compiling the context model of significance map element based on the significance map element in the high frequency in the significance map array or the low frequency position.In this case, use for the benchmark of between the high and low frequency position, distinguishing, such as the predefine y-x position that is applied in the significance map element in the significance map.Significance map element in the low frequency position in the significance map array can be shared identical context model with other significance map elements in other significance map arrays of the same frequency position of sharing the significance map array.This can be associated with the potential high correlation between the significance map element at same frequency position place.Low frequency can be generally defined as the low frequency component of spacing wave.Can be defined in the low frequency position in the significance map by the y-x position of significance map element.For example, usually (0,0) frequency location is used as the low frequency position.

Can be intended for the scan pattern of significance map compiling and decoding for current converter unit, current compilation unit, when anter, current picture and current sequence.This scan pattern also can depend on current converter unit, current compilation unit, change when anter, current picture and current sequence.Under the situation that scan pattern changes, can use for the more effective scan pattern or desirably be used for the analysis of significance map compiling and decoding in addition of identification---such as, by classification, picture analysis or some other criterion of picture---come to be identified for for current converter unit, current compilation unit, when anter, current picture and/or current sequence the scan pattern of significance map array.Scan pattern can be can be used for current converter unit, current compilation unit, when one of a plurality of scan patterns of anter, current picture or current sequence.The scan pattern that uses in the context model of using the adaptive scanning pattern is selected is not limited, and can for example be: zigzag scan, such as at the zigzag scan 200 shown in Fig. 2 A; The scanning of lower-left, diagonal angle, all lower-lefts, diagonal angle as shown in fig. 2B scanning 210; The upper right scanning in diagonal angle is such as in the upper right scanning 220 in the diagonal angle shown in Fig. 2 C; Vertical scanning is such as at the vertical scanning 230 shown in Fig. 2 D; Horizontal sweep is such as in the horizontal sweep 240 shown in Fig. 2 E.Under the situation that provides the significance map that is associated with converter unit, can compile significance map element based on the significance map array of significance map according to scan pattern.

Fig. 2 A is the example for the zigzag scan 200 of the significance map compiling of converter unit (that is the converter unit that, has the transform array of selecting for the self adaptation context model) and decoding.As example, Fig. 2 A illustrates the zigzag scan 200 for the 16x16 piece.Zigzag scan selects to be utilized to determine to handle the sequence such as transformation series transformation of variable element with context model.According to example, can utilize the pattern of zigzag scan 200 to finish determining of context model.Can select context model based on the one or more values that are associated with neighbours' significance map element of at least one compiling of significance map element in the significance map array.By relatively, at the model of considering that is used for HEVC, use self adaptation to separate zigzag scan, and below it is described in detail.

In the example according to principle of the present invention, be in the context selection of zigzag scan at adaptive scanning, can utilize zigzag scan 200.In other examples, another kind scanning so can be used, because lower-left, diagonal angle scanning 210, the upper right scanning 220 in diagonal angle etc. can be used for significance map compiling and decoding for all array size.As mentioned above, can be scheduled to or select to be used for the scan pattern of adaptive scanning pattern based on criterion.Be identified for the context model of the element in significance map based on the neighbours' of the compiling of element value (0 or 1).As a definite example of self adaptation context model, providing under the situation of significance map, can be identified for the context model of the element in significance map as shown in Figure 3, be used for comprising handle converter unit, be identified for compiling and the adaptive scanning criterion 300 of the context model of decoding is demonstrated context model.This processing can comprise the generation significance map, and this significance map has the array corresponding with the array of converter unit, such as the significance map array greater than the 8x8 clauses and subclauses.The significance map array can comprise the significance map element that distributes to the y-x position of significance map array as clauses and subclauses based on the residue tolerance that is associated with compilation unit based on the source picture.Significance map element for the position in array as shown in Figure 3 (0,0), (0,1) or (1,0) are located can distribute unique context model.

If scan pattern is zigzag scan 200, then for the position (0 in array as shown in Figure 3, x〉1) the significance map element located, if x is even number, then can based in the position (0, x-1), (0, x-2), (1, x-2) and (1, the neighbours' of the element of x-1) locating value (0 or 1) is selected context model.Can utilize other criterions with zigzag scan 200 or another kind of scan pattern with replacing.

If scan pattern is zigzag scan 200, then for the position in array as shown in Figure 3 (y〉1,0) the significance map element of locating, if y is odd number, then can be based on (y-1,0), (y-2,0), (y-2 in the position, 1) and the neighbours' of (y-1,1) element of locating value (0 or 1) select context model.Can utilize other criterions with zigzag scan 200 or another kind of scan pattern with replacing.

If scan pattern is zigzag scan 200, then for the position in array as shown in Figure 3 (y〉0, x〉0) the significance map element located, then can be based at position (y-1, x-1), (y-1, x), (y, x-1) and (y-1, x-2) and if x greater than 1 (y, x-2), if x greater than 1 and y less than height-1 (y+1, x-2), if y is greater than 1 (y-2, x-1) and (y-2, x), if y greater than 1 and x less than width-1 (y-2, x+1), if it is x and y and be that odd number and x are less than (the y-1 of width-1, x+1), if x and y and be even number and y less than height-1 (y+1, the neighbours' of the element of x-1) locating value (0 or 1) is selected context model.Can utilize other criterions with zigzag scan 200 or another kind of scan pattern with replacing.

For the significance map based on the converter unit with the transform array that is less than or equal to the 8x8 clauses and subclauses, can use fixedly criterion model based on the position in the array of significance map.Fig. 4 A to 4C illustrates the context model for 2x2,4x4 and 8x8 significance map array.They depend on the position, and are designed based on following hypothesis: for the array of identical size, the value of the specific location in significance map (0 or 1) can be followed similar statistical model.The context selection scheme of describing in Fig. 4 A, Fig. 4 B and Fig. 4 C utilizes array position as the context selection criterion.Yet, for bigger array size, the number that the possible context that the array position of increase can increase indication in fact needs more memory spaces is selected.Coming the application context Model Selection can be used for being used for number greater than the context selection of the array of 8x8 by the adaptive scanning pattern remains on and puts into practice in the limit.

Example as a comparison, in TMuC0.7, a model that is used for HEVC of considering is enabled one group of conversion coefficient compiling and decoding instrument.When the entropy compile option was CABAC/PIPE, its acquiescence was switched on.In the middle of these instruments, cut apart the zigzag scan pattern for significance map compiling and decoding application self-adapting.Result of the test is indicated this self adaptation to cut apart zigzag scan pattern scheme and is only obtained insignificant performance gain.But, and to compare at the zigzag scan 200 shown in Fig. 2 A, it has also introduced other memory space and computation complexity.

In TMuC0.7, acquiescently, the entropy compiling is arranged to use the option of the CABAC/PIPE that merges one group of conversion coefficient compiling and decoding instrument.Being used for compiling conciliates the scanning sequency of code weight important plan and is allowed to diagonally from the lower-left to upper right or change between two predefined scan patterns according to diagonal to the lower-left from upper right.When this conversion appears at each to the end of silver coin scanning, and it is by two counter controls, and these two counters are: c ₁, be arranged in the number of important conversion coefficient of compiling of the lower-left half of transform block; And, c ₂, be arranged in the number of important conversion coefficient of upper right half compiling of transform block.

In significance map compiling in TMuC0.7 and the preceding a kind of realization of decoding, with a zigzag scan pattern and two counter c ₁And c ₂Compare, self adaptation is cut apart the other memory space that zigzag scan need be used for two scan patterns.It is also selected and has introduced other computation complexity because counting is arranged in number, the branch operation of compiling of important conversion coefficient of lower-left half-sum upper right half and the scanning that is used for each coefficient before last significant coefficient.Be chosen in the context model of the element in the significance map based on the adjacent element of the compiling in significance map.Because diagonal scanning can be walked arbitrary route, so be necessary to check whether upper right element or lower-left element are available for the given currentElement in significance map compiling and decoding.This has caused other branch operation.Result of the test is indicated this self adaptation to cut apart the zigzag scan scheme and is only obtained insignificant performance gain, is cost with other memory space requirement and the computation complexity that increases still.

In the example according to principle of the present invention, when selecting CABAC/PIPE, can be for significance map compiling and decoding as the zigzag scan 200 of full zigzag scan.TMuC0.7 can be modified preceding a kind of self adaptation that realizes of the significance map in TMuC0.7 compiling and decoding to be cut apart the Z-shaped zigzag scan 200 that replaces with for bigger converter unit (that is, have greater than the array of 8x8 converter unit).As example, Fig. 2 A illustrates the zigzag scan 200 for the 16x16 array.Because fixed scan pattern, be used for the neighborhood that context is selected so also fixed.Cutting apart the other memory space that zigzag scan is associated with preceding a kind of self adaptation that realizes of the compiling of significance map in TMuC0.7 and decoding requires no longer to exist with computation complexity, and can utilize the self adaptation context to select, such as the context model of conduct in the adaptive scanning criterion 300 shown in Fig. 3, and as mentioned above.

Use the utilization of the context model selection of adaptive scanning pattern to improve compile efficiency, because reduced the poor efficiency in conversion process.These comprise that this computation complexity comprises based on the poor efficiency of the expense that is associated with computation complexity in addition: follow the tracks of partly the neutralize counting of important conversion coefficient of upper right compiling partly of the lower-left that is arranged in conversion; Carry out branch operation; And scan selection for the coefficient in significance map compiling and decoding.

Refer again to Fig. 1, compiling system 110 comprises input interface 130, controller 111, counter 112, frame memory 113, coding unit 114, conveyer buffer 115 and output interface 135.Decode system 140 comprises receiver buffer 150, decoding unit 151, frame memory 152 and controller 153.Compiling system 110 and decode system 140 are coupled to each other via the transmission path that comprises compression bit stream 105.The controller 111 of compiling system 110 is controlled the quantity of the data that will transmit based on the capacity of receiver buffer 150, and can comprise other parameters, such as the data volume of time per unit.The fault that controller 111 control coding units 114 are operated with the reception signal decoding that prevents decode system 140.Controller 111 can be processor or comprise the microcomputer that for example has processor, random access memory and read-only memory.

Can comprise the video sequence of frame from the source picture 120 of the supply of content supplier for example, the video sequence of this frame is included in the source picture in the video sequence.Source picture 120 can be unpressed or compression.If source picture 120 is unpressed, then compiling system 110 can be associated with encoding function.If source picture 120 is compressed, then compiling system 110 can be associated with the transcoding function.Can utilize controller 111 to draw compilation unit from the source picture.Frame memory 113 can have: first area, this first area can be used for storage and import the source picture into from source picture 120; And second area, this second area can be used for reading the source picture and they being outputed to coding unit 114.Controller 111 can be to frame memory 113 output area changeover control signals 123.Zone changeover control signal 123 can indicate whether to utilize first area or second area.

Controller 111 is to coding unit 114 output encoder control signals 124.Coding control signal 124 makes coding unit 114 begin encoding operations, such as preparing compilation unit based on the source picture.In response to the coding control signal 124 that comes self-controller 111, coding unit 114 beginnings are handled to high efficient coding and are read the compilation unit of preparing, this high efficient coding is handled all process of compilation or conversion process of compilation predicted in this way, and this prediction process of compilation or conversion process of compilation are handled the compilation unit of the preparation that generates video compression data based on the source picture that is associated with compilation unit.

Coding unit 114 can flow the video compression data that encapsulation generates in (PES) substantially in the encapsulation that comprises video packets.Coding unit 114 can use control information and program time to stab (PTS) video packets is mapped in the encoded video signal 122, and can signal encoded video signal 122 to conveyer buffer 115.

Can in conveyer buffer 114, store the encoded video signal 122 that comprises the video compression data that generates.Information content counter 112 is incremented to indicate the total amount of the data in conveyer buffer 115.When from buffer retrieval and removal data, counter 112 can be successively decreased to be reflected in the data volume in the conveyer buffer 114.Occupied area information signal 126 can be sent to counter 112 to be added or has removed data from coding unit 114 from sender buffer 115 indicating whether, thereby can increasing or decreasing counter 112.Controller 111 can be controlled the generation of the video packets that is produced by coding unit 114 based on occupied area information 126, and occupied area information 126 can be transmitted in order to prevent and occur overflowing or underflow in conveyer buffer 115.

Information content counter 112 can reset in response to the preset signals 128 by controller 111 generations and output.After repositioning information counter 112, it can count the data by coding unit 114 output, and obtains the video compression data that generated and/or the quantity of video packets.Then, information content counter 112 can be to the information content signal 129 of controller 111 supplies for the information content of representing to obtain.Controller 111 can be controlled coding unit 114 and make and do not overflow at conveyer buffer 115 places.

Decode system 140 comprises input interface 170, receiver buffer 150, controller 153, frame memory 152, decoding unit 151 and output interface 175.The receiver buffer 150 of decode system 140 is store compressed bit stream 105 temporarily, and compression bit stream 105 comprises based on the video compression data that receives and video packets from the source picture of source picture 120.Decode system 140 can read the control information and the presentative time that are associated with video packets in the data that receive and stab information, and output is applied to the frame number signal 163 of controller 153.Controller 153 can be supervised the frame number of being counted with predetermined space, for example, and when each decoding unit 151 is finished decode operation.

When frame number signal 163 indication receiver buffers 150 were in predetermined volumes, controller 153 can be to decoding unit 151 output decoding commencing signals 164.When frame number signal 163 indication receiver buffers 150 during less than predetermined volumes, controller 153 can be waited for and the appearance that the count number of frame wherein becomes the situation that equals predetermined quantity occur.When frame number signal 163 indication receiver buffers 150 were in predetermined volumes, controller 153 can be exported decoding commencing signal 164.Can be based on stabbing with dullness order (that is, increase or reduce) decoding and coding video packets and video compression data with the video packets presentative time that be associated of coding.

In response to decoding commencing signal 164, decoding unit 151 can decoded number reaches the data of a picture that is associated with a frame and the compressed video data that is associated with the picture that is associated with from the video packets of receiver buffer 150.Decoding unit 151 can write the vision signal 162 of decoding in the frame memory 152.Frame memory 152 can have first area and second area, and the decoding unit vision signal is written into the first area, and second area is used for reading decoded pictures 160 to output interface 175.

According to different examples, compiling system 110 can be integrated in the transcoder at head end place or encoding device or be associated with it in addition, and decode system 140 can be integrated into such as in the downstream unit of mobile device, set-top box or transcoder or associated with it in addition.These can utilize the context model of using the adaptive scanning pattern to select to be used separately or together in the method for compiling and/or decoding.Below with reference to the Fig. 5,6 and 7 that has described method 500,600 and 700 flow chart the variety of way that can realize compiling system 110 and decode system 140 described in more detail.

Method 500 is for utilizing the context model of using the adaptive scanning pattern to select to prepare to compile the method for significance map.Method 600 is the methods that compile be used to the significance map that utilizes compilation unit and compiling, and the significance map of this compiling utilizes converter unit to prepare, and this converter unit is to utilize the context model of using the adaptive scanning pattern to select to handle.Obvious for the one of ordinary skilled in the art, method 500, the general diagram of 600 and 700 expressions and can be added other steps, and can be removed, revise or rearrange existing step, and do not depart from method 500,600 and 700 scope.Can carry out method 500,600 and 700 description with particular reference to the compiling system of in Fig. 1, describing 110 and decode system 140.Yet, should be understood that under the situation that does not depart from method 500,600 and 700 scope, can be in the system different with compiling system 110 and decode system 140 and/or device implementation method 500,600 and 700.

With reference to the method 500 in Fig. 5, at step 501 place, the controller 111 that is associated with compiling system 110 is such as by four fens tree formats the source picture being divided into compilation unit.

In step 502, controller 111 is measured based on the residue that is associated with compilation unit and is become converter unit next life, this converter unit comprises at least one converter unit with transform array, and this transform array comprises the conversion element that distributes to the y-x position of transform array as clauses and subclauses.Can generate converter unit according to the prediction processing of in generating video compression data, also using.

In step 503, controller 111 is handled the converter unit that produces by following manner: generate the significance map with significance map array, this significance map array has the y-x position corresponding with the y-x position in transform array.Step 503 can be subdivided into step 503-503B as follows.

At step 503A, controller 111 and coding unit 114 utilize scan pattern to scan a plurality of significance map elements in the significance map array.Scan pattern is not limited, and can for example be: zigzag scan, such as at the zigzag scan 200 shown in Fig. 2 A; The scanning of lower-left, diagonal angle, all lower-lefts, diagonal angle as shown in fig. 2B scanning 210; The upper right scanning in diagonal angle is such as in the upper right scanning 220 in the diagonal angle shown in Fig. 2 C; Vertical scanning is such as at the vertical scanning 230 shown in Fig. 2 D; Horizontal sweep is such as in the horizontal sweep 240 shown in Fig. 2 E.

At step 503B, controller 111 is identified for compiling the context model of this significance map element of a plurality of significance map elements based on the value that is associated with at least one neighbours' significance map element of a significance map element in significance map.Can determine context model based on the value that is associated with at least one neighbours' significance map element of a significance map element in the significance map array.And, if the significance map element is in the low frequency position in the significance map array, then can determine context model based on the low frequency position reference in the significance map array and low frequency position.Can be independently of one another or additionally use for these criterions of determining context model.

At step 503C, controller 111 and coding unit 114 utilize determined context model to compile the significance map element, with the significance map element of the compiling that forms significance map.This process of compilation can be the entropy process of compilation, is used for the y-x array of significance map is reduced to simpler matrix.

With reference to the method 600 in Fig. 6, in step 601, the interface 130 of compiling system 110 and frame memory 113 receive the source picture 120 that comprises the source picture.

In step 602, controller 111 is prepared compilation unit and the converter unit that comprises based on the converter unit of source picture.Reference method 500 is carried out this preparation as mentioned above.

In step 603, controller 111 and coding unit 114 are handled the converter unit of preparing based on compilation unit, to generate video compression data.

In step 604, the video compression data that controller 111 and coding unit 114 encapsulation generate.

In step 605, controller 111 and conveyer buffer 115 transmit the video compression data of encapsulation in the bit stream 105 of compression via interface 135.

With reference to the method 700 in Fig. 7, at step 701 place, decode system 140 receives the compression bit stream 105 that comprises video compression data via interface 170 and receiver buffer 150.

In step 702, decode system 140 receives the residue picture that is associated with video compression data via interface 170 and receiver buffer 150.

In step 703, decoding unit 151 and controller 153 are handled the video compression data that receives.

In step 704, decoding unit 151 and controller 153 generate the reconstruct picture based on handled video compression data and received residue picture.

In step 705, decoding unit 151 and controller 153 encapsulate the reconstruct picture that generates, and they are signaled to frame memory 152.

In step 706, controller 153 is via the interface 175 reconstruct picture that the signal notice generates in the signal 180 of decoding.

Some or all of aforesaid method and operation may be provided in the machine readable instructions such as utility program, computer program etc. of storing in computer-readable recording medium, this computer-readable recording medium can be non-transient state, such as the storage device of hardware storage apparatus or other types.For example, they can be used as the program existence that is made of the program command in source code, object identification code, executable code or extended formatting.

The example of computer-readable recording medium comprises conventional computer system RAM, ROM, EPROM, EEPROM and disk or CD or band.Aforesaid concrete example comprises the distribution of program on CD ROM.Therefore, should be understood that any electronic installation that can carry out above-mentioned functions can carry out those functions of enumerating above.

Referring to Fig. 8, platform 800 is shown, platform 800 can be used as for context model that utilize to use adaptive scanning and select to compile or the calculation element of the system of decoding, such as compiling system 100 and/or decode system 200.Platform 800 also can be used for upstream encoding device, transcoder or downstream unit, such as set-top box, mobile phone, mobile phone or other mobile devices, transcoder with can utilize the context model of using adaptive scanning to select and utilize the context model of using the adaptive scanning pattern to select the compilation unit that is associated handled and other devices and the equipment of converter unit.Can understand that the diagram of platform 800 is general diagrams, and platform 800 other assembly can be comprised, and described assembly some can be under the situation of the scope that does not depart from platform 800, removed and/or revise.

Platform 800 comprises: processor 801, such as CPU; Display 802 is such as monitor; Interface 803 is such as simple input interface and/or move the network interface of WAN or WiMax WAN to Local Area Network, wireless 802.11x LAN, 3G or 4G; And, computer-readable medium 804.Each of these assemblies can operationally be coupled to bus 808.For example, bus 808 can be EISA, PCI, USB, FireWire, NuBus or PDS.

Computer-readable medium (CRM) such as CRM804 can be to participate in providing any suitable medium of instruction to carry out to processor 801.For example, CRM804 can be: non-volatile media, such as CD or disk; Volatile media is such as memory; And transmission medium is such as coaxial cable, copper cash and optical fiber.Transmission medium also can be taked the form of sound, light or rf wave.CRM804 also can store other instructions or instruction set, comprises word processor, browser, Email, instant message transrecieving, media player and telephone code.

CRM804 also can store: operating system 805, such as MAC OS, MS WINDOWS, UNIX or LINUX; Use 806, such as network application, word processor, spreadsheet application, browser, Email, instant message transrecieving, such as media player or the mobile media player of using (for example, " app ") of recreation; And data structure managing uses 807.Operating system 805 can be multi-user, multiprocessing, multitask, multithreading and real-time etc.Operating system 805 also can be carried out basic task, such as: from interface 803 identification inputs, comprise from the input unit identification input such as keyboard or keypad; Send output to display 802, and follow the tracks of file and catalogue on CRM804; The control peripheral unit is such as disk drive, printer, image capture apparatus; And, the business of management on bus 808.Application 806 can comprise for the various assemblies of setting up and keeping network to set up, such as being used for code or the instruction that realization comprises the communication protocol of TCP/IP, HTTP, Ethernet, USB and FireWire.

Data structure managing application such as data structure managing application 807 provides for various code components nonvolatile memory, that be used for foundation/renewal computer-readable system (CRS) framework, as mentioned above.In specific example, some or all that used 807 processing of carrying out by data structure managing can be integrated in the operating system 805.In specific example, can realize described processing at Fundamental Digital Circuit with computer hardware, firmware, code, instruction set or its any combination at least in part.

According to principle of the present invention, there are system, method and computer-readable medium (CRM), they provide compiling and the decoding that utilizes the context model selection of using the adaptive scanning pattern.By utilizing the context model of using the adaptive scanning pattern to select, reduced the poor efficiency in conversion process.These comprise that this computation complexity comprises based on the poor efficiency of the expense that is associated with computation complexity in addition: follow the tracks of partly the neutralize counting of important conversion coefficient of upper right compiling partly of the lower-left that is arranged in conversion; Carry out branch operation; And, scan selection for the coefficient in the significance map compiling.

Specifically describe though run through disclosure integral body, representative illustration has practicality in a large amount of application, and top explanation is not intended to and should not be interpreted as restrictive.Term, explanation and accompanying drawing are only set forth by explanation as used herein, and do not mean that restriction.Those skilled in the art can recognize that in the spirit and scope of example, many variations are possible.Though reference example has been described example, those skilled in the art can carry out various modifications for described example, and do not depart from the scope of the example of describing in appended claim and equivalents thereof.

Claims

1. system that is used for compiling, described system comprises:

Processor, described processor are configured to prepare video compression data based on the source picture, and described preparation comprises

Described source picture is divided into compilation unit;

Become at least one converter unit next life based on the residue tolerance that is associated with described compilation unit, described at least one converter unit has transform array, and described transform array comprises the conversion coefficient that distributes to the y-x position of described transform array as clauses and subclauses; And

Handle the converter unit that produces, described processing comprises:

Generate significance map, described significance map has the significance map array, and described significance map array has the y-x position corresponding with the described y-x position of described transform array; And

Utilize scan pattern to be identified for compiling the context model of a significance map element of described a plurality of significance map elements,

Wherein, describedly determine at least one value of being associated based on neighbours' significance map element of at least one compiling of the described significance map element of described a plurality of significance map elements in described significance map array.

2. system according to claim 1, wherein, handle the converter unit that generates and comprise: the value of utilizing described scan pattern, neighbours' selection criterion and having a plurality of significance map elements in described significance map array compiles described significance map element.

3. system according to claim 1, wherein

If described significance map element is in based in the low frequency position in the described significance map array of low frequency position reference, then

Determine described context model based on the described low frequency position in described transform array.

4. system according to claim 1, wherein, described scan pattern is in zigzag scan, the scanning of lower-left, diagonal angle, the upper right scanning in diagonal angle, vertical scanning and the horizontal sweep.

5. system according to claim 1 wherein, uses for the analysis of identifying the scan pattern that satisfies efficiency goal or predetermined criterion and determines described scan pattern.

6. system according to claim 1, wherein, the described processing of the converter unit that generates comprises: utilize determined context model to compile described significance map element.

7. system according to claim 1 wherein, generates the residue tolerance that is associated with described compilation unit according to the prediction processing of carrying out for predicting unit based on described compilation unit.

8. system according to claim 1, wherein, described significance map is associated with one significance map array size in 8x8,16x16,32x32,64x64,128x128 and the 256x256 clauses and subclauses.

9. method that is used for compiling, described method comprises:

Utilize processor to prepare video compression data based on the source picture, described preparation comprises

Described source picture is divided into compilation unit;

Handle the converter unit that generates, described processing comprises:

10. a non-transient state computer-readable medium is used for the storage computer-readable instruction, and described computer-readable instruction is carried out a kind of method for compiling when being carried out by computer system, and described method comprises:

Described source picture is divided into compilation unit;

Handle the converter unit that produces, described processing comprises:

11. a system that is used for decoding, described system comprises:

Interface, described interface is configured to the receiver, video packed data; And

Processor, described processor is configured to handle the video compression data that receives,

Wherein, the described video compression data that receives is based on the converter unit of handling, and the converter unit of described processing is based on the source picture, and prepares the converter unit of described processing by the step that comprises the steps:

Described source picture is divided into compilation unit;

Handle the converter unit that generates, described processing comprises

12. system according to claim 11, wherein, the converter unit of handling described generation comprises: the value of utilizing described scan pattern, neighbours' selection criterion and having a plurality of significance map elements in described significance map array compiles described significance map element.

13. system according to claim 11, wherein

14. system according to claim 11, wherein, described scan pattern is in zigzag scan, the scanning of lower-left, diagonal angle, the upper right scanning in diagonal angle, vertical scanning and the horizontal sweep.

15. system according to claim 11 wherein, uses for the analysis of identifying the scan pattern that satisfies efficiency goal or predetermined criterion and determines described scan pattern.

16. system according to claim 11, wherein, the described processing of the converter unit of described generation comprises: utilize the context model of determining to compile described significance map element.

17. system according to claim 11 wherein, generates the residue tolerance that is associated with described compilation unit according to the prediction processing of carrying out for predicting unit based on described compilation unit.

18. system according to claim 11, wherein, described significance map is associated with one significance map array size in 8x8,16x16,32x32,64x64,128x128 and the 256x256 clauses and subclauses.

19. a method that is used for decoding, described method comprises:

The receiver, video packed data; And

Utilize processor to handle the video compression data that receives,

Described source picture is divided into compilation unit;

Handle the converter unit that generates, described processing comprises

Generate significance map, described significance map has the significance map array, and described significance map array has the y-x position corresponding with the described y-x position of described transform array;

20. a non-transient state computer-readable medium is used for the storage computer-readable instruction, described computer-readable instruction is carried out a kind of method for decoding when being carried out by computer system, and described method comprises:

The receiver, video packed data; And

Utilize processor to handle the video compression data that receives,

Described source picture is divided into compilation unit;

Handle the converter unit that generates, described processing comprises