CN101578865A - Techniques for content adaptive video frame slicing and non-uniform access unit coding - Google Patents

Techniques for content adaptive video frame slicing and non-uniform access unit coding Download PDF

Info

Publication number
CN101578865A
CN101578865A CN 200780047187 CN200780047187A CN101578865A CN 101578865 A CN101578865 A CN 101578865A CN 200780047187 CN200780047187 CN 200780047187 CN 200780047187 A CN200780047187 A CN 200780047187A CN 101578865 A CN101578865 A CN 101578865A
Authority
CN
China
Prior art keywords
section
vau
frame
encoded
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200780047187
Other languages
Chinese (zh)
Inventor
塞伊富拉·哈利特·奥古兹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN101578865A publication Critical patent/CN101578865A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]

Abstract

Techniques for content adaptive video frame slicing and non-uniform access unit coding for improved coding efficiency are provided. An encoder and decoder are disclosed to process (encode or decode) a single non-uniform video access unit (VAU) employing flexible macroblock ordering (FMO) in conjunction with different slice coding types in response to global motion detection of a camera pan or a scroll within the single VAU.

Description

The technology that is used for content adaptive video frame cutting and non-consistent access unit coding
The related application cross reference
The application's case is advocated the rights and interests of the 60/876th, No. 920 U.S. Provisional Application case of filing an application on December 22nd, 2006, and its full content is incorporated herein with way of reference.
Technical field
The present invention generally relates to video coding, and more particularly relates to content adaptive video frame cutting and the non-consistent access unit coding technology with the code efficiency that is improved that is used for.
Background technology
In all current video compression standards, the encoded expression of frame of video or so-called video access units (VAU) comprises the section as next lower level in the coding grade.Slicing layer allows the functional grouping of an integer macro block in the frame of video (data), is used as lock unit again in the described encoded expression that is grouped in frame.For as correct synchronous points again, cross over inactive all the predictive coding scheme/subordinate schemes such as for example interior prediction (based on neighbor) and motion vector prediction etc. of all slice boundaries.
In H.264 (and do not comprise optional " the piece cutting structure pattern of annex K:H.263+ " rectangle section subpattern) before, for example H.261, MPEG-1, MPEG-2/H.262, H.263 reach previous video compression standard support such as MPEG-4, the limited degree of slice size difference slightly wherein in fact by the continuous piece cutting structure formed of macro block (with raster scan order) of integer.
H.264 standard has been introduced the notion of " section group ", and it makes it possible in mode fully arbitrarily is a plurality of sections in some section groups and the section groups with the macroblock partition of frame, therefore is not subjected to must be on raster scan order continuous constraint.This decomposes arbitrarily by so-called " mapping of section group " describes, and except that the compressed data of frame, the group's mapping of also will cutting into slices is transmitted into described decoder.This regulation is called flexible macro-block ordering (FMO).
Therefore exist being used for content adaptive video frame cutting and non-consistent access unit coding needs with the technology of the code efficiency that is improved.
Summary of the invention
This paper is provided for content adaptive video frame cutting and the non-consistent access unit coding technology with the code efficiency that is improved.The invention provides a kind of device that comprises processor, the content-adaptive frames that described processor operations is divided into some sections groups and section with execution are cut apart and are used one or more section type of codings to carry out non-consistent video access units (VAU) in single VAU and encode.In a plurality of embodiment, memory is coupled to described processor.
In one aspect, this paper provides the encoding device that comprises coding engine, and described coding engine can be operated to adopt flexible macro-block ordering (FMO) in response to the pan of camera or the global motion detection of scrolling in conjunction with the section of the difference in the single video access units (VAU) type of coding.
In another aspect, the encoding device that comprises coding engine can be operated to change in conjunction with the section of the difference in the single video access units (VAU) type of coding in response in the compound scene one or more and adopt flexible macro-block ordering (FMO), one or more parts of the described frame of video of wherein said one or more variable effects but not whole video frame.Described one or more variations can comprise shear scene change, intersect desalination, fade in or fade out, amplify or dwindle and global motion such as pan or scrolling kind for example.
In another aspect, provide a kind of decoding device that comprises Decode engine.Described Decode engine can be operated to adopt flexible macro-block ordering (FMO) to come single video access units through nonuniform encoding (VAU) is decoded in conjunction with the section of the difference in the single video access units (VAU) type of coding.
In another configuration, a kind of computer program is provided, it comprises the computer-readable media that comprises the instruction that is used to handle multi-medium data.Described instruction causes computer to use flexible macro-block ordering (FMO) execution that the content-adaptive frames that frame is divided into some section groups and section are cut apart.Described instruction also causes computer to use one or more section type of codings to carry out non-consistent VAU coding to described through the frame of cutting apart.
In another configuration, a kind of computer program that comprises computer-readable media is provided, described computer-readable media comprises the instruction that is used to handle multi-medium data.Described instruction causes computer to adopt flexible macro-block ordering (FMO) to come described single VAU is decoded in conjunction with the section of the difference in the single video access units through nonuniform encoding (VAU) type of coding.
The technology of Miao Shuing provides a kind of and uses a plurality of slice types to carry out the video access units coding method with the code efficiency that obtains to strengthen herein.
According to detailed description, especially in conjunction with the accompanying drawings the time, with easier clear extra aspect.
Description of drawings
Hereinafter describe in detail in conjunction with described graphic basis, will understand each side of the present invention and configuration easilier, in all were graphic, identical reference character was represented corresponding element.
Fig. 1 graphic extension is according to the calcspar of the exemplary multimedia communications system of some configuration.
Fig. 2 A graphic extension can be used for the calcspar of the exemplary encoder apparatus in Fig. 1 system.
Fig. 2 B graphic extension can be used for the calcspar of the exemplary decoder device in Fig. 1 system.
Fig. 3 A graphic extension is according to using some configuration according to the piece cutting structure of Moving Picture Experts Group-1 to have the first exemplary frame that the sample section is cut apart.
Fig. 3 B graphic extension is according to using some configuration according to the piece cutting structure of Moving Picture Experts Group-2 to have the second exemplary frame that the sample section is cut apart.
Fig. 4 A graphic extension has the sample frame of cutting apart based on type 2 flexible macro-blocks orderings (FMO) according to standard H.264/AVC.
Fig. 4 B graphic extension has the sample frame of cutting apart based on Class1 FMO according to standard H.264/AVC.
Fig. 5 A graphic extension is through the VAU (frame #0) of I type coding.
Camera pan time frame #0 and the middle VAU (frame #3) and Type B VAU through P type coding of frame #3 are left worked as in Fig. 5 B graphic extension.
Fig. 6 A graphic extension has intra-encoded macro block group and a plurality of parallel horizontally extending single frame through the interframe encode section of a vertical strip structure, and the vertical strip of wherein said intra-encoded macro block starts from the left border of described frame.
Fig. 6 B graphic extension has intra-encoded macro block group and a plurality of parallel horizontally extending single frame through the interframe encode section of a vertical strip structure, and wherein the vertical strip of intra-encoded macro block starts from the right side boundary of described frame.
Fig. 6 C graphic extension has intra-encoded macro block group and a plurality of parallel horizontally extending single frame through the interframe encode section of a single horizontal bar band structure, and the group of the intra-encoded macro block of wherein said horizontal bar band structure starts from the bottom boundary of described frame.
Fig. 6 D graphic extension has intra-encoded macro block group and a plurality of parallel horizontally extending single frame through the interframe encode section of a single horizontal bar band structure, and the intra-encoded macro block group of wherein said horizontal bar band structure starts from the top boundary of described frame.
Fig. 7 graphic extension is a kind of to be used to carry out the coding engine that the content-adaptive frame is cut apart (being divided into some section groups and sections) and non-consistent VAU coding.
Fig. 8 graphic extension is used for carrying out according to some configuration that the interdependent frame of content is cut apart and the flow chart of the process of non-consistent video access units coding.
The exemplary example of the non-consistent VAU that Fig. 9 A-9D graphic extension obtains from content adaptive video frame cutting according to some configuration (about section geometry and type of coding both).
Figure 10 graphic extension has the compound scene VAU of multizone of a plurality of fragments different on the meaning of one's words.
The flow chart of the process of the non-consistent VAU decoding of Figure 11 graphic extension.
Simplify described image in graphic for the purpose of graphic extension, and described image is not to draw in proportion.For ease of understanding, under possible situation, used identical Ref. No. specify described diagram shared similar elements, just can add suffix in due course to distinguish this class component.
Appended graphic graphic extension exemplary arrangement of the present invention, and itself should not be regarded as limiting the scope of the invention that can allow other equivalent.Expection can advantageously be incorporated the feature or the step of a configuration in other configuration under situation about further not repeating.
Embodiment
" exemplary " used herein speech means " as example, example or illustration ".Any configuration of Miao Shuing herein or design may not be regarded as better or more favourable than other configuration or design, and term " core ", " engine ", " machine ", " processor " reach " processing unit " and be used interchangeably.
Hereinafter describe in detail and relate to some sample configuration.Yet, multiple multi-form embodiment that the present invention can claims define and contained.In this explanation, with reference to accompanying drawing, parts identical in institute's drawings attached are specified with identical numbering.
Can a series of pictures, frame and/or territory characterize vision signal, any one in picture, frame and/or the territory all can further comprise one or more sections.As used herein, term " frame " is the broad term that can include one or more frames, territory, picture and/or section.
Configuration comprises the system and method for the channel switching that promotes in the multimedia emission system.Multi-medium data can comprise the audio-video data of one or more sport videos, audio frequency, rest image, text or any other adequate types.
For example multimedia processing system such as video encoder can based on motion picture expert group (MPEG)-1 for example ,-2 and-4 standards, International Telecommunication Union-T H.263 standard and ITU-T H.264 standard and corresponding standard thereof, ISO/IEC MPEG-4 part 10 international standards such as (being advanced video coding (AVC)) use coding method to come multi-medium data is encoded, each in the above-mentioned standard all is incorporated herein with way of reference fully for all purposes.This coding (and expanding to decoding) method is usually directed to the compressing multimedia data for emission and/or storage.Compression can be considered as removing redundant process from multi-medium data widely.
Can sequence of pictures describe vision signal, described picture comprises frame (whole picture) or territory (for example, interlaced video stream comprises the odd number that replaces of picture or the territory of even lines).In addition, each frame or territory all can further comprise the subdivision in one or more sections or described frame or territory.No matter be to use separately or be used in combination with other word, term used herein " frame " all can refer to picture, frame, territory or one section.Method for video coding comes compressed video signal by using compression algorithm harmless or that diminish to compress each frame.Frame coding (also being called intraframe coding herein) refers to only use a frame to come described frame is encoded in the frame.Inter-frame coding (also being called interframe encode herein) is meant based on other " reference " frame to come a frame is encoded.For instance, vision signal represents time redundancy usually, and wherein approximating frame has each other coupling fully or at least to the part of small part coupling in the time series of frame.
For example multimedia processor such as video encoder can come described frame is encoded by a frame being divided into some pixel sub groups.Described pixel sub group can be described as piece or macro block (MB), and can comprise (for instance) 16 * 16 pixels.Encoder can further be divided into the plurality of sub piece with each 16 * 16 macro block.Each sub-piece all can further comprise extra sub-piece.For instance, the sub-piece of a 16x16 macro block can comprise 16 * 8 and 8 * 16 sub-pieces.Each sub-piece in described 16 * 8 and 8 * 16 sub-pieces all can comprise (for instance) a plurality of 8 * 8 sub-pieces, and 8 * 8 sub-pieces itself can comprise (for instance) 4 * 4,4 * 8 and 8 * 4 sub-pieces etc.Term used herein " piece " can refer to the sub-piece of macro block or any size.
Encoder uses the algorithm based on the interframe encode motion compensation to utilize time redundancy between the successive frame.Movement compensating algorithm is discerned the several portions of one or more reference frames of match block at least in part.Described can the compatible portion with respect to described reference frame be shifted in described frame.This displacement characterizes by one or more motion vectors.Can one or more remainders characterize the difference between the partial matching partial of described and described reference frame.Described encoder can encode a frame as and comprise at specific one or more motion vectors cut apart of frame and the data of remainder.Can select to be used for specific subregion that frame is encoded by minimizing cost function approx, the distortion or the appreciable distortion of described cost function (for instance) balance code size and the content frame that causes because of coding.
Interframe encode realizes more compression efficiency than intraframe coding.Yet when reference data (for example reference frame or reference field) was lost because of channel error etc., interframe encode can go wrong.Except that losing the reference data because of wrong, reference data also can be because of through the initial acquisition of inter-frame encoding frame place vision signal or regain unavailable.In these cases, possibly can't decode, maybe can form propagable illusion of not expecting and mistake data through interframe encode.These situations can cause the offending user experience of time expand section.
The intra-encoded frame that can independently decode is to make the vision signal can synchronous/the most common synchronous again frame form.Described MPEG-x and H.26x standard use the content be called set of pictures (GOP), GOP comprises intra-encoded frame (being also referred to as the I-frame) and with reference to the interim prediction P frame of I frame or other P and/or the B frame in bi-directional predicted B frame and/or the described GOP.Long GOP expects for the compression speed that increases, but short GOP can allow to obtain faster and/or be synchronous synchronously/again.The number that increases the I frame will permit obtaining faster and/or synchronous/and synchronous again, but with lower boil down to cost.
Fig. 1 graphic extension is according to the calcspar of the exemplary multimedia communications system 100 of some configuration.System 100 comprises the encoder apparatus 110 of communicating by letter with decoder device 150 via network 140.In an example, encoder apparatus 110 is encoded for launching at network 140 from external source 102 multimedia signal receivings and to described signal.
In this example, encoder apparatus 110 comprises the processor 112 that is coupled to memory 114 and transceiver 116.112 pairs of data from the multi-medium data source of processor are encoded and it is provided to transceiver 116 for transmitting on networks 140.In this example, decoder device 150 comprises the processor 152 that is coupled to memory 154 and transceiver 156.Processor 152 can comprise one or more general processors and/or digital signal processor.Memory 154 can comprise that one or more are solid-state or based on the storage device of disk.Transceiver 156 is configured on network 140 receiving multimedia data and described multi-medium data is provided to processor 152 to decode.In an example, transceiver 156 comprises wireless transceiver.Network 140 can comprise one or more wired or wireless communication systems, wherein comprise Ethernet, phone (for example POTS), cable, on in power circuit and fibre system and/or the wireless system one or an above system, wireless system comprises code division multiple access (CDMA or CDMA2000) communication system, frequency division multiple access (FDMA) system, OFDM (OFDM) system, time division multiple access (TDMA) system (for example GSM/GPRS) (general packet radio service)/EDGE (the data gsm environment of enhancing), TETRA (terrestrial repetition radio) mobile telephone system, Wideband Code Division Multiple Access (WCDMA) (WCDMA) system, high data rate (1xEV-DO or the multicast of 1xEV-DO gold) system, IEEE 802.11 systems, medium FLO system, the DMB system, one or more systems in the DVB-H system etc.
Fig. 2 A graphic extension can be used for the calcspar of the exemplary encoder apparatus 110 in the system 100 of Fig. 1 according to some configuration.In this configuration, encoder 110 comprises interframe encode encoder components 118, intraframe coding encoder components 120, reference data generator element 122 and transmitter components 124.118 pairs of references of interframe encode encoder are arranged in the other parts of the video data of frame At All Other Times and carry out the interframe encode of the video of time prediction (for example using motion-compensated prediction) and partly encode.The intra-encoded part of the video that 120 pairs of intraframe coding encoders can be not be decoded separately with reference to other video data of locating is in time encoded.In some configuration, but intraframe coding encoder 120 usage spaces are predicted the redundancy of utilizing other video data that is arranged in identical time frame.
In one aspect, reference data generator 122 produces indication respectively by produce intra-encoded of encoder 120 and 118 and through the data of interframe encode video data position.For instance, described reference data can comprise the identifier of sub-piece and/or macro block, and decoder uses described identifier to locate a position in frame.Described reference data also can comprise the number of frames in order to locating frame in sequence of frames of video.
Reflector 124 is launched through interframe coded data, intra-encoded data on network 140 networks such as grade of for example Fig. 1, and some configuration in the transmitted-reference data.Described data can be launched on one or more communication links.The term communication link is using in general sense, and can comprise any communication channel, wherein including but not limited to wired or wireless network, pseudo channel, optical link or the like.In some configuration, intra-encoded data are launched on the basal layer communication link, and are launched on the enhancement layer communication link through the data of interframe encode.In some configuration, intra-encoded data and all on identical communication link, be launched through the data of interframe encode.In some configuration, can on the communication with side information link, be launched through the data of interframe encode, intra-encoded data and one or more data in the reference data.For instance, for example can use H.264 supplemental enhancement information (SEI) message or user _ data-message equiband communication link of MPEG-2.In some configuration, intra-encoded data, on pseudo channel, be launched through the data of interframe encode and one or more data in the reference data.Pseudo channel can comprise packet, and it is the bag the discerned header that belongs to described pseudo channel that described packet contains described identification of data packets.Other form of known identification pseudo channel in described technology, for example frequency division, time are cut apart, sign indicating number Zhan Pin etc.
Fig. 2 B graphic extension is used for the calcspar of exemplary decoder device 150 of the system 100 of Fig. 1 according to some configuration.In this configuration, decoder 150 comprises that receiver element 158, selectivity decoder element 160, reference data determiner element 162 reach for example one or more reference data availability detector such as channel switch detector element 164 and error detector element 166.
Receiver 158 receives encoded video data (for example by Fig. 1 and 2A encoder 110 coded datas).Receiver 158 can receive described encoded data on the wired or wireless networks such as network 140 of for example Fig. 1.Described data can be received on one or more communication links.In some configuration, described intra-encoded data are being received on the basal layer communication link and described data through interframe encode are received on the enhancement layer communication link.In some configuration, described intra-encoded data and described data through interframe encode all are received on identical communication link.In some configuration, can on the communication with side information link, be received through the data of interframe encode, intra-encoded data and one or more data in the reference data.For instance, for example can use H.264 supplemental enhancement information (SEI) message or user _ data-message equiband communication link of MPEG-2.In some configuration, intra-encoded data, on pseudo channel, be received through the data of interframe encode and one or more data in the reference data.Pseudo channel can comprise packet, and it is the bag the discerned header that belongs to described pseudo channel that described packet contains described identification of data packets.Other form of known identification pseudo channel in described technology.
What 160 pairs of selectivity decoders were received decodes through interframe encode and intra-encoded video data.In some configuration, the data of described reception comprise the intra-encoded version through the part of interframe encode version and video data of the part of video data.Predicting according to this after the reference data of the data of interframe encode is decoded, can decode to data through interframe encode.For instance, use motion compensated prediction and coded data comprises the motion vector and the frame identifier of the position of discerning reference data.If by the described part available (for example decoded) of the frame of discerning through the motion vector and the frame identifier of interframe encode version, then selectivity decoder 160 can be decoded to described form through interframe encode.Yet if described reference data is unavailable, selectivity decoder 160 can be decoded to described intra-encoded version.
In one aspect, the reference data that reference data determiner 162 identification is received, described reference data are indicated described intra-encoded and through the position at interframe encode video data place in the encoded video data that is received.For instance, described reference data can comprise selectivity decoder 160 in order to the sub-piece of one position, location and/or the identifier of macro block in frame.Described reference data also can comprise the number of frames in order to locating frame in sequence of frames of video.Use this reference data that receives make decoder can determine through the data of interframe encode interdependent reference data whether available.
The availability of reference data can be subjected to the channel effect that the user is switched multichannel communication system.For instance, a plurality of video broadcastings can use one or more communication links to use for receiver 158.If user command receiver 158 changes to different broadcast channels, then being used for reference data through the data of interframe encode possibly can't be available immediately on new channel.Channel switch detector 164 detects the channel switching command and sends, and signals to selectivity decoder 160.Whether selectivity decoder 160 can then use discerns through the reference data of interframe encode version available from the information of described reference data determiner acquisition, and then discern the position of nearest intra-encoded version, and optionally the intra-encoded version of being discerned is decoded.
The reference data availability also can be subjected to the influence of the mistake in institute's receiving video data.Error detector 166 can utilize error detection techniques (for example forward error correction) to discern the mistake that can't correct in the bit stream.If through the interframe encode version have the mistake that can't correct in the interdependent reference data, then error detector 166 can be subjected to erroneous effects to selectivity decoder 160 which video data of identification that signal.Selectivity decoder 160 can then determine whether decoding or described intra-encoded version (for example, if described reference data is unavailable) is decoded through interframe encode version (for example, if described reference data can with).
In some configuration, can rearrange and/or one or more elements of the encoder 110 of constitutional diagram 2A.The element of encoder 110 can be implemented by hardware, software, firmware, middleware, microcode or its any combination.In some configuration, can rearrange and/or one or more elements of the decoder 150 of constitutional diagram 2B.The element of decoder 150 can be implemented by hardware, software, firmware, middleware, microcode or its any combination.
The MediaFLOTM video coding that some configuration of the present invention can use (for instance) to be used for using FLO air interface specification " Forward LinkOnly[FLO] Air Interface Specification for Terrestrial Mobile Multimedia Multicast " to send the real-time video service in the TM3 system is implemented, described FLO air interface specification is published in August, 2006 as technical standard TIA-1099, and it is incorporated herein with way of reference fully for all purposes.
Raster scan order has applied horizontality for inevitably the section subregion.Illustrate among Fig. 3 A and the 3B respectively and cut apart sample at two sections of MPEG-1 and MPEG-2.
Fig. 3 A graphic extension is cut apart according to the first exemplary sample section of the frame 200 of some configuration of the piece cutting structure of Moving Picture Experts Group-1 according to use.Represent different section subregions by different crosshatches.In this example, some macro block in the described section occupies two adjacent horizontal line.In this frame 200, piece cutting structure comprises the macro block 202 that is positioned on first horizontal line and is positioned at macro block 204 on second horizontal line.In these structures, be not that section all macro blocks in the group all need direct neighbor.
Fig. 3 B graphic extension is cut apart according to the second exemplary sample section of the frame 210 of some configuration of the piece cutting structure of Moving Picture Experts Group-2 according to use.In Fig. 3 B, described piece cutting structure is individually represented by A-Q.These piece cutting structure are horizontal arrangement line by line disorderly.For instance, piece cutting structure A extends the whole first horizontal line R1.Equally, piece cutting structure B extends the whole second horizontal line R2.Yet in this example, in the 3rd horizontal line R3, R3 is shared by piece cutting structure C and D for row.The layout of the piece cutting structure of horizontal line R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11 and R12 is not all wished for exemplary.But each section at most only all can occupy the end of the right margin sentinel section of a horizontal line and frame.
Fig. 4 A graphic extension has the sample frame of cutting apart 250 based on type 2 flexible macro-blocks orderings (FMO) according to standard H.264/AVC.In Fig. 4 A, frame 250 comprises a background 254 that is indicated by the section #2 of group and two (2) the individual prospect subregions 256 and 258 that are used for interest region (ROI) coding that indicated by section #0 of group and #1 respectively.As seen in FIG., prospect subregion 256 has interest region in the frame as the shaped as frame structure, and it comprises row group and row group neighboring macro-blocks.Yet the macro block in the subregion 256 is adjacent.Therefore, prospect subregion 258 also comprises through arranging to comprise a son group and vertically reaches the macro block of neighboring macro-blocks flatly.Subregion 256 and subregion 258 are labeled as the rectangular area to indicate the ROI of described particular zones.Therefore, the upper left and bottom right coordinate of these rectangles is necessary, and is sent to decoder device 150 from encoder apparatus 110.
Fig. 4 B graphic extension has the sample frame of cutting apart 300 based on Class1 FMO according to standard H.264/AVC.In Fig. 4 B, frame 300 comprises checkerboard pattern and reaches hiding with the error resilience that is improved.For the purpose of graphic extension, those macro blocks that are depicted as white are associated with the section #0 of group.Those macro blocks that are depicted as black #1 of group that is used to cut into slices.Therefore, existence can be by the alternating pattern that uses FMO to be affected.This is allowed, because FMO no longer needs described section to be made up of neighboring macro-blocks.Therefore, described checkerboard pattern provides the section of dispersion in fact.
H.264/AVC the FMO of standard comprises that 7 kinds of being labeled as type 0-type 6 are dissimilar.Yet for the purpose of graphic extension, this paper only describes the example that Class1 and type 2 provide piece cutting structure.The FMO that is used for the error resilience purpose allows macro block to be sorted by the mode that any other macro block from same slice group surrounds with no macro block.Therefore, under the situation that wrong (for example cut into slices launching during lose) occur, the reconstruct of losing piece can be dependent on available around the information of macro block.Type 6FMO is a type the most at random.Type 6FMO allows flexible fully to the user.For example type 0-type 5 waits other FMO type to be subject to and must follow specific pattern.
Though FMO assign to support different purposes, mainly it is considered as the error resilience instrument at present and promotes as the error resilience instrument.
In the video compression standard before H.264, the type of coding of each VAU must be consistent in the gamut of frame of video.This makes the section of component frame must use identical type of coding I (interior), P (prediction) or B (two predictions or bi-directional predicted) to encode.Along with the H.264 introducing of standard, eliminated this restriction.H.264 standard allows to use different type of codings in VAU.Therefore the section of VAU generally can have difference (coding) type, thereby causes the VAU through nonuniform encoding.In addition, H.264 also make it possible to by using type of coding to produce VAU, for example I type VAU, P type VAU or Type B VAU whole video frame unanimity.
Current configuration provides coding engine 500 (Fig. 7), being combined in the VAU possibility utilization FMO regulation H.264 of using different section (coding) types, with in (common and so most important) global motion situation of the pan that is implemented in camera or scrolling and the code efficiency of improving in the example formed by fragments different on the meaning of one's words of middle scene.
Figure 10 graphic extension has compound (being multizone) scene VAU 900, for example Business Wire of fragment different on the meaning of one's words 902,904,906,908 and 910.In the scene upper left quarter that indicates by fragment numeral 902, there is on-the-spot broadcasting video (news or advertising clip).In upper right quarter, exist by fragment numeral 904 and 906 playing up of sign to be the text of big font size and the financial index of figure.In multizone scene VAU 900 bottoms, having what indicated by fragment numeral 908 is the news express delivery of text with the ticker symbol that flows from right to left of big small text of small font and graph rendering and quotation and by playing up of indicating of fragment numeral 910.In this type of multizone scene is synthetic, different scene fragments for example will experience that " shearings " scene changes, intersection is desalinated, fade in and asynchronous variation such as fade out, amplifies and dwindle because of its meaning of one's words and content difference.For instance, when if the content of fragment 902 changes suddenly because of " shearing " scene changes, be that intra-encoded macro block is the most effective only then with the macroblock coding in the fragment 902, but with all the other macroblock codings for through inter-coded macroblocks utilizing correlation continuous time in fragment 904,906,908 and 910, fragment 904,906,908 and 910 content are constant this moment.Therefore, content-adaptive frame cutting unit 510 operation with detect shear scene change, intersect desalination, fade in or fade out, amplify or dwindle and the global motion kind in one or more variations.
Fig. 5 A graphic extension is through the VAU (frame #0) 350 of I type coding.Fig. 5 B graphic extension is two Type B VAU (frame #1 and #2, not graphic extension) VAU (frame #3) 370 through P type coding of pan left when camera begins to obtain vision signal with frame #0 between I type and P type VAU and wherein wherein.Fig. 5 A is identical in fact with the content of Fig. 5 frame of video that B describes, just at the pan that empty vertical line indicates in scene described in Fig. 5 B such as VAU 370 left sides to the left side, described empty vertical line marks is the border between sightless new scene details and the visible old scene details in frame #0, and therefore is available for predicting in frame #0.For the purpose of this sample graphic extension, can suppose that the camera that points to the special scenes with suitable details is just experiencing approximate pan left completely.
Turn back to Fig. 5 A once more, Fig. 5 A has a structure I BBPBBP with what frame 0 and frame 3 came that graphic extension catches under comfortable these environment respectively ... the initial I-frame VAU 350 of GOP and P-frame VAU 370 subsequently.In the P-frame, show through the macro block 372 (having the square of the boundary line of discerning 374) of interframe encode (promptly through time prediction) and the motion vector of correspondence thereof (discern in the square region small arrow 377).Macro block 376 is indicated in the macro block that is in close proximity to macro block 372 in the horizontal plane.Mark border between horizontal neighboring macro-blocks 372 and 376 by boundary line 374.Residue macro block to P frame (its border is unrecognized) carries out intraframe coding, the most of left border in the described macro block along frame, because kinetic property, new details enters described scene herein.The situation of this macro block (mb) type distribution and motion vector field structure contrast camera pan (to the left side) is the typical case extremely.According to the pan speed of camera and the time gap between P type VAU 370 and its reference frame, mainly the vertical strip of the intra-encoded macro block that exists along described frame left border can be crossed over one or more macro blocks row.For the purpose of graphic extension, in Fig. 5 B, there is not the intra-encoded macro block of region representation of frame or piece along left border.
Directly mode is summarised as camera pan and other situation of scrolling or the more complex situations of global translation motion in scene with above-mentioned observation.
Table 1. in different slice types in the encoded expression of mb type of 4 * 4 coding MB.
Section _ type The mb_ offset of interior 4 * 4 macro blocks The encoded expression of mb_ offset, i.e. ue (v) code word Code word size
2 or 7, (I) 0 1 1
0 or 5, (P) 5 00110 5
1 or 6, (B) 23 000011000 9
In all video compression standards, signal the type of coding (pattern) of each MB (except the MB that skips) in the bit stream foremost so that the analysis of decoder and entropy decode procedure can expect each MB data proper syntax and correctly explain described bit stream.In the section/VAU of P type coding, define preferred compact model and its occurrence frequency obviously greater than the occurrence frequency of the intra-encoded MB in the section/VAU of P type coding through the MB of interframe encode (promptly through time prediction).This can draw following visual report.Suppose that use content-adaptive variable length code (CAVLC) pattern H.264 represents MB type syntactic element " mb_ type ", then can be as being summarised in the different slice types binary representation in the table 1 through the type of interior _ 4 * 4 coding MB.
In P and the B section in the unexpected warp as seen, _ use and the signaling of the MB of 4 * 4 codings cause 4 the extra positions and the expense of 8 positions respectively.This situation is similar to the MB variable through interior _ 16 * 16 codings, although this paper will not provide correlative detail.Therefore, all the other are all equal, and the intra-encoded MB in the illustration I section is the most effective.
In order to providing the most important contribution of code efficiency, and its size should be desirably little through the frame of the time prediction VAU of P type and Type B coding (promptly through).Because intraframe coding is the minimum type of coding of efficient in three types, therefore the intra-encoded MB that exists number to increase in P type or Type B VAU is a kind of situation of not expecting.Yet when this situation took place reality, for instance, because complicated motion deformation dynamics or new object enter scene among the P type VAU among P or the Type B VAU, the task of encoder was to carry out the coding of MB in these with as far as possible effective and efficient manner.
The single frame 400 of Fig. 6 A graphic extension, its have that the intra-encoded macro block group 410 of single vertical strip structure and a plurality of parallel, horizontal extend through interframe encode section 415, the vertical strip of wherein said intra-encoded macro block starts from the left border of frame 400.In this example, section 1-5 is parallel and occupies the row of similar number separately.Therefore partly overlapping and be positioned at intra-encoded macro block group 410 through the vertical strip structure of interframe encode section 1-5 and be limited to the single VAU that the hachure by the variation of macro block 410 indicates, and extend to the boundary line, bottom fully from the boundary line, top of frame 400.
Fig. 6 B graphic extension has the single frame 420 of the intra-encoded macro block group 425 of single vertical strip structure.Frame 420 further comprises a plurality of parallel horizontally extending through interframe encode section 430, and wherein the vertical strip 425 of intra-encoded macro block starts from the right side boundary of frame 420.Therefore variation hachure as macro block 425 indicates, and the vertical strip 425 of intra-encoded macro block is partly overlapping and be positioned at through interframe encode section 1-5.
Fig. 6 C graphic extension has the single frame 450 of the intra-encoded macro block group 460 of single horizontal bar band structure.Frame 450 further comprises a plurality of parallel horizontally extending through interframe encode section 455, and the intra-encoded macro block group 460 of wherein said single horizontal bar band structure starts from the bottom boundary of frame 450 and extends to the right side fully from the left side of frame 450.
The single frame 470 of Fig. 6 D graphic extension, it has the intra-encoded macro block group 475 of single horizontal bar band structure and a plurality of parallel horizontally extending through interframe encode section 480, and the intra-encoded macro block group 475 of wherein said single horizontal bar band structure starts from the top boundary of frame 470 and extends to the right side fully from the left side of frame 470.During when the needs that have the employing intraframe coding in the suitable major part at frame and not at the described piece cutting structure of the careful cutting of this regional geometry, the slice boundaries in the zone of pending intraframe coding will reduce intraframe coding efficient.This come down to since cross over originally be used to carry out in the unavailability of slice boundaries of those neighbors of prediction and the unavailability of some inner estimation mode of causing owing to adjacent unavailability subsequently.The wherein common piece cutting structure of Fig. 6 A-6D graphic extension is further cut apart described pending intraframe coding and is disposed the situation in the zone of conflicting with intraframe coding efficient according to some.
Fig. 7 graphic extension is used to carry out the coding engine 500 that the content-adaptive frame is cut apart (being divided into some section groups and section) and non-consistent VAU coding.Coding engine 500 content adaptive frame cutting units 510 and non-consistent video access units (VAU) coding unit 520.Content-adaptive frame cutting unit 510 comprises shot boundary detector 512, sports ground calculator 514 and frame sectionalizer 516.Content-adaptive frame cutting unit 510 further comprises section group and determines and assignment module 518.
Shot boundary detector 512 detects one or more shot boundaries of one or more frames.In one aspect, the detector lens border comprises detects the scene variation.Scene changes and the detection of shot boundary is important, because the variation that the interruption of these incident hint continuous motion fields and scene are formed.The sports ground of one or more frames such as sports ground calculator 514 calculated example such as I-frame, P-frame, B-frame.In one aspect, operations such as the operation of the global motion that detected for example comprises camera pan or scrolling to P and category-B type, amplifies or dwindles, and the motion deformation example of the complexity in B and the P type that the use of the intraframe coding in the access unit of these time predictions will originally be become will be essential.Owing to determined sports ground, thereby can determine camera pan or scrolling, amplify or dwindle, so that can therefore carry out nonuniform encoding to described VAU.In one embodiment, the information about visibly different sports ground fragment in frame (for example different on the direction of the motion vector that it contained and intensity) can be provided to frame sectionalizer unit as the prompting that promotes its segmentation task.
Frame sectionalizer 516 is used for one or more frames are carried out segmentation.Frame sectionalizer 516 is with described frame segmentation or for example be divided into one or more macro block groups with those macro block groups of section #0 of group and the section #1 of group structurally associated connection, as shown in Fig. 9 A-9D any one.
Section group determine and assignment module 518 so that the purpose that one or more sections in the described section of one or more macro block groups that discerned and one or more section groups and each groups are associated is analyzed described frame sectionalizer unit exports.Section group determine and but but assignment module 518 is analyzed the size of one or more macro block groups of described identification and geometry, its predictability such as inter prediction or infra-frame prediction attribute for example, described one or more macro block groups are assigned to one or more section groups, and determine the size (for example occupied line number of any one section in any one section group) of one or more sections in described one or more section groups.Section group determines to reach section group, section and/or the slice type that assignment module 518 is determined one or more frames.520 pairs of non-consistent video access units (VAU) coding units with those macro blocks of being associated of definite type carry out nonuniform encodings.
Referring again to Fig. 2 A,, will use interframe encode or intraframe coding to come described section is encoded by interframe encode encoder 118 or intraframe coding encoder 120 respectively based on section group and/or slice type.Therefore, code device 100 comes described section is encoded according to interframe encode or intraframe coding technology based on determined section group and/or section and/or slice type at least in part.
The content-adaptive frame is cut apart the code efficiency that (being divided into some sections groups and sections) and non-consistent VAU coding are disposed the reduction that is caused by described mechanism.Therefore, be not suitable for translation of rigid body motion model (for example experience rotatablely move object) for example camera pan or scrolling in the VAU of P and Type B coding, dwindle or amplifies and global motion operations such as compound movement distortion example in the VAU of B and P type coding that the use intraframe coding will be become will be essential.Described non-consistent VAU coding will use intraframe coding with the efficient that increases in the access unit of time prediction.
For realizing this requirement with effective and efficient manner, encoder can adopt and be similar to handling process illustrated among Fig. 8.As the result of the processing of this enhancing, the section segmenting structure of the sample example of graphic extension among Fig. 6 A-6D should be modified as the structure of graphic extension among Fig. 9 A-9D.Should be appreciated that, provide sample example among Fig. 6 A-6D and Fig. 9 A-9D only for the purpose of graphic extension, and can carry out frame based on predictability attribute segmentation/be divided into several regions (macro block group), section group and section with complete flexible way based on the regulation of FMO.
In the various hereinafter configurations, with the order flowchart piece of being described, or can be simultaneously, carry out these pieces or its several portions concurrently or with different order.
Fig. 8 graphic extension is used for realizing that according to some configuration the interdependent frame of content is cut apart and the flow chart of the process 600 of non-consistent video access units coding.Process 600 is sentenced camera lens-border/scene-change-detection at piece 602 and is begun.Change to discern and have the base unit that increases conforming high spatial chronotaxis by detecting scene, scene changes the border of having interrupted space time similitude and this type of base unit of mark.Piece 604 calculates sports ground at piece 604 places after piece 602.In one aspect, use two-way and individual event calculates: one or more regional predictability attributes in identification frame or the frame (but for example inter prediction or not); For example camera pan or scrolling, dwindle or the operation of global motion such as amplification; And identification has the zone (macro block group) in the frame of obvious different motion characteristic, for example static (no change), consistently move, non-consistent moving region.
It should be noted that in the camera lens fragment, calculate the sports ground of all frames except that first frame of described video segment.Video sequence generally will comprise a plurality of camera lens fragments, promptly consistently on the meaning of one's words change successive video frames group separately by scene.IBP ... layout will be called as " gop structure " more accurately.Though expectation makes the I frame aim at the scene variation, and is also nonessential like this, and there is other reason (for example enabling arbitrary access) of the I frame that inserts even interval and may not aim at upper limit delay performance with the scene variation.For instance, the frame 350 of Fig. 5 A is the I frame, and will can not stand sports ground and calculate.Yet, will carry out sports ground for the frame among Fig. 5 B 370 (it is a P type frame) and calculate.First frame or I frame will all carry out intraframe coding.
Piece 606 carries out segmentation to frame herein after piece 604.The segmentation of frame is basically based on time predictability and sports ground attribute.Piece 608 is after piece 606, and the group of cutting into slices at piece 608 places determines and assignment.Determine appointment about section group, section and section (coding) type of each frame.At piece 608 places, can discern first macro block each section in absolute address` (first_mb_in_slice) and/or each section in the counter-scanning macro block information.In the particular kind of relationship of Fig. 9 A, display frame 700.Herein, identification section 6 has I type coding.Determine slice boundaries, for example wait to be included in the vertical macroblocks column number in the section 6.In one aspect, section 6 is associated with the vertical strip of macro block and owing to the camera pan need carry out intraframe coding to the left side.In addition, determine section 1,2,3,4 and 5 herein.In this particular frame, section 1-5 is the section of P type.Therefore, in general, section 6 will be the I type, and remaining section all is the P type or all is Type B.
In piece 608, encoder engine also can be incorporated the additional limits that for example is used for error resilience etc. into.Piece 610 is after piece 608, and encode to section based on the section type of coding of being discerned in 610 places at piece, for example carries out intraframe coding and interframe encode.Piece 610 terminal procedures 600.At piece 612 places, the output of process 600 is sent to the file of memory 114 and/or sends to transceiver 116 with bit stream, to be used on network 140, being delivered to decoder device 150.
Output according to the process 600 of the specific criteria that is just using or other non-standard video compression algorithm also will contain the information that is associated with section group and section relevant for macro block.
Figure 11 graphic extension is used for carrying out the flow chart of the process 1000 of non-consistent VAU decoding under the situation that exists flexible macro-block ordering (FMO) to use.H.264 the FMO in standard regulation make it possible to fully neatly the macro block component of frame of video to be cut (not hinted that continuous raster scan order limits) or be grouped into one or more section groups and each section group in one or more sections.Determine described macro block group is divided into one or more sections in one or more section groups and each section group by encoder, and should be with macro block formed related decoder that is provided to section group and between cutting into slices.For instance, in standard H.264, in image parameters collection (PPS), signal this association by section group's mapping (SGM).Decode operation will use the syntactic element that the absolute address` " first_mb_in_slice " of described first macro block in described section is provided in section group mapping (SGM) information that is provided and the header of each section, with the macro block information of scanning sequence outside its grating in each section of reverse scan is come in its correct locus.Therefore, will decode and the pixel reconstruction process according to the section type of coding of in the section header, signaling equally.Decoder device 150 will use SGM to carry out non-consistent VAU decoding, and described SGM is produced and is written in the described bit stream by encoder when using the FMO regulation.
In the various hereinafter configurations, with described order flowchart piece, or can be side by side, carry out these pieces or its several portions concurrently or with different order.
Process 1000 begins with piece 1002, herein, and the SGM that decoder device 150 reception PPS and definite encoder apparatus 110 are produced.According to piece 1004, decoder device 150 also receives the syntactic element of each section of signaling in the VAU of nonuniform encoding in the section header.Piece 1006 is determined the absolute address` (first_mb_in_slice) of first macro block in each section at piece 1006 places after piece 1004.Piece 1008 is after piece 1006, and at piece 1008 places, the reverse scan of scanning sequence macro block position information in each section is carried out in its correct locus is operated outside its grating.Piece 1010 after piece 1008, at piece 1010 places, the section type of coding reconstructed pixel that the VAU through non-uniform encoding is decoded and signals in the header according to section.
The exemplary example that Fig. 9 A-9D graphic extension obtains from content adaptive video frame cutting according to some configuration through the VAU of nonuniform encoding (about section geometry and section type of coding both).In Fig. 9 A, comprise the terrace cut slice #6 that conduct is positioned at the macro block vertical strip 715 in VAU left side and is appointed as the I type of pending intraframe coding through the VAU 700 of nonuniform encoding.Vertical strip 715 starts from left side edge or the place, boundary line of VAU 700, and extends one or more macro blocks from here and be listed as right side edge or boundary line to define described section.Vertical strip 715 extends to the bottom from the top of frame of video.The residue section 1-5 that is specified in 710 places cuts into slices as the P type and carries out interframe encode.Section 1-5 is the parallel, horizontal structure and is grouped among the section #0 of group.The left border of the section #0 of group starts from cutting into slices on the edge, the rightmost side or boundary line of #6, and extends to the right side boundary through the VAU 700 of nonuniform encoding.Therefore, with the absolute address` that correspondingly is provided with in first macro block each in section 1-5.
In Fig. 9 B, comprise the terrace cut slice #6 that conduct is positioned at the macro block vertical strip 740 on VAU right side and is appointed as the I type of pending intraframe coding through the VAU 730 of nonuniform encoding.Vertical strip 740 extends from one or more macro block row away from right hand edge or the boundary line of VAU 730, and extends right side edge from one or more macro block row to VAU 730 or boundary line.Vertical strip 740 also extends to the bottom from the top of VAU 730.The residue section 1-5 that is specified in 735 places cuts into slices as the P type and carries out interframe encode.Section 1-5 is the parallel, horizontal structure and is grouped among the section #0 of group.In the case, the left border of the section #0 of group starts from the left side edge or boundary line of VAU 730, and it extends to the left border of vertical strip 740.Therefore, with the absolute address` that correspondingly is provided with in first macro block each in section 1-6.
In Fig. 9 C, VAU 750 comprises the single dropping cut slice #1 that conduct is positioned at the horizontal band 755 of VAU 750 top sides and is appointed as the I type of pending intraframe coding.The residue section 2-7 that is specified in 760 places cuts into slices as the P type and carries out interframe encode.In the case, the right side of the right side of horizontal band 755 and boundary line, left side and VAU 750 and boundary line, left side or coincident.The top of horizontal band 755 overlaps with the top of VAU 750.Yet the bottom margin of horizontal band 755 extends one or more row downwards from VAU edge, top.The section 2-7 parallel and flatly be configured in the section #1 of group in.In the case, among the section #1 of group the right side of each section and boundary line, left side also with right side and boundary line, left side or the coincident of VAU 750.In this arranged, the number of the row among the section 1-7 and big I thereof were unequal.The last section of the section #1 of group (being section #7 in the case) has the boundary line, bottom that overlaps with the boundary line, bottom of VAU 750.
In Fig. 9 D, VAU 800 comprises the single dropping cut slice #8 that conduct is positioned at the horizontal band 804 of VAU 800 bottoms and is appointed as the I type of pending intraframe coding.The bottom margin of horizontal band 804 overlaps with the bottom margin of VAU 800.Horizontal band 804 extends upward one or more macro-block line from the bottom margin of VAU 800.The residue section 1-7 that is specified in 802 places for example P type section carries out interframe encode.The section 1-7 parallel and flatly be configured to the section #0 of group in.In this arranged, the number of the row among the section 1-8 and big I thereof were unequal.First section of the section #0 of group (being section #1 in the case) has the boundary line, top that overlaps with the boundary line, top of VAU 750.The right side of all sections and boundary line, left side all overlap with right side and the boundary line, left side of VAU 750.
It will be understood by one of ordinary skill in the art that any one that can use in various different technologies and the skill and technique come expression information and signal.For instance, above data, instruction, order, information, signal, position, symbol and the chip that may mention in the whole text of explanation can be represented by voltage, electric current, electromagnetic wave, magnetic field or magnetic particle, light field or light particle or its any combination.
The those skilled in the art should be further appreciated that various illustrative components, blocks, module and the algorithm steps described in conjunction with revealed instance herein can be embodied as electronic hardware, firmware, computer software, middleware, microcode or its combination.Be to remove this interchangeability of ground graphic extension hardware and software, above with regard to its functional big volume description various Illustrative components, piece, module, circuit and step.It still is that software depends on application-specific and the design constraint that puts on the total system that this function is embodied as hardware.The those skilled in the art can implement described function in a different manner at each application-specific, but this type of embodiment decision-making should not be regarded as causing deviating from the scope of institute's revealing method.
Various illustrative components, blocks, assembly, module and the circuit of describing in conjunction with revealed instance herein can use general processor, digital signal processor (DSP), application-specific integrated circuit (ASIC) (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components or implement with its any combination of carrying out function described herein or carry out through design.General processor can be microprocessor, but or, processor can be any conventional processors, controller, microcontroller or state machine.Processor also can be embodied as the combination of calculation element, for example, and the associating of the combination of DSP and microprocessor, the combination of a plurality of microprocessors, one or more microprocessors and DSP core, or any other this configuration.
In one or more software modules that the method for describing in conjunction with example disclosed herein or the step of algorithm can directly be embodied in the hardware, carried out by one or more treatment elements or in both combinations.Software module can reside in the medium of any other form known in RAM memory, flash memory, ROM memory, eprom memory, eeprom memory, register, hard disc, removable disk, CD-ROM or this technology or combination.Exemplary storage medium is coupled to described processor, so that described processor can be from described read information and to described medium writing information.Perhaps, medium can be the part of processor.Processor and medium can reside in the application-specific integrated circuit (ASIC) (ASIC).ASIC can reside in the radio modem.Perhaps, processor and medium can be used as discrete component and reside in the radio modem.
The those skilled in the art provide above explanation to revealed instance so that all can make or use the method and apparatus that is disclosed.The those skilled in the art will understand the various modifications to these examples easily, and the principle that is defined herein can be applicable to other example and can add extra key element.

Claims (53)

1, a kind of device, it comprises:
Processor, its operation is cut apart with the content-adaptive frame that execution is divided into some section groups and section, and uses one or more section type of codings to carry out the VAU coding in single non-consistent video access units (VAU); And
Memory, it is coupled to described processor.
2, device as claimed in claim 1, wherein said processor when carrying out described content-adaptive frame and cut apart operation with: detect one or more frames one or more shot boundaries, calculate described one or more frames sports ground, described one or more frames are carried out segmentation and determine the described section group of described one or more frames, described section or slice type.
3, device as claimed in claim 2, wherein said processor comes described one or more frames are encoded based on described definite section group, described section or described slice type up to small part ground at the described non-consistent VAU coding of execution.
4, device as claimed in claim 3, wherein said processor comes a corresponding frame is encoded based on described definite section group with P or Type B section group and single I type section group up to small part ground at the described non-consistent VAU coding of execution.
5, device as claimed in claim 4, wherein said processor is encoded to a corresponding frame first group with single section and has second group that a plurality of parallel, horizontal are extended section when the described non-consistent VAU of execution encodes.
6, device as claimed in claim 3, wherein the first section group comprises the single band that is arranged in the intra-encoded macro block in the described single section, described band has and the edge, three boundary lines of a corresponding frame or three edges that partially overlap, and the second section group comprises a plurality of parallel, horizontal extension sections that have separately through inter-coded macroblocks.
7, device as claimed in claim 6, wherein said processor detect those macro blocks that comprise the sports ground of global motion and will be associated with described global motion and are assigned as described intra-encoded macro block carrying out to comprise when described content-adaptive frame is cut apart.
8, device as claimed in claim 7, wherein said global motion comprise camera left pan, camera pan, camera scroll up or the downward scrolling of camera to the right.
9, device as claimed in claim 1, wherein said processor operations is to adopt flexible macro-block ordering (FMO) in response to one or more variations in the compound scene when cutting apart, and the time use difference section type of coding in the described single VAU at coding, one or more parts of the described VAU of described one or more variable effects but not described VAU's is whole.
10, a kind of multimedia system, it comprises:
Content-adaptive frame cutting unit, it is operated frame is divided into some section groups and section; And
Non-consistent video access units (VAU) coding unit, it is encoded to the single band of intra-encoded macro block in order to the first with described frame, and the second portion of described frame is encoded to cutting into slices through interframe encode that a plurality of parallel, horizontal extend.
11, system as claimed in claim 10, wherein said content-adaptive frame cutting unit comprises:
Detector, it is used to detect one or more shot boundaries of one or more frames;
Calculator, it is used to calculate the sports ground of described one or more frames;
Sectionalizer, it is used for described one or more frames are carried out segmentation; And
Determiner, it is used for determining section group and the section or the slice type of described one or more frames.
12, system as claimed in claim 11, wherein said non-consistent VAU coding unit comprises encoder, and described encoder operation comes described one or more frames are encoded based on described definite section group, described section or described slice type down to small part ground.
13, system as claimed in claim 12, wherein said encoder operation come a corresponding frame is encoded based on described definite section group with P type and single I type section group down to small part ground.
14, system as claimed in claim 13, wherein said encoder operation is to be encoded to a corresponding frame the first section group with described single band and to have second group through the interframe encode section that a plurality of parallel, horizontal are extended.
15, system as claimed in claim 14, wherein the first section group comprises a plurality of intra-encoded macro block that is arranged in the described single band, described band has and the edge, three boundary lines of a corresponding frame or three edges that partially overlap, and the second section group comprises cutting into slices through interframe encode of a plurality of parallel, horizontal extensions.
16, system as claimed in claim 15, wherein said calculator detects the sports ground that comprises global motion and is assigned as described intra-encoded macro block with those macro blocks that are used for being associated with described global motion.
17, system as claimed in claim 15, wherein said global motion comprise camera left pan, camera pan, camera scroll up or the downward scrolling of camera to the right.
18, a kind of method that is used to handle multi-medium data, it comprises:
The content-adaptive frame is cut apart, and frame is divided into some section groups and section; And
Non-consistent video access units (VAU) coding is encoded to the single band of intra-encoded macro block with the first of described frame, and the second portion of described frame is encoded to cutting into slices through interframe encode that a plurality of parallel, horizontal extend.
19, method as claimed in claim 18, it comprises that further detecting global motion detects.
20, method as claimed in claim 18, wherein said content-adaptive frame is cut apart and is comprised: one or more shot boundaries that detect one or more frames; Calculate the sports ground of described one or more frames; Described one or more frames are carried out segmentation; And determine the described section group of described one or more frames, described section or slice type.
21, method as claimed in claim 18, wherein said non-consistent VAU coding comprises at least in part to come described one or more frames are encoded based on described definite section group, described section or described slice type.
22, method as claimed in claim 21, wherein said non-consistent VAU coding comprise at least in part to come a corresponding frame is encoded based on described definite section group with P type and single I type section group.
23, method as claimed in claim 22, wherein said non-consistent VAU coding comprises: a corresponding frame is encoded to the first section group with described single band, and described band has and the edge, three boundary lines of a corresponding frame or three edges that partially overlap; Reach second group through the interframe encode section with a plurality of parallel, horizontal extensions is encoded.
24, method as claimed in claim 23, the described coding of the wherein said first section group comprises: the level or the vertical edge, boundary line of a corresponding frame are associated with the described single band of described intra-encoded macro block; Reach the second section group through the interframe encode section that comprises a plurality of parallel, horizontal extensions is encoded.
25, method as claimed in claim 24, wherein said content-adaptive frame are cut apart to comprise and are detected the sports ground comprise global motion and be assigned as described intra-encoded macro block with those macro blocks that are used for being associated with described global motion.
26, method as claimed in claim 25, the described detection of wherein said global motion comprise detect camera left pan, camera to the right pan, camera scroll up or the downward scrolling of camera in one.
27, a kind of encoding device, it comprises: coding engine, it can be operated with the global motion detection in response to camera pan or scrolling and adopt flexible macro-block ordering (FMO) in conjunction with the section of the difference in the single video access units (VAU) type of coding.
28, encoding device as claimed in claim 27, it further comprises:
Be used to detect the device of one or more shot boundaries of one or more frames;
Be used to calculate the device of the sports ground that comprises described global motion of described one or more frames;
Be used for device that described one or more frames are carried out segmentation; And
Be used for determining the section group of described one or more frames and the device of section or slice type.
29, encoding device as claimed in claim 28, it further comprises and is used at least in part coming described one or more frames are carried out apparatus for encoding based on described definite section group and described section or described slice type.
30, a kind of computer program that comprises computer-readable media, described computer-readable media comprises the instruction that is used to handle multi-medium data, wherein said instruction causes computer to carry out following operation:
Use flexible macro-block ordering (FMO) execution that the content-adaptive frames that frame is divided into some section groups and section are cut apart; And
Use one or more section type of codings to carry out non-consistent VAU coding through the frame of cutting apart to described.
31, computer program as claimed in claim 30, the wherein said instruction of cutting apart in order to execution content-adaptive frame comprises in order to cause described computer to carry out the instruction of following operation:
Detect one or more shot boundaries of one or more frames;
Calculate the sports ground of described one or more frames;
Described one or more frames are carried out segmentation; And
Determine described section group and the described section or the slice type of described one or more frames.
32, computer program as claimed in claim 31 wherein saidly comprises in order to cause described computer to carry out the instruction of following operation in order to the instruction of carrying out described non-consistent VAU coding:
Come described one or more frames are encoded based on described definite section group and described section or described slice type at least in part.
33, computer program as claimed in claim 31 wherein saidly comprises instruction in order to cause described COMPUTER DETECTION global motion to detect in order to the instruction of calculating described sports ground.
34, computer program as claimed in claim 30, wherein said VAU is the multizone scene with fragments different on the meaning of one's words, and describedly comprises in order to cause described computer described VAU to be divided into the instruction of fragments different on the described meaning of one's words in order to carry out the instruction that described content-adaptive frame cuts apart.
35, computer program as claimed in claim 34, wherein said in order to the instruction of cutting apart further comprise in order to cause described computer determine to shear scene change, intersect desalination, fade in or fade out, amplify or dwindle and the global motion kind in any one or the instruction that changes more than.
36, computer program as claimed in claim 31 wherein saidly comprises in order to cause described computer to come a corresponding instruction that frame is encoded based on described definite section group with P type or Type B section group and single I type section group at least in part in order to the instruction of carrying out described non-consistent VAU coding.
37, computer program as claimed in claim 36, wherein saidly comprise the instruction that causes described computer to carry out following operation in order to the instruction of carrying out described non-consistent VAU coding: a corresponding frame is encoded to first group of the single band with intra-encoded macro block, and described band has and the edge, three boundary lines of a corresponding frame or three edges that partially overlap; Reach second group through the interframe encode section with a plurality of parallel, horizontal extensions is encoded.
38, computer program as claimed in claim 31, wherein said instruction in order to calculating comprises in order to cause described COMPUTER DETECTION global motion to be assigned as the instruction of intra-encoded macro block with those macro blocks that are used for being associated with described global motion.
39, computer program as claimed in claim 38, wherein said in order to the instruction that detects described global motion comprise in order to cause described COMPUTER DETECTION camera left pan, camera to the right pan, camera scroll up or the downward scrolling of camera in one instruction.
40, a kind of equipment that is used to handle multi-medium data, it comprises:
Be used to carry out frame is divided into the device that some content-adaptive frames of cutting into slices groups and section are cut apart; And
Be used to carry out and use one or more section type of codings described non-consistent video access units (VAU) apparatus for encoding of encoding through the frame of cutting apart.
41, equipment as claimed in claim 40 wherein saidly is used to carry out the device that the content-adaptive frame cuts apart and comprises: the device that is used to detect one or more shot boundaries of one or more frames; Be used to calculate the device of the sports ground of described one or more frames; Be used for device that described one or more frames are carried out segmentation; And be used for determining the device of described section group, described section or the slice type of described one or more frames.
42, equipment as claimed in claim 40 wherein saidly is used to carry out described non-consistent VAU apparatus for encoding and comprises at least in part to come described one or more frames are encoded based on described definite section group, described section or described slice type.
43, equipment as claimed in claim 40 wherein saidly is used to carry out non-consistent VAU apparatus for encoding and comprises at least in part to come a corresponding frame is encoded based on described definite section group with P type or Type B section group and single I type section group.
44, equipment as claimed in claim 40, wherein saidly be used to carry out non-consistent VAU apparatus for encoding and comprise: a corresponding frame is encoded to first group of the single band with intra-encoded macro block, and described band has and the edge, three boundary lines of a corresponding frame or three edges that partially overlap; And has second group that a plurality of parallel, horizontal are extended through the interframe encode section.
45, equipment as claimed in claim 40, wherein said VAU are the multizone scenes with fragments different on the meaning of one's words, and describedly are used to carry out device that described content-adaptive frame cuts apart and comprise described VAU is divided into fragments different on the described meaning of one's words.
46, a kind of decoding device, it comprises: Decode engine, it can be operated to adopt flexible macro-block ordering (FMO) to come described single VAU is decoded in conjunction with the section of the difference in the single video access units through nonuniform encoding (VAU) type of coding.
47, decoding device as claimed in claim 46, wherein said Decode engine receive the absolute address` of first macro block of image parameters collection and each different coding type by described FMO.
48, decoding device as claimed in claim 46, wherein said Decode engine comprises the selectivity decoder, described selectivity decoder in order to: the first section group to single band with intra-encoded macro block decodes, and described band has and the edge, three boundary lines of described single VAU or three edges that partially overlap; Reach the second section group through the interframe encode section with a plurality of parallel, horizontal extensions is decoded.
49, a kind of computer program that comprises computer-readable media, described computer-readable media comprises the instruction that is used to handle multi-medium data, wherein said instruction causes computer to carry out following operation:
Adopt flexible macro-block ordering (FMO) to come described single VAU is decoded in conjunction with the section of the difference in the single video access units (VAU) type of coding through nonuniform encoding.
50, computer program as claimed in claim 49 wherein saidly comprises in order to cause described computer to receive the image parameters collection by described FMO and to receive the instruction of absolute address` of first macro block of each different coding type in order to decoded instruction.
51, computer program as claimed in claim 49, wherein saidly comprise in order to cause described computer to carry out the instruction of following operation in order to decoded instruction: the first section group to single band with intra-encoded macro block decodes, and described band has and the edge, three boundary lines of described single VAU or three edges that partially overlap; Reach the second section group through the interframe encode section with a plurality of parallel, horizontal extensions is decoded.
52, a kind of encoding device, it comprises: coding engine, it can be operated to change in conjunction with the section of the difference in the single video access units (VAU) type of coding in response in the compound scene one or more and adopt flexible macro-block ordering (FMO), one or more parts of the described VAU of described one or more variable effects but not all parts of described VAU.
53, encoding device as claimed in claim 52, wherein said one or more variations comprise shear scene change, intersect desalination, fade in or fade out, amplify or dwindle and the global motion kind in any one or change more than one.
CN 200780047187 2006-12-22 2007-12-21 Techniques for content adaptive video frame slicing and non-uniform access unit coding Pending CN101578865A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US87692006P 2006-12-22 2006-12-22
US60/876,920 2006-12-22
US11/961,647 2007-12-20

Publications (1)

Publication Number Publication Date
CN101578865A true CN101578865A (en) 2009-11-11

Family

ID=41272868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200780047187 Pending CN101578865A (en) 2006-12-22 2007-12-21 Techniques for content adaptive video frame slicing and non-uniform access unit coding

Country Status (2)

Country Link
CN (1) CN101578865A (en)
TW (1) TW200840368A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102378004A (en) * 2010-08-10 2012-03-14 索尼公司 Moving image processing apparatus, moving image processing method, and program
CN103975596A (en) * 2011-10-24 2014-08-06 高通股份有限公司 Grouping of tiles for video coding
CN106471578A (en) * 2014-05-16 2017-03-01 高通股份有限公司 Cross fades between higher-order ambiophony signal
TWI620435B (en) * 2012-09-18 2018-04-01 Vid衡器股份有限公司 Method and apparatus for region of interest video coding using tiles and tile groups

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102378004A (en) * 2010-08-10 2012-03-14 索尼公司 Moving image processing apparatus, moving image processing method, and program
CN103975596A (en) * 2011-10-24 2014-08-06 高通股份有限公司 Grouping of tiles for video coding
US9584819B2 (en) 2011-10-24 2017-02-28 Qualcomm Incorporated Grouping of tiles for video coding
CN103975596B (en) * 2011-10-24 2017-06-09 高通股份有限公司 For the packet of the tile of video coding
TWI620435B (en) * 2012-09-18 2018-04-01 Vid衡器股份有限公司 Method and apparatus for region of interest video coding using tiles and tile groups
US10057570B2 (en) 2012-09-18 2018-08-21 Vid Scale, Inc. Method and apparatus for region of interest video coding using tiles and tile groups
CN106471578A (en) * 2014-05-16 2017-03-01 高通股份有限公司 Cross fades between higher-order ambiophony signal
CN106471578B (en) * 2014-05-16 2020-03-31 高通股份有限公司 Method and apparatus for cross-fade between higher order ambisonic signals

Also Published As

Publication number Publication date
TW200840368A (en) 2008-10-01

Similar Documents

Publication Publication Date Title
US8428125B2 (en) Techniques for content adaptive video frame slicing and non-uniform access unit coding
KR101773693B1 (en) Disparity vector derivation in 3d video coding for skip and direct modes
AU2011354441B2 (en) Method and apparatus of improved intra luma prediction mode coding
KR102185025B1 (en) Simplifications on disparity vector derivation and motion vector prediction in 3d video coding
CN103026709B (en) For the inter-frame forecast mode of video coding and the decoding of reference picture list index
JP6513685B2 (en) Improved Inference of NoOutputOfPriorPicsFlag in Video Coding
US9288502B2 (en) Methods and apparatus for the use of slice groups in decoding multi-view video coding (MVC) information
KR101708586B1 (en) Neighbor block-based disparity vector derivation in 3d-avc
US20120106634A1 (en) Method and apparatus for processing multi-view video signal
US20140119439A1 (en) Method and apparatus of intra mode coding
KR101958055B1 (en) Disparity vector refinement in video coding
EP3275179B1 (en) Device and method for processing video data
KR101861906B1 (en) Device and method for scalable coding of video information based on high efficiency video coding
KR20150105372A (en) Scalable hevc device and method generating adapted motion vector candidate lists for motion prediction in the enhancement layer
WO2015013137A1 (en) Device and method for scalable coding of video information
US20140098881A1 (en) Motion field upsampling for scalable coding based on high efficiency video coding
JP5502798B2 (en) Channel switching frame
CN104956676A (en) Inter-layer syntax prediction control
CN101578865A (en) Techniques for content adaptive video frame slicing and non-uniform access unit coding
US10194146B2 (en) Device and method for scalable coding of video information
CN101491097A (en) Video coding with fine granularity scalability using cycle-aligned fragments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20091111