CN101491103B

CN101491103B - Method and apparatus for encoder assisted pre-processing

Info

Publication number: CN101491103B
Application number: CN200780027205.7A
Authority: CN
Inventors: 维贾雅拉克希米·R·拉温德朗
Original assignee: Qualcomm Inc
Current assignee: Qualcomm Inc
Priority date: 2006-07-20
Filing date: 2007-07-19
Publication date: 2011-07-27
Anticipated expiration: 2027-07-19
Also published as: CN101491102A; CN101491103A; CN101491102B

Abstract

This application includes devices and methods for processing multimedia data to generate enhanced quality multimedia data at the receiver based on encoder assisted post-processing. In one aspect, processing multimedia data includes identifying at least one pixel strength region in at least one image of the multimedia data; modifying at least one part of the multimedia data to reduce the pixel strength region; and encoding the modified multimedia data to form post-processing multimedia data.

Description

Be used for the pretreated method and apparatus of encoder assist type

Present application for patent is advocated the 60/832nd of being entitled as of application on July 20th, 2006 " method and apparatus that is used for the reprocessing of encoder assist type ", the priority of No. 348 U.S. Provisional Application cases, described application case transfers this assignee, and is incorporated herein by reference fully for all purposes.

Technical field

The application's case is handled at multi-medium data substantially, and more particularly, at using the decoder processes technology to come encoded video.

Background technology

To there is ever-increasing demand in the high resolution multimedia transfer of data to display unit (for example, the display unit of cellular phone, computer and PDA).Need high-resolution (term is checked some desired details and the required resolution of feature in order to indication herein) in order to watch some multi-medium data (for example, physical culture, video, television broadcasting are presented and other this type of image) best.The amount that provides the high resolution multimedia data need increase the data that send to display unit usually, this is need more communication resources and the process of transmission bandwidth.

Spatial scalability is in order to strengthening the typical method of resolution, and wherein high resolution information (high-frequency data in particular) is encoded and be transferred to the basal layer of lower resolution data as an enhancement layer.Yet spatial scalability is than poor efficiency, because these type of data have noise-like statistical characteristics and have relatively poor code efficiency.In addition, spatial scalability is a limitation in height, has pre-determined when establishment/encoding enhancement layer because go up sampling resolution.Therefore, need other method to overcome the deficiency of other known in spatial scalability and this technology resolution enhancement methods.

Summary of the invention

Each equipment of Miao Shuing and method all have some aspects herein, there is no single person in the described aspect and are responsible for its required attribute fully.Under the situation of the scope that does not limit this disclosure, existing with its outstanding feature of brief discussion.After having considered that this discusses content, and in particular, after having read the chapters and sections that are entitled as " execution mode ", how to provide improvement to multi-medium data treatment facility and method with the feature of understanding this disclosure.

In one embodiment, a kind of method of handling multi-medium data, described method comprises: at least one pixel intensity range at least one image of identification multi-medium data; At least a portion of revising described multi-medium data is to reduce described at least one pixel intensity range; And the described modified multi-medium data of encoding is to form encoded multi-medium data.Revising described at least one pixel intensity range can comprise inverse histogram equalization operation, Gamma correction or revise described at least one pixel intensity range based on the institute's detection range of pixel value and the threshold value of the limit that defines the scope of pixel intensity value to small part.Described method can further comprise described encoded multi-medium data is transferred to terminal installation.

Employed post-processing technology can comprise the operation of remapping of histogram equalization, Gamma correction, contrast enhancement process or one other pixel intensity in the decoder.Described method can comprise: keep the designator in order to the modification that reduces described at least one pixel intensity range that indication is carried out described multi-medium data; And the described designator of encoding is for being transferred to terminal installation.In certain embodiments, described terminal installation can be configured to use described designator to adjust described at least one pixel intensity range of multi-medium data.And described method can comprise described designator multi-medium data is transferred to terminal installation.But also storage indicator, described designator indication in order to revising the post-processing technology of pixel intensity range, and are revised described at least one pixel intensity range of described multi-medium data based on described designator before coding in the decoder of terminal installation.

In another embodiment, a kind of system that is used to handle multi-medium data comprises: image processing module, it is configured to discern the pixel intensity range of the part of multi-medium data, and described image processing module further is configured to revise described multi-medium data to reduce described pixel intensity range; And encoder, its described modified multi-medium data that is configured to encode is to form encoded multi-medium data.Described image processing module can produce the designator in order to the modification that reduces described pixel intensity range that indication is carried out described multi-medium data, and the wherein said encoder described designator that is configured to encode.In certain embodiments, use described encoded multi-medium data to transmit described designator to be used to the described encoded multi-medium data of decoding.Described system can further comprise storage device, it is configured to be stored in the decoder of terminal installation the designator in order to the post-processing technology of revising pixel intensity range, and revises described at least one pixel intensity range of described multi-medium data before coding based on described designator.

In another embodiment, a kind of system that is used for handling multi-medium data comprises: the device of at least one pixel intensity range that is used to discern at least one image of multi-medium data; Be used to revise at least a portion of described multi-medium data to reduce the device of described at least one pixel intensity range; And the described modified multi-medium data that is used to encode is to form the device of encoded multi-medium data.

In another embodiment, a kind of machine-readable medium comprises the instruction that is used to handle multi-medium data, at least a portion that described instruction impels at least one pixel intensity range at least one image of machine recognition multi-medium data when carrying out, revise described multi-medium data is reducing described at least one pixel intensity range, and the described modified multi-medium data of encoding is to form encoded multi-medium data.

Description of drawings

Fig. 1 is used to transmit the block diagram of multimedia communication system for explanation.

Fig. 2 is the block diagram that the specific components of the communication system that is used for encoded multimedia is described.

Fig. 3 is used for the block diagram of another embodiment of specific components of the communication system of encoded multimedia for explanation.

Fig. 4 is the block diagram that another embodiment of the specific components that is used for encoded multimedia is described.

Fig. 5 has the block diagram of the code device that is configured the processor that is used for the encoded multimedia data for explanation.

Fig. 6 has the block diagram of another embodiment of the code device that is configured the processor that is used for the encoded multimedia data for explanation.

Fig. 7 is the flow chart of the process of explanation encoded multimedia data.

Fig. 8 is the form of the example of explanation interpolation filter coefficient factors.

Fig. 9 is the form of explanation in order to the designator of the type of specifying the post-processing operation that will carry out at the decoder place and its parameter.

Figure 10 comes the flow chart of the process of encoded multimedia data for the pixel brightness value of explanation at least a portion by the multi-medium data that remaps.

Figure 11 has the block diagram that is configured to the code device of the preprocessor of modification multi-medium data before coding.

Embodiment

In the following description, provide detail so that the thorough understanding to described aspect to be provided.Yet, those skilled in the art will appreciate that, can not have to put into practice under the situation of these details described aspect.For instance, can the shown in block diagrams circuit, so that can not make described aspect hard to understand because of unnecessary details.In other situation, can at length not show well-known circuit, structure and technology so that can not make described aspect hard to understand.

Herein to " aspect ", " on the one hand ", " some aspects " or " some aspect " with use the reference of the similar phrase of term " embodiment " or " a plurality of embodiment " to mean in conjunction with one or more being included at least one aspect in the described special characteristic in aspect, structure or the characteristic.This type of phrase that occurs everywhere may not refer to all that also non-with one side is the independent or alternative aspect of repelling mutually with others in this manual.In addition, having described can be by some aspects and the non-various feature that represented by others.Similarly, having described may be for to some aspect but not the various requirement of the requirement of others.

As used herein " multi-medium data " or only " multimedia " be broad terms, it comprises video data (it can comprise voice data), voice data or video data and both audio, and also can comprise graph data." video data " or " video " is broad terms as used herein, and it refers to contain the sequence of the image of text message or image information and/or voice data.

For desired high resolution multimedia data are provided to one or more display unit, spatial scalability and last sampling algorithm generally include image or edge enhancing technique, described technology adopts rim detection, is linearity or self adaptation (sometimes for non-linear) filtering subsequently.Yet, can't detect at encoder via these mechanism and to be in the key and the fine detail edges of losing between compression and following sampling date with high percentage confidence level, perhaps decode and last sampling date between can't create key and fine detail edges effectively again.Some feature of the method and system of Miao Shuing comprises in order to the process of identification about the information of the details of the multi-medium data lost owing to compression herein.Further feature relates to by using this information recovering this type of details in the multi-medium data of decoding.Further describe and illustrate this type of system and method for introducing herein about Fig. 1 to Fig. 7.In an one exemplary embodiment, in order to promote the process of encoded multimedia data, coding method about reprocessing or decode procedure (for example can be used, at the display unit place) information come the encoded multimedia data with take into account by specific coding and/or decode procedure (the following sampling of for example, in encoder, implementing and/or in decoder, implement on the algorithm of taking a sample) the data difference opposite sex that produced.

In one example, multi-medium data at first encoded (for example, through sampling and compression down) is to form the compressed data that will be transferred at least one display unit subsequently.Use known decoder decode and last sampling algorithm to decompress and go up the copy of the encoded data of sampling, and the multi-medium data of gained data and primary reception (uncompressed) compared.Original multi-medium data and the difference table between last sampled data after decompressing are shown " different information ".(for example be incorporated in post-processing technology, following sampling and last sampling filter) in the removable noise of enhancing process, enhancing feature (for example, the quick change district in the data of skin, facial characteristics, indication " fast moving " object) or reduce entropy in the different information that is produced.Different information is encoded to " supplementary ".Supplementary also is transferred to decoder, the details through decoded picture that may demote during it is being encoded in order to enhancing at the decoder place.Can then the image that strengthens be presented on the display unit.

Fig. 1 is the block diagram of communication system 10 that is used to transmit the multi-medium data of crossfire or other type.This technology can be applicable in the digital transmission facility 12, and digital transmission facility 12 will be transferred to many display unit or terminal 16 through the multi-medium data of digital compression.The multi-medium data that is received by transmission facilities 12 can be digital video source, for example, and digital cable feed or believe/make an uproar through digitized simulation height and compare the source.Video source is treated and be modulated on the carrier wave to be used for being transferred to one or more terminals 16 via network 14 in transmission facilities 12.

Network 14 can be the wired or wireless network of the arbitrary type that is suitable for transmitting data, comprise Ethernet, phone (for example, POTS), cable, one or more in power line and fibre system and/or the wireless system, wherein wireless system comprises one or more in the following system: code division multiple access (CDMA or CDMA2000) communication system, frequency division multiple access (FDMA) system, OFDM (OFDM) system, timesharing multiple access (TDMA) system of GSM/GPRS (General Packet Radio Service)/EDGE (enhanced data gsm environment) for example, TETRA (terrestrial trunked radio) mobile telephone system, Wideband Code Division Multiple Access (WCDMA) (WCDMA) system, high data rate (1xEV-D0 or the multicast of 1xEV-DO gold) system, IEEE 802.11 systems, MediaFLO ^TMSystem, DMB system or DVB-H system.For instance, described network can be global computer communication network, wide area network, metropolitan area network, local area network (LAN) and the satellite network of cellular telephone network, for example internet, and the part of these and other type network or combination.

Each terminal 16 that receives encoded multi-medium data from network 14 can be the communicator of arbitrary type, includes, but is not limited to the part or the combination of radio telephone, PDA(Personal Digital Assistant), personal computer, TV, set-top box, desk-top, on knee or palmtop computer, (PDA), video storage device (for example cassette tape formula video recorder (VCR), digital video recording playback machine (DVR) etc.) and these and other device.

Fig. 2 is used for the block diagram of specific components of communication system of the digital transmission facility 12 of encoded multimedia for explanation.Transmission facilities 12 comprises multimedia sources 26, and described multimedia sources 26 is configured to for example receive or the multimedia of access otherwise from storage device based on it, and multi-medium data is provided to code device 20.Code device 20 (to small part) is based on coming the encoded multimedia data about the information of decoding algorithm, and described decoding algorithm is used for or can be used for for example downstream receiving system of terminal 16 subsequently.

Code device 20 comprises first encoder 21 that is used for the encoded multimedia data.First encoder 21 is provided to communication module 25 with encoded multi-medium data, is used for being transferred to the one or more of terminal 16.First encoder 21 also is provided to decoder 22 with the copy of encoded data.Decoder 22 be configured to decode encoded data and use post-processing technology in the decode procedure that preferably also is used for receiving system.Decoder 22 will be provided to comparator 23 through the data of decoding.

Designator uses for decoder 22 through identification, described designator indication post-processing technology.As " through identification " of in aforementioned sentence, using be meant that decoder is kept, stored, selection or access designator.In certain embodiments, described designator can be kept or be stored in the storage arrangement of decoder 22, or keeps or be stored in another device of communicating by letter with decoder 22.In certain embodiments, described designator can be selected from a plurality of designators, and each designator is indicated a post-processing technology.In certain embodiments, under the situation of not knowing the employed concrete treatment technology of decoder in the receiving system, decoder 22 also can use other known or typical treatment technology.

Decoder 22 can be configured to carry out one or more post-processing technologies.In certain embodiments, decoder 22 is configured to adopt the input of which technology to use one in the multiple post-processing technology based on indication.Usually, as being used for the 21 employed compressions of first encoder and the following sampling process of encoded multimedia data, and in the decoder 22 employed decompressions that are used for the decoding multimedia data and the result of last sampling process, through decoding data may some be different at least (and from original multi-medium data degradation) with original multi-medium data.Comparator 23 is configured to receive and more original multi-medium data and the multi-medium data through decoding, and definite comparison information.Comparison information can comprise by more original multi-medium data with through the decoding multi-medium data and definite any information.In certain embodiments, comparing data comprise two in the data set difference and be known as " different information ".For instance, can produce different information based on frame by frame.Also can be based on comparing district by district piece.Related herein block can change to M * N one " block " of the pixel of size arbitrarily from one " block " of a pixel (1 * 1).The shape of block may not be square.

" different information " expression is as the result of coding/decoding process and the image degradation of seeing in the multi-medium data that terminal 16 places show.Comparator 23 is provided to second encoder 24 with comparison information.The comparison information of in second encoder 24, encoding, and encoded " supplementary " be provided to communication module 25.Communication module 25 can be transferred to the data 18 that comprise encoded multimedia and encoded supplementary terminal installation 16 (Fig. 1).Decoder in the terminal installation uses " supplementary " enhancing to be added (for example, adding details) to the multi-medium data through decoding that is affected or demotes during coding or decoding.This has strengthened the picture quality of the encoded multi-medium data that is received, and makes and high-resolution can be presented on the display unit through decoded picture.In certain embodiments, first encoder 21 and second encoder 24 can be embodied as unity coder.

Post-processing technology can comprise one or more technology that strengthen some feature (for example, skin and facial characteristics) in the multi-medium data.Encoded different information is transferred to receiving system.Receiving system uses supplementary that details is added into through decoded picture with compensation affected details during Code And Decode.Therefore, high-resolution and/or better quality image can be presented on the receiving system.

Different information is identified as supplementary in the main encoded bit stream.User's data or " filler (filler) " grouping are available so that the size of encoded data is suitable for the size of encoded transmission of media data protocol packet size (for example, IP datagram or MTU) to carry supplementary.In certain embodiments, one group of relation that different information can be identified as existing information in the encoded data of low resolution (for example, the number of equation, decision logic, quantification residual error coefficient and position, fuzzy logic ordination), and the index to this type of relation can be encoded to supplementary.Owing to be not that all differences information all must be encoded and the form of this information can be simplified index into the question blank that concerns, so the sampling metadata is encoded more efficiently on the encoder assist type, and utilize information in the receiving system with the entropy of the information that reduces to be transmitted.

Also contain other configuration of described code device 20.For instance, Fig. 3 illustrates an alternate embodiment of using an encoder 31 to substitute the code device 30 of two encoders (as shown in Figure 2).In this embodiment, comparator 23 is provided to unity coder 31 to be used for coding with different information.Encoder 31 is provided to communication module 25 to be used to be transferred to terminal 16 with encoded multi-medium data (for example, the first encoded data) and encoded supplementary (for example, the second encoded data).

Fig. 4 is the block diagram of an example of the part (in particular, being encoder 21, decoder 40 and comparator 23) of the system shown in key diagram 2 and Fig. 3.Decoder 40 is configured for use in the encoded multi-medium data of decoding and is applied in the post-processing technology of using in the receiving terminal 16 (Fig. 1).The functional of decoder 40 can be implemented in the encoder described herein, for example, Fig. 2 and decoder 22 illustrated in fig. 3.Decoder 22 receives encoded multi-medium data from encoder 21.The encoded multi-medium data of decoder module 41 decoding in the decoder 40, and will be provided to post-processing module in the decoder 40 through decoded data.In this example, post-processing module comprises noise suppressor module 42 and data enhancer module 43.

Usually the noise in the supposition video sequence is the white Gauss of additivity.Yet vision signal is equal height correlation on time and space.Therefore, by all utilizing its whiteness in time with on the space, can remove noise from signal section.In certain embodiments, noise suppressor module 42 comprises the time noise suppressed, for example, and Kalman (Kalman) filter.Noise suppressor module 42 can comprise other noise suppressing method, for example, and wavelet shrinkage filter and/or small echo Wei Na (Wiener) filter.Small echo is for using so that given signal is confined to the class function in spatial domain and the convergent-divergent territory.The basic theory of small echo is to analyze the signal under different scales or the resolution, makes the less change in the Wavelet representation for transient produce the less change of the correspondence in the primary signal.Also wavelet shrinkage or small echo Weiner filter can be applied as noise suppressor 42.The wavelet shrinkage noise suppressed can relate to the contraction in the wavelet transformed domain, and comprises three steps usually: linear forward wavelet transform, nonlinear shrinkage noise suppressed and linear inverse wavelet transform.Weiner filter is that MSE optimizes linear filter, and it can be in order to improve because of additivity noise and fuzzy image of demoting.In certain aspects, noise inhibiting wave filter is based on the one side of (4,2) biorthogonal cubic B-spline wavelet filter.

Noise suppressor module 42 will be through noise suppressed be provided to data enhancer module 43 through decoded data.Data enhancer module 43 can be configured to strengthen some feature be considered to watch (for example) skin, facial characteristics and change the needed data of data (for example, be used for be associated with sport event multi-medium data) fast.The major function of data enhancer module is to provide between the playback of data or stage of exhaustion image or video to strengthen.Typical figure image intensifying comprises sharpening, color gamut/saturation/hue improvements, contrast improvement, histogram equalization and high frequency emphasis.About strengthening skin characteristic, there are some skin color detection methods.In case discerned the zone that has the colour of skin in the image, then can revise corresponding to this regional chromatic component improving tone, thereby be fit to desired palette.

About improving facial characteristics, if in facial characteristics, detect ringing noise (ringing noise), for example discerned, then can be spent ring (de-ringing) filter and/or suitable level and smooth/noise minimizing filter so that these minimum artifacts and execution context/content choice figure image intensifying via Face Detection.The video enhancing comprises flicker minimizing, frame rate raising etc.The designator of transmission mean flow rate can help the decoder/back decoder/reprocessing about the flicker minimizing on the framing in video.Flicker is often quantized to cause by DC, thus cause on those frames of the original existence with identical luminescent condition/brightness average brightness level have fluctuation through the video of construction again.Flicker reduce the calculating of the mean flow rate (for example, DC histogram) be usually directed to contiguous frames and on the in question frame of institute application equalization filter so that the mean flow rate of each frame turns back to the mean flow rate of being calculated.In the case, different information can be the mean flow rate side-play amount through precomputation that will be applied to each frame.Data enhancer module 43 will be provided to comparator 23 through the decoding multimedia data through what strengthen.

Fig. 5 has the block diagram of an example of the code device 50 that is configured the processor 51 that is used for the encoded multimedia data for explanation.Code device 50 may be implemented in the transmission facilities, for example, and digital transmission facility 12 (Fig. 1).Code device 50 comprises medium 58, and it is configured to communicate by letter and be configured with communication module 59 with processor 51 and communicates by letter.In certain embodiments, processor 51 is configured and to come the encoded multimedia data with encoder 20 similar modes illustrated in fig. 2.The multi-medium data that processor 51 uses first coder module, 52 codings to be received.Then use the encoded multi-medium data of decoder module 53 decodings, decoder module 53 is configured to use at least one post-processing technology that is implemented in the terminal 16 (Fig. 1) to come the decoding multimedia data.The noise that processor 51 use noise suppressor modules 55 remove in the multi-medium data of decoding.Processor 51 can comprise data enhancer module 56, and tool is configured to strengthen through the multi-medium data of decoding to be used for the predetermined characteristic of facial characteristics for example or skin.

Determine through the difference between (and through strengthen) multi-medium data of decoding and the original multi-medium data by comparator module 54, described comparator module 54 produce expression through decoding multi-medium data and the different information of the difference between the original multi-medium data.By the different information of second encoder, 57 codings through strengthening.Second encoder 57 produces the encoded supplementary that is provided to communication module 59.Encoded multi-medium data also is provided to communication module 59.Encoded multi-medium data and supplementary all can be sent to display unit (for example, the terminal 16 among Fig. 1), and display unit uses supplementary to come the decoding multimedia data to produce the multi-medium data that strengthens.

Fig. 6 has the block diagram of another embodiment of the code device 60 that is configured the processor 61 that is used for the encoded multimedia data for explanation.This embodiment can be similar to Fig. 5 and come the encoded multimedia data, except processor 61 contains an encoder 62 of encoded multimedia data and different information.Encoded multi-medium data and supplementary then are sent to display unit (for example, the terminal among Fig. 1 16) by communication module 59.Decoder in the display unit then uses supplementary decoding multimedia data with the data that produce enhanced resolution and show this data.

Hereinafter list the example of some post-processing technology that may be implemented in the decoder, yet, to the description of these examples and do not mean that the technology that disclosure is limited to only those descriptions.As mentioned above, decoder 22 can be implemented any one supplementary of coming Recognition Different information and producing correspondence in numerous post-processing technologies.

Colourity is handled

One example of post-processing technology is that colourity is handled, and it relates to the operation about the colourity of the multi-medium data that will show.The color space conversion is an example for this reason.Typical squeeze operation (decode, deblock etc.) and some post-processing operation (for example, be independent of colourity and revise function by the intensity of brightness or Y representation in components, for example, histogram equalization) betide in YCbCr or YUV territory or the color space, and display is operated in rgb color space usually.In preprocessor and video-stream processor, carry out the color space conversion to solve this difference.If keep identical bit depth, then the data transaction between RGB and the YCC/YUV can cause data compression, because when the strength information among R, G and the B is transformed to the Y component, redundancy wherein reduces, thereby causes sizable compression of source signal.Therefore, arbitrary compression based on reprocessing will be operated in the YCC/YUV territory potentially.

The colourity subsample relates to for the practice of brightness (amount of representing it) comparison color (amount of representing it) enforcement than multiresolution.It is used for many Video Coding Scheme (analog-and digital-) and also is used for the JPEG coding.In the colourity subsample, brightness and chromatic component are through forming the weighted sum of Gamma correction (tristimulus) R ' G ' B ' component, and the weighted sum of non-linear (tristimulus) RGB component.Usually the subsample scheme is expressed as three parts than (for example, 4: 2: 2), but was expressed as four parts sometimes (for example, 4: 2: 2: 4).Four parts are that the luminance level sampling of (pressing its order separately) first is with reference to (initial, as to be the multiple of 3.579MHz in the ntsc television system); Second portion Cb and Cr (colourity) horizontal factor (with respect to first numeral); The third part of identical with second numeral when being zero (except when, its indication Cb and Cr were through vertically 2: 1 subsamples); If with existence, four part identical (indication α " key (key) " component) with the brightness numeral.Post-processing technology can comprise sampling (for example, 4: 2: 0 data being converted to 4: 2: 2 data) or sampling (for example, 4: 4: 4 data being converted to 4: 2: 0 data) down on the colourity.Usually carry out low to 4: 2: 0 videos to medium bit rate compression.If the source multi-medium data has the colourity higher than 4: 2: 0 (for example, 4: 4: 4 or 4: 2: 2), then can will be sampled to 4: 2: 0 under it, encode, transmit, decode and then go up to take a sample and get back to original colourity during the post-processing operation.At the display unit place,, colourity is returned to its complete 4: 4: 4 ratio when being transformed to RGB when being used to show.Can use this type of post-processing operation to dispose the decoding/processing operation of decoder 22 to repeat to betide the downstream display device place.

Graphic operation

Post-processing technology about graphics process also may be implemented in the decoder 22.Some display unit comprise graphic process unit, for example, support the display unit of multimedia and 2D or 3D recreation.The functional processes pixel that comprises of graphic process unit is operated, and some (or all) that can use suitably wherein operate to improve video quality or to be incorporated in potentially in the Video processing that comprises compression/de-compression.

α mixes

α is mixed into the overlapping operation that is generally used in two transformations between the scene or is used for the video on the existing screen on the GUI, and it is for also may be implemented in an example of the pixel operation post-processing technology in the decoder 22.In α mixed, the α value scope in the colour coding was from 0.0 to 1.0, and wherein 0.0 represents complete transparent color, and the complete opaque color of 1.0 expressions.For " mixing ", will multiply by " α " from the pixel that picture buffer reads.To multiply by negative α from the pixel that display buffer reads.Both are added together and display result.Video content contains various forms of transition effect, comprise: from/to black or other evenly/desalination of constant color changes intersection desalination (cross fade) (fade transition), the scene and the junction point between the content type (for example, animation to commercial video etc.).H.264 standard has and is used to the frame number that changes or POC (sequence of pictures number) and transmits the α value and be used to begin regulation with the designator of halt.Also can specify the even color that is used to change.

Transition region can be difficult to coding, because it is not the scene change of burst, wherein the beginning (first frame) of new scene can be encoded to the I frame, and frame subsequently is encoded to predictive frame.Owing to the character that is generally used for the locomotion evaluation/compensation technique in the decoder, can be with motion tracking as data block, and constant brightness side-play amount is absorbed in the residual error (weight estimation is head it off to a certain extent).The desalination that intersects has bigger problem, because the change in brightness and the motion just followed the tracks of is not a real motion, but the switching gradually from an image to another image, it causes bigger residual error.These bigger residual errors cause extensive motion and block pseudomorphism after quantizing (process of low bitrate).With respect to the situation of bringing out the block pseudomorphism, for similar or preferable perception/visual quality, coding defines the complete image of transition region and specifies the α mixed configuration will cause no pseudomorphism playback and the improvement of compression efficiency/ratio or the reducing of bit rate that changes with the desalination of influence desalination/intersection.

The α ability of mixing of knowing decoder at the encoder place can help transition effect is encoded to metadata but not encode position consumption on big residual error via routine.Except the α value, some examples of this type of metadata also comprise the index to one group of transition effect supporting at decoder/preprocessor place (for example, convergent-divergent, rotate, fade out and desalinate).

Transparency

" transparency " is another the simple relatively reprocessing pixel operation in the decoder 22 that can be included in code device 20.In transparency process, read pixel value from display buffer, and read another pixel value (frame that will show) from picture buffer.If the value coupling transparence value from picture buffer is read then will write display from the value that display buffer reads.Otherwise, will write display from the value that picture buffer reads.

Video scaling (x2 ,/2 ,/4, arbitrary proportion)

The intention of video scaling (" amplifying (upscaling) " or " dwindling (downscaling) ") is generally when information conveyed is moved to another unlike signal form or resolution under with signal format or resolution, keeps the original signal information and the quality of as much.It is work under two (2) or four (4) times convergent-divergent, and carried out via the simple averageization of pixel value.Amplification relates to interpolation filter and can carry out on two axles.The Y value is carried out the bicubic interpolation, and chromatic value is carried out nearest neighbor filtering.

For instance, can calculate the interpolate value of Y by following equation:

Y [i, j] = \frac{- Y [i - 3, j] + 9 Y [i - 1, j] + 9 Y [i + 1, j] - Y [i + 3, j]}{16}

Equation 1

For the Y of each interpolation in the delegation, and

Y [i, j] = \frac{- Y [i, j - 3] + 9 Y [i, j - 1] + 9 Y [i, j + 1] - Y [i, j + 3]}{16}

Equation 2

Y for each interpolation in the row.

From comparing side by side, bilinearity and bicubic interpolation scheme are showed minimum visible difference.The bicubic interpolation obtains slightly sharp keen image.Must build and put bigger line buffer, so that carry out the bicubic interpolation.All bicubic filters are one dimension, and wherein coefficient only depends on zoom ratio.In one example, 8 are enough to code coefficient and guarantee picture quality.Only needing all coefficient codings be signed not, and the use circuit may be difficult to the sign of encoding.For the bicubic interpolation, the sign of coefficient is always [++-].

Fig. 8 shows the various selections for the filter of given scale factor.The scale factor of listing among Fig. 8 is the example of the most normal scale factor that runs in mobile device.For each scale factor, can come the out of phase of selective filter based on the type and desired slipping away (roll off) feature at detected edge.For some texture and fringe region, some filters are worked better than other filter.Estimate derived filter tap (filter tap) based on experimental result and vision.In certain embodiments, the complicated scaler of locating at receiver (decoder/display driver) of appropriateness can be selected between filter adaptively based on block/tile (tile).The encoder of understanding the feature in the scaler of receiver can indicate (based on original comparison) at which person in each block selective filter (for example, provide the form of filter index).The method can be the replacement scheme of decoder via the suitable filter of rim detection decision.It makes the minimum power in cycle of treatment and the decoder, because it must not carry out the decision logic (for example, consuming the pruning and the directional operation of many processor circulations) that is associated with rim detection.

Gamma correction

Gamma correction, gamma are non-linear, gamma coding or the gamma that is called for short usually are the title in order to the nonlinear operation of brightness in Code And Decode video or the still image system or tristimulus value(s), and it also is the another kind of post-processing technology that can implement in decoder 22.The overall brightness of Gamma correction control chart picture.May seem to fade or too dark without the image of suitably proofreading and correct.Attempting exactly, reproducing colors also needs some understanding of Gamma correction.The amount that changes Gamma correction not only changes brightness, also changes red: green: blue ratio.Under the simplest situation, Gamma correction is defined by following power law expression formula:

V_{out} = V_{in}^{γ}

Equation 3

Wherein input and output value is non-negative real-valued, for example is in usually in 0 to 1 the preset range.Usually the situation with γ＜1 is called the gamma compression, and γ＞1 is called the gamma expansion.Decoder reprocessing therein comprises in the embodiment of Gamma correction, can implement corresponding gamma post-processing technology in decoder 22.Usually, carry out Gamma correction in the analog domain in the LCD panel.Usually, the Gamma correction heel is with shake (dithering), but in some cases, at first execution is shaken.

Histogram equalization

Histogram equalization turns to the method for the dynamic range of the pixel in the histogram modification image that uses pixel value.Usually, the information in the image is not to be evenly distributed on the possible values scope.This pixel intensity frequency distribution of image can be described with the formation image histogram the relation of the brightness (for example, being from 0 to 255 for eight monochrome images) (x axle) of each pixel by the number (y axle) that illustrates pixel.Drop on the diagrammatic representation of the number of pixels in the various brightness level boundaries in the image histogram exploded view picture.Dynamic range is the measurement of the width of histogrammic occupied part.Usually, the image with little dynamic range also has low contrast, and the image with big dynamic range has high-contrast.Use map operation (for example, histogram equalization, contrast or gamma adjustment or another operation of remapping) can change the dynamic range of image.When having reduced the dynamic range of image, can use " planarization (flattened) " image of less bit representation (and coding) gained.

Can carry out the dynamic range adjustment to pixel intensity range (for example, the scope of pixel brightness value).Though usually entire image is carried out, also can be carried out the dynamic range adjustment to the part (the pixel intensity range of for example, representing the part of described image) of an image through identification.In certain embodiments, image can have two or more identification divisions (for example, distinguishing by different images subject matter content, locus or by the different piece of image histogram), and can adjust the dynamic range of each part individually.

Histogram equalization can be in order to increase the local contrast of image, especially when the data available of image when contrast value is represented closely.Through adjusting thus, can be on histogram with the intensity preferred distribution.This allows the zone of low local contrast to obtain higher contrast ratio, and does not influence overall contrast.By launching pixel intensity value effectively, histogram equalization is realized this situation.Described method can be used for having and is bright or is in the image of dark background and prospect.

Though histogram equalization improves contrast, it has also reduced the compression efficiency of image.In some coding methods, " oppositely " that can use histogram equalization property before coding is to improve compression efficiency substantially.In inverse histogram equalization process, the pixel brightness value that remaps is to reduce contrast; The image histogram of gained has less (compression) dynamic range.In some embodiment of this process, can before coded image, derive the histogram of each image.The brightness range of the pixel in the multimedia image can be through convergent-divergent to be compressed to image histogram the brightness value than close limit effectively.Therefore, can reduce the contrast of image.When compression during this image, owing to low/among a small circle brightness value, code efficiency is higher than the situation of no histogram compression.When decoding described image at the terminal installation place, the histogram equalization process of moving on described terminal installation returns to original distribution with the contrast of image.In certain embodiments, encoder can be kept the designator that (or reception) discerns the algorithm of histogram equalization of the decoder that is used for the terminal installation place.In the case, encoder can use algorithm of histogram equalization oppositely improving compression efficiency, and then enough information is provided to decoder to be used for the recovery of contrast.

Figure 11 illustrates an embodiment of code device 1120, and it can reduce the dynamic range of multi-medium data before the encoded multimedia data, so that use less bits to come the encoded multimedia data.In Figure 11, multimedia sources 1126 is provided to code device 1120 with multi-medium data.Code device 1120 comprises preprocessor 1118, its receiving multimedia data and reduce the dynamic range of at least one contained in described multi-medium data image.The data of gained " compression " have reduced the size of multi-medium data, and have correspondingly reduced the amount that needs the multi-medium data of coding.The data of gained are provided to encoder 1121.

Encoder 1121 coding through adjusting multi-medium data and encoded data are provided to communication module 1125, be transferred to as terminal installation illustrated in fig. 1 16 (for example, hand-held set) being used to.In certain embodiments, the information that also will be associated with the dynamic range adjustment is provided to encoder 1121.Described information can be maintained in the code device 1121 designator with the modification of pixel intensity range being carried out as indication.If the information that is associated with the dynamic range adjustment (or designator) is provided, also this information of codified and it is provided to communication module 1125 of encoder 1121 then is to be used to be transferred to terminal installation 16.Subsequently, terminal installation 16 dynamic range of (expansion) described image that before display image, remaps.In certain embodiments, for example the encoder of the encoder 21 of Fig. 2 can be configured to carry out this preliminary treatment dynamic range adjustment.In certain embodiments, except other coding embodiment (comprising), can carry out the adjustment of preliminary treatment dynamic range herein for example referring to figs. 1 to the described coding embodiment of Fig. 9.

The type of the post-processing operation that will carry out at the decoder place in order to appointment and the metadata (or designator) of its parameter are described among Fig. 9.The option of convergent-divergent is not on the same group the coefficient that is used for interpolation filter described in Fig. 9.The function indicator is the index of one group of post-processing function listing in the 2nd row of form illustrated in fig. 9.Encoder group selection from then on produces the function (based on block) of the minimum entropy of the different information that will encode.According to circumstances, choice criteria also can be first water, and (for example, PSNR, SSIM, PQR etc.) measure described quality via some destination apparatus.In addition, for the function of each appointment, provide set of option based on the method that is used for this function.For instance, use edge detection method (for example, one group of Sobel filter or 3 * 3 or 5 * 5 Gaussian mask), then use high frequency emphasis, the edge strengthens and can betide outside the loop.In certain embodiments, by using block-separating device circuit in the loop, the edge strengthens and can betide in the loop.Under latter instance, the edge detection method that uses during deblocking in the loop is in order to the identification edge, and will be in order to strengthen the sharpening filter at edge to the supplementary functions of the conventional low-pass filtering of being undertaken by deblocking filter.Similarly, histogram equalization has option, and with equalization on the intensity level of four corner or part intensity level, and Gamma correction has the option that is used to shake.

Fig. 7 explanation is by an example of the process 70 of coding structure (for example, code device 20 (Fig. 2), code device 30 (Fig. 3), code device 40 (Fig. 4) and code device 50 (Fig. 5)) encoded multimedia data.At state 71 places, described process is kept the designator of post-processing technology.For instance, described post-processing technology can be used in the decoder of display unit (for example, terminal 16 (Fig. 1)).Metadata also can be carried out the well-known or general treatment technology of indication under the situation of what post-processing technology (if there is) specifically not knowing at the receiving and displaying device place.At state 72 places, first multi-medium data that is received is at first encoded to form the first encoded multi-medium data.

At state 73 places, by the post-processing technology that the decode first encoded multi-medium data and application are discerned by designator, process 70 produces second multi-medium data.Described post-processing technology can be one or another post-processing technology in the post-processing technology of describing herein.At state 74 places, process 70 compares second multi-medium data and first multi-medium data to determine comparison information.Described comparison information can be the different information of the difference between described second multi-medium data of indication and described first multi-medium data.At state 75 places, process 70 is then encoded described comparison information to form supplementary (the second encoded data).Supplementary and encoded multi-medium data can be sent to display unit subsequently, described display unit can use described supplementary with the decoding multimedia data.

Figure 10 comes the flow chart of the process 1000 of encoded multimedia data (for example, being carried out by the encoder 1120 of Figure 11) by the pixel luminance intensity range that reduces at least a portion of described multi-medium data before the encoded multimedia data for explanation.At state 1005 places, the pixel luminance intensity range in the process 1000 identification multi-medium datas.For instance, if described multi-medium data comprises an image, then the pixel intensity range of that image can be discerned or be determined to process 1000.If multi-medium data comprises image sequence (for example, video), then can discern the one or more pixel intensity range in the described image.For instance, pixel intensity range can be and contain 90% the range of luminance values of the pixel in the image of brightness value of (perhaps, for example, 95% or 99%).In certain embodiments, if the images category in the image sequence seemingly, then can discern the identical pixel intensity range of all (or to the reducing a lot) images in the described image sequence.In certain embodiments, can discern pixel luminance intensity range with two or more images of equalization.

At state 1010 places, process 1000 is revised the part of multi-medium data to reduce pixel luminance intensity range.Usually, the pixel brightness value of image concentrates on the part of available intensity range.Reduce (or remapping) pixel value and can reduce data volume in the image widely to cover small range, it helps than active data coding and transmission.The example that reduces pixel luminance intensity range comprises " oppositely " histogram equalization, Gamma correction or will be remapped to the only scope that reduces of a part of green strength scope from the brightness value of " all " scopes (being 0-255 for eight bit images for example).

At state 1015 places, the modified multi-medium data of process 1000 codings is to form encoded data.Can be with the terminal installation 16 (Fig. 1) of encoded transfer of data to the encoded data of decoding.Decoder in the terminal installation is carried out the process of the strength range that is used for the extended multimedia data.For instance, in certain embodiments, decoder is carried out histogram equalization, Gamma correction or another image process that remaps, with the pixel value of the multi-medium data of expansion on a pixel intensity range.Gained may seem to be similar to its original appearance through the extended multimedia data, perhaps watching on the display of terminal installation at least is pleasant.In certain embodiments, the designator of indicating strength range to reduce can be through producing, encode and being transferred to terminal installation.Decoder in the terminal installation can use the described designator supplementary as the institute's receiving multimedia data that is used to decode.

It should be noted that the process that described aspect can be described as being depicted as flow chart, flow chart, structure chart or block diagram.Though flow chart can be described as a continuous process with described operation, can walk abreast or carry out many described operations simultaneously.In addition, can rearrange the order of described operation.When the operation of a process is finished, stop described process.Process can be corresponding to method, function, program, routine, subprogram etc.When process during corresponding to function, its termination turns back to call function or principal function corresponding to described function.

It should also be apparent to those skilled in the art that under the situation of the operation that does not influence device, can rearrange one or more elements of device disclosed herein.Similarly, under the situation of the operation that does not influence device, one or more elements of device disclosed herein capable of being combined.Those skilled in the art will appreciate that, can use in multiple different science and technology and the technology any one to come expression information and signal.The those skilled in the art should be further appreciated that and can will be embodied as electronic hardware, firmware, computer software, middleware, microcode or its combination in conjunction with the described various illustrative logical blocks of example disclosed herein, module and algorithm steps.For this interchangeability of hardware and software clearly is described, above substantially according to its functional various Illustrative components, block, module, circuit and step described.With this functional hardware that is embodied as still is that software depends on application-specific and the design constraint of forcing on whole system.The those skilled in the art can implement described functional at each application-specific by different way, but this type of implementation decision should be interpreted as causing the scope of the method that disengaging discloses.

In the software module that can directly be included in the hardware, carry out by processor in conjunction with the step of described method of example disclosed herein or algorithm, or in both combinations.Software module can reside in RAM memory, flash memory, ROM memory, eprom memory, eeprom memory, register, hard disk, removable dish, CD-ROM, or in this technology in the medium of known any other form.Exemplary storage medium is coupled to processor so that described processor can be from described read information, and information is write described medium.In replacement scheme, medium can be incorporated into described processor.Processor and medium can reside in the application-specific integrated circuit (ASIC) (ASIC).Described ASIC can reside in the radio modem.In replacement scheme, processor and medium can be used as discrete component and reside in the radio modem.

In addition, can be by implementing or carry out various illustrative logical blocks, assembly, module and the circuit of describing in conjunction with example disclosed herein with general processor, digital signal processor (DSP), application-specific integrated circuit (ASIC) (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components or its any combination of carrying out function described herein through design.General processor can be microprocessor, but in replacement scheme, described processor can be arbitrary conventional processors, controller, microcontroller or state machine.Also processor can be embodied as the combination of calculation element, for example, DSP combines DSP core or arbitrary other this type of configuration with the combination of microprocessor, a plurality of microprocessor, one or more microprocessors.

The previous description of the example that is disclosed through providing so that any those skilled in the art can make or use the method and apparatus of announcement.The those skilled in the art will understand the various modifications to these examples easily, and under the situation of the spirit or scope that do not break away from the method and apparatus that is disclosed, the principle that defines can be applied to other example herein and maybe can add extra element.Hope is illustrative to the description of described aspect, and does not limit the scope of claims.

Claims

1. method of handling multi-medium data, described method comprises:

At least one pixel intensity range at least one image of identification multi-medium data;

At least a portion of revising described multi-medium data is to reduce described at least one pixel intensity range; And

Encode described modified multi-medium data to form encoded multi-medium data;

In the encoder place of terminal installation, the decoder that identification is used for terminal installation is with one of them designator of a plurality of post-processing technologies of revising pixel intensity range;

In described encoder place, the described encoded multi-medium data of decoding is to form the multi-medium data through decoding;

In described encoder place, described multi-medium data through decoding is used the post-processing technology discerned by described designator to form second multi-medium data;

In described encoder place, more described multi-medium data and described second multi-medium data are to produce supplementary;

Transmit described supplementary to described terminal installation.

2. method according to claim 1, it further comprises described encoded multi-medium data is transferred to described terminal installation.

3. method according to claim 1, it further comprises:

Keep the designator that indication is carried out described multi-medium data in order to the described modification that reduces described at least one pixel intensity range; And

Encode described designator for being transferred to described terminal installation.

4. method according to claim 3, it further comprises described designator and described multi-medium data is transferred to described terminal installation.

5. method according to claim 1, wherein said post-processing technology comprise inverse histogram equalization or Gamma correction operation.

6. method according to claim 1 is wherein revised described at least one pixel intensity range based on pixel value scope that is detected and the threshold value that defines the limit of pixel intensity value scope to small part.

7. method according to claim 1, at least one pixel intensity range at least one image of wherein said identification multi-medium data comprises two or more pixel intensity range at least one image of discerning described multi-medium data, and wherein said modification comprises the described multi-medium data of modification to reduce described two or more pixel intensity range.

8. method according to claim 7, wherein said two or more pixel intensity range are represented the different spatial of the pixel of perhaps described at least one image in the different images subject matter of described at least one image.

9. method according to claim 7, wherein said two or more pixel intensity range are represented the different piece of the image histogram of described at least one image.

10. system that is used to handle multi-medium data, it comprises:

Be used for discerning the device of at least one pixel intensity range of at least one image of multi-medium data;

Be used to revise at least a portion of described multi-medium data to reduce the device of described at least one pixel intensity range; And

Be used to encode described modified multi-medium data to form the device of encoded multi-medium data;

Be used for discerning the decoder that is used in terminal installation with one of them the device of designator of a plurality of post-processing technologies of revising pixel intensity range;

Be used to decode described encoded multi-medium data to form device through the multi-medium data of decoding;

Be used for described multi-medium data through decoding is used the post-processing technology discerned by described designator to form the device of second multi-medium data;

Be used for more described multi-medium data and described second multi-medium data to produce the device of supplementary;

Be used to transmit the device of described supplementary to described terminal installation.

11. system according to claim 10, it further comprises the device that is used for described encoded multi-medium data is transferred to terminal installation.

12. system according to claim 10, it further comprises:

Be used to keep the device that indication is carried out described multi-medium data in order to the designator of the described modification that reduces described at least one pixel intensity range; And

Be used to encode described designator for the device that is transferred to terminal installation.

13. system according to claim 10, the wherein said device that is used to revise described at least one pixel intensity range comprises the device that is used to carry out the inverse histogram equalization operation.