WO2009109936A1

WO2009109936A1 - Arrangement and approach for video data up-conversion

Info

Publication number: WO2009109936A1
Application number: PCT/IB2009/050912
Authority: WO
Inventors: Erwin Bellers; Jacobus Willem Van Gurp
Original assignee: Nxp B.V.
Priority date: 2008-03-05
Filing date: 2009-03-05
Publication date: 2009-09-11

Abstract

Video data is up-converted in a manner that addresses detected or expected artefact conditions. According to an example embodiment, input video data is processed on a pixel block basis by, for each input pixel block (e.g., 120) expected to exhibit undesirable artefact conditions, generating a target pixel block (e.g., 140) using motion characteristics of the input pixel block and different noise factors for different pixels in the target block (e.g., 112, 114, 116).

Description

ARRANGEMENT AND APPROACH FOR VIDEO DATA UP-CONVERSION

The present invention relates generally to image applications, and more specifically, to circuits and methods for processing image data for display.

In many video display applications, it is desirable process an incoming video stream to generate an output stream that is accurate and pleasing and/or meets certain conditions relative to the communication, processing or display of video. Various processing approaches have been used in connection with the presentation of video on devices such as computer displays, televisions, movie screens or hand-held devices such mobile telephones and PDAs.

One approach to video processing involves increasing the frame rate of an input video stream (up-converting) to generate an output video stream having a higher frame rate than that of the input video stream. For example, temporal up-conversion has been used in systems that convert a picture rate X into a picture rate Y, where the rate Y is generally greater than the rate X (e.g., for converting a 50Hz input video sequence to a 100Hz output video sequence, or converting a 24Hz input video to a 60Hz output sequence).

In many applications, temporal interpolation is carried out on a block basis (e.g., 8x8, 4x4 or 2x2 pixel blocks) with motion compensation using a motion vector that is constant for each pixel in the block. However, if the motion vector for a particular block is incorrect or inaccurate, an error is generally made for all the pixels in the block and can result in a visible blocking artefact. Such errors may occur, for example, in pixel blocks that are located at the boundaries of moving objects. Certain approaches to address errors such as those described above have been challenging to implement. For example, the block size can be reduced (e.g., from 8x8 to 2x2 pixels) to reduce the visibility of generated artefacts. However, reducing the block sizes generally increases the computational complexity of the up-conversion, which can raise issues with video processing capabilities or may require undesirable cost increases to video processing and/or display equipment.

These and other issues continue to present challenges to the processing and display of image data.

Various aspects of the present invention are directed to arrangements for and methods of processing video data in a manner that addresses and overcomes the above- mentioned issues and other issues as directly and indirectly addressed in the detailed description that follows.

According to an example embodiment of the present invention, a video processing arrangement up-converts input video data having pixel blocks by selectively using a noise factor to mitigate the effects of artefacts. The arrangement includes an up- conversion circuit that is responsive to an artefact condition that exceeds an artefact threshold for an input pixel block by generating a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block. The circuit is responsive to an artefact condition that does not exceed the artefact threshold by generating a target pixel block from the input pixel block using a motion characteristic of the pixel block (e.g., without using a noise factor). The up-converted video data that includes the target pixel block generated for each input pixel block is output for use in generating a video display.

The above summary of the present invention is not intended to describe each embodiment or every implementation of the present invention. Other aspects of the invention will become apparent and appreciated by referring to the following detailed description and claims taken in conjunction with the accompanying drawings.

The invention may be more completely understood in consideration of the following detailed description of various embodiments of the invention in connection with the accompanying drawings, in which:

FIG. 1 shows a video processing and display arrangement, according to an example embodiment of the present invention;

FIG. 2 shows a flow diagram for processing video data, according to another example embodiment of the present invention; and

FIG. 3 shows a circuit arrangement for up-converting video data, according to another example embodiment of the present invention.

While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the appended claims.

The present invention is believed to be applicable to a variety of arrangements and approaches for image data processing. While the present invention is not necessarily limited to such applications, an appreciation of various aspects of the invention is best gained through a discussion of examples in such an environment.

According to an example embodiment of the present invention, target pixels of target pixel blocks are generated using input pixels in input pixel blocks of a video signal, a motion characteristic (e.g., a motion vector) for the input pixel blocks and decision noise that is different for different pixels in each block. In some applications, each target pixel in a particular target pixel block is generated using decision noise that is different than decision noise used to generate other target pixels in the target pixel block. In other applications, some target pixels in a particular target pixel block are generated using different decision noise, while other target pixels in the target pixel block are generated using the same decision noise.

This noise-masking approach is selectively implemented to generate an output video signal (using the target pixel blocks) that promotes the display of images that is visually pleasing under a variety of conditions, such as those involving the up-conversion of the input video signal and others as susceptible to artefacts. This approach is therefore useful, for example, to address issues including those discussed in the background above, to mitigate the effect of artefacts in pixel blocks. In addition, this approach to pixel block processing is adaptive, in that the noise-masking processing can be used selectively, depending upon the nature of the pixels and other conditions relating to the video being processed.

In some applications, a noise factor is used to control the combination of interpolated pixel values that are generated using two or more processing approaches (i.e., algorithms) for up-converting a video stream, with at least one of the approaches using motion compensation to predict the movement of a video image. These interpolated pixel values are mixed, using the noise factor to set the ratio for mixing the values, and the mixed output is provided as a target output pixel to be displayed. In some applications, this mix is a linear mix. In other applications, the mix is a non-linear mix such as those using median filtering. Still other applications involve the use of both linear and non-linear mixing approaches. In the context of the above discussion, the noise factor (i.e., as decision noise) is different for different pixels or different sets of pixels in each target pixel block.

In some embodiments, the noise-masking approaches described herein are implemented on a selective basis for pixel blocks in a particular video stream, according to artefact conditions (e.g., an artefact threshold) for the pixel block being processed. For example, noise-masking approaches to generating target pixels can be carried out in applications in which artefacts are present in a particular pixel block, or in applications where the strength (or severity) of artefacts a particular pixel block is relatively high.

In some embodiments, a similar artefact-masking approach is carried out using an estimated likelihood of the presence and/or strength of artefacts, and basing the use of noise-masking approaches for processing a particular pixel block upon the estimated likelihood. Such a likelihood may be determined, for example, based upon the motion vector field local to the pixel block being processed, where a consistent motion vector field can be generally interpreted as presenting a low likelihood of artefacts and an inconsistent motion vector field can be generally interpreted as presenting a high likelihood of artefacts. Another approach to estimating the likelihood of an artefact occurrence involves the detection of occlusion areas, with the likelihood of an artefact being higher in occlusion areas and lower elsewhere.

In addition to the above, a mix of actual and estimated artefact conditions can also be used in accordance with these and/or other example embodiments. This mix may be applied, for example, at different locations in a particular video frame being processed, or at each pixel block, depending upon conditions such as the nature of the image in the video frame or pixel blocks.

In connection with the above and other example embodiments, an artefact threshold may be set and determined using one or more of a variety of approaches, depending upon a variety of conditions such as those relating to the type of video being processed, a video display used to display the processed video, and the video source. This threshold is then used as a determination of whether to process incoming pixel blocks using a noise-masking approach or using another approach. For example, it may be desirable to set an artefact threshold relating to one or both of artefact quantity and size that are known to result in the generation of undesirable video data. When artefact conditions for a particular pixel block being processed meet or exceed such a threshold, noise-masking processing is implemented in processing the pixel block to mask or hide the artefacts, or otherwise to make the resulting video pleasing. In this context, the noise-masking approach can be turned on or off, relative to another approach for generating an output (target) pixel block.

Turning now to the figures, FIG. 1 shows a video processing and display arrangement 100 for processing video data with selective , according to another example embodiment of the present invention. The arrangement 100 includes a video display 105 and video processing circuitry including an up-converter 110 that generates video data to be displayed on the video display. In some applications, the up-converter 110 is implemented as part of the video display 105, such as in a video data processing chip or circuit operating to generate an output for the display of video images. The up-converter 110 includes interpolation functions A and B at 112 and 114 that interpolate pixel data from an input pixel block 120. A mixer 116 combines the interpolated pixel data to generate an up-converted pixel block 140. The up-converted pixel block 140 can be added to an input video stream or otherwise presented with other pixel blocks for a video stream to provide an up-converted output having a desirably higher frame rate than the input video stream. A variety of such input pixel blocks (120) are thus processed to generate a video image for the display 105.

The mixer 116 uses a mix factor β and selectively uses a noise factor ε in mixing the interpolated pixel data, according to an artefact characteristic 130, such as an estimated or actual artefact condition as described above. For example, as described above, a larger noise factor is used when an artefact characteristic 130 pertaining to the input pixel block 120 is indicative that the likelihood of an artefact being present in the pixel data is high. When the artefact characteristic 130 for a particular pixel block indicates that the likelihood of an artefact being present in the pixel data is low, the mixer 116 processes the interpolated without using a noise factor. In some embodiments, the up-converter 110 determines or estimates artefact conditions to set the artefact characteristic 130. For instance, the up-converter 110 may detect inconsistent motion vectors in neighboring pixels relative to the input pixel block 120, and use the detection of the inconsistent motion vectors as an indicator of an artefact condition that presents a relatively high likelihood of artefact occurrence.

In one embodiment, the mixer 116 mixes data generated using two interpolation functions for estimating video data as follows. For an input block of pixels 120, the target output block (up-converted pixel block 140) is characterized in the following:

F_oul (x, n - a) = (β + ε)G(F(x, n - 1), F(x, «)) + (1 - (β + ε))H(F(x, n - 1), F(x, «))

where

F(x,n) is the pixel value at position x = (x,y)^τ (T for transpose), n is the picture number, the functions G and H are two methods for interpolation, β is the mix factor (e.g., the decision factor), ε is a noise component (factor) that can be both positive as well as negative, and o < Off + £^•) < i ,

where β is constant for a complete block, and ε is different for different pixels in a pixel block.

The mix factor /?can be maintained as constant for a block of pixels. By adding the noise component ε to the mix factor β for every pixel in the block, and by selectively changing the noise component, different pixels in a particular pixel block are mixed slightly differently. For example, as relevant to the above discussion, each pixel in a pixel block is mixed using a different noise component for certain applications. In other applications, two or more pixels in a pixel block are mixed using the same noise component, with other pixels in the pixel block being mixed using a different noise component. With this approach, blocking artefacts are masked and become less visible.

The noise component ε is generated using one or more of a variety of approaches. In some embodiments, the noise is generated using a random number. For instance, as related to the above discussion, a random number (e.g., -0.25 to 0.25) can be added to a mix factor value β [0 ... 1], and the new β can be clipped between [0, I]. The random addition can be at least partly controlled using the differences and/or reliability of motion vectors in the direct neighborhood of current pixel block being processed (for which the noise component ε is generated). For instance, if the reliability of one or more local motion vectors is high, a small noise component can be added and if the reliability is small, larger noise components can be added. For general information regarding video processing, and for specific information regarding video processing and mixing approaches for up-conversion as may be implemented in connection with FIG. 1 and/or various other example embodiments, reference may be made to A. Oj o and G. de Haan, "Robust motion-compensated video up-conversion," IEEE Tr. On Consumer Electronics, vol. 43, no. 4, pp. 1045-1056, November 1997; reference may also be made to PCT Patent Application No. PCT/IB 1998/001241, entitled "Motion Estimation and Motion-compensated Interpolation (listing Gerard de Haan as an inventor); both of these references are fully incorporated herein by reference. FIG. 2 shows a flow diagram for processing video data, according to another example embodiment of the present invention. An input pixel block 200 for an input video stream is processed at block 210 to detect an artefact condition relative to the input pixel block (e.g., as may be relative to the input pixel block and nearby pixel blocks in the video stream). At blocks 220 and 230, different positions for each pixel in the input pixel block are interpolated respectively using first and second interpolation functions. In some applications, one of the interpolation functions uses a motion compensation functions and the other uses a non-motion compensated interpolation function. Other applications are directed to the use of other combinations of motion-compensated and non-motion compensated functions. Motion compensated interpolation involves, for example, estimating the location of the pixel or as relative to an image, after a particular time period in consideration of a motion vector or vectors relating to the pixel.

If an artefact condition exceeds a threshold at block 240, the input pixel block 200 is processed using a noise factor. At block 250, such a noise factor is generated for each pixel in the input pixel block, with the noise factor for each pixel being different than the noise factor for other pixels in the pixel block. As described above, this approach at block 250 may be carried out with sets of pixels (i.e., where two or more pixels are processed using the same noise factor, and other pixels are processed using a different noise factor or different noise factors). At block 252, the interpolated values for each pixel in the pixel block 200 are mixed using a mixing factor and the noise factor to generate a target pixel for each pixel in the input pixel block. The result of the mixing is output as an up-converted pixel block (i.e., as part of an up-converted video stream from the input video stream) at block 270.

If an artefact condition does not exceed a threshold at block 240, the input pixel block 200 is processed without a noise factor. At block 260, the interpolated values for each pixel in the pixel block 200 are mixed using a mixing factor to generate a target pixel for each pixel in the input pixel block. The result of the mixing is output as an up- converted pixel block (i.e., as part of an up-converted video stream from the input video stream) at block 270. The steps characterized at blocks 210 - 270 are carried out for all input pixel blocks in an incoming video stream and used to generate an output that is up-converted to include additional frames. Motion estimation used to interpolate additional frames is carried out using a mixing approach that is sensitive to artefacts in using noise to render the artefacts less detectable by the human eye when displayed on a video display. FIG. 3 shows an up-conversion circuit 300 for up-converting incoming video data at IN, according to another example embodiment of the present invention. The up- conversion circuit 300 processes incoming pixel blocks to generate up-converted video data from the incoming video data by interpolating a position for each pixel in the pixel blocks.

Interpolators 310 and 320 respectively carry out interpolation functions A and B for interpolating a pixel (position) for each incoming pixel, respectively generating outputs A and B. A noise factor generator 330 generates a noise factor ε as a function of the pixel being processed, and as may also be related to neighboring pixels or pixel blocks. A mixing factor generator 340 generates a mixing factor β as a function of a motion vector (MV).

A multiplier 350 multiplies the output of the interpolator 310 by the sum of the mixing factor β and the noise factor ε . Similarly, a multiplier 360 multiples the output of the interpolator 320 by the value of one (1), less sum of mixing factor β and the noise factor ε . An adder 370 then adds the products generated by multipliers 350 and 360 to provide an output pixel at OUT that is generated with the selective and disparate use of the noise factor ε to provide a target pixel block that mitigates visual artefacts. In this regard, the output can generally be considered as OUT = ((β +£^• ) x A) + ((I -(β + £^• )) x B). In some applications, the up-conversion circuit 300 is operated so that the noise factor ε is used primarily when a certain artifact condition (e.g., threshold) is met. For instance, when neighboring motion vectors are consistent and artifacts are unlikely, the noise factor generator 330 may generate a noise factor ε that is small or about zero. In this regard, the output may be considered as 0UT = (β x Λ) + ((l-β) x 5).

The display approaches and embodiments described herein are amenable to use with a multitude of different types of display systems and arrangements, and can be arranged and/or programmed into a variety of different circuits and controllers. For example, certain embodiments involve processing approaches that are carried out in a video processing circuit pipeline for video or television (TV) systems. One such embodiment involves the implementation of one or more of the above up-conversion approaches with a backend video sealer integrated circuit, such as those used on the signal board of an LCD display or television. These applications are implemented using artefact- masking processing of video data to be displayed in a manner that mitigates undesirable display characteristics, such as those described in the background above.

In addition to the above, the various processing approaches described herein can be implemented using a variety of devices and methods including general purpose processors implementing specialized software, digital signal processors, programmable logic arrays, discrete logic components and fully-programmable and semi-programmable circuits such as PLAs (programmable logic arrays). For instance, the up-converter 110 in FIG. 1 can be implemented using one or more of these approaches and in one embodiment, is a microcomputer (e.g., microprocessor) programmed to carry out the described functions.

The various embodiments described above and shown in the figures are provided by way of illustration only and should not be construed to limit the invention. Based on the above discussion and illustrations, those skilled in the art will readily recognize that various modifications and changes may be made to the present invention without strictly following the exemplary embodiments and applications illustrated and described herein. For example, various image data processing approaches may be amenable to use with various display types, relating to projection displays, flat-panel displays, LCD displays (including those described) involving flat-panel or projection display approaches, handheld device displays, and other digital light processing display approaches. Such modifications and changes do not depart from the true scope of the present invention that is set forth in the following claims.

Claims

What is claimed is:

1. A video processing arrangement for up-converting input video data having pixel blocks, the arrangement comprising: an up-conversion circuit (e.g., 110) to in response to an artefact condition (e.g., 130) that exceeds an artefact threshold for an input pixel block, generate a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block (e.g., 112, 114, 116), in response to an artefact condition that does not exceed the artefact threshold, generate a target pixel block from the input pixel block using a motion characteristic of the pixel block (e.g., 112, 114, 116), and output up-converted video data that includes the target pixel block generated for each input pixel block (e.g., 140).

2. The arrangement of claim 1, wherein the up-conversion circuit generates a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block by using a noise component that is different for each pixel in the target pixel block.

3. The arrangement of claim 1, wherein the up-conversion circuit generates a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block by generating interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, and generating each pixel in the target pixel block by mixing the generated interpolation values for each pixel using the noise component for that pixel.

4. The arrangement of claim 1, wherein the up-conversion circuit generates a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block by generating interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, and generating each pixel in the target pixel block by linearly combining the generated interpolation values for each pixel using the noise component for that pixel.

5. The arrangement of claim 1, wherein the up-conversion circuit generates a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block by generating interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, and generating each pixel in the target pixel block by non-linearly combining the generated interpolation values for each pixel using the noise component for that pixel.

6. The arrangement of claim 1, wherein the up-conversion circuit generates interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, sets a mixing factor value as a function of the interpolation value generation, and generates a target pixel block from the input pixel block by mixing the generated interpolation values for each pixel using the mixing factor.

7. The arrangement of claim 1, wherein the up-conversion circuit generates interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, sets a mixing factor value as a function of the interpolation value generation, and generates a target pixel block from the input pixel block by linearly mixing the generated interpolation values for each pixel using the mixing factor.

8. The arrangement of claim 1, wherein the up-conversion circuit generates interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, sets a mixing factor value as a function of the interpolation value generation, and generates a target pixel block from the input pixel block by non-linearly mixing the generated interpolation values for each pixel using the mixing factor.

9. The arrangement of claim 1, wherein the artefact threshold is set as a function of the strength of an artefact in the pixel.

10. The arrangement of claim 1, wherein the artefact threshold is set as a function of the likelihood that the pixel includes an artefact.

11. The arrangement of claim 1, wherein the artefact threshold is set as a function of a likelihood that the pixel includes an artefact as determined in response to the consistency of a local motion vector field for a particular pixel.

12. The arrangement of claim 1, wherein the artefact threshold is set as a function of a likelihood that the pixel includes an artefact as determined in response to the presence of an occlusion area for a particular pixel.

13. The arrangement of claim 1, wherein the up-conversion circuit generates the noise component for a particular pixel using a random number.

14. The arrangement of claim 1, wherein the up-conversion circuit generates the noise component for a particular pixel as a function of a motion vector near the pixel block including the particular pixel by in response to the motion vector having a high reliability, generating a relatively small noise component, and in response to the motion vector having a low reliability, generating a relatively larger noise component.

15. The arrangement of claim 1, further including a video display to display the output up-converted video data.

16. The arrangement of claim 1, further including a video display device to display the output up-converted video data, and wherein the video display device includes the up-conversion circuit.

17. A method for up-converting input video data having pixel blocks, the method comprising: for each input pixel block exhibiting an artefact condition that exceeds an artefact threshold, generating a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block (e.g., 220, 230, 250, 252); for each input pixel block that does not exhibit an artefact condition that exceeds the artefact threshold, generating a target pixel block from the input pixel block using a motion characteristic of the pixel block (e.g., 220, 230, 260); and outputting up-converted video data that includes the target pixel block generated for each input pixel block (e.g., 270).

18. The method of claim 17, wherein generating a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block includes generating interpolation values for each pixel in the target pixel block using a motion characteristic of the input pixel block, and generating each pixel in the target pixel block by mixing the generated interpolation values for each pixel using the noise component for that pixel.

19. The method of claim 17, further including determining a amount of expected block artefacts for different image portions of the video data, wherein generating a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block includes generating a target pixel block for each input pixel block for which a determined amount of expected block artefacts exceeds a threshold amount of block artefacts.

20. The method of claim 17, further including determining a severity of expected block artefacts for different image portions of the video data, wherein generating a target pixel block from the input pixel block using a motion characteristic of the input pixel block and a noise component that is different for different pixels in the target pixel block includes generating a target pixel block for each input pixel block for which a determined severity of expected block artefacts exceeds a threshold severity of block artefacts.