EP2705494A1

EP2705494A1 - Method for scaling video data, and an arrangement for carrying out the method

Info

Publication number: EP2705494A1
Application number: EP12721212.4A
Authority: EP
Inventors: Alexander NEUBECK
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2011-05-04
Filing date: 2012-04-30
Publication date: 2014-03-12
Also published as: DE102011075261A1; WO2012150203A1; US20140055498A1

Abstract

The invention presents a method for editing video data (60), a method for presenting video data (60) and an arrangement for carrying out the method for editing video data (60). The method for editing video data (60) involves the video data (60) being smoothed and sampled in a CPU (50) for prescaling and then transmitted to a graphics card having a graphics processor (52), in which the prescaled video data (64) are subjected to an edge sharpening operation.

Description

description

title

METHOD FOR SCALING VIDEO DATA AND AN ARRANGEMENT FOR IMPLEMENTING THE

PROCEDURE

The invention relates to a method for processing video data, to a method for displaying video data and to an arrangement for carrying out the method for processing video data. State of the art

To control the screen display in computer systems or computers graphics cards are used. Usually, when a program is executed, the processor or the CPU of the computer calculates the data and forwards it to the graphics card. The graphics card converts the data so that the monitor can render the data as an image.

Usually, video data is either pre-scaled completely on the central processing unit (CPU) and then transferred to the graphics card, or the complete video image is transferred to the graphics card where it is scaled to the target resolution.

In video surveillance systems, many video sequences must be displayed in parallel on a monitor. If, for example, 25 HD videos (resolution eg 1280 x 720) are to be displayed at 60 Hz, then 1500 textures must be displayed on the

Graphics card, which corresponds to a transfer rate of 2 GB / second. To be able to display the 25 HD videos in full resolution on the monitor, this would have to have a total resolution of 6400 x 3600.

However, high-resolution monitors usually have a much lower maximum resolution of, for example, 1920 x 1200 pixels. Would with a classic

If the video material on the CPU is scaled, then a transfer would be transmission rate of 207 MB / s. However, this approach has the disadvantage that the CPU, which is already busy with decoding and other tasks, also has to perform the video scaling. However, high-quality image scaling by any scaling factor is a complex operation for the CPU.

The document DE 10 2005 046 664 A1 describes a method for generating a flexible display area for a video surveillance system. The display area comprises a main window into which a number of information windows can be overlaid. Switching an operator is done by a

Selection and resizing of an information window. With the described method, a man-machine interface is realized, which offers a clear presentation while at the same time good adaptability to the respective application. Video information is prepared, arranged and displayed in such a way that optimal transmission of the

Information on the human server is possible.

When displaying the video data, the so-called scaling problem has to be considered. In digital image editing, scaling refers to resizing a digital image, distinguishing between raster graphics and vector graphics. The scaling of raster graphics is a sample rate conversion, namely the conversion of a discrete signal of one sample rate into a discrete signal of another sample rate. Disclosure of the invention

Against this background, a method for editing video data with the features of claim 1, a method for displaying video data according to claim 5 and an arrangement according to claim 6 are presented. Embodiments result from the dependent claims and the description.

With the presented method, it is thus possible to distribute the load arbitrarily between CPU and graphics card so that the available resources can be used optimally. The presented method deals with the decomposition of the general scaling problem which consists of two steps: a smoothing operation including sampling and a subsequent edge sharpening operation. The former can be implemented particularly efficiently on the CPU by means of SSE commands (SSE: Streaming SIMD extension), when a power of 2 or at least an integer factor is scaled. In the above example, ideally, the image material is pre-scaled by the CPU by a factor of 2 in both directions, ie, smoothed and sampled accordingly. This reduces the amount of data to be transferred by a factor of 4. After that, the graphics card is used to perform edge sharpening and, typically, the result image is scaled to the final target resolution by the remaining 5/3 factor.

A smoothing operation is to be understood in the mathematical context of an operation with which a curve is converted into a curve of lesser curvature. This curve of lesser curvature should deviate as little as possible from the original curve.

Since the decoded image in main memory usually has to be kept as a reference image, the image should be stored in a special texture for transfer.

Memory to be copied. Ideally, the image smoothing including scanning is set instead of the simple copying operation.

It should be noted that a modern compression standard when coding a data frame or frame references other frames. These frames must be efficiently accessible by the decoder and, when the decoder is executed on the CPU, reside in the main memory of the computer. In order for the video card driver to efficiently copy the frame from main memory to the video card memory using DMA, the frame must be in a special area of a non-pageable memory or in a pinned memory area.

An application should be very conservative with pinned memory because it can not be swapped out or swapped by the operating system to provide space for other applications including the operating system. Thus, to cope with both areas, the decoder and the texture upload area, a copy operation from normal memory to a pinned area is required. Memory area necessary. The data is then copied from the pinned memory area via DMA transfer to the memory of the graphics card and can then be accessed for further calculations and presentations. This is referred to herein as a "simple" copy operation since there is no further transformation other than copying, but this copying operation is done by the CPU or processor, ie, the processor reads a small data area into its registers and writes its contents to another address. Thus, the processor is not overly busy and can also calculate simple transformations, such as smoothing and subsampling the image.

With appropriate optimization and selection of the scaling method, the additional operations are negligible, so that the conversion creates no further load for the CPU. The subsequent texture transfer relieves the graphics card or memory bus in the above example by a factor of 4, d. H. only 500 MB / s are transmitted.

The remaining edge sharpening and rescaling can be parallelized very well line by line or column by column, which makes them particularly suitable for processing on modern graphics cards using CUDA (Compute Unified Device Architecture) or Compute Shaders. CUDA is a technique by which program parts can be developed that are processed by the graphics processor (GPU) on the graphics card.

In this case, the total load compared to the full scaling on the graphics card has dropped despite the additional edge sharpening, since the amount of data has already been reduced by a factor of 4 by the CPU. The advantage becomes even clearer with larger scaling, for example with filling HD 1920 x 1020, since now the first scaling step on the CPU already enables a data reduction by a factor of 16, without generating a significant load on the CPU.

Further advantages and embodiments of the invention will become apparent from the description and the accompanying drawings. It is understood that the features mentioned above and those yet to be explained below can be used not only in the particular combination indicated, but also in other combinations or in isolation, without departing from the scope of the present invention.

Brief description of the drawings

FIG. 1 shows an embodiment of the described arrangement. FIG. 2 shows an embodiment of the presented method. Embodiments of the invention

The invention is schematically illustrated by means of embodiments in the drawings and will be described in detail below with reference to the drawings.

1 shows a schematic representation of an embodiment of the presented arrangement, which is designated overall by the reference numeral 10 and is integrated in a computer system or a computer 12. This arrangement 10 is connected to a monitor 14 on which video data is to be displayed.

The arrangement 10 comprises a CPU 20, a main memory 22, a graphics card 24 and a texture memory 26. In the graphics card 24 in turn a graphics processor 30 and a graphics memory 32 is provided. In one embodiment of the arrangement 10, the graphics memory 32 serves as a texture memory 26.

FIG. 2 shows the presented method based on the conversion of the video data intended for display. The illustration shows a CPU 50, a graphics processor 52 and a monitor 54 for this purpose.

Initially, the video data 60 for presentation is present in the CPU 50. In a first step 62, the video data 60 is smoothed and sampled so that pre-scaled video data 64 is present. This pre-scaled video data 64 is transferred to graphics processor 52 in a further step 66 by texture transfer. In the graphics processor 52, the pre-scaled Video data 64 is subjected to an edge sharpening operation in a further step 68 so that edge-sharpened pre-scaled video data 70 is present. The edge sharpened pre-scaled video data 70 is finalized in a further step 72, thereby obtaining final scaled video data 74. This final scaled video data 74 is transmitted to the monitor 54 for presentation in a final step 76.

It was recognized that the general scaling problem corresponds to a basal change, i. H. you project the original image, d. H. the original video data 60, as lossless as possible in the target area. When working with discrete input signals, "losslessness" can be measured as a reproduction error, i. H. from the scaled image, the original pixels are reproduced by interpolation and the deviation from the latter is measured. In the continuous case, a measure is defined which directly compares the two continuous signals, e.g. B .:

<f, g> = integral f (x) ^* g (x) dx

For the sake of simplicity, the method will be described below with reference to the discrete variant. However, the presented method can be applied equally to the continuous case.

In many cases, due to the complexity, the scaling task is replaced by a simple interpolation task, and the resulting massive aliasing or blurring in favor of performance is accepted. The described method shows how the correct scaling can be performed without compromising performance.

Mathematically, the discrete scaling problem can be formulated as follows:

(A ^* y-x) ^Λ 2 -> min where x represents the original image, y the scaled image and A an interpolation matrix to return from the scaled image to the original image. This scaling problem has the solution: y = (Α ^Λ Τ ^* A) ^Λ -1 ^* Α ^Λ Τ ^* x

In the above wording, Α ^Λ Τ corresponds to the smoothing operation with sampling and (Α ^Λ Τ ^* A) ^Λ -1 to edge sharpening.

For scaling by an integer factor and suitable boundary conditions, eg. As in a reflection, A corresponds to a convolution matrix, which can be calculated by means of simple convolution. Thus, Α ^Λ Τ and (Α ^Λ Τ ^* A) ^Λ -1 are also convolution matrices. The latter can be calculated by z-transformation or matrix decomposition by simple recursive filters.

Since image scaling in x- and y-direction are independent of each other, one can also mix the corresponding operations as desired. It is therefore advisable first to perform the operations that reduce the amount of data the most, ie. H. in the case of image scaling, first perform the appropriate smoothing with scanning in the x and y directions before starting the edge sharpening. In combination with a graphics card, this means that the CPU handles the simple convolution with corresponding data reduction in both the x and y directions while the graphics card performs the appropriate post processing to arrive at the final scaled image. This scheme automatically minimizes data transfer between main memory and graphics card.

Claims

claims

Method for processing video data (60) to be displayed on a monitor (14, 54), in which the video data (60) are smoothed and scanned in a CPU (20, 50) for prescale, and subsequently to a graphics card (24) with a graphics processor (30, 52) in which the pre-scaled video data (64) is subjected to edge sharpening operation.

The method of claim 1, wherein the transmission of the pre-scaled video data (64) is by texture transfer.

The method of claim 1 or 2, wherein in the graphics processor (30, 52), in addition to the edge sharpening operation, a final scaling is performed.

Method according to Claim 3, in which the edge sharpening operation and the final scaling are parallelized line by line or column by column.

A method of displaying video data (60) on a monitor (14, 54), wherein the video data (60) is processed prior to presentation by a method according to any one of claims 1 to 4.

Arrangement for processing video data (60), which has a CPU (20, 50) and a graphics card (24) with graphics processor (30, 52), wherein the CPU (20, 50) prescalculates the video data (60) for a Smoothing and sampling of the video data (60) is formed and the graphics processor (30, 52) for edge sharpening operation of the vorskalierten video data (64) is formed. 7. Arrangement according to claim 6, wherein a texture memory (26) for the transmission of the pre-scaled video data (64) is provided.

8. Arrangement according to claim 7, wherein the texture memory (26) is a graphics memory (32) of the graphics card (24).

9. Arrangement according to one of claims 6 to 8, wherein in the CPU (20, 50) an SSE command data set is stored.

10. Arrangement according to one of claims 6 to 9, wherein the CPU (20, 50) and the graphics card (24) in a computer system (12) are integrated.