WO2002043387A2

WO2002043387A2 - Devices and method for processing images

Info

Publication number: WO2002043387A2
Application number: PCT/EP2001/013761
Authority: WO
Inventors: Wes Bell
Original assignee: Voxar Ag
Priority date: 2000-11-24
Filing date: 2001-11-26
Publication date: 2002-05-30
Also published as: AU2002221896A1; WO2002043387A3

Abstract

The invention relates to an image processing device, comprising user image data input devices for inputting current user image data, image data editing devices for generating edited user image data from the current user image data and image data outputting devices for outputting user image data. The image data editing devices contain processing devices which are configured to determine the largest moving object and identify this as the user or available parts of the user, and localise and track the same. An inventive image processing method comprising the acquisition, editing and outputting of current user image data also provides that the largest moving object is determined and is identified as the user or available parts of the user and localised and tracked.

Description

Image processing equipment and processes

The invention relates to image processing devices and methods. In particular, the invention finds application in the field of so-called nideophones or video telephones.

Nideoki miinikationsvorrichtungen, Nideokommumkationssysteme and Nideokommunikations- that, in addition to the auditory area also offer visual media or channels for the transmission of sound and image information are known, but have not yet been widely used in the general population. A major disadvantage of the prior art is that the associated transmission of image information to at least one other communication participant often interferes with the privacy of the user, depending on who triggers the communication contact, the user would like and / or his communication partner transmits certain visual information or not. The communication participants would prefer to transmit an optimal “desired appearance” tailored to the respective communication partner. This includes not only a suitable background, but also suitable clothing and an advantageous other appearance.

I '

The invention is based on the presence of audiovisual communication media. General features of audiovisual communication media are microphone and loudspeaker, video camera and screen, a control unit, a processing unit outgoing for processing audio and video signals, a processing unit coming for processing audio and video signals and a compression unit for optimal use of the available line bandwidth, e.g. via analog and digital telephone networks, packet-controlled communication via the Internet, internal computer networks, etc.

In particular, the invention relates to image processing devices, such as, for example, in telecommunications or video communication devices with user image data input devices for inputting current user image data, image data editing devices for generating edi oriented user image data from the current user image data, and image data output devices for outputting user image data to, for example, at least one further communication subscriber.

The basis of the present invention is also an image processing method for e.g. Tele or video communication method, whereby at least one communication subscriber or user in general is identified by means of identification devices, current user image data are entered in user image data input devices, an edit selection controller directs the current user image data to image data editing devices as a function of the identification result of the identification devices, the image data editing devices, when they have received the current user image data, generated user image data edited therefor or for this, and finally unedited current or, if present, edited user image data are output by means of image data output devices. Furthermore, the basis is an appropriately equipped or functioning telecommunications or video communication system.

Such devices, methods and systems are disclosed in the international patent application PCT / DE 00/00442 with the filing date February 16, 2000 and the priority dates February 16 and October 8, 1999 by the same applicant. The full disclosure content of this application is hereby incorporated in its entirety in the current documents to avoid merely repetitive reproduction by means of the present reference. In particular, this reference applies to editing options and controls.

The basic principle of this technology is that the image taken by a user is broken down into, for example, three levels, the face or head level, the body level and the background level, which can each be edited individually. The device and method configurations and features of the present invention are also based on this.

The devices, methods and systems, in particular according to PCT / DE 00/00442, are intended to be further improved with the present invention.

Insofar as reference is made below to telecommunications or video communication devices or telecommunications or video communication methods, this is done only by way of example and not by way of limitation. To this end, the present invention provides an image processing device with user image data input devices for entering current user image data, image data editing devices for generating edited user image data from the current user image data, and image data output devices for outputting user image data, the image data editing devices containing processing devices designed for this purpose are to determine the largest moving object and to identify and locate and track them as users or available parts of the user.

In an image processing device according to the invention, it is also preferably provided that the processing devices segment the largest moving object in order to feed the received data to further processing.

The object of the invention is also achieved by means of an image processing method in which current user image data are acquired, edited and output, the largest moving object being determined and identified and located and tracked as the user or available parts of the user.

In an image processing method according to the invention, it is preferably further provided that the largest moving object is segmented and that its data are fed to further processing

Further advantageous and preferred embodiments of the invention result from the entirety of the present documents including the disclosure content of PCT / DE 00/00442 as well as the state of the art and the professional knowledge.

From the content of the disclosure of the present documents, further aspects of the invention that can be protected and are worthy of protection are obtained on their own, even without a combination with the aspects of the invention formulated in the claims and explained above. In this respect, the invention relates to any configurations of image processing devices and methods determined by the features and combinations of features disclosed in the present documents, which can be made both the subject of the present application and the subject of divisional applications.

The invention is explained in more detail below merely by way of example using an exemplary embodiment, to which and its features and combinations of features, however, the invention is not restricted. The present invention is considered to be an improved blue screen technology that is particularly suitable for, but not limited to, video conferencing applications or video phonography.

Segmentation of actors in film and video sequences has been an important topic in the film and television industries for many years. For segmentation in these industries, an approach has typically been chosen that requires an actor with a plain background to be included behind the actor. This known single color is then used like a key to know which parts of each frame is the background and which parts of each frame is the actor. After segmentation, the actor can be inserted into or overlaid with a new, possibly composite frame.

With the advent of video conferencing applications on desktop computers and the upcoming handheld and portable device market, segmentation using low cost cameras and low power consumption devices has become an issue.

In the following, a solution to this topic of real-time segmentation is shown for in particular video conference applications and videophony, preferably on a desktop or hand-held and portable devices.

The key idea is to take advantage of the fact that a person is typically the largest figure in view during a videoconferencing session or long distance video conversation. The current new technology eliminates the need for man to be against a specifically colored background. In addition, it is shown that moving objects can exist in the background and how these objects can be automatically found and removed from the list of possible interesting objects.

Applications for real-time segmentation of humans, e.g. in video conferencing applications involve replacing a background. In addition, segmentation is used as a first step in facial detection and recognition, and as a first step in changing every part of the segmented person.

Many of these existing approaches to segmented actors and video sequences have existed for many years. These approaches are often based on color (RuzonOO, Wojdala98, Szeliski98, Nachshon97, Hall97, Shimoda89) and some on the figure (Panusopone99, PanusoponeOO, Qian99, Poth90, Ben-EzraOO). Recent approaches use depth maps, previous information used with the use of multiple cameras (Kompatsiaris98) or with previous background knowledge (KompatsiarisOO).

The present approach works well for semi constrained scenes. This approach has proven to be an effective solution not only for desktop video conferencing applications, but also in the emerging market for handheld video conferencing or videophone devices and other similar applications, cheap desktop video cameras or a digital video camera card were used in all tested applications.

This solution runs in real time and builds on and extends the traditional idea of using a blue screen or other uniform color, such as green, as the background.

The following aspects are discussed below:

"Semi Constrained Scenes" section: The characteristic shapes of people in video conferencing applications.

Section "Advanced Blue Screen": The segmentation algorithm used. Section "Use in video conference applications": Applications for the use of a detected face. Conclusion section: Conclusions.

Semi constrained scenes

The approach described here works with so-called "semi-constrained" scenes. "Semi-constrained" scenes are defined as scenes that meet the following criteria. First, the border colors of a video conference participant's head and clothing must contrast with the color of the background. Second, no other scene objects may overlap with humans.

The first criterion is typically easily met because, for example, most office environments have solid, light walls and video conference participants typically wear dark clothing. It is not necessary for walls to be light and for video conference participants to wear dark clothes, only that there is a slight contrast between the two. The second criterion is also easy to meet, since most office environments typically have objects on the wall, such as clocks and framed works of art, between which videoconferencing participants can be easily placed so that these objects do not interfere with his head or body or their heads or Intersect bodies.

Advanced Blue Screen Segmentation Algorithm

Using live video feed from an inexpensive digital video camera, this approach performs several consecutive steps to process each video frame. The first step extracts 8-bit luminance data from the full color data captured by the video. Using this light channel, salt and pepper noise is removed using the median filter. Then edges are detected using an operator such as Sobel. A cutoff thresholding then removes all edges whose "edge sharpness" (edginess) is below a specified numerical value. The heart of this segmentatins algorithm is a multi-stage vectorization process, which is referred to as an advanced blue screen (ABS) ,

ABS begins the search process row by row in the lower left corner of each nideo frame. Whenever an edge pixel is found, ABS scans the edge in a clockwise direction until the original pixel is reached. While an edge is being scanned, a secondary mask, known as the segmentation mask, is created. A numbered limit is set in the segmentation mask during the edge scanning process.

As soon as the current figure is scanned, an inverted flood fill appears within the bounding rectangle of the newly scanned figure or shape. The boundary data and the inverted flood fill data are then used to "flood fill" this rescanned figure or shape, while the newly found interior figure is clearly numbered and the newly found surface area of the figure or shape or shape is calculated.

This process of finding, scanning, and flooding each closed figure in the current video frame continues until all shapes or forms are found. It should be noted that the boundary pixels are uniquely numbered for each figure or shape found, and likewise all inner pixels of each figure are numbered differently and uniquely. In the last pass through the Nideorahmendaten all figures except the largest figure are automatically removed. This completes the segmentation process, which is border pixel accurate.

Use in video conferencing applications

The segmentation data generated by this approach can be used for countless purposes. The first use is real-time background replacement. A second use is to detect the user's face in order to perform face recognition or processing. A third use is to hide the face of the user.

Conclusion

Progress in historical blue screen film and video segmentation technology has been described above. This advanced blue screen technique has been described to be used e.g. for desktop videoconferencing and videophone applications and that this advanced blue screen technology is particularly suitable for future videoconferencing videophone applications using e.g. Digital video camera cards in hand-held and portable devices.

The term semi constrained scene was defined and an example of semi constrained scenes was shown. The advanced blue screen technique is based on these semi constrained scenes that are given as input.

Possible applications of the technology have been described and many more are conceivable.

In the following, a further concrete version of the invention is described in detail but only by way of example using method steps.

This exemplary embodiment of the image processing method according to the invention, which is described in more detail below, the characteristics of a corresponding image processing device, as in the other available documents, resulting analogously for a person skilled in the art, relates to image processing and methods that use algorithms, in particular the blue screen technology. to improve the image, preferably in videoconferencing and videophone applications, but also in other comparable applications. general description

The segmentation approach of the present invention assumes the following conditions.

A. The foreground and background do not share borders.

B. In the foreground image area, the background contrasts with the foreground.

The procedure contains the basic steps:

1. Convert 24-bit color to 8-bit grayscale using the "Y" component of YIQ.

2. Perform median filtering to remove "salt and pepper" noise.

3. Calculate marginal data.

4. Carry out cutoff thresholding to remove margins that are barely present, or in other words, remove all detected margins whose "egdiness" or marginal value or edge sharpness is very low.

5. Vectorize the image data. This will result in zero or more defined figures. Sort out all figures or shapes or shapes except the one with the largest surface area. This will remove background borders and figures, including any moving figures in the background, that have no border in common with the foreground figure.

6. Feed resulting segmentation data to the modification code to get the composite image.

definitions

Segmentatiosmaske:

- Arrangement of unsigned 1-byte values height x width

- Entry as marginal data

- Output as anti-aliasing data for the modification logic - Interpretation of anti-aliasing values

0 ≡ 0% opaque foreground 255 ≡ 100% opaque foreground 1 to 254 ≡ anti-aliasing level X, so that the resulting pixel value for display and transmission is calculated on the basis of color component to color component as the resulting pixel =

((Foreground color component * X)

+

(Background color component * (255 - X))

)

/ 255

Form mask:

- arrangement of unsigned values height x width

- Perhaps an arrangement of 1-byte values, 2-byte values or more, depending on the maximum number of supported shapes or figures

- Contains figure boundary and figure content information

- interpretation of values

0 ≡ outside of all figures

1, 3, 5, ... ≡ Figure boundary pixel whose figure number is given

2, 4, 6, ... s figure content pixel whose figure number is calculated by subtracting 1 from the given value

vectorize

1. Initialize the following variables a. Allocate an array of unsigned values to the memory for a figure mask (ShapeMask), the width or height of which is the same as that of the segmentation mask (SegMask), and initialize ShapeMask to zero. b. Allocate storage for the layout or arrangement, called pVectShape, that includes all of the figure surface area values and the narrowest bounding box coordinates for each figure or shape. c. CurrentShapeBoundaryNumber (current figure limit) = 1 d. X = l e. Y = l

2. Starting with the position (X, Y), scanning from left to right, from bottom to top until a non-zero pixel is found that is also a zero-valued pixel in the ShapeMask.

If this condition is met, go to the next step. Exit the loop and go to step 6 if

((X> = (.Width - 2)) && (Y> = (iheight - 2)))

The reason for scanning from left to right and from bottom to top is given by: a. The origin of the picture is in the lower left corner b. The positive X axis points to the right c. The positive Y axis points upwards

3. Call the routine "Trace Around Shape" (follow the shape), the ShapeMask updates or updates and comes back with the value SurfaceArea and the coordinates of the closest border box around the figure.

4. If (SurfaceArea> 0), save the returned SurfaceArea and boundary box coordinate values (a SHAPEJDATA record) and increment to get the next available figure number. pVectShape-> sd [(CurrentShapeBoundaryNumber - 1) / 2] =

SHAPEJDATA; CurrentShapeBoundaryNumber + = 2;

5. Go to step 2

6. Call routine "Cull Shapes" (sort out shapes or figures)

7. Call routine "Opaque Foreground" (opaque foreground)

8. Empty the memory assigned to ShapeMask and pVectShape. Episodes around figure or shape

1. As entered, this routine requires a. the current figure pixel location (X, Y), b. the current figure boundary number (CurrentShapeBoundaryNumber), and c. it is assumed that we are currently on the lowest leftmost figure boundary

The fact that data is traced from left to right and from bottom to top in the pixel array leads to the beginning of the scan at the bottom leftmost figure boundary pixel. The reason for this is that the BITMAP data that is given is in reverse order. In other words, although the scan lines are left to right oriented, the scan rows are ordered from the bottom up, which is the standard BITMAP order. The image origin is in the lower left corner with the positive X-axis pointing to the right and the positive Y-axis pointing upwards.

2. Save the location of this first figure boundary pixel found. a. XFirstShapeBoundaryPixel = X b. YFirstShapeBoundaryPixel = Y

3. Initialize the minimum and maximum x and y digits for the closest bounding box that includes this figure. a. XMin = X ^<' b. YMin = Y c. XMax = X d. YMax = Y

4. In addition to this option to stay where you are, there are eight different adjacent directions that you can move in, the following are these eight adjacent directions:

5 6 - 7

4 currently 0

3 2 1 These values 0 to 7 correspond to the indexes of the arrangement or arrangement aCW_Offsets, which are defined below.

The positive X axis points to the right, while the positive Y axis points upwards. This fact remains true as long as Windows BITMAP data is available. In other words, for the standard Windows BITMAPs the fact applies that the scan lines are oriented from left to right and the scan lines are arranged from bottom to top, which is the opposite of the normal Y-view orientation, in which the image origin in top left corner.

5. Looking towards where we have just moved, we will move in a direction that is 135 ° CCW from the one we have just moved in, and then we will move in 45 ° increments in a CW direction continue until we either a. leave the pixels to be checked, or b. encounter the initial figure boundary pixel (XFirstShapeBoundaryPixel, YFirstShapeBoundaryPixel), which is the natural constraint for this algorithm after moving past the first figure boundary pixel, or c. find the next figure boundary pixel by finding the next edge pixel (i.e. non-zero SegMask pixel) and by turning in the CW direction over the last figure figure pixel found, or d. continue looking at the next adjacent pixel by turning 45 ° in the CW direction.

6. From the assumptions made in (1) above, we know the direction we scanned when we found the initial figure boundary pixel. Since we know that we start our CW course at a point that is 135 ° CCW from the direction we last went, we can initialize it as follows:

Arrayh dex 0 1 2 3 4 5 6 7

(Arrangement index) aCW Offsets (1; 0) (1; -1) (0; -l) (-1; -1) (-1; 0) (-1; 1) (0; 1) (1; 1 )

Array index = 5;

7. Set the "run out of pixels to examine" or "leave the pixels to be examined" condition. LastArraylndex = (Arraylndex + 7) mod 8;

8. CASE 5.a - If the pixels to be checked are left (i.e. if LastArraylndex = Arraylndex), a figure with only one pixel was found. Do the following: a. Set the only pixel found to zero to get it in the SegMask! ; remove, and

¹ b. Return with the value 0 for the SurfaceArea. This particular case should be handled before entering the loop.

9. CASE 5.b - If

(X + aCW_Offsets [Arraylndex] .x = XFirstShapeBoundaryPixel) and (Y + aCW_Offsets [Arraylndex] .y = YFirstShapeBoundaryPixel),

then we have completed the figure boundary diagonal. Do the following: a. Set the pixel at the current (X, Y) position to the value of the current figure limit. b. Call routine "Inverted Boundary Fill" to fill the inside of the current figure and calculate the SurfaceArea value for that figure.

10. CASE 5.c - If

(SegMask [X + aCW_Offsets [Arraylndex]. X,

Y + aCW_Offsets [Arraylndex] .y]! = 0)

then we found the next figure boundary pixel by finding the next edge pixel (i.e. non-zero SegMask pixels) while panning in the CW direction over the last figure boundary pixel found. Do the following: a. Set the pixel at the current (X, Y) position to the value of the current figure limit. b. Update the current (X, Y) position as follows: X + = aCW_Offsets [Arraylndex]. x;

Y + = aCW_Offsets [ArrayIndex] .y; c. Update the figure boundary box coordinates:

1. If (XMin> X) XMin = X;

2. If (XMax <X) XMax = X;

3. If (YMin> Y) YMin = Y;

4. If (YMax <Y) YMax = Y; d. Update the starting direction to check:

Array index = ((array index + 6) mod 8);

e. Go to step 7

11. CASE 5.d - We are still turning in 45 ° increments above the last figure boundary pixel found, so a. Array index = (array index + 1) mod 8; b. Go to step 8

Invert border fill

1. The tovers boundary filling algorithm fills from all four sides of the narrowest boundary box, which encloses the current figure. First, a pseudo-fill occurs from each of the four boundary box sides up to, but not including, the current figure boundary. Second, the pseudo-fill is inverted in order to obtain the desired figure boundary fill.

Only one of the four lateral pseudo-flood fillings is described. The others follow analog and are actually implemented as a single routine in the code.

2. Press the button from left to right along the bottom of the figure boundary box, look for a pixel that is not a current figure boundary pixel and at the same time has not already been marked as a current figure outer pixel. If one is found, go to step 3. Otherwise, continue checking the pixels along the current scan line. Once all of the scan line pixels in the current scan line are checked, go to step 5.

3. If the current pixel in the scan line has not already been marked as a figure boundary or an inner pixel of a previously found figure, mark the current pixel as a current figure outer pixel. , If there is another scan line above the current scan line that is within the current figure boundary box and the pixel directly above the current scan pixel is not the current figure boundary pixel, then

a. If the pixel in the scan line above the current scan line pixel has not yet been marked as a figure boundary or inside pixel of a previously found figure, then mark the current pixel as a current figure outside pixel. b. If we're not in a span at the moment, i. Increment the number of ranges found ii. Save the beginning pixel offset for the current span iii. Place the "in a range" flag to indicate that we are currently in a range or distance c. Increment the number of pixels in the current range. Otherwise - when we encounter a current figure boundary pixel - a. Set the "in one Span "mark back Go to step 2.

5. If there is one or more distances or spans in the next scan line, go to step 6. Otherwise go back.

6. Extend the current span or distance set by checking the pixels on the right and left for each span for span membership. Increment the "PrecedingRunLength" or "previous air lengths" values into the PIXEL_SPAN data structure, which are managed for each span or distance and whose data elements are as follows:

BegPixelOffset RunLength PrecedingRunLength SubsequentRunLength

7. Go back starting from step 2 until distances or spans are no longer generated using scan span or range data instead of directly checking the pixel data, since the pixel data has already been checked and updated as needed as each range or span was generated. 8. Call routine "Invert Fill Regions" to complete the inverted limit fill algorithm.

Invert fill regions

1. Scan from left to right, from bottom to top within the current figure limit box.

2. Do one of the following for each ShapeMask pixel checked:

a. If the current ShapeMask pixel is a previous figure boundary or interior pixel, ignore the current ShapeMask pixel and continue. b. If the current ShapeMask pixel is a current figure outer pixel, then reset the current pixel to the value zero. c. If the current ShapeMask pixel is a current figure boundary pixel, increment the surface area counter by one d. If the current ShapeMask pixel is a current figure inner pixel, i. set the current pixel to CurrentShapeBoundaryNumber + 1, and ii.increment the surface area counter by one.

Sorting shapes or figures (cull shapes)

1. Search pVectShape for the largest SurfaceArea value and save the quantity index in the variable ShapeArealdx.-

2. If (ShapeArealdx = 0) go back;

3. CurrentShapeBoundaryNumber = (ShapeArealdx * 2) + 1

4. Using the closest figure bound box coordinates stored in pVectShape in the ShapeArealdx index, key ShapeMask from left to right, from top to bottom for non-zero values.

5. For all non-zero ShapeMask pixels that are not equal to the CurrentShapeBoundaryNumber value (one of the largest SurfaceArea figure boundary pixels) or equal to the Cur- rentShapeBoundaryNumber + 1 (a largest SurfaceArea figure pixel), set the pixel in SegMask to zero at the current (X, Y) position to indicate that this SegMask pixel is foreground transparent.

Transparent foreground

The behavior of this routine depends on the level of functionality as suggested in the list below under "level of functionality". "

level of functionality

How do we create transparency of the foreground?

1. We can only see the limits through a. Set all figure boundary pixels to the value 255 in the segmentation mask, while b. all other segmentation mask pixels are set to the value 0.

2. We can see borders and interiors through a. Setting all figure boundary pixels to the value 128 in the segmentation mask, and b. Set all figure inner pixels to the value 255 in the segmentation mask, while c. all other segmentation mask pixels are set to the value 0.

3. We can make the whole foreground 100% transparent madien through a. Setting all figure boundary pixels to the value 255 in the segmentation mask, and b. Set all figure inner pixels to the value 255 in the segmentation mask, while c. all other segmentation mask pixels are set to the value 0.

4. We can do boundary blurring by a. Setting all figure boundary pixels to the value 128 in the segmentation mask, and b. Set all figure inner pixels to the value 225 in the segmentation mask, while c. all other segmentation mask pixels are set to 0, and d. the modification logic is made to blurr or blur all resulting pixels that have a value of 128 in the segmentation mask. 5. We can do border anti-aliasing through a. Setting the segmentation mask pixel values to 64, which is a pixel outside each segmentation mask boundary pixel, and b. Setting all segmentation mask boundary pixels to the value 128, and c. Setting all segmentation mask pixel values to 192 which is a pixel within each segmentation mask boundary pixel, and d. Set all other figure inner pixels to the value 255 in the segmentation mask, while e. all other segmentation mask pixels are set to the value 0.

6. We can perform border anti-aliasing with blurring by a. Carry out all the steps as described under "Border Anti-Aliasing" above, and then b. Cause the modification logic to blurr or blur all resulting pixels that have the value 128 in the segmentation mask.

7. We can do advanced border anti-aliasing through a. the presence of linearly stretchable boundary pixel values as input from the edge detection algorithm (ie pixels which are the sharpest within their local surroundings are set to the value 255 by the edge detection algorithm, and those pixels which are also edge pixels but not within their local surroundings) sharpest edges are linearly stretched to values between 1 and 255) b. Set all other figure inner pixels to the value 255 in the segmentation mask, while c. all other segmentation mask pixels are set to the value 0.

8. We can do advanced border anti-aliasing with blurring by a. Perform the steps as described in "Advanced Border Anti-Aliasing" above, and then b. Cause the modification logic to blurr or blur any resulting pixels that have values in the range 1 to 254 in the segmentation mask. The invention is not restricted to the features and combinations of features of the exemplary embodiments described above and shown in the drawing. The individual aspects, features and combinations of features of the present invention can be implemented individually and in combination and are worthy of protection. In addition to the general and concrete information on the implementation of the invention contained in the present documents, its scope also includes all variations, modifications, substitutions and combinations which the person skilled in the art can readily recognize from the documents themselves and / or with the help of his specialist knowledge.

Claims

Expectations

1. Image processing device with user image data input devices for entering current user image data, image data editing devices for generating edited user image data from the current user image data, and image data output devices for outputting user image data, characterized in that the image data editing devices contain processing devices which are designed for this purpose to determine the largest moving object and to identify and locate and track it as a user or available parts of the user.

2. Image processing device according to claim 1, characterized in that the processing devices segment the largest moving object in order to supply the data obtained for further processing.

3. Image processing method in which current user image data are acquired, edited and output, characterized in that the largest moving object is determined and identified and located and tracked as the user or available parts of the user.

4. Image processing method according to claim 3, characterized in that the largest moving object will segment and its data are fed to further processing.