WO2014154574A1 - Method, apparatus and computer program for image processing - Google Patents

Method, apparatus and computer program for image processing Download PDF

Info

Publication number
WO2014154574A1
WO2014154574A1 PCT/EP2014/055687 EP2014055687W WO2014154574A1 WO 2014154574 A1 WO2014154574 A1 WO 2014154574A1 EP 2014055687 W EP2014055687 W EP 2014055687W WO 2014154574 A1 WO2014154574 A1 WO 2014154574A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture frames
correlation
picture
pixel
blocks
Prior art date
Application number
PCT/EP2014/055687
Other languages
French (fr)
Inventor
Piergiorgio Sartor
Francesco Michielin
Original Assignee
Sony Corporation
Sony Deutschland Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corporation, Sony Deutschland Gmbh filed Critical Sony Corporation
Publication of WO2014154574A1 publication Critical patent/WO2014154574A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • G06T7/231Analysis of motion using block-matching using full search
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching

Definitions

  • the present disclosure relates to a method for image processing.
  • the invention also relates to an apparatus for image processing, a computer program as well as a non-transitory computer-readable recording medium.
  • a method for image processing in particular for motion and/or disparity estimation, comprising:
  • an apparatus for image processing of a number of spatially and/or temporally separated picture frames, in particular for motion and/or disparity estimation comprising:
  • a transformation unit adapted to compress the picture frames at least partially and to provide binary pixel blocks of the picture frames
  • a correlation unit adapted to correlate the binary pixel blocks
  • a detection unit for detecting correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
  • a computer program comprising program means for causing a computer to carry out the steps of the method disclosed herein, when said computer program is carried out on a computer, as well as a non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method disclosed herein to be performed are provided.
  • One of the aspects of the present disclosure is to implement a motion and/or a disparity estimation using multiple frames in time and/or space.
  • the respective picture frames are binarized at least partially to provide respective binary pixel blocks of the pixel frames.
  • the pixel blocks which are estimated to be correlated are binarized and compressed to binary pixel blocks in order to reduce the technical effort and the time to process those pixel blocks.
  • the binary pixel blocks are correlated in order to determine the correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
  • Fig. 1 shows a schematic block diagram illustrating an embodiment of the method for image processing
  • Fig. 2 shows a block diagram illustrating the correlation of pixel blocks in a set of picture frames
  • Fig. 3 shows picture frames and binarized picture frames for illustrating the binary correlation
  • Fig. 4 shows a diagram for illustrating the cross-correlation of the binarized pixel blocks of Fig. 3;
  • Fig. 5 shows a schematic block diagram of an apparatus for image processing.
  • FIG. 1 shows a schematic block diagram of an embodiment of a method for image processing which is generally denoted by 10.
  • a video stream comprising different picture frames are provided as indicated by 12, wherein the picture frames are spatial and/or temporal separated picture frames from which a certain motion or a certain disparity should be estimated and/or certain frames should be interpolated.
  • the picture frames are preprocessed.
  • the preprocessing step 14 may include filtering, scaling in order to reduce the pixel size and/or the resolution of the picture frames, contrast enhancement of the picture frames and/or frequency manipulation such as low-pass filtering or high-pass filtering which may be applied depending on the noise level of the picture frames.
  • the so- preprocessed image frames are binarized in a further step as indicated by 16. This binariza- tion performs a binary compression of the image frames, i.e.
  • the binarization may be performed by calculating an average level of the pixels of the image frame or an average level of a certain pixel block of the image frame and by comparing each pixel value of the image frame or the certain image block with the respective average value. Depending on the pixel value with respect to the average value, a 1 or a 0 is assigned to the pixel in order to provide the binary image frame or the binary pixel block. If the pixel value is equal to the average value, a 1 or a 0 can be assigned to the respective pixel and it is integrated in a comparison.
  • each pixel value can be compared with a central value and a 1 or a 0 is assigned to the respective pixel depending on the comparison.
  • the pixel values of the image frame or of a certain block are compared with a certain value and a 1 or a 0 is assigned to the respective pixel to binarize the image frames.
  • the binarized image frames or pixel blocks are blockwise cross- correlated as indicated by 18.
  • the cross correlation is performed between a binarized source block of a first picture frame and a search block or search area of a second picture frame.
  • the search block may be larger than the source block in order to determine the correlated pixel blocks of the two different image frames.
  • the search block or the search area is determined on the basis of an estimation process in order to reduce the search effort to find the correlated pixel blocks.
  • the estimation process selects from a set of vectors a plurality of candidate vectors which are assigned to the source block and point to the search area, where the pixel block correlated to the source block is estimated.
  • the estimation may be a motion estimation process.
  • the cross-correlation 18 provides a cross-correlation value by multiplying each pixel of the search block with a corresponding pixel of the search area. A sum of each of the multiplication result of the pixels forms the cross-correlation result, wherein a high value indicates that most of the compared pixels have an identical value (0 or 1) and have a high degree of correlation.
  • the cross correlation is performed for different positions of the source block in the search area so that a matching can be found if a correlated pixel block is present in the search area. If the correlated pixel blocks are found, a respective vector from the source block to the correlated pixel block in the search area is determined and the determined vector is provided as an output vector as indicated by 20. Further, the so-determined vector is fed back to the cross-correlation 18 in order to select the candidate vectors from the field of vectors for a cross correlation of the next picture frame. The step of feeding the output vector back to the cross correlation in order to select candidate vectors is called predictor-setup and is denoted by 22.
  • the multiplication of the pixels of the binary pixel blocks may be implemented as a simple XNOR function.
  • the cross-correlation may be implemented as an XOR function, wherein in that case the lowest value of the cross-correlation result indicates that most of the compared pixels have an identical value (0 or 1) and have a high degree of correlation.
  • the method 10 is usually carried out as 3D or 2D recursive motion estimation, wherein the highest correlation value is defined as a match of the correlated pixel blocks.
  • Fig. 2 shows an example of two picture frames, which are in Fig. 1 indicated with the reference numerals 24, 26.
  • the picture frames 24, 26 are captured e.g. by two cameras as left and right frames for three-dimensional imaging or as two time instances t, t+1 of a sequence of consecutive images captured by one camera.
  • a frame is built up of pixels, each pixel carrying information for example on colour etc.
  • the picture frames 24, 26 are built of an array of pixel blocks 28 each comprising an array of pixels.
  • a source pixel block 30 is schematically shown in a first position within the picture frame 24.
  • the pixel block corresponding to the source pixel block 30 is located at a different position wherein the difference between both positions characterize a movement of the respective pattern or the disparity between both corresponding pixel blocks.
  • the present method is provided to identify the position of the pixel block corresponding to the source pixel block 30 in the second picture frame 26 in order to determine the disparity between both corresponding pixel blocks and/or the movement of the respective corresponding pixel blocks from one picture frame 24 to the other picture frame 26.
  • a plurality of estimation vectors 32, 34, 36, 38, 40 are determined or selected from a field of vectors estimating a movement of a disparity of the corresponding pixel block and pointing from the position of the source pixel block 30 to an estimated position 42, 44, 46, 48, 50 of the corresponding pixel block in the second picture frame 26.
  • the estimated position 42-50 are coarse estimations of the position of the corresponding pixel block estimated e.g. on the basis of determined previous motion vectors and/or disparity vectors and the correct position of the corresponding pixel block in the second picture frame 26 has to be determined or verified by a following image processing step.
  • a search area 52, 54, 56, 58, 60 or a search pixel block 52-60 is determined for each of the estimated positions 42- 50.
  • the search areas 52-60 are usually larger than the source block 30 and surround usually the estimated positions 42-50.
  • the source pixel block 30 and the pixel blocks of the search areas 52-60 are binarized to a binary pixel block and compared by means of a cross-correlation.
  • the source pixel block 30 is successively compared to different positions in the search areas 52-60 and a cross- correlation value is determined for each position by multiplying the binary pixel blocks of the source block 30 and the search area 52-60 or by means of an XNOR operator wherein the different values of the pixel by pixel comparison are added to determine the cross- correlation value.
  • the best matching of the source block 30 and the search areas 52- 60 lead to the highest cross-correlation value and, therefore, the pixel block in the search area 52-60 corresponding to the source pixel block 30 can be determined in a fast processing with a high reliability.
  • the pixel block in the search area 52-60 corresponding to the source pixel block 30 can be determined also on the basis of the XOR operstion with a high reliability.
  • the pixel block corresponding to the source pixel block 30 determined in the second picture frame 26 by means of the cross-correlation is indicated as an example by reference numeral 62.
  • a motion vector or disparity vector 64 pointing from the source pixel block 30 to the corresponding pixel block 62 in the second picture frame 26 is determined and provided as the image processing result.
  • the so-determined output vector 64 is used to determine the estimated vectors 32-40, e.g. by selecting the most probable vectors from a field of estimated vectors 42-50 as candidate vectors for the next search and cross-correlation step in the following picture frame.
  • the determination of the corresponding pixel block 62 is based on the results of the previous image processing steps or the analysis of the previous picture frames and fed-back as mentioned above by means of the predictor set-up 22.
  • the size of the search areas 52-60 is variable and is changed on the basis of a reliability of the estimation of the estimation vectors 32- 40. If the reliability of the estimation vectors 32-40 is high so that the estimated positions 42-50 is expected to be very close to the correlated pixel block 62 to be determined, the size of search area 52-60 is reduced. If the reliability of the estimation of the estimation vectors 32-40 is low, the search area 52-60 is increased so that the correlated pixel block 62 can be found within the larger search areas 52-60 with a high probability.
  • the reliability of the estimation vectors 32-40 can be determined on the basis of an amount of the estimated vectors 32-40, wherein a large amount of estimation vectors 32-40 or candidate vectors 32-40 indicate that the estimation of the estimated positions 42-50 has a low reliability so that the size of the search areas 52-60 should be increased in order to find the correlated pixel block 62 with a higher probability.
  • the reliability estimation can be based on a uniformity of the estimation vectors 32- 40, wherein the estimation is considered to have a poor reliability if the vector candidates 32-40 point in many different directions i.e. have a low uniformity and that the estimation has a high reliability if the estimation vectors 32-40 point in one direction i.e. have a high uniformity.
  • the size of the search are 52-60 can be reduced. Additionally, the reliability estimation can be based on a correlation result of the binary pixel blocks of different picture frames determined in a previous processing step. In other words, if a correlation value of the binary pixel blocks is high for the XNOR operation, the correlated pixel block 62 is determined with a high reliability. Alternatively, if the XOR operation is used, the correlated pixel block 62 is determined with a high reliability if the correlation value of the binary pixel blocks is low. If the output vector 64 of the previous processing step which is used to determine the candidate vectors or the estimation vectors has a high reliability the estimation of the candidate vectors also has a high reliability. In that case, the size of the search area can be reduced since the reliability of the estimation vector is considered to be high.
  • the plurality of estimation vectors 32-40 are usually selected from a field of estimation vectors and are selected on the basis of a motion and/or disparity relation of the different picture frames.
  • the estimated vectors 32-40 can be selected from the field of estimation vectors on the basis of a general movement of the image pattern within the picture frames. In other words, if a general movement e.g. of the background of the image is determined in the picture frames, the estimated vectors 32-40 are selected on the basis of this detected general movement of the image pattern.
  • the general movement can also be determined on the basis of a movement sensor within the used camera which determines the movement of the camera independently and wherein the general movement of the image pattern within the picture frames is estimated on the basis of the measurement of the motion sensor.
  • estimation vectors 32-40 can be determined on the basis of scaled picture frames of the set of picture frames, wherein the pixel size of the scaled picture frames is reduced in order to provide a fast movement detection within the scaled picture frame so that the coarse position of the correlated pixel block 62 can be determined quickly on the basis of a coarse picture calculation.
  • the size of the search area 52-60 can be adapted independently in X-direction and in Y-direction depending on the estimation vectors 32- 40.
  • Fig. 3 shows two picture frames as captured and two corresponding bina- rized picture frames and the respective source blocks for illustrating the cross-correlation process 18 in order to determine the corresponding pixel block 62.
  • Fig. 3a shows a first picture frame 66 corresponding to the first picture frame 24 and a second picture frame 68 corresponding to the second picture frame 26 shown in Fig. 2. Further, a source block 70 as part of the first picture frame 66 is shown corresponding to the source block 30 shown in Fig. 2.
  • the pictures frames 66, 68 and the source block 70 shown in Fig. 3a are monochrome or coloured pictures as captured by the respective camera.
  • Fig. 3b shows a first binarized picture frame 72 and a second binarized picture frame 74 corresponding to the first picture frame 66 and the second picture frame 68 shown in Fig. 3a. Further, a binarized source block 76 corresponding to the source block 70 shown in Fig. 3a is shown in Fig. 3b.
  • the binarization of the picture frames 66, 68 is performed as described above, wherein the pixel value of each pixel of the picture frames 66, 68 is compared to a certain pixel threshold level and depending on the comparison result a 0 or a 1 is assigned as a binary pixel value to the respective pixel in order to provide the binarized picture frames 72, 74.
  • the threshold value can be predefined or can be determined as a medium value of all pixels of the picture frame 66, 68 or a certain block or area of the picture frames 66, 68.
  • the binarized source block 76 is determined in order to determine the respective corresponding pixel block 62 in the binarized second picture frame 74.
  • the source block 70 extracted from the captured first picture frame 66 can be binarized in order to reduce the image processing effort.
  • the search area 52-60 is determined as mentioned above and the binarized source block 76 is correlated to the search area 52-60 of the binarized second picture frame 74.
  • the correlation is performed by cross- correlation as mentioned above, wherein a pixel by pixel XNOR (or alternatively XOR) operation or a multiplication of the pixel values is performed for different search positions of the binarized source block 76 within the search area 52-60.
  • the results of the pixel by pixel XNOR (or XOR) or multiplication operation are added in order to determine a sum as a single cross-correlation value.
  • the position of the binarized source block 76 within the search area 52-60 providing the highest cross-correlation value is defined as a matching position in the case of the XNOR operation and is determined as corresponding pixel block 62 which corresponds to the source pixel block 70.
  • the position of the binarized source block 76 within the search area 52-60 providing the lowest cross-correlation value is defined as the matching position.
  • Fig. 4 shows a map of cross-correlation values determined for different positions of the binarized source block 76 within the search area 52-60 by means of an XNOR operation.
  • the cross-correlation value is generally denoted by A.
  • the map shown in Fig. 4 shows different values depending on the X-position and the Y-position of the binarized source block 76 within the search area 52-60.
  • the cross-correlation value map shows different low peaks and a single high peak, which is generally denoted by 78.
  • the high peak 78 is obviously the highest cross-correlation value A for the search shown in Fig. 4 so that this respective X-position and Y-position is defined as the position of the corresponding pixel block 62 for this XNOR operation.
  • the cross-correlation value map comprises different high peaks and a single low peak.
  • the low peak indicates the best correlation for the search so that this respective X-position and Y-position is defined as the matching position of the corresponding pixel block 62. Due to the cross-correlation and the so-determined cross-correlation value A, a fast matching can be achieved since the cross correlation in general provides a fast convergence of the calculation process.
  • the correlation of the binarized search block 76 and the binarized second picture frame 74 is performed by phase correlation, wherein edges in the source block 76 and the second binarized picture frame are correlated in order to identify the correlated pixel block 62.
  • FIG. 5 shows a schematic block diagram of an apparatus for image processing on the basis of the image processing method 10 described above.
  • the apparatus is generally denoted by 80.
  • the apparatus 80 receives a plurality of image frames 66, 68 from an imaging device 82 such as a camera.
  • the picture frames 66, 68 are preprocessed by means of a pre-processing unit 84 which comprises a filter unit, a scaling unit and/or a contrast-enhancement unit for performing the pre-processing step 14 shown in Fig. 1.
  • the pre-processed picture frames 86 are provided to a binarization unit 88.
  • the binarization unit 88 performs a binary compression of the pre-processed picture frames 86 as mentioned above and provides the binarized picture frames 72, 74 and the binarized source block 76 to a cross-correlation unit 90.
  • the cross-correlation unit 90 performs the cross- correlation step 18 as mentioned above and provides the cross-correlation value A as a result of the cross-correlation 18 to an evaluation unit 92 which evaluates the position corresponding to the highest peak 78 for an XNOR operator (or the lowest peak for an XOR operator) of the cross-correlation value A and determines the output vector 64 as a result of the image processing pointing from the source block 30 to the correlated pixel block 62.
  • the output vector 64 is also fed back to the cross-correlation unit 90 in order to determine the estimation vectors on the basis of the output vector.
  • the feedback loop is called predictors setup.
  • the method 10 for image processing and the apparatus 80 for image processing have the advantage that a fast convergence can be provided so that the image processing can be performed with low time effort and the technical effort to perform this method, is reduced.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Method for image processing, in particular for motion and/or disparity estimation, comprising: providing a set of temporal or spatial related picture frames containing correlated blocks, transforming the picture frames by binary compressing the picture frames at least partially to provide binary pixel blocks of the pixel frames, correlating the binary pixel blocks, and determining the correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.

Description

METHOD, APPARATUS AND COMPUTER PROGRAM FOR IMAGE PROCESSING
BACKGROUND
Field of the Disclosure
[0001] The present disclosure relates to a method for image processing. The invention also relates to an apparatus for image processing, a computer program as well as a non-transitory computer-readable recording medium.
Description of Related Art [0002] There is an increased demand for 2D/3D and multiple view applications, all of them requiring image processing, like motion estimation, disparity estimation or picture frame interpolation. In the art several methods for estimating motion and disparity are known. Most of them are working independently of each other and do not use spatial and temporal information between multiple picture frames captured by e.g. two or more cameras. The same applies to frame interpolation, i.e. creating new frames based on other frames and estimation of vector information. In order to provide the motion estimation, the disparity estimation or the picture frame interpolation it is necessary to identify correlated pixel blocks in different image frames.
[0003] Therefore there is a demand for providing a reliable and effective image processing to identify correlated pixel blocks in different image frames.
[0004] In order to determine a motion in a set of images US 2008/0037869 Al suggests to calculate an absolute difference between pixels of a certain pixel region in the image frames and to determine the correlated pixel blocks on the basis of the absolute difference between the pixels.
[0005] However, the known methods for identifying correlated pixel blocks in a set of image frames are complex, require a long processing time and a fast processing unit to calculate the respective motion vectors.
[0006] The "background" description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventor(s), to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art against the present invention.
SUMMARY [0007] It is an object to provide a method for image processing which achieves an improved fast motion and/or disparity estimation with low technical effort. It is a further object to provide an apparatus for image processing which achieves an improved fast motion and/or disparity estimation with low technical effort, as well as a corresponding computer program for implementing the method and a non-transitory computer-readable recording medium for implementing the method.
[0008] According to an aspect there is provided a method for image processing, in particular for motion and/or disparity estimation, comprising:
providing a set of temporal and/or spatial related picture frames containing correlated picture blocks,
transforming the picture frames by binary compressing the picture frames at least partially to provide binary pixel blocks of the pixel frames,
correlating the binary pixel blocks, and
determining the correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
[0009] According to a further aspect there is provided an apparatus for image processing of a number of spatially and/or temporally separated picture frames, in particular for motion and/or disparity estimation, comprising:
a transformation unit adapted to compress the picture frames at least partially and to provide binary pixel blocks of the picture frames,
a correlation unit adapted to correlate the binary pixel blocks, and a detection unit for detecting correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
[0010] According to still further aspects a computer program comprising program means for causing a computer to carry out the steps of the method disclosed herein, when said computer program is carried out on a computer, as well as a non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method disclosed herein to be performed are provided.
[0011] Preferred embodiments are defined in the dependent claims. It shall be understood that the claimed apparatus, the claimed computer program and the claimed computer-readable recording medium have similar and/or identical preferred embodiments as the claimed method and as defined in the dependent claims.
[0012] One of the aspects of the present disclosure is to implement a motion and/or a disparity estimation using multiple frames in time and/or space. In order to determine correlated pixel blocks in the temporal or spatial related picture frames, the respective picture frames are binarized at least partially to provide respective binary pixel blocks of the pixel frames. Preferably, the pixel blocks which are estimated to be correlated are binarized and compressed to binary pixel blocks in order to reduce the technical effort and the time to process those pixel blocks. After the binarization, the binary pixel blocks are correlated in order to determine the correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
[0013] Since the picture frames are compressed by means of the binarization, the technical effort and the time for the image correlation is reduced and the correlated picture blocks can be determined with low technical effort by binary pixel block correlation.
[0014] It is to be understood that both the foregoing general description of the invention and the following detailed description are exemplary, but are not restrictive, of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] A more complete appreciation of the disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
Fig. 1 shows a schematic block diagram illustrating an embodiment of the method for image processing;
Fig. 2 shows a block diagram illustrating the correlation of pixel blocks in a set of picture frames;
Fig. 3 shows picture frames and binarized picture frames for illustrating the binary correlation;
Fig. 4 shows a diagram for illustrating the cross-correlation of the binarized pixel blocks of Fig. 3; and
Fig. 5 shows a schematic block diagram of an apparatus for image processing.
DESCRIPTION OF THE EMBODIMENTS
[0016] Referring now to the drawings, wherein like reference numerals designate identical or corresponding parts throughout the several views, Fig. 1 shows a schematic block diagram of an embodiment of a method for image processing which is generally denoted by 10.
[0017] A video stream comprising different picture frames are provided as indicated by 12, wherein the picture frames are spatial and/or temporal separated picture frames from which a certain motion or a certain disparity should be estimated and/or certain frames should be interpolated. In a following step as indicated by 14, the picture frames are preprocessed. The preprocessing step 14 may include filtering, scaling in order to reduce the pixel size and/or the resolution of the picture frames, contrast enhancement of the picture frames and/or frequency manipulation such as low-pass filtering or high-pass filtering which may be applied depending on the noise level of the picture frames. The so- preprocessed image frames are binarized in a further step as indicated by 16. This binariza- tion performs a binary compression of the image frames, i.e. the input image is reduced from 8 or 10 or 12 bit per pixel to one bit per pixel. The binarization may be performed by calculating an average level of the pixels of the image frame or an average level of a certain pixel block of the image frame and by comparing each pixel value of the image frame or the certain image block with the respective average value. Depending on the pixel value with respect to the average value, a 1 or a 0 is assigned to the pixel in order to provide the binary image frame or the binary pixel block. If the pixel value is equal to the average value, a 1 or a 0 can be assigned to the respective pixel and it is integrated in a comparison.
[0018] Alternatively each pixel value can be compared with a central value and a 1 or a 0 is assigned to the respective pixel depending on the comparison. In general, the pixel values of the image frame or of a certain block are compared with a certain value and a 1 or a 0 is assigned to the respective pixel to binarize the image frames.
[0019] The binarized image frames or pixel blocks are blockwise cross- correlated as indicated by 18. The cross correlation is performed between a binarized source block of a first picture frame and a search block or search area of a second picture frame. The search block may be larger than the source block in order to determine the correlated pixel blocks of the two different image frames.
[0020] The search block or the search area is determined on the basis of an estimation process in order to reduce the search effort to find the correlated pixel blocks. The estimation process selects from a set of vectors a plurality of candidate vectors which are assigned to the source block and point to the search area, where the pixel block correlated to the source block is estimated. The estimation may be a motion estimation process. [0021] The cross-correlation 18 provides a cross-correlation value by multiplying each pixel of the search block with a corresponding pixel of the search area. A sum of each of the multiplication result of the pixels forms the cross-correlation result, wherein a high value indicates that most of the compared pixels have an identical value (0 or 1) and have a high degree of correlation. The cross correlation is performed for different positions of the source block in the search area so that a matching can be found if a correlated pixel block is present in the search area. If the correlated pixel blocks are found, a respective vector from the source block to the correlated pixel block in the search area is determined and the determined vector is provided as an output vector as indicated by 20. Further, the so-determined vector is fed back to the cross-correlation 18 in order to select the candidate vectors from the field of vectors for a cross correlation of the next picture frame. The step of feeding the output vector back to the cross correlation in order to select candidate vectors is called predictor-setup and is denoted by 22.
[0022] The multiplication of the pixels of the binary pixel blocks may be implemented as a simple XNOR function. Alternatively, the cross-correlation may be implemented as an XOR function, wherein in that case the lowest value of the cross-correlation result indicates that most of the compared pixels have an identical value (0 or 1) and have a high degree of correlation.
[0023] The method 10 is usually carried out as 3D or 2D recursive motion estimation, wherein the highest correlation value is defined as a match of the correlated pixel blocks.
[0024] Fig. 2 shows an example of two picture frames, which are in Fig. 1 indicated with the reference numerals 24, 26. The picture frames 24, 26 are captured e.g. by two cameras as left and right frames for three-dimensional imaging or as two time instances t, t+1 of a sequence of consecutive images captured by one camera. As generally known, in a digital electronics environment, a frame is built up of pixels, each pixel carrying information for example on colour etc. In Fig. 1 the picture frames 24, 26 are built of an array of pixel blocks 28 each comprising an array of pixels.
[0025] In Fig. 2 a source pixel block 30 is schematically shown in a first position within the picture frame 24. In the second picture frame 26, the pixel block corresponding to the source pixel block 30 is located at a different position wherein the difference between both positions characterize a movement of the respective pattern or the disparity between both corresponding pixel blocks. The present method is provided to identify the position of the pixel block corresponding to the source pixel block 30 in the second picture frame 26 in order to determine the disparity between both corresponding pixel blocks and/or the movement of the respective corresponding pixel blocks from one picture frame 24 to the other picture frame 26.
[0026] In order to identify the pixel block corresponding to the source pixel block 30 in the second picture frame 26, a plurality of estimation vectors 32, 34, 36, 38, 40 are determined or selected from a field of vectors estimating a movement of a disparity of the corresponding pixel block and pointing from the position of the source pixel block 30 to an estimated position 42, 44, 46, 48, 50 of the corresponding pixel block in the second picture frame 26. It should be noted that the estimated position 42-50 are coarse estimations of the position of the corresponding pixel block estimated e.g. on the basis of determined previous motion vectors and/or disparity vectors and the correct position of the corresponding pixel block in the second picture frame 26 has to be determined or verified by a following image processing step.
[0027] Departing from the estimated positions 42-50 a search area 52, 54, 56, 58, 60 or a search pixel block 52-60 is determined for each of the estimated positions 42- 50. The search areas 52-60 are usually larger than the source block 30 and surround usually the estimated positions 42-50. In order to identify the correct position of the corresponding pixel block, the source pixel block 30 and the pixel blocks of the search areas 52-60 are binarized to a binary pixel block and compared by means of a cross-correlation. Since the search areas 52-60 are larger than the source pixel block 30, the source pixel block 30 is successively compared to different positions in the search areas 52-60 and a cross- correlation value is determined for each position by multiplying the binary pixel blocks of the source block 30 and the search area 52-60 or by means of an XNOR operator wherein the different values of the pixel by pixel comparison are added to determine the cross- correlation value. Hence, the best matching of the source block 30 and the search areas 52- 60 lead to the highest cross-correlation value and, therefore, the pixel block in the search area 52-60 corresponding to the source pixel block 30 can be determined in a fast processing with a high reliability. If the alternative XOR function is used for the cross- correlation of the source block 30 and the search area 52-60 and the cross-correlation value is correspondingly determined by adding the results of each of the XOR operations, the best matching of the source block 30 and the search areas 52-60 lead to the lowest cross- correlation value. Hence, the pixel block in the search area 52-60 corresponding to the source pixel block 30 can be determined also on the basis of the XOR operstion with a high reliability. The pixel block corresponding to the source pixel block 30 determined in the second picture frame 26 by means of the cross-correlation is indicated as an example by reference numeral 62.
[0028] On the basis of the so-determined corresponding pixel block 62, a motion vector or disparity vector 64 pointing from the source pixel block 30 to the corresponding pixel block 62 in the second picture frame 26 is determined and provided as the image processing result. The so-determined output vector 64 is used to determine the estimated vectors 32-40, e.g. by selecting the most probable vectors from a field of estimated vectors 42-50 as candidate vectors for the next search and cross-correlation step in the following picture frame. In other words the determination of the corresponding pixel block 62 is based on the results of the previous image processing steps or the analysis of the previous picture frames and fed-back as mentioned above by means of the predictor set-up 22.
[0029] In a preferred embodiment, the size of the search areas 52-60 is variable and is changed on the basis of a reliability of the estimation of the estimation vectors 32- 40. If the reliability of the estimation vectors 32-40 is high so that the estimated positions 42-50 is expected to be very close to the correlated pixel block 62 to be determined, the size of search area 52-60 is reduced. If the reliability of the estimation of the estimation vectors 32-40 is low, the search area 52-60 is increased so that the correlated pixel block 62 can be found within the larger search areas 52-60 with a high probability.
[0030] The reliability of the estimation vectors 32-40 can be determined on the basis of an amount of the estimated vectors 32-40, wherein a large amount of estimation vectors 32-40 or candidate vectors 32-40 indicate that the estimation of the estimated positions 42-50 has a low reliability so that the size of the search areas 52-60 should be increased in order to find the correlated pixel block 62 with a higher probability. Alternatively, the reliability estimation can be based on a uniformity of the estimation vectors 32- 40, wherein the estimation is considered to have a poor reliability if the vector candidates 32-40 point in many different directions i.e. have a low uniformity and that the estimation has a high reliability if the estimation vectors 32-40 point in one direction i.e. have a high uniformity. If the uniformity of the candidate vectors 32-40 is high, the size of the search are 52-60 can be reduced. Additionally, the reliability estimation can be based on a correlation result of the binary pixel blocks of different picture frames determined in a previous processing step. In other words, if a correlation value of the binary pixel blocks is high for the XNOR operation, the correlated pixel block 62 is determined with a high reliability. Alternatively, if the XOR operation is used, the correlated pixel block 62 is determined with a high reliability if the correlation value of the binary pixel blocks is low. If the output vector 64 of the previous processing step which is used to determine the candidate vectors or the estimation vectors has a high reliability the estimation of the candidate vectors also has a high reliability. In that case, the size of the search area can be reduced since the reliability of the estimation vector is considered to be high.
[0031] The plurality of estimation vectors 32-40 are usually selected from a field of estimation vectors and are selected on the basis of a motion and/or disparity relation of the different picture frames. The estimated vectors 32-40 can be selected from the field of estimation vectors on the basis of a general movement of the image pattern within the picture frames. In other words, if a general movement e.g. of the background of the image is determined in the picture frames, the estimated vectors 32-40 are selected on the basis of this detected general movement of the image pattern. The general movement can also be determined on the basis of a movement sensor within the used camera which determines the movement of the camera independently and wherein the general movement of the image pattern within the picture frames is estimated on the basis of the measurement of the motion sensor. Further, the estimation vectors 32-40 can be determined on the basis of scaled picture frames of the set of picture frames, wherein the pixel size of the scaled picture frames is reduced in order to provide a fast movement detection within the scaled picture frame so that the coarse position of the correlated pixel block 62 can be determined quickly on the basis of a coarse picture calculation.
[0032] In a further embodiment, the size of the search area 52-60 can be adapted independently in X-direction and in Y-direction depending on the estimation vectors 32- 40.
[0033] Fig. 3 shows two picture frames as captured and two corresponding bina- rized picture frames and the respective source blocks for illustrating the cross-correlation process 18 in order to determine the corresponding pixel block 62. Fig. 3a shows a first picture frame 66 corresponding to the first picture frame 24 and a second picture frame 68 corresponding to the second picture frame 26 shown in Fig. 2. Further, a source block 70 as part of the first picture frame 66 is shown corresponding to the source block 30 shown in Fig. 2. The pictures frames 66, 68 and the source block 70 shown in Fig. 3a are monochrome or coloured pictures as captured by the respective camera.
[0034] Fig. 3b shows a first binarized picture frame 72 and a second binarized picture frame 74 corresponding to the first picture frame 66 and the second picture frame 68 shown in Fig. 3a. Further, a binarized source block 76 corresponding to the source block 70 shown in Fig. 3a is shown in Fig. 3b. [0035] The binarization of the picture frames 66, 68 is performed as described above, wherein the pixel value of each pixel of the picture frames 66, 68 is compared to a certain pixel threshold level and depending on the comparison result a 0 or a 1 is assigned as a binary pixel value to the respective pixel in order to provide the binarized picture frames 72, 74. The threshold value can be predefined or can be determined as a medium value of all pixels of the picture frame 66, 68 or a certain block or area of the picture frames 66, 68. On the basis of the binarized first picture frame 72, the binarized source block 76 is determined in order to determine the respective corresponding pixel block 62 in the binarized second picture frame 74. Alternatively, merely the source block 70 extracted from the captured first picture frame 66 can be binarized in order to reduce the image processing effort.
[0036] In the binarized second picture frame 74, the search area 52-60 is determined as mentioned above and the binarized source block 76 is correlated to the search area 52-60 of the binarized second picture frame 74. The correlation is performed by cross- correlation as mentioned above, wherein a pixel by pixel XNOR (or alternatively XOR) operation or a multiplication of the pixel values is performed for different search positions of the binarized source block 76 within the search area 52-60. The results of the pixel by pixel XNOR (or XOR) or multiplication operation are added in order to determine a sum as a single cross-correlation value. The position of the binarized source block 76 within the search area 52-60 providing the highest cross-correlation value is defined as a matching position in the case of the XNOR operation and is determined as corresponding pixel block 62 which corresponds to the source pixel block 70. Alternatively, in the case of the XOR operation, the position of the binarized source block 76 within the search area 52-60 providing the lowest cross-correlation value is defined as the matching position.
[0037] Fig. 4 shows a map of cross-correlation values determined for different positions of the binarized source block 76 within the search area 52-60 by means of an XNOR operation. The cross-correlation value is generally denoted by A. The map shown in Fig. 4 shows different values depending on the X-position and the Y-position of the binarized source block 76 within the search area 52-60. The cross-correlation value map shows different low peaks and a single high peak, which is generally denoted by 78. The high peak 78 is obviously the highest cross-correlation value A for the search shown in Fig. 4 so that this respective X-position and Y-position is defined as the position of the corresponding pixel block 62 for this XNOR operation. In the alternative case of the use of the XOR operation for the cross-correlation, the cross-correlation value map comprises different high peaks and a single low peak. In that case, the low peak indicates the best correlation for the search so that this respective X-position and Y-position is defined as the matching position of the corresponding pixel block 62. Due to the cross-correlation and the so-determined cross-correlation value A, a fast matching can be achieved since the cross correlation in general provides a fast convergence of the calculation process.
[0038] In an alternative embodiment of the present invention, the correlation of the binarized search block 76 and the binarized second picture frame 74 is performed by phase correlation, wherein edges in the source block 76 and the second binarized picture frame are correlated in order to identify the correlated pixel block 62.
[0039] Fig. 5 shows a schematic block diagram of an apparatus for image processing on the basis of the image processing method 10 described above. The apparatus is generally denoted by 80.
[0040] As an input, the apparatus 80 receives a plurality of image frames 66, 68 from an imaging device 82 such as a camera. The picture frames 66, 68 are preprocessed by means of a pre-processing unit 84 which comprises a filter unit, a scaling unit and/or a contrast-enhancement unit for performing the pre-processing step 14 shown in Fig. 1. The pre-processed picture frames 86 are provided to a binarization unit 88. The binarization unit 88 performs a binary compression of the pre-processed picture frames 86 as mentioned above and provides the binarized picture frames 72, 74 and the binarized source block 76 to a cross-correlation unit 90. The cross-correlation unit 90 performs the cross- correlation step 18 as mentioned above and provides the cross-correlation value A as a result of the cross-correlation 18 to an evaluation unit 92 which evaluates the position corresponding to the highest peak 78 for an XNOR operator (or the lowest peak for an XOR operator) of the cross-correlation value A and determines the output vector 64 as a result of the image processing pointing from the source block 30 to the correlated pixel block 62. The output vector 64 is also fed back to the cross-correlation unit 90 in order to determine the estimation vectors on the basis of the output vector. The feedback loop is called predictors setup.
[0041] The method 10 for image processing and the apparatus 80 for image processing have the advantage that a fast convergence can be provided so that the image processing can be performed with low time effort and the technical effort to perform this method, is reduced.
[0042] Obviously, numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.
[0043] In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. A single element or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
[0044] In so far as embodiments of the invention have been described as being implemented, at least in part, by software-controlled data processing apparatus, it will be appreciated that a non-transitory machine-readable medium carrying such software, such as an optical disk, a magnetic disk, semiconductor memory or the like, is also considered to represent an embodiment of the present invention. Further, such a software may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
[0045] Any reference signs in the claims should not be construed as limiting the scope.

Claims

1. Method for image processing, in particular for motion and/or disparity estimation, comprising:
providing a set of temporal or spatial related picture frames containing correlated picture blocks,
transforming the picture frames by binary compressing the picture frames at least partially to provide binary pixel blocks of the pixel frames,
correlating the binary pixel blocks, and
determining the correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
2. Method according to claim 1, wherein the correlating of the binary pixel blocks comprises a cross correlation of the binary pixel blocks.
3. Method as claimed in claim 1, wherein a source pixel block is defined and bina- rized in a first of the picture frames and a search pixel block is defined and binarized in a second of the picture frames.
4. Method as claimed in claim 3, wherein a cross-correlation function is calculated by cross-correlation of the source pixel block and the search pixel block and wherein the correlated pixel blocks are determined on the basis of an absolute value of the cross- correlation function.
5. Method as claimed in claim 3, wherein the search pixel block comprises a larger amount of pixels than the source pixel block.
6. Method as claimed in claim 1, further comprising the step of providing a plurality of vector candidates for each picture frame indicating at least one estimated motion and/or disparity relation between two of the picture frames.
7. Method as claimed in claim 6, wherein the search pixel blocks are determined on the basis of the vector candidates.
8. Method as claimed in claim 6, wherein the vector candidates are determined on the basis of a motion and/or a disparity relation of correlated pixel blocks of a plurality of picture frames.
9. Method as claimed in claim 6, wherein the vector candidates are determined on the basis of a general movement of image pattern within the picture frames.
10. Method as claimed in claim 6, wherein the vector candidates are determined on the basis of scaled picture frames of the set of picture frames having a reduced pixel size.
11. Method as claimed in claim 6, wherein a size of the search pixel block is adapted on the basis of a reliability estimation of the search pixel block determination.
12. Method as claimed in claim 11, wherein the reliability estimation is based on an amount of the vector candidates estimated for the image frame.
13. Method as claimed in claim 11, wherein the reliability estimation is based on a uniformity of the estimated vector candidates.
14. Method as claimed in claim 11, wherein the reliability estimation is based on a correlation of the binary pixel blocks of different picture frames determined in a previous processing step.
15. Method as claimed in claim 1, further comprising a step of preprocessing in advance of the transformation including filtering, scaling and/or contrast enhancement of the picture frames.
16. Method as claimed in claim 1, wherein the correlation of the binary pixel blocks comprises a phase correlation of the binary pixel blocks.
17. Apparatus for image processing of a number of spatially and/or temporally separated picture frames, in particular for motion and/or disparity estimation, comprising:
a transformation unit adapted to compress the picture frames at least partially and to provide binary pixel blocks of the picture frames,
a correlation unit adapted to correlate the binary pixel blocks, and a detection unit for detecting correlated picture blocks in the set of picture frames on the basis of the correlation of the binary pixel blocks.
18. A computer program comprising program code means for causing a computer to perform the steps of said method as claimed in claim 1 when said computer program is carried out on a computer.
19. A non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according claim 1 to be performed.
PCT/EP2014/055687 2013-03-25 2014-03-21 Method, apparatus and computer program for image processing WO2014154574A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP13160838 2013-03-25
EP13160838.2 2013-03-25

Publications (1)

Publication Number Publication Date
WO2014154574A1 true WO2014154574A1 (en) 2014-10-02

Family

ID=47913273

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2014/055687 WO2014154574A1 (en) 2013-03-25 2014-03-21 Method, apparatus and computer program for image processing

Country Status (1)

Country Link
WO (1) WO2014154574A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996032807A2 (en) * 1995-04-04 1996-10-17 Chromatic Research, Inc. Method and structure for performing motion estimation using reduced precision pixel intensity values
WO1998043436A1 (en) * 1997-03-25 1998-10-01 Level One Communications, Inc. Method for simplying block matching motion estimation
US20110229056A1 (en) * 2010-03-19 2011-09-22 Sony Corporation Method for highly accurate estimation of motion using phase correlation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996032807A2 (en) * 1995-04-04 1996-10-17 Chromatic Research, Inc. Method and structure for performing motion estimation using reduced precision pixel intensity values
WO1998043436A1 (en) * 1997-03-25 1998-10-01 Level One Communications, Inc. Method for simplying block matching motion estimation
US20110229056A1 (en) * 2010-03-19 2011-09-22 Sony Corporation Method for highly accurate estimation of motion using phase correlation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIAN FENG ET AL: "Adaptive block matching motion estimation algorithm using bit-plane matching", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING. (ICIP). WASHINGTON, OCT. 23 - 26, 1995; [PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING. (ICIP)], LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, vol. 3, 23 October 1995 (1995-10-23), pages 496 - 499, XP010197230, ISBN: 978-0-7803-3122-8, DOI: 10.1109/ICIP.1995.537680 *
YONGTAE KIM ET AL: "Fast Disparity and Motion Estimation for Multi-view Video Coding", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 53, no. 2, 1 May 2007 (2007-05-01), pages 712 - 719, XP011186798, ISSN: 0098-3063, DOI: 10.1109/TCE.2007.381750 *

Similar Documents

Publication Publication Date Title
US8588515B2 (en) Method and apparatus for improving quality of depth image
JP4997281B2 (en) Method for determining estimated motion vector in image, computer program, and display device
US10412462B2 (en) Video frame rate conversion using streamed metadata
JP2000261828A (en) Stereoscopic video image generating method
WO1992021210A1 (en) Method for detecting moving vector and apparatus therefor, and system for processing image signal using the apparatus
US10687076B2 (en) Refining motion vectors in video motion estimation
Milani et al. Demosaicing strategy identification via eigenalgorithms
Mansour et al. Video background subtraction using semi-supervised robust matrix completion
Santamaria et al. A comparison of block-matching motion estimation algorithms
KR20050012766A (en) Unit for and method of estimating a motion vector
US8306123B2 (en) Method and apparatus to improve the convergence speed of a recursive motion estimator
US9917988B2 (en) Method for detecting occlusion areas
US20170085912A1 (en) Video sequence processing
US20130301928A1 (en) Shift vector reliability determining apparatus and method
Trucco et al. Real-time disparity maps for immersive 3-d teleconferencing by hybrid recursive matching and census transform
JP2004356747A (en) Method and apparatus for matching image
US10096116B2 (en) Method and apparatus for segmentation of 3D image data
US11657608B1 (en) Method and system for video content analysis
WO2014154574A1 (en) Method, apparatus and computer program for image processing
US9648339B2 (en) Image processing with segmentation using directionally-accumulated difference-image pixel values
US9563960B2 (en) Method for detecting foreground
KR101359351B1 (en) Fast method for matching stereo images according to operation skip
JP6468703B2 (en) Motion detection device, motion detection method, and program
WO2014135401A1 (en) System for frame interpolation
JP2011217294A (en) Noise reduction processing apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14711755

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14711755

Country of ref document: EP

Kind code of ref document: A1