WO2023174546A1 - Procédé et unité de processeur d'image pour traitement de données d'image - Google Patents

Procédé et unité de processeur d'image pour traitement de données d'image Download PDF

Info

Publication number
WO2023174546A1
WO2023174546A1 PCT/EP2022/057016 EP2022057016W WO2023174546A1 WO 2023174546 A1 WO2023174546 A1 WO 2023174546A1 EP 2022057016 W EP2022057016 W EP 2022057016W WO 2023174546 A1 WO2023174546 A1 WO 2023174546A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
frame
motion
block
frames
Prior art date
Application number
PCT/EP2022/057016
Other languages
English (en)
Inventor
Gregor Schewior
Noha El-Yamany
Original Assignee
Dream Chip Technologies Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dream Chip Technologies Gmbh filed Critical Dream Chip Technologies Gmbh
Priority to PCT/EP2022/057016 priority Critical patent/WO2023174546A1/fr
Priority to TW111149959A priority patent/TW202338734A/zh
Publication of WO2023174546A1 publication Critical patent/WO2023174546A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/521Processing of motion vectors for estimating the reliability of the determined motion vectors or motion vector field, e.g. for smoothing the motion vector field or for correcting motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • H04N5/145Movement estimation

Definitions

  • the invention relates to a method for processing image data of an image sensor, wherein the image data comprising a raw matrix of pixel per image, i.e. raw image data.
  • the invention further relates to an image processor unit for processing raw image data provided by an image sensor, said image sensor comprises a sensor array providing a raw matrix of pixel per image.
  • the invention relates to a computer program arranged to carry out the steps of the aforementioned method.
  • Digital imagers are widely used in everyday consumer products, such as smart- phones, tablets, notebooks, cameras, cars and wearables.
  • the use of small imaging sensors is becoming a trend to maintain a small, light-weight product form factor as well as to reduce the production cost.
  • colour filter arrays such as common Bayer colour filter array
  • the use of colour filter arrays limits or reduces the spatial resolution, as the full colour image is produced via interpolation (demosaicing) of undersampled colour channels.
  • BISP burst image signal processors
  • a burst of frames is captured, preferably with pre-defined (e.g. programmable) settings, and the frames are fused together to achieve a variety of goals.
  • each 2x2 block (quad) in the RGGB Bayer raw data is averaged to create a down-sized grayscale image (1/4 the resolution of the raw CFA data).
  • Multi-scale (pyramidal) motion estimation is performed on the down-sized grayscale image.
  • Block matching is pursued for motion estimation at each of the scales, with an L2 cost function being minimized at all levels of the image pyramid, except at the finest scale (down-sized grayscale image), where an L1 cost function is minimized.
  • Subpixel-level accuracy is sought during the motion estimation at all scales of the pyramid, except at the finest scale, where pixel-level accuracy is sought.
  • This strategy effectively limits the pixel displacements between the raw frames to be multiple of 2. This constraint is considered to be sufficient for the purpose of the application of multi-frame denoising and high-dynamic-range (HDR) fusion.
  • Each frame’s contribution at every pixel is estimated through kernel regression and these contributions are accumulated separately per colour channel.
  • the kernel shapes are adjusted based on the estimated local gradients and the sample contributions are weighted based on a robustness model, which computes a per-pixel weight for every frame using the alignment field and local statistics gathered from the neighborhood around each pixel.
  • the final merged RGB image is obtained by normalizing the accumulated results per channel. This process of merging a burst of frames has the effect to boost the perceived resolution or simply to enable choosing the best shot or an application to zero-shutter-lag use cases.
  • An additional step of three iterations of Lukas-Kanade optical flow image warping is proposed, which achieves subpixel-level accuracy, since pixel-level accuracy is not sufficient for the purpose of multi-frame super-fusion algorithms.
  • SLR digital single lens cameras
  • the burst mode is supported to enable selecting the best shot, or to perform sophisticated multi-frame processing, such as a multi-frame super-resolution feature.
  • the burst rate i.e. how many frames taken in fast succession
  • the burst rate varies but is increasing with advancing camera technologies.
  • Image alignment An essential step in the development of many multi-frame (burst) processing solutions is image alignment.
  • the motion between the captured frames or the motion of selected frames with respect to an anchor (reference) frame is estimated, and the frames are subsequently aligned or registered to compensate for the global motion and/or local motions.
  • the aligned frames can then be fused in a variety of ways, depending on the intended feature.
  • Image alignment can take place in the raw colour- filter array (CFA) domain or in the full-colour (e.g., RGB) domain, can be designed to achieve pixel-level accuracy or subpixel-level accuracy, and can be tailored to fit a variety of motion models, such as the translational, similarity, affine, and projective motion models.
  • CFA raw colour- filter array
  • full-colour e.g., RGB
  • Image alignment includes the estimation of motion in the burst of frames.
  • the technique of motion estimation that range from pixel-based to feature-based solutions, supporting global and local motion estimation as well as a variety of motion models is well known in the art.
  • Block Matching via exhaustive search is quite expensive computationally.
  • Significant speedup of the search can be achieved, for example via diamond search or other fast search algorithms.
  • the object of the present invention is to provide an improved method and an image processor unit for processing image data of an image sensor.
  • the method comprises the steps of:
  • each forward motion vector in forward direction represents the displacement of a block in a selected anchor frame of the burst of frames to the best-matching block in an respective alternate frame of the burst of frames and each backward motion vector in backward direction represents the displacement of a block in a respective alternate frame of the burst of frames to the best-matching block in an selected anchor frame of the burst of frames;
  • step B) Determining reliability factors for the motion vectors determined in step A) to assign the reliability of the respective motion vector for a given block by use of the difference between the forward motion vector and the related backward motion vector, wherein the reliability increases with decreasing difference;
  • step C) Aligning the frames of the burst of frames for an image, wherein the motion is compensated by use of weighted motion vectors, wherein the motion vectors are weighted with the respective reliability factor determined in step B).
  • the alignment can be used to register frames or parts thereof to respective image positions in related frames or parts thereof.
  • a frame refers to a full image captured by the image sensor or a part of a full image, i.e. a frame with a particular full or reduced size.
  • the alignment of frames can be performed e.g. by alignment of a complete frame or full image by use of a global motion vector for the complete frame or image or by alignment of a plurality of blocks of a frame by use of a respective local motion vectors for a related block.
  • the frames considered for steps A) to C) can be a selected region of interest (ROI) in a larger frame captured or being able to be captured by the image sensor.
  • ROI region of interest
  • weighted motion vectors has the effect of improved quality of the motion vectors by overweighting the more reliable motion vectors and underweighting the less reliable motion vectors. Weighting in this meaning also includes the use of only two weighting factors ZERO and ONE to consider the motion vectors having a reliability above a given threshold and fully disregarding the other motion vectors having a reliability below the given threshold.
  • the method can be performed by estimating the motion directly from the raw colour filter array (CFA) image data of the image sensor.
  • CFA colour filter array
  • the bi-directional motion estimation allows adaptive selection of anchor (reference) frames.
  • the motion estimation via explicit identification of unreliable motion estimates leads to improved robustness.
  • the motion can be estimated with arbitrary accuracy, e.g. pixel-level and/or subpixel- level accuracy. Subpixel-level motion estimation with arbitrary accuracy is achievable, without the use of any iterative or multi-scale procedures.
  • the estimated subpixel-level motion vectors can be used to derive the parameters of a variety of motion models, supporting global and/or local motions. The method supports global motion as well as local motion, and different motion models.
  • the method can be implemented to perform real and direct image registration (i.e. alignment) in the raw domain for a chosen colour channel that possess the highest sampling/strongest response. Since the offsets of other colour channels in the CFA with respect to the chosen one are known and fixed, registration of those colour channels is done implicitly without any additional computational effort.
  • the method explicitly detects unreliable motion vectors by means of the reliability criteria. Thanks to the availability of forward and backward motion vectors, there is the possibility of adaptive/variable selection of the reference frame.
  • a number of different motion vectors reliability rules can be applied to obtain at least one reliability factor for a respective frame or part thereof, e.g. a set of reliability factors for a block with each reliability factor of the set being obtained by a related reliability rule so that the set includes reliability factors obtained by different predefined reliability rules.
  • a reliability factor of Zero (0) obtained can be assigned to a set of forward motion vector and related backward motion vector, e.g. to a motion vector determined for the respective block or frame and used for further processing, in case that the difference between the forward motion vector and related backward motion vector exceeds a predefined threshold value, so that the motion vectors are disregarded in the further processing.
  • the motion vector determined for the respective block or frame and used for further processing can be for example the forward motion vector only.
  • a reliability factor of One (1 ) can be assigned to a set of forward motion vector and related backward motion vector, e.g. to a motion vector determined for the respective block or frame and used for further processing, in case that the difference between the forward motion vector and related backward motion vector is below a predefined motion consistency threshold value, so that the motion vectors are fully considered in the further processing.
  • the motion vector determined for the respective block or frame and used for further processing can be for example the forward motion vector only.
  • SAD L1-norm sum of absolute difference
  • a reliability factor of Zero (0) can be assigned to motion vector in case that the magnitude of at least one of the x-component in x- direction or y-component in y-direction of the motion vector is equal to the block matching search size.
  • the reliability factor related to this second rule can be set to ONE (1 ).
  • a reliability factor can be assigned to a motion vector for a given block by use of the difference of the respective motion vector and the motion vectors of neighbouring blocks. This implicitly imposes a smoothness constraint on the motion vector field of a frame.
  • the sum of absolute differences can be calculated.
  • a fourth rule can be implemented with the purpose to identify only high- quality motion vectors as reliable.
  • a forth rule reliability factor can be assigned to the motion vector of a given block which is determined by the use of the sum of absolute difference (SAD) which corresponds to the motion vector for the best-matching block.
  • SAD sum of absolute difference
  • the SAD value that corresponds to the best match motion vector is compared to a pre-defined threshold, and if it is above that threshold, then the corresponding motion vector is considered to be unreliable, i.e. reliability factor of Zero. Otherwise, the motion vector is considered as reliable, so that the related reliability factor can be set to ONE.
  • a reliability factor of Zero (0) can be assigned in case that the sum of absolute difference (SAD) which corresponds to the motion vector for the best-matching block exceeds a predefined threshold value. In that case, the motion vector is fully disregarded for the further processing of the image.
  • SAD sum of absolute difference
  • a total reliability factor can be calculated by combining a set of reliability factors assigned to a motion vector for a given block. This can simply be calculated by multiplying the set of reliability factors, which are determined by use of different rules for a related motion vector.
  • the motion vectors can be weighted e.g. with the respective total reliability factor in step C).
  • the method can be performed by use of any applicable strategy for evaluating matching of parts of images, in particular block matching.
  • the best-matching block can be, for example, determined by calculating the sum of absolute difference (SAD) between the selected block in a first frame and the block under search in a second frame for a number of blocks under search having different relative positions in the second frame from each other, and determining the best- matching block as the block having the smallest sum of absolute difference (SAD).
  • a global motion vector can be determined for a frame by calculating an average motion vector from the set of weighted motion vectors for the frame, which are each weighted with the respective total reliability factor or at least one of reliability factor determined for the motion vectors of given blocks of the frame.
  • Estimation of motion with subpixel-level accuracy can be achieved by the steps of:
  • a respective local subpixel level motion vector For each block of a frame, a respective local subpixel level motion vector can be determined. At least one reliability factor can be determined for each local subpixel level motion vector, and the blocks in the alternate frame can be aligned blockwise (i.e. separate for each block of a frame block by block) to the anchor frame by use of the local subpixel level motion vectors determined for the respective blocks of the alternate frame.
  • the motion estimation process can be supported by such information, and the motion vector quality could be further improved.
  • sensor signals could be used for each of the anchor and alternate frames. This can be used to solve e.g. the issue, that the motion model is valid for global motions and might not be fully accurate for objects that are not in the same distance to the imaging system.
  • the image sensor may capture image data such that a plurality of colour channels are provided. This might include also White- or Grey colour.
  • each frame i.e. image
  • estimation of motion vectors is performed on a selected channel, which has the highest information density, i.e. the highest-sampling colour channel. This is in case of the Bayer Colour Filter Array the Green Channel due to the pattern R-G-G-B.
  • the method therefore comprises the steps of selecting the colour channel having the highest information density compared to the information density of the other channels of the plurality of channels, aligning the raw matrix of pixel of the frame for the selected colour channel by interpolating missing pixel in the selected colour channel, and proceeding with the steps A) to C) on the interpolated pixel of this selected colour channel.
  • an image processor unit for processing raw image data provided by an image sensor.
  • Said image sensor comprises a sensor array providing a burst of frames captured by the image sensor per image, each frame comprising a raw matrix of pixels a raw matrix of pixels per image.
  • the image processor unit is arranged to:
  • each forward motion vector in forward direction represents the displacement of a block in a selected anchor frame of the burst of frames to the best-matching block in an respective alternate frame of the burst of frames and each backward motion vector in backward direction represents the displacement of a block in a respective alternate frame of the burst of frames to the best-matching block in an selected anchor frame of the burst of frames;
  • step B) Determine reliability factors for the motion vectors determined in step A) to assign the reliability of the respective motion vector for a given block by use of the difference between the forward motion vector and the related backward motion vector, wherein the reliability increases with decreasing difference;
  • step C) Align the frames of the burst of frames for an image, wherein the motion is compensated by use of weighted motion vectors, wherein the motion vectors are weighted with the respective reliability factor determined in step B).
  • the image processor unit is arranged for processing image data by performing the aforementioned method steps.
  • the object is further solved by the image processor unit comprising the features of one of the claims 18 19 or 20.
  • the object is further solved by a computer program comprising instructions which, when the program is executed by a processing unit, cause the processing unit to carry out the steps of the aforementioned method.
  • Figure 1 Block diagram of an electronic device comprising a camera, an image processor unit and a mechanical actuator;
  • Figure 2 Flow diagram of the method for processing image data of an image sensor
  • Figure 3 Schematic diagram of block matching related to macroblock displace- ment in the reference (anchor) and alternate frame within a search window in the alternate frame;
  • Figure 4 Exemplary estimated motion vector field for a frame in pixel-level accuracy
  • Figure 5 Exemplary estimated motion vector field for a frame according to figure 4 with weighted motion vectors disregarding the unreliable motion vectors;
  • Figure 6 Exemplary estimated motion vector field for a frame according to figure 5 with weighted motion vectors in sub-pixel-level accuracy
  • Figure 7 Schematic diagram of the method adapted to higher-order motion models.
  • Figure 1 is an exemplary block diagram of an electronic device 1 comprising a camera 2 and an image processor unit 3 for processing raw image data IMGRAW provided by an image sensor 4 of the camera 2.
  • the image sensor 4 comprises an array of pixels so that the raw image IMGRAW is a data set in a raw matrix of pixels per image.
  • a colour filter array CFA is provided in the optical path in front of the image sensor 4.
  • the camera comprises an opto-mechanical lens system 5, e.g. a fixed uncontrolled lens.
  • the electronic device 1 further comprises a mechanical actuator 6.
  • the mechanical actuator 6 is provided in the electronic device primarily for other purposes, e.g. for signalling to a user. This is a well-known feature of smartphones for signalling incoming new messages or calls.
  • the electronic device 1 can be a handheld device like a smartphone, a tablet, wearables or a camera or the like.
  • the image processor unit 3 is arranged for processing image data IMGRAw from the image sensor 4 capturing a burst of images I frames FI ...N for one image.
  • the image processor unit 3 is arranged to process estimated motion vectors MV and to align the matrices of pixels of bursts of captured images to one specific alignment and to combine the burst of images to achieve a resulting image IMGFIN by use of the plurality of pixels of the raw matrices available for each pixel position and the matrix of the resulting image.
  • FIG. 2 shows a flow diagram of the method for processing raw image data IMGRAW of an image sensor 4.
  • the raw image data IMGRAW comprises a burst of images, or a burst of frames FREF, FALT_1, FALT_2, ... , FALT_N captured for one image.
  • the raw image data IMGRAW are processed by automatically performing at least the steps a) to h) e.g. on a respectively arranged digital signal processor or by software running on an image processor.
  • the first step a), or set of steps a0), a1 ), a2), ... , aN) for each frame of a burst of frames for an image is designed for Colour Channel Interpolation CCI.
  • the colours are under-sampled, e.g. 50% green, 25% blue and 25% red in the standard RGGB Bayer sensor. It is proposed to perform alignment on the colour channel that has the highest sampling in the CFA and/or the channel that offers more details/higher response.
  • An example of such a channel is the green channel in the standard RGGB Bayer CFA or the white channel in the RGBW Bayer CFA.
  • the colour channels in the CFA are typically under-sampled.
  • the first step in the image alignment pipeline is to interpolate the selected colour channel to fill in the gaps due to the missing samples. Interpolation is performed only in the missing samples places, in order to generate a full-resolution colour channel data. Detail-preserving interpolation, such as bi-cubic/bilateral interpolation, could be pursued. However, if computational complexity is of concern, simpler interpolation, such as bilinear interpolation, could be done.
  • Motion estimation will then be performed on the interpolated, full-resolution colour channel. Since the positions of the other colour channels in the CFA with respect to the selected channel are known, motion estimation is implicitly performed for those channels as well, without additional computational efforts. For simplicity, it is referred to the selected interpolated, full-resolution colour channel as C.
  • the anchor (reference) is denoted by C anchor and the alternate frame/frame region of interest (ROI) is denoted by C alternate .
  • CCS Colour Channel Smoothing
  • step b or the division of the routine into steps bO), b1 ), b2), ... , bN) each for a respective frame, Colour Channel Smoothing CCS is performed.
  • This step b) is designed to lowpass-filter both selected colour channel C anchor and ⁇ alternate- This is performed in order to robustify the motion estimation operations against noise in the raw data.
  • Lowpass filtering is achieved by convolving C anchor and C aUernate wih a smoothing filter, such as e.g. a two-dimensional Gaussian filter.
  • step c) (reflected by step A) in the claims) is designed for estimating motion vectors in forward and backward direction for the alternate frames (including the option of alternate ROIs of a frame) with respect to the anchor (reference) frame (or ROI of an anchor frame).
  • Block matching is well known and by far the most popular motion estimation method used in various image/video processing applications.
  • the image can be divided e.g. into M x N rectangular blocks, each of size L x x L y and for each block in the current (alternate) frame, a search is performed in a window in the anchor (reference) frame to find the best matching block, which minimizes some pre- defined block distortion metric (e.g. the so called Displaced Frame Difference (DFD)).
  • DFD Displaced Frame Difference
  • the matching search can be performed within ⁇ P pixels, i.e. supporting up to P pixels in the ⁇ x and ⁇ y directions, as depicted in Figure 3.
  • Block matching BM via exhaustive search is quite expensive computationally. Significant speedup of the search can be achieved, for example via diamond search or other fast search algorithms known in the prior art.
  • the motion vector MV is computed by finding the best match, which achieves the minimum block distortion metric in the search window ( ⁇ P pixels); the motion vector MV for a given block is calculated as the displacement of the reference frame block to the alternate frame best-matching block.
  • the implicit underlying assumption in block matching BM is that within a small block, the motion can be modeled as translational.
  • the brightness is assumed to be constant. While this latter assumption may not hold, for example in frames taken with different exposure times, a pre-processing photometric alignment can be performed prior to block matching BM, in order to find the motion vectors MVs under the brightness constancy assumption.
  • the third step c) in the motion estimation pipeline is designed to perform block matching BM to estimate the motion between the smoothed colour channel C anchor and image data. Block matching is performed in both of the forward direction (matching towards and the backward direction (matching towards
  • the bi-directional block matching BM for motion reliability estimation allows to estimate a reliability factor for the respective motion vectors as explained later in respect to step d). It is also worth mentioning that having both the forward and backward motion estimation results available could facilitate adaptive selection of the anchor (reference) frame in subsequent multi-frame fusion operations.
  • Fast block matching search can be achieved by employing the diamond search strategy. At this stage, only pixel-level accuracy is thought.
  • the block matching/distortion criterion used is the sum of absolute difference (SAD) between the block in the anchor frame and the block under search in the alternate frame (in the forward mode and vice versa in the backward mode), because of its low computational complexity.
  • SAD absolute difference
  • additional pixels can be included around the borders of the blocks and form an enlarged area with the size WL X x WL y , called wide block, where WL X ⁇ L x and WL y ⁇ L y .
  • step d) (reflected by step B) in the claims) of the proposed motion estimation pipeline, it is necessary to be able to detect unreliable motion vectors MVs, so that their values would be underweighted, e.g. fully excluded, in later analysis for finding the frame global motion parameters (or local motions, if needed).
  • a number of different motion reliability rules are applied to estimate respective reliability factors assigned to a motion vector for a given frame or part thereof (e.g. block).
  • the reliability factors of the set of reliability factors assigned to one common motion vector for a given frame or part thereof can be combined to achieve a total reliability factor assigned to the motion vector for the given frame, i.e. full frame, block or the like.
  • the forward motion vector MV should be the opposite of the backward motion vector MV; i.e. the difference between the respective x and y components of the forward and backward motion vector should be zero.
  • detection of unreliable motion estimation due to occlusion, saturated pixels, motion blur, shadows, reflections and other local variations in the scene
  • the motion vector MV for that given block is declared as unreliable.
  • T MV a pre- defined threshold
  • the first rule reliability factor for block (i, j) is denoted by
  • an additional motion vector MV binary reliability factor R2( i,j) is calculated as follows.
  • the second rule reliability factor for block (i, j) is denoted by
  • u avg and v avg are the average x and y component of the motion vectors MVs of the neighboring blocks, e.g. in the 3x3 block neighborhood around the block under consideration.
  • T u and T v are tunable parameters that control the strength of the motion vector MV smoothness constraint.
  • the aforementioned three reliability metrics are further complemented by a fourth one, whose purpose is to identify only high-quality motion vectors MVs estimates as reliable.
  • a fourth one whose purpose is to identify only high-quality motion vectors MVs estimates as reliable.
  • the sum of absolute difference SAD value that corresponds to the best match motion vector MV is compared to a pre-defined threshold, and if it is above that threshold, then the corresponding motion vector MV is considered to be unreliable, i.e. forth rule reliable factor
  • the corresponding motion vector MV is considered to be reliable, i.e. forth rule reliable factor
  • the total motion vector MV binary reliability is then calculated as follows: wherein Riviv(i,j) is denoted as total reliable factor assigned to the motion vector of a given block.
  • the reliable factors R1(i,j), R2(i,j), R3(i,j), R4(i,j) are set to ON-Off-states by digital Zero or One values
  • the total reliable factor Riviv(i,j) is Zero, once at least one of the reliable factors R1(i,j), R2(i,j), R3(i,j), R4(i,j) is zero. This can be implemented simply by use of a logical AND gate.
  • translational motion vectors MVs are calculated by the block matching BM strategy, this does not mean that the global motion of the frame needs to be modeled as translational.
  • an affine motion model can be assumed for the whole frame, and its parameters could be calculated from the reliable translational motion vectors. So, depending on the chosen motion model, the derivation of the global motion parameters will follow, as described shortly.
  • Motion estimation can be performed with pixel-level accuracy or subpixel-level accuracy in mind, depending on the target application.
  • Block matching BM would be more computationally complex if it would be performed with subpixel-accuracy. Therefore, block matching BM with pixel-level accuracy is preferably performed, followed by a non-itertative, non-interpolative step of motion estimation with arbitrary subpixel accuracy. This way, motion vectors MVs with arbitrary subpixel accuracy are calculated, while still keeping the overall computational complexity low.
  • ⁇ MV A global, robust translation/displacement between the anchor frame and the alternate frame is calculated from the reliable motion vectors MVs in ⁇ MV .
  • One possible robust estimate is the average value, e.g. the median, of all those reliable motion vectors MVs, as shown below.
  • the result of this operation is a global motion vector MV, denoted by [u global , v global ]. f) Alternate Frame Alignment including Calculation of Subpixel-Level Global Motion Parameters
  • Subpixel accuracy is key to the operation of various multi-frame processing features, such as multi-frame super-resolution.
  • Arbitrary accuracy is essential to achieving high-guality image registration, and thereby high-guality subseguent multi-frame processing.
  • step C) a non-iterative, interpolation-free approach is employed similar to the one described in S. Chan et al.: “Subpixel Motion Estimation Without Interpolation” referenced in the above description of the prior art. The alignment is based on Taylor approximations.
  • the alternate frame is aligned to the reference frame based on the estimated global pixel-level motion vector MV.
  • B alternate (i,j) and B anchor (i.,j) denote the (i,j)-th block in the alternate and anchor frames, respectively, after global alignment, and (i,j) e ⁇ MV .
  • Subpixel-level motion estimation of the aligned alternate frames can then be calculated as follows:
  • a subpixel-level forward motion vector [u subpixel (i,j),v subpixel (i,j)]
  • a global, robust subpixel-level motion vector MV can be calculated from the subpixel-level motion vectors MVs in ⁇ MV .
  • One possible robust estimate is the median of all the subpixel-level motion vectors MVs, as shown below. The result of this operation is a global subpixel-level motion vector MV, denoted by
  • the total global subpixel-level motion vector can then be calculated as follows
  • the alternate frame FALT_ ⁇ can be aligned to the anchor frame FREF.
  • the alignment error signal is expected to be high in areas where the motion vectors MVs were not reliable (i.e. , and that is a very useful information (coupled with ) for subsequent multi-frame processing.
  • Local variations are present in the scene because of an object, e.g. as in the captured image a car, moving on the right, along with its shadow, as well as tree leaves motion and other reflections and small variations in the scene.
  • Figure 5 presents the estimated weighted motion vector field with pixel-level accuracy component in horizontal X- and vertical Y-direction of the frame.
  • the unreliable motion vectors are disregarded.
  • the reliable local motion vectors MVs are shown after excluding the motion vectors MVs whose total reliability factor is zero.
  • the aforementioned exemplary embodiment is based on translational motion models.
  • the method is not limited to the translational motion model, and can be adapted to other higher-order models, in particular non-translational motion models.
  • the parameters t x and t y capture the two-dimensional translation in the x and y directions, respectively.
  • the rotation is captured by the parameters a 1 , a 2 , a 3 and a 4 , wherein the zooming is captured by the zooming-parameters a 1 and a 4 , and wherein shear is captured by the shear-parameters a 2 and a 3 .
  • the affine motion model six parameters from the already calculated translational motion vectors One exemplary approach is explained in the following.
  • the reliability of the calculated motion vectors is calculated in a similar manner (employing proper thresholds for the rules #1 to #3) in order to identify the reliable motion vectors MVs in the motion vector field achieved in the above mentioned step 1 .
  • A is stack of a number Q of 2x6 matrices, and each of those matrices has entries b is a stack of Q 2x1 matrices, and each of those matrices has entries
  • the motion model is valid for global motions and is not fully accurate for objects that are not in the same distance to the imaging system. If the information of some motion-related auxiliary sensors, such as an accelerometer/a gyroscope and of depth sensors, is available for each of the anchor and alternate frames, the motion estimation process could be supported by such information, and the motion vector quality could be further improved.
  • some motion-related auxiliary sensors such as an accelerometer/a gyroscope and of depth sensors
  • Figure 7 shows a schematic diagram of the method adapted to higher-order motion models supporting non-translational transformations, such as the similarity, affine and projective (2D homography) transformations described in R. Hartley and A. Zisserman, “Multiple View Geometry in Computer Vision”. University Press, Cambridge, 2004 referenced in the first part of the description related to the prior art. It should be noted that the method shown in Figure 7 also supports the translational motion model, as described later.
  • subpixel refinement is performed on a block basis between the reference and anchor frames/frame ROIs, without any explicit intermediate alignment.
  • the result is a subpixel-level motion vector MV field which is then transformed into a list of 2D correspondences.
  • the correspondences are then used to estimate the global motion model parameters. Once the motion parameters are calculated, alignment of the frames can be performed.
  • Steps a) to d) corresponds to the exemplary method described above with reference to Figure 2.
  • Former step e) is modified denoted step i) and is designed for Block- Based Subpixel-Level Refinement BB-ME by use of the reliability factors derived in step d) and the interpolated and smoothed low-pass filtered raw CFA data of the alternate frame or region of interest ROI of an alternate frame derived in steps b1 ), b2), ... bN).
  • step j) designed for Two-Dimensional Correspondences
  • step k) designed for Transformation Matrix Calculation.
  • the result of step k) are estimated Motion Parameters.
  • step a the output of step a), i.e. divided steps aO), a1 ), a2, ... , aN), is C r and C a for each frame or region of interest ROI of a frame.
  • step b i.e. the divided steps bO), b1), b2), ... , bN
  • the output from step d) is the M x N binary motion vector reliability map,
  • step i) block-based subpixel refinement is pursued only for the reliable motion vectors. No explicit global intermediate alignment is performed. Hence, for each block B r in the reference frame, the matching block B a in the alternate frame is identified based on the motion vector MV . Then for each block pair, the subpixel motion vector is calculated by solving the system of equations in section (3.7). The result is a subpixel-level forward motion vector field . Adding the motion vector fields and ⁇ , the total subpixel-level motion vector field is obtained, where
  • ⁇ R denotes the collection of reliable motion vectors.
  • Q the number of reliable motion vectors in ⁇ R.
  • each source point [S x ,S y ] is defined as and the corresponding destination point [S x ,S y ] is defined as
  • the set of 2D correspondences, ⁇ is then used for estimating the parameters of the underlying global motion model as described in what follows.
  • the motion model parameters are estimated via direct linear transform (DLT).
  • DLT direct linear transform
  • Direct Linear Transform is described in more detail in G. H. Golub and C. F. Van Loan, Matrix Computations, second edition, 1989 and R. Hartley and A. Zisserman, “Multiple View Geometry in Computer Vision”. University Press, Cambridge, 2004.
  • the 2D transformation of homogeneous image point coordinates from the source to the destination can be defined as where the 3x3 transformation matrix T motion captures the motion model parameters and
  • the translational motion model can be represented by where t x and t y represent the translation in the x and y directions, respectively.
  • the translational model parameters (t x , t y ) can be estimated from the set of Q correspondence pairs as
  • the similarity transformation motion model can be represented by where 0 is the angle of rotation, and s is the scaling factor of the x/y coordinates.
  • the corresponding DLT system of equations can be defined as which in terms of the pairs of correspondences and motion model parameters is written as
  • the unknown motion model parameters vector, X 4x1 can be estimated by solving the DLT system with SVD:
  • the affine transformation motion model can be represented by where t x and t y represent the translation in the x and y directions, respectively. And the parameters a, b, c and d are the combination of the scaling, rotation, and shear.
  • the unknown motion model parameters vector X 6x1 can be estimated by solving the DLT system with SVD:
  • the projective (2D homography) transformation motion model can be represented by
  • the DLT system of equations can be defined as
  • the unknown motion model parameters vector X 9x1 can be estimated by solving the DLT system with SVD, where the last column of V represents X 9x1 as shown below.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Color Television Systems (AREA)
  • Image Analysis (AREA)

Abstract

L'invention se rapporte à un procédé de traitement de données d'image (IMGRAW) d'un capteur d'image (4), les données d'image comprenant une rafale de trames capturées par le capteur d'image (4) par image, chaque trame comprenant une matrice brute de pixels. Le procédé comprend les étapes consistant : A) à déterminer des vecteurs de mouvement (MV1…N), chaque vecteur de mouvement vers l'avant dans une direction vers l'avant représentant le déplacement d'un bloc dans une trame d'ancrage sélectionnée de la rafale de trames vers le bloc le mieux adapté dans une autre trame respective de la rafale de trames et chaque vecteur de mouvement vers l'arrière dans la direction vers l'arrière représentant le déplacement d'un bloc dans une autre trame respective de la rafale de trames vers le bloc le mieux adapté dans une trame d'ancrage sélectionnée de la rafale de trames ; B) à déterminer des facteurs de fiabilité pour les vecteurs de mouvement déterminés à l'étape A) afin d'attribuer la fiabilité du vecteur de mouvement respectif pour un bloc donné à l'aide de la différence entre le vecteur de mouvement vers l'avant et le vecteur de mouvement vers l'arrière associé, la fiabilité augmentant lorsqu'une différence décroît ; C) à aligner les trames de la rafale de trames pour une image, le mouvement étant compensé à l'aide de vecteurs de mouvement pondérés, les vecteurs de mouvement étant pondérés avec le facteur de fiabilité respectif déterminé à l'étape B).
PCT/EP2022/057016 2022-03-17 2022-03-17 Procédé et unité de processeur d'image pour traitement de données d'image WO2023174546A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2022/057016 WO2023174546A1 (fr) 2022-03-17 2022-03-17 Procédé et unité de processeur d'image pour traitement de données d'image
TW111149959A TW202338734A (zh) 2022-03-17 2022-12-26 用於處理影像資料的方法及影像處理器單元

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/057016 WO2023174546A1 (fr) 2022-03-17 2022-03-17 Procédé et unité de processeur d'image pour traitement de données d'image

Publications (1)

Publication Number Publication Date
WO2023174546A1 true WO2023174546A1 (fr) 2023-09-21

Family

ID=80978767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/057016 WO2023174546A1 (fr) 2022-03-17 2022-03-17 Procédé et unité de processeur d'image pour traitement de données d'image

Country Status (2)

Country Link
TW (1) TW202338734A (fr)
WO (1) WO2023174546A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002058A1 (en) * 2003-09-02 2007-01-04 Koninklijke Philips Electronics N.V. Temporal interpolation of a pixel on basis of occlusion detection
US20090147853A1 (en) * 2007-12-10 2009-06-11 Qualcomm Incorporated Resource-adaptive video interpolation or extrapolation
US20100225768A1 (en) * 2009-03-05 2010-09-09 Sony Corporation Method and system for providing reliable motion vectors

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002058A1 (en) * 2003-09-02 2007-01-04 Koninklijke Philips Electronics N.V. Temporal interpolation of a pixel on basis of occlusion detection
US20090147853A1 (en) * 2007-12-10 2009-06-11 Qualcomm Incorporated Resource-adaptive video interpolation or extrapolation
US20100225768A1 (en) * 2009-03-05 2010-09-09 Sony Corporation Method and system for providing reliable motion vectors

Non-Patent Citations (13)

* Cited by examiner, † Cited by third party
Title
ADITHYA POTHAN RAJ V ET AL: "Detection of small moving objects based on motion vector processing using BDD method", ADVANCED COMPUTING (ICOAC), 2011 THIRD INTERNATIONAL CONFERENCE ON, IEEE, 14 December 2011 (2011-12-14), pages 229 - 234, XP032136815, ISBN: 978-1-4673-0670-6, DOI: 10.1109/ICOAC.2011.6165180 *
B. WRONSKII. GARCIA-DORADOM. ERNSTD. KELLYM. KRAININC. K. LIANGM. LEVOYP. MILANFAR: "Handheld Multi-Frame Super-Resolution", ACM TRANSACTIONS ON GRAPHICS, vol. 38, no. 4, July 2019 (2019-07-01)
G. H. GOLUBC. F. VAN LOAN, MATRIX COMPUTATIONS, 1989
G. H. GOLUBC. F. VAN LOAN: "Matrix Computations", 1989
L. C. MANIKANDANR. K. SELVAKUMAR: "A Study on Block Matching Algorithms for Motion Estimation in Video Coding", INTERNATIONAL JOURNAL OF SCIENTIFIC & ENGINEERING RESEARCH, vol. 5, July 2014 (2014-07-01)
M. A. FISCHLERR. C. BOLLES: "Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography", COMMUNICATIONS OF THE ACM, vol. 24, no. 6, 1981, pages 381 - 395
R. HARTLEYA. ZISSERMAN: "Multiple View Geometry in Computer Vision", 2004, UNIVERSITY PRESS
R. YAAKOBA. ARYANFARA. A. HALINN. SULAIMAN: "A Comparison of Different Block Matching Algorithms for Motion Estimation", THE 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI 2013
S. CHAN ET AL., SUBPIXEL MOTION ESTIMATION WITHOUT INTERPOLATION
S. HASINOFFD. SHARLETR. GEISSA. ADAMSJ. T. BARRONF. KAINZJ. CHENM. LEVOY: "Burst photography for high dynamic range and low-light imaging on mobile cameras", ACM TRANSACTIONS ON GRAPHICS, vol. 35, no. 6, November 2016 (2016-11-01)
SAMUEL W HASINOFF ET AL: "Burst photography for high dynamic range and low-light imaging on mobile cameras", ACM TRANSACTIONS ON GRAPHICS, ACM, NY, US, vol. 35, no. 6, 11 November 2016 (2016-11-11), pages 1 - 12, XP058306351, ISSN: 0730-0301, DOI: 10.1145/2980179.2980254 *
SING BING KANG ET AL: "High dynamic range video", ACM TRANSACTIONS ON GRAPHICS, ACM, NY, US, vol. 22, no. 3, 1 July 2003 (2003-07-01), pages 319 - 325, XP058249526, ISSN: 0730-0301, DOI: 10.1145/882262.882270 *
WONSANG YOU ET AL: "Moving Object Tracking in H.264/AVC Bitstream", 30 June 2007, MULTIMEDIA CONTENT ANALYSIS AND MINING; [LECTURE NOTES IN COMPUTER SCIENCE;;LNCS], SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 483 - 492, ISBN: 978-3-540-73416-1, XP019064716 *

Also Published As

Publication number Publication date
TW202338734A (zh) 2023-10-01

Similar Documents

Publication Publication Date Title
US6285804B1 (en) Resolution improvement from multiple images of a scene containing motion at fractional pixel values
US8768069B2 (en) Image enhancement apparatus and method
JP4623111B2 (ja) 画像処理装置、画像処理方法及びプログラム
US8290212B2 (en) Super-resolving moving vehicles in an unregistered set of video frames
US9202263B2 (en) System and method for spatio video image enhancement
WO2019135916A1 (fr) Simulation de flou de mouvement
US8279341B1 (en) Enhancing the resolution and quality of sequential digital images
US9055217B2 (en) Image compositing apparatus, image compositing method and program recording device
US20120162449A1 (en) Digital image stabilization device and method
US20110085049A1 (en) Method and apparatus for image stabilization
AU2011205087B2 (en) Multi-hypothesis projection-based shift estimation
US20090028462A1 (en) Apparatus and program for producing a panoramic image
KR20100139030A (ko) 이미지들의 수퍼 해상도를 위한 방법 및 장치
WO2001031568A1 (fr) Systemes et procede permettant produire des images haute resolution a partir d'une sequence video d'images de plus basse resolution
JP5107409B2 (ja) 動き領域の非線形スムージングを用いた動き検出方法およびフィルタリング方法
CN111510691B (zh) 颜色插值方法及装置、设备、存储介质
GB2536430B (en) Image noise reduction
US20100182511A1 (en) Image Processing
US11334961B2 (en) Multi-scale warping circuit for image fusion architecture
US11816858B2 (en) Noise reduction circuit for dual-mode image fusion architecture
WO2020033432A1 (fr) Super-résolution au moyen d'un mouvement manuel naturel appliqué à un dispositif d'utilisateur
Choi et al. Motion-blur-free camera system splitting exposure time
Al Ismaeil et al. Dynamic super resolution of depth sequences with non-rigid motions
US8571356B2 (en) Image processing apparatus, image processing method, and image processing program
WO2023174546A1 (fr) Procédé et unité de processeur d'image pour traitement de données d'image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22713678

Country of ref document: EP

Kind code of ref document: A1