WO2016111239A1 - Image processing device, image processing method and program recording medium - Google Patents

Image processing device, image processing method and program recording medium Download PDF

Info

Publication number
WO2016111239A1
WO2016111239A1 PCT/JP2016/000013 JP2016000013W WO2016111239A1 WO 2016111239 A1 WO2016111239 A1 WO 2016111239A1 JP 2016000013 W JP2016000013 W JP 2016000013W WO 2016111239 A1 WO2016111239 A1 WO 2016111239A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame image
image
frame
movement amount
interest
Prior art date
Application number
PCT/JP2016/000013
Other languages
French (fr)
Japanese (ja)
Inventor
真澄 石川
仁 河村
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2016568360A priority Critical patent/JP6708131B2/en
Publication of WO2016111239A1 publication Critical patent/WO2016111239A1/en

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory

Definitions

  • the present invention relates to a video processing device, a video processing method, and a program recording medium.
  • Photosensitivity seizure is one of the symptoms of an abnormal response to light stimulation, and is a seizure showing symptoms similar to epilepsy such as convulsions and disturbance of consciousness.
  • Non-Patent Document 1 In order to suppress the occurrence of such effects, attempts are being made to suppress the distribution of video content that has a negative effect on the human body.
  • ITU International Telecommunication Union
  • Non-Patent Document 2 In Japan, the Japan Broadcasting Corporation and the Japan Broadcasting Corporation have established guidelines for animation production in particular, and are demanding compliance with those who are involved in broadcasting (Non-Patent Document 2).
  • One video that contains many flickers that can trigger photosensitivity seizures is a video that contains a lot of flash emitted from a news photographer during a press conference. In such an image, a short-time bright region is generated by a flash emitted from the camera, and many blinks are generated by repeating this.
  • Patent Documents 1 to 3 disclose related techniques for detecting and correcting video content that has an adverse effect on the human body.
  • Patent Document 1 discloses a technique for detecting a scene (image) that induces a light-sensitive seizure in a liquid crystal display and reducing the luminance of a backlight unit with respect to the detected scene. This technology obviates the effects of photosensitivity attacks on viewers.
  • Patent Document 2 corrects the dynamic range of the (n + 1) th frame image by gamma correction or tone curve correction based on the comparison result of the histograms of the nth frame image and the (n + 1) th frame image.
  • the technology is disclosed. This technique relieves strong blinking and reduces eye strain or poor health.
  • Patent Document 3 discloses a technique for correcting a motion vector.
  • Non-Patent Document 3 and Non-Patent Document 4 disclose optical flow calculation methods described later.
  • the related technology has the following problems. Large changes in brightness or saturation that can trigger photosensitivity seizures may occur in some areas of the image, not in the entire image. Since the technique disclosed in the related art uniformly corrects the entire image without making these determinations, it reduces the contrast and brightness of areas that do not need to be corrected and does not cause blinking, and reduces the image quality of those areas. May deteriorate.
  • An object of the present invention is to provide a technique capable of generating a natural video in which fluctuations in luminance or saturation are suppressed.
  • An image processing device includes: Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means, Image synthesizing means for synthesizing the frame image of interest and the correction frame image.
  • An image processing method includes: It is determined whether any of a plurality of temporally continuous frame images is an attention frame image including a blinking region whose luminance or saturation is different from a preceding frame image by a predetermined level or more. Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Estimating a second movement amount to be Based on the selected pair and the estimated first movement amount and / or second movement amount, a corrected frame image corresponding to a frame image at the shooting time of the frame image of interest is generated, The attention frame image and the correction frame image are synthesized.
  • a program recording medium includes: On the computer, A process of determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region that differs in luminance or saturation by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject A process of estimating a second movement amount to be performed; Processing for generating a corrected frame image corresponding to a frame image at the photographing time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount; , And a process of combining the frame image of interest and the correction frame image.
  • An image processing device includes: Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images; A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated.
  • First estimating means for: A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area.
  • second estimation means for estimating a second movement amount resulting from the movement of.
  • FIG. 1 is a block diagram of a video processing apparatus according to the first embodiment.
  • FIG. 2 is a schematic diagram showing a rectangular area luminance calculation method.
  • FIG. 3 is a block diagram of the motion estimator in the first embodiment.
  • FIG. 4 is a schematic diagram illustrating a method of selecting a frame image that does not include a bright region.
  • FIG. 5 is a diagram illustrating a method for selecting a motion estimation frame.
  • FIG. 6 is a diagram illustrating a method for selecting a motion estimation frame.
  • FIG. 7 is a diagram illustrating an example of a method for selecting a motion estimation frame pair.
  • FIG. 8 is a block diagram of a correction frame generation unit in the first embodiment.
  • FIG. 1 is a block diagram of a video processing apparatus according to the first embodiment.
  • FIG. 2 is a schematic diagram showing a rectangular area luminance calculation method.
  • FIG. 3 is a block diagram of the motion estimator in the first embodiment.
  • FIG. 4 is a schematic diagram
  • FIG. 9 is a graph showing an example of a method for setting the value of the rate of change in local area luminance in the output frame image.
  • FIG. 10 is a flowchart showing the operation of the video processing apparatus according to the first embodiment.
  • FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus.
  • FIG. 1 is a block diagram showing a configuration of a video processing apparatus 100 according to the first embodiment of the present invention. Note that the arrows described in FIG. 1 (and the subsequent block diagrams) merely show an example of the data flow, and are not intended to limit the data flow.
  • the video processing apparatus 100 includes a determination unit 11, a motion estimation unit 12, an image generation unit 13, and an image synthesis unit 14.
  • the determination unit 11 determines whether or not the frame image includes a region that may induce a photosensitivity seizure. Specifically, the determination unit 11 uses a frame image having a preset number of frames to blink a specific frame image (hereinafter referred to as “target frame image”) by flash or the like (the luminance changes greatly). It is determined whether the frame image includes a region. In the following, a region determined in this way (a region where the luminance changes greatly) is referred to as a “blinking region”. For example, when the determination unit 11 receives an input of time-sequential frame images for (2m + 1) frames taken from time (tm) to time (t + m), the determination unit 11 selects the frame image at time t. A frame image of interest is determined, and it is determined whether the frame image includes a blink region.
  • the motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 synthesize a frame image in which the movement of the image due to the displacement of the camera or the subject is corrected.
  • the motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 can output a frame image in which the influence of blinking is reduced by appropriately suppressing the luminance change in the blinking region in this way.
  • the blinking region includes a bright region in which the luminance of the frame image of interest is greatly improved (becomes brighter) and a dark region in which the luminance of the frame image of interest is greatly lowered (becomes dark).
  • a bright region in which the luminance of the frame image of interest is greatly improved (becomes brighter)
  • a dark region in which the luminance of the frame image of interest is greatly lowered (becomes dark).
  • only the bright area will be described below.
  • the determination unit 11 determines whether the target frame image is a frame image including a blinking region.
  • One method for determining whether a target frame image is a frame including a blinking region is a method using a change rate of local region luminance between the target frame image and another input frame image.
  • the local area luminance represents a luminance value of an area including the pixel and a predetermined number of pixels around the pixel in each pixel of the input plurality of frame images.
  • the determination unit 11 first converts color information described in an RGB color system or the like into luminance information (luminance value) representing brightness for each pixel of the input plurality of frame images. Thereafter, the determination unit 11 performs a smoothing process using pixels around the target pixel on the converted luminance information, thereby calculating a luminance value in the pixel peripheral region.
  • the method for converting color information into luminance information is, for example, a method for calculating a Y value representing the luminance of the YUV (YCbCr, YPbPr) color system used for broadcasting, or a Y value representing the luminance of the XYZ color system.
  • the color systems describing luminance information are not limited to these color systems.
  • the determination unit 11 may convert the color information into another index representing luminance such as the V value of the HSV color system.
  • the determination unit 11 converts the color information into color information before correction by inverse gamma correction before conversion to luminance information. Also good.
  • the smoothing method is, for example, an average value of luminance information of upper and lower q pixels and left and right p pixels, that is, (2p + 1) ⁇ (2q + 1) pixels, of pixels around the target pixel.
  • the local region luminance l t (x, y) of the pixel at the position (x, y) in the frame image at time t is expressed by Equation (1) using the luminance information Y t of the frame image.
  • the determination unit 11 may calculate the local region luminance l t (x, y) using a weighted average using a preset weight w as in Expression (2).
  • the determination unit 11 calculates a Gaussian weight w (i, j) using Equation (3) using a preset parameter ⁇ .
  • the local area luminance change rate represents the ratio of the local area luminance change between the pixel of the target frame image and the pixel of another input frame image at the same position.
  • the determination unit 11 changes the local area luminance change rate r t ⁇ t + k (x, y) of the pixel at each position (x, y) of the frame image of interest at time t and the frame image at time (t + k). ) Is calculated using equation (4).
  • the determination unit 11 determines based on the calculated change rate whether or not the frame image of interest includes an area that is brighter than a predetermined level by comparison with other frame images. As a result, when the attention frame image includes an area that is brighter than a predetermined level with respect to other frame images before and after in time, the determination unit 11 determines that the attention frame image is a bright area due to blinking. It is determined that the frame image is included.
  • the determination unit 11 uses the threshold value ⁇ of the change rate and the threshold value ⁇ of the area rate, which are set in advance, depending on whether the area rate of the region where the change rate r t-t + k exceeds the threshold value ⁇ exceeds the threshold value ⁇ .
  • the determination unit 11 sets the determination flag flag t-t + k to “1” when it is determined that the frame image of interest at time t includes a region that is brighter than a predetermined level by the frame image at time (t + k). . If the determination unit 11 determines that there is no such area, the determination flag flag t-t + k is set to “0”. The determination unit 11 similarly calculates a determination flag for the combination of the target frame image and all the other input frame images, and the frame image for which the determination flag is “1” for each of the times before and after the target frame image. It is determined whether or not exists. When such a frame image exists, the determination unit 11 determines that the frame image of interest is a frame image including a bright region.
  • the determination unit 11 may use a method of using the change rate of the rectangular area luminance as another method of determining whether the frame image of interest is a frame image including a blinking area.
  • the rectangular area luminance represents an average value of luminance for each rectangular area set in advance in each frame image.
  • the rectangular area luminance when a 10 ⁇ 10 block rectangular area is set in the frame image is an average value of the luminance values of the pixels included in each rectangular area.
  • the luminance value the Y value of the YUV color system, the Y value of the XYZ color system, the V value of the HSV color system, etc. can be used as in the case of calculating the local area luminance.
  • the change rate of the rectangular area luminance represents the ratio of the difference between the rectangular area luminance of the block of interest in the target frame image and the rectangular area luminance of the block at the same position in the other input frame image.
  • the determination unit 11 determines the rectangular area luminance L t (i, j) at the time t of the block at the position (i, j) of the target frame image and the rectangular area luminance L t + k of the frame image at the time (t + k).
  • the change rate R t ⁇ t + k (i, j) of (i, j) is calculated using equation (5).
  • the determination using the change rate of the rectangular area luminance is performed in the same manner as the determination using the change ratio of the local area luminance.
  • the determination unit 11 determines whether or not the attention frame image includes a region that is brighter than the other frame images in the combination of the attention frame image at time t and all the other input frame images. Set the value of the judgment flag.
  • the determination unit 11 determines that the frame image of interest is a frame image including a blinking area when there are frame images having the determination flag “1” at each of the times before and after the frame of interest image.
  • the determination flag value setting method as in the case of using the local area luminance change rate, a pixel whose change rate exceeds the threshold value ⁇ using a preset change rate threshold value ⁇ and an area rate threshold value ⁇ . There is a method of setting “1” or “0” depending on whether or not the area ratio exceeds the threshold value ⁇ .
  • the determination unit 11 outputs a determination flag between the frame image of interest and another input frame image as analysis information together with the determination result. Moreover, the determination part 11 may output the determination flag calculated between frame images other than an attention frame image as auxiliary information by performing the same process.
  • the determination unit 11 calculates the rectangular area luminance calculated between each rectangular area of the target frame image and the rectangular area at the same position of the other frame image. May be output as analysis information.
  • FIG. 3 is a block diagram illustrating a configuration of the motion estimation unit 12.
  • the motion estimation unit 12 includes a selection unit 12A, a first estimation unit 12B, and a second estimation unit 12C.
  • the motion estimation unit 12 receives the frame image and the determination result and analysis information output from the determination unit 11 as inputs. When it is determined that the target frame image is a frame image including a bright region, the motion estimation unit 12 selects a plurality of frame images to be used for motion estimation from the input frame images, and selects between the selected frame images. The movement amount of the image due to the movement of the camera and the subject is estimated.
  • the selection unit 12A selects a frame image used for estimation of the movement amount from frame images other than the target frame image, and acquires a pair of frame images including the selected frame image.
  • the selection unit 12A selects these frame images (hereinafter referred to as “motion estimation frame images”), for example, by the following method.
  • the selection unit 12A may select one frame image as a motion estimation frame image from before and after the target frame image based on the luminance difference between the target frame image and the input other frame image. In this case, the selection unit 12A acquires one frame image before and after each frame image of interest and uses it as a pair of motion estimation frame images. Specifically, the selection unit 12A may select the motion estimation frame image using the determination flag calculated by the determination unit 11.
  • the frame images before and after the closest to the target frame image are selected as the motion estimation frame images.
  • FIG. 4 is a schematic diagram showing a method for selecting a frame image that does not include a bright region.
  • Figure 4 shows the case of comparing the frame image from time (t-2) to time (t + 2) with other frame images for the frame image at time t for the four types of cases 1 to 4 The determination flag (flag) is illustrated. Note that in FIG. 4 (and similar figures thereafter), frame images that do not include a bright region are shown with hatching. An unhatched frame image represents a frame image including a bright region.
  • the selection unit 12A selects a frame image at time (t ⁇ 1) and time (t + 1) in case 1. Similarly, the selection unit 12A displays frame images at time (t-2) and time (t + 1) in case 2, and frames at time (t-1) and time (t + 2) in case 3. In the case of an image, case4, frame images at time (t-2) and time (t + 2) are selected.
  • the selection unit 12A may correct the selection result of the motion estimation frame using the determination flag between the frame images other than the target frame image input as the auxiliary information. In the selection using the determination flag between the frame image of interest and another frame image, when the frame at time (t + k) is selected as the motion estimation frame, the selection unit 12A corrects the selection result as follows. May be.
  • the determination flag flag t-t + k of the frame image at the time (t + k + 1) and the determination flag of the frame image at the time (t + k + 1) and the frame image at the time (t + k) When both flag t + k + 1-t + k values are ⁇ 1 '', there is also a large luminance change between the frame image at time (t + k + 1) and the frame image at time (t + k). It is believed that there is. Therefore, in this case, the selection unit 12A may change (correct) the motion estimation frame image to a frame image at time (t + k + 1).
  • the selection unit 12A may select a plurality of frame images as the motion estimation frame images from before and after the target frame image based on the luminance change between the target frame image and the input other frame image. In this case, the selection unit 12A acquires a plurality of pairs of frame images. Specifically, the selection unit 12A may select a predetermined number of frame images having the determination flag calculated by the determination unit 11 out of the neighboring frame images of the target frame image.
  • FIG. 5 is a schematic diagram showing an example of selecting a plurality (two pairs in this case) of motion estimation frame images.
  • the selection unit 12A when the frame images at times (t ⁇ 2), (t ⁇ 1), (t + 1), and (t + 2) do not include a bright region, the selection unit 12A All of these frame images are selected as motion estimation frames.
  • the determination flag in the example of FIG. 5 is equal to the determination flag in the case 1 of FIG.
  • the selection unit 12A not only displays frame images at time (t-1) and time (t + 1) but also frame images at time (t-2) and time (t + 2). Select as a frame for motion estimation.
  • This selection method selectively uses an area that is less affected by light flickering from multiple frame images when frequent flickering occurs in a short time or when a flash band occurs.
  • the accuracy can be increased (see, for example, FIG. 7).
  • the flash band refers to the difference in exposure period for each line when light emission in a short time such as flash light occurs in a rolling shutter type imaging device such as a CMOS (Complementary Metal-Oxide-Semiconductor) sensor. This is a large change (shift) in the signal intensity that occurs.
  • CMOS Complementary Metal-Oxide-Semiconductor
  • the selection unit 12A selects, as a motion estimation frame image, one of the frame images before and after the target frame image and the target frame image based on the luminance difference between the target frame image and the input other frame image. May be. Specifically, the selection unit 12A may select a frame image closest to the target frame image from among the frames having the determination flag “1” calculated by the determination unit 11. When the determination flag is “1” both before and after the frame image of interest, the selection unit 12A selects only one preset frame.
  • FIG. 6 shows an example of a case where a frame image at a time earlier than the target frame image is selected. In this case, the selection unit 12A uses the frame image thus selected and the frame image of interest as a pair of motion estimation frame images.
  • the number of images to be processed by the motion estimation unit 12 and the image generation unit 13 is reduced, so that high-speed processing can be realized.
  • this selection method is based on the assumption that corresponding points can be detected in the frame image of interest.
  • the first estimation unit 12B estimates pixel motion caused by camera or subject motion between a pair of motion estimation frame images. Motion estimation is performed on a combination (pair) of any two frame images of the motion estimation frame images. The first estimation unit 12B performs motion estimation on at least one set of one or a plurality of pairs.
  • the first estimation unit 12B performs motion estimation on a pair of two frame images selected one by one from before and after the target frame image.
  • the first estimation unit 12B may perform motion estimation on a pair composed of the target frame image and one of the frame images selected from before and after.
  • the first estimation unit 12B uses a rectangular area of the target frame image and a plurality of frame images selected from before and after the target frame image. The luminance of the rectangular area is compared with the rectangular area at the same position. Then, the first estimation unit 12B detects a region where the change rate of the luminance of the rectangular region exceeds the threshold value ⁇ . The first estimator 12B makes a pair of frame images including a region having a common region where the rate of change exceeds the threshold ⁇ , and moves with respect to the common region (region surrounded by a dotted line in FIG. 7) of each pair. Estimate.
  • the threshold value ⁇ may be a preset value, but an appropriate value may be dynamically set so that motion estimation can be performed in a certain area.
  • the first estimation unit 12B uses the determination flag between frame images other than the target frame image input from the determination unit 11, and the frame image in which the determination flag between the frame images is “0”. Motion estimation may be performed on a pair of
  • the first estimation unit 12B performs motion estimation on a pair of a frame and a target frame image selected from either one before or after the target frame image.
  • the motion of the image due to the motion of the camera can be expressed by affine transformation between a pair of motion estimation frame images because of the global motion of the screen.
  • Affine transformation is a geometric transformation that combines translation between two images and linear transformation (enlargement / reduction, rotation, skew).
  • Equation (6) The linear transformation matrix of Equation (6) is obtained by QR decomposition.
  • equation (6) can be expressed as equation (7).
  • corresponding points on the image I ′ are detected for three or more pixels on the image I, and each coordinate is expressed by an expression ( It can be calculated by substituting in 7).
  • the first estimation unit 12B can detect corresponding points by the following method, for example.
  • the first estimation unit 12B calculates an optical flow for the pixel P on the image I, and sets the pixel P ′ to which the pixel P is moved as a corresponding point.
  • a method based on the Lucas-Kanade method or the Horn-Schunck method can be cited.
  • the Lucas-Kanade method is a method for calculating the amount of movement of an image based on a constraint condition in which pixel values are approximately the same before and after movement (Non-Patent Document 3).
  • the Horn-Schunck method is a method for calculating the amount of movement of an image by minimizing the error function of the entire image while taking into account the smoothness between adjacent optical flows (Non-Patent Document 4).
  • the first estimation unit 12B specifies the region R ′ on the image I ′ corresponding to the region R on the image I, and the corresponding point of the pixel P corresponding to the center coordinate of the region R corresponds to the center coordinate of the region R ′. It is assumed that the pixel P ′ to be used.
  • the regions R and R ′ may be rectangular regions obtained by dividing the images I and I ′ into a grid having a predetermined size, or may be clusters generated by clustering pixels based on image features such as color and texture. May be.
  • the first estimation unit 12B can detect the region R ′ by template matching using the region R as a template.
  • the first estimation unit 12B uses an SSD (Sum of Squared Difference), SAD (Sum of Absolute Difference), and normalized cross-correlation (ZNCC: Zero-mean Normalized) as a similarity index used for template matching.
  • SSD SSD
  • SAD Sud of Absolute Difference
  • ZNCC normalized cross-correlation
  • Cross-Correlation may be used.
  • the normalized cross-correlation (R ZNCC ) is calculated based on the average values (T ave and) from the template and image luminance values (T (i, j) and I (i, j)) as shown in Equation (8).
  • the similarity can be evaluated stably even if there is a variation in brightness. Therefore, by using normalized cross-correlation, the first estimation unit 12B uses another index even when there is a difference in luminance between a pair of motion estimation frame images due to the influence of flash light. The region R ′ can be detected more stably.
  • the first estimation unit 12B may detect the pixel P ′ corresponding to the corresponding point of the pixel P corresponding to the center coordinate of the region R using the optical flow. For example, the first estimation unit 12B uses the representative value (weighted average value or median value) of the optical flow estimated for each pixel in the region R as the movement amount of the region R, and moves the pixel P by the movement amount of the region R. Let the previous pixel P ′ be a corresponding point.
  • the first estimation unit 12B extracts the pixel P corresponding to the feature point from the image I, and sets the pixel P ′ corresponding to the movement destination of the pixel p of the image I ′ as the corresponding point.
  • the first estimation unit 12B may use, for example, a corner point detected by a Harris corner detection algorithm as a feature point. Harris's corner detection algorithm is based on the knowledge that “the first differential value (difference) is large only in one direction at the point on the edge, and the first differential value is large in multiple directions at the point on the corner”. This is an algorithm for extracting a point having a large positive maximum value of the represented Harris operator dst (x, y).
  • fx and fy mean primary differential values (differences) in the x and y directions, respectively.
  • G ⁇ means smoothing by a Gaussian distribution with a standard deviation ⁇ .
  • k is a constant, and a value from 0.04 to 0.15 is empirically used.
  • the first estimation unit 12B may identify the corresponding point based on the optical flow detected at the feature point.
  • the first estimation unit 12B has an image feature whose image feature value (for example, SIFT (Scale-Invariant Feature Transform) feature value) extracted from an image patch including a certain feature point of the image I is an image patch of the image I ′.
  • the center of the image patch may be set as the corresponding point p ′ when it is similar to the image feature amount extracted from.
  • the first estimator 12B may calculate the affine transformation parameters based on the three reliable combinations of corresponding points among the corresponding points detected using the above method. You may calculate by the least squares method based on the combination of the above corresponding points. Alternatively, the first estimation unit 12B may calculate the affine transformation parameters using a robust estimation method such as RANSAC (RANdom SAmple Consensus). RANSAC calculates three tentative affine transformation parameters by randomly selecting from three combinations of corresponding points, and when there are many combinations that correspond to the tentative affine transformation parameters among other combinations of corresponding points, This is a method in which an affine transformation parameter is adopted as a true affine transformation parameter.
  • RANSAC Random SAmple Consensus
  • the first estimation unit 12B may exclude a specific image region from the calculation target of the affine transformation parameter.
  • Such an image area has a corresponding point detection accuracy such as an edge portion of an image that is likely to be out of the shooting range when the camera moves or a flat portion with a small luminance difference from adjacent pixels. It is a known image area that is low.
  • the pixel value of such an image area changes due to factors other than the movement of the camera, such as an area in the center of the screen where a moving subject is highly likely to be reflected, or a portion that receives a fixed illumination that changes color. It is an image area.
  • the combination of (12B-1), (12B-2), (12B-3) and (12A-1), (12A-2), (12A-3) described above is not particularly limited. That is, the first estimation unit 12B performs (12B-1), (12B) on the motion estimation frame image selected by any of the methods (12A-1), (12A-2), and (12A-3). -2) or (12B-3) may be executed.
  • the first estimation unit 12B may use camera motion information acquired by a measuring instrument (gyroscope, depth sensor, etc.) mounted on the camera in addition to the motion estimation by the image processing described above.
  • Second estimation unit 12C The second estimation unit 12C detects the subject region from one of the pair of motion estimation frame images and estimates the corresponding region (the region corresponding to the subject region) from the other for the motion of the image caused by the motion of the subject. Ask for. Alternatively, the second estimation unit 12C generates a converted image by performing affine transformation on one or both of the pair of motion estimation frame images, and from one of the pair of motion estimation frame images or the converted image thereof. A subject area may be detected. In this case, the second estimator 12C may determine the motion of the image due to the motion of the subject by estimating the other frame image of the pair of motion estimation frame images or the corresponding region of the converted image. .
  • the second estimation unit 12C detects the pair of the subject area and the corresponding area by subtracting the moving amount of the image caused by the camera movement based on the affine transformation parameter and the motion estimation frame image pair. Based on this pair, the second estimation unit 12C estimates the amount of image movement caused by the movement of the subject.
  • Examples of the subject area detection method include the following methods.
  • the second estimation unit 12C detects an image (a set of pixels) that moves differently from the movement amount estimated by the affine transformation parameter from one of the pair of motion estimation frame images as a subject area.
  • the second estimation unit 12C uses the equation (7) to calculate the image P from the image I for the pixel P of the image I based on the affine transformation parameters calculated between the image I and the image I ′.
  • a prediction vector (u, v) between I ′ is calculated.
  • the second estimating unit 12C selects the pixel P as a candidate point when the difference between the vectors (x′ ⁇ x, y′ ⁇ y) and (u, v) between the pixel P and the pixel P ′ is equal to or greater than a certain value.
  • calculating the vector difference means subtracting the amount of movement of the image due to the movement of the camera.
  • the second estimation unit 12C detects the set of candidate points as the subject area of the image I.
  • the second estimation unit 12C calculates a difference between a converted image generated by affine transformation of one and a converted image generated by affine transformation (inverse transformation) of the other frame image. Is detected as a subject area from both converted images.
  • the second estimation unit 12C predicts from the image I at an arbitrary time t based on the affine transformation parameters calculated between the image I and the image I ′ using the equation (7).
  • An image Ip is generated.
  • the second estimation unit 12C generates a predicted image I p ′ at time t from the image I ′ based on the affine transformation parameters calculated between the image I and the image I ′.
  • the second estimation unit 12C calculates a difference between the predicted images I p and I p ′, and detects a set of pixels having an absolute value of the difference equal to or larger than a certain value as a subject area from each of the predicted images I p and I p ′. .
  • the second estimation unit 12C can generate a pixel (x p , y p ) on the predicted image I p by substituting the pixel (x, y) of the image I into Expression (9).
  • the affine transformation parameters between the image I and the image Ip are ( ⁇ p , a p , b p , d p , t px , t py ).
  • ( ⁇ p , a p , b p , d p , t px , t py ) can be calculated by the following relational expression.
  • the affine transformation parameters from the image I to the image I ′ are ( ⁇ , a, b, d, t x , t y ), the time difference between the image I and the image I ′ is T, and the time difference between the image I and the image I p Is T p .
  • the second estimation unit 12C calculates ( ⁇ p , a p , b p , d p , t px , t py ) by weighting the rate of change. May be.
  • the second estimation unit 12C can generate the pixel (x p ′, y p ′) of the predicted image I p ′ by substituting the pixel (x ′, y ′) of the image I ′ into Expression (10).
  • the affine transformation parameters between the image I ′ and the image I p ′ are ( ⁇ p ′, a p ′, b p ′, d p ′, t px ′, t py ′).
  • ( ⁇ p ′, a p ′, b p ′, d p ′, t px ′, t py ′) is obtained by the following relational expression.
  • the parameters of the affine transformation from the image I ′ to the image I are ( ⁇ ′, a ′, b ′, d ′, t x ′, ty ′), the time difference between the image I and the image I ′ is T, and the image It is assumed that the time difference between I and the image I p ′ is T p ′.
  • the second estimation unit 12C may detect a region having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as a subject region from each of the converted image and the frame image. Good.
  • This detection method is a derivative of (12C-1-2).
  • the second estimation unit 12C predicts from the image I at the time t + k based on the affine transformation parameters calculated between the image I and the image I ′ using Expression (7). An image is generated and a difference from the image I ′ is calculated.
  • the second estimation unit 12C When the second estimation unit 12C detects the subject area, the second estimation unit 12C estimates a corresponding area corresponding to the detected subject area. Examples of a method for estimating the corresponding region of the subject region include the following methods. The second estimation unit 12C may use each method alone or in combination.
  • the second estimation unit 12C calculates an optical flow with respect to the other frame image for all the pixels in the subject area detected from one of the pair of motion estimation frame images, and moves by the weighted average of the optical flow. The tip is detected as a corresponding area.
  • the second estimation unit 12C calculates an optical flow between the other frame image or its converted image for all pixels in the subject area detected from the converted image generated by affine transforming one of the pairs. May be.
  • the second estimation unit 12C may give a high weight to the optical flow of the pixels close to the center of gravity of the subject region as the weight used in the calculation of the weighted average of the optical flow.
  • the second estimation unit 12C may give a high weight to the optical flow of the pixel having a large luminance gradient with respect to the surroundings in the subject area, and the orientation or size variance with the optical flow calculated with the surrounding pixels.
  • a high weight may be given to the optical flow of a pixel with a small.
  • the second estimation unit 12 ⁇ / b> C may exclude a certain number of flows whose magnitude is greater than or equal to a certain value or less as outliers in the optical flow of the subject area, and equally weight the remaining optical flows. .
  • the second estimation unit 12C can estimate the position of the corresponding region based on the optical flow with high reliability by setting the weight based on the luminance gradient and the variance of the direction or the size of the optical flow. is there.
  • the second estimation unit 12C uses, as a template, a subject region detected in one of the pair of motion estimation frame images or a converted image after the affine transformation, and a template for scanning the other frame image or the converted image after the affine transformation. Corresponding regions are detected by matching.
  • the second estimation unit 12C may use any of the indices described in (12B-2) as a similarity index used for template matching, or may use another method.
  • the second estimation unit 12C may detect the corresponding region based on the distance (Euclidean distance) of the image feature amount expressing the color and texture. For example, the second estimation unit 12C extracts an image feature amount from the subject region detected in one of the pair of motion estimation frame images, and the distance from the detected image feature amount for an arbitrary region of the other frame image is A short area may be detected as the corresponding area.
  • the second estimation unit 12C roughly estimates the position of the corresponding region by template matching using the entire subject region as a template, and then searches again around each partial region generated by dividing the subject region, The corresponding area may be determined.
  • the second estimation unit 12C detects a feature point from the subject area detected in one of the pair of motion estimation frame images or the converted image after the affine change, and corresponds to the feature point from the other frame image or the converted image.
  • the optical flow is detected by detecting the point to be performed.
  • the second estimator 12C detects, as a corresponding region, a destination that has moved the subject region by the weighted average of the detected optical flow.
  • the second estimation unit 12C may use, for example, a Harris corner point as a feature point, or may use a feature point detected by another method.
  • the second estimation unit 12C performs (12C-2-) on the subject region detected by any of the methods (12C-1-1), (12C-1-2), and (12C-1-3). 1), (12C-2-2) or (12C-2-3) may be executed.
  • the second estimation unit 12C detects the subject area, and estimates the corresponding area after estimating the corresponding area. Examples of the method for estimating the motion of the subject include the following methods.
  • the second estimation unit 12C detects, from one of the pair of motion estimation frame images, a set of pixels that move differently from the movement amount estimated by the affine transformation parameter as a subject area (12C-1-1)
  • the movement of the subject is estimated by the following method.
  • the second estimation unit 12C calculates the difference between the position information (coordinates) representing the position of the subject area and the position information of the corresponding area, and uses this as a temporary movement vector of the subject area.
  • the second estimation unit 12C calculates a difference between the temporary movement vector and the movement vector of the image due to the camera movement in the pair of motion estimation frame images, and sets the difference as the true movement vector of the subject area between the pair. .
  • (12C-3-2) Motion estimation method 2 When the second estimation unit 12C detects a region having a large difference between the converted images generated by affine transformation of each pair of motion estimation frame images as a subject region from both of the converted images (12C-1-2) ) Estimate the movement of the subject by the following method.
  • the second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other converted image, and the second estimation unit 12C calculates the difference between the pair of motion estimation frame images. Let it be a true movement vector.
  • (12C-3-3) Motion estimation method 3 When the second estimation unit 12C detects an area having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as the subject area (12C-1-3) The movement of the subject is estimated by the following method.
  • the second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other frame image, and calculates the true of the subject between the pair of motion estimation frame images.
  • the movement vector of This estimation method is a derivative form of (12C-3-2) described above.
  • the motion estimation unit 12 outputs the estimated motion information to the image generation unit 13.
  • the motion information includes at least one of motion information caused by camera motion and motion information caused by subject motion.
  • the motion estimation unit 12 outputs the time of each frame of the pair of motion estimation frame images used for motion estimation and the affine transformation parameters calculated between the pair as motion information resulting from the motion of the camera.
  • the motion estimation unit 12 outputs motion information resulting from the motion of the camera by the number of pairs of motion estimation frame images that have undergone motion estimation.
  • the motion estimator 12 includes each frame image of the motion estimation frame image pair used for estimating the motion of the subject and its time, location information of the subject region, location information of the corresponding region of the subject region, and true information of the subject.
  • the movement vector is output as movement information resulting from the movement of the subject.
  • the position information of the subject region represents one coordinate of the pair of motion estimation frame images.
  • the position information of the corresponding region represents the other coordinate of the pair of motion estimation frame images.
  • the motion estimation unit 12 detects the subject region and estimates the corresponding region of the subject region in the converted image generated by affine transformation of the pair of motion estimation frame images
  • the motion estimation unit 12 is caused by the motion of the subject.
  • the motion information is output as follows.
  • the motion estimation unit 12 includes the time of each frame of the pair of motion estimation frame images used for the motion estimation of the subject, the location information of the subject region, the location information of the corresponding region of the subject region, and the true movement vector of the subject. Is output.
  • the position information of the subject region represents coordinates in a converted image generated by affine transformation of one of the pair of motion estimation frame images.
  • the position information of the corresponding region represents coordinates in a converted image generated by affine transformation of the other of the pair of motion estimation frame images.
  • the motion estimation unit 12 outputs motion information resulting from the motion of the subject for the number of pairs of motion estimation frame images for which motion estimation has been performed.
  • FIG. 8 is a block diagram illustrating a configuration of the image generation unit 13.
  • the image generation unit 13 includes a first correction unit 13A, a second correction unit 13B, and a synthesis unit 13C.
  • the image generation unit 13 receives a plurality of frame images, analysis information from the determination unit 11, and motion information from the motion estimation unit 12 as inputs. When it is determined that the frame of interest is a frame image including a bright area due to the blinking of light, the image generation unit 13 corrects the frame image for motion estimation to an image at the time of the frame of interest image, Output as a corrected frame image.
  • the first correction unit 13A first generates a first corrected image by correcting the motion of the camera for each motion estimation frame image.
  • the second correction unit 13B generates a second corrected image by correcting the motion of the subject for each motion estimation frame image.
  • the synthesizer 13C generates a second corrected image for each motion estimation frame image, and generates a corrected frame image by combining them.
  • the first correction unit 13A corrects the camera motion by, for example, the following method based on the image data of the pair of motion estimation frame images and the affine transformation parameters calculated between the pair.
  • the first correction unit 13A determines that there is no camera movement when each value of the affine transformation parameter is smaller than a preset threshold value, and does not need to correct the camera movement. In this case, the first correction unit 13A regards the uncorrected motion estimation frame image as the first corrected image.
  • the first correcting unit 13A selects the first and second frame images that are closest to the target frame image and do not include the bright region as the motion estimation frame images (12A-1) by the following method. A corrected image is generated. The first correction unit 13A uses the affine transformation parameters calculated between the two selected frame images to generate correction frame images from these frame images, respectively.
  • the first correction unit 13A uses one of the motion estimation frame images as the image I and the other as the image I ′.
  • the predicted images Ip and Ip ′ at the time t of the image are generated as the first corrected image.
  • the first correction unit 13A When a plurality of frame images are selected as the motion estimation frame images from before and after the attention frame image (12A-2), the first correction unit 13A generates the first correction image by the following method.
  • the first correction unit 13A generates a first correction image from each pair based on each affine transformation parameter calculated from a plurality of pairs of motion estimation frame images.
  • the first correction unit 13A uses one of the motion estimation frames as an image I and the other as an image I ′. Predicted images I p and I p ′ at time t are generated as first corrected images. For example, as illustrated in FIG. 7, when the first correction unit 13A selects two frames before and after the target frame image and performs motion estimation for two pairs, the target frame image generated for each selected frame The four predicted images at the time are set as the first corrected image.
  • the first correction unit 13A performs the first correction by the following method. Generate an image. The first correction unit 13A generates a first correction image from the selected frame image based on the affine transformation parameters calculated between the target frame image and the selected frame image.
  • the first correction unit 13A when the frame image selected as the motion estimation frame image is set as an image I, the first correction unit 13A at the time t of the frame image of interest.
  • the predicted image Ip is generated as the first corrected image.
  • the second correction unit 13 ⁇ / b> B updates the pixel information of the position of the subject in the frame image of interest to update the movement of the subject. to correct.
  • the second correction unit 13B can correct the movement of the subject by the following method.
  • the second correction unit 13B determines that there is no movement of the subject when each value of the true movement vector of the subject is smaller than a preset threshold value, and does not correct the movement of the subject. Also good. In this case, the second correction unit 13B regards the first correction image as the second correction image.
  • the second correction unit 13B Based on the true movement vector of the subject between the pair of motion estimation frame images and the time information of the pair and the attention frame image, the second correction unit 13B and each frame image and the attention frame image of the pair To determine the true movement vector of the subject.
  • the second correction unit 13B uses the pixel value of the subject area specified from the first correction image, and the previous pixel moved by the true movement vector from the coordinates of the subject area specified from the first correction image The value and the pixel value of the coordinates of the subject area specified from the first correction frame are updated. Accordingly, the second correction unit 13B generates a second corrected image.
  • the second correction unit 13B may update the pixel value by replacing the pixel value of the movement destination with the pixel value of the subject area. Further, the second correction unit 13B may replace the pixel value of the movement destination with a weighted average value of the pixel value and the pixel value of the subject area, or the pixel value of the movement destination may be a pixel value around the movement destination. And a weighted average value based on pixel values of the subject area.
  • the second correction unit 13B may replace the pixel value of the coordinates of the subject area with the previous pixel value moved by the inverse vector of the true movement vector.
  • the second correction unit 13B may replace the pixel value of the coordinates of the subject area with a weighted average value with the previous pixel value moved by the inverse vector of the true movement vector, or the inverse vector of the true movement vector.
  • the pixel value may be replaced with the weighted average value of the previous pixel value and its surrounding pixels.
  • the true movement vector of the subject between each frame image of the pair of frame images for motion estimation and the target frame image is obtained by the following equation.
  • the true movement vector of the subject area between the frame images I1 and I2 constituting the pair of motion estimation frame images is V
  • the times of the frame images I1 and I2 are T1 and T2, respectively
  • the time of the frame of interest is Let T3 (T1 ⁇ T3 ⁇ T2).
  • the second correction unit 13B determines that the pixel of the first correction image corresponding to the pixel determined to be the subject area in the motion estimation frame image is the pixel of the subject area, thereby determining the first correction image.
  • the subject image can be specified from the above.
  • the combining unit 13C can generate a corrected frame image by combining a plurality of second corrected images.
  • the combining unit 13C is corrected frame image I c can be generated by equation (13).
  • the number of second correction images is N
  • the weight is wi.
  • the weight wi is larger as the absolute value of Di (
  • the combining unit 13C may calculate wi based on a function that linearly increases as
  • the image synthesizing unit 14 synthesizes the frame image of interest and the correction frame image to generate and output a frame image (hereinafter referred to as “output frame image”) in which blinking due to flash or the like is suppressed.
  • the image composition unit 14 calculates a composition ratio in each pixel, and generates an output image by composition processing. . In other cases, the image composition unit 14 uses the input frame image of interest as an output frame image as it is. When the composition ratio u (x, y) at the target pixel I t (x, y) at the position (x, y) is given, the image composition unit 14 outputs the value I out (x, y) of the output frame image at the same position. y) is calculated as shown in equation (14).
  • the image composition unit 14 can calculate the composition ratio using the change rate of the local area luminance between the target frame image and the corrected frame image.
  • the image composition unit 14 calculates the local region luminance change rate r t-es between the frame image of interest and the corrected frame image using a method similar to the method in which the determination unit 11 calculates the local region luminance change rate. can do.
  • the image composition unit 14 uses the composition ratio u (x, y) at the target pixel at the position (x, y) as the change rate r t-es (x, y) of the local region luminance at the same position (x, y).
  • the image composition unit 14 calculates the composition ratio u (x, y) so that the change rate of the local area luminance in the output frame image becomes r tar (x, y).
  • the image composition unit 14 may calculate the composition ratio using the change rate of the rectangular area luminance. Specifically, the image composition unit 14 first corresponds to the rectangular region luminance change rate R t-es calculated using the same method as the determination unit 11 and a preset value of R t-es. The composition ratio U for each rectangular area is calculated from the change rate of the luminance of the rectangular area of the output frame image. Next, the image composition unit 14 obtains a composition ratio u for each pixel from the composition ratio U for each rectangular area using linear interpolation or bicubic interpolation.
  • the determination unit 11 determines whether or not the frame image of interest at time t is a frame image including a bright region caused by blinking of light due to flash or the like that may induce a photosensitivity seizure (S11).
  • the motion estimation unit 12 selects a motion estimation frame image from a plurality of frame images including the frame image of interest, and estimates the amount of image movement due to the motion of the camera and the subject between the motion estimation frame images (S12). .
  • the image generation unit 13 uses the camera and subject image between the motion estimation frame image and the target frame image based on the movement amount of the pixel due to the camera and subject motion estimated between the motion estimation frame images. Is estimated. In addition, the image generation unit 13 converts each motion estimation frame image into an image at the time of the target frame image, and generates a corrected frame image by synthesizing the converted images (S13).
  • the image synthesizing unit 14 synthesizes the attention frame image and the correction frame image, and generates and outputs an output frame image in which blinking due to flash or the like is suppressed (S14).
  • the video processing apparatus 100 can generate a natural video in which variation in luminance is suppressed with respect to a video including a large luminance change that may induce a photosensitivity seizure. .
  • the video processing apparatus 100 synthesizes a frame image having no luminance change estimated from other frame images with respect to a target frame image including a region having a large luminance change while changing the weight for each pixel. It is. As a result, the video processing apparatus 100 can correct only an area where there is a large luminance change and restore information lost due to blinking or the like.
  • blinking by flash etc. occurs at a press conference.
  • a subject participant
  • a conference seat sits down, and leaves after the conference.
  • the camera follows the subject as the subject moves. In this case, the shooting range of the camera moves following the subject.
  • the video processing apparatus 100 corrects the image by estimating the movement of the camera and the subject, it can suppress blurring and blurring of the contour and generate a smooth video.
  • the video processing apparatus 100 can be similarly applied to a case where the blinking region is a dark region that is darker than the other frame images by a predetermined level or more (the luminance is reduced) in the frame image of interest.
  • the determination unit 11 determines whether there is an area that is darker than a predetermined level from the frame image at time (t + k) among the plurality of frame images to which the frame image of interest at time t is input. For example, the determination unit 11 uses the preset threshold value ⁇ ′ of the luminance fluctuation rate and the threshold value ⁇ ′ of the area rate to determine the area of the region where the local region luminance change rate r t ⁇ t + k is lower than the threshold value ⁇ ′. Judgment is made based on whether the rate exceeds the threshold ⁇ ′.
  • the determination unit 11 When it is determined that there is a region that is larger and darker than the frame image at the time (t + k) of the input frame image at the time t, the determination unit 11 sets the determination flag flag t-t + k to “ 1 ”. Otherwise, the determination unit 11 may set the determination flag flag t-t + k to “0”. The determination unit 11 calculates a determination flag for the combination of the target frame image and all the other input frame images. The determination unit 11 determines that the target frame image is a frame image including a dark region due to blinking of light when there is a frame image whose determination flag is “1” at each time before and after the target frame image.
  • the determination unit 11 may use a method using a change rate of the luminance of the rectangular area. For example, the determination unit 11 uses the threshold value ⁇ ′ for the luminance fluctuation rate and the threshold value ⁇ ′ for the area ratio, which are set in advance, so that the area ratio of the region where the change rate of the luminance of the rectangular area is lower than the threshold value ⁇ ′ Depending on whether or not it exceeds, “1” or “0” is set to the determination flag flag t-t + k .
  • the video processing apparatus 100 can be similarly applied to a change in saturation such as red flash. Therefore, the above-described embodiment may include a mode in which “luminance” is replaced with “saturation” or “luminance or saturation”.
  • the embodiment according to the present invention can be applied to a video editing system for editing video recorded on a hard disk or the like.
  • the embodiment according to the present invention can be applied to a video camera, a display terminal, and the like by using a frame image held in a memory.
  • the embodiment according to the present invention can be configured by hardware, but can also be realized by a computer program.
  • the video processing apparatus 100 realizes the same functions and operations as those in the above-described embodiment by a processor that operates according to a program stored in the program memory.
  • only a part of the functions can be realized by a computer program.
  • FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus 200 that implements the video processing apparatus 100.
  • the computer apparatus 200 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, a RAM (Random Access Memory) 203, a storage device 204, a drive device 205, a communication interface 206, and an input / output interface. 207.
  • the video processing apparatus 100 can be realized by the configuration (or part thereof) shown in FIG.
  • the CPU 201 executes the program 208 using the RAM 203.
  • the program 208 may be stored in the ROM 202.
  • the program 208 may be recorded on a recording medium 209 such as a flash memory and read by the drive device 205 or transmitted from an external device via the network 210.
  • the communication interface 206 exchanges data with an external device via the network 210.
  • the input / output interface 207 exchanges data with peripheral devices (such as an input device and a display device).
  • the communication interface 206 and the input / output interface 207 can function as means for acquiring or outputting data.
  • the video processing apparatus 100 may be configured by a single circuit (such as a processor) or a combination of a plurality of circuits.
  • the circuit here may be either dedicated or general purpose.
  • (Appendix 1) Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means,
  • An image processing apparatus comprising: an image combining unit that combines the frame image of interest and the correction frame image.
  • the motion estimation means includes The video processing apparatus according to claim 1, further comprising selection means for selecting at least one of the pair from a frame image other than the frame image of interest.
  • the motion estimation means includes The video processing according to claim 2, further comprising: first estimation means for calculating a geometric transformation parameter based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images and estimating the first movement amount. apparatus.
  • the motion estimation means includes A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected.
  • the video processing apparatus further comprising: a second estimation unit that estimates the second movement amount based on the second movement amount.
  • the motion estimation means includes A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • the video processing apparatus further comprising: second estimation means for estimating
  • the image generating means includes First correction means for generating a first correction image from each frame image of the pair based on the first movement amount; Second correction means for generating a second corrected image from each of the first corrected images based on the second movement amount;
  • the video processing apparatus according to any one of appendices 1 to 5, further comprising: a combining unit configured to combine each of the second correction frame images.
  • the determination means includes A frame image in which a region having a luminance or saturation change rate greater than or equal to a specified value or less than a specified area is determined to be the target frame image with any other frame image.
  • the image composition means includes The video processing apparatus according to any one of appendices 1 to 7, wherein a composite ratio for combining the frame image of interest and the correction frame image is calculated based on a predetermined function.
  • the image composition means includes As a composition ratio for compositing the attention frame image and the correction frame image, the composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large.
  • the video processing device according to any one of supplementary notes 1 to 8.
  • a subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected.
  • a subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • Appendix 17 The video processing method according to any one of appendices 10 to 16, wherein a synthesis ratio for synthesizing the frame image of interest and the correction frame image is calculated based on a predetermined function.
  • composition ratio of the correction frame image is increased for an area where the rate of change between the attention frame image and the correction frame image is large.
  • Appendix 22 In the estimation process, A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • a frame image in which a region in which a change rate of luminance or saturation is greater than or less than a specified value or less than a specified area with another frame image is determined to be the attention frame image.
  • composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large.
  • Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images; A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated.
  • First estimating means for: A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. And a second estimating means for estimating a second movement amount resulting from the movement of the video processing apparatus.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The present invention provides an image processing device capable of generating a natural image in which flickering is reduced. An image processing device 100 is provided with a determining unit 11, a motion estimation unit 12, an image generating unit 13, and an image synthesizing unit 14. The determining unit 11 determines which one of a plurality of temporally consecutive frame images is a target frame image including a flickering region in which the brightness or chroma is different from that of the preceding or subsequent frame image by a predetermined level or higher. The motion estimation unit 12 estimates a first movement amount caused by a movement of a camera and/or a second movement amount caused by a movement of a subject on the basis of a pair of frame images selected on the basis of a difference in brightness or chroma between the target frame image and the preceding or subsequent frame image. On the basis of the selected pair and the estimated first movement amount and/or second movement amount, the image generating unit 13 generates a correction frame image corresponding to a frame image at the time when the target frame image is captured. The image synthesizing unit 14 synthesizes the target frame image and the correction frame image.

Description

映像処理装置、映像処理方法及びプログラム記録媒体Video processing apparatus, video processing method, and program recording medium
 本発明は、映像処理装置、映像処理方法及びプログラム記録媒体に関する。 The present invention relates to a video processing device, a video processing method, and a program recording medium.
 一部の映像コンテンツは、視聴者に対して、生理的に悪影響を与える可能性がある。このような影響の一つは、光過敏性発作の発症である。光過敏性発作は、光刺激に対する異常反応の症状の一つであり、痙攣や意識障害などの癲癇(てんかん)に似た症状を示す発作である。 一部 Some video content may have a physiological adverse effect on viewers. One such effect is the development of a photosensitivity attack. Photosensitivity seizure is one of the symptoms of an abnormal response to light stimulation, and is a seizure showing symptoms similar to epilepsy such as convulsions and disturbance of consciousness.
 このような影響の発生を抑制するために、人体に悪影響がある映像コンテンツの配信を抑制する試みが実施されている。例えば、国際電気通信連合(ITU)は、映像コンテンツは光過敏性発作を発生させる危険性があることを映像配信組織が映像コンテンツ製作者に対して周知するよう勧告している(非特許文献1)。また、日本においては、日本放送協会と日本民間放送連盟が、特にアニメーションの製作に関しガイドラインを制定し、放送に携わる者に遵守するよう求めている(非特許文献2)。 In order to suppress the occurrence of such effects, attempts are being made to suppress the distribution of video content that has a negative effect on the human body. For example, the International Telecommunication Union (ITU) recommends that video distribution organizations inform video content producers that video content is at risk of causing photosensitivity attacks (Non-Patent Document 1). ). In Japan, the Japan Broadcasting Corporation and the Japan Broadcasting Corporation have established guidelines for animation production in particular, and are demanding compliance with those who are involved in broadcasting (Non-Patent Document 2).
 しかし、報道映像のような速報性が求められる映像コンテンツを生放送する際に、映像が光過敏性発作を誘発する可能性のある多くの明滅を含んでいる場合には、人体に悪影響がある映像コンテンツの配信を抑制することが困難である。このような場合、現状では、テロップ等で視聴者に事前に注意を喚起する対策がとられている。光過敏性発作を誘発する可能性のある多くの明滅を含んでいる映像の一つに、記者会見等で報道カメラマンから撮影時に発せられるフラッシュが多く含まれる映像が挙げられる。このような映像では、カメラから発せられるフラッシュによる短時間の明領域が発生し、これが繰り返されることで多くの明滅が発生することになる。 However, when broadcasting live video content that requires promptness, such as news footage, if the video contains many blinks that can trigger a photosensitivity attack, the video will have a negative effect on the human body. It is difficult to suppress content distribution. In such a case, at present, measures are taken to alert the viewer in advance with a telop or the like. One video that contains many flickers that can trigger photosensitivity seizures is a video that contains a lot of flash emitted from a news photographer during a press conference. In such an image, a short-time bright region is generated by a flash emitted from the camera, and many blinks are generated by repeating this.
 人体に悪影響がある映像コンテンツを検出して補正する関連技術が特許文献1~3に開示されている。 Patent Documents 1 to 3 disclose related techniques for detecting and correcting video content that has an adverse effect on the human body.
 特許文献1は、液晶ディスプレイにおいて、光過敏性発作を誘発するシーン(画像)を検出し、検出されたシーンに対してバックライトユニットの輝度を低下させる技術を開示している。この技術は、視聴者への光過敏性発作の影響を未然に防止する。 Patent Document 1 discloses a technique for detecting a scene (image) that induces a light-sensitive seizure in a liquid crystal display and reducing the luminance of a backlight unit with respect to the detected scene. This technology obviates the effects of photosensitivity attacks on viewers.
 特許文献2は、第nフレーム画像と第(n+1)フレーム画像のヒストグラムの比較結果に基づいて、ガンマ補正又はトーンカーブ補正によって第(n+1)フレーム画像のダイナミックレンジを狭める補正をする技術を開示している。この技術は、強い明滅を緩和し、眼精疲労又は体調不良を低減させる。また、特許文献3は、動きベクトルを補正する技術を開示している。 Patent Document 2 corrects the dynamic range of the (n + 1) th frame image by gamma correction or tone curve correction based on the comparison result of the histograms of the nth frame image and the (n + 1) th frame image. The technology is disclosed. This technique relieves strong blinking and reduces eye strain or poor health. Patent Document 3 discloses a technique for correcting a motion vector.
 なお、非特許文献3及び非特許文献4は、後述するオプティカルフローの算出方法を開示する。 Note that Non-Patent Document 3 and Non-Patent Document 4 disclose optical flow calculation methods described later.
特開2008-301150号公報JP 2008-301150 A 特開2010-035148号公報JP 2010-035148 A 特開2008-124956号公報JP 2008-124956 A
 しかしながら、関連技術には下記の課題がある。光過敏性発作を誘発する可能性のある輝度又は彩度の大きな変化は、画像全体ではなく、画像の一部の領域に発生する場合がある。関連技術に開示された手法は、これらの判別を行わず画像全体を一律に補正するため、本来補正する必要がない明滅が発生していない領域のコントラストや明度を低下させ、その領域の画質を劣化させる場合がある。 However, the related technology has the following problems. Large changes in brightness or saturation that can trigger photosensitivity seizures may occur in some areas of the image, not in the entire image. Since the technique disclosed in the related art uniformly corrects the entire image without making these determinations, it reduces the contrast and brightness of areas that do not need to be corrected and does not cause blinking, and reduces the image quality of those areas. May deteriorate.
 また、フラッシュ等による明滅の場合には、フラッシュによって明るくなった領域の画素の一部の色情報がカメラのダイナミックレンジを超えている(すなわち飽和している)場合がある。色情報が飽和した画素は、本来の情報が失われている。そのため、色情報が飽和した画素を含むフレーム画像のみを用いた補正処理のみでは、彩度が過大又は過小な画素を発生させる場合があり、色味の変動を抑制することができない。それゆえ、このような補正処理では、明滅を自然に緩和することは困難である。 In addition, in the case of blinking due to flash or the like, color information of a part of pixels in an area brightened by the flash may exceed the dynamic range of the camera (ie, be saturated). Pixels with saturated color information lose their original information. For this reason, only correction processing using only a frame image including pixels with saturated color information may generate pixels with excessive or undersaturation, and fluctuations in color cannot be suppressed. Therefore, it is difficult to naturally mitigate flicker by such correction processing.
 本発明の目的は、輝度又は彩度の変動が抑制された自然な映像を生成することができる技術を提供することにある。 An object of the present invention is to provide a technique capable of generating a natural video in which fluctuations in luminance or saturation are suppressed.
 本発明の一態様に係る映像処理装置は、
 時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する判定手段と、
 前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する動き推定手段と、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する画像生成手段と、
 前記注目フレーム画像と前記補正フレーム画像とを合成する画像合成手段と
 を備える。
An image processing device according to an aspect of the present invention includes:
Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images;
Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be
Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means,
Image synthesizing means for synthesizing the frame image of interest and the correction frame image.
 本発明の一態様に係る映像処理方法は、
 時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定し、
 前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定し、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成し、
 前記注目フレーム画像と前記補正フレーム画像とを合成する。
An image processing method according to an aspect of the present invention includes:
It is determined whether any of a plurality of temporally continuous frame images is an attention frame image including a blinking region whose luminance or saturation is different from a preceding frame image by a predetermined level or more.
Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Estimating a second movement amount to be
Based on the selected pair and the estimated first movement amount and / or second movement amount, a corrected frame image corresponding to a frame image at the shooting time of the frame image of interest is generated,
The attention frame image and the correction frame image are synthesized.
 本発明の一態様に係るプログラム記録媒体は、
 コンピュータに、
 時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する処理と、
 前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する処理と、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する処理と、
 前記注目フレーム画像と前記補正フレーム画像とを合成する処理と
 を実行させる。
A program recording medium according to an aspect of the present invention includes:
On the computer,
A process of determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region that differs in luminance or saturation by a predetermined level or more with respect to the preceding and following frame images;
Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject A process of estimating a second movement amount to be performed;
Processing for generating a corrected frame image corresponding to a frame image at the photographing time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount; ,
And a process of combining the frame image of interest and the correction frame image.
 本発明の一態様に係る映像処理装置は、
 時間的に連続する複数のフレーム画像から第1のフレーム画像と第2のフレーム画像とを選択する選択手段と、
 前記第1のフレーム画像と前記第2のフレーム画像の間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、カメラの動きに起因する第1の移動量を推定する第1の推定手段と、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記第1のフレーム画像及び前記第2のフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて被写体の動きに起因する第2の移動量を推定する第2の推定手段と
 を備える。
An image processing device according to an aspect of the present invention includes:
Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images;
A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated. First estimating means for:
A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. And second estimation means for estimating a second movement amount resulting from the movement of.
 本発明によれば、輝度又は彩度の変動が抑制された自然な映像を生成することができる。 According to the present invention, it is possible to generate a natural video in which fluctuations in luminance or saturation are suppressed.
図1は第1の実施の形態における映像処理装置のブロック図である。FIG. 1 is a block diagram of a video processing apparatus according to the first embodiment. 図2は矩形領域輝度の算出方式を表す模式図である。FIG. 2 is a schematic diagram showing a rectangular area luminance calculation method. 図3は第1の実施の形態における動き推定部のブロック図である。FIG. 3 is a block diagram of the motion estimator in the first embodiment. 図4は明領域を含まないフレーム画像の選択方法を表す模式図である。FIG. 4 is a schematic diagram illustrating a method of selecting a frame image that does not include a bright region. 図5は動き推定用フレームの選択方法を表す図である。FIG. 5 is a diagram illustrating a method for selecting a motion estimation frame. 図6は動き推定用フレームの選択方法を表す図である。FIG. 6 is a diagram illustrating a method for selecting a motion estimation frame. 図7は動き推定用フレーム対の選択方法の一例を示す図である。FIG. 7 is a diagram illustrating an example of a method for selecting a motion estimation frame pair. 図8は第1の実施の形態における補正フレーム生成部のブロック図である。FIG. 8 is a block diagram of a correction frame generation unit in the first embodiment. 図9は出力フレーム画像における局所領域輝度の変化率の値の設定方法の一例を示すグラフである。FIG. 9 is a graph showing an example of a method for setting the value of the rate of change in local area luminance in the output frame image. 図10は第1の実施の形態における映像処理装置の動作を示すフローチャートである。FIG. 10 is a flowchart showing the operation of the video processing apparatus according to the first embodiment. 図11はコンピュータ装置のハードウェア構成を例示するブロック図である。FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus.
 [構成]
 図1は、本発明による第1の実施の形態に係る映像処理装置100の構成を示すブロック図である。なお、図1(及び以降のブロック図)に記載された矢印は、データの流れの一例を示すにすぎず、データの流れを限定することを意図したものではない。
[Constitution]
FIG. 1 is a block diagram showing a configuration of a video processing apparatus 100 according to the first embodiment of the present invention. Note that the arrows described in FIG. 1 (and the subsequent block diagrams) merely show an example of the data flow, and are not intended to limit the data flow.
 この映像処理装置100は、判定部11と、動き推定部12と、画像生成部13と、画像合成部14とを備える。 The video processing apparatus 100 includes a determination unit 11, a motion estimation unit 12, an image generation unit 13, and an image synthesis unit 14.
 判定部11は、フレーム画像に光過敏性発作を誘発する可能性がある領域が含まれるか否かを判定する。具体的には、判定部11は、予め設定されたフレーム数のフレーム画像を用いて、特定のフレーム画像(以下「注目フレーム画像」という。)がフラッシュ等により明滅する(輝度が大きく変化する)領域を含むフレーム画像であるかを判定する。以下においては、このようにして判定された領域(輝度が大きく変化する領域)のことを、「明滅領域」という。例えば、判定部11は、時刻(t-m)から時刻(t+m)までに撮影された(2m+1)フレーム分の時間的に連続するフレーム画像の入力を受け付けると、時刻tのフレーム画像を注目フレーム画像とし、当該フレーム画像が明滅領域を含むかを判定する。 The determination unit 11 determines whether or not the frame image includes a region that may induce a photosensitivity seizure. Specifically, the determination unit 11 uses a frame image having a preset number of frames to blink a specific frame image (hereinafter referred to as “target frame image”) by flash or the like (the luminance changes greatly). It is determined whether the frame image includes a region. In the following, a region determined in this way (a region where the luminance changes greatly) is referred to as a “blinking region”. For example, when the determination unit 11 receives an input of time-sequential frame images for (2m + 1) frames taken from time (tm) to time (t + m), the determination unit 11 selects the frame image at time t. A frame image of interest is determined, and it is determined whether the frame image includes a blink region.
 注目フレーム画像に明滅領域が含まれる場合、動き推定部12、画像生成部13及び画像合成部14は、カメラや被写体の変位に起因した画像の移動を補正したフレーム画像を合成する。動き推定部12、画像生成部13及び画像合成部14は、このようにして明滅領域の輝度変化を適切に抑制することで、明滅の影響を低減させたフレーム画像を出力することができる。 When the blinking region is included in the attention frame image, the motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 synthesize a frame image in which the movement of the image due to the displacement of the camera or the subject is corrected. The motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 can output a frame image in which the influence of blinking is reduced by appropriately suppressing the luminance change in the blinking region in this way.
 尚、明滅領域には、注目フレーム画像の輝度が前後のフレーム画像の輝度と比較して大きく向上する(明るくなる)明領域と、大きく低下する(暗くなる)暗領域とがある。しかし、説明の簡略化のため、以下においては明領域についてのみ説明する。 Note that the blinking region includes a bright region in which the luminance of the frame image of interest is greatly improved (becomes brighter) and a dark region in which the luminance of the frame image of interest is greatly lowered (becomes dark). However, for the sake of simplicity, only the bright area will be described below.
 <判定部11>
 判定部11は、複数のフレーム画像の入力を受け付けると、注目フレーム画像が明滅領域を含むフレーム画像であるか判定する。
<Determining unit 11>
When receiving the input of a plurality of frame images, the determination unit 11 determines whether the target frame image is a frame image including a blinking region.
 注目フレーム画像が明滅領域を含むフレームであるかを判別する方法の一つは、注目フレーム画像と他の入力されたフレーム画像との間の局所領域輝度の変化率を用いる方法である。 One method for determining whether a target frame image is a frame including a blinking region is a method using a change rate of local region luminance between the target frame image and another input frame image.
 ここにおいて、局所領域輝度は、入力された複数のフレーム画像の各画素における、当該画素とその周辺の所定数の画素を含む領域の輝度値を表す。判定部11は、まず、入力された複数のフレーム画像の各画素について、RGB表色系などで記述された色情報を明るさを表す輝度情報(輝度値)に変換する。その後、判定部11は、変換された輝度情報に対して注目画素周辺の画素を用いた平滑化処理を施すことで、画素周辺領域の輝度値を算出する。 Here, the local area luminance represents a luminance value of an area including the pixel and a predetermined number of pixels around the pixel in each pixel of the input plurality of frame images. The determination unit 11 first converts color information described in an RGB color system or the like into luminance information (luminance value) representing brightness for each pixel of the input plurality of frame images. Thereafter, the determination unit 11 performs a smoothing process using pixels around the target pixel on the converted luminance information, thereby calculating a luminance value in the pixel peripheral region.
 色情報を輝度情報に変換する方法は、例えば、放送用に用いられるYUV(YCbCr,YPbPr)表色系の輝度を表すY値を算出する方法や、XYZ表色系の輝度を表すY値を算出する方法がある。ただし、輝度情報を記述する表色系は、これらの表色系に限定されない。例えば、判定部11は、HSV表色系のV値等、輝度を表す他の指標に色情報を変換してもよい。また、判定部11は、入力されたフレーム画像に予めガンマ補正が施されている場合には、輝度情報への変換の前に、色情報を逆ガンマ補正により補正前の色情報に変換してもよい。 The method for converting color information into luminance information is, for example, a method for calculating a Y value representing the luminance of the YUV (YCbCr, YPbPr) color system used for broadcasting, or a Y value representing the luminance of the XYZ color system. There is a way to calculate. However, the color systems describing luminance information are not limited to these color systems. For example, the determination unit 11 may convert the color information into another index representing luminance such as the V value of the HSV color system. In addition, when the input frame image has been subjected to gamma correction in advance, the determination unit 11 converts the color information into color information before correction by inverse gamma correction before conversion to luminance information. Also good.
 平滑化処理の方法は、例えば、注目画素の周辺にある画素のうちの上下それぞれのq画素と左右それぞれのp画素、すなわち(2p+1)×(2q+1)画素の輝度情報の平均値を算出する方法がある。この場合、時刻tのフレーム画像のうちの位置(x,y)にある画素の局所領域輝度lt(x,y)は、フレーム画像の輝度情報Ytを用いて、式(1)のように表すことができる。
Figure JPOXMLDOC01-appb-I000001
The smoothing method is, for example, an average value of luminance information of upper and lower q pixels and left and right p pixels, that is, (2p + 1) × (2q + 1) pixels, of pixels around the target pixel. There is a method to calculate. In this case, the local region luminance l t (x, y) of the pixel at the position (x, y) in the frame image at time t is expressed by Equation (1) using the luminance information Y t of the frame image. Can be expressed as
Figure JPOXMLDOC01-appb-I000001
 また、判定部11は、式(2)のように、予め設定された重みwを用いた重み付き平均を用いて局所領域輝度lt(x,y)を算出してもよい。
Figure JPOXMLDOC01-appb-I000002
Further, the determination unit 11 may calculate the local region luminance l t (x, y) using a weighted average using a preset weight w as in Expression (2).
Figure JPOXMLDOC01-appb-I000002
 重みの設定方法としては、例えば、ガウシアン重みを用いる方法がある。判定部11は、あらかじめ設定されたパラメーターσを用いて、式(3)によりガウシアン重みw(i,j)を算出する。
Figure JPOXMLDOC01-appb-I000003
As a weight setting method, for example, there is a method using Gaussian weight. The determination unit 11 calculates a Gaussian weight w (i, j) using Equation (3) using a preset parameter σ.
Figure JPOXMLDOC01-appb-I000003
 局所領域輝度の変化率は、注目フレーム画像の画素と、同位置の他の入力フレーム画像の画素との間の局所領域輝度の変化の比率を表す。判定部11は、時刻tにおける注目フレーム画像と時刻(t+k)におけるフレーム画像のそれぞれの位置(x,y)にある画素の局所領域輝度の変化率rt-t+k(x,y)を、式(4)を用いて算出する。
Figure JPOXMLDOC01-appb-I000004
The local area luminance change rate represents the ratio of the local area luminance change between the pixel of the target frame image and the pixel of another input frame image at the same position. The determination unit 11 changes the local area luminance change rate r t−t + k (x, y) of the pixel at each position (x, y) of the frame image of interest at time t and the frame image at time (t + k). ) Is calculated using equation (4).
Figure JPOXMLDOC01-appb-I000004
 判定部11は、算出された変化率に基づき、注目フレーム画像に他のフレーム画像より所定のレベル以上明るくなる領域が含まれているか否か判定する。その結果、時間的に前後にある他のフレーム画像に対して所定のレベル以上明るくなる領域が注目フレーム画像に含まれている場合には、判定部11は、注目フレーム画像が明滅による明領域を含むフレーム画像であると判定する。 The determination unit 11 determines based on the calculated change rate whether or not the frame image of interest includes an area that is brighter than a predetermined level by comparison with other frame images. As a result, when the attention frame image includes an area that is brighter than a predetermined level with respect to other frame images before and after in time, the determination unit 11 determines that the attention frame image is a bright area due to blinking. It is determined that the frame image is included.
 判定部11は、予め設定された変化率の閾値αと面積率の閾値βを用いて、変化率rt-t+kが閾値αを超える領域の面積率が閾値βを超えるか否かによって判定する方法を用いることもできる。例えば、避けるべき点滅映像の判断基準の一つとして、日本放送協会と日本民間放送連盟によるガイドラインには、「点滅が同時に起こる面積が画面の1/4を超え、かつ、輝度変化が10%以上の場合」が規定されている。上記の判定方法において、この判断基準を満たすためには、判定部11は、α=0.1、β=0.25を設定する。 The determination unit 11 uses the threshold value α of the change rate and the threshold value β of the area rate, which are set in advance, depending on whether the area rate of the region where the change rate r t-t + k exceeds the threshold value α exceeds the threshold value β. A determination method can also be used. For example, as one of the criteria for determining blinking video to be avoided, the guidelines by the Japan Broadcasting Corporation and the Japan Broadcasting Corporation include: `` The area where blinking occurs simultaneously exceeds 1/4 of the screen and the luminance change is 10% or more. "In the case of". In the above determination method, in order to satisfy this determination criterion, the determination unit 11 sets α = 0.1 and β = 0.25.
 判定部11は、時刻tにおける注目フレーム画像に時刻(t+k)のフレーム画像より所定のレベル以上明るくなる領域があると判定した場合、判定フラグflagt-t+kを「1」とする。また、判定部11は、このような領域がないと判定した場合には、判定フラグflagt-t+kを「0」とする。判定部11は、注目フレーム画像と、入力された他のフレーム画像の全てとの組み合わせに関して同様に判定フラグを算出し、注目フレーム画像の前後の時刻それぞれについて判定フラグが「1」となるフレーム画像が存在するか否かを判断する。このようなフレーム画像が存在する場合、判定部11は、注目フレーム画像が明領域を含むフレーム画像であると判定する。 The determination unit 11 sets the determination flag flag t-t + k to “1” when it is determined that the frame image of interest at time t includes a region that is brighter than a predetermined level by the frame image at time (t + k). . If the determination unit 11 determines that there is no such area, the determination flag flag t-t + k is set to “0”. The determination unit 11 similarly calculates a determination flag for the combination of the target frame image and all the other input frame images, and the frame image for which the determination flag is “1” for each of the times before and after the target frame image. It is determined whether or not exists. When such a frame image exists, the determination unit 11 determines that the frame image of interest is a frame image including a bright region.
 判定部11は、注目フレーム画像が明滅領域を含むフレーム画像であるかを判別する別の方法として、矩形領域輝度の変化率を用いる方法を利用してもよい。ここにおいて、矩形領域輝度は、各フレーム画像における予め設定された矩形領域毎の輝度の平均値を表す。例えば、図2に示されているように、フレーム画像に10×10ブロックの矩形領域を設定した場合の矩形領域輝度は、矩形領域のそれぞれに含まれる画素の輝度値の平均値である。輝度値としては、局所領域輝度を算出する場合と同様に、YUV表色系のY値、XYZ表色系のY値、HSV表色系のV値等を用いることができる。 The determination unit 11 may use a method of using the change rate of the rectangular area luminance as another method of determining whether the frame image of interest is a frame image including a blinking area. Here, the rectangular area luminance represents an average value of luminance for each rectangular area set in advance in each frame image. For example, as shown in FIG. 2, the rectangular area luminance when a 10 × 10 block rectangular area is set in the frame image is an average value of the luminance values of the pixels included in each rectangular area. As the luminance value, the Y value of the YUV color system, the Y value of the XYZ color system, the V value of the HSV color system, etc. can be used as in the case of calculating the local area luminance.
 矩形領域輝度の変化率は、注目フレーム画像の注目しているブロックの矩形領域輝度と、入力された他のフレーム画像における同じ位置のブロックの矩形領域輝度の差の比率を表す。判定部11は、注目フレーム画像の位置(i,j)にあるブロックの時刻tにおける矩形領域輝度Lt(i,j)と時刻(t+k)のフレーム画像の矩形領域輝度Lt+k(i,j)の変化率Rt-t+k (i,j)を、式(5)を用いて算出する。
Figure JPOXMLDOC01-appb-I000005
The change rate of the rectangular area luminance represents the ratio of the difference between the rectangular area luminance of the block of interest in the target frame image and the rectangular area luminance of the block at the same position in the other input frame image. The determination unit 11 determines the rectangular area luminance L t (i, j) at the time t of the block at the position (i, j) of the target frame image and the rectangular area luminance L t + k of the frame image at the time (t + k). The change rate R t−t + k (i, j) of (i, j) is calculated using equation (5).
Figure JPOXMLDOC01-appb-I000005
 矩形領域輝度の変化率を用いた判定は、局所領域輝度の変化率を用いた判定と同様に行われる。判定部11は、時刻tにおける注目フレーム画像と入力された他の全てのフレーム画像との組み合わせにおいて、注目フレーム画像に他のフレーム画像より大きく明るくなる領域が含まれているかどうかを判定することで判定フラグの値を設定する。判定部11は、注目フレーム画像の前後の時刻それぞれに判定フラグが「1」となるフレーム画像が存在する場合、注目フレーム画像が明滅領域を含むフレーム画像であると判定する。 The determination using the change rate of the rectangular area luminance is performed in the same manner as the determination using the change ratio of the local area luminance. The determination unit 11 determines whether or not the attention frame image includes a region that is brighter than the other frame images in the combination of the attention frame image at time t and all the other input frame images. Set the value of the judgment flag. The determination unit 11 determines that the frame image of interest is a frame image including a blinking area when there are frame images having the determination flag “1” at each of the times before and after the frame of interest image.
 判定フラグの値の設定方法には、局所領域輝度の変化率を用いる場合と同様に、予め設定された変化率の閾値αと面積率の閾値βを用いて、変化率が閾値αを超える画素の面積率が閾値βを超えるか否かによって、「1」又は「0」を設定する方法がある。 In the determination flag value setting method, as in the case of using the local area luminance change rate, a pixel whose change rate exceeds the threshold value α using a preset change rate threshold value α and an area rate threshold value β. There is a method of setting “1” or “0” depending on whether or not the area ratio exceeds the threshold value β.
 判定部11は、判定結果と共に、注目フレーム画像と入力された他のフレーム画像との判定フラグを解析情報として出力する。また、判定部11は、同様の処理を実行することにより注目フレーム画像以外のフレーム画像間で算出された判定フラグを補助情報として出力してもよい。 The determination unit 11 outputs a determination flag between the frame image of interest and another input frame image as analysis information together with the determination result. Moreover, the determination part 11 may output the determination flag calculated between frame images other than an attention frame image as auxiliary information by performing the same process.
 また、判定部11は、入力されたフレーム画像間での判定フラグに加えて、注目フレーム画像の各矩形領域について他のフレーム画像の同一の位置の矩形領域との間で算出された矩形領域輝度の変化率を解析情報として出力してもよい。 In addition to the determination flag between the input frame images, the determination unit 11 calculates the rectangular area luminance calculated between each rectangular area of the target frame image and the rectangular area at the same position of the other frame image. May be output as analysis information.
 <動き推定部12>
 図3は、動き推定部12の構成を示すブロック図である。
<Motion estimation unit 12>
FIG. 3 is a block diagram illustrating a configuration of the motion estimation unit 12.
 動き推定部12は、選択部12Aと、第1推定部12Bと、第2推定部12Cとを有する。 The motion estimation unit 12 includes a selection unit 12A, a first estimation unit 12B, and a second estimation unit 12C.
 動き推定部12は、フレーム画像と、判定部11から出力された判定結果及び解析情報とを入力として受け付ける。動き推定部12は、注目フレーム画像が明領域を含むフレーム画像であると判定された場合に、入力されたフレーム画像から動き推定に用いるフレーム画像を複数選択し、選択されたフレーム画像の間でのカメラ及び被写体の動きに起因する画像の移動量を推定する。 The motion estimation unit 12 receives the frame image and the determination result and analysis information output from the determination unit 11 as inputs. When it is determined that the target frame image is a frame image including a bright region, the motion estimation unit 12 selects a plurality of frame images to be used for motion estimation from the input frame images, and selects between the selected frame images. The movement amount of the image due to the movement of the camera and the subject is estimated.
 選択部12A
 選択部12Aは、注目フレーム画像以外のフレーム画像から、移動量の推定に用いるフレーム画像を選択し、選択されたフレーム画像を含む1対のフレーム画像を取得する。選択部12Aは、これらのフレーム画像(以下「動き推定用フレーム画像」という。)を、例えば以下の方法によって選択する。
Selection unit 12A
The selection unit 12A selects a frame image used for estimation of the movement amount from frame images other than the target frame image, and acquires a pair of frame images including the selected frame image. The selection unit 12A selects these frame images (hereinafter referred to as “motion estimation frame images”), for example, by the following method.
 ・(12A-1)選択方法1
 選択部12Aは、注目フレーム画像と入力された他のフレーム画像との輝度差に基づいて、注目フレーム画像の前後からそれぞれ1つのフレーム画像を動き推定用フレーム画像として選択してもよい。この場合、選択部12Aは、注目フレーム画像の前後それぞれのフレーム画像を1つずつ取得して1対の動き推定用フレーム画像として用いる。具体的には、選択部12Aは、判定部11で算出された判定フラグを用いて動き推定用フレーム画像を選択してもよい。
(12A-1) Selection method 1
The selection unit 12A may select one frame image as a motion estimation frame image from before and after the target frame image based on the luminance difference between the target frame image and the input other frame image. In this case, the selection unit 12A acquires one frame image before and after each frame image of interest and uses it as a pair of motion estimation frame images. Specifically, the selection unit 12A may select the motion estimation frame image using the determination flag calculated by the determination unit 11.
 この方法では、判定フラグが「1」となるフレーム画像のうち、注目フレーム画像に最も近い前後それぞれのフレーム画像が動き推定用フレーム画像として選択される。 In this method, out of the frame images whose determination flag is “1”, the frame images before and after the closest to the target frame image are selected as the motion estimation frame images.
 図4は、明領域を含まないフレーム画像の選択方法を表す模式図である。図4は、case1~case4の4種類のケースについて、時刻(t-2)から時刻(t+2)までのフレーム画像と、時刻tのフレーム画像に対して他のフレーム画像を比較した場合の判定フラグ(flag)とを例示している。なお、図4(及び以降の同様の図)において、明領域を含まないフレーム画像は、ハッチングを付して示されている。ハッチングされていないフレーム画像は、明領域を含むフレーム画像を表す。 FIG. 4 is a schematic diagram showing a method for selecting a frame image that does not include a bright region. Figure 4 shows the case of comparing the frame image from time (t-2) to time (t + 2) with other frame images for the frame image at time t for the four types of cases 1 to 4 The determination flag (flag) is illustrated. Note that in FIG. 4 (and similar figures thereafter), frame images that do not include a bright region are shown with hatching. An unhatched frame image represents a frame image including a bright region.
 例えば、明領域を含む時刻tのフレーム画像に対して、選択部12Aは、case1の場合には、時刻(t-1)と時刻(t+1)のフレーム画像を選択する。同様に、選択部12Aは、case2の場合には時刻(t-2)と時刻(t+1)のフレーム画像、case3の場合には時刻(t-1)と時刻(t+2)のフレーム画像、case4の場合には時刻(t-2)と時刻(t+2)のフレーム画像をそれぞれ選択する。 For example, for a frame image at time t including a bright region, the selection unit 12A selects a frame image at time (t−1) and time (t + 1) in case 1. Similarly, the selection unit 12A displays frame images at time (t-2) and time (t + 1) in case 2, and frames at time (t-1) and time (t + 2) in case 3. In the case of an image, case4, frame images at time (t-2) and time (t + 2) are selected.
 また、選択部12Aは、補助情報として入力された注目フレーム画像以外のフレーム画像間の判定フラグを用いて、動き推定用フレームの選択結果を修正してもよい。注目フレーム画像と他のフレーム画像との判定フラグを用いた選択において、動き推定用フレームとして時刻(t+k)のフレームが選択された場合、選択部12Aは、次のように選択結果を修正してもよい。例えば、注目フレームと時刻(t+k+1)のフレーム画像の判定フラグflagt-t+k及び時刻(t+k+1)のフレーム画像と時刻(t+k)のフレーム画像の判定フラグflagt+k+1-t+kの値が共に「1」の場合、時刻(t+k+1)のフレーム画像と時刻(t+k)のフレーム画像との間にも大きな輝度変化があると考えられる。そのため、選択部12Aは、この場合、動き推定用フレーム画像を時刻(t+k+1)のフレーム画像に変更(修正)してもよい。 Further, the selection unit 12A may correct the selection result of the motion estimation frame using the determination flag between the frame images other than the target frame image input as the auxiliary information. In the selection using the determination flag between the frame image of interest and another frame image, when the frame at time (t + k) is selected as the motion estimation frame, the selection unit 12A corrects the selection result as follows. May be. For example, the determination flag flag t-t + k of the frame image at the time (t + k + 1) and the determination flag of the frame image at the time (t + k + 1) and the frame image at the time (t + k) When both flag t + k + 1-t + k values are `` 1 '', there is also a large luminance change between the frame image at time (t + k + 1) and the frame image at time (t + k). It is believed that there is. Therefore, in this case, the selection unit 12A may change (correct) the motion estimation frame image to a frame image at time (t + k + 1).
 ・(12A-2)選択方法2
 選択部12Aは、注目フレーム画像と入力された他のフレーム画像の間の輝度変化に基づいて、注目フレーム画像の前後それぞれから複数のフレーム画像を動き推定用フレーム画像として選択してもよい。この場合、選択部12Aは、フレーム画像の対(ペア)を複数取得する。具体的には、選択部12Aは、注目フレーム画像の近隣のフレーム画像のうち、判定部11で算出された判定フラグが1のフレーム画像を予め定められた数選択してもよい。
(12A-2) Selection method 2
The selection unit 12A may select a plurality of frame images as the motion estimation frame images from before and after the target frame image based on the luminance change between the target frame image and the input other frame image. In this case, the selection unit 12A acquires a plurality of pairs of frame images. Specifically, the selection unit 12A may select a predetermined number of frame images having the determination flag calculated by the determination unit 11 out of the neighboring frame images of the target frame image.
 図5は、複数(ここでは2対)の動き推定用フレーム画像を選択する場合の例を示す模式図である。図5に例示されているように、時刻(t-2)、(t-1)、(t+1)及び(t+2)におけるフレーム画像が明領域を含まない場合、選択部12Aは、これらのフレーム画像の全てを動き推定用フレームとして選択する。図5の例における判定フラグは、図4のcase1の例における判定フラグと等しい。しかし、この選択方法においては、選択部12Aは、時刻(t-1)と時刻(t+1)におけるフレーム画像だけでなく、時刻(t-2)と時刻(t+2)におけるフレーム画像も動き推定用フレームとして選択する。 FIG. 5 is a schematic diagram showing an example of selecting a plurality (two pairs in this case) of motion estimation frame images. As illustrated in FIG. 5, when the frame images at times (t−2), (t−1), (t + 1), and (t + 2) do not include a bright region, the selection unit 12A All of these frame images are selected as motion estimation frames. The determination flag in the example of FIG. 5 is equal to the determination flag in the case 1 of FIG. However, in this selection method, the selection unit 12A not only displays frame images at time (t-1) and time (t + 1) but also frame images at time (t-2) and time (t + 2). Select as a frame for motion estimation.
 この選択方法は、短時間に頻繁に明滅が発生する場合やフラッシュバンドが発生した場合に、複数のフレーム画像から光の明滅の影響が少ない領域を選択的に利用し、フレーム間の動き推定の精度を高めることが可能である(例えば図7参照)。ここにおいて、フラッシュバンドとは、CMOS(Complementary metal-oxide-semiconductor)センサなどのローリングシャッタ方式の撮像素子において、フラッシュ光のような短時間の発光が生じた際にライン毎の露光期間の違いによって生じる信号強度の大きな変化(ずれ)のことである。フラッシュバンドが発生したフレーム画像は、例えば、その上半分又は下半分のみが発光時の画像(明領域)となり、残りの部分が発光直前又は直後の相対的に暗い画像となる。 This selection method selectively uses an area that is less affected by light flickering from multiple frame images when frequent flickering occurs in a short time or when a flash band occurs. The accuracy can be increased (see, for example, FIG. 7). Here, the flash band refers to the difference in exposure period for each line when light emission in a short time such as flash light occurs in a rolling shutter type imaging device such as a CMOS (Complementary Metal-Oxide-Semiconductor) sensor. This is a large change (shift) in the signal intensity that occurs. In the frame image in which the flash band is generated, for example, only the upper half or the lower half is an image at the time of light emission (bright region), and the remaining part is a relatively dark image immediately before or after the light emission.
 ・(12A-3)選択方法3
 選択部12Aは、注目フレーム画像と入力された他のフレーム画像の間の輝度差に基づいて、注目フレーム画像の前後どちらか一方のフレーム画像と注目フレーム画像とを動き推定用フレーム画像として選択してもよい。具体的には、選択部12Aは、判定部11で算出された判定フラグが「1」のフレームのうち、注目フレーム画像に最も近接するフレーム画像を選択してもよい。注目フレーム画像の前後いずれも判定フラグが「1」である場合には、選択部12Aは、予め設定された一方のフレームのみを選択する。図6は、注目フレーム画像よりも前の時刻のフレーム画像を選択した場合の一例を示す。この場合、選択部12Aは、このように選択されたフレーム画像と注目フレーム画像とを1対の動き推定用フレーム画像として用いる。
(12A-3) Selection method 3
The selection unit 12A selects, as a motion estimation frame image, one of the frame images before and after the target frame image and the target frame image based on the luminance difference between the target frame image and the input other frame image. May be. Specifically, the selection unit 12A may select a frame image closest to the target frame image from among the frames having the determination flag “1” calculated by the determination unit 11. When the determination flag is “1” both before and after the frame image of interest, the selection unit 12A selects only one preset frame. FIG. 6 shows an example of a case where a frame image at a time earlier than the target frame image is selected. In this case, the selection unit 12A uses the frame image thus selected and the frame image of interest as a pair of motion estimation frame images.
 この選択方法によれば、選択方法1及び2と比較して、動き推定部12及び画像生成部13が処理対象とする画像の数が少なくなるため、高速な処理が実現できる。 According to this selection method, compared with the selection methods 1 and 2, the number of images to be processed by the motion estimation unit 12 and the image generation unit 13 is reduced, so that high-speed processing can be realized.
 なお、この選択方法は、注目フレーム画像において対応点の検出が可能であることを前提とする。 Note that this selection method is based on the assumption that corresponding points can be detected in the frame image of interest.
 第1推定部12B
 第1推定部12Bは、動き推定用フレーム画像のペア間におけるカメラ又は被写体の動きに起因した画素の動きを推定する。動き推定は、動き推定用フレーム画像のうちの任意の2つのフレーム画像の組み合わせ(ペア)に対して行う。第1推定部12Bは、1又は複数のペアのうち少なくとも1組に対して動き推定を行う。
First estimation unit 12B
The first estimation unit 12B estimates pixel motion caused by camera or subject motion between a pair of motion estimation frame images. Motion estimation is performed on a combination (pair) of any two frame images of the motion estimation frame images. The first estimation unit 12B performs motion estimation on at least one set of one or a plurality of pairs.
 例えば、上述した選択方法1(12A-1)の場合、第1推定部12Bは、注目フレーム画像の前後から1つずつ選択された2つのフレーム画像から成るペアに対して動き推定を行う。これに加えて、第1推定部12Bは、注目フレーム画像とその前後から選択されたフレーム画像のうち一方とから成るペアに対して動き推定を行ってもよい。 For example, in the case of the selection method 1 (12A-1) described above, the first estimation unit 12B performs motion estimation on a pair of two frame images selected one by one from before and after the target frame image. In addition to this, the first estimation unit 12B may perform motion estimation on a pair composed of the target frame image and one of the frame images selected from before and after.
 また、選択方法2(12A-2)の場合、第1推定部12Bは、図7に示すように、注目フレーム画像の前後それぞれから選択された複数のフレーム画像について、注目フレーム画像の矩形領域と同一位置の矩形領域との間で矩形領域輝度を比較する。そして、第1推定部12Bは、これらの矩形領域輝度の変化率が閾値γを超える領域を検出する。第1推定部12Bは、変化率が閾値γを超える領域が共通する領域を含むフレーム画像をペアとし、それぞれのペアの当該共通する領域(図7の中の点線で囲われた領域)に関して動き推定を行う。閾値γは、予め設定された値でもよいが、動き推定を一定の面積で行えるように適当な値が動的に設定されてもよい。あるいは、第1推定部12Bは、判定部11から入力された注目フレーム画像以外のフレーム画像間の判定フラグをもとに、互いのフレーム画像の間での判定フラグが「0」となるフレーム画像のペアに対して動き推定を行ってもよい。 In the case of the selection method 2 (12A-2), as shown in FIG. 7, the first estimation unit 12B uses a rectangular area of the target frame image and a plurality of frame images selected from before and after the target frame image. The luminance of the rectangular area is compared with the rectangular area at the same position. Then, the first estimation unit 12B detects a region where the change rate of the luminance of the rectangular region exceeds the threshold value γ. The first estimator 12B makes a pair of frame images including a region having a common region where the rate of change exceeds the threshold γ, and moves with respect to the common region (region surrounded by a dotted line in FIG. 7) of each pair. Estimate. The threshold value γ may be a preset value, but an appropriate value may be dynamically set so that motion estimation can be performed in a certain area. Alternatively, the first estimation unit 12B uses the determination flag between frame images other than the target frame image input from the determination unit 11, and the frame image in which the determination flag between the frame images is “0”. Motion estimation may be performed on a pair of
 また、選択方法3(12A-3)の場合、第1推定部12Bは、注目フレーム画像の前後いずれか一方から選択したフレームと注目フレーム画像のペアに対して動き推定を行う。 Also, in the case of selection method 3 (12A-3), the first estimation unit 12B performs motion estimation on a pair of a frame and a target frame image selected from either one before or after the target frame image.
 カメラの動きに起因した画像の動きは、画面の大局的な動きのため、動き推定用フレーム画像のペア間のアフィン変換によって表現することができる。アフィン変換は、2つの画像間での平行移動と線形変換(拡大縮小、回転、スキュー)を組み合わせた幾何変換である。動き推定用フレーム画像のペアを画像Iと画像I’とし、画像I上の画素P(x,y)と画像I’上の画素P’(x’,y’)とが対応するとした場合、画像Iから画像I’へのアフィン変換は式(6)で表される。
Figure JPOXMLDOC01-appb-I000006
The motion of the image due to the motion of the camera can be expressed by affine transformation between a pair of motion estimation frame images because of the global motion of the screen. Affine transformation is a geometric transformation that combines translation between two images and linear transformation (enlargement / reduction, rotation, skew). When a pair of motion estimation frame images is an image I and an image I ′, and a pixel P (x, y) on the image I corresponds to a pixel P ′ (x ′, y ′) on the image I ′, The affine transformation from the image I to the image I ′ is expressed by Expression (6).
Figure JPOXMLDOC01-appb-I000006
 式(6)の線形変換行列は、QR分解によって
Figure JPOXMLDOC01-appb-I000007
The linear transformation matrix of Equation (6) is obtained by QR decomposition.
Figure JPOXMLDOC01-appb-I000007
の要素に分解できる。これらを用いると、式(6)は式(7)のように表せる。
Figure JPOXMLDOC01-appb-I000008
Can be broken down into Using these, equation (6) can be expressed as equation (7).
Figure JPOXMLDOC01-appb-I000008
 アフィン変換のパラメーター(θ,a’,b’,d’,tx、ty)は、画像I上の3点以上の画素について、それぞれ画像I’上の対応点を検出し、各座標を式(7)に代入することで算出できる。第1推定部12Bは、例えば以下の方法で対応点を検出することができる。 As the affine transformation parameters (θ, a ′, b ′, d ′, tx, ty), corresponding points on the image I ′ are detected for three or more pixels on the image I, and each coordinate is expressed by an expression ( It can be calculated by substituting in 7). The first estimation unit 12B can detect corresponding points by the following method, for example.
 ・(12B-1)検出方法1
 第1推定部12Bは、画像I上の画素Pについてオプティカルフローを算出し、画素Pの移動先の画素P’を対応点とする。オプティカルフローの主な算出方法としては、Lucas-Kanade法やHorn-Schunck法に基づく方法が挙げられる。Lucas-Kanade法は、移動前後で画素値はほぼ同値とする拘束条件に基づき、画像の移動量を算出する方法である(非特許文献3)。また、Horn-Schunck法は、隣接するオプティカルフローの間の滑らかさを考慮しながら、画像全体の誤差関数を最小化することで画像の移動量を算出する方法である(非特許文献4)。
(12B-1) Detection method 1
The first estimation unit 12B calculates an optical flow for the pixel P on the image I, and sets the pixel P ′ to which the pixel P is moved as a corresponding point. As a main calculation method of the optical flow, a method based on the Lucas-Kanade method or the Horn-Schunck method can be cited. The Lucas-Kanade method is a method for calculating the amount of movement of an image based on a constraint condition in which pixel values are approximately the same before and after movement (Non-Patent Document 3). The Horn-Schunck method is a method for calculating the amount of movement of an image by minimizing the error function of the entire image while taking into account the smoothness between adjacent optical flows (Non-Patent Document 4).
 ・(12B-2)検出方法2
 第1推定部12Bは、画像I上の領域Rに対応する画像I’上の領域R’を特定し、領域Rの中心座標に相当する画素Pの対応点を領域R’の中心座標に相当する画素P’とする。領域R、R’は、画像I、I’を規定サイズの格子状に分割した矩形領域であってもよいし、色やテクスチャなどの画像特徴量に基づく画素のクラスタリングによって生成されたクラスタであってもよい。
(12B-2) Detection method 2
The first estimation unit 12B specifies the region R ′ on the image I ′ corresponding to the region R on the image I, and the corresponding point of the pixel P corresponding to the center coordinate of the region R corresponds to the center coordinate of the region R ′. It is assumed that the pixel P ′ to be used. The regions R and R ′ may be rectangular regions obtained by dividing the images I and I ′ into a grid having a predetermined size, or may be clusters generated by clustering pixels based on image features such as color and texture. May be.
 第1推定部12Bは、例えば領域Rをテンプレートとしたテンプレートマッチングにより領域R’を検出することができる。第1推定部12Bは、テンプレートマッチングに用いる類似度の指標として、画素値の差分に基づくSSD(Sum of Squared Difference)、SAD(Sum of Absolute Difference)、正規化相互相関(ZNCC:Zero-mean Normalized Cross-Correlation)などを用いてもよい。特に、正規化相互相関(RZNCC)は、式(8)に示すように、テンプレート及び画像の輝度値(T(i,j)及びI(i,j))からそれぞれの平均(Tave及びIave)を差し引いて計算することにより、明るさの変動があっても安定して類似度を評価できる指標である。そのため、正規化相互相関を用いることにより、第1推定部12Bは、フラッシュ光の影響によって動き推定用フレーム画像のペアの間で輝度に差がある場合であっても、他の指標を用いる場合よりも安定して領域R’を検出することができる。
Figure JPOXMLDOC01-appb-I000009
For example, the first estimation unit 12B can detect the region R ′ by template matching using the region R as a template. The first estimation unit 12B uses an SSD (Sum of Squared Difference), SAD (Sum of Absolute Difference), and normalized cross-correlation (ZNCC: Zero-mean Normalized) as a similarity index used for template matching. Cross-Correlation) may be used. In particular, the normalized cross-correlation (R ZNCC ) is calculated based on the average values (T ave and) from the template and image luminance values (T (i, j) and I (i, j)) as shown in Equation (8). By calculating by subtracting I ave ), the similarity can be evaluated stably even if there is a variation in brightness. Therefore, by using normalized cross-correlation, the first estimation unit 12B uses another index even when there is a difference in luminance between a pair of motion estimation frame images due to the influence of flash light. The region R ′ can be detected more stably.
Figure JPOXMLDOC01-appb-I000009
 あるいは、第1推定部12Bは、オプティカルフローを用いて領域Rの中心座標に相当する画素Pの対応点に相当する画素P’を検出してもよい。例えば、第1推定部12Bは、領域R内の各画素において推定したオプティカルフローの代表値(重み付き平均値又は中央値)を領域Rの移動量とし、画素Pを領域Rの移動量だけ移動させた先の画素P’を対応点とする。 Alternatively, the first estimation unit 12B may detect the pixel P ′ corresponding to the corresponding point of the pixel P corresponding to the center coordinate of the region R using the optical flow. For example, the first estimation unit 12B uses the representative value (weighted average value or median value) of the optical flow estimated for each pixel in the region R as the movement amount of the region R, and moves the pixel P by the movement amount of the region R. Let the previous pixel P ′ be a corresponding point.
 ・(12B-3)検出方法3
 第1推定部12Bは、画像Iから特徴点に相当する画素Pを抽出し、画像I’の画素pの移動先に相当する画素P’を対応点とする。第1推定部12Bは、例えばHarrisのコーナー検出アルゴリズムによって検出されるコーナー点を特徴点としてもよい。Harrisのコーナー検出アルゴリズムは、「エッジ上の点では一次微分値(差分)が一方向にのみ大、コーナー上の点では一次微分値が複数の方向に大」との知識に基づき、下式で表されるHarrisオペレータdst(x,y)の正の極大値が大きい点を抽出するアルゴリズムである。
Figure JPOXMLDOC01-appb-I000010
(12B-3) Detection method 3
The first estimation unit 12B extracts the pixel P corresponding to the feature point from the image I, and sets the pixel P ′ corresponding to the movement destination of the pixel p of the image I ′ as the corresponding point. The first estimation unit 12B may use, for example, a corner point detected by a Harris corner detection algorithm as a feature point. Harris's corner detection algorithm is based on the knowledge that “the first differential value (difference) is large only in one direction at the point on the edge, and the first differential value is large in multiple directions at the point on the corner”. This is an algorithm for extracting a point having a large positive maximum value of the represented Harris operator dst (x, y).
Figure JPOXMLDOC01-appb-I000010
 ここでfxとfyは、それぞれx、y方向の1次微分値(差分)を意味する。また、Gσは、標準偏差σのガウス分布による平滑化を意味する。kは定数であり、経験的に0.04から0.15の値が用いられる。 Here, fx and fy mean primary differential values (differences) in the x and y directions, respectively. means smoothing by a Gaussian distribution with a standard deviation σ. k is a constant, and a value from 0.04 to 0.15 is empirically used.
 第1推定部12Bは、特徴点で検出されたオプティカルフローをもとに対応点を特定してもよい。また、第1推定部12Bは、画像Iのある特徴点を含む画像パッチから抽出された画像特徴量(例えば、SIFT(Scale-Invariant Feature Transform)特徴量)が画像I’のいずれかの画像パッチから抽出された画像特徴量と類似するときに、その画像パッチの中心を対応点p’としてもよい。 The first estimation unit 12B may identify the corresponding point based on the optical flow detected at the feature point. In addition, the first estimation unit 12B has an image feature whose image feature value (for example, SIFT (Scale-Invariant Feature Transform) feature value) extracted from an image patch including a certain feature point of the image I is an image patch of the image I ′. The center of the image patch may be set as the corresponding point p ′ when it is similar to the image feature amount extracted from.
 第1推定部12Bは、アフィン変換のパラメーターを、上記の方法を用いて検出された対応点のうち最も信頼性の高い対応点の組み合わせ3組をもとに算出してもよいし、3組以上の対応点の組み合わせをもとに最小二乗法によって算出してもよい。あるいは、第1推定部12Bは、RANSAC(RANdom SAmple Consensus)のようなロバスト推定法を利用してアフィン変換のパラメーターを算出してもよい。RANSACでは、多数の対応点の組み合わせの中からランダムに3組選択して仮のアフィン変換パラメーターを算出し、他の対応点の組み合わせのうち仮のアフィン変換パラメーターに従う組み合わせ数が多いときに、当該アフィン変換パラメーターを真のアフィン変換パラメーターとして採用する方法である。また、第1推定部12Bは、特定の画像領域をアフィン変換のパラメーターの算出対象から除外してもよい。このような画像領域は、例えば、カメラが動いたときに撮影範囲から外れる可能性が高い画像の端部や、隣接する画素との輝度差が小さい平坦部のような、対応点の検出精度が低いことが既知の画像領域である。あるいは、このような画像領域は、動く被写体が映っている可能性が高い画面中央の領域や、色が変化する固定照明が当たる部分のような、カメラの動き以外の要因で画素値が変化する画像領域である。 The first estimator 12B may calculate the affine transformation parameters based on the three reliable combinations of corresponding points among the corresponding points detected using the above method. You may calculate by the least squares method based on the combination of the above corresponding points. Alternatively, the first estimation unit 12B may calculate the affine transformation parameters using a robust estimation method such as RANSAC (RANdom SAmple Consensus). RANSAC calculates three tentative affine transformation parameters by randomly selecting from three combinations of corresponding points, and when there are many combinations that correspond to the tentative affine transformation parameters among other combinations of corresponding points, This is a method in which an affine transformation parameter is adopted as a true affine transformation parameter. Further, the first estimation unit 12B may exclude a specific image region from the calculation target of the affine transformation parameter. Such an image area has a corresponding point detection accuracy such as an edge portion of an image that is likely to be out of the shooting range when the camera moves or a flat portion with a small luminance difference from adjacent pixels. It is a known image area that is low. Alternatively, the pixel value of such an image area changes due to factors other than the movement of the camera, such as an area in the center of the screen where a moving subject is highly likely to be reflected, or a portion that receives a fixed illumination that changes color. It is an image area.
 上述した(12B-1)、(12B-2)、(12B-3)と(12A-1)、(12A-2)、(12A-3)の組み合わせは、特に限定されない。つまり、第1推定部12Bは、(12A-1)、(12A-2)、(12A-3)のいずれの方法で選択された動き推定用フレーム画像に対して(12B-1)、(12B-2)又は(12B-3)を実行してもよい。また、第1推定部12Bは、上記の画像処理による動き推定に加え、カメラに搭載された計測機(ジャイロ、デプスセンサ等)によって取得したカメラの動き情報を利用してもよい。 The combination of (12B-1), (12B-2), (12B-3) and (12A-1), (12A-2), (12A-3) described above is not particularly limited. That is, the first estimation unit 12B performs (12B-1), (12B) on the motion estimation frame image selected by any of the methods (12A-1), (12A-2), and (12A-3). -2) or (12B-3) may be executed. The first estimation unit 12B may use camera motion information acquired by a measuring instrument (gyroscope, depth sensor, etc.) mounted on the camera in addition to the motion estimation by the image processing described above.
 第2推定部12C
 第2推定部12Cは、被写体の動きに起因した画像の動きを、動き推定用フレーム画像のペアの一方から被写体領域を検出し、他方から対応領域(被写体領域に対応する領域)を推定することで求める。あるいは、第2推定部12Cは、動き推定用フレーム画像のペアの一方又は両方をアフィン変換することで変換画像を生成し、動き推定用フレーム画像のペアのうち一方のフレーム画像又はその変換画像から被写体領域を検出してもよい。この場合、第2推定部12Cは、被写体の動きに起因した画像の動きを、動き推定用フレーム画像のペアのうち他方のフレーム画像又はその変換画像の対応領域を推定することで求めてもよい。
Second estimation unit 12C
The second estimation unit 12C detects the subject region from one of the pair of motion estimation frame images and estimates the corresponding region (the region corresponding to the subject region) from the other for the motion of the image caused by the motion of the subject. Ask for. Alternatively, the second estimation unit 12C generates a converted image by performing affine transformation on one or both of the pair of motion estimation frame images, and from one of the pair of motion estimation frame images or the converted image thereof. A subject area may be detected. In this case, the second estimator 12C may determine the motion of the image due to the motion of the subject by estimating the other frame image of the pair of motion estimation frame images or the corresponding region of the converted image. .
 すなわち、第2推定部12Cは、アフィン変換パラメーター及び動き推定用フレーム画像のペアに基づいてカメラの動きに起因する画像の移動量を減算することで、被写体領域と対応領域のペアを検出する。第2推定部12Cは、このペアに基づいて被写体の動きに起因する画像の移動量を推定する。 That is, the second estimation unit 12C detects the pair of the subject area and the corresponding area by subtracting the moving amount of the image caused by the camera movement based on the affine transformation parameter and the motion estimation frame image pair. Based on this pair, the second estimation unit 12C estimates the amount of image movement caused by the movement of the subject.
 被写体領域の検出方法として、例えば以下の方法が挙げられる。 Examples of the subject area detection method include the following methods.
 ・(12C-1-1)検出方法1
 第2推定部12Cは、動き推定用フレーム画像のペアの一方から、アフィン変換パラメーターによって推定される移動量とは異なる動きをする画像(画素の集合)を、被写体領域として検出する。
(12C-1-1) Detection method 1
The second estimation unit 12C detects an image (a set of pixels) that moves differently from the movement amount estimated by the affine transformation parameter from one of the pair of motion estimation frame images as a subject area.
 具体的には、第2推定部12Cは、式(7)を用いて、画像Iと画像I’の間で算出されたアフィン変換パラメーターをもとに、画像Iの画素Pについて画像Iから画像I’の間での予測ベクトル(u,v)を算出する。第2推定部12Cは、画素Pと画素P’の間のベクトル(x’-x,y’-y)と(u,v)との差が一定値以上のときに、画素Pを候補点とする。ここで、ベクトルの差を算出することは、カメラの動きに起因する画像の移動量を減算することを意味する。第2推定部12Cは、候補点の集合を画像Iの被写体領域として検出する。 Specifically, the second estimation unit 12C uses the equation (7) to calculate the image P from the image I for the pixel P of the image I based on the affine transformation parameters calculated between the image I and the image I ′. A prediction vector (u, v) between I ′ is calculated. The second estimating unit 12C selects the pixel P as a candidate point when the difference between the vectors (x′−x, y′−y) and (u, v) between the pixel P and the pixel P ′ is equal to or greater than a certain value. And Here, calculating the vector difference means subtracting the amount of movement of the image due to the movement of the camera. The second estimation unit 12C detects the set of candidate points as the subject area of the image I.
 ・(12C-1-2)検出方法2
 第2推定部12Cは、動き推定用フレーム画像のペアについて、一方をアフィン変換することで生成した変換画像と、他方のフレーム画像をアフィン変換(逆変換)することで生成した変換画像との差分が大きい領域を、変換画像両方から被写体領域として検出する。
(12C-1-2) Detection method 2
For the pair of motion estimation frame images, the second estimation unit 12C calculates a difference between a converted image generated by affine transformation of one and a converted image generated by affine transformation (inverse transformation) of the other frame image. Is detected as a subject area from both converted images.
 具体的には、第2推定部12Cは、式(7)を用いて、画像Iと画像I’の間で算出されたアフィン変換パラメーターをもとに、画像Iから任意の時刻tでの予測画像Iを生成する。第2推定部12Cは、同様に、画像Iと画像I’の間で算出されたアフィン変換パラメーターをもとに、画像I’から時刻tにおける予測画像I’を生成する。第2推定部12Cは、予測画像IとI’の差分を算出し、差分の絶対値が一定以上の画素の集合を、予測画像I、I’のそれぞれから被写体領域として検出する。 Specifically, the second estimation unit 12C predicts from the image I at an arbitrary time t based on the affine transformation parameters calculated between the image I and the image I ′ using the equation (7). An image Ip is generated. Similarly, the second estimation unit 12C generates a predicted image I p ′ at time t from the image I ′ based on the affine transformation parameters calculated between the image I and the image I ′. The second estimation unit 12C calculates a difference between the predicted images I p and I p ′, and detects a set of pixels having an absolute value of the difference equal to or larger than a certain value as a subject area from each of the predicted images I p and I p ′. .
 なお、第2推定部12Cは、画像Iの画素(x、y)を式(9)に代入することで予測画像I上の画素(x,y)を生成できる。ここにおいて、画像Iから画像Iの間のアフィン変換パラメーターは(θ、a、b、d、tpx、tpy)であるとする。
Figure JPOXMLDOC01-appb-I000011
Note that the second estimation unit 12C can generate a pixel (x p , y p ) on the predicted image I p by substituting the pixel (x, y) of the image I into Expression (9). Here, it is assumed that the affine transformation parameters between the image I and the image Ip are (θ p , a p , b p , d p , t px , t py ).
Figure JPOXMLDOC01-appb-I000011
 ここで、(θ、a、b、d、tpx、tpy)は、以下の関係式によって算出できる。ここにおいて、画像Iから画像I’へのアフィン変換パラメーターは(θ、a、b、d、t、t)、画像Iと画像I’の時間差はT、画像Iと画像Iの時間差はTであるとする。
Figure JPOXMLDOC01-appb-I000012
Here, (θ p , a p , b p , d p , t px , t py ) can be calculated by the following relational expression. Here, the affine transformation parameters from the image I to the image I ′ are (θ, a, b, d, t x , t y ), the time difference between the image I and the image I ′ is T, and the time difference between the image I and the image I p Is T p .
Figure JPOXMLDOC01-appb-I000012
 但し、上記の関係式は、カメラの動きが等速であることを仮定している。第2推定部12Cは、カメラの動きの変化率が既知である場合には、その変化率を重みづけて(θ、a、b、d、tpx、tpy)を算出してもよい。 However, the above relational expression assumes that the motion of the camera is constant. When the rate of change of the camera motion is known, the second estimation unit 12C calculates (θ p , a p , b p , d p , t px , t py ) by weighting the rate of change. May be.
 また、第2推定部12Cは、画像I’の画素(x’、y’)を式(10)に代入することで予測画像I’の画素(x’、y’)を生成できる。ここにおいて、画像I’と画像I’の間のアフィン変換パラメーターは(θ’、a’、b’、d’、tpx’、tpy’)であるとする。
Figure JPOXMLDOC01-appb-I000013
In addition, the second estimation unit 12C can generate the pixel (x p ′, y p ′) of the predicted image I p ′ by substituting the pixel (x ′, y ′) of the image I ′ into Expression (10). . Here, it is assumed that the affine transformation parameters between the image I ′ and the image I p ′ are (θ p ′, a p ′, b p ′, d p ′, t px ′, t py ′).
Figure JPOXMLDOC01-appb-I000013
 ここで、(θ’、a’、b’、d’、tpx’、tpy’)は、以下の関係式によって求められる。ここにおいて、画像I’から画像Iへのアフィン変換のパラメーターは(θ’、a’、b’、d’、t’、t’)、画像Iと画像I’の時間差はT、画像Iと画像I’の時間差はT’であるとする。
Figure JPOXMLDOC01-appb-I000014
Here, (θ p ′, a p ′, b p ′, d p ′, t px ′, t py ′) is obtained by the following relational expression. Here, the parameters of the affine transformation from the image I ′ to the image I are (θ ′, a ′, b ′, d ′, t x ′, ty ′), the time difference between the image I and the image I ′ is T, and the image It is assumed that the time difference between I and the image I p ′ is T p ′.
Figure JPOXMLDOC01-appb-I000014
 ・(12C-1-3)検出方法3
 第2推定部12Cは、動き推定用フレーム画像のペアの一方をアフィン変換することで生成した変換画像と他方の差分が大きい領域を、変換画像とフレーム画像のそれぞれから被写体領域として検出してもよい。この検出方法は、(12C-1-2)の派生形である。
(12C-1-3) Detection method 3
The second estimation unit 12C may detect a region having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as a subject region from each of the converted image and the frame image. Good. This detection method is a derivative of (12C-1-2).
 具体的には、第2推定部12Cは、式(7)を用いて、画像Iと画像I’の間で算出されたアフィン変換パラメーターをもとに、画像Iから時刻t+kでの予測画像を生成し、画像I’との差分を算出する。 Specifically, the second estimation unit 12C predicts from the image I at the time t + k based on the affine transformation parameters calculated between the image I and the image I ′ using Expression (7). An image is generated and a difference from the image I ′ is calculated.
 第2推定部12Cは、被写体領域を検出したら、検出された被写体領域に対応する対応領域を推定する。被写体領域の対応領域を推定する方法として、例えば、以下の方法が挙げられる。第2推定部12Cは、各方法を単体で用いてもよいし、組み合わせて用いてもよい。 When the second estimation unit 12C detects the subject area, the second estimation unit 12C estimates a corresponding area corresponding to the detected subject area. Examples of a method for estimating the corresponding region of the subject region include the following methods. The second estimation unit 12C may use each method alone or in combination.
 ・(12C-2-1)推定方法1
 第2推定部12Cは、動き推定用フレーム画像のペアの一方から検出された被写体領域の全画素について、他方のフレーム画像との間でオプティカルフローを算出し、オプティカルフローの重み付き平均だけ移動した先を対応領域として検出する。あるいは、第2推定部12Cは、このペアの一方をアフィン変換することで生成した変換画像から検出された被写体領域の全画素について、他方のフレーム画像又はその変換画像との間でオプティカルフローを算出してもよい。
(12C-2-1) Estimation method 1
The second estimation unit 12C calculates an optical flow with respect to the other frame image for all the pixels in the subject area detected from one of the pair of motion estimation frame images, and moves by the weighted average of the optical flow. The tip is detected as a corresponding area. Alternatively, the second estimation unit 12C calculates an optical flow between the other frame image or its converted image for all pixels in the subject area detected from the converted image generated by affine transforming one of the pairs. May be.
 第2推定部12Cは、オプティカルフローの重み付き平均の算出で用いる重みについては、被写体領域の重心に近い画素のオプティカルフローに高い重みを付与してもよい。第2推定部12Cは、被写体領域内で周囲との輝度勾配が大きい画素のオプティカルフローに高い重みを付与してもよいし、周囲の画素で算出されたオプティカルフローとの向き又は大きさの分散が小さい画素のオプティカルフローに高い重みを付与してもよい。あるいは、第2推定部12Cは、被写体領域のオプティカルフローのうち、大きさが一定値以上又は以下のフローを外れ値として一定数除外し、残ったオプティカルフローに均等に重みを付与してもよい。第2推定部12Cは、輝度勾配や、オプティカルフローの向き又は大きさの分散に基づいて重みを設定することで、信頼性の高いオプティカルフローに基づいて対応領域の位置を推定することが可能である。 The second estimation unit 12C may give a high weight to the optical flow of the pixels close to the center of gravity of the subject region as the weight used in the calculation of the weighted average of the optical flow. The second estimation unit 12C may give a high weight to the optical flow of the pixel having a large luminance gradient with respect to the surroundings in the subject area, and the orientation or size variance with the optical flow calculated with the surrounding pixels. A high weight may be given to the optical flow of a pixel with a small. Alternatively, the second estimation unit 12 </ b> C may exclude a certain number of flows whose magnitude is greater than or equal to a certain value or less as outliers in the optical flow of the subject area, and equally weight the remaining optical flows. . The second estimation unit 12C can estimate the position of the corresponding region based on the optical flow with high reliability by setting the weight based on the luminance gradient and the variance of the direction or the size of the optical flow. is there.
 ・(12C-2-2)推定方法2
 第2推定部12Cは、動き推定用フレーム画像のペアの一方又はそのアフィン変換後の変換画像で検出された被写体領域をテンプレートとし、他方のフレーム画像又はそのアフィン変換後の変換画像を走査するテンプレートマッチングにより、対応領域を検出する。第2推定部12Cは、テンプレートマッチングに用いる類似度指標として、(12B-2)に記載の指標のうちいずれかを用いてもよいし、他の方法を用いてもよい。
(12C-2-2) Estimation method 2
The second estimation unit 12C uses, as a template, a subject region detected in one of the pair of motion estimation frame images or a converted image after the affine transformation, and a template for scanning the other frame image or the converted image after the affine transformation. Corresponding regions are detected by matching. The second estimation unit 12C may use any of the indices described in (12B-2) as a similarity index used for template matching, or may use another method.
 あるいは、第2推定部12Cは、色やテクスチャを表現する画像特徴量の距離(ユークリッド距離)に基づいて対応領域を検出してもよい。例えば、第2推定部12Cは、動き推定用フレーム画像のペアの一方で検出された被写体領域から画像特徴量を抽出し、他方のフレーム画像の任意の領域について検出した画像特徴量との距離が短い領域を対応領域として検出してもよい。 Alternatively, the second estimation unit 12C may detect the corresponding region based on the distance (Euclidean distance) of the image feature amount expressing the color and texture. For example, the second estimation unit 12C extracts an image feature amount from the subject region detected in one of the pair of motion estimation frame images, and the distance from the detected image feature amount for an arbitrary region of the other frame image is A short area may be detected as the corresponding area.
 あるいは、第2推定部12Cは、被写体領域全体をテンプレートとしたテンプレートマッチングにより、対応領域の位置を大まかに推定してから、被写体領域を分割して生成した各部分領域について再度周囲を探索し、対応領域を決定してもよい。 Alternatively, the second estimation unit 12C roughly estimates the position of the corresponding region by template matching using the entire subject region as a template, and then searches again around each partial region generated by dividing the subject region, The corresponding area may be determined.
 ・(12C-2-3)推定方法3
 第2推定部12Cは、動き推定用フレーム画像のペアの一方又はそのアフィン変化後の変換画像で検出された被写体領域から特徴点を検出し、他方のフレーム画像又はその変換画像から特徴点に対応する点を検出することでオプティカルフローを検出する。第2推定部12Cは、検出されたオプティカルフローの重み付き平均だけ被写体領域を移動した先を対応領域として検出する。なお、第2推定部12Cは、例えばHarrisのコーナー点を特徴点として用いてもよいし、他の方法によって検出された特徴点を用いてもよい。
(12C-2-3) Estimation method 3
The second estimation unit 12C detects a feature point from the subject area detected in one of the pair of motion estimation frame images or the converted image after the affine change, and corresponds to the feature point from the other frame image or the converted image. The optical flow is detected by detecting the point to be performed. The second estimator 12C detects, as a corresponding region, a destination that has moved the subject region by the weighted average of the detected optical flow. The second estimation unit 12C may use, for example, a Harris corner point as a feature point, or may use a feature point detected by another method.
 上述した(12C-2-1)、(12C-2-2)、(12C-2-3)と(12C-1-1)、(12C-1-2)、(12C-1-3)の組み合わせは、特に限定されない。つまり、第2推定部12Cは、(12C-1-1)、(12C-1-2)、(12C-1-3)のいずれの方法で検出された被写体領域に対して(12C-2-1)、(12C-2-2)又は(12C-2-3)を実行してもよい。 (12C-2-1), (12C-2-2), (12C-2-3) and (12C-1-1), (12C-1-2), (12C-1-3) described above The combination is not particularly limited. That is, the second estimation unit 12C performs (12C-2-) on the subject region detected by any of the methods (12C-1-1), (12C-1-2), and (12C-1-3). 1), (12C-2-2) or (12C-2-3) may be executed.
 第2推定部12Cは、被写体領域を検出し、対応領域を推定したら、被写体の動きを推定する。被写体の動きの推定方法として、例えば、以下の方法が挙げられる。 The second estimation unit 12C detects the subject area, and estimates the corresponding area after estimating the corresponding area. Examples of the method for estimating the motion of the subject include the following methods.
 ・(12C-3-1)動き推定方法1
 第2推定部12Cは、動き推定用フレーム画像のペアの一方から、アフィン変換パラメーターによって推定される移動量とは異なる動きをする画素の集合を被写体領域として検出した場合(12C-1-1)、次の方法により被写体の動きを推定する。第2推定部12Cは、被写体領域の位置を表す位置情報(座標)と、対応領域の位置情報との差分を算出し、これを被写体領域の仮の移動ベクトルとする。第2推定部12Cは、仮の移動ベクトルと動き推定用フレーム画像のペアにおけるカメラの動きによる画像の移動ベクトルとの差分を算出し、このペアの間での被写体領域の真の移動ベクトルとする。
(12C-3-1) Motion estimation method 1
When the second estimation unit 12C detects, from one of the pair of motion estimation frame images, a set of pixels that move differently from the movement amount estimated by the affine transformation parameter as a subject area (12C-1-1) The movement of the subject is estimated by the following method. The second estimation unit 12C calculates the difference between the position information (coordinates) representing the position of the subject area and the position information of the corresponding area, and uses this as a temporary movement vector of the subject area. The second estimation unit 12C calculates a difference between the temporary movement vector and the movement vector of the image due to the camera movement in the pair of motion estimation frame images, and sets the difference as the true movement vector of the subject area between the pair. .
 ・(12C-3-2)動き推定方法2
 第2推定部12Cは、動き推定用フレーム画像のペアのそれぞれをアフィン変換することで生成した変換画像同士の差分が大きい領域を変換画像の双方から被写体領域として検出した場合(12C-1-2)、次の方法により被写体の動きを推定する。第2推定部12Cは、一方の変換画像の被写体領域の位置情報と、他方の変換画像から検出した対応領域の位置情報との差分を算出し、動き推定用フレーム画像のペアの間における被写体の真の移動ベクトルとする。
(12C-3-2) Motion estimation method 2
When the second estimation unit 12C detects a region having a large difference between the converted images generated by affine transformation of each pair of motion estimation frame images as a subject region from both of the converted images (12C-1-2) ) Estimate the movement of the subject by the following method. The second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other converted image, and the second estimation unit 12C calculates the difference between the pair of motion estimation frame images. Let it be a true movement vector.
 ・(12C-3-3)動き推定方法3
 第2推定部12Cは、動き推定用フレーム画像のペアの一方をアフィン変換することで生成した変換画像と他方との差分が大きい領域を双方から被写体領域として検出した場合(12C-1-3)、次の方法により被写体の動きを推定する。第2推定部12Cは、一方の変換画像の被写体領域の位置情報と、他方のフレーム画像から検出した対応領域の位置情報の差分を算出し、動き推定用フレーム画像のペアの間における被写体の真の移動ベクトルとする。この推定方法は、上述した(12C-3-2)の派生形である。
(12C-3-3) Motion estimation method 3
When the second estimation unit 12C detects an area having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as the subject area (12C-1-3) The movement of the subject is estimated by the following method. The second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other frame image, and calculates the true of the subject between the pair of motion estimation frame images. The movement vector of This estimation method is a derivative form of (12C-3-2) described above.
 動き推定部12は、推定した動き情報を画像生成部13に出力する。動き情報は、カメラの動きに起因した動き情報と被写体の動きに起因した動き情報のうち、少なくとも一方を含むものとする。 The motion estimation unit 12 outputs the estimated motion information to the image generation unit 13. The motion information includes at least one of motion information caused by camera motion and motion information caused by subject motion.
 カメラが固定されている場合は、カメラの動きに起因した動き情報は不要である。被写体が固定されている場合は、被写体の動きに起因した動き情報は不要である。 If the camera is fixed, no motion information due to camera motion is required. When the subject is fixed, the motion information resulting from the motion of the subject is not necessary.
 動き推定部12は、動き推定に用いた動き推定用フレーム画像のペアの各フレームの時刻と、このペアの間で算出されたアフィン変換パラメーターとをカメラの動きに起因した動き情報として出力する。動き推定部12は、カメラの動きに起因した動き情報を、動き推定を行った動き推定用フレーム画像のペアの個数分出力する。 The motion estimation unit 12 outputs the time of each frame of the pair of motion estimation frame images used for motion estimation and the affine transformation parameters calculated between the pair as motion information resulting from the motion of the camera. The motion estimation unit 12 outputs motion information resulting from the motion of the camera by the number of pairs of motion estimation frame images that have undergone motion estimation.
 動き推定部12は、被写体の動き推定に用いた動き推定用フレーム画像のペアの各フレーム画像及びその時刻と、被写体領域の位置情報と、被写体領域の対応領域の位置情報と、被写体の真の移動ベクトルとを、被写体の動きに起因する動き情報として出力する。被写体領域の位置情報は、動き推定用フレーム画像のペアのうち一方の座標を表す。また、対応領域の位置情報は、動き推定用フレーム画像のペアのうち他方の座標を表す。 The motion estimator 12 includes each frame image of the motion estimation frame image pair used for estimating the motion of the subject and its time, location information of the subject region, location information of the corresponding region of the subject region, and true information of the subject. The movement vector is output as movement information resulting from the movement of the subject. The position information of the subject region represents one coordinate of the pair of motion estimation frame images. The position information of the corresponding region represents the other coordinate of the pair of motion estimation frame images.
 また、動き推定部12は、動き推定用フレーム画像のペアをアフィン変換して生成した変換画像において被写体領域の検出及び被写体領域の対応領域の推定を実行した場合には、被写体の動きに起因する動き情報を以下のように出力する。動き推定部12は、被写体の動き推定に用いた動き推定用フレーム画像のペアの各フレームの時刻と、被写体領域の位置情報と、被写体領域の対応領域の位置情報と、被写体の真の移動ベクトルとを出力する。被写体領域の位置情報は、動き推定用フレーム画像のペアのうち一方をアフィン変換して生成した変換画像における座標を表す。対応領域の位置情報は、動き推定用フレーム画像のペアのうち他方をアフィン変換して生成した変換画像における座標を表す。 In addition, when the motion estimation unit 12 detects the subject region and estimates the corresponding region of the subject region in the converted image generated by affine transformation of the pair of motion estimation frame images, the motion estimation unit 12 is caused by the motion of the subject. The motion information is output as follows. The motion estimation unit 12 includes the time of each frame of the pair of motion estimation frame images used for the motion estimation of the subject, the location information of the subject region, the location information of the corresponding region of the subject region, and the true movement vector of the subject. Is output. The position information of the subject region represents coordinates in a converted image generated by affine transformation of one of the pair of motion estimation frame images. The position information of the corresponding region represents coordinates in a converted image generated by affine transformation of the other of the pair of motion estimation frame images.
 動き推定部12は、被写体の動きに起因する動き情報を、動き推定を行った動き推定用フレーム画像のペアの個数分出力する。 The motion estimation unit 12 outputs motion information resulting from the motion of the subject for the number of pairs of motion estimation frame images for which motion estimation has been performed.
 <画像生成部13>
 図8は、画像生成部13の構成を示すブロック図である。
<Image generation unit 13>
FIG. 8 is a block diagram illustrating a configuration of the image generation unit 13.
 画像生成部13は、第1補正部13Aと、第2補正部13Bと、合成部13Cとを有する。 The image generation unit 13 includes a first correction unit 13A, a second correction unit 13B, and a synthesis unit 13C.
 画像生成部13は、複数のフレーム画像と、判定部11からの解析情報と、動き推定部12からの動き情報とを入力として受け付ける。画像生成部13は、注目フレームが光の明滅による明領域を含むフレーム画像であると判定された場合に、動き推定用フレーム画像を注目フレーム画像の時刻における画像に補正し、これらを合成して補正フレーム画像として出力する。 The image generation unit 13 receives a plurality of frame images, analysis information from the determination unit 11, and motion information from the motion estimation unit 12 as inputs. When it is determined that the frame of interest is a frame image including a bright area due to the blinking of light, the image generation unit 13 corrects the frame image for motion estimation to an image at the time of the frame of interest image, Output as a corrected frame image.
 第1補正部13Aは、まず、各動き推定用フレーム画像についてカメラの動きを補正することで第1の補正画像を生成する。第2補正部13Bは、次に各動き推定用フレーム画像について被写体の動きを補正することで第2の補正画像を生成する。合成部13Cは、各動き推定用フレーム画像について第2の補正画像を生成し、それらを合成することで補正フレーム画像を生成する。 The first correction unit 13A first generates a first corrected image by correcting the motion of the camera for each motion estimation frame image. Next, the second correction unit 13B generates a second corrected image by correcting the motion of the subject for each motion estimation frame image. The synthesizer 13C generates a second corrected image for each motion estimation frame image, and generates a corrected frame image by combining them.
 第1補正部13Aは、動き推定用フレーム画像のペアの画像データと、このペアの間で算出されたアフィン変換パラメーターとをもとに、例えば以下の方法によってカメラの動きを補正する。 The first correction unit 13A corrects the camera motion by, for example, the following method based on the image data of the pair of motion estimation frame images and the affine transformation parameters calculated between the pair.
 なお、第1補正部13Aは、アフィン変換パラメーターの各値が予め設定した閾値よりも小さい場合には、カメラの動きがなかったものと判定し、カメラの動きの補正は行わなくてもよい。この場合、第1補正部13Aは、補正されていない動き推定用フレーム画像を第1の補正画像とみなす。 The first correction unit 13A determines that there is no camera movement when each value of the affine transformation parameter is smaller than a preset threshold value, and does not need to correct the camera movement. In this case, the first correction unit 13A regards the uncorrected motion estimation frame image as the first corrected image.
 ・(13A-1)カメラの動きの補正方法1
 第1補正部13Aは、注目フレーム画像に最も近接し、明領域を含まない前後それぞれのフレーム画像が動き推定用フレーム画像として選択された場合(12A-1)には、次の方法により第1の補正画像を生成する。第1補正部13Aは、選択した2つのフレーム画像の間で算出されたアフィン変換パラメーターを用いて、これらのフレーム画像からそれぞれ補正フレーム画像を生成する。
(13A-1) Camera motion correction method 1
The first correcting unit 13A selects the first and second frame images that are closest to the target frame image and do not include the bright region as the motion estimation frame images (12A-1) by the following method. A corrected image is generated. The first correction unit 13A uses the affine transformation parameters calculated between the two selected frame images to generate correction frame images from these frame images, respectively.
 具体的には、第1補正部13Aは、(12C-1-2)に記載されたように、動き推定用フレーム画像のうち一方を画像I、他方を画像I’としたときに、注目フレーム画像の時刻tでの予測画像Ip、Ip’を第1の補正画像として生成する。 Specifically, as described in (12C-1-2), the first correction unit 13A uses one of the motion estimation frame images as the image I and the other as the image I ′. The predicted images Ip and Ip ′ at the time t of the image are generated as the first corrected image.
 ・(13A-2)カメラの動きの補正方法2
 第1補正部13Aは、注目フレーム画像の前後それぞれからフレーム画像が動き推定用フレーム画像として複数選択された場合(12A-2)には、次の方法により第1の補正画像を生成する。第1補正部13Aは、複数の動き推定用フレーム画像のペアで算出された各アフィン変換パラメーターをもとに、各ペアから第1の補正画像をそれぞれ生成する。
(13A-2) Camera motion correction method 2
When a plurality of frame images are selected as the motion estimation frame images from before and after the attention frame image (12A-2), the first correction unit 13A generates the first correction image by the following method. The first correction unit 13A generates a first correction image from each pair based on each affine transformation parameter calculated from a plurality of pairs of motion estimation frame images.
 具体的には、第1補正部13Aは、(12C-1-2)に記載のように、各動き推定用フレームの一方を画像I、他方を画像I’としたときに、注目フレーム画像の時刻tでの予測画像I、I’を第1の補正画像として生成する。例えば、第1補正部13Aは、図7に示すように、注目フレーム画像の前後からそれぞれ2フレーム選択し、2組のペアについて動き推定を行った場合、選択した各フレームについて生成した注目フレーム画像の時刻での予測画像4枚を、第1の補正画像とする。 Specifically, as described in (12C-1-2), the first correction unit 13A uses one of the motion estimation frames as an image I and the other as an image I ′. Predicted images I p and I p ′ at time t are generated as first corrected images. For example, as illustrated in FIG. 7, when the first correction unit 13A selects two frames before and after the target frame image and performs motion estimation for two pairs, the target frame image generated for each selected frame The four predicted images at the time are set as the first corrected image.
 ・(13A-3)カメラの動きの補正方法3
 第1補正部13Aは、注目フレーム画像と注目フレーム画像の前後どちらか一方のフレーム画像とが動き推定用フレーム画像として選択された場合(12A-3)には、次の方法により第1の補正画像を生成する。第1補正部13Aは、注目フレーム画像と選択されたフレーム画像との間で算出されたアフィン変換パラメーターをもとに、選択されたフレーム画像から第1の補正画像を生成する。
(13A-3) Camera motion correction method 3
When the target frame image and one of the frame images before and after the target frame image are selected as the motion estimation frame image (12A-3), the first correction unit 13A performs the first correction by the following method. Generate an image. The first correction unit 13A generates a first correction image from the selected frame image based on the affine transformation parameters calculated between the target frame image and the selected frame image.
 具体的には、第1補正部13Aは、(12C-1-2)に記載のように、動き推定用フレーム画像として選択したフレーム画像を画像Iとしたとき、注目フレーム画像の時刻tでの予測画像Iを第1の補正画像として生成する。 Specifically, as described in (12C-1-2), when the frame image selected as the motion estimation frame image is set as an image I, the first correction unit 13A at the time t of the frame image of interest. The predicted image Ip is generated as the first corrected image.
 第2補正部13Bは、第1の補正画像と、動き推定部12から入力された真の移動ベクトルをもとに、注目フレーム画像における被写体の位置の画素情報を更新することによって被写体の動きを補正する。第2補正部13Bは、具体的には、以下の方法によって被写体の動きの補正を実現できる。 Based on the first corrected image and the true movement vector input from the motion estimation unit 12, the second correction unit 13 </ b> B updates the pixel information of the position of the subject in the frame image of interest to update the movement of the subject. to correct. Specifically, the second correction unit 13B can correct the movement of the subject by the following method.
 なお、第2補正部13Bは、被写体の真の移動ベクトルの各値が予め設定した閾値よりも小さい場合には、被写体の動きがなかったものと判定し、被写体の動きの補正は行わなくてもよい。この場合、第2補正部13Bは、第1の補正画像を第2の補正画像とみなす。 The second correction unit 13B determines that there is no movement of the subject when each value of the true movement vector of the subject is smaller than a preset threshold value, and does not correct the movement of the subject. Also good. In this case, the second correction unit 13B regards the first correction image as the second correction image.
 第2補正部13Bは、動き推定用フレーム画像のペアの間での被写体の真の移動ベクトルと、当該ペア及び注目フレーム画像の時刻情報をもとに、当該ペアの各フレーム画像と注目フレーム画像との間における被写体の真の移動ベクトルを求める。 Based on the true movement vector of the subject between the pair of motion estimation frame images and the time information of the pair and the attention frame image, the second correction unit 13B and each frame image and the attention frame image of the pair To determine the true movement vector of the subject.
 第2補正部13Bは、第1の補正画像から特定された被写体領域の画素値を用いて、第1の補正画像から特定された被写体領域の座標から真の移動ベクトル分移動させた先の画素値と第1の補正フレームから特定された被写体領域の座標の画素値とを更新する。これにより、第2補正部13Bは、第2の補正画像を生成する。 The second correction unit 13B uses the pixel value of the subject area specified from the first correction image, and the previous pixel moved by the true movement vector from the coordinates of the subject area specified from the first correction image The value and the pixel value of the coordinates of the subject area specified from the first correction frame are updated. Accordingly, the second correction unit 13B generates a second corrected image.
 第2補正部13Bは、移動先の画素値を被写体領域の画素値に置き換えることによって画素値を更新してもよい。また、第2補正部13Bは、移動先の画素値を、当該画素値と被写体領域の画素値の重みづけ平均値に置き換えてもよいし、移動先の画素値を移動先の周辺の画素値と被写体領域の画素値による重みづけ平均値に置き換えてもよい。 The second correction unit 13B may update the pixel value by replacing the pixel value of the movement destination with the pixel value of the subject area. Further, the second correction unit 13B may replace the pixel value of the movement destination with a weighted average value of the pixel value and the pixel value of the subject area, or the pixel value of the movement destination may be a pixel value around the movement destination. And a weighted average value based on pixel values of the subject area.
 また、第2補正部13Bは、真の移動ベクトルの逆ベクトル分移動させた先の画素値によって被写体領域の座標の画素値を置き換えてもよい。第2補正部13Bは、被写体領域の座標の画素値を真の移動ベクトルの逆ベクトル分移動させた先の画素値との重みづけ平均値に置き換えてもよいし、真の移動ベクトルの逆ベクトル分移動させた先の画素値及びその周辺画素の重みづけ平均値に置き換えてもよい。 Further, the second correction unit 13B may replace the pixel value of the coordinates of the subject area with the previous pixel value moved by the inverse vector of the true movement vector. The second correction unit 13B may replace the pixel value of the coordinates of the subject area with a weighted average value with the previous pixel value moved by the inverse vector of the true movement vector, or the inverse vector of the true movement vector The pixel value may be replaced with the weighted average value of the previous pixel value and its surrounding pixels.
 なお、動き推定用フレーム画像のペアの各フレーム画像と注目フレーム画像との間での被写体の真の移動ベクトルは、下式で求められる。ここにおいて、動き推定用フレーム画像のペアを構成するフレーム画像I1、I2の間での被写体領域の真の移動ベクトルをV、フレーム画像I1、I2の時刻をそれぞれT1,T2、注目フレームの時刻をT3(T1<T3<T2)とする。 Note that the true movement vector of the subject between each frame image of the pair of frame images for motion estimation and the target frame image is obtained by the following equation. Here, the true movement vector of the subject area between the frame images I1 and I2 constituting the pair of motion estimation frame images is V, the times of the frame images I1 and I2 are T1 and T2, respectively, and the time of the frame of interest is Let T3 (T1 <T3 <T2).
 フレーム画像I1から注目フレーム画像への被写体の真の移動ベクトル:
     V・(T3-T1)/(T2-T1)   (式11)
 フレーム画像I2から注目フレーム画像への被写体の真の移動ベクトル:
     -V・(T2-T3)/(T2-T1)   (式12)
True movement vector of the subject from the frame image I1 to the frame image of interest:
V · (T3-T1) / (T2-T1) (Formula 11)
True movement vector of the subject from the frame image I2 to the frame image of interest:
-V · (T2-T3) / (T2-T1) (Formula 12)
 また、第2補正部13Bは、動き推定用フレーム画像において被写体領域であると判定された画素に対応する第1の補正画像の画素を被写体領域の画素と判定することにより、第1の補正画像から被写体画像を特定することができる。 In addition, the second correction unit 13B determines that the pixel of the first correction image corresponding to the pixel determined to be the subject area in the motion estimation frame image is the pixel of the subject area, thereby determining the first correction image. The subject image can be specified from the above.
 合成部13Cは、複数の第2の補正画像を合成することで補正フレーム画像を生成することができる。例えば、合成部13Cは、補正フレーム画像Iを式(13)によって生成できる。ここにおいて、第2の補正画像の数をN、第2の補正画像をI(i=1,…,N)、重みをwiとする。重みwiは、第2の補正画像に対応した動き推定用フレーム画像と注目フレーム画像との時間差をDiとしたときに、Diの絶対値(|Di|)が小さいほど大きな値である。
Figure JPOXMLDOC01-appb-I000015
The combining unit 13C can generate a corrected frame image by combining a plurality of second corrected images. For example, the combining unit 13C is corrected frame image I c can be generated by equation (13). Here, the number of second correction images is N, the second correction image is I i (i = 1,..., N), and the weight is wi. When the time difference between the motion estimation frame image corresponding to the second corrected image and the target frame image is Di, the weight wi is larger as the absolute value of Di (| Di |) is smaller.
Figure JPOXMLDOC01-appb-I000015
 なお、合成部13Cは、下式のように|Di|の減少に伴い線形に増加する関数に基づいてwiを算出してもよい。
Figure JPOXMLDOC01-appb-I000016
Note that the combining unit 13C may calculate wi based on a function that linearly increases as | Di |
Figure JPOXMLDOC01-appb-I000016
 <画像合成部14>
 画像合成部14は、注目フレーム画像と補正フレーム画像とを合成し、フラッシュ等による明滅を抑制したフレーム画像(以下「出力フレーム画像」という。)を生成し出力する。
<Image composition unit 14>
The image synthesizing unit 14 synthesizes the frame image of interest and the correction frame image to generate and output a frame image (hereinafter referred to as “output frame image”) in which blinking due to flash or the like is suppressed.
 画像合成部14は、注目フレーム画像が明領域を含む画像であると判別され、かつ、補正フレーム画像が生成された場合に、各画素における合成比率を算出し、合成処理により出力画像を生成する。それ以外の場合には、画像合成部14は、入力された注目フレーム画像をそのまま出力フレーム画像とする。画像合成部14は、位置(x,y)の注目画素It(x,y)における合成比率u(x,y)が与えられたとき、同位置における出力フレーム画像の値Iout(x,y)を式(14)のように算出する。
Figure JPOXMLDOC01-appb-I000017
When it is determined that the target frame image is an image including a bright region and a corrected frame image is generated, the image composition unit 14 calculates a composition ratio in each pixel, and generates an output image by composition processing. . In other cases, the image composition unit 14 uses the input frame image of interest as an output frame image as it is. When the composition ratio u (x, y) at the target pixel I t (x, y) at the position (x, y) is given, the image composition unit 14 outputs the value I out (x, y) of the output frame image at the same position. y) is calculated as shown in equation (14).
Figure JPOXMLDOC01-appb-I000017
 画像合成部14は、注目フレーム画像と補正フレーム画像との間の局所領域輝度の変化率を用いて合成比率を算出することができる。画像合成部14は、判定部11が局所領域輝度の変化率を算出する方法と同様の方法を用いて注目フレーム画像と補正フレーム画像との間の局所領域輝度の変化率rt-esを算出することができる。画像合成部14は、位置(x,y)の注目画素における合成比率u(x,y)を、同位置(x,y)における局所領域輝度の変化率rt-es(x,y)と、予め設定された、rt-es(x,y)の値に対応した出力フレーム画像における局所領域輝度の変化率の値rtar(x,y)とを用いて、式(15)のように算出することができる。画像合成部14は、出力フレーム画像における局所領域輝度の変化率がrtar(x,y)になるように合成比率u(x,y)を算出する。
Figure JPOXMLDOC01-appb-I000018
The image composition unit 14 can calculate the composition ratio using the change rate of the local area luminance between the target frame image and the corrected frame image. The image composition unit 14 calculates the local region luminance change rate r t-es between the frame image of interest and the corrected frame image using a method similar to the method in which the determination unit 11 calculates the local region luminance change rate. can do. The image composition unit 14 uses the composition ratio u (x, y) at the target pixel at the position (x, y) as the change rate r t-es (x, y) of the local region luminance at the same position (x, y). Using the preset value r tar (x, y) of the local area luminance change rate in the output frame image corresponding to the value of r t-es (x, y) as shown in equation (15) Can be calculated. The image composition unit 14 calculates the composition ratio u (x, y) so that the change rate of the local area luminance in the output frame image becomes r tar (x, y).
Figure JPOXMLDOC01-appb-I000018
 出力フレーム画像における局所領域輝度の変化率の値rtarの設定方法の一例として、図9のグラフのように、ある程度小さい値のrt-esに対してはrtar= rt-esとし、大きな値のrt-esに対してはrtarを所定の最大値にし、その値より大きくならないようにする方法がある。 As an example of a method of setting the local area intensity of the rate of change of the value r tar in the output frame image, as shown in the graph of FIG. 9, and r tar = r t-es for r t-es a certain small value, For large values of r t-es, there is a way to set r tar to a predetermined maximum value and not to exceed that value.
 画像合成部14は、矩形領域輝度の変化率を用いて合成比率を算出してもよい。具体的には、画像合成部14は、まず、判定部11と同様の方法を用いて算出された矩形領域輝度の変化率Rt-esと予め設定されたRt-esの値に対応する出力フレーム画像の矩形領域輝度の変化率から矩形領域毎の合成比率Uを算出する。次いで、画像合成部14は、矩形領域毎の合成比率Uから線形補間や双三次補間を用いて、画素毎の合成比率uを求める。 The image composition unit 14 may calculate the composition ratio using the change rate of the rectangular area luminance. Specifically, the image composition unit 14 first corresponds to the rectangular region luminance change rate R t-es calculated using the same method as the determination unit 11 and a preset value of R t-es. The composition ratio U for each rectangular area is calculated from the change rate of the luminance of the rectangular area of the output frame image. Next, the image composition unit 14 obtains a composition ratio u for each pixel from the composition ratio U for each rectangular area using linear interpolation or bicubic interpolation.
 [動作]
 次に、図1及び図10を参照して、本実施の形態の動作を説明する。
[Operation]
Next, the operation of the present embodiment will be described with reference to FIGS.
 判定部11は、時刻tの注目フレーム画像が、光過敏性発作を誘発する可能性のある、フラッシュ等による光の明滅による明領域を含むフレーム画像であるかを判定する(S11)。 The determination unit 11 determines whether or not the frame image of interest at time t is a frame image including a bright region caused by blinking of light due to flash or the like that may induce a photosensitivity seizure (S11).
 動き推定部12は、注目フレーム画像を含む複数のフレーム画像から動き推定用フレーム画像を選択し、動き推定用フレーム画像の間でのカメラ及び被写体の動きによる画像の移動量を推定する(S12)。 The motion estimation unit 12 selects a motion estimation frame image from a plurality of frame images including the frame image of interest, and estimates the amount of image movement due to the motion of the camera and the subject between the motion estimation frame images (S12). .
 画像生成部13は、動き推定用フレーム画像の間で推定されたカメラ及び被写体の動きによる画素の移動量をもとに、動き推定用フレーム画像から注目フレーム画像の間でのカメラ及び被写体による画像の移動量を推定する。また、画像生成部13は、各動き推定用フレーム画像を注目フレーム画像の時刻における画像に変換し、変換されたそれぞれの画像を合成することで、補正フレーム画像を生成する(S13)。 The image generation unit 13 uses the camera and subject image between the motion estimation frame image and the target frame image based on the movement amount of the pixel due to the camera and subject motion estimated between the motion estimation frame images. Is estimated. In addition, the image generation unit 13 converts each motion estimation frame image into an image at the time of the target frame image, and generates a corrected frame image by synthesizing the converted images (S13).
 画像合成部14は、注目フレーム画像と補正フレーム画像とを合成し、フラッシュ等による明滅を抑制した出力フレーム画像を生成し出力する(S14)。 The image synthesizing unit 14 synthesizes the attention frame image and the correction frame image, and generates and outputs an output frame image in which blinking due to flash or the like is suppressed (S14).
 [効果]
 本実施の形態に係る映像処理装置100は、光過敏性発作を誘発する可能性のある、大きな輝度変化を含む映像に対して、輝度の変動が抑制された自然な映像を生成することができる。
[effect]
The video processing apparatus 100 according to the present embodiment can generate a natural video in which variation in luminance is suppressed with respect to a video including a large luminance change that may induce a photosensitivity seizure. .
 その理由は、映像処理装置100が、大きな輝度変化のある領域を含む注目フレーム画像について、他のフレーム画像から推定される輝度変化のないフレーム画像を、画素毎に重みを変化させながら合成するためである。これにより、映像処理装置100は、大きな輝度変化がある領域のみを補正し、明滅等で失われた情報を復元することができる。 The reason is that the video processing apparatus 100 synthesizes a frame image having no luminance change estimated from other frame images with respect to a target frame image including a region having a large luminance change while changing the weight for each pixel. It is. As a result, the video processing apparatus 100 can correct only an area where there is a large luminance change and restore information lost due to blinking or the like.
 ところで、フラッシュ等による明滅は記者会見等で発生する。例えば、記者会見では、被写体(会見者)が会見席まで歩き、着席し、会見後、退席する。一連の被写体の動作に伴い、カメラは被写体を追う。この場合、カメラの撮影範囲は、被写体に追従して移動する。 By the way, blinking by flash etc. occurs at a press conference. For example, in a press conference, a subject (conference member) walks to a conference seat, sits down, and leaves after the conference. The camera follows the subject as the subject moves. In this case, the shooting range of the camera moves following the subject.
 カメラや被写体の動きを考慮せずに画像を合成する場合、輪郭のブレやボケが発生する。この映像を再生すると、輝度が抑制されたフレームだけがブレやボケによって被写体の輪郭が太く膨張したように見え、動きの滑らかさが損なわれる。 輪 郭 When an image is synthesized without considering the movement of the camera or subject, blurring or blurring of the outline occurs. When this video is played back, only the frame in which the luminance is suppressed appears to have the subject's outline thickened due to blurring or blurring, and the smoothness of the movement is impaired.
 映像処理装置100は、カメラや被写体の動きを推定して画像を補正するため、輪郭のブレやボケを抑制し、滑らかな映像を生成できる。 Since the video processing apparatus 100 corrects the image by estimating the movement of the camera and the subject, it can suppress blurring and blurring of the contour and generate a smooth video.
 [別の実施形態]
 尚、上述した実施の形態では、明滅領域が注目フレーム画像において他のフレーム画像より所定のレベル以上明るくなる(輝度が大きくなる)明領域の例を説明した。しかし、映像処理装置100は、明滅領域が注目フレーム画像において他のフレーム画像より所定のレベル以上暗くなる(輝度が小さくなる)暗領域である場合に対しても、同様に適用することができる。
[Another embodiment]
In the above-described embodiment, an example of a bright region in which the blinking region is brighter than the other frame image by a predetermined level or more (brightness increases) in the frame image of interest has been described. However, the video processing apparatus 100 can be similarly applied to a case where the blinking region is a dark region that is darker than the other frame images by a predetermined level or more (the luminance is reduced) in the frame image of interest.
 ところで、フラッシュが散発的に焚かれると、上記のような明領域が発生する。一方で、フラッシュの数が増えると、全体的に輝度が大きくなる。多数のフラッシュが断続的に焚かれると、瞬間的に暗領域が発生する。 By the way, when the flash is scattered sporadically, the above bright area occurs. On the other hand, as the number of flashes increases, the overall brightness increases. When a large number of flashes are fired intermittently, a dark region is instantaneously generated.
 判定部11は、時刻tにおける注目フレーム画像が入力された複数のフレーム画像のうちの時刻(t+k)のフレーム画像より所定のレベル以上暗くなる領域があるかを判定する。例えば、判定部11は、予め設定された輝度変動率の閾値α’と面積率の閾値β’を用いて、局所領域輝度の変化率rt-t+kが閾値α’を下回る領域の面積率が閾値β’を超えるか否かによって判定する。 The determination unit 11 determines whether there is an area that is darker than a predetermined level from the frame image at time (t + k) among the plurality of frame images to which the frame image of interest at time t is input. For example, the determination unit 11 uses the preset threshold value α ′ of the luminance fluctuation rate and the threshold value β ′ of the area rate to determine the area of the region where the local region luminance change rate r t−t + k is lower than the threshold value α ′. Judgment is made based on whether the rate exceeds the threshold β ′.
 時刻tにおける注目フレーム画像が入力されたフレーム画像の時刻(t+k)のフレーム画像より大きく暗くなる領域があると判定された場合、判定部11は、判定フラグflagt-t+kを「1」としてもよい。そうでない場合には、判定部11は、判定フラグflagt-t+kを「0」としてもよい。判定部11は、注目フレーム画像と入力された他の全てのフレーム画像との組み合わせについて判定フラグを算出する。判定部11は、注目フレーム画像の前後それぞれの時刻に判定フラグが「1」となるフレーム画像が存在する場合、注目フレーム画像が光の明滅による暗領域を含むフレーム画像であると判定する。 When it is determined that there is a region that is larger and darker than the frame image at the time (t + k) of the input frame image at the time t, the determination unit 11 sets the determination flag flag t-t + k to “ 1 ”. Otherwise, the determination unit 11 may set the determination flag flag t-t + k to “0”. The determination unit 11 calculates a determination flag for the combination of the target frame image and all the other input frame images. The determination unit 11 determines that the target frame image is a frame image including a dark region due to blinking of light when there is a frame image whose determination flag is “1” at each time before and after the target frame image.
 判定部11は、別の方法として、矩形領域輝度の変化率を用いる方法を利用してもよい。例えば、判定部11は、予め設定された輝度変動率の閾値α’と面積率の閾値β’を用いて、矩形領域輝度の変化率が閾値α’を下回る領域の面積率が閾値β’を超えるか否かによって、判定フラグflagt-t+kに「1」又は「0」を設定する。 As another method, the determination unit 11 may use a method using a change rate of the luminance of the rectangular area. For example, the determination unit 11 uses the threshold value α ′ for the luminance fluctuation rate and the threshold value β ′ for the area ratio, which are set in advance, so that the area ratio of the region where the change rate of the luminance of the rectangular area is lower than the threshold value α ′ Depending on whether or not it exceeds, “1” or “0” is set to the determination flag flag t-t + k .
 更に、上述した実施の形態では、輝度の変動、すなわち、一般閃光についての例を説明した。しかし、映像処理装置100は、赤色閃光等の彩度の変動に対しても同様に適用することができる。したがって、上述した実施の形態は、「輝度」を「彩度」又は「輝度又は彩度」に置き換えた態様を含み得る。 Furthermore, in the above-described embodiment, an example of luminance fluctuation, that is, a general flash has been described. However, the video processing apparatus 100 can be similarly applied to a change in saturation such as red flash. Therefore, the above-described embodiment may include a mode in which “luminance” is replaced with “saturation” or “luminance or saturation”.
 [その他]
 本発明による実施の形態は、ハードディスク等に記録されている映像を編集する映像編集システムに適用することができる。また、本発明による実施の形態は、メモリに保持されたフレーム画像を用いることで、ビデオカメラやディスプレイ端末等にも適用することができる。
[Others]
The embodiment according to the present invention can be applied to a video editing system for editing video recorded on a hard disk or the like. In addition, the embodiment according to the present invention can be applied to a video camera, a display terminal, and the like by using a frame image held in a memory.
 また、上述した説明からも明らかなように、本発明による実施の形態は、各部をハードウェアで構成することも可能であるが、コンピュータプログラムにより実現することも可能である。この場合、映像処理装置100は、プログラムメモリに格納されているプログラムで動作するプロセッサによって、上述した実施の形態と同様の機能、動作を実現する。また、上述した実施の形態は、その一部の機能のみをコンピュータプログラムにより実現することも可能である。 As is clear from the above description, the embodiment according to the present invention can be configured by hardware, but can also be realized by a computer program. In this case, the video processing apparatus 100 realizes the same functions and operations as those in the above-described embodiment by a processor that operates according to a program stored in the program memory. In the above-described embodiment, only a part of the functions can be realized by a computer program.
 図11は、映像処理装置100を実現するコンピュータ装置200のハードウェア構成を例示するブロック図である。コンピュータ装置200は、CPU(Central Processing Unit)201と、ROM(Read Only Memory)202と、RAM(Random Access Memory)203と、記憶装置204と、ドライブ装置205と、通信インタフェース206と、入出力インタフェース207とを備える。映像処理装置100は、図11に示される構成(又はその一部)によって実現され得る。 FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus 200 that implements the video processing apparatus 100. The computer apparatus 200 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, a RAM (Random Access Memory) 203, a storage device 204, a drive device 205, a communication interface 206, and an input / output interface. 207. The video processing apparatus 100 can be realized by the configuration (or part thereof) shown in FIG.
 CPU201は、RAM203を用いてプログラム208を実行する。プログラム208は、ROM202に記憶されていてもよい。また、プログラム208は、フラッシュメモリなどの記録媒体209に記録され、ドライブ装置205によって読み出されてもよいし、外部装置からネットワーク210を介して送信されてもよい。通信インタフェース206は、ネットワーク210を介して外部装置とデータをやり取りする。入出力インタフェース207は、周辺機器(入力装置、表示装置など)とデータをやり取りする。通信インタフェース206及び入出力インタフェース207は、データを取得又は出力する手段として機能することができる。 The CPU 201 executes the program 208 using the RAM 203. The program 208 may be stored in the ROM 202. The program 208 may be recorded on a recording medium 209 such as a flash memory and read by the drive device 205 or transmitted from an external device via the network 210. The communication interface 206 exchanges data with an external device via the network 210. The input / output interface 207 exchanges data with peripheral devices (such as an input device and a display device). The communication interface 206 and the input / output interface 207 can function as means for acquiring or outputting data.
 なお、映像処理装置100は、単一の回路(プロセッサ等)によって構成されてもよいし、複数の回路の組み合わせによって構成されてもよい。ここでいう回路(circuitry)は、専用又は汎用のいずれであってもよい。 Note that the video processing apparatus 100 may be configured by a single circuit (such as a processor) or a combination of a plurality of circuits. The circuit here may be either dedicated or general purpose.
 尚、上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。 In addition, although a part or all of said embodiment can be described also as the following additional remarks, it is not restricted to the following.
 (付記1)
 時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する判定手段と、
 前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する動き推定手段と、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する画像生成手段と、
 前記注目フレーム画像と前記補正フレーム画像とを合成する画像合成手段と
 を備える映像処理装置。
(Appendix 1)
Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images;
Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be
Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means,
An image processing apparatus comprising: an image combining unit that combines the frame image of interest and the correction frame image.
 (付記2)
 前記動き推定手段は、
 前記注目フレーム画像以外のフレーム画像から前記ペアの少なくとも一方を選択する選択手段
 を有する
 付記1記載の映像処理装置。
(Appendix 2)
The motion estimation means includes
The video processing apparatus according to claim 1, further comprising selection means for selecting at least one of the pair from a frame image other than the frame image of interest.
 (付記3)
 前記動き推定手段は、
 前記ペアのフレーム画像間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、前記第1の移動量を推定する第1の推定手段を有する
 付記2記載の映像処理装置。
(Appendix 3)
The motion estimation means includes
The video processing according to claim 2, further comprising: first estimation means for calculating a geometric transformation parameter based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images and estimating the first movement amount. apparatus.
 (付記4)
 前記動き推定手段は、
 前記ペアの一方のフレーム画像から前記第1の移動量に基づいて被写体領域を検出し、当該ペアの他方のフレーム画像から当該被写体領域に対応する対応領域を検出し、前記被写体領域及び前記対応領域に基づいて前記第2の移動量を推定する第2の推定手段を有する
 付記3記載の映像処理装置。
(Appendix 4)
The motion estimation means includes
A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected. The video processing apparatus according to claim 3, further comprising: a second estimation unit that estimates the second movement amount based on the second movement amount.
 (付記5)
 前記動き推定手段は、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記ペアのそれぞれのフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて前記第2の画素移動量を推定する第2の推定手段を有する
 付記3記載の映像処理装置。
(Appendix 5)
The motion estimation means includes
A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area. The video processing apparatus according to claim 3, further comprising: second estimation means for estimating
 (付記6)
 前記画像生成手段は、
 前記第1の移動量に基づいて、前記ペアのそれぞれのフレーム画像から第1の補正画像を生成する第1の補正手段と、
 前記第2の移動量に基づいて、前記第1の補正画像のそれぞれから第2の補正画像を生成する第2の補正手段と、
 前記第2の補正フレーム画像のそれぞれを合成する合成手段と
 を有する
 付記1から5までのいずれか記載の映像処理装置。
(Appendix 6)
The image generating means includes
First correction means for generating a first correction image from each frame image of the pair based on the first movement amount;
Second correction means for generating a second corrected image from each of the first corrected images based on the second movement amount;
The video processing apparatus according to any one of appendices 1 to 5, further comprising: a combining unit configured to combine each of the second correction frame images.
 (付記7)
 前記判定手段は、
 他のフレーム画像との間で、輝度又は彩度の変化率が規定値以上又は未満の領域が規定面積以上を占めるフレーム画像を前記注目フレーム画像であると判定する
 付記1から6までのいずれか記載の映像処理装置。
(Appendix 7)
The determination means includes
A frame image in which a region having a luminance or saturation change rate greater than or equal to a specified value or less than a specified area is determined to be the target frame image with any other frame image. The video processing apparatus described.
 (付記8)
 前記画像合成手段は、
 前記注目フレーム画像と前記補正フレーム画像とを合成する合成比率を、所定の関数をもとに算出する
 付記1から7までのいずれか記載の映像処理装置。
(Appendix 8)
The image composition means includes
The video processing apparatus according to any one of appendices 1 to 7, wherein a composite ratio for combining the frame image of interest and the correction frame image is calculated based on a predetermined function.
 (付記9)
 前記画像合成手段は、
 前記注目フレーム画像と前記補正フレーム画像とを合成するための合成比率として、前記注目フレーム画像と前記補正フレーム画像との変化率が大きい領域に対しては、前記補正フレーム画像の合成比率が大きくなるよう設定する
 付記1から8までのいずれかに記載の映像処理装置。
(Appendix 9)
The image composition means includes
As a composition ratio for compositing the attention frame image and the correction frame image, the composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large. The video processing device according to any one of supplementary notes 1 to 8.
 (付記10)
 注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定し、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成し、
 前記注目フレーム画像と前記補正フレーム画像とを合成する
 映像処理方法。
(Appendix 10)
Based on the pair of frame images selected based on the difference in luminance or saturation from the frame image of interest and the frame images before and after the frame image of interest, and / or due to the movement of the subject and / or the movement of the subject Estimate the second travel,
Based on the selected pair and the estimated first movement amount and / or second movement amount, a corrected frame image corresponding to a frame image at the shooting time of the frame image of interest is generated,
A video processing method for combining the frame image of interest and the correction frame image.
 (付記11)
 前記ペアの少なくとも一方を前記注目フレーム画像以外のフレーム画像から選択する 付記10記載の映像処理方法。
(Appendix 11)
The video processing method according to claim 10, wherein at least one of the pair is selected from frame images other than the frame image of interest.
 (付記12)
 前記ペアのフレーム画像間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、前記第1の移動量を推定する 付記11記載の映像処理方法。
(Appendix 12)
The video processing method according to claim 11, wherein a geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images, and the first movement amount is estimated.
 (付記13)
 前記ペアの一方のフレーム画像から前記第1の移動量に基づいて被写体領域を検出し、当該ペアの他方のフレーム画像から当該被写体領域に対応する対応領域を検出し、前記被写体領域及び前記対応領域に基づいて前記第2の移動量を推定する 付記12記載の映像処理方法。
(Appendix 13)
A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected. The video processing method according to claim 12, wherein the second movement amount is estimated based on the first movement amount.
 (付記14)
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記ペアのそれぞれのフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて前記第2の画素移動量を推定する 付記12記載の映像処理方法。
(Appendix 14)
A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area. The video processing method according to appendix 12.
 (付記15)
 前記第1の移動量に基づいて、前記ペアのそれぞれのフレーム画像から第1の補正画像を生成し、
 前記第2の移動量に基づいて、前記第1の補正画像のそれぞれから第2の補正画像を生成し、
 前記第2の補正フレーム画像のそれぞれを合成する
 付記10から14までのいずれか記載の映像処理方法。
(Appendix 15)
Generating a first corrected image from each frame image of the pair based on the first movement amount;
Generating a second corrected image from each of the first corrected images based on the second movement amount;
The video processing method according to any one of appendices 10 to 14, wherein each of the second correction frame images is synthesized.
 (付記16)
 他のフレーム画像との間で、輝度又は彩度の変化率が規定値以上又は未満の領域が規定面積以上を占めるフレーム画像を前記注目フレーム画像であると判定する
 付記10から15までのいずれか記載の映像処理方法。
(Appendix 16)
A frame image in which a region having a luminance or saturation change rate equal to or greater than a specified value or less than another frame image occupies a specified area or more is determined to be the target frame image. Any one of Supplementary Notes 10 to 15 The video processing method described.
 (付記17)
 前記注目フレーム画像と前記補正フレーム画像とを合成する合成比率を、所定の関数をもとに算出する
 付記10から16までのいずれか記載の映像処理方法。
(Appendix 17)
The video processing method according to any one of appendices 10 to 16, wherein a synthesis ratio for synthesizing the frame image of interest and the correction frame image is calculated based on a predetermined function.
 (付記18)
 前記注目フレーム画像と前記補正フレーム画像とを合成するための合成比率として、前記注目フレーム画像と前記補正フレーム画像との変化率が大きい領域に対しては、前記補正フレーム画像の合成比率を大きくなるよう設定する
 付記10から17までのいずれかに記載の映像処理方法。
(Appendix 18)
As a composition ratio for compositing the attention frame image and the correction frame image, the composition ratio of the correction frame image is increased for an area where the rate of change between the attention frame image and the correction frame image is large. The video processing method according to any one of supplementary notes 10 to 17.
 (付記19)
 コンピュータに、
 時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する処理と、
 前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する処理と、
 前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する処理と、
 前記注目フレーム画像と前記補正フレーム画像とを合成する処理と
 を実行させるための映像処理プログラム。
(Appendix 19)
On the computer,
A process of determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region that differs in luminance or saturation by a predetermined level or more with respect to the preceding and following frame images;
Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject A process of estimating a second movement amount to be performed;
Processing for generating a corrected frame image corresponding to a frame image at the photographing time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount; ,
The video processing program for performing the process which synthesize | combines the said attention frame image and the said correction | amendment frame image.
 (付記20)
 前記推定する処理において、
 前記注目フレーム画像以外のフレーム画像から前記ペアの少なくとも一方を選択する
 付記19記載の映像処理プログラム。
(Appendix 20)
In the estimation process,
The video processing program according to claim 19, wherein at least one of the pair is selected from frame images other than the frame image of interest.
 (付記21)
 前記推定する処理において、
 前記ペアのフレーム画像間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、前記第1の移動量を推定する
 付記20記載の映像処理プログラム。
(Appendix 21)
In the estimation process,
The video processing program according to claim 20, wherein a geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images, and the first movement amount is estimated.
 (付記22)
 前記推定する処理において、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記ペアのそれぞれのフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて前記第2の画素移動量を推定する
 付記21記載の映像処理プログラム。
(Appendix 22)
In the estimation process,
A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area. The video processing program according to appendix 21.
 (付記23)
 前記推定する処理において、
 前記ペアの一方のフレーム画像から前記第1の移動量に基づいて被写体領域を検出し、当該ペアの他方のフレーム画像から当該被写体領域に対応する対応領域を検出し、前記被写体領域及び前記対応領域に基づいて前記第2の移動量を推定する 付記21記載の映像処理プログラム。
(Appendix 23)
In the estimation process,
A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected. The video processing program according to attachment 21, wherein the second movement amount is estimated based on
 (付記24)
 前記補正フレーム画像を生成する処理において、
 前記第1の移動量に基づいて、前記ペアのそれぞれのフレーム画像から第1の補正画像を生成し、
 前記第2の移動量に基づいて、前記第1の補正画像のそれぞれから第2の補正画像を生成し、
 前記第2の補正フレーム画像のそれぞれを合成する
 付記19から23までのいずれか記載の映像処理プログラム。
(Appendix 24)
In the process of generating the corrected frame image,
Generating a first corrected image from each frame image of the pair based on the first movement amount;
Generating a second corrected image from each of the first corrected images based on the second movement amount;
24. The video processing program according to any one of supplementary notes 19 to 23, wherein each of the second correction frame images is synthesized.
 (付記25)
 前記判定する処理において、
 他のフレーム画像との間で、輝度又は彩度の変化率が規定値以上又は未満の領域が規定面積以上を占めるフレーム画像を前記注目フレーム画像であると判定する
 付記19から24までのいずれか記載の映像処理プログラム。
(Appendix 25)
In the determination process,
A frame image in which a region in which a change rate of luminance or saturation is greater than or less than a specified value or less than a specified area with another frame image is determined to be the attention frame image The described video processing program.
 (付記26)
 前記合成する処理において、
 前記注目フレーム画像と前記補正フレーム画像とを合成する合成比率を、所定の関数をもとに算出する
 付記19から25までのいずれか記載の映像処理プログラム。
(Appendix 26)
In the process of combining,
The video processing program according to any one of supplementary notes 19 to 25, wherein a composite ratio for combining the frame image of interest and the correction frame image is calculated based on a predetermined function.
 (付記27)
 前記合成する処理において、
 前記注目フレーム画像と前記補正フレーム画像とを合成するための合成比率として、前記注目フレーム画像と前記補正フレーム画像との変化率が大きい領域に対しては、前記補正フレーム画像の合成比率が大きくなるよう設定する
 付記19から26までのいずれか記載の映像処理プログラム。
(Appendix 27)
In the process of combining,
As a composition ratio for compositing the attention frame image and the correction frame image, the composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large. The video processing program according to any one of appendices 19 to 26.
 (付記28)
 時間的に連続する複数のフレーム画像から第1のフレーム画像と第2のフレーム画像とを選択する選択手段と、
 前記第1のフレーム画像と前記第2のフレーム画像の間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、カメラの動きに起因する第1の移動量を推定する第1の推定手段と、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記第1のフレーム画像及び前記第2のフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて被写体の動きに起因する第2の移動量を推定する第2の推定手段と
 を備える映像処理装置。
(Appendix 28)
Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images;
A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated. First estimating means for:
A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. And a second estimating means for estimating a second movement amount resulting from the movement of the video processing apparatus.
 (付記29)
 時間的に連続する複数のフレーム画像から第1のフレーム画像と第2のフレーム画像とを選択し、
 前記第1のフレーム画像と前記第2のフレーム画像の間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、カメラの動きに起因する第1の移動量を推定し、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記第1のフレーム画像及び前記第2のフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて被写体の動きに起因する第2の移動量を推定する
 映像処理方法。
(Appendix 29)
Selecting a first frame image and a second frame image from a plurality of temporally continuous frame images;
A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated. And
A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. A video processing method for estimating a second movement amount resulting from the movement of the video.
 (付記30)
 コンピュータに、
 時間的に連続する複数のフレーム画像から第1のフレーム画像と第2のフレーム画像とを選択する処理と、
 前記第1のフレーム画像と前記第2のフレーム画像の間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、カメラの動きに起因する第1の移動量を推定する処理と、
 前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記第1のフレーム画像及び前記第2のフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて被写体の動きに起因する第2の移動量を推定する処理と
 を実行させるためのプログラム。
(Appendix 30)
On the computer,
A process of selecting a first frame image and a second frame image from a plurality of temporally continuous frame images;
A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated. Processing to
A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. A program for executing a process of estimating a second movement amount caused by the movement of.
 以上好ましい実施の形態を挙げて本発明を説明したが、本発明は必ずしも上記実施の形態に限定されるものではなく、その技術的思想の範囲内において様々に変形し実施することが出来る。 Although the present invention has been described with reference to the preferred embodiments, the present invention is not necessarily limited to the above-described embodiments, and various modifications can be made within the scope of the technical idea.
 この出願は、2015年1月6日に出願された日本出願特願2015-000630を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2015-000630 filed on January 6, 2015, the entire disclosure of which is incorporated herein.
 11  判定部
 12  動き推定部
 12A  選択部
 12B  第1推定部
 12C  第2推定部
 13  画像生成部
 13A  第1補正部
 13B  第2補正部
 13C  合成部
 14  画像合成部
DESCRIPTION OF SYMBOLS 11 Determination part 12 Motion estimation part 12A Selection part 12B 1st estimation part 12C 2nd estimation part 13 Image generation part 13A 1st correction part 13B 2nd correction part 13C Composition part 14 Image composition part

Claims (10)

  1.  時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する判定手段と、
     前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する動き推定手段と、
     前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する画像生成手段と、
     前記注目フレーム画像と前記補正フレーム画像とを合成する画像合成手段と
     を備える映像処理装置。
    Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images;
    Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be
    Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means,
    An image processing apparatus comprising: an image combining unit that combines the frame image of interest and the correction frame image.
  2.  前記動き推定手段は、
     前記注目フレーム画像以外のフレーム画像から前記ペアの少なくとも一方を選択する選択手段
     を有する
     請求項1記載の映像処理装置。
    The motion estimation means includes
    The video processing apparatus according to claim 1, further comprising a selection unit that selects at least one of the pair from a frame image other than the frame image of interest.
  3.  前記動き推定手段は、
     前記ペアのフレーム画像間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、前記第1の移動量を推定する第1の推定手段を有する
     請求項2記載の映像処理装置。
    The motion estimation means includes
    The video according to claim 2, further comprising: a first estimation unit that calculates a geometric transformation parameter based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images and estimates the first movement amount. Processing equipment.
  4.  前記動き推定手段は、
     前記ペアの一方のフレーム画像から前記第1の移動量に基づいて被写体領域を検出し、当該ペアの他方のフレーム画像から当該被写体領域に対応する対応領域を検出し、前記被写体領域及び前記対応領域に基づいて前記第2の移動量を推定する第2の推定手段を有する
     請求項3記載の映像処理装置。
    The motion estimation means includes
    A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected. The video processing apparatus according to claim 3, further comprising: a second estimation unit that estimates the second movement amount based on the first movement amount.
  5.  前記動き推定手段は、
     前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記ペアのそれぞれのフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて前記第2の画素移動量を推定する第2の推定手段を有する
     請求項3記載の映像処理装置。
    The motion estimation means includes
    A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area. The video processing apparatus according to claim 3, further comprising second estimating means for estimating
  6.  前記画像生成手段は、
     前記第1の移動量に基づいて、前記ペアのそれぞれのフレーム画像から第1の補正画像を生成する第1の補正手段と、
     前記第2の移動量に基づいて、前記第1の補正画像のそれぞれから第2の補正画像を生成する第2の補正手段と、
     前記第2の補正フレーム画像のそれぞれを合成する合成手段と
     を有する
     請求項1から5までのいずれか1項記載の映像処理装置。
    The image generating means includes
    First correction means for generating a first correction image from each frame image of the pair based on the first movement amount;
    Second correction means for generating a second corrected image from each of the first corrected images based on the second movement amount;
    The video processing apparatus according to claim 1, further comprising: a combining unit that combines each of the second correction frame images.
  7.  前記画像合成手段は、
     前記注目フレーム画像と前記補正フレーム画像の各画素を、当該画素の輝度の差に応じた比率で合成する
     請求項1から6までのいずれか1項記載の映像処理装置。
    The image composition means includes
    The video processing device according to any one of claims 1 to 6, wherein each pixel of the frame image of interest and the correction frame image are combined at a ratio corresponding to a difference in luminance between the pixels.
  8.  時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定し、
     前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定し、
     前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成し、
     前記注目フレーム画像と前記補正フレーム画像とを合成する
     映像処理方法。
    It is determined whether any of a plurality of temporally continuous frame images is an attention frame image including a blinking region whose luminance or saturation is different from a preceding frame image by a predetermined level or more.
    Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Estimating a second movement amount to be
    Based on the selected pair and the estimated first movement amount and / or second movement amount, a corrected frame image corresponding to a frame image at the shooting time of the frame image of interest is generated,
    A video processing method for combining the frame image of interest and the correction frame image.
  9.  コンピュータに、
     時間的に連続する複数のフレーム画像のいずれかが、輝度又は彩度が前後のフレーム画像に対して所定のレベル以上異なる明滅領域を含む注目フレーム画像であるか判定する処理と、
     前記注目フレーム画像及びその前後のフレーム画像から輝度又は彩度の差に基づいて選択されたフレーム画像のペアに基づいて、カメラの動きに起因する第1の移動量及び/又は被写体の動きに起因する第2の移動量を推定する処理と、
     前記選択されたペアと、前記推定された第1の移動量及び/又は第2の移動量とに基づいて、前記注目フレーム画像の撮影時刻におけるフレーム画像に相当する補正フレーム画像を生成する処理と、
     前記注目フレーム画像と前記補正フレーム画像とを合成する処理と
     を実行させるためのプログラムを記録した記録媒体。
    On the computer,
    A process of determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region that differs in luminance or saturation by a predetermined level or more with respect to the preceding and following frame images;
    Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject A process of estimating a second movement amount to be performed;
    Processing for generating a corrected frame image corresponding to a frame image at the photographing time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount; ,
    A recording medium on which a program for executing the process of combining the frame image of interest and the correction frame image is recorded.
  10.  時間的に連続する複数のフレーム画像から第1のフレーム画像と第2のフレーム画像とを選択する選択手段と、
     前記第1のフレーム画像と前記第2のフレーム画像の間において検出された対応点又は対応領域の位置関係に基づいて幾何変換パラメーターを算出し、カメラの動きに起因する第1の移動量を推定する第1の推定手段と、
     前記幾何変換パラメーターに基づいて、前記第1の移動量を減算することで、前記第1のフレーム画像及び前記第2のフレーム画像から被写体領域を検出し、前記検出された被写体領域に基づいて被写体の動きに起因する第2の移動量を推定する第2の推定手段と
     を備える映像処理装置。
    Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images;
    A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated. First estimating means for:
    A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. And a second estimating means for estimating a second movement amount resulting from the movement of the video processing apparatus.
PCT/JP2016/000013 2015-01-06 2016-01-05 Image processing device, image processing method and program recording medium WO2016111239A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2016568360A JP6708131B2 (en) 2015-01-06 2016-01-05 Video processing device, video processing method and program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015000630 2015-01-06
JP2015-000630 2015-01-06

Publications (1)

Publication Number Publication Date
WO2016111239A1 true WO2016111239A1 (en) 2016-07-14

Family

ID=56355932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/000013 WO2016111239A1 (en) 2015-01-06 2016-01-05 Image processing device, image processing method and program recording medium

Country Status (2)

Country Link
JP (1) JP6708131B2 (en)
WO (1) WO2016111239A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020059425A1 (en) * 2018-09-18 2020-03-26 株式会社日立国際電気 Imaging device, image processing method, and program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09224250A (en) * 1996-02-16 1997-08-26 Nippon Hoso Kyokai <Nhk> Level fluctuation detector and image quality improving device
JP2000069325A (en) * 1998-08-26 2000-03-03 Fujitsu Ltd Image display controller and recording medium thereof
JP2007193192A (en) * 2006-01-20 2007-08-02 Nippon Hoso Kyokai <Nhk> Image analysis device, risk determination program for visual stimulation, and image analysis system
JP2010141486A (en) * 2008-12-10 2010-06-24 Fujifilm Corp Device, method and program for compositing image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09224250A (en) * 1996-02-16 1997-08-26 Nippon Hoso Kyokai <Nhk> Level fluctuation detector and image quality improving device
JP2000069325A (en) * 1998-08-26 2000-03-03 Fujitsu Ltd Image display controller and recording medium thereof
JP2007193192A (en) * 2006-01-20 2007-08-02 Nippon Hoso Kyokai <Nhk> Image analysis device, risk determination program for visual stimulation, and image analysis system
JP2010141486A (en) * 2008-12-10 2010-06-24 Fujifilm Corp Device, method and program for compositing image

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020059425A1 (en) * 2018-09-18 2020-03-26 株式会社日立国際電気 Imaging device, image processing method, and program
US11310446B2 (en) 2018-09-18 2022-04-19 Hitachi Kokusai Electric Inc. Imaging device, image processing method, and program

Also Published As

Publication number Publication date
JPWO2016111239A1 (en) 2017-10-19
JP6708131B2 (en) 2020-06-10

Similar Documents

Publication Publication Date Title
CN108335279B (en) Image fusion and HDR imaging
US9661239B2 (en) System and method for online processing of video images in real time
US9202263B2 (en) System and method for spatio video image enhancement
TWI767985B (en) Method and apparatus for processing an image property map
JP7136080B2 (en) Imaging device, imaging method, image processing device, and image processing method
US9390511B2 (en) Temporally coherent segmentation of RGBt volumes with aid of noisy or incomplete auxiliary data
JP4210954B2 (en) Image processing method, image processing method program, recording medium storing image processing method program, and image processing apparatus
JP4454657B2 (en) Blur correction apparatus and method, and imaging apparatus
US8803947B2 (en) Apparatus and method for generating extrapolated view
JP2010258710A (en) Motion vector detection apparatus, method for controlling the same, and image capturing apparatus
KR20150145725A (en) Method and apparatus for dynamic range expansion of ldr video sequence
US20220256183A1 (en) Real-time image generation in moving scenes
JP2015082768A (en) Image processing device, image processing method, program, and storage medium
JP2018124890A (en) Image processing apparatus, image processing method, and image processing program
TW200919366A (en) Image generation method and apparatus, program therefor, and storage medium for string the program
JP7285791B2 (en) Image processing device, output information control method, and program
JP2018195084A (en) Image processing apparatus, image processing method, program, and storage medium
JP6365355B2 (en) Image generating apparatus and image generating method
KR20200045682A (en) Parallax minimized stitching using HLBP descriptor
James et al. Globalflownet: Video stabilization using deep distilled global motion estimates
JP2015138399A (en) image processing apparatus, image processing method, and computer program
JP2018061130A (en) Image processing device, image processing method, and program
JP6708131B2 (en) Video processing device, video processing method and program
JP6582994B2 (en) Image processing apparatus, image processing method, and program
JP2018093359A (en) Image processing apparatus, image processing method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16734981

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016568360

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16734981

Country of ref document: EP

Kind code of ref document: A1