WO2015159585A1

WO2015159585A1 - Image-processing device, imaging device, image-processing method, and program

Info

Publication number: WO2015159585A1
Application number: PCT/JP2015/054971
Authority: WO
Inventors: ケツテイチョウ; 直規葛谷
Original assignee: ソニー株式会社
Priority date: 2014-04-16
Filing date: 2015-02-23
Publication date: 2015-10-22

Abstract

This invention infers object regions accurately. A comparison-region determination unit determines a comparison region in which to compare an input image and a given reference image. For each of a plurality of pixels of interest in the comparison region of the input image, an input-image feature-quantity acquisition unit acquires an input-image feature quantity consisting of a value that is a function of the differences between the value of that pixel of interest and the values of surrounding pixels surrounding that pixel of interest. A reference-image feature-quantity acquisition unit acquires reference-image feature quantities each consisting of a value that is a function of the differences between the value of a corresponding pixel in the reference image that has the same coordinates as a pixel of interest and the values of pixels that have the same coordinates as the surrounding pixels surrounding said pixel of interest. On the basis of the input-image feature quantities and the reference-image feature quantities, a similarity-degree acquisition unit acquires a region similarity degree consisting of the degree of similarity between the comparison region of the input image and the comparison region of the reference image. An object inference unit infers that comparison regions of the input image that region similarity degrees indicate are not similar to the reference image are object regions.

Description

Image processing apparatus, imaging apparatus, image processing method, and program

The present technology relates to an image processing device, an imaging device, an image processing method, and a program. More specifically, the present invention relates to an image processing device, an imaging device, a processing method in these, and a program for causing a computer to execute the method.

Conventionally, for the purpose of monitoring or measuring, a surveillance camera that estimates an object region by image processing has been used. For example, a surveillance camera that estimates an object region by a background difference method for obtaining a difference between a background image and an input image has been proposed (see, for example, Patent Document 1). This surveillance camera obtains a difference area in which the pixel value difference is larger than the threshold value in the input image and the background image by the background difference method, and obtains the similarity between these difference areas by normalized cross-correlation matching. When the similarity is higher than the threshold, the monitoring camera estimates that the difference area is a disturbance area such as light or shadow, and otherwise the difference area is an object area such as a suspicious object or a suspicious person. Estimated.

JP 2007-201933 A

However, in the above-described conventional technique, when noise occurs in the input image, the similarity of the difference area is lowered due to the influence of the noise. For this reason, there is a possibility that the surveillance camera erroneously estimates that the disturbance is a suspicious object or the like. As described above, the above-described surveillance camera has a problem that the area of the object cannot be accurately estimated.

This technology was created in view of such a situation, and aims to accurately estimate an object region.

The present technology has been made to solve the above-described problems, and a first aspect of the present technology is a comparison region determination unit that determines a comparison region to be compared in each of an input image and a predetermined reference image. An input image feature that acquires a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image as an input image feature amount A reference image feature that acquires a value corresponding to a difference between pixel values of a quantity acquisition unit and a corresponding pixel whose coordinates match the pixel of interest in the reference image and a pixel whose coordinates match the surrounding pixel as a reference image feature quantity Based on the input image feature quantity and the reference image feature quantity, the similarity between the comparison area in the input image and the comparison area in the input image and the comparison area in the reference image is set as the area similarity. An image processing apparatus comprising: a similarity acquisition unit to be obtained; an object estimation unit that estimates the comparison region that is not similar based on the region similarity in the input image as a region of the object; A program for causing a computer to execute the method. This brings about the effect that a comparison region that is not similar is estimated as the region of the object.

Also, in the first aspect, the input image feature amount acquisition unit may detect a plurality of corners in the comparison region and set the pixel of interest. This brings about the effect that a plurality of corners are detected as the target pixel in the comparison region.

Also, in this first aspect, the input image feature quantity acquisition unit may generate a random number corresponding to any of the pixels in the comparison region and set the pixel corresponding to the random number as the target pixel. This brings about the effect that the pixel corresponding to the random number is set as the target pixel.

Further, in the first aspect, the input image feature quantity acquisition unit may extract pixels within a predetermined distance from the pixel of interest as the surrounding pixels. This brings about the effect that pixels within a predetermined distance from the target pixel are extracted as surrounding pixels.

Further, in the first aspect, the input image feature amount acquisition unit may extract a pixel whose pixel value is not similar to the target pixel from the pixels in the comparison area, and use the extracted pixel as the surrounding pixel. This brings about the effect that pixels whose pixel values are not similar to the target pixel are extracted as surrounding pixels.

In the first aspect, the object estimation unit acquires the input image feature amount acquired for each of the plurality of target pixels and the corresponding pixels whose coordinates coincide with the target pixel. Depending on the number of times it is determined for each pixel of interest whether the local similarity that is the similarity to the reference image feature amount is higher than a predetermined local determination threshold, and the local similarity is higher than the local determination threshold The obtained value may be acquired as the region similarity. Accordingly, an effect is obtained in which a value corresponding to the number of times that the local similarity between the input image feature quantity and the reference image feature quantity is determined to be higher than the local determination threshold is acquired as the area similarity.

Further, in the first aspect, pixels having similar pixel values in the input image and the reference image may be detected, and an area including the detected pixels may be determined as the comparison area. This brings about the effect that an area composed of pixels whose pixel values are not similar in the input image and the reference image is determined as the comparison area.

Further, in the first aspect, the comparison area is determined in each of the two input images and the reference image, and a vector from one comparison area to the other comparison area of the two input images is moved. It may be detected as a vector. This brings about the effect that a vector from one comparison area of the two input images to the other comparison area is detected as a movement vector.

A second aspect of the present technology includes an imaging unit that captures an input image, a comparison region determination unit that determines a comparison region to be compared in each of the input image and a predetermined reference image, and the input image An input image feature amount acquisition unit that acquires, as an input image feature amount, a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region; A reference image feature amount acquisition unit that acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinate matches the pixel of interest in a reference image and a pixel whose coordinate matches the surrounding pixel; A similarity acquisition unit that acquires the similarity between the comparison region in the input image and the comparison region in the reference image as a region similarity based on the input image feature amount and the reference image feature amount An imaging apparatus including an object estimation unit for estimating the comparison area in the input image is not similar based on the region similarity as an area of the object. This brings about the effect that a comparison region that is not similar is estimated as the region of the object.

According to the present technology, an excellent effect that the area of the object can be accurately estimated can be achieved. Note that the effects described here are not necessarily limited, and may be any of the effects described in the present disclosure.

1 is a block diagram illustrating a configuration example of an imaging apparatus according to a first embodiment. It is a block diagram which shows one structural example of the image process part in 1st Embodiment. It is a block diagram which shows one structural example of the difference area | region detection part in 1st Embodiment. It is a figure which shows an example of the reference | standard image in 1st Embodiment. It is a figure which shows an example of the input image in 1st Embodiment. It is a figure which shows an example of the difference image in 1st Embodiment. It is a figure which shows an example of the labeling image in 1st Embodiment. It is a block diagram which shows the example of 1 structure of the input image feature-value acquisition part in 1st Embodiment. It is a figure which shows an example of the attention pixel and surrounding pixel in the input image in 1st Embodiment. It is a figure which shows an example of the attention pixel and surrounding pixel in the reference | standard image in 1st Embodiment. It is a figure for demonstrating the acquisition method of the local feature vector in 1st Embodiment. It is a figure which shows an example of the local feature vector in 1st Embodiment. It is a figure which shows an example of the similarity in 1st Embodiment. It is a figure which shows an example of the estimation result in 1st Embodiment. 3 is a flowchart illustrating an example of an operation of the imaging apparatus according to the first embodiment. It is a flowchart which shows an example of the difference area | region detection process in 1st Embodiment. It is a flowchart which shows an example of the feature-value acquisition process in 1st Embodiment. It is a block diagram which shows the example of 1 structure of the input image feature-value acquisition part in 2nd Embodiment. It is a block diagram which shows the example of 1 structure of the reference | standard image feature-value acquisition part in 2nd Embodiment. It is a figure for demonstrating the extraction method of the pixel pair in 2nd Embodiment. It is the figure which expanded a part of difference area | region of the reference | standard image in 2nd Embodiment. It is the figure which expanded a part of difference area of the input image in 2nd Embodiment. It is a figure which shows an example of the local feature-value in 2nd Embodiment. It is a flowchart which shows an example of the feature-value acquisition process in 2nd Embodiment. It is a block diagram which shows the example of 1 structure of the difference area | region detection part in 3rd Embodiment. It is a block diagram which shows the example of 1 structure of the movement vector in 3rd Embodiment, and a stay period. It is a flowchart which shows an example of the difference area | region detection process in 3rd Embodiment.

Hereinafter, modes for carrying out the present technology (hereinafter referred to as embodiments) will be described. The description will be made in the following order.
1. First embodiment (example of obtaining similarity of region from feature amount)
2. Second embodiment (an example of selecting a pixel of interest randomly when obtaining the similarity of a region from a feature amount)
3. Third Embodiment (Example in which a movement vector is detected before the similarity of a region is obtained from a feature amount)

<1. First Embodiment>
[Configuration example of imaging device]
FIG. 1 is a block diagram illustrating a configuration example of the imaging apparatus 100 according to the first embodiment. The imaging apparatus 100 captures an image and includes an imaging lens 110, an imaging element 120, a recording unit 130, a control unit 140, and an image processing unit 200.

The imaging lens 110 collects light and guides it to the imaging device 120. The image sensor 120 converts the light from the imaging lens 110 into an electrical signal and captures an image under the control of the control unit 140. Each time the image sensor 120 captures an image, the image sensor 120 supplies the image as an input image to the image processing unit 200 via the signal line 129. The imaging element 120 is an example of an imaging unit described in the claims.

The image processing unit 200 estimates an object area in the input image. The image processing unit 200 supplies the estimation result to the control unit 140 via the signal line 209. Further, the image processing unit 200 supplies the input image to the recording unit 130 via the signal line 208.

The recording unit 130 records an input image and a reference image. For example, an image captured in advance before the input image is captured at the monitoring target location is used as the reference image.

The control unit 140 controls the entire imaging apparatus 100. The control unit 140 generates a control signal for instructing imaging in accordance with a user operation or the like, and supplies the control signal to the image sensor 120 via the signal line 149. Further, the control unit 140 receives the estimation result from the image processing unit 200, and if any region is estimated to be an object region, the control unit 140 sends an alarm signal to that effect to the outside of the imaging device 100 or the like. Output.

The image processing unit 200 is provided in the imaging device 100, but may be provided in an image processing device different from the imaging device 100. In this configuration, the imaging apparatus 100 supplies an input image to the image processing apparatus, and the image processing apparatus estimates an object region and supplies an estimation result to the imaging apparatus 200 or the like.

[Configuration example of image processing unit]
FIG. 2 is a block diagram illustrating a configuration example of the image processing unit 200 according to the first embodiment. The image processing unit 200 includes a noise removal unit 210, a difference area detection unit 220, an input image feature amount acquisition unit 230, a reference image feature amount acquisition unit 240, a similarity acquisition unit 250, and an object estimation unit 260.

The noise removing unit 210 performs processing for removing noise from the input image. For example, the noise removing unit 210 removes noise by passing a low-pass filter that suppresses high frequency components higher than a predetermined cutoff frequency. This low-pass filter is realized by, for example, an IIR (Infinite Impulse Response) filter or an FIR (Finite Impulse Response) filter. The noise removal unit 210 supplies the input image from which noise has been removed to the recording unit 130, the difference area detection unit 220, and the input image feature amount acquisition unit 230.

The difference area detection unit 220 determines an area to be compared in the input image and the reference image. For example, the difference area detection unit 220 detects pixels whose pixel values are not similar in the input image and the reference image, and sets an area including these pixels as a comparison area. First, the difference area detection unit 220 detects the absolute value of the difference between pixel values for each pixel having the same coordinate, generates a difference image composed of these differences, and converts the difference image into a binary image using a predetermined binarization threshold. Convert to value.

And the difference area detection part 220 is the identification information which identifies the area | region in each of the area | region where the pixel of the pixel value which shows that a difference absolute value is larger than a binarization threshold value in the binarized difference image. Labeling process to allocate as a label. In this labeling process, for example, a 4-connection algorithm that connects pixels that are continuous in the horizontal direction and the vertical direction, an 8-connection algorithm that further connects pixels that are continuous in an oblique direction in addition to those directions, and the like are used. It is done. Hereinafter, an area in the input image and the reference image indicated by the labeled area is referred to as a “difference area”. These difference areas are compared as areas to be compared in the input image and the reference image. The difference area detection unit 220 supplies the image subjected to the labeling process to the input image feature amount acquisition unit 230 and the reference image feature amount acquisition unit 240 as a labeling image.

The difference area detection unit 220 is an example of a comparison area determination unit described in the claims.

The input image feature amount acquisition unit 230 acquires a feature amount in a difference area in the input image. The input image feature amount acquisition unit 230 detects a plurality of target pixels in the difference region, obtains a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each target pixel, and according to the difference The obtained value is acquired as a local feature amount. Here, the pixel value compared when obtaining the local feature amount is, for example, a luminance value or a color difference. The input image feature amount acquisition unit 230 supplies each of the local feature amounts to the similarity acquisition unit 250.

The reference image feature amount acquisition unit 240 acquires feature amounts in the difference area in the reference image. The reference image feature amount acquisition unit 240 receives the coordinates of the target pixel from the input image feature amount acquisition unit 230, and obtains a difference in pixel value between the corresponding pixel whose coordinates match the target pixel and the surrounding surrounding pixels. Then, the reference image feature amount acquisition unit 240 acquires a value corresponding to the difference as a local feature amount, and supplies it to the similarity acquisition unit 250.

The similarity acquisition unit 250 calculates the similarity between the difference areas of the input image and the reference image based on the local feature acquired by the input image feature acquisition unit 230 and the reference image feature acquisition unit 240. Is what you want. For example, the region similarity is higher as the degree of similarity between regions is higher. Details of the region similarity acquisition method will be described later. The similarity acquisition unit 250 supplies the obtained region similarity to the object estimation unit 260.

The object estimation unit 260 estimates a region having a region similarity lower than a predetermined region determination threshold (in other words, a region that is not similar) as a region of a suspicious object or a suspicious object that is not in the reference image. The object estimation unit 260 supplies the estimation result to the control unit 140.

The image processing unit 200 performs noise removal on the input image, but may perform processing other than noise removal as long as the image quality is improved. For example, the image processing unit 200 may perform contrast enhancement or electronic camera shake correction instead of noise removal. Electronic camera shake correction is also called stabilization processing. Alternatively, the image processing unit 200 may perform contrast enhancement and electronic camera shake correction in addition to noise removal. As described above, by performing the process of improving the image quality, it is possible to improve the estimation accuracy of the object.

[Configuration example of difference area detection unit]
FIG. 3 is a block diagram illustrating a configuration example of the difference area detection unit 220 according to the first embodiment. The difference area detection unit 220 includes a difference image generation unit 221 and a labeling processing unit 222.

The difference image generation unit 221 generates a difference image between the input image and the reference image. The difference image generation unit 221 binarizes the generated difference image and supplies it to the labeling processing unit 222.

The labeling processing unit 222 performs a labeling process on the binarized difference image. The labeling processing unit 222 supplies the image subjected to the labeling process to the input image feature amount acquisition unit 230 and the reference image feature amount acquisition unit 240 as a labeling image.

FIG. 4 is a diagram illustrating an example of the reference image 500 according to the first embodiment. As shown in the figure, the reference image 500 includes only backgrounds such as houses and trees.

FIG. 5 is a diagram illustrating an example of the input image 510 according to the first embodiment. As shown in the figure, the input image 510 includes

subjects

511, 512 and 513 in addition to the background. The subject 511 is, for example, a suspicious object or a suspicious person. The subject 512 is light such as search light, for example. The subject 513 is, for example, a cloud shadow.

FIG. 6 is a diagram illustrating an example of the difference image 520 according to the first embodiment. The difference image 520 is obtained by binarizing the difference image between the reference image 500 and the input image 510. As shown in the figure, the difference image 520 includes

difference areas

521, 522, and 523 corresponding to the

subjects

511, 512, and 513. These difference areas are areas in the input image 510 where the absolute value of the pixel value difference with respect to the reference image 500 is larger than a predetermined binarization threshold.

The imaging apparatus 100 generates a difference image from the entire input image and reference image, but is not limited to this configuration. For example, when performing an electronic zoom or the like, the imaging apparatus 100 may generate a difference image from an extraction range of an input image and a reference image using a part of the image as an extraction range.

FIG. 7 is a diagram illustrating an example of a labeling image 530 according to the first embodiment. The labeling image 530 includes

difference areas

531, 532, and 533 corresponding to the

subjects

511, 512, and 513. For example, a label “1” is assigned to the difference area 531. Further, for example, a label “2” is assigned to the difference area 532, and a label “3” is assigned to the difference area 533, for example.

FIG. 8 is a block diagram illustrating a configuration example of the input image feature amount acquisition unit 230 according to the first embodiment. The input image feature amount acquisition unit 230 includes a corner detection unit 231 and a local feature amount acquisition unit 232.

The corner detection unit 231 detects a corner as a target pixel in the difference area. The corner detection unit 231 detects an edge using a canny filter or the like in each difference region in the input image, and detects a corner that is an intersection of the detected edges as a target pixel. The corner detection unit 231 supplies the coordinates of the target pixel to the local feature amount acquisition unit 232 and the reference image feature amount acquisition unit 240.

The local feature amount acquisition unit 232 acquires a local feature amount in the input image. Here, a value corresponding to the difference between the target pixel and surrounding pixels around the target pixel is obtained as the local feature amount. The local feature amount acquisition unit 232 calculates, for example, a difference obtained by subtracting the pixel value of the target pixel from the pixel value of the surrounding pixel for each of the eight surrounding pixels whose coordinate Euclidean distance is 2 ^1/2 or less, A value corresponding to the difference is obtained as a local feature amount. Then, the local feature amount acquisition unit 232 generates a local feature vector including the local feature amount for each group including the target pixel and the eight surrounding pixels, and supplies the local feature vector to the similarity acquisition unit 250.

The configuration of the reference image feature quantity acquisition unit 240 is the same as that of the local feature quantity acquisition unit 232 except that a local feature vector is generated from the reference image.

FIG. 9 is a diagram illustrating an example of a pixel of interest and surrounding pixels in the input image 510 according to the first embodiment. In each of the

difference areas

511, 512, and 513, the input image feature amount acquisition unit 230 detects a corner as the target pixel 611. Then, the input image feature amount acquisition unit 230 extracts eight pixels around the pixel of interest 611 as surrounding pixels. In the figure, a group 612 surrounded by a dotted line is a group including a pixel of interest 611 and surrounding pixels around it.

FIG. 10 is a diagram illustrating an example of a pixel of interest and surrounding pixels in the reference image 500 according to the first embodiment. The reference image feature amount acquisition unit 240 extracts a corresponding pixel 601 having the same coordinates as the target pixel 611 and eight surrounding pixels around the corresponding pixel 601. In the figure, a group 602 surrounded by a dotted line is a group including a corresponding pixel 601 and surrounding pixels around it.

FIG. 11 is a diagram for explaining a local feature vector acquisition method according to the first embodiment. The input image feature amount acquisition unit 230 extracts the pixel of interest 611 and eight surrounding pixels. Then, the input image feature amount acquisition unit 230 calculates, for each peripheral pixel, a difference obtained by subtracting the pixel value of the target pixel 611 from the pixel value of the peripheral pixel. For example, when the difference is within a certain range of +30 to −30, the input image feature quantity acquisition unit 230 acquires a value of “0” as a local feature quantity. Further, the input image feature quantity acquisition unit 230 acquires the value “1” as the local feature quantity when the difference is greater than +30, and the value “−1” when the difference is less than −30. Acquired as a feature value. A local feature vector including these local feature amounts is obtained for each group 612. When the number of surrounding pixels is 8, each of the local feature vectors includes 8 local feature amounts.

For example, the pixel value of the target pixel is “65”, and the pixel values of the eight surrounding pixels are “30”, “60”, “65”, “60”, “140”, “60”, “200”, respectively. And “60”. In this case, “−1”, “0”, “0”, “0”, “1”, “0”, “1”, and “0” are obtained as local feature amounts.

As illustrated in FIG. 11, the imaging apparatus 100 obtains a local feature amount corresponding to a difference in pixel value, and obtains a region similarity from the local feature amount, so noise resistance is increased. For example, when the pixel value of the target pixel in the input image without noise is “65” and the pixel value of a certain surrounding pixel is “30”, the difference between them is lower than −30. −1 ”. Here, even if the pixel value of the target pixel fluctuates to “67” due to noise that cannot be removed by the noise removing unit 210, the difference is still lower than −30, and thus the local feature amount is “−1”. The value does not change. For this reason, the fluctuation | variation of area | region similarity can be suppressed.

FIG. 12 is a diagram showing an example of local feature vectors in the first embodiment. The input image feature quantity acquisition unit 230 extracts a plurality of groups for each difference area corresponding to the label in the input image, and obtains a local feature vector for each of these groups. Further, the reference image feature amount acquisition unit 240 extracts a plurality of groups for each label (difference area) in the reference image, and obtains a local feature vector for each of these groups.

FIG. 13 is a diagram illustrating an example of the degree of similarity according to the first embodiment. The similarity acquiring unit 250 obtains the similarity between the local feature acquired in the input image and the local feature acquired in the reference image for each group as the local similarity. The local similarity is obtained by, for example, normalized cross correlation matching. Then, the similarity acquisition unit 250 determines whether or not the local similarity is higher than a predetermined local determination threshold for each group, and sets a value corresponding to the number of times the local similarity is determined to be higher than the local determination threshold as a region. Calculate as similarity. For example, a value obtained by dividing the number of times that the local similarity is higher than the local determination threshold in the difference area by the number of groups in the difference area is obtained for each label (difference area) as the area similarity.

Note that the similarity acquisition unit 250 may determine the number of times the local similarity is determined to be higher than the local determination threshold as the local similarity. In this case, the corner detection part 231 should just make the number of each attention pixels of a difference area the same. For example, the corner detection unit 231 detects all corners for each difference area, and reduces the target pixel in other areas in accordance with the difference area with the smallest number of detections.

Further, although the similarity acquisition unit 250 obtains the local similarity by the normalized cross correlation matching, it is not limited to this configuration. For example, the similarity obtaining unit 250 may obtain SAD (Sum of Absolute Differences), which is the sum of absolute values of differences, as the local similarity. The similarity acquisition unit 250 may obtain SSD (SumSof Squared Differences), which is the sum of squares of differences, as the local similarity.

FIG. 14 is a diagram illustrating an example of an estimation result in the first embodiment. The object estimation unit 260 estimates a label (difference area) whose area similarity is lower than a predetermined area determination threshold as an area of a suspicious object or a suspicious person. On the other hand, a label whose region similarity is equal to or higher than a predetermined region determination threshold indicates a region where the texture of the region is not changed and the brightness is merely changed as a whole. Such a region is determined as a region where disturbance such as light or shadow has occurred. In other words, it is determined that it is not a suspicious object.

[Operation example of imaging device]
FIG. 15 is a flowchart illustrating an example of the operation of the imaging apparatus 100 according to the first embodiment. This operation is executed every time an input image is captured, for example.

First, the imaging apparatus 100 executes a difference area detection process for detecting a difference area between the input image and the reference image (step S910). Then, the imaging apparatus 100 executes a feature amount acquisition process for acquiring a feature amount in the difference area (step S920).

Further, the imaging apparatus 100 selects any one of the difference areas where the object is not estimated (step S901), and acquires the area similarity of the difference area based on the feature amount (step S902). The imaging apparatus 100 determines whether or not the region similarity is lower than the region determination threshold (step S903).

If the area similarity is lower than the area determination threshold (step S903: Yes), the imaging apparatus 100 determines that the area is an area such as a suspicious object (step S904). On the other hand, when the region similarity is equal to or greater than the region determination threshold (step S903: No), the imaging apparatus 100 determines that the region is a region such as a disturbance and is not a region such as a suspicious object (step S905). ).

After step S904 or S905, the imaging apparatus 100 determines whether the object has been estimated in all the difference areas (step S906). If the object is not estimated in all the difference areas (step S906: No), the imaging apparatus 100 returns to step S901. On the other hand, when the object is estimated in all the difference areas (step S906: Yes), the imaging apparatus 100 ends the process on the input image (step S906).

FIG. 16 is a flowchart illustrating an example of a difference area detection process according to the first embodiment. The imaging device 100 generates a difference image between the input image and the reference image (step S911). Then, the imaging apparatus 100 performs a labeling process on the difference image to generate a labeling image (step S912). After step S912, the imaging apparatus 100 ends the difference area detection process.

FIG. 17 is a flowchart illustrating an example of a feature amount acquisition process according to the first embodiment. The imaging apparatus 100 detects a plurality of corners as the target pixel in the difference area of the input image (step S921). Then, the imaging apparatus 100 extracts a group composed of the pixel of interest and surrounding pixels around it from the input image, and acquires a local feature vector composed of a local feature amount for each group (step S922). In addition, the imaging apparatus 100 extracts a group including a pixel whose coordinates match the target image and surrounding pixels around the target image in the input image, and acquires a local feature vector including a local feature amount for each group (step). S923). After step S923, the imaging apparatus 100 ends the feature amount acquisition process.

As described above, according to the first embodiment of the present technology, in order to estimate the region of the object based on the region similarity acquired from the feature amount according to the difference between the pixel values of the target pixel and the surrounding pixels, It is possible to accurately estimate the region of the object while suppressing the variation of the region similarity due to noise.

<2. Second Embodiment>
In the first embodiment, the imaging apparatus 100 detects the corner as the target pixel, but may select a randomly obtained pixel as the target pixel. In addition, the imaging apparatus 100 extracts pixels within a certain distance from the target pixel as surrounding pixels, but may extract pixels whose pixel values are not similar to the target pixel as surrounding pixels. The imaging apparatus 100 according to the second embodiment selects the randomly obtained pixel as a target pixel, and extracts pixels whose pixel values are not similar to the target pixel as surrounding pixels. Different from form.

FIG. 18 is a block diagram illustrating a configuration example of the input image feature quantity acquisition unit 230 according to the second embodiment. The input image feature amount acquisition unit 230 according to the second embodiment includes a random number generation unit 233, a surrounding pixel extraction unit 234, and a local feature amount acquisition unit 235.

The random number generation unit 233 generates a random number corresponding to any coordinate in the difference area. For example, the minimum value and the maximum value of the x coordinate of the pixels in the difference area are set to x _min and x _max, and the minimum value and the maximum value of the y coordinate are set to y _min and y _max . The random number generation unit 233 generates random numbers x _r and y _r by the following formula.
w = x _max -x _min Equation 1
h = y _max −y _min Expression 2
x _r = rand (w) Equation 3
y _r = rand (h) Expression 4
In

Equations

3 and 4, rand (A) is a function that returns a random number from 0 to A-1 using a linear congruential method or the like.

If the relative coordinates (x _r , y _r ) based on (x _min , y _min ) are not coordinates in the difference area, or if they are already generated coordinates, the random number generator 233 again generates a random number. Generate. On the other hand, when the relative coordinates (x _r , y _r ) are new coordinates in the difference area, the random number generation unit 233 supplies the coordinates to the surrounding pixel extraction unit 234 as the coordinates of the target pixel. Then, the random number generation unit 233 repeats generation of random numbers until the number of generated coordinates of the target pixel reaches a certain number.

The surrounding pixel extraction unit 234 extracts surrounding pixels whose pixel values are not similar to the target pixel. The surrounding pixel extraction unit 234 obtains a pixel value difference with respect to the target pixel in order from a pixel close to the target pixel among pixels around the target pixel in the input image, and the absolute value of the difference is a predetermined difference threshold (for example, 30) It is determined whether or not it is larger. Then, the surrounding pixel extraction unit 234 extracts, as surrounding pixels corresponding to the target pixel, a pixel that is first determined that the absolute value of the difference is larger than the difference threshold (in other words, the pixel values are not similar). Each time the surrounding pixel is extracted, the surrounding pixel extraction unit 234 supplies the coordinates of the surrounding pixel and the pixel pair of the surrounding pixel to the reference image feature amount acquisition unit 240. In addition, the surrounding pixel extraction unit 234 supplies the difference to the local feature amount acquisition unit 235.

The local feature amount acquisition unit 235 acquires a local feature amount corresponding to the difference for each pixel pair. For example, the local feature quantity acquisition unit 235 acquires a value of “1” as the local feature quantity when the sign of the difference is + and a value of “0” when the sign of the difference is −. The local feature amount acquisition unit 235 supplies the local feature amount of each pixel pair to the similarity acquisition unit 250.

Note that the input image feature quantity acquisition unit 230 randomly selects a pixel of interest, but may detect a corner as a pixel of interest as in the first embodiment. In this case, every time a corner is detected, surrounding pixels corresponding to the corner are extracted, and a pixel pair is generated.

FIG. 19 is a block diagram illustrating a configuration example of the reference image feature amount acquisition unit 240 according to the second embodiment. The reference image feature amount acquisition unit 240 includes a difference acquisition unit 241 and a local feature amount acquisition unit 242. The difference acquisition unit 241 acquires a pixel value difference between a pixel whose coordinates match the target pixel in the reference image and a pixel whose coordinates match the surrounding pixels. The difference acquisition unit 241 supplies the difference to the local feature amount acquisition unit 242. The configuration of the local feature quantity acquisition unit 242 is the same as that of the local feature quantity acquisition unit 235.

FIG. 20 is a diagram for explaining a pixel pair extraction method according to the second embodiment. In the figure, a black pixel is a pixel of interest selected at random. Also, the hatched pixels are surrounding pixels associated with the target pixel. A pixel pair whose both ends are connected by a line segment indicated by an arrow is a pixel pair of a corresponding target pixel and surrounding pixels.

The surrounding pixel extraction unit 234, for example, pays attention to the pixels in order along a spiral path that moves away from the target pixel 711 as the vehicle turns, obtains a pixel value difference with respect to the target pixel 711 for the target pixel, and the absolute value of the difference is It is determined whether or not the difference threshold value is greater. Then, the surrounding pixel extraction unit 234 extracts, as the corresponding surrounding pixel 712, a pixel that is first determined that the absolute value of the difference is larger than the difference threshold.

Note that the surrounding pixel extraction unit 234 extracts surrounding pixels by paying attention to the pixels in order along the spiral path. However, as in the first embodiment, pixels within a certain distance are used as surrounding pixels. It may be extracted. In this case, for each pixel of interest selected at random, for example, surrounding 8 pixels are extracted as surrounding pixels, and a local feature vector is obtained.

FIG. 21 is an enlarged view of a part of the difference area of the reference image in the second embodiment. The number of pixels in the enlarged part is 200 pixels of 20 × 10. In this portion, it is assumed that there are 139 pixels having a pixel value “40” and 61 pixels having a pixel value “200”.

FIG. 22 is an enlarged view of a part of the input image in the second embodiment. The enlarged part is a part corresponding to the part enlarged in the input image in FIG. This part is a part of a cloud shadow or the like, and it is assumed that the average pixel value is about 40 lower than the reference image. As a result, the pixel value of the pixel having the same coordinate as the pixel having the pixel value “40” in the reference image is around “0” in the input image, and the pixel value having the same coordinate as the pixel having the pixel value “200” in the reference image. The pixel value is around “160” in the input image.

Here, it is assumed that noise is generated in the input image that cannot be completely removed by the noise removal unit 210. It is assumed that the pixel value has a variation of about a variance value “5” due to the influence of noise. As a result, for example, there are 95 pixels of “0”, 13 pixels of pixel value “2”, and 31 pixels of pixel value “4”. Also, there are 24 pixels with the pixel value “155”, 26 pixels with the pixel value “160”, and 11 pixels with the pixel value “162”.

If the feature similarity is not obtained and the region similarity of the enlarged portion of the input image and the reference image is obtained by normalized cross-correlation matching, the region similarity R is calculated by the following equation.

In the above equation, N is the number of pixels in the x direction of the difference area, and M is the number of pixels in the Y direction. I (i, j) is a pixel value of a pixel in the difference area of the reference image. T (i, j) is a pixel value of a pixel in the difference area of the input image.

When the pixel value is actually input to Equation 5, the following equation is obtained.

As exemplified in Equation 6, when the region similarity R is obtained by normalized cross-correlation matching or the like, the region similarity R may be lowered due to the influence of noise.

FIG. 23 is a diagram illustrating an example of local feature amounts according to the second embodiment. A in the same figure is an example of the local feature-value of the expansion part of an input image. B in the figure is an example of the local feature amount of the enlarged portion of the reference image. As illustrated in the figure, the local feature values for each pixel pair in the input image and the reference image all match. For this reason, the region similarity is the maximum value “1”. Thus, the region similarity is unlikely to decrease due to the influence of noise.

FIG. 24 is a flowchart illustrating an example of a feature amount acquisition process according to the second embodiment. The imaging apparatus 100 selects a target pixel at random in the difference area of the input image (step S925). Then, the imaging apparatus 100 pays attention to the pixels in order along the spiral path around the pixel of interest, and extracts surrounding pixels in which the absolute value of the pixel value difference is greater than the difference threshold (step S926). The imaging apparatus 100 acquires the local feature amount of the pixel pair of the target pixel and the surrounding pixels (step S927). The imaging apparatus 100 determines whether or not the number of pixel pairs is equal to or greater than a set value (step S928). When the number of pixel pairs is less than the set value (step S928: No), the imaging apparatus 100 returns to step S925.

On the other hand, when the number of pixel pairs is equal to or larger than the set value (step S928: Yes), the imaging apparatus 100 acquires a local feature amount in the reference image for each pixel pair (step S929). After step S929, the imaging apparatus 100 ends the feature amount acquisition process.

As described above, according to the second embodiment, since the imaging apparatus 100 generates a random number and selects a pixel corresponding to the random number as a target pixel, a difference region (for example, a density gradient that is difficult to detect a corner) is selected. In a small area, the target pixel can be easily selected.

<3. Third Embodiment>
In the first embodiment, it is not assumed that the object moves. However, the movement vector of the object may be further detected on the assumption that the object moves. The imaging apparatus 100 according to the third embodiment is different from the first embodiment in that the movement vector is further detected.

FIG. 25 is a block diagram illustrating a configuration example of the difference area detection unit 220 according to the third embodiment. The difference area detection unit 220 of the third embodiment differs from the first embodiment in that it further includes a data buffer 223 and a movement vector detection unit 224.

The data buffer 223 holds a labeling image, a detection vector, and a stay period. The detection vector and the stay period will be described later.

The labeling processing unit 222 according to the third embodiment supplies a labeling image to the data buffer 223 and the movement vector detection unit 224.

The movement vector detection unit 224 detects a movement vector for each difference area. The movement vector detection unit 224 acquires the current labeling image and the past labeling image and movement vector held in the data buffer 223. The movement vector detection unit 224 pays attention to one of the difference areas in the past input image in order, and searches the current difference area corresponding to the difference area. For example, the movement vector detection unit 224 calculates, as the current movement vector, a vector from the reference coordinates of the focused past difference area to the respective reference coordinates of the current difference area. Here, the reference coordinates are, for example, the center and the center of gravity of the difference area. In addition, the movement vector detection unit 224 calculates the area of the past difference area of interest and the area of each of the current difference areas. Then, the movement vector detection unit 224 gives priority to a difference area with a small variation in area and movement vector and acquires it as a corresponding area. For example, the movement vector detection unit 224 obtains an evaluation value for each current difference area by the following formula, and acquires a difference image having the highest evaluation value as a corresponding area.

Fv (k) = | v_k−v_k ′ |

F (k) = Fv (k) −Fs (k) Equation 9
In the above equation, v_k is a movement vector of the difference area corresponding to the current label k. v_k ′ is a movement vector of the past difference region of interest. S_k is the area of the difference area corresponding to the current label k. S_k ′ is the area of the past difference region of interest. F (k) is an evaluation value of the difference area corresponding to the label k. In the first input image, the movement vector is set to an initial value.

The movement vector detection unit 224 updates the label of the current difference area to the same label as the corresponding past difference area. The movement vector detection unit 224 counts up the stay period of the updated label by one frame. In the first input image, the stay period of each label is set to an initial value. Then, the movement vector detection unit supplies the labeling image with the updated label to the data buffer 223, the input image feature amount acquisition unit 230, and the reference image feature amount acquisition unit 240, and the movement vector and the stay period are displayed in the data buffer 223 and This is supplied to the control unit 140. As described above, by associating the past difference area and the current difference area based on the area and the movement vector, it is possible to track the moving body moving at a constant speed.

FIG. 26 is a block diagram illustrating a configuration example of the movement vector and the stay period in the third embodiment. As illustrated at the same time, the movement vector detection unit 224 obtains a movement vector and a stay period for each label.

FIG. 27 is a flowchart illustrating an example of a difference area detection process according to the third embodiment. The difference area detection process of the third embodiment is different from that of the first embodiment in that steps S913 and S914 are further executed. The difference area detection unit 220 generates a labeling image (step S912) and detects a movement vector for each label (step S913). Further, a stay period is obtained for each label (step S914). After step S914, the difference area detection unit 220 ends the difference area detection process.

As described above, according to the third embodiment, the imaging apparatus 100 can easily track the moving body in order to obtain the movement vector of the difference area. Thereby, the imaging device 100 can perform a useful process for crime prevention, such as obtaining the stay period of the moving body and reproducing only the input image of the stay period.

The above-described embodiment shows an example for embodying the present technology, and the matters in the embodiment and the invention-specific matters in the claims have a corresponding relationship. Similarly, the invention specific matter in the claims and the matter in the embodiment of the present technology having the same name as this have a corresponding relationship. However, the present technology is not limited to the embodiment, and can be embodied by making various modifications to the embodiment without departing from the gist thereof.

Further, the processing procedure described in the above embodiment may be regarded as a method having a series of these procedures, and a program for causing a computer to execute these series of procedures or a recording medium storing the program. You may catch it. As this recording medium, for example, a CD (Compact Disc), an MD (MiniDisc), a DVD (Digital Versatile Disc), a memory card, a Blu-ray disc (Blu-ray (registered trademark) Disc), or the like can be used.

It should be noted that the effects described here are not necessarily limited, and may be any of the effects described in the present disclosure.

In addition, this technique can also take the following structures.
(1) a comparison area determination unit that determines a comparison area to be compared in each of the input image and the predetermined reference image;
An input image feature value that acquires, as an input image feature value, a value corresponding to a pixel value difference between surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image An acquisition unit;
A reference image feature amount acquisition unit that acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel in the reference image and a pixel whose coordinates match the surrounding pixel;
A similarity acquisition unit that acquires a similarity between the comparison region in the input image and the comparison region in the reference image as a region similarity based on the input image feature and the reference image feature;
An image processing apparatus comprising: an object estimation unit configured to estimate the comparison region that is not similar based on the region similarity in the input image as an object region.
(2) The image processing apparatus according to (1), wherein the input image feature amount acquisition unit detects a plurality of corners in the comparison region and sets the target pixel.
(3) The image according to (1), wherein the input image feature amount acquisition unit generates a random number corresponding to any one of the pixels in the comparison area and sets the pixel corresponding to the random number as the pixel of interest. Processing equipment.
(4) The image processing device according to any one of (1) to (3), wherein the input image feature quantity acquisition unit extracts a pixel within a predetermined distance from the target pixel as the surrounding pixel.
(5) The input image feature quantity acquisition unit extracts a pixel whose pixel value is not similar to the target pixel from the pixels in the comparison area, and sets the pixel as the surrounding pixel. An image processing apparatus according to any one of the above.
(6) The object estimation unit may calculate the input image feature amount acquired for each of the plurality of target pixels and the reference image feature amount acquired for each of the corresponding pixels whose coordinates coincide with the target pixel. Whether the local similarity that is the similarity is higher than a predetermined local determination threshold value is determined for each pixel of interest, and a value corresponding to the number of times that the local similarity is determined to be higher than the local determination threshold value is the region similarity The image processing apparatus according to any one of (1) to (5), acquired as a degree.
(7) The image processing according to any one of (1) to (6), wherein a pixel whose pixel value is not similar in the input image and the reference image is detected and an area including the detected pixel is used as the comparison area. apparatus.
(8) The comparison area determination unit determines the comparison area in each of the two input images and the reference image, and calculates a vector from one comparison area to the other comparison area of the two input images. The image processing device according to any one of (1) to (7), wherein the image processing device is detected as a movement vector.
(9) an imaging unit that captures an input image;
A comparison area determination unit for determining a comparison area to be compared in each of the input image and the predetermined reference image;
An input image feature value that acquires, as an input image feature value, a value corresponding to a pixel value difference between surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image An acquisition unit;
A reference image feature amount acquisition unit that acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel in the reference image and a pixel whose coordinates match the surrounding pixel;
A similarity acquisition unit that acquires a similarity between the comparison region in the input image and the comparison region in the reference image as a region similarity based on the input image feature and the reference image feature;
An imaging apparatus comprising: an object estimation unit configured to estimate the comparison area that is not similar based on the area similarity in the input image as an object area.
(10) a comparison region determination procedure in which the comparison region determination unit determines a comparison region to be compared in each of the input image and the predetermined reference image;
The input image feature amount acquisition unit inputs a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image. Input image feature amount acquisition procedure to be acquired as a quantity;
A reference image feature amount acquisition unit acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel and a pixel whose coordinates match the surrounding pixel in the reference image. A reference image feature acquisition procedure;
Similarity acquired by the similarity acquisition unit based on the input image feature quantity and the reference image feature quantity as the similarity between the comparison area in the input image and the comparison area in the reference image Acquisition procedure;
An object processing method, comprising: an object estimation unit configured to estimate, as an object region, the comparison region that is not similar based on the region similarity in the input image.
(11) a comparison region determination procedure in which the comparison region determination unit determines a comparison region to be compared in each of the input image and the predetermined reference image;
The input image feature amount acquisition unit inputs a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image. Input image feature amount acquisition procedure to be acquired as a quantity;
A reference image feature amount acquisition unit acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel and a pixel whose coordinates match the surrounding pixel in the reference image. A reference image feature acquisition procedure;
Similarity acquired by the similarity acquisition unit based on the input image feature quantity and the reference image feature quantity as the similarity between the comparison area in the input image and the comparison area in the reference image Acquisition procedure;
A program for causing an object estimation unit to cause a computer to execute an object estimation procedure for estimating a comparison area that is not similar based on the area similarity in the input image as an object area.

DESCRIPTION OF SYMBOLS 100 Image pick-up device 110 Imaging lens 120 Image pick-up element 130 Recording part 140 Control part 200 Image processing part 210 Noise removal part 220 Difference area detection part 221 Difference image generation part 222 Labeling process part 223 Data buffer 224 Movement vector detection part 230 Input image Feature amount acquisition unit 231

Corner detection unit

232, 235, 242 Local feature amount acquisition unit 233 Random number generation unit 234 Surrounding pixel extraction unit 240 Reference image feature amount acquisition unit 241 Difference acquisition unit 250 Similarity acquisition unit 260 Object estimation unit

Claims

A comparison area determination unit for determining a comparison area to be compared in each of the input image and the predetermined reference image;
An input image feature value that acquires, as an input image feature value, a value corresponding to a pixel value difference between surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image An acquisition unit;
A reference image feature amount acquisition unit that acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel in the reference image and a pixel whose coordinates match the surrounding pixel;
A similarity acquisition unit that acquires a similarity between the comparison region in the input image and the comparison region in the reference image as a region similarity based on the input image feature and the reference image feature;
An image processing apparatus comprising: an object estimation unit configured to estimate the comparison region that is not similar based on the region similarity in the input image as an object region.
The image processing apparatus according to claim 1, wherein the input image feature amount acquisition unit detects a plurality of corners in the comparison region as the target pixel.
The image processing apparatus according to claim 1, wherein the input image feature amount acquisition unit generates a random number corresponding to any one of the pixels in the comparison region and sets the pixel corresponding to the random number as the target pixel.
The image processing apparatus according to claim 1, wherein the input image feature amount acquisition unit extracts pixels within a predetermined distance from the target pixel as the surrounding pixels.
The image processing apparatus according to claim 1, wherein the input image feature amount acquisition unit extracts a pixel whose pixel value is not similar to the target pixel from the pixels in the comparison region, and sets the pixel as the surrounding pixel.
The object estimation unit is based on a similarity between the input image feature amount acquired for each of the plurality of target pixels and the reference image feature amount acquired for each of the corresponding pixels whose coordinates coincide with the target pixel. Whether or not a certain local similarity is higher than a predetermined local determination threshold is determined for each pixel of interest, and a value corresponding to the number of times the local similarity is determined to be higher than the local determination threshold is acquired as the region similarity The image processing apparatus according to claim 1.
The image processing apparatus according to claim 1, wherein the comparison area determination unit detects pixels whose pixel values are not similar in the input image and the reference image, and sets an area including the detected pixels as the comparison area.
The comparison area determination unit determines the comparison area in each of the two input images and the reference image, and uses a vector from one comparison area to the other comparison area of the two input images as a movement vector. The image processing apparatus according to claim 1 for detection.
An imaging unit that captures an input image;
A comparison area determination unit for determining a comparison area to be compared in each of the input image and the predetermined reference image;
An input image feature value that acquires, as an input image feature value, a value corresponding to a pixel value difference between surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image An acquisition unit;
A reference image feature amount acquisition unit that acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel in the reference image and a pixel whose coordinates match the surrounding pixel;
A similarity acquisition unit that acquires a similarity between the comparison region in the input image and the comparison region in the reference image as a region similarity based on the input image feature and the reference image feature;
An imaging apparatus comprising: an object estimation unit configured to estimate the comparison area that is not similar based on the area similarity in the input image as an object area.
A comparison region determination procedure in which a comparison region determination unit determines a comparison region to be compared in each of the input image and the predetermined reference image;
The input image feature amount acquisition unit inputs a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image. Input image feature amount acquisition procedure to be acquired as a quantity;
A reference image feature amount acquisition unit acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel and a pixel whose coordinates match the surrounding pixel in the reference image. A reference image feature acquisition procedure;
Similarity acquired by the similarity acquisition unit based on the input image feature quantity and the reference image feature quantity as the similarity between the comparison area in the input image and the comparison area in the reference image Acquisition procedure;
An object processing method, comprising: an object estimation unit configured to estimate, as an object region, the comparison region that is not similar based on the region similarity in the input image.
A comparison region determination procedure in which a comparison region determination unit determines a comparison region to be compared in each of the input image and the predetermined reference image;
The input image feature amount acquisition unit inputs a value corresponding to a difference between pixel values of surrounding pixels around the target pixel and the target pixel for each of the plurality of target pixels in the comparison region in the input image. Input image feature amount acquisition procedure to be acquired as a quantity;
A reference image feature amount acquisition unit acquires, as a reference image feature amount, a value corresponding to a difference in pixel value between a corresponding pixel whose coordinates match the target pixel and a pixel whose coordinates match the surrounding pixel in the reference image. A reference image feature acquisition procedure;
Similarity acquired by the similarity acquisition unit based on the input image feature quantity and the reference image feature quantity as the similarity between the comparison area in the input image and the comparison area in the reference image Acquisition procedure;
A program for causing an object estimation unit to cause a computer to execute an object estimation procedure for estimating a comparison area that is not similar based on the area similarity in the input image as an object area.