WO2011086901A1

WO2011086901A1 - Image processing device, image capture device, and image processing program

Info

Publication number: WO2011086901A1
Application number: PCT/JP2011/000098
Authority: WO
Inventors: 河内亮太; 森屋健太郎; 大元憲英
Original assignee: 株式会社ニコンシステム
Priority date: 2010-01-13
Filing date: 2011-01-12
Publication date: 2011-07-21
Also published as: JP2012146179A; JP5487126B2

Abstract

The disclosed image processing device is provided with an acquisition unit for acquiring information of an object image to be an object of processing; a region partitioning unit for partitioning the object image into a plurality of blocks; a setting unit for setting one or more templates on the basis of images of one or more blocks existing on an outer peripheral portion of the object image among the plurality of blocks; a calculation unit for calculating a representative value for each of the plurality of blocks into which the object image has been partitioned; a matching unit for performing matching for each of the plurality of blocks by comparing a representative value of a block to be matched with representative values in the one or more templates; and a creation unit for creating maps indicating distribution of a subject in the object image on the basis of the results of the matching by the matching unit; whereby extraction of a main subject region is performed by a method compatible with the object image without depending on high frequency components or assuming an empirical composition.

Description

Image processing apparatus, imaging apparatus, and image processing program

The present invention relates to an image processing device, an imaging device, and an image processing program.

Conventionally, various techniques for extracting a main subject area included in a scene have been proposed. As an example, Patent Document 1 discloses a technique for obtaining a main subject region by obtaining a degree of difference between a feature of a certain part of an image and a feature of a part located around the part.

JP 2009-246920 A

However, the conventional technique extracts the main subject area depending on the high frequency component. Therefore, when a portion other than the main subject region (background region) contains a high-frequency component, this portion is also extracted as the main subject region, and a preferable result cannot be obtained. Furthermore, a technique for extracting a main subject region based on an empirically assumed composition is also considered. For example, assuming that there is a main subject in the central part, the extraction is performed with emphasis on the central part of the image, or “a background line is placed on 1/3 vertical and horizontal lines or the subject is placed at the intersection of these lines. Extraction is performed according to the three-division method (1/3 rule) that “the composition will be balanced if placed.” In such a technique, there is a problem that preferable extraction may not be performed in an image having a composition other than the assumed composition.

Therefore, an object of the present invention is to provide a means for extracting a main subject region by a method suitable for a target image without depending on a high frequency component or assuming an empirical composition. It is in.

An image processing apparatus according to an aspect includes an acquisition unit that acquires information on a target image to be processed, an area division unit that divides the target image into a plurality of blocks, and the target image among the plurality of blocks. A setting unit that sets one or more templates based on an image of one or more blocks existing in the outer periphery of the image, and a calculation unit that calculates a representative value for each of the plurality of blocks obtained by dividing the target image The matching unit that performs the matching by comparing the representative value of the block to be matched with the representative value in the one or more templates for each of the plurality of blocks, and the matching result by the matching unit And a creation unit that creates a map indicating the distribution of the subject in the target image.

The image processing apparatus further includes a generation unit that generates at least one image having a lower resolution than the target image, and the region dividing unit divides the target image and the low resolution image into a plurality of blocks, respectively, and sets the setting. The unit sets the one or more templates for each of the target image and the low resolution image, and the matching unit performs the matching for each of the target image and the low resolution image, and The creation unit creates a map showing the distribution of the subject for each of the target image and the low-resolution image, and performs the calculation based on the plurality of created maps to thereby show the distribution of the subject in the target image. You may create a map.

Further, the generation unit may generate at least one low-resolution image by performing a process of suppressing or transmitting a specific frequency band on the target image.

In addition, the generation unit may generate at least one low-resolution image by performing at least one of low-pass processing and resizing processing on the target image.

In addition, the generation unit may generate at least one low-resolution image by performing a band pass filter process on the target image.

In addition, the setting unit may set the one or more templates based on images of all blocks existing on the outer periphery of the target image.

In addition, the setting unit sets the one or more templates based on images of all blocks existing on three sides except the lower side of the target image, or sets all the existing images on the three sides. The one or more templates are set based on the image of the block and the image of some predetermined blocks existing on the lower side, or the images of all the blocks existing on the left side and the right side The one or more templates may be set based on

Further, the acquisition unit further acquires posture information of the imaging device at the time of capturing the target image, and the setting unit selects one or more blocks from the plurality of blocks based on the posture information, The one or more templates may be set based on an image of the selected block.

In addition, the setting unit selects some blocks from all the blocks existing on the outer periphery of the target image based on the position of the matching target block in the target image, and selects a plurality of selected blocks The one or more templates may be set based on the image.

Further, the calculating unit calculates a pixel value for each pixel included in the block as the representative value, and the matching unit is based on a difference between pixel values of arbitrary pixels in the matching target block, An evaluation value related to the matching target block may be used.

Further, the calculation unit calculates a pixel value for each pixel included in the block as the representative value, and the matching unit calculates a pixel value of an arbitrary pixel in the block to be matched and an arbitrary template block The difference absolute value sum, which is a value obtained by obtaining and adding the absolute value of the difference from the pixel value of the pixel at the position corresponding to the arbitrary pixel for all the pixels in the matching target block, Each of the two or more templates may be obtained, and a minimum value of the difference absolute value sum among the obtained plurality of difference absolute value sums may be used as the evaluation value for the block to be matched.

In addition, the calculation unit calculates the representative value by calculating a pixel value for each pixel included in the block, and then performing image conversion to a frequency domain on the pixel value included in the block, The matching unit calculates an absolute value of a difference between an arbitrary representative value in a matching target block and a representative value corresponding to the arbitrary representative value in an arbitrary template block in the matching target block. A sum of absolute differences, which is a value obtained by obtaining and adding representative values corresponding to all pixels, is obtained for each of the one or more templates, and the smallest absolute difference among the plurality of obtained sums of absolute differences. The value sum value may be used as the evaluation value for the matching target block.

Further, the calculation unit calculates a plurality of representative colors and their weights by clustering or the like for the pixel values included in the block, and considers the representative colors such as EMD (Earth Move Distance) and the weights thereof. May be calculated. In that case, the number of representative colors for each block may be different.

In addition, the calculation unit may calculate the representative value by performing at least one of Fourier transform, discrete cosine transform, and wavelet transform on the pixel value included in the block.

In addition, the calculation unit calculates, as the representative value, a value indicating a color feature for each of the plurality of blocks based on a distribution of a plurality of color components constituting the target image, and the matching unit includes: A sum of absolute differences, which is a value obtained by adding a difference between a representative value of a matching target block and a representative value of a block of an arbitrary template, is obtained for each of the one or more templates, and the plurality of obtained absolute differences Among the value sums, at least one of the minimum value of the difference absolute value sum, the value of the maximum difference absolute value sum, and the average value of the sum of difference absolute values is the evaluation related to the block to be matched. It is good as a value.

In addition, the calculation unit may calculate at least one of a value indicating a representative color based on a histogram and a value indicating a feature amount based on a relative histogram as the representative value.

Further, the creation unit may compare the evaluation value with a threshold value determined according to a range of values that the evaluation value can take, and create the map based on the comparison result.

Further, among the plurality of blocks divided by the region dividing unit, for a block not set in the template by the setting unit, the value in the map created by the creating unit is compared with a predetermined threshold value, Based on the comparison result, an additional setting unit that newly adds one or more templates is further provided, and the matching unit includes the representative value of the block to be matched and the one or more added by the additional setting unit. Matching is performed for each of the plurality of blocks by comparing each of the representative values in the template, and the creation unit is a map showing the distribution of subjects in the target image based on the result of matching by the matching unit May be created.

In addition, when the target image includes a plurality of subject images, the map created by the creation unit is subjected to at least one of a labeling process and a grouping process by clustering, so that the plurality of subjects are detected. You may further provide the process part which performs the process which makes identification possible.

In addition, when the target image is displayed, the display unit may display a region on the target image corresponding to a region where a value in the map exceeds a predetermined threshold value so as to be visible.

Further, an image processing unit may be further provided that performs a trimming process on an area on the target image corresponding to an area where a value in the map exceeds a predetermined threshold with respect to the target image.

An imaging apparatus according to one aspect includes an imaging unit that captures an image of a subject and any of the image processing apparatuses described above, and the acquisition unit acquires information on the target image from the imaging unit.

It should be noted that an image processing unit may be further provided for the target image to perform a trimming process on a region on the target image corresponding to a region whose value in the map exceeds a predetermined threshold.

Further, a control unit that performs at least one of focus adjustment control and exposure control during imaging by the imaging unit based on the map may be further provided.

Further, it may further include a control unit that monitors at least one of the size and the position of the main subject based on the map and starts imaging by the imaging unit according to the monitoring result.

The imaging unit has at least one of an optical zoom function and an electronic zoom function, and executes at least one of the optical zoom function and the electronic zoom function by the imaging unit based on the map. good.

Note that a program that causes a computer to operate as an image processing apparatus according to an aspect, a storage medium that stores the program, and an operation of the image processing apparatus according to an aspect expressed in a method category are also specific examples of the present invention. It is effective as a specific embodiment.

1 is a block diagram illustrating a configuration example of an image processing apparatus according to a first embodiment. A flow chart showing an example of operation of an image processing device in a 1st embodiment. Another flowchart which shows the operation example of the image processing apparatus in 1st Embodiment. The figure which shows the example of the block division | segmentation in 1st Embodiment The figure which shows the example of a template setting in 1st Embodiment The figure which shows the example of map Sal [T] in 1st Embodiment. The figure which shows the example of extraction of the main subject area | region in 1st Embodiment. The figure which shows the example of the automatic crop in 1st Embodiment The figure which shows the example of a template setting in 3rd Embodiment The figure which shows the example of a template setting in 4th Embodiment The figure which shows the example of the block division | segmentation in the modification of 6th Embodiment The figure which shows the example of a template setting in the modification of 6th Embodiment. Flowchart showing an example of operation of the image processing apparatus in the seventh embodiment The block diagram which shows the structural example of the electronic camera in 8th Embodiment. The figure which shows the example of another map Sal [T] The figure which shows the example of extraction of several main subject area | regions

<Description of First Embodiment>
FIG. 1 is a block diagram illustrating a configuration example of an image processing apparatus according to the first embodiment. The image processing apparatus according to the first embodiment is configured by a personal computer in which an image processing program for creating a map indicating the distribution of a subject is installed with respect to a processing target image (target image) captured by the imaging apparatus.

1 includes a data reading unit 12, a storage device 13, a CPU 14, a memory 15, an input / output I / F 16, and a bus 17. The data reading unit 12, the storage device 13, the CPU 14, the memory 15, and the input / output I / F 16 are connected to each other via a bus 17. Further, an input device 18 (keyboard, pointing device, etc.) and a monitor 19 are connected to the computer 11 via an input / output I / F 16. The input / output I / F 16 receives various inputs from the input device 18 and outputs display data to the monitor 19.

The data reading unit 12 is used when reading the target image data and the image processing program from the outside. For example, the data reading unit 12 communicates with a reading device (such as an optical disk, a magnetic disk, or a magneto-optical disk reading device) that acquires data from a removable storage medium, or an external device in accordance with a known communication standard. It consists of communication devices (USB interface, LAN module, wireless LAN module, etc.) to be performed.

The storage device 13 is constituted by a storage medium such as a hard disk or a nonvolatile semiconductor memory, for example. The storage device 13 includes an image storage unit 21 that records an image and a map storage unit 22 that records a map to be described later. The storage device 13 stores an image processing program and various data necessary for executing the program. The storage device 13 can also store the data of the target image read from the data reading unit 12.

The CPU 14 is a processor that comprehensively controls each part of the computer 11. The CPU 14 functions as a low-pass image generation unit 23, a region division unit 24, a template setting unit 25, a matching processing unit 26, and a map creation unit 27 by executing the above-described image processing program (low-pass image generation). The operations of the unit 23, the region dividing unit 24, the template setting unit 25, the matching processing unit 26, and the map creating unit 27 will be described later).

The memory 15 temporarily stores various calculation results (such as variables and flag values) in the image processing program. The memory 15 is composed of, for example, a volatile SDRAM.

<Operation Example of First Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the first embodiment will be described with reference to the flowcharts of FIGS. 2 and 3. 2 and 3 is started when the CPU 14 executes the image processing program in response to a program execution instruction from the user. Further, the step numbers in FIGS. 2 and 3 correspond to each other.

(Step S101)
The CPU 14 obtains data of the target image designated by the user from the outside via the data reading unit 12. Note that when the target image data is stored in advance in the image storage unit 21 of the storage device 13 or the like, the CPU 14 may omit the process of S101. Hereinafter, in the example of the present specification, the acquired target image is denoted as Img [1].

(Step S102)
The CPU 14 causes the low-pass image generation unit 23 to generate the low-pass images Img [2] and Img [3] based on the image data of the target image Img [1] acquired in step S101. The generation of the low-pass image Img [2] is performed by the following

equations

1 and 2, and the generation of the low-pass image Img [3] is performed by the following

equations

3 and 4.

First, using Equation 1, the data of the target image Img [1] is orthogonally transformed into an expression in the frequency domain by Fourier transformation. (Ωx, ωy) in Equation 1 indicates coordinates in the frequency space, and fq1 indicates a predetermined threshold (details of fq will be described later). Next, using Expression 2, an inverse Fourier transform is performed on F (LImg [2]) obtained by Expression 1 to generate a low-pass image Img [2] subjected to band limitation.

Similarly, with respect to the low-pass image Img [3], the low-pass image Img [3] is generated by performing Fourier transform and inverse Fourier transform on the data of the target image Img [1] using Equation 3 and Equation 4. .

Note that fq1 in Expression 1 and fq2 in Expression 3 are threshold values that are determined in advance based on the height, width, diagonal line, and the like of the target image. fq1 and fq2 may be the same value or different values. In the above example, the low-pass images Img [2] and Img [3] are generated based on the image data of the target image Img [1]. However, the image data of the target image Img [1] Based on this, the low-pass image Img [2] may be generated, and the low-pass image Img [3] may be generated based on the image data of the generated low-pass image Img [2].

In the above example, low-pass processing is performed to generate low-pass images Img [2] ２ and Img [3], which are lower resolution images than the target image Img [1]. A similar image can be created by applying processing for suppressing or transmitting a specific frequency band to 1]. For example, bandpass filter processing may be performed on the target image Img [1] using the following Expression 5 and Expression 2 described above.

Equation 5 represents a bandpass filter in the frequency domain. Using Equation 5, the data of the target image Img [1] is orthogonally transformed into an expression in the frequency domain by Fourier transformation. (Ωx, ωy) in Equation 5 represents coordinates in the frequency space, and fq3 and fq4 represent predetermined threshold values. These fq3 and fq4 are threshold values that are determined in advance based on the height, width, diagonal line, and the like of the target image, similarly to the above-described fq1 and fq2. Any of fq3 and fq4 may be the same value as fq1 and fq2, or may be a different value. Next, inverse Fourier transform is performed on F (BImg [2]) obtained by Equation 5 using Equation 2 described above to generate a bandpass image Img [2] subjected to band limitation. .

The same applies to the bandpass image Img [3]. As in the case of performing the low-pass process, bandpass images Img [2] and Img [3] may be generated based on the image data of the target image Img [1], or the target image Img [1]. The bandpass Img [2] may be generated based on the image data, and the bandpass image Img [3] may be generated based on the image data of the generated bandpass image Img [2].

When band-pass filter processing is used instead of low-pass processing, for example, when the target image Img [1] is an image with a gentle gradation such as a sunset sky, matching processing described later is effective. It is preferable to apply a band pass filter that suppresses or transmits only a specific frequency band so as to function.

(Step S103)
The CPU 14 divides the target image Img [1] acquired in step S101 and the low-pass images Img [2] and Img [3] generated in step S102 into a plurality of blocks at equal intervals by the region dividing unit 24, respectively. . As an example, the region dividing unit 24 in the first embodiment equally divides the target image Img [1], the low-pass images Img [2], and Img [3] into blocks of 10 × 10 each. In the following, each block indicates the type of image with [n], indicates the position of the block with (i, j), and is denoted as block B [n] (i, j). n indicates the type of image, n = 1 indicates the target image Img [1] acquired in step S101, and n = 2 and n = 3 indicate the low-pass images Img [2] and Img [3] generated in step S102. Indicates. Further, i indicates the position in the horizontal direction, and j indicates the order in the vertical direction. For example, the block at the upper left corner of the target image Img [1] is the start point and is expressed as block B [1] (1,1), the block at the lower right corner is the end point, and block B [1] (10, 10 ). (See FIG. 4). The same applies to the low-pass images Img [2] and Img [3].

Note that the number of block divisions is determined in advance in consideration of processing accuracy and processing speed. The number of divisions of the target image Img [1], the low-pass images Img [2], and Img [3] may be the same or different.

Note that when dividing each image, there may be a remainder due to the balance between the number of pixels and the number of divisions. In such a case, it suffices that the extra pixels exist in the outermost periphery of each image. For example, when 10 pixels are left in the up and down direction, a remainder of 10 pixels may be present in the outermost peripheral portion of the upper side or the lower side, or divided into the upper side and the lower side (for example, 5 pixels on the upper side, the lower side (5 pixels, etc.) may be present.

(Step S104)
The CPU 14 uses the template setting unit 25 to select a block existing on the outermost periphery from a plurality of blocks obtained by the division processing in S103 and set it as a template. The template setting unit 25 performs template setting for each of the target image Img [1], the low-pass images Img [2], and Img [3].

Hereinafter, the setting of the template in the target image Img [1] will be described as an example. As shown in FIG. 5, the template setting unit 25 selects a block (a block indicated by hatching in FIG. 5) existing on the outermost periphery of the target image Img [1] and sets it as a template. That is, among the plurality of blocks shown in FIG. 4, 10 blocks from block B [1] (1,1) to block B [1] (10,1) existing on the upper side and blocks existing on the left side B [1] (1,2) to B [1] (1,9), and eight blocks B [1] (10,2) to B [1] (10, 9) 8 blocks and 10 blocks of blocks B [1] (1,10) to B [1] (10,10) existing on the lower side as a template. Set.

However, as described above, when there is a remainder when each image is divided, the block set as a template in step S104 is “a block that exists on the outermost periphery among the blocks”.

In the following, the number N is assigned to the template from the upper left to the lower right, and each template is represented as a template T [1] {N}. As described above, when 36 blocks existing on the outermost periphery of the target image Img [1] are selected, as shown in FIG. 5, the templates T [1] {1} to T [1] {36 } Templates are set.

The template setting unit 25 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass images Img [2], the templates T [2] {1} to T [2] {36} is set, and the template T [3] {1} to template T [3] {36} are set for the low-pass image Img [3].

(Step S105)
The CPU 14 performs matching processing on each of the target image Img [1], the low-pass images Img [2], and Img [3] by the matching processing unit 26.

Hereinafter, the matching process in the target image Img [1] will be described as an example. The matching processing unit 26 removes the template set in step S104 from the target image Img [1] (block B [1] (2,2) to block B [1] (9,9) 8 × 8 = Evaluation value SAD [1] (i, j) is obtained for each of the 64 blocks.

As an example, a case where the evaluation value SAD [1] (2, 2) is obtained for the block B [1] (2, 2) among the blocks to be matched will be described as an example. The matching processing unit 26 stores the image data of the image of the block B [1] (2,2) and the templates (template T [1] {1} to template T [1] {36}) set in step S104. And the difference absolute value sum SAD [1] (i, j) {N} shown in the following equation is obtained.

[N] on the left side of Equation 6 indicates the type of image (here, n = 1), (i, j) indicates the position of the block (here, i = 2, j = 2), and {N} Indicates the template number (here, 1 to 36). Further, the right side in FIG. 5 indicates the pixel value of an arbitrary pixel in a block (block B [n] (i, j)) to be matched and an arbitrary template block (template T [n] {N}). )) Shows that the absolute value of the difference from the pixel value of the pixel at the position corresponding to an arbitrary pixel is obtained and added for all the pixels in the matching target block.

The difference absolute value sum SAD [1] (i, j) {N} obtained by Equation 6 decreases as the matching degree between the matching target block and the template increases.

For example, when the difference absolute value sum SAD [1] (2,2) {1} is obtained for the block B [1] (2,2) and the template T [1] {1}, the matching process is performed. The unit 26 calculates a pixel value of an arbitrary pixel in the block B [1] (2,2) and an arbitrary template block (template T [1] {1} = block B [1] (1,1)). The absolute value of the difference from the pixel value of the pixel at the position corresponding to an arbitrary pixel is obtained and added for all the pixels in the block B [1] (2,2) 加算, and the absolute value sum SAD [1 ] Find (2,2) {1}.

The matching processing unit 26 performs the same processing for each of the block B [1] (2,2) and the templates T [1] {2} to the templates T [1] {36}, and calculates the absolute value sum. SAD [1] (2,2) {2} to absolute value sum SAD [1] (2,2) {36} are obtained. Then, an evaluation value SAD [1] (2,2) for the block B [1] (2,2) is obtained using the following equation.

The min (X) on the right side of Expression 7 is an expression that returns the minimum value of X. In the above example, the absolute value sum SAD [1] (2,2) {1} to the absolute value sum SAD [1] ( 2,2) Let the minimum value of {36} be the evaluation value SAD [1] (2,2).

The matching processing unit 26 performs the above processing also for the block B [1] (3,2) to the block B [1] (9,9), and the evaluation value SAD [1] (3,2) ˜ The evaluation value SAD [1] (9, 9) is obtained. As a result, each of the matching target blocks (8 × 8 = 64 blocks from block B [1] ８ (2,2) to block B [1] (9,9)) in the target image Img [1] is evaluated. The value SAD [1] (i, j) is determined.

The matching processing unit 26 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass image Img [2], the evaluation value SAD [2] (2, 2) ˜ Evaluation value SAD [2] (9,9) is obtained, and evaluation value SAD [3] (2,2) to evaluation value SAD [3] (9,9) is obtained for the low-pass image Img [3].

In the above example, the example in which the matching process is not performed for the block set in the template in step S104 is shown, but the matching process may be similarly performed for the block set in the template. In this case, the value of the evaluation value SAD [1] ， (i, j) is 0.

(Step S106)
The CPU 14 uses the map creation unit 27 to create a map for each of the target image Img [1], the low-pass image Img [2], and Img [3].

In the following, the creation of a map in the target image Img [1] will be described as an example. Based on the evaluation value SAD [1] (2,2) to evaluation value SAD [1] (9,9) obtained in step S105 for the target image Img [1], the map creation unit 27 performs the target image Img [1]. ] Map Sal [1] is created.

The map creation unit 27 compares the evaluation value SAD [1] (2,2) to the evaluation value SAD [1] (9,9) obtained in step S105 with the threshold value TH, and assigns the pixel value to each block. To decide. For example, for the block B [1] (2,2) described above, the map creation unit 27 compares the evaluation value SAD [1] (2,2) with the threshold value TH and calculates SAD [1] (2,2). )> If the threshold value TH is satisfied, the pixel values of all the pixels in the block B [1] (2,2) are replaced with the values of SAD [1] (2,2). On the other hand, when SAD [1] (2,2) ≦ threshold TH, the map creating unit 27 sets the pixel values of all the pixels in the block B [1] (2,2) to zero.

Note that the threshold value TH is determined according to a threshold value determined according to a range that the evaluation value SAD [n] など (i, j) obtained in step S105 can take (for example, the evaluation value SAD). [n] is about 10% below the range that can be taken by (i, j)). The smaller this threshold TH, the higher the possibility that a main subject is present in the block corresponding to the evaluation value to be compared (the possibility that it is not a background area), and the larger this threshold TH is. The possibility that it is estimated that the main subject does not exist (the possibility that it is assumed to be the background region) increases.

Further, when SAD [1] ２， (2,2)> threshold TH, it can be estimated that the main subject exists in block B [1] (2,2), that is, block B [1] (2 , 2) is a case where it can be estimated that it is not a background region. In such a case, the pixel values corresponding to the main subject are assigned by replacing the pixel values of all the pixels in the block B [1] (2,2) with the values of SAD [1] (2,2). be able to. On the other hand, when SAD [1] (2,2) ≦ threshold TH, it can be estimated that there is no main subject in block B [1] (2,2), that is, block B [1] ( 2, 2) is a case where it can be estimated that it is a background region. In such a case, pixel values corresponding to the background area can be assigned by setting the pixel values of all the pixels in the block B [1] (2, 2) to 0.

Further, for each block (36 blocks corresponding to template T [1] {1} to template T [1] {36}, see FIG. 5) set as a template in step S104, the map creating unit 27 All pixel values in each block are set to 0. This is because these blocks can be estimated to be the background region without comparison with the threshold value TH described above.

The map to which the new pixel value is assigned for each block by the above-described processing is the map Sal [1] related to the target image Img [1].

The map creation unit 27 performs the same processing for each of the low-pass images Img [2] and Img [3], creates a map Sal [2] for the low-pass image Img [2], and creates the low-pass image Img. For [3], a map Sal [3] is created. Note that, in the creation of the maps related to the low-pass images Img [2] and Img [3], the same threshold value TH used for creating the map related to the target image Img [1] may be used, or a different threshold value may be used. Also good.

Finally, the map creation unit 27 is based on the map Sal [1] regarding the target image Img [1], the map Sal [2] regarding the low-pass image Img [2] ２, and the map Sal [3] regarding the low-pass image Img [3]. The final map Sal [T] is created using the following equation.

W1, w2, and w3 in Equation 8 indicate the weighting amount of each map. The map creation unit 27 weights the map Sal [1], the map Sal [2], and the map Sal [3], and adds the pixel values of the corresponding pixels in each map, thereby adding the map Sal [ T] is created.

Note that the weights w1, w2, and w3 are determined by the frequency components in the template set in step S104. For example, when the template described above is noisy, the weight w2 of the map Sal [2] relating to the low-pass image Img [2] and the weight w3 of the map Sal [3] relating to the low-pass image Img [3] are relatively increased. When the noise is low, the weight w1 of the map Sal [1] related to the target image Img [1] may be relatively increased. In addition to this, for example, when the target image Img [1] is photographed, based on the photographing mode (“portrait mode”, “landscape mode”, etc.) set in the imaging device, the result of subject recognition, and the like. The weights w1, w2, and w3 may be determined.

FIG. 6 shows an example of the map Sal [T] created in this way. 6A shows the target image Img [1] acquired in step S101, and FIG. 6B shows the map Sal [T] created in step S106. In FIG. 6A, branches, leaves, wire nets, etc. are reflected in the background portion. Such a portion has been recognized as a main subject area in spite of the background in the conventional method. However, according to the present embodiment, since the templates are set using the images of these portions, as shown in FIG. 6B, these portions are not misrecognized as the main subject region, but are the main subjects. Only certain bird portions will remain on the map Sal [T].

(Step S107)
The CPU 14 records the map Sal [T] obtained in step S106 in association with the target image Img [1]. For example, the map Sal [T] may be recorded as supplementary information of the target image Img [1], or the map Sal [T] is identified to indicate that the information relates to the target image Img [1]. Information may be given.

(Step S108)
Based on the map Sal [T] obtained in step S106, the CPU 14 superimposes and displays the target image Img [1] and a marker indicating the main subject area on the monitor 19.

CPU 14 first extracts a main subject area based on the map Sal [T]. The CPU 14 compares the value of each pixel of the map Sal [T] with a predetermined threshold value TR and obtains a minimum rectangular range including all pixels exceeding the threshold value TR, thereby extracting a main subject region. Note that when obtaining the minimum rectangular range, the aspect ratio may be obtained with a fixed aspect ratio.

The threshold value TR is a threshold value that is determined according to the range of values that can be taken by each pixel included in the map Sal [T]. The smaller the threshold value TR, the higher the possibility that the region extracted as the main subject region will be wider. The larger the threshold value TR, the higher the possibility that the region extracted as the main subject region will become narrower.

FIG. 7 shows an example of the main subject area extracted in this way. FIG. 7A shows a diagram in which the frame Fa indicating the main subject area described above is superimposed on the map Sal [T] shown in FIG. 6B. FIG. 7B shows a diagram in which the frame Fb indicating the main subject region described above is superimposed on the target image Img [1] shown in FIG. 6A.

In the map Sal [T], as shown in FIG. 7A, branches, leaves, wire meshes, etc. reflected in the background portion are excluded from the main subject region, so that a suitable main subject region can be extracted. Can do.

Next, as shown in FIG. 7B, the CPU 14 superimposes and displays the target image Img [1] and a frame Fb, which is a marker indicating the main subject area, on the monitor 19. In the target image Img [1], as shown in FIG. 7B, a frame Fb indicating the main subject area is displayed in a portion excluding branches, leaves, wire mesh, etc. reflected in the background portion.

Although the example of FIG. 7 shows an example in which the main subject area is extracted into a rectangle, the present invention is not limited to this example. For example, any shape such as an ellipse, a polygon, or an irregular shape along the outline of the main subject region may be used.

In the example of FIG. 7, an example in which a frame indicating the main subject area is displayed is shown, but the present invention is not limited to this example as long as the main subject area is visible. For example, the frame may blink or a predetermined color frame may be displayed. Further, the brightness and color of the main subject area and other areas may be changed and displayed.

CPU14 will complete | finish a series of processes, if the display mentioned above is performed.

In the above-described example, an example in which a series of processing is executed according to a program execution instruction by the user is shown, but the present invention is not limited to this example. For example, a map for a plurality of images may be created in response to a single user instruction. In addition, a series of processes may be automatically executed every time image data is read from the outside via the data reading unit 12. In addition, the imaging apparatus including the image processing apparatus described in this embodiment may be configured to execute a series of processes when performing imaging. In addition, the reproduction apparatus including the image processing apparatus described in the present embodiment may be configured to execute a series of processes when reproducing an image.

Next, a method for using the map Sal [T] and the extracted main subject area will be described with an example. As described above, by extracting the main subject area based on the map Sal [T], it can be used at the time of image capturing and reproduction.

At the time of shooting, the following usage methods (a) and (b) are conceivable.
(A) Automatic zoom to the main subject area (attention area) At the time of shooting, the map Sal [T] is created based on a so-called through image for composition confirmation. It can be carried out. For example, when the main subject area is smaller than a predetermined size, it is possible to perform appropriate shooting by automatically performing optical zoom or electronic zoom around the main subject area.

Such processing can be performed in the same manner when moving images are captured. In any case, the degree of zooming can be suppressed to such an extent that the main subject area does not protrude from the finder (shootable range). By performing automatic zooming, it is possible to easily capture the main subject that the user wants to photograph at an appropriate zoom magnification.
(B) Use for AE, AF, AWB At the time of shooting, a map Sal [T] is created based on a so-called through image for composition confirmation. It can control suitably. Further, the information of the map Sal [T] may be used for subject recognition that has been performed conventionally.

Such processing of AF, AE, AWB, etc. can be performed in the same manner when moving images are taken. In any case, the main subject area may be detected based on the map Sal [T], and AF, AE, AWB, etc. may be performed with the center of gravity of the main subject area as the center. By such processing, AF, AE, AWB, etc. suitable for the main subject can be executed while tracking the movement of the main subject region.

Moreover, automatic shutter control can be performed by creating a map Sal [T] based on a through image at the time of shooting. For example, a map Sal [T] is created for a continuously generated through image at regular time intervals to detect a main subject region, and at least one of the detected size and position of the main subject region is monitored. To do. When at least one of the size and the position of the main subject region satisfies a predetermined appropriate condition (may be set in advance or set by the user), automatic shutter control is performed.

Such an automatic shutter control can be performed in the same manner when a moving image is taken. In either case, the main subject can be automatically imaged while tracking the movement of the main subject area.

When performing the above-described AF, AE, AWB, etc. processing and automatic shutter control, the control conditions up to several frames before are stored, and the control conditions in the current frame are the control conditions in the previous frame. If it is significantly different from the tracking, tracking may be prohibited.

Further, at the time of reproduction, the following usage methods (c) and (d) are conceivable.
(C) Determination of the zoom center in a slide show In a slide show in which a plurality of images are continuously reproduced and displayed, zoom processing is often performed as a display effect when switching images. In such a case, by extracting the main subject region based on the map Sal [T], the center of the main subject region can be set as the zoom center. As a result, it is possible to perform display in accordance with the effect of “prominent main subject region (region of interest)” that is the purpose of zoom processing.
(D) Automatic cropping of main subject area (attention area) When a plurality of images are displayed as a list, only the main subject area is displayed by automatic cropping by cutting out and enlarging a part of the image. By performing such display, when displaying a list of a plurality of images, a large amount of images can be displayed on one screen while maintaining the listability without displaying extra information. An example of such display is shown in FIG. FIG. 8A is an example of a conventional list display. FIG. 8B shows an example in which automatic cropping is performed based on the map Sal [T] and only the main subject area is displayed as a list during such display.

Note that such automatic cropping can also be applied to a confirmation image (a so-called freeze image) displayed immediately after shooting. By performing automatic cropping when displaying the confirmation image, the user can easily confirm the focus in the main subject area, confirm the camera shake, and the like. In addition, when an enlargement display instruction is given by the user during image reproduction, the same effect can be obtained by performing the same processing.

Note that when performing automatic cropping, the main subject area may be stretched in the vertical or horizontal direction so that the cropped image has an appropriate aspect ratio, and then the cropping process may be performed. In this way, by performing the crop processing that maintains the aspect ratio, the aspect ratio of the image before cropping (the target image) is maintained even when the cropped image is output to an external device having a fixed aspect ratio. be able to.

As described above, the image processing apparatus according to the first embodiment divides the target image into a plurality of blocks, and based on the images of the plurality of blocks existing on the outer periphery of the target image among the plurality of blocks. Set the template. Then, a representative value is calculated for each of the plurality of blocks obtained by dividing the target image, and matching is performed for each of the plurality of blocks by comparing the representative value of the block to be matched with the representative value in the plurality of templates. Based on the matching result, a map showing the distribution of the subject in the target image is created.

The above-described outer peripheral portion of the target image is, for example, a range of about 30% of the height of the target image from the upper and lower ends of the target image, and about 30% of the width of the target image from the left and right ends of the target image. Can be considered as a range.

Therefore, according to the configuration of the first embodiment, by using the outer periphery of the target image as a template, it is possible to reliably detect the background area. Therefore, by extracting the main subject area using the created map, it is possible to extract the main subject area by a method suitable for the target image without depending on high-frequency components or assuming an empirical composition. It can be performed.

In particular, according to the configuration of the first embodiment, the main subject is not a face as compared with the face recognition technology that has been conventionally considered to specialize in recognizing a face as the main subject. In addition, the main subject region can be preferably extracted. Furthermore, the main subject area can be extracted from the target image without requiring various designations and settings by the user.

In addition, according to the configuration of the first embodiment, at least one image having a lower resolution than the target image is generated, and a map indicating the distribution of the subject is generated for each of the target image and the low resolution image. Since a map showing the distribution of subjects in the target image is created by performing calculations based on multiple maps, a suitable template is set even for images in which subjects other than the main subject appear in the outer periphery of the target image Can Therefore, even if a high frequency component is on the outer periphery of the target image, the main subject region can be extracted by a method suitable for the target image.

In the first embodiment, the low-pass images Img [2] and Img [3] are generated based on the image data of the target image Img [1] acquired in step S101. It is not limited to this example. For example, it may be configured to generate three or more low-pass images. In this case, a map Sal [n] is created for each of the plurality of generated low-pass images, and the created map Sal [n] is appropriately weighted and added, so that the map Sal [T] is the same as in the present embodiment. ] Can be created.

In the first embodiment, the low-pass images Img [2] and Img [3] are generated based on the image data of the target image Img [1] acquired in step S101. [2] and Img [3] need not be generated. That is, only the target image Img [1] acquired in step S101 is processed from step S103 to step S105, and the map Sal [1] related to the target image Img [1] described in step S106 is directly used as the map Sal [T. It is also good as.

<Description of Second Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the second embodiment will be described. The second embodiment is a modification of the process of S102 in the first embodiment. In the present specification, in the description of the following embodiment, the description of the configuration of the image processing apparatus common to the first embodiment is omitted.

In the example of the second embodiment, the following process is performed instead of the process of S102 in the first embodiment.

(Step S102)
The CPU 14 causes the area dividing unit 24 to generate resized images Img [2] and Img [3] based on the image data of the target image Img [1] acquired in step S101. The resized image Img [2] is generated according to the following Expression 9, and the resized image Img [3] is generated according to the following Expression 10.

Resize (X, Y) on the right side of Expression 9 and Expression 10 is an expression indicating that X is resized with a magnification Y. The area dividing unit 24 generates the resized image Img [2] by resizing the target image Img [1] at the magnification rt1, as shown in Expression 9, and also, as shown in Expression 10, the target image Img [1]. Is resized at a magnification rt2 to generate a resized image Img [3]. Note that the magnification rt1 and the magnification rt2 are predetermined magnifications, both of which are less than 1 and rt1 ≠ rt2.

In the example described above, an example in which the resized images Img [2] and Img [3] are generated based on the image data of the target image Img [1] has been shown. However, the image data of the target image Img [1] Based on this, the resized image Img [2] may be generated, and the resized image Img [3] may be generated based on the image data of the generated resized image Img [2].

In the processing after step S103, the CPU 14 uses the resized images Img [2] and Img [3] instead of the low-pass images Img [2] and Img [3], and performs the same processing as in the first embodiment. . However, when creating the map Sal [T] in step S106, the following equation is used instead of equation 8.

Rt1 and rt2 in Equation 11 are the magnification rt1 and the magnification rt2 at the time of the resizing process described above. Due to the resizing process, the resized images Img [2] and Img [3] are smaller in size than the target image Img [1]. Therefore, the map Sal [2] and the resized image Img [3] related to the resized image Img [2]. The size of the map Sal [3] related to is smaller than the map Sal [1] related to the target image Img [1] [. For this reason, when creating the map Sal [T] in step S106, weights and addition processing are performed after the sizes are adjusted by multiplying the reciprocals of the magnification rt1 and the magnification rt2 at the time of the resizing processing.

As described above, the image processing apparatus according to the second embodiment performs a resizing process, which is a low resolution process (band limiting process) similar to the low pass process, instead of the low pass process according to the first embodiment. Therefore, it is possible to obtain substantially the same effect as the configuration of the first embodiment. The image processing apparatus according to the second embodiment can be expected to increase the processing speed by performing the resizing process instead of the low-pass process.

<Description of Third Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the third embodiment will be described. The third embodiment is a modification of the process of S104 in the first embodiment and the second embodiment. In the present specification, in the description of the following embodiments, the description of the configuration of the image processing apparatus common to the first embodiment and the second embodiment is omitted.

In the example of the third embodiment, the following process is performed instead of the process of S104 in the first embodiment.

(Step S104)
The CPU 14 uses the template setting unit 25 to select all the blocks existing on the three sides excluding the lower side from the plurality of blocks obtained by the division processing in S103 and set them as templates. The template setting unit 25 performs template setting for each of the target image Img [1], the low-pass images Img [2], and Img [3].

Hereinafter, the setting of the template in the target image Img [1] will be described as an example. As shown in FIG. 9A, the template setting unit 25 selects blocks (blocks indicated by diagonal lines in FIG. 9A) existing on three sides excluding the lower side of the target image Img [1] and sets them as templates. That is, among the plurality of blocks shown in FIG. 4, 10 blocks from block B [1] (1,1) to block B [1] (10,1) existing on the upper side and blocks existing on the left side N blocks B [1] (1,2) to B [1] (1,10) and blocks B [1] (10,2) to B [1] (10, A total of 28 blocks including the 9 blocks of 10) are set as templates. In the following, the number N is assigned to the template from the upper left to the lower right, and each template is represented as a template T [1] {N}. As described above, when 28 blocks existing on the three sides excluding the lower side of the target image Img [1] are selected, as shown in FIG. 9, templates T [1] {1} to templates T [1] ] 28 templates of {28} are set.

The reason why all the blocks existing on the three sides excluding the lower side are set as the template is because the block existing on the lower side is not set as the template. For example, there may be a main subject (or an extension of the main subject) on the lower side of the image, such as a bust-up human image as shown in FIG. 9A. In such a case, the lower side This is because the block existing in is not suitable as a template. If a block in which the main subject exists is set as a template, the block in which the main subject exists is extracted as a background area. In order to deal with such a problem, all blocks existing on the three sides except the lower side are set as templates.

It should be noted that the top and bottom of the image can be recognized based on the orientation information of the imaging device when the target image Img [1] is captured. Furthermore, the top and bottom of the image may be recognized based on the results of automatic subject recognition, face recognition, and the like. For example, for an image in a horizontal position as shown in FIG. 9B, blocks that are present on three sides excluding the left side of the target image Img [1] (blocks indicated by diagonal lines in FIG. 9B) are selected and set as a template. That's fine.

The template setting unit 25 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass images Img [2], the templates T [2] {1} to T [2] {28} is set, and the template T [3] {1} to template T [3] {28} are set for the low-pass image Img [3].

In the processing after step S105, the CPU 14 performs the same processing as in the first embodiment. However, in step S105, the templates for which the absolute difference sum SAD [1] (i, j) {N} is calculated are 28 templates T [1] {1} to template T [1] {28}. Template.

As described above, the image processing apparatus according to the third embodiment sets a plurality of templates based on the images of all blocks existing on the three sides of the target image except the lower side. Therefore, a suitable template can be set also for an image in which a main subject exists on the lower side of the target image. In addition, the processing speed can be increased by reducing the number of templates.

In the third embodiment, an example is shown in which a plurality of templates are set based on images of all blocks existing on the three sides except the lower side. However, images of all blocks existing on the left side and the right side are shown. Based on this, a plurality of templates may be set.

In the first and third embodiments, the image of all blocks existing on the target side (all four sides in the first embodiment, three sides or two sides in the third embodiment) are included. Although an example in which a plurality of templates are set based on the above is shown, a plurality of templates may be set based on an image of some blocks. For example, the template may be set based on images of all blocks existing on three sides except the lower side and images of some predetermined blocks existing on the lower side. Also, a plurality of templates may be set based on the block images existing at the four corners.

<Description of Fourth Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the fourth embodiment will be described. The fourth embodiment is a modification of the process of S104 in the first embodiment and the second embodiment, similarly to the third embodiment described above. Therefore, as in the third embodiment, in the description of the following embodiment, the description of the configuration of the image processing apparatus common to the first embodiment and the second embodiment is omitted in this specification.

In the example of the fourth embodiment, the following process is performed instead of the process of S104 in the first embodiment.

(Step S104)
The CPU 14 uses the template setting unit 25 to select a part of blocks from a plurality of blocks obtained by the division processing in S103 and all blocks existing on the outermost periphery of the image based on the position of the matching target block in the image. Select and set as template. The template setting unit 25 performs template setting for each of the target image Img [1], the low-pass images Img [2], and Img [3].

Hereinafter, the setting of the template in the target image Img [1] will be described as an example. The template setting unit 25 individually handles each of the blocks (8 × 8 = 64 blocks from block B [1] (2,2) to block B [1] (9,9)) excluding the block existing on the outermost periphery. Set the template. Based on the position of the block B [1] (i, j) in the target image Img [1], the template setting unit 25 sets the upper side of all the blocks existing on the outermost periphery of the target image Img [1]. Block B [1] １ (i−1,1), block B [1] (i, 1), block B [1] (i + 1,1), and block B [1] (1, j-1), block B [1] (1, j), block B [1] (1, j + 1), block B [1] (10, j-1), block B [1] existing on the right side (10, j), block B [1] (10, j + 1), block B [1] (i−1,10), block B [1] (i, 10), block B [1 existing on the lower side ] (I + 1, 10) is selected and set as a template.

For example, as shown in FIG. 10, when a template is set for block B [1] (6,4), block B [1] (5,1), block B [1] ( 6,1), block B [1] (7,1), block B [1] (1,3), block B [1] (1,4), block B [1] (1 5), block B [1] １ (10,3), block B [1] (10,4), block B [1] (10,5) existing on the right side, and block B [ 1] (5, 10), block B [1] (6, 10), and block B [1] (7, 10) are selected and set as templates. That is, the template setting unit 25 sets a total of 12 blocks among the plurality of blocks shown in FIG. 4 as templates.

In the following, the number N is assigned to the templates from the upper left to the lower right, and each template is represented as a template T [1] (i, j) {N}. (I, j) in the template T [1] (i, j) {N} indicates that it is a template set for the block B [1] (i, j).

As described above, when 12 blocks are selected based on the position of the block B [1] １ (6,4), as shown in FIG. 10, the template T [1] {(6,4) {1} Twelve templates of template T [1] (6,4) {12} are set.

The template setting unit 25 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass image Img [2], the block B [2] (2,2) to the block B [ 2] Template T [2] (i, j) {1} to template T [2] (i, j) {12} are set for each block of (9,9). For the low-pass image Img [3], the template T [3] (i, j) {1 for each of the blocks B [3] (2,2) to B [3] (9,9). } To template T [3] (i, j) {12} are set.

In the processing after step S105, the CPU 14 performs the same processing as in the first embodiment. However, when matching processing is performed in step S105, a difference absolute value sum SAD [n] (i, j) {N} is obtained using a template that is different for each block. In addition, templates for which the absolute difference sum SAD [n] ｎ {N} is calculated are, for each block, templates T [n] (i, j) {1} to templates T [n] (i, j). ) Twelve templates of {12}.

In the above-described example, the template setting unit 25 uses all the positions present on the outermost periphery of the target image Img [1] based on the position of the block B [1] (i, j) in the target image Img [1]. In this example, three blocks are selected for each side and set as a template. However, the configuration may be such that (2a + 1) blocks are selected for each side using the variable a (where a is an integer of 0 or more). That is, block B [1] (ia, 1) to block B [1] (i + a, 1) existing on the upper side and block B [1] (1, ja) to block B existing on the left side [1] (1, j + a), block B [1] (10, j−a) to block B [1] (10, j + a) existing on the right side, and block B [1] (i existing on the lower side -A, 10) to block B [1] (i + a, 10) may be selected and set as a template. The variable a may be determined according to the number of divisions at the time of block division performed in step S103.

As described above, the image processing apparatus according to the fourth embodiment selects some blocks from all blocks existing on the outermost periphery of the target image based on the position of the matching target block in the target image. A plurality of templates are set based on the images of the selected plurality of blocks.

Therefore, according to the configuration of the fourth embodiment, the number of templates can be reduced and the processing speed can be increased as in the third embodiment.

In the first to fourth embodiments, an example in which a plurality of templates are set based on the image of a block existing on the outermost periphery has been described. However, the present invention is not limited to this example. For example, a plurality of templates may be set based on an image of a block existing on the innermost circumference of the outermost circumference. For example, in the example of FIG. 4, eight blocks B [1] (2,2) to B [1] (9,2) existing on the inner side of the upper side and the inner side of the left side of the upper side. Seven blocks from existing block B [1] ３ (2,3) to block B [1] (2,90), and block B [1] (9,3) to block existing one round inside of the right side A total of 22 blocks, including 7 blocks of B [1] (9, 9), may be set as a template. With such a setting, for example, it is possible to appropriately cope with a case where the composition is determined to some extent, such as when the target image Img [1] is a picture in a frame.

Also, the number of blocks on each side may not be the same. For example, for the upper side, a plurality of templates may be set based on a block image for two lines, and for the left side and the right side, a plurality of templates may be set based on a block image for one line.

<Description of Fifth Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the fifth embodiment will be described. The fifth embodiment is a modification of the process of S105 in the first to fourth embodiments described above. In the present specification, in the description of the following embodiments, the description of the configuration of the image processing apparatus common to the first to fourth embodiments is omitted.

In the example of the fifth embodiment, the following process is performed instead of the process of S105 in the first embodiment.

Hereinafter, the matching process in the target image Img [1] will be described as an example. The matching processing unit 26 calculates the following expression for each block (block B [1] (1,1) to block B [1] (10,10)) obtained by dividing the block in step S103 for the target image Img [1]. To determine the frequency feature. For the blocks (template T [1] {1} to template T [1] {36}) corresponding to the template set in step S104, the frequency feature quantity f [1] {N} is calculated using Expression 12. The portion of the target image Img [1] excluding the template set in step S104 (8 × 8 = 64 blocks from block B [1] (2,2) to block B [1] (9,9)) is obtained. For each, the frequency feature quantity f [1] (i, j) is obtained using Equation 13.

Details of the Fourier transform are the same as those described in step S102 of the first embodiment. The matching processing unit 26 performs frequency feature quantity f [1] {1} to frequency feature quantity f [1] {36} and frequency feature quantity f [1] (2,2) to frequency feature quantity f [1] ( 9 and 9) are obtained, and the evaluation value fSAD [1] ｉ (i, j) is obtained for each portion excluding the template set in step S104 from the target image Img [1].

As an example, a case where the evaluation value fSAD [1] (2, 2) is obtained for the block B [1] (2, 2) among the blocks to be matched will be described as an example. The matching processing unit 26 uses the frequency feature quantity f [1] (2,2) of the block B [1] (2,2) and the templates (template T [1] {1} to template T [ 1] {36}) are compared with the frequency feature quantity f [1] {1} to the frequency feature quantity fT [1] {36}, respectively, and the sum of absolute differences fSAD [1] (i, j ) Find {N}.

[N] on the left side of Expression 14 indicates the type of image (here, n = 1), (i, j) indicates the position of the block (here, i = 2, j = 2), and {N} Indicates the template number (here, 1 to 36). Further, the right side of Expression 14 represents the value of the frequency feature quantity f [1] (i, j) corresponding to an arbitrary pixel in the block (block B [n] (i, j)) to be matched, In an arbitrary template block (template T [n] {N})), the absolute value of the difference from the value of the frequency feature quantity fT [1] {N} corresponding to the pixel at the position corresponding to the arbitrary pixel is calculated. , The area corresponding to all the pixels in the matching target block (= B [n] (i, j) base) is obtained and added.

The difference absolute value sum fSAD [1] (i, j) {N} obtained by Expression 14 decreases as the matching degree between the matching target block and the template increases.

For example, when the difference absolute value sum fSAD [1] (2,2) {1} is obtained for the block B [1] (2,2) and the template T [1] {1}, the matching process is performed. The unit 26 determines the value of the frequency feature quantity f [1] ２ (2,2) corresponding to an arbitrary pixel in the block B [1] (2,2) and an arbitrary template block (template T [1] { 1} = in block B [1] (1,1)), the absolute value of the difference from the value of the frequency feature quantity fT [1] {1} corresponding to the pixel at the position corresponding to an arbitrary pixel is The area corresponding to all the pixels in B [1] (2,2) （(= the base of B [1] (2,2)) is obtained and added, and the absolute value total fSAD [1] (2,2) Find {1}.

The matching processing unit 26 performs the same processing for each of the block B [1] (2,2) and the templates T [1] {2} to the templates T [1] {36}, and calculates the absolute value sum. fSAD [1] (2,2) {2} to absolute value sum fSAD [1] (2,2) {36} are obtained. Then, the evaluation value fSAD [1] (2,2) for the block B [1] (2,2) is obtained using the following equation.

The min (X) on the right side of Equation 15 is an equation that returns the minimum value of X. In the above example, the absolute value sum fSAD [1] (2,2) {1} to the absolute value sum SAD [1] ( 2,2) Let the minimum value of {36} be the evaluation value fSAD [1] (2,2).

The matching processing unit 26 performs the above processing for the block B [1] (3,2) to the block B [1] (9,9), and the evaluation value fSAD [1] (3,2) ˜ The evaluation value fSAD [1] (9, 9) is obtained. As a result, each of the matching target blocks (8 × 8 = 64 blocks from block B [1] ８ (2,2) to block B [1] (9,9)) in the target image Img [1] is evaluated. The value fSAD [1] (i, j) is determined.

The matching processing unit 26 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass image Img [2], the evaluation value fSAD [2] (2,2) ˜ Evaluation value SAD [2] (9,9) is obtained, and evaluation value fSAD [3] (2,2) to evaluation value SAD [3] (9,9) is obtained for the low-pass image Img [3].

In the processing after step S106, for the target image Img [1], the CPU 14 replaces the evaluation value SAD [1] (2,2) to the evaluation value SAD [1] (9,9) with the evaluation value fSAD [1]. 1] A map Sal [1] relating to the target image Img [1] is created based on (2,2) to evaluation value fSAD [1] (9,9). The same applies to the low-pass images Img [2] and Img [3].

As described above, the image processing apparatus according to the fifth embodiment calculates the representative value by performing the Fourier transform on the pixel value included in the block after calculating the pixel value for each pixel included in the block. To do. And, the absolute value of the difference between the arbitrary representative value in the matching target block and the representative value corresponding to the arbitrary representative value in the block of the arbitrary template corresponds to all the pixels in the matching target block. A sum of absolute differences, which is a value obtained by adding and obtaining a representative value, is obtained for each of a plurality of templates, and among the plurality of obtained sums of absolute differences, a value of the minimum sum of absolute differences is calculated as a matching target. The evaluation value for the block. Therefore, it is possible to obtain substantially the same effect as the configuration of the first embodiment.

In the fifth embodiment, an example in which Fourier transform is performed in step S105 has been described. However, any conversion process may be performed as long as image conversion to the frequency domain is performed. For example, discrete cosine transform or wavelet transform may be performed. Furthermore, image conversion to the frequency domain may be performed by combining a plurality of methods.

Further, in the fifth embodiment, as described above, the example in which the smallest value of the difference absolute value sum among the obtained plurality of difference absolute value sums is used as the evaluation value for the block to be matched has been shown. The maximum value or the average value may be used as the evaluation value.

<Description of Sixth Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the sixth embodiment will be described. The sixth embodiment is a modification of the process of S105 in the first to fourth embodiments, similarly to the fifth embodiment described above. Therefore, as in the fifth embodiment, in the description of the following embodiments, the description of the configuration of the image processing apparatus common to the first to fourth embodiments is omitted in this specification.

In the example of the sixth embodiment, the following process is performed instead of the process of S105 in the first embodiment.

Hereinafter, the matching process in the target image Img [1] will be described as an example. The matching processing unit 26 calculates the following expression for each block (block B [1] (1,1) to block B [1] (10,10)) obtained by dividing the block in step S103 for the target image Img [1]. The representative color feature amount is obtained by using this. For the blocks (template T [1] {1} to template T [1] {36}) corresponding to the template set in step S104, the representative color feature CL [1] {N} And the portion excluding the template set in step S104 from the target image Img [1] (8 × 8 = 64 blocks from block B [1] (2,2) to block B [1] (9,9)) For each of the above, the representative color feature amount CL [1] (i, j) is obtained using Expression 17.

CLr [n] {N}, CLg [n] {N}, and CLb [n] {N} on the right side of Equation 16 are respectively the maximum values for the pixel values of RGB colors in the template T [n] {N}. Indicates the mode value. When there are a plurality of mode values, the mode value corresponding to the smallest pixel value may be employed, or an average value may be employed.

Further, CLr [n] (i, j), CLg [n] (i, j), and CLb [n] (i, j) on the right side of Expression 17 are respectively represented by blocks B [n] (i, j). The mode value (mode) for each pixel value of each RGB color is shown.

Next, for the target image Img [1], the matching processing unit 26 performs block division on each block (block B [1] １， (1,1) to block B [1] (10,10)) in step S103. A second representative color feature amount is obtained using the following equation. For the blocks (template T [1] {1} to template T [1] {36}) corresponding to the template set in step S104, the second representative color feature value Q [ 1] {N} is obtained, and the portion excluding the template set in step S104 from the target image Img [1] (block B [1] (2,2) to block B [1] (9,9) 8 × For each of (8 = 64 blocks), the second representative color feature value Q [1] (i, j) is obtained using Equations 22 to 25.

Pr [n] {N} on the right side of Equation 19 represents a relative histogram of the R component in the template T [n] {N}. Similarly, Pg [n] {N} on the right side in Expression 20 represents a relative histogram of the G component in the template T [n] {N}, and Pb [n] {N} on the right side in Expression 21 represents the template. The relative histogram of the B component in T [n] {N} is shown. In addition, Pr [n] (i, j) on the right side of Equation 23 represents a relative histogram of the R component in the block B [n] (i, j). Similarly, Pg [n] (i, j) on the right side in Expression 24 represents a relative histogram of the G component in the block B [n] (i, j), and Pb [n] (i on the right side in Expression 25. , J) shows a relative histogram of the B component in the block B [n] (i, j). In Equations 19 to 21 and 23 to 25, the range for obtaining Σ representing the sum is determined by the number of bins (number of divisions) in each relative histogram.

The matching processing unit 26 represents the representative color feature value CL [1] {1} to the representative color feature value CLf [1] {36} and the representative color feature value CL [1] (2,2) to the representative color feature value CL [ 1] (9, 9) is obtained, and the second representative color feature value Q [1] {1} to the second representative color feature value Q [1] {36} and the second representative color feature value Q [1 ] (2,2) to second representative color feature quantity Q [1] (9,9) are obtained. Then, an evaluation value V [1] (i, j) is obtained for each block from the target image Img [1] for each portion excluding the template set in step S104.

As an example, a case where the evaluation value V [1] (2,2) is obtained for the block B [1] (2,2) among the blocks to be matched will be described as an example. The matching processing unit 26 performs representative color feature value CL [1] (2,2) and second representative color feature value Q [1] (2,2) of block B [1] (2,2), and step S104. The representative color feature values CL [1] {1} to the representative color feature values CLf [1] {36} of the templates (template T [1] {1} to template T [1] {36}) set in step 2 The representative color feature quantity Q [1] {1} to the second representative color feature quantity Q [1] {36} are respectively compared, and the difference absolute value sum V [1] (i, j) {N }

[N] on the left side of Equation 26 indicates the type of image (here, n = 1), (i, j) indicates the position of the block (here, i = 2, j = 2), and {N} Indicates the template number (here, 1 to 36). Further, the right side in Expression 26 represents each feature amount of the block (block B [n] (i, j)) to be matched and each block of the template (template T [n] {N})). It shows that the absolute value of the difference from the feature amount is obtained and added.

The difference absolute value sum V [1] (i, j) {N} obtained by Expression 26 decreases as the matching degree between the matching target block and the template increases.

The matching processing unit 26 performs the same processing for each of the block B [1] (2,2) and the templates T [1] {2} to the templates T [1] {36}, and calculates the absolute value sum. V [1] (2,2) {2} to absolute value sum V [1] (2,2) {36} are obtained. Then, an evaluation value V [1] (2,2) for the block B [1] (2,2) is obtained using the following equation.

Min (X) on the right side of Expression 27 is an expression that returns the minimum value of X. In the above example, the sum of absolute values V [1] (2,2) {1} to the sum of absolute values V [1] ( 2,2) Let the minimum value of {36} be the evaluation value V [1] (2,2).

The matching processing unit 26 performs the above processing also for the block B [1] (3,2) to the block B [1] ， (9,9), and the evaluation value V [1] (3,2) ˜ The evaluation value V [1] (9, 9) is obtained. As a result, each of the matching target blocks (8 × 8 = 64 blocks from block B [1] ８ (2,2) to block B [1] (9,9)) in the target image Img [1] is evaluated. The value V [1] (i, j) is determined.

The matching processing unit 26 performs the same processing for each of the low-pass images Img [2] and Img [3]. For the low-pass image Img [2], the evaluation value V [2] (2, 2) ˜ Evaluation value V [2] (9,9) is obtained, and evaluation value V [3] (2,2) to evaluation value V [3] (9,9) is obtained for the low-pass image Img [3].

In the processing after step S106, the CPU 14 replaces the evaluation value SAD [1] (2,2) to the evaluation value SAD [1] (9,9) with respect to the target image Img [1]. 1] A map Sal [1] relating to the target image Img [1] is created based on (2,2) to evaluation value V [1] (9,9). The same applies to the low-pass images Img [2] and Img [3].

As described above, the image processing apparatus according to the sixth embodiment calculates, as a representative value, a value indicating a color feature for each of a plurality of blocks based on a distribution of a plurality of color components constituting the target image. Then, a difference absolute value sum that is a value obtained by adding the difference between the representative value of the block to be matched and the representative value of the block of an arbitrary template is obtained for each of the plurality of templates, and the obtained plurality of difference absolute value sums are obtained. Among these, the smallest sum of absolute differences is set as the evaluation value for the block to be matched. Therefore, it is possible to obtain substantially the same effect as the configuration of the first embodiment.

In the sixth embodiment, the representative color feature value and the second representative color feature value have been described as examples of the value indicating the color feature, but only one of them may be used. Further, when the difference absolute value sum V [n] (i, j) {N} shown in Expression 26 is obtained, the representative color feature value and the second representative color feature value may be appropriately weighted.

<Description of Modified Example of Sixth Embodiment>
Note that the processing from S103 to S105 of the sixth embodiment may be modified as follows.

(Step S103)
The CPU 14 divides the target image Img [1] acquired in step S101 and the low-pass images Img [2] and Img [3] generated in step S102 into a plurality of blocks by the area dividing unit 24, respectively. However, as shown in FIG. 11, the area dividing unit 24 divides the target image Img [1] into a block B [1] (1, 1) existing on the outer periphery and an 8 × 8 matrix shape existing inside the block B [1] (1, 1). Are divided into blocks B [1] (2, 2) to B [1] (9, 9). Further, the region dividing unit 24 similarly divides the low-pass images Img [2] and Img [3].

(Step S104)
The CPU 14 uses the template setting unit 25 to select the block B [1] (1, 1) existing on the outer periphery from the plurality of blocks obtained by the division processing in S103, and sets it as a template. As shown in FIG. 12, the template setting unit 25 sets a template for each of the target image Img [1], the low-pass images Img [2], and Img [3]. That is, in this modification, the template setting unit 25 sets one block as the template T [1] {1}.

Hereinafter, the matching process in the target image Img [1] will be described as an example. The matching processing unit 26 blocks each block (block B [1] (1,1), B [1] (2,2) to block B [1] (block B [1] (1,2)) for the target image Img [1] in step S103. For 9, 9)), the representative color feature amount described above is obtained. In addition, for the block (template T [1] {1}) corresponding to the template set in step S104, the representative color feature amount CL [1] {1} described above is obtained, and from the target image Img [1], step S104 is obtained. The representative color feature amount CL [1 described above for each of the portions (8 × 8 = 64 blocks from block B [1] (2,2) to block B [1] (9,9)) excluding the template set in step 1). ] Find (i, j).

Next, the matching processing unit 26 blocks each block (block B [1] (1,1), B [1] (2,2) to block B [) divided in step S103 for the target image Img [1]. 1] The above-mentioned second representative color feature amount is obtained for (9, 9)). For the block (template T [1] {1}) corresponding to the template set in step S104, the above-described second representative color feature quantity Q [1] {1} is obtained, and the target image Img [1] is obtained. Each of the portions excluding the template set in step S104 (8 × 8 = 64 blocks from block B [1] (2,2) to block B [1] (9,9)) is the second representative described above. A color feature quantity Q [1] (i, j) is obtained.

The matching processing unit 26 obtains the representative color feature value CL [1] {1} and the representative color feature value CL [1] (2,2) to the representative color feature value CL [1] (9,9), respectively. The second representative color feature value Q [1] {1} and the second representative color feature value Q [1] (2,2) to the second representative color feature value Q [1] (9,9) are obtained. Then, an evaluation value V [1] (i, j) is obtained for each block from the target image Img [1] for each portion excluding the template set in step S104.

As an example, a case where the evaluation value V [1] (2,2) is obtained for the block B [1] (2,2) among the blocks to be matched will be described as an example. The matching processing unit 26 performs representative color feature value CL [1] (2,2) and second representative color feature value Q [1] (2,2) of block B [1] (2,2), and step S104. Are compared with the representative color feature value CL [1] {1} and the second representative color feature value Q [1] {1} of the template (template T [1] {1}) set in step 1, respectively. The value sum V [1] (2,2) {1} is obtained.

As described above, the target image Img [1] is divided instead of equally divided, and the block B [1] (1,1) which is the only template as a result of the division is defined as the template T [1] {1}. It is good also as a structure to set. In this case, the same effect as that of the sixth embodiment can be obtained.

It should be noted that the division example shown in FIG. 11 and the template setting example shown in FIG. 12 are examples, and the present invention is not limited to this example. The division may be performed in any shape, and a template may be set based on a plurality of block images.

<Description of Seventh Embodiment>
Hereinafter, an operation example of the image processing apparatus according to the seventh embodiment will be described. The seventh embodiment is a modification of map creation in the first to sixth embodiments. Therefore, in the present specification, in the description of the following embodiments, the redundant description of the configuration of the image processing apparatus that is common to the first to sixth embodiments is omitted.

In the example of the seventh embodiment, a process for adding a template is performed in the map creation process in the first to sixth embodiments.

FIG. 13 is a flowchart showing a modification of the flowchart shown in FIG. 2 of the first embodiment.
(Step S201 to Step S206)
The CPU 14 performs the same processing as steps S101 to S106 in the flowchart shown in FIG.
(Step S207)
The CPU 14 determines whether or not there is a template to be added based on the map Sal [T] obtained in step S206. When determining that there is a template to be added, the CPU 14 returns to step S204 and sets the template again. On the other hand, when determining that there is no template to be added, the CPU 14 proceeds to step S208.

CPU14 calculates | requires the representative value of the value of each pixel of map Sal [T] about the block which is not set to the template among the several blocks by the division | segmentation process of step S203. The representative value may be obtained by any method such as an average value or a median value. Then, if there is a block whose calculated representative value is smaller than a predetermined threshold, it is determined that there is a template to be added. A block whose representative value is smaller than a predetermined threshold is a block that can be assumed to be a background area. Therefore, a map with higher accuracy can be created by adding such a block as a template.

CPU14 repeats the process of step S204 to step S207 until it determines with there being no template to add. By repeating such a process, as a result, each block of a portion that is not an outer peripheral portion (= portion close to the center) may be set as a template. Therefore, a template can be set appropriately and a highly accurate map can be created even for a target image having a composition in which the main subject region is not centered.

Also, for example, adding a template is useful when the background of the target image has a gradation. When a gradation is applied in which the color becomes lighter (or lighter) from the outside to the inside of the background portion, the block closer to the center is more likely to be different from the block existing in the outer peripheral portion. Therefore, since these blocks have large map values, there is a high possibility that these blocks will be detected as main subject areas. However, as described above, when adding a template, adjacent blocks with little difference are gradually added as a template, so that a block existing in a portion that is a background region can be reliably set as a template. .

In the invention of Japanese Patent No. 4334981, a technique for accurately detecting the background inside the image by selecting a peripheral portion of the image as a background template and weighting by the physical distance of each block is disclosed. . However, in the above invention, due to the weighting by the physical distance, the main subject region also tends to be detected as the background region closer to the central portion. On the other hand, according to the method of adding a template described above, weighting based on a physical distance is not used, so that the portion corresponds to a block set as a template regardless of the physical position of the main subject. If it does not resemble the part, the possibility of being detected as a background region is low.
(Step S208 to Step S209)
The CPU 14 performs the same processing as steps S107 to S108 in the flowchart shown in FIG.

As described above, a map with higher accuracy can be created by adding a template.

<Description of Eighth Embodiment>
FIG. 14 is a block diagram illustrating a configuration example of an electronic camera according to the eighth embodiment. The electronic camera 31 includes an imaging optical system 32, an imaging device 33, an image processing engine 34, a ROM 35, a main memory 36, a recording I / F 37, an operation unit 38 that receives user operations, and a monitor (not shown). It has the display part 39 provided with. Here, the image sensor 33, ROM 35, main memory 36, recording I / F 37, operation unit 38 and display unit 39 are each connected to the image processing engine 34.

The imaging element 33 is an imaging device that captures an image of a subject formed by the imaging optical system 32 and generates an image signal of the captured image. The image signal output from the image sensor 33 is input to the control unit via an A / D conversion circuit (not shown).

The image processing engine 34 is a processor that comprehensively controls the operation of the electronic camera 31. For example, the image processing engine 34 performs various types of image processing (color interpolation processing, gradation conversion processing, contour enhancement processing, white balance adjustment, color conversion processing, etc.) on the captured image data. Further, the image processing engine 34 is configured to execute any one of the image processing apparatuses (the CPU 14, the low-pass image generation unit 23, the region division unit 24, the template setting unit 25, and the like) according to the first to seventh embodiments by executing a program. It functions as a matching processing unit 26 and a map creation unit 27).

The ROM 35 stores a program executed by the image processing engine 34. The main memory 36 temporarily stores image data in the pre-process and post-process of image processing.

The recording I / F 37 has a connector for connecting the nonvolatile storage medium 40. Then, the recording I / F 37 executes data writing / reading with respect to the storage medium 40 connected to the connector. The storage medium 40 is composed of a hard disk, a memory card incorporating a semiconductor memory, or the like. In FIG. 14, a memory card is illustrated as an example of the storage medium 40.

The display unit 39 displays the image data acquired from the image processing engine 34 and the display described in step S108 of the first embodiment.

The electronic camera 31 of the eighth embodiment acquires an image captured by the image sensor 33 as a target image in an imaging process triggered by a user's imaging instruction, and is the same as the image processing apparatus of any of the above embodiments. A map Sal [T] is created by processing. Note that the image processing engine 34 may record the map Sal [T] as supplementary information in an image file including data of the target image. Further, the same processing may be performed using an image recorded in the main memory 36 or the like as a target image. As described above, the electronic camera 31 of the eighth embodiment can obtain substantially the same effect as that of the above embodiment.

<Supplementary items of the embodiment>
(1) In each of the above embodiments, the example in which the target image data is in the RGB format has been described. However, the present invention is not limited to the configuration of the above embodiment. The image processing apparatus of the present invention can also be applied to image data in other color spaces such as YCbCr color space and L ^* a ^* b ^* color space.

(2) Each variable, coefficient, threshold value, and the like described in the above embodiments is an example, and the present invention is not limited to this example. For example, the block division in step S103 of the first embodiment is an example of 10 × 10, but other division numbers may be used.

In addition, for example, the block division in step S103 of the first embodiment has been described so as not to generate an overlapping portion in a 10 × 10 region, but the division may be performed so as to have an overlapping portion. If the overlapping portion does not occur, the accuracy of the map may be lowered if the main subject region exists across a plurality of blocks. Therefore, by performing the division including the overlapping portion, it can be expected that the accuracy of the map is improved even when the main subject region exists over a plurality of blocks. Furthermore, block division may be performed by excluding a part of the region from the beginning by regarding the central portion as the main subject region.

(3) In each of the above embodiments, as shown in FIG. 6, the case where there is one main subject region has been described as an example, but the present invention is not limited to this example. For example, the map Sal [T] shown in FIG. 15B can be created by performing the processing described in the above embodiments on the target image Img [1] shown in FIG. 15A. In such a case, if a main subject area is extracted based on the created map Sal [T], a plurality of main subject areas can be extracted from the map Sal [T] as shown in FIG. 16A. Furthermore, as shown in FIG. 16B, a plurality of main subject areas can be extracted from the target image Img [1]. When a plurality of main subject areas are extracted, the plurality of main subject areas can be recognized separately by using a known labeling technique or grouping processing by clustering.

As described above, when a plurality of main subject areas are extracted, either one of the plurality of main subject areas is selected or information on the plurality of main subject areas is combined to perform the processing described in each of the above embodiments. This may be done for each main subject area.

When selecting one of a plurality of main subject areas, for example, a method of selecting a main subject area having a large area, or a main subject area having a high sum of map values corresponding to pixels constituting the main subject area. The method of selecting can be considered. Conversely, by excluding the main subject region having a low sum of map values and the main subject region having a small area, the possibility of deleting a portion corresponding to noise increases. Further, any one of a plurality of main subject areas may be selected based on a user operation.

As described above, when any one of the plurality of main subject areas is selected, zooming to the selected main subject area is performed during the automatic zooming to the main subject area shown in (a) of the first embodiment. Done. Further, in the use for AF, AE, AWB shown in (b) of the first embodiment, AF, AE, AWB control and automatic shutter control corresponding to the selected main subject area are performed. In particular, in automatic shutter control, if information of a plurality of main subject areas is combined and the whole is regarded as one main subject area, AF control is performed on a portion not included in any main subject area. There is a risk of going. However, by selecting any of the plurality of main subject areas, it is possible to reliably perform AF control on any of the plurality of main subject areas. In the determination of the zoom center in the slide show shown in (c) of the first embodiment, the center of the selected main subject area is determined as the zoom center. In the automatic cropping of the main subject area shown in (d) of the first embodiment, automatic cropping is performed on the selected main subject area.

(4) The image processing apparatus of the present invention is not limited to the example of the personal computer of the above embodiment. The image processing apparatus of the present invention may be an electronic device (for example, a photo viewer, a digital photo frame, a photo printing apparatus, etc.) having a digital image reproduction display function and a retouch function. The imaging device of the present invention may be mounted as a camera module of a mobile phone terminal.

(5) The matching processing method of the above embodiment is an example, and the present invention is not limited to this example. In the present invention, an example in which the sum of absolute differences is basically used has been described. However, for example, a matching process may be performed using normalized correlation in order to be robust against illumination changes. Further, for example, the matching process may be performed using various image feature amounts (Edge Histgram or Scalable Color) defined by Mpeg-7. In the matching process, the distance between representative colors in consideration of the weight is calculated, but the number of representative colors may be different. In this case, for example, a technique such as EMD (Earth Move Distance) may be used.

(6) In each of the above embodiments, an example has been described in which each process of the low-pass image generation unit 23, the region division unit 24, the template setting unit 25, the matching processing unit 26, and the map creation unit 27 is realized by software. Of course, each of these processes may be realized by hardware using an ASIC.

From the above detailed description, the features and advantages of the embodiment will become apparent. It is intended that the scope of the claims extend to the features and advantages of the embodiments as described above without departing from the spirit and scope of the right. Further, any person having ordinary knowledge in the technical field should be able to easily come up with any improvements and modifications, and there is no intention to limit the scope of the embodiments having the invention to those described above. It is also possible to use appropriate improvements and equivalents within the scope disclosed in.

DESCRIPTION OF SYMBOLS 11 ... Computer, 14 ... CPU, 23 ... Low-pass image generation part, 24 ... Area division part, 25 ... Template setting part, 26 ... Matching processing part, 27 ... Map creation part, 31 ... Electronic camera, 33 ... Imaging element, 34 ... Image processing engine, 35 ... ROM, 37 ... Recording I / F, 39 ... Display unit, 40 ... Storage medium

Claims

An acquisition unit for acquiring information of a target image to be processed;
An area dividing unit for dividing the target image into a plurality of blocks;
A setting unit configured to set one or more templates based on an image of one or more blocks present on an outer periphery of the target image among the plurality of blocks;
A calculation unit that calculates a representative value for each of the plurality of blocks obtained by dividing the target image;
A matching unit that performs matching for each of the plurality of blocks by comparing the representative value of the block to be matched with the representative value in the one or more templates,
An image processing apparatus comprising: a creation unit that creates a map indicating a distribution of a subject in the target image based on a result of matching by the matching unit.
The image processing apparatus according to claim 1.
A generator that generates at least one image having a lower resolution than the target image;
The region dividing unit divides the target image and the low-resolution image into a plurality of blocks,
The setting unit sets the one or more templates for each of the target image and the low-resolution image,
The matching unit performs the matching for each of the target image and the low-resolution image,
The creation unit creates a map showing the distribution of the subject for each of the target image and the low-resolution image, and performs a calculation based on the plurality of created maps to show the distribution of the subject in the target image. An image processing apparatus that creates the map.
The image processing apparatus according to claim 2,
The generation unit generates at least one low-resolution image by performing a process of suppressing or transmitting a specific frequency band on the target image.
The image processing apparatus according to claim 3.
The image processing apparatus generates the at least one low-resolution image by performing at least one of low-pass processing and resizing processing on the target image.
The image processing apparatus according to claim 3.
The generation unit generates at least one low-resolution image by performing a band-pass filter process on the target image.
The image processing apparatus according to any one of claims 1 to 5,
The image processing apparatus, wherein the setting unit sets the one or more templates based on images of all blocks existing on an outer periphery of the target image.
The image processing apparatus according to any one of claims 1 to 5,
The setting unit sets the one or more templates based on images of all blocks existing on three sides except the lower side of the target image, or all blocks existing on the three sides. And the one or more templates are set based on the image of a predetermined block existing on the lower side and the images of all the blocks existing on the left side and the right side And setting the one or more templates.
The image processing apparatus according to claim 7.
The acquisition unit further acquires posture information of the imaging device at the time of imaging the target image,
The setting unit selects one or more blocks from the plurality of blocks based on the posture information, and sets the one or more templates based on an image of the selected block. Processing equipment.
The image processing apparatus according to any one of claims 1 to 5,
The setting unit selects some blocks from all the blocks present on the outer periphery of the target image based on positions of the matching target blocks in the target image, and images of the selected plurality of blocks The one or more templates are set based on the image processing apparatus.
The image processing apparatus according to any one of claims 1 to 9,
The calculation unit calculates a pixel value for each pixel included in the block as the representative value,
The image processing apparatus, wherein the matching unit sets an evaluation value related to the matching target block based on a difference between pixel values of arbitrary pixels in the matching target block.
The image processing apparatus according to claim 10.
The calculation unit calculates a pixel value for each pixel included in the block as the representative value,
The matching unit calculates an absolute value of a difference between a pixel value of an arbitrary pixel in a matching target block and a pixel value of a pixel at a position corresponding to the arbitrary pixel in an arbitrary template block. The difference absolute value sum, which is a value obtained by obtaining and adding all the pixels in the target block, is obtained for each of the one or more templates, and the smallest difference absolute value among the plurality of obtained difference absolute value sums An image processing apparatus characterized in that a value sum of values is used as the evaluation value for the matching target block.
The image processing apparatus according to claim 10.
The calculation unit calculates the representative value by calculating a pixel value for each pixel included in the block and then performing image conversion to a frequency domain on the pixel value included in the block,
The matching unit calculates an absolute value of a difference between an arbitrary representative value in a matching target block and a representative value corresponding to the arbitrary representative value in an arbitrary template block in the matching target block. A sum of absolute differences, which is a value obtained by obtaining and adding representative values corresponding to all pixels, is obtained for each of the one or more templates, and the smallest absolute difference among the plurality of obtained sums of absolute differences. An image processing apparatus characterized in that a value sum of values is used as the evaluation value for the matching target block.
The image processing apparatus according to claim 12.
The calculation unit calculates the representative value by performing at least one of Fourier transform, discrete cosine transform, and wavelet transform on the pixel value included in the block. apparatus.
The image processing apparatus according to claim 10.
The calculation unit calculates, as the representative value, a value indicating a color feature for each of the plurality of blocks based on a distribution of a plurality of color components constituting the target image.
The matching unit obtains a difference absolute value sum, which is a value obtained by adding a difference between a representative value of a matching target block and a representative value of a block of an arbitrary template, for each of the one or more templates. At least one of the minimum difference absolute value total value, the maximum difference absolute value total value, and the average difference absolute value total value among the plurality of absolute difference total values is a matching target. An image processing apparatus characterized in that the evaluation value relating to a block of the image is the evaluation value.
The image processing apparatus according to claim 14.
The image processing apparatus, wherein the calculation unit calculates, as the representative value, at least one of a value indicating a representative color based on a histogram and a value indicating a feature amount based on a relative histogram.
The image processing apparatus according to any one of claims 10 to 15,
The creation unit compares the evaluation value with a threshold value determined according to a range of values that can be taken by the evaluation value, and creates the map based on a comparison result.
The image processing apparatus according to any one of claims 1 to 16,
Of the plurality of blocks divided by the region dividing unit, for the blocks not set in the template by the setting unit, the value in the map created by the creating unit is compared with a predetermined threshold value, and the comparison result And an additional setting unit for newly adding one or more templates,
The matching unit performs matching for each of the plurality of blocks by comparing the representative value of the block to be matched with the representative value in the one or more templates added by the additional setting unit. ,
The creation unit creates a map indicating a distribution of subjects in the target image based on a result of matching by the matching unit.
The image processing apparatus according to any one of claims 1 to 17,
When the target image includes a plurality of subject images, the plurality of subjects can be identified by performing at least one of a labeling process and a clustering grouping process on the map created by the creating unit. An image processing apparatus, further comprising: a processing unit that performs the processing described above.
The image processing apparatus according to any one of claims 1 to 18,
A display unit for displaying images;
The display unit displays an area on the target image corresponding to an area where a value in the map exceeds a predetermined threshold when the target image is displayed.
The image processing apparatus according to any one of claims 1 to 19,
An image processing apparatus, further comprising: an image processing unit that performs a trimming process on an area on the target image corresponding to an area where a value in the map exceeds a predetermined threshold with respect to the target image.
An imaging unit that captures an image of a subject;
An image processing apparatus according to any one of claims 1 to 20,
The said acquisition part acquires the information of the said target image from the said imaging part. The imaging device characterized by the above-mentioned.
The imaging device according to claim 21, wherein
An image processing apparatus, further comprising: an image processing unit that performs a trimming process on a region on the target image corresponding to a region where a value in the map exceeds a predetermined threshold with respect to the target image.
The imaging device according to claim 21, wherein
An imaging apparatus, further comprising: a control unit that performs at least one of focus adjustment control and exposure control during imaging by the imaging unit based on the map.
The imaging device according to claim 21, wherein
An imaging apparatus, further comprising: a control unit that monitors at least one of a size and a position of a main subject based on the map and starts imaging by the imaging unit according to a monitoring result.
The imaging device according to claim 21, wherein
The imaging unit has at least one of an optical zoom function and an electronic zoom function,
An image pickup apparatus that executes at least one of the optical zoom function and the electronic zoom function by the image pickup unit based on the map.
An acquisition process for acquiring information of a target image to be processed;
A region dividing process for dividing the target image into a plurality of blocks;
A setting process for setting one or more templates based on an image of one or more blocks present on the outer periphery of the target image among the plurality of blocks;
A calculation process for calculating a representative value for each of the plurality of blocks obtained by dividing the target image;
A matching process in which matching is performed for each of the plurality of blocks by comparing the representative value of the block to be matched with the representative value in the one or more templates,
An image processing program for causing a computer to execute a creation process for creating a map indicating a distribution of a subject in the target image based on a result of the matching process.