WO2011013610A1 - 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム及び記録媒体 - Google Patents
画像処理装置及び方法、データ処理装置及び方法、並びにプログラム及び記録媒体 Download PDFInfo
- Publication number
- WO2011013610A1 WO2011013610A1 PCT/JP2010/062510 JP2010062510W WO2011013610A1 WO 2011013610 A1 WO2011013610 A1 WO 2011013610A1 JP 2010062510 W JP2010062510 W JP 2010062510W WO 2011013610 A1 WO2011013610 A1 WO 2011013610A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- projection
- tensor
- frequency component
- sub
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4053—Super resolution, i.e. output image resolution higher than sensor resolution
Definitions
- the present invention relates to an image processing apparatus and method, a data processing apparatus and method, a program, and a recording medium, and more particularly to restoration, interpolation, enlargement, and encoding of high-quality information that does not exist in unprocessed image data (low-quality information).
- the present invention relates to a suitable image processing technique.
- Non-Patent Document 1 As a method for generating a high-resolution output image from a low-resolution input image, a low-resolution image and a high-resolution image pair are learned in advance for a large number of image contents, and the low-resolution information is converted to the high-resolution information.
- a technique has been proposed in which a conversion (projection) relationship is obtained and an image including high-resolution information is generated (restored) from a low-resolution input image using this projection relationship (Non-Patent Document 1).
- Such a conventional method can be divided into a learning step and a restoration step.
- low-resolution information about a pair group of low-resolution images and high-resolution images referred to as a “learning image set”.
- the high-resolution information projection relationship is learned in advance using tensor singular value decomposition (TSVD).
- TSVD tensor singular value decomposition
- an arbitrary low-resolution information input image including the learning image set is projected onto the high-resolution information image using the learned tensor.
- the modality of projective transformation (individual differences between people, facial expressions, image resolution, face orientation, lighting changes, race, etc.) can be expressed by the number of variations of the tensor. It can be restored with high accuracy if it is projected in a state that satisfies the input conditions.
- the conventional technique has a problem that input conditions for projection conversion are strict, and in particular, since an allowable range for illumination variation is narrow, if an image that does not satisfy the conditions is input, the restored image quality after projection deteriorates.
- illumination variation as a modality of projective transformation.
- adding a modality increases the projection function that defines the projection relationship and increases the processing time of projective transformation. .
- Such a problem is not limited to image processing, but also relates to various data processing such as speech recognition using the same projective transformation, language data processing, biological information processing, and natural / physical information processing.
- the sampling frequency of speech data, the number of quantization (number of bits), and the like can be modalities, but the learning eigenspace for speech recognition is divided into sampling frequencies such as 48 kHz, 44.1 kHz, and 32 kHz. , 16 bits, 8 bits, etc. must be prepared for each quantization number.
- the present invention has been made in view of such circumstances, and it is possible to relax the input conditions of the image that is the conversion source, and it is highly robust (robust) that a good converted image can be obtained even for an image in which illumination variation has occurred.
- An object is to provide an image processing apparatus and method, and a program. It is another object of the present invention to provide an image processing technique that can reduce the memory capacity used and can increase the processing speed by reducing the processing load. It is another object of the present invention to provide a data processing apparatus and method, a program, and a recording medium in which this image processing technology is extended to general data processing technology.
- An image processing apparatus includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- a first sub-tensor projecting means for calculating a coefficient vector in the intermediate eigenspace by projecting with the second sub-nucleus tensor and the eigenprojection matrix using the second sub-nucleus tensor.
- a second sub-tensor projecting unit that generates a projected image from the low-frequency component-suppressed image by projecting by a projection operation; an image converting unit that generates a converted image having a different image quality from the input image; and the projected image and the converted image And adding means for adding the image.
- a high-quality output image is obtained from a low-quality input image.
- the low-frequency component of the input image is suppressed and high-quality processing is performed by tensor projection. It is possible to remove the influence of image degradation in the image quality improvement processing by the tensor projection caused by disturbances such as illumination fluctuations and noise included in the frequency component from the output image, and the low-frequency component (disturbance, It is possible to increase robustness (robustness) against noise and the like.
- all of the eigenspaces that can be used for generating the learning image group are assigned to the high frequency components or medium frequency components and high frequency components. It becomes possible.
- store the acquired eigenprojection matrix and projection nucleus tensor is preferable.
- the storage means may be a non-volatile storage means such as a hard disk, an optical disk, or a memory card, or may be a storage means that performs temporary storage such as a RAM, or a combination thereof.
- the first setting specifies a projection relationship for projecting the first image quality image onto the intermediate eigenspace
- the second setting is a projection relationship for projecting the second image quality image onto the intermediate eigenspace. Can be specified.
- the image processing apparatus includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- An eigenprojection matrix generated by a projection operation from a learning image group including at least one of an image pair having a high frequency component and a medium frequency component as a pair, and a projection kernel generated from the learning image group and the projection matrix A first sub-nucleus tensor corresponding to the condition specified by the first setting generated using the tensor, and a second sub-corresponding condition specified by the second setting generated using the projective nucleus tensor
- An information acquisition means for acquiring a nuclear tensor; a filter means for generating a low-frequency component suppression image in which a high-frequency component or a high-frequency component and an intermediate-frequency component of an input image to be processed are extracted; and the low-frequency component A first sub-tensor projection means for projecting the suppression image by
- the image processing apparatus is the image processing apparatus according to the first or second aspect, wherein the information acquisition means calculates high-frequency components of the first image quality image and the second image quality image.
- An eigenprojection matrix generated by a projection operation from a learning image group including a pair of images and a projection kernel tensor generated from the learning image group and the eigenprojection matrix are acquired, and the filter unit is configured to input the input image.
- a high-frequency component image obtained by extracting the high-frequency component of the image, and the low-frequency component-suppressed image is projected by a first projection operation using the eigenprojection matrix and the first sub-nucleus tensor in the intermediate eigenspace.
- the first sub tensor projecting means and the second sub tensor projecting means for calculating a coefficient vector generate a high frequency component projected image from the high frequency component image, Features to generate image information of the high frequency region exceeding the frequency region is expressed in a force image.
- An image processing apparatus includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- An eigenprojection matrix generating means for generating an eigenprojection matrix generated by a projection operation from a learning image group including at least one of an image pair in which a high-frequency component and an intermediate-frequency component are paired;
- a projective nucleus tensor that defines the correspondence between the high-frequency component or the high-frequency component and the medium-frequency component and the intermediate eigenspace and the correspondence between the high-frequency component or the high-frequency component and the medium-frequency component of the second image quality image and the intermediate eigenspace is generated.
- first sub-nucleus tensor acquisition unit that generates a first sub-nucleus tensor corresponding to the condition specified by the first setting from the generated projected-nucleus tensor.
- Second sub-nucleus tensor acquisition means for generating a second sub-nucleus tensor corresponding to the condition specified in the second setting from the generated projected nucleus tensor, and a high-frequency component of the input image to be processed
- Filter means for generating a low-frequency component suppressed image from which a high-frequency component and an intermediate-frequency component are extracted, and the low-frequency component suppressed image by a first projection operation using the eigenprojection matrix and the first sub-nucleus tensor.
- a first sub-tensor projection means for projecting and calculating a coefficient vector in the intermediate eigenspace; and a second projection using the calculated sub-tensor and the eigen-projection matrix for the calculated coefficient vector.
- a second sub-tensor projecting means for projecting by calculation to generate a projected image from the low-frequency component-suppressed image, and generating a converted image having a different image quality from the input image
- An image conversion unit characterized in that it comprises, adding means for adding said projected image and the converted image.
- An image processing apparatus is the image processing apparatus according to the fourth aspect, wherein the eigenprojection matrix generating means is configured to detect high frequency components of the first image quality image and the second image quality image.
- the eigenprojection matrix is generated by a projection operation from the learning image group including the image pair, and the projection kernel tensor generation unit generates a projection nucleus tensor from the learning image group and the eigenprojection matrix, and the filter unit Generating a high-frequency component image obtained by extracting a high-frequency component of the input image, and projecting the low-frequency component-suppressed image by a first projection operation using the eigenprojection matrix and the first sub-nucleus tensor.
- the first sub-tensor projection means and the second sub-tensor projection means for calculating a coefficient vector in the intermediate eigenspace may project a high-frequency component from the high-frequency component image. And it generates an image and characterized in that to generate the image information of the high frequency region exceeding the frequency region represented in the input image.
- An image processing apparatus is the image processing apparatus according to any one of the first to fifth aspects, wherein the high-frequency component and the medium-frequency component of the first image quality image are included in the first image quality image.
- the same processing as the filter means is performed and extracted, and the high frequency component and the medium frequency component of the second image quality image are extracted by performing the same process as the filter means on the second image quality image. It is characterized by that.
- the processing using the eigenprojection matrix and the high-frequency component or medium-frequency component of the learning image group that generates the first and second projection tensors, the eigenprojection matrix, and the first and second projection tensors is performed. Since the high frequency component or the medium frequency component of the input image to be applied is extracted by the same processing, a projection image and a converted image suitable for addition by the adding means are generated.
- the image processing device is the image processing device according to any one of the first to sixth aspects, wherein the projected image and the converted image added by the adding means are weighted.
- a weighting factor determining means for determining a coefficient is provided.
- a mode in which the weighting factor is determined according to the reliability of restoration of the tensor projection process is preferable.
- the image processing device is the image processing device according to any one of the first to seventh aspects, wherein the filter means extracts a component having a frequency equal to or higher than a frequency based on the Nyquist frequency in the input image. It is characterized by giving.
- the filter means functions as a high frequency component pass filter (high pass filter).
- An image processing apparatus is the image processing apparatus according to any one of the first to eighth aspects, wherein the first image quality image is a relatively low image quality image in the image pair.
- the second image quality image is a relatively high-quality image in the image pair, and the changed image quality image is a higher-quality image than the input image.
- An image processing apparatus is the image processing apparatus according to any one of the first to ninth aspects, wherein the first setting is to project the first image quality image onto the intermediate eigenspace.
- the second setting is to specify a projection relationship for projecting the second image quality image to the intermediate eigenspace.
- An image processing apparatus is the image processing apparatus according to any one of the first to tenth aspects, wherein the projection operation is performed by locality preserving projection (LPP), local linearity. It is one of embedding (LLE) and linear tangent-space alignment (LTSA).
- LLP locality preserving projection
- LLE embedding
- LTSA linear tangent-space alignment
- the image processing device is the image processing device according to any one of the first to eleventh aspects, wherein the learning image group includes the image pair targeting a human face,
- the intermediate eigenspace is an individual difference eigenspace.
- An image processing apparatus is the image processing apparatus according to any one of the first to twelfth aspects, wherein the first feature area specifying means for specifying the first feature area from the input image. And compressing the image portion of the first feature region with the first compression strength for the input image, while compressing the image portion other than the feature region with a second compression strength higher than the first compression strength.
- An image processing apparatus is the image processing apparatus according to any one of the first to thirteenth aspects, wherein the projection calculation includes a projection calculation using a local relationship.
- the medium frequency component or the high frequency component that is easily lost in the global information such as PCA is easily stored, and thus the restoration image quality may be further improved. A new effect is born.
- An image processing method includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- a second sub-nucleus tensor generation step for generating a second sub-nucleus tensor corresponding to the condition specified by the second setting from the projected nuclear tensor, and high-frequency generation of the input image to be processed
- a first sub-tensor projection step of calculating a coefficient vector in the intermediate eigenspace by projecting by calculation, and a second of the calculated coefficient vector using the second sub-nucleus tensor and the eigen-projection matrix A second sub-tensor projection process that generates a projected image from the low-frequency component-suppressed image by projecting by the projection calculation of the image, an image conversion process that generates a converted image having a different image quality from the input image, the projected image, and the And an addition step of adding the converted image.
- An image processing method includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- An eigenprojection matrix generated by a projection operation from a learning image group including at least one of an image pair having a high frequency component and a medium frequency component as a pair, and a projection kernel generated from the learning image group and the projection matrix A first sub-nucleus tensor corresponding to the condition specified by the first setting generated using the tensor, and a second sub-corresponding condition specified by the second setting generated using the projective nucleus tensor
- An image processing method includes an image pair in which high-frequency components of a first image quality image and a second image quality image having different image quality are paired, and the first image quality image and the second image quality image.
- a projective kernel tensor generating step for generating a projection kernel tensor defining a correspondence relationship between a high frequency component and an intermediate eigenspace, and a corresponding relationship between a high frequency component of the second image quality image and the intermediate eigenspace; and the generated projection kernel
- a first sub-nucleus tensor acquisition step for generating a first sub-nucleus tensor corresponding to the condition specified in the first setting from the tensor, and a second setting from the generated projected nucleus tensor.
- a second sub-nucleus tensor acquisition step for generating a second sub-nucleus tensor corresponding to the condition, and a low-frequency component suppressed image from which a high-frequency component or a high-frequency component and a medium-frequency component of the input image to be processed are extracted.
- a filter processing step to be generated; and a first vector for calculating a coefficient vector in the intermediate eigenspace by projecting the low-frequency component-suppressed image by a first projection operation using the eigenprojection matrix and the first sub-nucleus tensor.
- a sub-tensor projection step, and the calculated coefficient vector is projected by a second projection operation using the second sub-nucleus tensor and the eigenprojection matrix, and a projection image is obtained from the low-frequency component suppression image.
- a second sub-tensor projection step to be generated; an image conversion step to generate a converted image having a different image quality from the input image; and the projection image and the converted image. Characterized in that it comprises an adding step of, a.
- the image processing method according to an eighteenth aspect of the present invention is the image processing method according to any one of the fifteenth to seventeenth aspects, wherein the projection calculation includes a projection calculation using a local relationship.
- a program for causing a computer to perform a pair of high-frequency components of a first image quality image and a second image quality image having different image quality, and the first image quality image and the second image quality.
- Second sub-tensor projecting means for generating a projected image from the low-frequency component-suppressed image by projecting by the second projecting operation, image converting means for generating a converted image having a different image quality from the input image, and the projection It functions as an adding means for adding the image and the converted image.
- a program causes a computer to perform a pair of high-frequency components of a first image quality image and a second image quality image having different image quality, and the first image quality image and the second image quality.
- An eigenprojection matrix generated by a projection operation from a learning image group including at least one of an image pair in which a high frequency component and an intermediate frequency component are paired with an image, and a projection generated from the learning image group and the projection matrix A first sub-nucleus tensor corresponding to the condition specified by the first setting generated using the nuclear tensor, and a second corresponding to the condition specified by the second setting generated using the projected nucleus tensor
- Information acquisition means for acquiring a sub-nucleus tensor, and filter means for generating a low-frequency component suppressed image from which a high-frequency component or a high-frequency component and a medium-frequency component of an input image to be processed are extracted Projecting the low-frequency component-suppressed image by
- a program causes a computer to perform a pair of high-frequency components of a first image quality image and a second image quality image having different image quality, and the first image quality image and the second image quality.
- An eigenprojection matrix generating means for generating an eigenprojection matrix generated by a projection operation from a learning image group including at least one of an image pair having a high frequency component and an intermediate frequency component as a pair with the image; and the first image quality image
- a projective nucleus tensor that defines the correspondence between the high-frequency component or the high-frequency component and the medium-frequency component and the intermediate eigenspace, and the correspondence between the high-frequency component or the high-frequency component and the medium-frequency component of the second image quality image and the intermediate eigenspace.
- Projection nucleus tensor generation means to generate, and a first subnucleus that generates a first subnucleus tensor corresponding to the condition specified by the first setting from the generated projection nucleus tensor
- a second sub-nucleus tensor acquisition unit that generates a second sub-nucleus tensor corresponding to the condition specified by the second setting from the generated projection nucleus tensor, and an input image to be processed
- Filter means for generating a low-frequency component suppressed image from which a high-frequency component or a high-frequency component and an intermediate-frequency component are extracted, and a first using the eigenprojection matrix and the first sub-nucleus tensor for the low-frequency component suppressed image
- the first sub-tensor projection means for calculating the coefficient vector in the intermediate eigenspace by projecting by the projection operation of the second sub-nucleus tensor and the eigen-projection matrix using the calculated coefficient
- a program according to a twenty-second aspect of the present invention is the program according to any one of the nineteenth to twenty-first aspects, wherein the projection calculation includes a projection calculation using a local relationship.
- a data processing apparatus is a learning data group including a data pair in which at least a medium frequency component or a high frequency component of data of a first condition and data of a second condition having different conditions is paired.
- a projection kernel tensor generated from the eigenprojection matrix generated by the projection operation from the learning data group and the eigenprojection matrix, and the correspondence between the data of the first condition and the intermediate eigenspace and the second condition An information acquisition means for acquiring a first sub-nucleus tensor created as a condition corresponding to the condition specified in the first setting from a projection nucleus tensor that defines the data of the intermediate eigenspace and the correspondence relationship of the intermediate eigenspace; Filter means for generating high frequency components of input data to be processed, or low frequency component suppression input data from which high frequency components and medium frequency components are extracted, and the low frequency A component suppression input data is projected by a first projection operation using the eigenprojection matrix acquired from the information acquisition unit and
- a data processing device provides a learning data group including a data pair in which at least a medium frequency component or a high frequency component of data of a first condition and data of a second condition having different conditions is paired.
- a projection kernel tensor generated from the eigenprojection matrix generated by the projection operation from the learning data group and the eigenprojection matrix, and the correspondence between the data of the first condition and the intermediate eigenspace and the second condition An information acquisition means for acquiring a first sub-nucleus tensor created as a condition corresponding to the condition specified in the first setting from a projection nucleus tensor that defines the data of the intermediate eigenspace and the correspondence relationship of the intermediate eigenspace; Filter means for generating high frequency components of input data to be processed, or low frequency component suppression input data from which high frequency components and medium frequency components are extracted, and the low frequency A component suppression input data is projected by a first projection operation using the eigenprojection matrix acquired from the information acquisition unit and the first sub-nucleus tensor to calculate a coefficient vector in the intermediate eigenspace Sub-tensor projection means.
- a data processing device is the data processing device according to the twenty-third or twenty-fourth aspect, wherein the projection calculation includes a projection calculation using a local relationship.
- a data processing method is a learning data group including a data pair in which at least a medium frequency component or a high frequency component of data of a first condition and data of a second condition having different conditions is paired.
- a projection kernel tensor generated from the eigenprojection matrix generated by the projection operation from the learning data group and the eigenprojection matrix, and the correspondence between the data of the first condition and the intermediate eigenspace and the second condition An information acquisition step of acquiring a first sub-nucleus tensor created as a condition corresponding to the condition specified in the first setting, from a projection nucleus tensor that defines the data of the intermediate eigenspace and the correspondence relationship of the intermediate eigenspace; A filtering step for generating low-frequency component suppression input data from which high-frequency components or high-frequency components and medium-frequency components of input data to be processed are extracted; and First, a coefficient vector in the intermediate eigenspace is calculated by projecting the minute suppression input data by a first projection operation using the eigenprojection matrix acquired by the information acquisition step and the first sub-nucleus tensor. And a sub-tensor projection step.
- a data processing method is a learning data group including a data pair in which at least a medium frequency component or a high frequency component of first condition data and second condition data having different conditions is paired.
- a projection kernel tensor generated from the eigenprojection matrix generated by the projection operation from the learning data group and the eigenprojection matrix, and the correspondence between the data of the first condition and the intermediate eigenspace and the second condition An information acquisition step of acquiring a first sub-nucleus tensor created as a condition corresponding to the condition specified in the first setting, from a projection nucleus tensor that defines the data of the intermediate eigenspace and the correspondence relationship of the intermediate eigenspace;
- a filtering step for generating low frequency component suppression input data from which high frequency components or high frequency components and medium frequency components of input data to be processed are extracted, and the low frequency A component suppression input data is projected by a first projection operation using the eigenprojection matrix acquired by the information acquisition step and the first sub
- a data processing method is the data processing method according to the twenty-sixth or twenty-seventh aspect, wherein the projection calculation includes a projection calculation using a local relationship.
- a program for learning data including a data pair in which at least a medium frequency component or a high frequency component of first condition data and second condition data having different conditions is paired.
- filter means for generating low frequency component suppression input data from which high frequency components of input data to be processed or high frequency components and medium frequency components are extracted The low frequency component suppression input data is projected by a first projection operation using the eigenprojection matrix acquired from the information acquisition means and the first sub-nucleus tensor to calculate a coefficient vector
- a program provides a computer, learning data including a data pair in which at least a medium frequency component or a high frequency component of data of a first condition and data of a second condition having different conditions is paired.
- a program according to a thirty-first aspect of the present invention is the program according to the twenty-ninth or thirty-third aspect, wherein the projection calculation includes a projection calculation using a local relationship.
- the locality is preserved from the first eigenspace (ie, pixel eigenspace) via the “orientation” modality having one or more conditions, and the common second eigenspace (ie, “ By projecting onto an “intermediate eigenspace” (for example, an individual difference eigenspace), the projection result has a property of gathering at approximately one point on the second eigenspace.
- each of the conditions for determining the positional relationship (“closeness”) between the learning sample and the input sample on the second eigenspace is There is no need to prepare for each orientation condition (front, left, right,...), One or more of these conditions can be handled with a single standard, and disturbance and noise are included. It is possible to achieve robustness by suppressing specific components such as low frequency components. Therefore, highly accurate and robust processing can be performed, and effects such as higher processing speed and reduced memory capacity can be obtained.
- a recording medium according to a thirty-second aspect of the present invention is a recording medium on which a program according to any of the nineteenth to twenty-second aspects and the twenty-ninth to thirty-first aspects is recorded.
- Each means (process) such as a filter means (process) in the data processing apparatus, method, and program according to the 24th to 31st aspects is the image processing apparatus, method, and so on according to the 1st to 23rd aspects.
- the same means (process) as the program can be applied.
- a means similar to the fourth to thirteenth aspects, or an aspect in which a process corresponding to each means is added is possible.
- the program recorded on the recording medium may be an aspect obtained by adding the above means.
- a high-quality output image is obtained from a low-quality input image.
- the low-frequency component is suppressed by performing the high-quality processing by tensor projection while suppressing the low-frequency component of the input image.
- the effects of image degradation in image quality enhancement processing due to tensor projection caused by disturbances such as illumination fluctuations and noise included in the output image can be removed from the output image, and low-frequency components (disturbance, noise, etc.) ) Can be improved in robustness (robustness).
- all of the eigenspaces that can be used for generating the learning image group are set to the high frequency component or the medium frequency component. It becomes possible to assign to components, and it is possible to obtain a highly accurate and robust restored image with fewer learning samples.
- FIG. 1 is a conceptual diagram of tensor projection
- Figure 2 is an illustration of the principle of applying tensor projection to super-resolution image conversion
- FIG. 3A is a block chart showing an outline of processing in the image processing apparatus according to the embodiment of the present invention
- FIG. 3B shows the frequency characteristics of the input image
- FIG. 3C shows the frequency characteristics of the input image after passing through the high-pass filter
- FIG. 3D shows the frequency characteristics of the output image
- FIG. 4 is an explanatory diagram exemplifying that the change on the LPP eigenspace (here, the individual difference eigenspace) has a property close to linearity
- FIG. 5A is an example of an LPP projection distribution of an image sample (low resolution) represented in a two-dimensional subspace
- FIG. 5B is an example of an LPP projection distribution of an image sample (high resolution) represented in a two-dimensional subspace;
- FIG. 6 is a block diagram showing the configuration of the image processing apparatus according to the embodiment of the present invention;
- FIG. 7A is a conceptual diagram of projection by principal component analysis (PCA);
- FIG. 7B is a conceptual diagram of projection by singular value decomposition (SVD);
- FIG. 8 is a conceptual diagram showing the effect of redundancy deletion by learning set representative value conversion;
- FIG. 9 is a diagram illustrating an example of weights determined in association with the distance from the concealment candidate position;
- FIG. 10 is a conceptual diagram showing the relationship between a learning image vector group and an unknown image vector on an individual difference eigenspace;
- FIG. 11 is a diagram showing an example of weights determined in association with the distance from the learning set;
- FIG. 12 is a block diagram showing the configuration of an image processing apparatus according to another embodiment of the present invention;
- FIG. 13 is a block diagram showing an example of an image processing system according to an embodiment of the present invention;
- FIG. 14 is a block diagram showing a configuration example of the image processing apparatus 220 in FIG. 13;
- FIG. 15 is a block diagram showing a configuration example of the feature region specifying unit 226 in FIG. 14;
- FIG. 16 is an explanatory diagram showing an example of processing for specifying a feature region from within an image;
- FIG. 17 is an explanatory diagram showing another example of processing for specifying a feature region from within an image;
- FIG. 18 is an explanatory diagram showing an example of a feature region determination process by the second feature region specifying unit 620 in FIG. 15;
- FIG. 19 is a block diagram showing a configuration example of the compression unit 232 in FIG. 14;
- FIG. 20 is a block diagram showing another configuration example of the compression unit 232;
- FIG. 21 is a block diagram illustrating a configuration example of the image processing apparatus 250 in FIG. 13;
- FIG. 22 is a block diagram illustrating a configuration example of the image processing unit 330 in FIG. 21;
- FIG. 23 is a diagram showing an example of parameters stored in the parameter storage unit 1010 in FIG. 22 in a table format;
- FIG. 24 is a diagram showing an example of weighting of specific parameters
- 25 is a block diagram showing a configuration example of the display device 260 in FIG. 13
- FIG. 26 is a diagram showing an example of an image display area
- FIG. 27 is a configuration diagram illustrating an example of an image processing system according to another embodiment.
- the present invention can be applied to various uses.
- a case where a human face image is handled and a high-quality image is restored from a low-quality input image will be described as an example.
- a learning image set a learning image group is prepared in which a low resolution image and a high resolution image of a face for a plurality of persons (for example, 60 persons) are paired.
- the learning image set used here uses, as a low-resolution learning image, a low-resolution learning image obtained by reducing information under certain conditions, such as thinning out pixels at a fixed rate from a high-resolution learning image. Conversion by learning in advance the correspondence between pairs of low-resolution learning images generated by this information reduction and the original high-resolution learning images (images of the same person with the same content) Generate a function (a tensor that defines the projection).
- the gradation representing the size (number of pixels) and density of the target image is not particularly limited.
- the number of pixels of a high-resolution image (hereinafter sometimes abbreviated as “H image”) is 64 ⁇ 48.
- the number of pixels and low-resolution image (hereinafter may be abbreviated as “L image”) is 32 ⁇ 24 pixels, each of which has 8-bit density values (pixel values) of 0 to 255 gradations for each pixel. ) Will be described.
- the input space and the output space can be handled in the same space (coordinate axes), which is convenient for calculation.
- the learning data of the L image is used after being enlarged by an appropriate method in order to match the number of pixels of the H image. In this way, the correspondence (positional relationship) of the pixels is determined in a one-to-one relationship between the L image and the H image with the same number of pixels. it can.
- the learning image set can include images of various modalities.
- the face direction is assumed to be the front and the facial expression is assumed to be a standard expressionless expression (“normal”).
- one image is divided into squares by a region unit (for example, 8 ⁇ 8 pixels) of a predetermined number of pixels, and a plurality of these divided blocks (hereinafter referred to as “patches”). Calculation processing is performed for each patch. That is, the number of pixels per patch ⁇ the number of patches (the number of divisions) is the total number of objects to be processed in one image.
- a 64 ⁇ 48 pixel image is divided into 8 ⁇ 8 pixel units (patches) and divided into 8 ⁇ 6 48 patches.
- the patch size, the number of divisions, and the division form are not particularly limited. .
- a mode in which a predetermined amount of pixels are overlapped and divided between adjacent patches is possible, and a mode in which processing is performed in units of one image without patch division is also possible.
- the face orientation is 10 patterns with the direction changed from right to front-to-left, and the facial expression is normal, smile, anger, cry expression, 4 patterns, and the lighting direction is right to right
- various modalities such as 5 patterns with the direction changed in 5 steps by 45 degrees in the range of “front to right side” (see Table 2).
- Tables 1 and 2 are merely examples, and other modalities such as race, gender, and age may be added or replaced with other modalities.
- the number of types of modalities corresponds to the rank of a nuclear tensor G that defines the projection relationship described later (in the case of Table 1, a tensor with a rank of 4), and the product of the number of dimensions of each modality is the number of components of the nuclear tensor G.
- the number (size) of components of the nuclear tensor G is 8 ⁇ 8 ⁇ 2 ⁇ 48 ⁇ 60.
- FIG. 1 is a conceptual diagram of tensor projection.
- eigenspace also referred to as “feature space”
- movement projection between a plurality of eigenA, B, and C. .
- the projection relationship from the real space R to the eigenspace A is represented by a tensor U
- the projection relationship between the eigenspaces A and B is represented by a tensor G 1 or G 1 ⁇ 1
- the projection relationship between the eigenspaces B and C is represented by a tensor G 2 or G 2 ⁇ 1
- the projection relationship between the eigenspaces C and A is represented by a tensor G 3 or G 3 ⁇ 1 .
- FIG. 2 uses a projection between a pixel real space, a pixel eigenspace, and an individual difference peculiar (person feature) space to convert (restore) a low-resolution image into a high-resolution image.
- the image data is given a numerical value (pixel value) representing the density for each pixel, and is grasped as a coefficient vector in a multidimensional space based on the axis representing the density value (pixel value) for each pixel position. can do.
- pixel value representing the density for each pixel
- the low-resolution face image data of a certain person A is plotted as a point P LA in the pixel real space. That is, the coefficient vector (x 1 , x 2 , x 3 ) of Mr.
- A's low-resolution face image data takes a value (x 1 ) from 0 to 255 on the axis of the first basis component e 1 , Similarly, since a certain value (x 2 ) (x 3 ) of 0 to 255 is taken on the axis of the second base component e 2 and the axis of the third defined component e 3 , the image data is in the pixel real space. It is represented as a point P LA with Similarly, Mr. A's high-resolution face image data is plotted as a certain point P HA in the pixel real space.
- the purpose of the conversion here is to convert a point of a certain low-resolution image (for example, a low-resolution point P LA ) in the pixel real space and move it to a high-resolution point (P HA ′). .
- the conversion process starts with a projection function using a linear projection eigenprojection matrix U pixels from a pixel real space R in FIG. 2A by a dimensionality reduction technique typified by Locality Preserving Projection (LPP). Project to the eigenspace A by U pixels ⁇ 1 (FIG. 2B).
- LPP Locality Preserving Projection
- the axis (base) of the pixel eigenspace A corresponds to the feature axis (eigenvector) by the dimension reduction method, and this projection is a rotation of the coordinate system that converts the axis of the pixel real space R into the axis of the pixel eigenspace A. I can grasp it.
- the projection function G L ⁇ 1 uses a function that defines the correspondence between the low resolution image and the individual difference eigenspace.
- the point of the low resolution image and the point of the high resolution image related to the same person can be plotted at substantially the same position.
- a projection function GH that defines the correspondence between the high resolution image and the individual difference eigenspace is used.
- the high-resolution pixel vector H in the pixel real space is obtained by the following equation.
- a projection function (U pixels, ) is obtained from a learning image set consisting of a pair group of a low resolution image and a high resolution image using locality preserving projection (LPP), and based on this, an individual difference is obtained.
- Projection functions G L and GH are obtained so that the L image point and the H image point of the same person substantially coincide in space.
- LPP projection will be described as an example.
- other projection methods such as principal component analysis (PCA) can be applied instead of LPP projection.
- PCA principal component analysis
- FIG. 3A is a block chart showing an outline of processing in the embodiment of the present invention. As illustrated, the processing according to the present embodiment can be broadly divided into a learning step and a restoration step.
- a learning image group in which a low-quality image and a high-quality image are paired is input (# 10), and a high-pass filter (high-pass filter) is used for this image group.
- a high frequency component of the learning image set (low quality image and high quality image) is extracted (# 11).
- a process for generating a projection tensor is performed by applying a dimension reduction technique such as local preserving projection (LPP) to the high-frequency component of the input image.
- LPP local preserving projection
- the medium-frequency component may be extracted together with the high-frequency component. That is, a high frequency component or a high frequency component and a medium frequency component of the input learning image set are extracted, and a learning image set in which the low frequency component is suppressed is obtained.
- an eigenprojection matrix (# 14) is generated, and a projection nuclear tensor (# 10) defining the correspondence between the low-quality image and the intermediate eigenspace and the correspondence between the high-quality image and the intermediate eigenspace ( # 16) is generated.
- LPP performs coordinate transformation so as to preserve the closeness of local values (information on the geometric distance of neighboring values) of the sample in the original space (here, the real space of pixels).
- the coordinate axes are determined so that a sample in the vicinity of the original space is embedded in the projection destination space (eigenspace).
- the LPP eigenprojection matrix U j ⁇ U 1 , U 2 , U 3 ,... U 64 ⁇ corresponding to the patch position dimension (64 dimensions in the case of Table 1) is obtained.
- an eigenprojection matrix U is obtained from the viewpoint of each modality such as pixel, resolution, patch position, etc., and each projection kernel tensor G component is obtained using the U, and a set of these is obtained as a projection nucleus tensor G.
- the feature axis arrangement (array) is determined in ascending order of eigenvalues. Therefore, using only the top feature axes with a high degree of influence, the dimension can be reduced and the size of the nuclear tensor can be greatly reduced.
- a low-quality image as a conversion source is input (# 20), and information for specifying the patch position to be processed and information for setting the distinction between the L image and the H image are given ( # 22).
- the projective kernel tensor (# 16) is created based on all eigenvectors corresponding to each modality, and is an aggregate including projective components related to all modalities, and is used for restoration processing from the tensor components. It is necessary to remove the ingredients. For example, by determining a condition that an eigenspace of “individual difference” is used as an intermediate eigenspace (a space at the turning point of the projection route) via the projection route described in FIG. 2, the corresponding sub-nucleus tensor GL, GH can be taken out. As described above, a process until a sub-nucleus tensor to be actually used is generated may be included in the “learning step”.
- the input low-quality image (# 20) is subjected to high-frequency component extraction processing using a high-pass filter (# 21).
- the high frequency component extraction step is subjected to the same processing as the high frequency component extraction step (# 11) in the learning step. For example, a process of extracting the same frequency component as the frequency component extracted from the learning image set from the input image is performed. That is, in the high frequency component extraction step in the restoration step, the same frequency components as the learning image set that is the basis of the eigenprojection matrix and the projection kernel tensor are extracted.
- the characteristic illustrated with reference numeral 20 in FIG. 3B illustrates the relationship between the spatial frequency (frequency) in the input image and the response (gain) (frequency characteristic of the input image).
- the input image has a spatial frequency up to f 2 , and an illumination variation factor is included in a low frequency region (for example, a frequency region less than f 1 ).
- the characteristic illustrated with reference numeral 21 in FIG. 3C is the frequency characteristic of the low-frequency component suppressed image obtained by extracting the high-frequency component from the input image (# 20 in FIG. 3A). Here, it was subjected to processing for cutting frequency components lower than f 1 for an input image having a frequency characteristic shown in Figure 3B.
- the low-frequency component suppressed image is projected using the eigenprojection matrix and the first sub-nucleus tensor. (# 30) to calculate an intermediate eigenspace coefficient vector.
- This first sub-tensor projection step (# 30) corresponds to the projection of the path described in (a) ⁇ (b) ⁇ (c) of FIG.
- This second sub-tensor projection step (# 34) corresponds to the projection of the path described in (c) ⁇ (d) ⁇ (e) of FIG.
- the frequency characteristics of this enlarged image are as shown in FIG.
- a process of adding the above-described enlarged image and the projected image generated by the tensor projection is performed, and the high-frequency component of the input image is the tensor for the enlarged image.
- a restored image (high quality image, # 36) is generated by adding the projected image with high image quality by projection.
- FIG. 3D illustrates an example of the frequency characteristic of the high-quality image illustrated by adding reference numeral # 36 to FIG. 3A.
- the characteristic indicated by the reference numeral 20 ′ in the figure is the frequency characteristic of the enlarged image, and the characteristic indicated by the reference numeral 35 is the frequency characteristic of the projected image.
- an output image (high-quality image, # 36) having the frequency characteristics shown by the solid line can be obtained.
- f 1 ′ is a frequency corresponding to the threshold f 1 in the input image, and there is a method of setting the frequency f 1 ′ based on the Nyquist frequency in the sampling theorem. That is, by performing high-frequency component extraction processing on the input image using the frequency f 1 corresponding to a frequency slightly lower than the Nyquist frequency as a threshold value, it is possible to remove the image quality deterioration factor included in the low-frequency component of the input image. A preferable high-quality image is restored.
- the frequency region extracted from the input image (and learning image set) may be a so-called cutoff frequency (frequency at which the response is -3 dB), or may be set as appropriate according to the input image or the output image. .
- the enlarged image and the projected image are weighted using a weighting factor determined using the reliability of the projected image as an index and then added.
- the weighting factor is determined so as to increase the adoption ratio of the enlarged image Good. Further, it is more preferable that the weighting factor is determined in consideration of frequency characteristics.
- the storage means may be a semiconductor storage element such as a memory, or various storage media (elements) such as a magnetic storage medium such as an HDD or an optical storage medium.
- the form incorporated in the inside of an apparatus may be sufficient, and the form removable with apparatuses, such as a memory card, may be sufficient.
- the step (# 12) of generating a projection tensor in FIG. 3A and the calculation means thereof correspond to “eigenprojection matrix generation means (step)” and “projection nucleus tensor creation means (step)”. Also, the step of generating the first sub-nucleus tensor (# 24) and the operation means thereof correspond to the “first sub-nucleus tensor creating means (step)”, and the step of generating the second sub-nucleus tensor (#) 26) and the calculation means correspond to “second sub-nucleus tensor creation means (process)”.
- the low-quality image (# 20) as the conversion source corresponds to the “input image”
- the high-frequency component extraction step (# 21) by the high-pass filter corresponds to the “filter means (step)”.
- the second sub-tensor projection step (# 30) and its calculation means correspond to the “second sub-tensor projection means (step)”, and the high-frequency component obtained by the second sub-tensor projection (# 34)
- the projected image corresponds to a “projected image”.
- the adding step (# 60) of adding the enlarged image and the projected image corresponds to “adding means (step)”.
- the image processing for removing the image quality deterioration factor of the restored image due to the illumination variation included in the low-frequency component in the input image and the output image has been described.
- this image processing method is applied to other than the illumination variation. be able to.
- the intermediate frequency region is suppressed from the input image, and a high image quality processing (for example, enlargement processing) by a method different from the tensor projection is used for the intermediate frequency region, Using high-quality processing by the tensor projection method for other frequency regions, and adding two images generated by these high-quality processing, image quality degradation factors existing in a predetermined frequency region can be removed from the output image Is possible.
- a high image quality processing for example, enlargement processing
- image quality degradation factors existing in a predetermined frequency region can be removed from the output image Is possible.
- FIG. 4 shows an example in which a change in a modality (here, individual difference) on the LPP eigenspace has a property close to linear.
- a modality here, individual difference
- Mr. A, Mr. B, Mr. C, and Mr. D are converted by LPP
- the change between Mr. A and Mr. B in FIG. Is almost linear, changing smoothly (continuously) in the individual difference eigenspace.
- LPP_HOSVD LPP High Order Singular Value Decomposition
- an unknown input image other than the learning image sample can be expressed approximately well using a vector group with the learning image sample in the LPP eigenspace. This is one of the advantages of using the LPP projective transformation system (Advantage 1).
- FIG. 5A shows the LPP projection distribution of a low-resolution image sample in a two-dimensional subspace
- FIG. 5B shows the LPP projection distribution of a high-resolution image sample in a two-dimensional subspace.
- the topology of the low resolution distribution (FIG. 5A) and the topology of the high resolution distribution (FIG. 5B) of the learning image sample vector group on the LPP eigenspace learn the eigenspace separately. It is known that the correlation is high even after conversion.
- FIG. 6 is a block diagram showing the configuration of the image processing apparatus 100 according to the embodiment of the present invention.
- the blocks of the processing units that contribute to the processing of each step are illustrated along the flow of processing, divided into a learning step and a restoration step.
- the image processing apparatus 100 includes a low-resolution enlargement processing unit 102, a high-pass filter 104, a patch division unit 106, an LPP projection tensor generation unit 108, a learning representative number acquisition unit 110, a learning set.
- the means for performing processing of each processing unit is realized by a dedicated electronic circuit (hardware), software, or a combination thereof.
- the first LPP_HOSVD projection processing unit 130 is means for performing the projection path processing described in FIGS. 2A to 2C, and as shown in FIG. “L pixel ⁇ individual space projection unit 132” for projecting from the pixel eigenspace to the individual difference eigenspace for the L image, and “[L pixel ⁇ individual difference] eigenspace projection unit 134 for projecting the L image from the pixel eigenspace to the individual difference eigenspace”. It has.
- the pixel value in the L image is referred to as L pixel
- the pixel value in the H image is referred to as H pixel.
- the second LPP_HOSVD projection processing unit 150 is a means for performing the projection path processing of FIG. 2 (c) ⁇ (d) ⁇ (e), and projects the H image from the individual difference eigenspace to the pixel eigenspace.
- the low resolution enlargement processing unit 102 performs processing for enlarging the input low resolution image to a predetermined size.
- the enlargement method is not particularly limited, and various methods such as bicubic, B-spline, bilinear, and nearest neighbor can be used.
- the low resolution image of the input learning image set is expanded to the same number of pixels as the high resolution image.
- the restoration step the input low resolution image is enlarged to the same number of pixels as the output (in this example, the same size as the high resolution image of the learning image set). This is because the number of input and output dimensions is made uniform as already described.
- the high-pass filter 104 applies a filter that suppresses low frequencies to the input image.
- An unsharp mask, Laplacian, gradient, or the like can be used for the filter. Since most of the effects of illumination fluctuations in the face image are present in the low frequency range, the influence of the illumination fluctuations can be removed by suppressing the low band by the high-pass filter 104, and the robustness against the illumination fluctuations can be improved.
- a highly accurate and robust restoration can be expected with fewer learning samples due to the synergistic effect of the high-pass filter 104 and the LPP_HOSVD projection.
- a process of extracting a high frequency component (a frequency component of f 1 or more in FIGS. 3B to 3D) is shown as an example of suppressing a low frequency component including an illumination variation factor.
- the high frequency component may be extracted and the middle frequency component may be extracted.
- the patch dividing unit 106 divides the input image into a shogi board. In both the learning step and the restoration step, signal processing is performed in units of patches. By performing the processing for each patch, the projection target can be handled in a low dimension by limiting the processing target to the local part of the image, so that it can be robust against high image quality and changes in individual differences. Therefore, in the implementation of the present invention, a configuration including means for patch division is a preferred mode.
- the LPP projection tensor generation unit 108 performs local storage projection from the input learning image set (low resolution image and high resolution image pair group) that has undergone preprocessing such as low resolution enlargement, high pass filter, and patch division. Apply (LPP) to generate an LPP projection tensor.
- the LPP performs coordinate transformation so as to preserve the local proximity (information on the geometric distance of the neighborhood) of the sample in the original linear space (here, the real space of pixels), and the original space.
- the coordinate axes are determined so as to embed nearby samples in the projection destination space (eigenspace).
- an LPP eigenprojection matrix U pixels is first generated by LPP based on this set, and then, as in singular value decomposition (SVD), an LPP projective kernel tensor G Is generated.
- SVD singular value decomposition
- LPP local preserving projection is to find an axis (feature axis) that makes samples with similar values close to each other, and as a result, preserves the local structure, and uses the distance between neighboring sample values. To do. A similar degree of similarity between samples (specimens) of close values is large and a degree of similarity of samples of different values is small is introduced, and projection is performed to bring the samples having high similarity close to each other.
- the LPP is used for the purpose of maintaining the local proximity and reducing the linear dimension, and has a feature that the local geometry is preserved and the projection can be easily performed only by the linear transformation. However, it is generally not an orthogonal basis. However, orthogonal LPP has also been proposed and it is desirable to use it.
- Step 1 First, the eigenvector corresponding to the minimum eigenvalue of the matrix (XDX t ) ⁇ 1 XLX t is set to u 1 .
- Step 2 Next, the k-th eigenvector is obtained. That is, the eigenvector corresponding to the minimum eigenvalue of the matrix M (k) shown in [Expression 4] and u k.
- the orthogonal LPP projection matrix W OLPP ⁇ u 1 ,..., U r ⁇ is obtained.
- PCA Principal component analysis
- Such PCA only provides a projection function between a real space vector and an eigen (feature) space vector as shown in FIG.
- SVD singular value decomposition
- the vector of the eigen space A and the eigen space B It also provides a projection function ⁇ between the vectors. That is, SVD corresponds to a decomposition expression of feature vectors in PCA.
- U is an output orthonormal vector
- V is an input orthonormal vector
- ⁇ is a diagonal output matrix of ⁇ i
- V * represents an adjoint matrix of V. That is, the V-projection eigenspace and the U-projection eigenspace are uniquely and linearly related to each i with a relationship of ⁇ i (> 0) times.
- a tensor SVD (TSVD) is obtained by making this matrix SVD multidimensional (multimodality), that is, by tensoring.
- TSVD tensor SVD
- the learning image set in Table 1 will be described as an example. For each patch position, an H image and an L image for 60 people are plotted in the pixel real space, and the LPP is applied to the distribution of 120 points. A feature axis focusing on near values (those with close changes) is obtained.
- the learning image set including a pair group of low-quality images and high-quality images is used.
- a provisional temporary LPP eigenprojection matrix U j ⁇ U 1 , U 2 , U 3 ,... U 200 ⁇ corresponding to the patch position dimension (200 dimensions in the case of Table 1) is obtained. Further, by using the temporary LPP eigenprojection matrix Uj, a temporary projection kernel tensor G that defines conversion between the pixel eigenspace and the individual difference eigenspace for the L image and the H image is generated by tensor singular value decomposition.
- Sub-nuclear tensors G Hj ⁇ G H1 , G H2 , G H3 ,... G H200 ⁇ that associate image pixels (H pixels) with individual difference eigenspaces are included.
- the learning image is narrowed down in order to select an appropriate sample when determining the projection function.
- the number of pairs of learning images to be used finally here, the number of samples
- learning representative number information on the learning representative number is acquired from the outside.
- the learning representative number acquisition unit 110 in FIG. 6 is means for taking in the learning representative number from the outside.
- the learning set representative value processing unit 112 performs processing for obtaining an individual difference eigenspace coefficient vector group from a preprocessed input learning image set (at least one of a low resolution image and a high resolution image). This processing is the same processing as the first LPP_HOSVD projection processing unit 130 in the restoration step for the input learning image set, that is, L pixel ⁇ eigenspace projection (processing by reference numeral 132) and [L pixel ⁇ individual difference] eigenspace projection. The processing up to (processing by reference numeral 134) is performed, and the coefficient vector of the individual difference eigenspace is obtained.
- N representative individual difference eigenspace coefficient vectors are obtained according to the learning representative number N obtained from the learning representative number acquisition unit 110.
- the representative vector is obtained using a k-means method, an EM algorithm, a variational Bayes method, a Markov chain Monte Carlo method, or the like. Alternatively, a plurality of these methods may be combined. For example, the initial candidate is obtained by the k-means method, and the representative vector is finally obtained by the EM algorithm, so that it can be obtained with high accuracy in a relatively short time.
- sample points points located in the neighborhood of the individual difference eigenspace
- the representative vector group on the individual difference eigenspace obtained in this way may be used as it is, but N samples of the preprocessed input learning image set closest to each vector of the obtained representative vector group are adopted.
- Embodiments are preferred. In the former case, the representative vector is synthesized from sample points, whereas in the latter case, actual sample points are adopted, so that blurring due to synthesis of representative points can be avoided.
- the reprojection tensor generation unit 114 performs the same processing as the LPP projection tensor generation unit 108 on the N representative learning image sets obtained by the learning set representative value processing unit 112, and calculates the LPP eigenprojection matrix and the LPP projection kernel tensor. Regenerate. Thus, based on the representative learning image set, an LPP eigenprojection matrix (U pixels ) 115 and an LPP projection kernel tensor (G) 116 used in a restoration step described later are obtained.
- the LPP projection tensor generation unit 108 and the reprojection tensor generation unit 114 are shown as separate blocks. However, the same processing block may be used to loop the processing.
- FIG. 8 is a conceptual diagram schematically showing how the learning set redundancy is deleted by the learning set representative value processing.
- the number of learning samples is set to “5” and is shown in a two-dimensional space.
- the samples of Mr. C, Mr. C, and Mr. D are represented by Mr. C, and the samples of Mr. A and Mr. D are deleted.
- the LPP eigenprojection matrix U pixels and the LPP projection kernel tensor G are recalculated by the reprojection tensor generation unit 114 based on the data of the three persons B, C, and E.
- the learning image set redundancy process reduces the redundancy of the learning image set, and the rank of each rank of the projection tensor can be reduced while maintaining the restoration performance and the robustness. It can contribute to suppression of memory increase and speeding up of processing.
- the low-resolution enlargement processing unit 102, the high-pass filter 104, and the patch division unit 106 described in the learning step in FIG. 6 are similarly used for the input image (low-quality image) in the restoration step. That is, in the restoration step, for each high-pass component of the input image, “L pixel ⁇ eigenspace projection” (reference numeral 132), “[L pixel ⁇ individual difference] eigenspace projection” (reference numeral 134), “ [Individual Difference ⁇ H Pixel] Eigenspace Projection ”(reference numeral 152) and“ Eigenspace ⁇ H Pixel Projection ”(reference numeral 154) are performed.
- the setting value acquisition unit 120 acquires information on the patch position to be processed and information specifying the settings of L and H from the outside, and obtains the information as “first sub-nucleus tensor generation unit 122”, “second” The sub-nucleus tensor generation unit 124 ”,“ L pixel ⁇ eigenspace projection unit 132 ”, and“ eigenspace ⁇ H pixel projection unit 154 ”.
- the patch position of the image after the patch division is associated with the first sub-nucleus tensor generation unit 122 and the second sub-nucleus tensor generation unit 124, and the “first The sub-nucleus tensor generation unit 122 ”, the“ second sub-nucleus tensor generation unit 124 ”, the“ L pixel ⁇ eigenspace projection unit 132 ”, and the“ eigenspace ⁇ H pixel projection unit 154 ”may be provided.
- the means may be performed in the learning step together with the “first sub-nucleus tensor generation unit 122” and the “second sub-nucleus tensor generation unit 124”.
- the first sub-nucleus tensor generation unit 122 provides the patch position output from the set value acquisition unit 120 and the L setting condition, thereby reducing the low-resolution from the LPP projection kernel tensor 116 related to the output of the reprojection tensor generation unit 114.
- a sub-nucleus tensor GL for an image is generated. Note that this means may be performed in the learning step, and instead of or in combination with the aspect of storing and storing the LPP projected nucleus tensor 116, the sub-nucleus tensor GL is generated and stored in the learning step. You may keep it.
- a memory for storing the sub-nucleus tensor is required, but there is an advantage that the processing time of the restoration step can be shortened.
- the “L pixel ⁇ eigenspace projection unit 132” in the first LPP_HOSVD projection processing unit 130 obtains an LPP eigenprojection matrix 115 (U pixels ) based on the patch position given from the setting value acquisition unit 120, and performs patch division.
- the output of the image from the section 106 performs processing of U pixels Full -1 projection to the pixel eigenspace described in FIG. 2 (a) ⁇ (b) .
- U pixels ⁇ 1 represents an inverse matrix of U pixels .
- the coefficient vector correction processing unit 140 uses the individual difference eigenspace coefficient vector group for the number of patches obtained by the [L pixel ⁇ individual difference] eigenspace projection unit 134 in FIG. 6 to use the second LPP_HOSVD projection processing unit 150. [Personal differences ⁇ H pixels] A correction coefficient vector group to be given to the eigenspace projection unit 152 is generated.
- This correction operation uses the feature of tensor projection that has a multiple linear projection framework. That is, as described with reference to FIG. 2, when a learned LPP eigenprojection matrix and an LPP projection nucleus tensor are used as the features of tensor projection, a patch group obtained by dividing the face image (for example, Mr. A's face image) of the same person. These pixel vectors are gathered at almost one point in the individual difference eigenspace. Therefore, high cross-correlation between patches can be used by converting to the same rank of the tensor space.
- the pixel vector of the patch where the concealment exists is a point at a position away from the area where the pixel vectors of the patch without other concealment gather in the individual difference eigenspace.
- the pixel vector of a patch with a concealment can be corrected and corrected to a vector without a concealment (correction coefficient vector).
- Example A-1-1-1 By using representative values such as the average value, median, maximum value, and minimum value of the coefficient vector group of the patch group related to the same person in the individual difference eigenspace as the value of the correction coefficient vector group, the individual difference eigenspace coefficient vector group Remove noise (influence of partial concealment such as glasses, mask, door).
- Example A-1-2 In the individual difference eigenspace, centering on representative values such as mean value, median, maximum value, minimum value, etc. in the histogram of the coefficient vector group of patches related to the same person in the individual difference eigenspace, e.g. Noise may be further removed by using an average value, median, maximum value, minimum value, or the like for the spatial coefficient vector group as the value of the correction coefficient vector group.
- a mode is also possible in which when a region where a concealment exists is detected, the region is converted with a dedicated tensor.
- Example A-2-1 Since the relative position of the glasses (upper horizontal) and mask (lower center) in the face is roughly known in advance, the individual difference eigenspace coefficient vector group of the patch in the corresponding area and the entire face (or excluding the concealment candidate area) Are compared with the representative value of the individual difference eigenspace coefficient vector group of the patch of the (face region), and if they are similar (if the distance is close), it is detected that the probability of no concealment is high. On the other hand, if the distance between the two is long, it is detected that there is a high probability that the concealment exists.
- the representative value may be obtained by adding a distance from the candidate position.
- the representative value weighted according to the patch position takes into account the uncertainty of the size of the concealment. For example, since glasses have various sizes, the adjacent patch may or may not be put on the adjacent patch depending on the size of the glasses. Considering probabilistically, the area closer to the center of the eye has a higher influence of the glasses, and the farther the distance (the closer to the periphery), the less the influence of the glasses. Determined as a function of distance from the center position.
- a lookup table (LUT) stored in advance is used in addition to an aspect in which calculation is performed from a predetermined function.
- restoration (restoration using tensor projection) according to the method of the present invention for the concealed object (glasses, mask, etc.) is performed on the concealed object area. .
- Example A-2-2 In “Example A-2-1”, the concealment is detected by paying attention to the distance from the representative value, but it can also be detected from the spread of the distribution of the coefficient vector group. That is, as another example of Example A-2-1, there is an aspect in which it is detected that the probability of concealment is high if the distribution of the individual difference eigenspace coefficient vector group of the patch corresponding to the region corresponding to the concealment candidate is widened. Is possible. When the concealment candidate region distribution is wider than the same distribution in the entire face, the probability of concealment may be high.
- Example A-2-3 As another embodiment, there is an aspect in which the distribution shape of the individual difference eigenspace coefficient vector group of the correct answer (image not included in the learning set) is obtained in advance. In this case, if the individual difference eigenspace coefficient vector group is similar to the prior distribution shape, it is detected that the probability of no concealment is high.
- Example A-3-1 A mode in which the same detection as in “Example A-2-1” is performed, and the concealment area is restored by another conversion method such as bicubic or “general-purpose super-resolution processing unit 164” (see FIG. 6). Is possible.
- Example of predicting and restoring coefficient vectors other than a specific area from a specific area in the face (Example A-4-1): Only a part of the face (for example, each area of the eyes, nose, and mouth) is used for the pixel vector of the patch group obtained by dividing the face image of the same person using the high correlation in the individual difference eigenspace. A correction coefficient vector group for the entire face may be obtained from the individual difference eigenspace coefficient vector group.
- Example A-4-1-1 For example, representative values such as an average value, median, maximum value, minimum value, etc. of a part of individual difference eigenspace coefficient vector groups in the face are used as values of the correction coefficient vector group of the entire face.
- Example A-4-1-2 instead of “Example A-4-1-1”, the distribution of the individual difference eigenspace coefficient vector group is obtained for a plurality of patches in the central portion of the face. Next, extrapolation prediction is performed from the same distribution to obtain a correction coefficient vector group other than the central portion. For example, a coefficient vector group distribution is obtained for 3 ⁇ 3 9 patches in the center of the face, and a coefficient vector at an outer position of the 9 patches is obtained from this distribution by an extrapolation method (extrapolation method).
- extrapolation method extrapolation method
- Example A-4-1-3 The distribution of the individual difference eigenspace coefficient vector group is obtained only for the patches thinned out in the horizontal and vertical directions in the face. Next, a correction coefficient vector group of patches for which individual distribution eigenspace coefficient vectors are not obtained by interpolating the distribution is obtained. For example, the distribution of the coefficient vector group is obtained only for even-numbered patch positions, and the remaining odd-numbered patches are obtained by interpolation.
- Example A-4-1 the [L pixel ⁇ individual difference] eigenspace projection from the first sub-nucleus tensor generation unit 122 described in FIG.
- the processing number of the unit 134 is reduced, and the processing speed can be increased.
- a low-pass filter for example, an average filter
- a low-pass filter may be further applied to the correction coefficient vector group of the patch to be processed and surrounding patches.
- a maximum value, a minimum value, and a median filter may be applied instead of the average filter.
- Second sub-core tensor generation unit 124 by giving the condition of the patch position and H settings of the output of the setting value acquisition unit 120 generates the sub-core tensor G H from LPP projection core tensor 116.
- the means may be performed in the learning step instead of the mode performed in the restoration step as shown in FIG.
- the processing time of the restoration step can be shortened.
- a memory for storing the sub-nucleus tensor GH is required.
- the eigenspace projection unit 152 obtains GH from the second sub-nucleus tensor generation unit 124, and performs the correction coefficient vector output from the coefficient vector correction processing unit 140 with reference to FIG. ⁇ Perform the GH projection described in (d).
- the eigenspace ⁇ H pixel projection unit 154 obtains the LPP eigenprojection matrix U pixels based on the patch position from the setting value acquisition unit 120, and the coefficient vector of the output of the [individual difference ⁇ H pixel] eigenspace projection unit 152.
- the U pixel projection processing described in FIGS. 2D to 2E is performed to obtain a high resolution image.
- the addition unit 160 outputs the sum of the input from the eigenspace ⁇ H pixel projection unit 154 (high-frequency component restoration information) and the input from the low-resolution enlargement processing unit 102 (original low-resolution enlarged image).
- the adding unit 160 adds and integrates all patches to generate one face image (high resolution image).
- the original low-resolution enlarged image may be configured to add the restoration information of the high-frequency component after performing a predetermined filtering process.
- super-resolution processing means (“reference numerals 100A and 100B in FIG. 6) using the LPP projection tensor described above, super-resolution processing means (“general-purpose super-resolution processing unit 164 in FIG. And a weight calculation unit 162 and a synthesis unit 166.
- the general-purpose super-resolution processing unit 164 super-enlarges the input low-resolution image to the same size as the output.
- the enlargement method is not particularly limited.
- clustering methods (Atkins, CB; Bouman, CA; Allebach, JP, “Optimal image scalingusing pixel classification”, IEEE, Image Processing, 2001. Proceedings. 2001International Conference on Volume3, 2001 Page (s): 864-867 vol.3).
- the clustering method uses a mixed model, so it can support super-resolution of various patterns by combining multiple models.
- z low resolution image
- x high resolution image
- probability w i as a weight is a dimension vector y of the difference between an unknown pixel and surroundings at the time of restoration. Is determined dynamically.
- Ai, Bi, ⁇ i, and ⁇ i are obtained as follows, for example.
- the dimensional vector (cluster vector) of the difference is obtained by K-means, and the centroid of each of the 100 classes is obtained and classified, and an initial distribution state is created.
- the likelihood function is maximized with the current conditional probability, and the next conditional probability is obtained.
- the conditional probability is estimated in the E step. It is the M step that maximizes the likelihood function using the estimated value of the E step. Continue the E step and M step loop operations until the output of the likelihood function is stable. For example, in order to learn 100,000 pixels in 100 classes, learning is performed 10,000 times (the convergence condition is e ⁇ 10 ).
- the enlargement method described in the low-resolution enlargement processing unit 102 may be used.
- the weight calculation unit 162 is a means for obtaining the weight w1 used by the synthesis unit 166 so as to increase or decrease the adoption rate of the general-purpose super-resolution method by the general-purpose super-resolution processing unit 164 according to the degree of deviation of the input condition.
- the weight w1 is determined so that the adoption rate of the general-purpose super-resolution method is lowered when the degree of deviation of the input condition is low, and the adoption rate of the general-purpose super-resolution method is increased as the degree of deviation of the input condition is high.
- Example B-1-1 The above-described tensor projection super-resolution means (reference numerals 100A and 100B in FIG. 6) is characterized in that the individual difference eigenspace coefficient vector is farther from the coefficient vector of the learning set on the individual difference eigenspace, and the recoverability is worse. Yes (feature [1]).
- FIG. 10 is a conceptual diagram showing the feature [1].
- the eigenspace of the tensor is represented by a three-dimensional space, and each learning image vector is represented by small points SL 1 , SL 2 ... SL i .
- the outer edge of the distribution range of the learning images represented by reference numeral 170, showing the center of gravity P G training image vectors by a black circle.
- Unknown image vectors IM 1 , IM 2 ... Other than the learning image vector are indicated by white circles.
- the distance is determined from the proximity of the unknown image vector to the learning image vector group, the distance from the learning image vector (nearest neighbor, center of gravity, surrounding boundary points), and the inside / outside determination of the sample group (class).
- IM 2 is also determined to be close to the learning image sample. The restoration of these unknown image vectors is very good.
- IM 3 and IM 4 exist inside the class of the sample group, and are a little apart from each other compared to IM 1 and IM 2 , and can be said to be at a “slightly close” level. These can be restored relatively well.
- IM 5 and IM 6 exist outside the sample group and are far from the learning set. Restorability when these unknown image vectors IM 5 and IM 6 are restored decreases. As described above, the closer to the learning set, the better the restoration is possible, and the longer the distance, the worse the restoration.
- the weight w1 is obtained as follows.
- processing up to the “[L pixel ⁇ individual difference] eigenspace projection unit 134” of the restoration step is performed, and the representative individual difference eigenspace coefficient vector group Sought in advance.
- the representative individual difference eigenspace coefficient vector group and the individual difference eigenspace coefficient vector obtained by the “[L pixel ⁇ individual difference] eigenspace projection unit 134” are used. seeking the closest distance, LUT or as shown in FIG. 11, ⁇ 1 / x, seek w1 by a function, such as ⁇ 1 / x 2, exp (- ⁇ 1x ).
- Example B-1-2 The w1 is increased as the direction of the coefficient vector of the learning set and the individual difference eigenspace coefficient vector are similar.
- Example B-2-1 Further, the above-described tensor projection super-resolution means (reference numerals 100A and 100B in FIG. 4) spreads the “distribution with the number of patches as the number of samples” of the individual difference eigenspace coefficient vector on the individual difference eigenspace. There is a feature that the restoration performance is worse the more it is scattered (feature [2]).
- the weight w1 is reduced when the distribution of the distance or orientation between the coefficient vector of the representative learning set and the individual difference eigenspace coefficient vector for each patch is wide for the patch sample.
- a lookup table indicating the correspondence between the distribution spread and the weight w1 may be created in advance, or may be calculated using a function that defines the correspondence.
- the method according to the present invention is performed on the individual difference eigenspace of the tensor (person eigenspace of FIG. 2C) compared to the pixel eigenspace of the tensor (image eigenspace of FIG. 2B).
- the tensor projection feature [1] it is possible to evaluate all patches with the same index (all patches are gathered at almost one point), and thus the spread of the distribution is reliable. A new effect of being able to be evaluated as a scale is born. Therefore, the weight calculation accuracy is improved.
- Example B-2-2 In the distribution with respect to the patch sample of “Example B-2-1”, w1 is reduced as the patch sample has a smaller number of samples (or farther from the representative value). That is, the weight is changed according to the frequency on the histogram. In this case, there is an effect that the weight can be controlled for each patch.
- Example B-3 In the distribution for the patch sample of “Example B-2-1”, the weight may be increased as the distribution shape is similar. For example, the weight is changed depending on whether the distribution shape of the distribution of the input image (unknown image) is similar to the distribution of Mr. A grasped in the learning step.
- Example B-Common-1 For the above-mentioned "Example B-1-1”, “Example B-1-2”, “Example B-2-1”, “Example B-2-2”, and “Example B-3”
- Example B-1-1 or “Example B-1-2”
- individual patches for example, in the face of Mr. A
- the distance of each patch from the representative value of the distribution with respect to the patch sample is used. The farther from the representative value, the less appropriate the correct answer is.
- the method according to the present invention is performed on the individual difference eigenspace of the tensor (person eigenspace of FIG. 2C) compared to the pixel eigenspace of the tensor (image eigenspace of FIG. 2B).
- the tensor projection feature [1] by evaluating the reliability of all the patches, all patches can be evaluated with the same index (all patches are gathered at almost one point), so learning defined as a provisional correct answer A new effect of being able to evaluate including the reliability of the sample itself is born. Therefore, the weight calculation accuracy is improved.
- Example B-Common-2 In addition, it is common to “Example B-1-1”, “Example B-1-2”, “Example B-2-1”, “Example B-2-2”, and “Example B-3” described above. As the representative value, average, median, maximum, minimum, etc. may be used.
- Example B-Common-3) Distributions common to the above-mentioned “Example B-1-1", “Example B-1-2", “Example B-2-1", “Example B-2-2", and “Example B-3" As the spread (variation), dispersion, standard deviation, or the like may be used.
- Example B-Common-4 The distance w1 is increased as the distance between the representative value such as the center of gravity of the learning set and the surrounding boundary points and the individual difference eigenspace coefficient vector is closer or similar in direction. According to this aspect, it is possible to reduce the number of distance and orientation calculation targets and increase the speed.
- Example B-Common-5 For the calculation of the “distance” in each example described above, the Euclidean distance, the Mahalanobis distance, the KL distance, etc. may be used.
- Example B-Common-6 For the calculation of the “direction” in each example described above, a vector angle, an inner product, an outer product, or the like may be used.
- Example B-Common-7 In the “learning step” described with reference to FIG. 4, the relationship between distance, orientation, representative value, distribution spread, distribution shape, and restoration error is defined as a correct answer set.
- the restoration error is the difference between the image restored by the projection function obtained from the learning image set and the correct image, and is represented by, for example, the mean square error of the correct incorrect image or PNSR (peak signal to noise ratio). .
- a relationship between at least one element of “distance, orientation, representative value, distribution spread, and distribution shape” and “restoration error” and a relationship between “restoration error” and “weight w1” are defined by an LUT or a function. deep.
- the above LUT or function is used from the similarity of at least one of “distance, orientation, representative value, distribution spread, distribution shape” of the “learning step” and the “restoration step”. “Weight w1” is obtained.
- ⁇ Processing at the learning step> The relationship between at least one of “distance, direction, representative value, distribution spread, distribution shape” and “restoration error” is obtained in advance. For example, it is obtained as “distance-restoration error characteristics”. A characteristic with a reliability probability proportional to the frequency may be used.
- weight is obtained from the relationship of the following equation ([Formula 6]).
- the “weight” is increased as the “restoration error” is smaller.
- Example B-Common-8 A function that defines the correlation between at least one of “Distance, orientation, representative value, distribution spread, and distribution shape” of the incorrect answer set in the individual difference eigenspace in “Example B-Common-7” and “Weight” (Regularization) least squares method, multiple regression analysis, SVM (regression), AdaBoost (regression), nonparametric Bayes, maximum likelihood estimation method, EM algorithm, variational Bayes method, Markov chain Monte Carlo method, etc. 5] may be obtained.
- Example B-Common-9 In each of the above examples (“Example B-1-1” to “Example B-Common-8”), a low-pass (average) filter is further applied to the weight of the patch to be processed and the surrounding patches. You may multiply. According to this aspect, there is an effect of spatially smoothing the obtained weight and an effect of removing noise. Further, a maximum value, a minimum value, and a median filter may be applied.
- Example B-Common-1 to 9 method can also be applied to the weighting in the coefficient vector correction processing unit 140 described above.
- the image (input image 1) given from the addition unit 160 and the image (input image 2) given from the general-purpose super-resolution processing unit 164 are obtained by the weight calculation unit 162 below. Are combined or selected in accordance with the weights.
- a high-quality image can be obtained from a low-quality input image.
- the tolerance for input conditions is wide, and robust image quality enhancement processing can be realized.
- one or a plurality of high-quality image processing units by another method are provided, and these are selectively used or synthesized by appropriate weighting. May be.
- the reliability of the super-resolution restoration process may be extremely low. Therefore, rather than outputting a failed image with low reliability, the information of the original input image is used. There may be cases where it is desirable to output images that make use of them. Therefore, instead of or in combination with the general-purpose super-resolution processing unit 164, a processing unit for simply enlarging the input image is provided, and an image enlarged by the enlargement processing unit (super-resolution restoration processing) The image that is not subjected to () may be provided to the combining unit 166.
- FIG. 12 is a block diagram showing another embodiment. 12, elements that are the same as or similar to those in the configuration of FIG. 7 are given the same reference numerals, and descriptions thereof are omitted.
- the form shown in FIG. 12 is an aspect in which the first sub-nucleus tensor 123 and the second sub-nucleus tensor 125 are generated and stored in storage means such as a memory in the learning step.
- the LPP eigenprojection matrix U and the projection kernel tensor G (and the first sub-nucleus tensor 123 and the second sub-nucleus tensor 125 to be generated from the LPP-specific projection matrix U and the second sub-nucleus tensor 125) are created once and stored, and then processed. Then, since this can be used repeatedly, a mode in which these matrices and tensors are parameterized for each learning image set and appropriate projection matrices and tensors are appropriately reset according to the contents of the input image in the restoration step is preferable.
- the projection matrix and tensor set generated based on the Japanese face learning image set For example, the projection matrix and tensor set generated based on the Japanese face learning image set, the projection matrix and tensor set generated based on the Western face learning image set, and so on. Then, projective transformation sets for each region are parameterized and used as necessary.
- the projection matrix and the tensor set may be switched according to the use of the processing, not limited to the super-resolution restoration processing of the face image.
- the learning image set is changed according to the use, such as for an endoscopic image or a vehicle image, and the LPP eigenprojection matrix U and the projective nucleus tensor G (and the first sub-nucleus tensor 123 and the first 2 sub-nucleus tensors 125) are generated, and the generated projection matrix and tensor are stored and stored in a nonvolatile memory, a magnetic disk, or other storage means. Then, by reading and setting the corresponding projection matrix and tensor according to the application, various image processing can be performed with the same algorithm.
- ⁇ Modification 2 of Embodiment> 6 and 12 show a configuration in which the learning step and the restoration step can be performed by one image processing apparatus, the image processing apparatus that performs the learning step and the image processing apparatus that performs the restoration step are separately provided. It is also possible to adopt the device configuration. In this case, it is desirable that the image processing apparatus responsible for the restoration step be configured to be able to acquire projection-related information (inherent projection matrix, projection tensor) created separately from the outside. As such information acquisition means, a media interface or a communication interface corresponding to an optical disk or other removable storage medium can be applied.
- projection-related information inherent projection matrix, projection tensor
- LPP is exemplified as a projection using a local relationship.
- LLE local linear embedding
- LTSA linear tangent-space alignment
- NPE Neighbor Preserving Embedding
- the technique for obtaining the representative learning image group of the present invention is not limited to projection using local relationships, but can also be applied to tensor singular value decomposition (TSVD) and the like.
- ⁇ Modification 4 of embodiment> In the embodiment described with reference to FIG. 6, in order to simplify the description, conditions are set for the four types of modalities described in Table 1 with known modalities of patches and resolutions, and “pixel value” Paying attention to the modality of “individual differences”, a projection route is designed from the pixel real space through the pixel eigenspace and the individual difference eigenspace.
- the design of the projection route is not limited to this example when implementing the present invention.
- various eigenspaces can be selected as eigenspaces passing through the projection route.
- the conversion source image input to the restoration step is an image area partially extracted (extracted) from a certain image before entering the processing procedure described with reference to FIGS. 6 and 12. Also good. For example, a process of extracting a human face portion from the original image is performed, and the extracted face image area can be handled as input image data in the restoration step.
- processing means for performing a synthesis process for replacing the extracted area with the restored output high-resolution image and fitting it into the original image may be added.
- the enlargement magnification is adjusted in accordance with the size of the final output image (or the size of the background to be synthesized).
- the “target” image may be a region including a part of a human body such as a head or a person's hand, or at least a part of a living body other than a human body, in addition to a face.
- the living body includes a specific tissue existing inside the living body such as a blood vessel inside the living body.
- a tumor tissue inside a living body may be included in the concept of “living body” and can be a “target”.
- cards such as money, cash cards, vehicles, or vehicle license plates. It is also possible to target characters, drawings, tables, photographs, etc. of documents scanned by a scanner device such as a copying machine.
- Mode can include subject orientation, size, position, lighting conditions, and the like. Other types of subjects include race, age, and gender. As attributes of the subject image, the facial expression of the imaged person, the gesture of the imaged person, the posture of the imaged person, the wear worn by the imaged person, etc. can be exemplified as “modality”. . Wearing items include glasses, sunglasses, masks, hats, and the like.
- Image processing to which the present invention can be applied includes not only super-resolution but also reduction processing with reduced aliasing components, multi-color, multi-gradation, noise reduction, block noise, mosquito noise, etc.
- noise reduction a projection relationship is learned by using a pair of a noise image (corresponding to “low image quality”) and an image without noise (corresponding to “high image quality”).
- the present invention is not limited to still images, but can be similarly applied to frame images (or field images) constituting a moving image.
- FIG. 13 shows an example of an image processing system 200 according to the embodiment of the present invention.
- An image processing system 200 described below can function as a monitoring system as an example.
- the image processing system 200 includes a plurality of imaging devices 210a-d that images the monitoring target space 202, an image processing device 220 that processes captured images captured by these imaging devices 210a-d, a communication network 240, and an image processing device 250. , An image database (DB) 255, and a plurality of display devices 260a-e.
- the image processing device 250 can be installed in a space 205 (for example, a place far away from the monitoring target space 202) different from the monitoring target space 202, and the display devices 260a-e also include the monitoring target space 202 and the image processing device. It can be provided in a space 206 different from 250 installation spaces 205.
- the imaging device 210a includes an imaging unit 212a and a captured image compression unit 214a.
- the imaging unit 212a captures a plurality of captured images by continuously capturing the monitoring target space 202.
- the captured image obtained by the imaging unit 212a may be a RAW format captured image.
- the captured image compression unit 214a synchronizes the RAW format captured images captured by the imaging unit 212a and compresses a moving image including a plurality of captured images obtained by the synchronization using MPEG encoding or another encoding method. Generate video data.
- the imaging device 210 a outputs the generated moving image data to the image processing device 220.
- imaging devices 210b, 210c, and 210d also have the same configuration as the imaging device 210a, and the moving image data generated by each imaging device 210a-d is sent to the image processing device 220.
- the imaging devices 210a-d may be collectively referred to as the imaging device 210.
- the display devices 260a-e may be collectively referred to as the display device 260, and in the following description, by omitting the characters following the numeric code, such as the alphabetic character at the end of the code attached to similar components, In some cases, what the numerical symbols indicate are collectively referred to.
- the image processing device 220 acquires a moving image by decoding the moving image data acquired from the imaging device 210.
- the image processing apparatus 220 includes a plurality of different types of features such as an area where a person 270 is imaged and an area where a moving body 280 such as a vehicle is imaged from each of a plurality of captured images included in the acquired moving image.
- the feature region of is detected.
- the image processing apparatus 220 compresses the image of the feature area with an intensity according to the type of the feature, and compresses the image of the area other than the feature area with an intensity stronger than the compression intensity for compressing the image of each feature area. To do.
- the image processing device 220 generates feature area information including information for specifying the feature area detected from the captured image.
- the feature area information is the text data including the position of the feature area, the size of the feature area, the number of the feature areas, the identification information for identifying the captured image in which the feature area is detected, or the like. It may be data that has been processed.
- the image processing apparatus 220 attaches the generated feature area information to the compressed moving image data, and transmits it to the image processing apparatus 250 through the communication network 240.
- the image processing apparatus 250 receives the compressed moving image data associated with the feature area information from the image processing apparatus 220.
- the image processing apparatus 250 stores the compressed moving image data in the image DB 255 in association with the feature area information associated with the compressed moving image data.
- the image DB 255 may store the compressed moving image data in a non-volatile storage medium such as a hard disk.
- the image DB 255 stores the compressed captured image.
- the image processing device 250 In response to a request from the display device 260, the image processing device 250 reads the compressed moving image data and the feature region information from the image DB 255, and decompresses the read compressed moving image data using the feature region information attached thereto. Then, a moving image for display is generated and transmitted to the display device 260 through the communication network 240.
- the display device 260 has a user interface through which image search conditions and the like can be input.
- the display device 260 can transmit various requests to the image processing device 250 and displays a display moving image received from the image processing device 250.
- the image processing apparatus 250 can perform various operations based on the position of the feature region, the size of the feature region, the number of feature regions, and the like included in the feature region information. It is also possible to specify a captured image that satisfies the search condition and its feature region. Then, the image processing device 250 may cause the display device 260 to display an image that matches the requested search condition by decoding the identified captured image and providing the decoded image to the display device 260.
- the image processing apparatus 250 may generate the display moving image by decompressing the compressed moving image data acquired from the image processing device 220 using the corresponding feature area information, and store the generated moving image data in the image DB 255. . At this time, the image processing apparatus 250 may store the moving image for display in the image DB 255 in association with the feature area information. According to such an aspect, the image processing device 250 can read a display moving image (expanded) from the image DB 255 in response to a request from the display device 260 and transmit it to the display device 260 together with the feature region information.
- the compressed moving image data may be expanded in the display device 260 to generate a display image. That is, the display device 260 may receive the feature area information and the compressed moving image data from the image processing device 250 or the image processing device 220. In such an aspect, when the received compressed moving image data is decoded and displayed on the display device 260, the display device 260 temporarily enlarges the feature region in the captured image obtained by decoding and causes the display device 260 to display the feature region. You can.
- the display device 260 may determine the image quality of each feature region according to the processing capacity of the display device 260, and improve the image quality of the feature region with the determined image quality.
- the display device 260 may replace the image of the feature region in the captured image displayed by the display device 260 with the image of the feature region with high image quality and cause the display device 260 to display the image.
- the super-resolution means using the tensor projection of the present invention can be used as the processing means for improving the image quality when performing this replacement display. That is, the image processing apparatus to which the present invention is applied can be mounted in the display device 260.
- the information indicating the feature region is stored in association with the moving image, so that it is possible to quickly search and find a captured image group that meets a predetermined condition in the moving image. . Further, according to the image processing system 200 of the present example, only a captured image group that meets a predetermined condition can be decoded, so that a partial moving image that meets the predetermined condition can be displayed promptly in response to a reproduction instruction. it can.
- the recording medium 290 shown in FIG. 13 stores programs for the image processing device 220, the image processing device 250, and the display device 260.
- the program stored in the recording medium 290 is provided to an electronic information processing apparatus such as a computer that functions as the image processing apparatus 220, the image processing apparatus 250, and the display apparatus 260 according to the present embodiment.
- the CPU included in the computer operates according to the contents of the program and controls each unit of the computer.
- the program executed by the CPU causes the computer to function as the image processing device 220, the image processing device 250, the display device 260, and the like described with reference to FIG. 13 and subsequent drawings.
- the recording medium 290 in addition to the CD-ROM, an optical recording medium such as DVD or PD, a magneto-optical recording medium such as MO or MD, a magnetic recording medium such as a tape medium or a hard disk device, a semiconductor memory, a magnetic memory, etc. It can be illustrated.
- a storage device such as a hard disk or a RAM provided in a server system connected to a dedicated communication network or the Internet can function as the recording medium 290.
- FIG. 14 shows an example of a block configuration of the image processing apparatus 220.
- the image processing apparatus 220 includes an image acquisition unit 222, a feature region specifying unit 226, an external information acquisition unit 228, a compression control unit 230, a compression unit 232, an association processing unit 234, and an output unit 236.
- the image acquisition unit 222 includes a compressed moving image acquisition unit 223 and a compressed moving image expansion unit 224.
- the compressed moving image acquisition unit 223 acquires encoded moving image data generated by the imaging device 210 (see FIG. 13).
- the compressed moving image expansion unit 224 generates a plurality of captured images included in the moving image by expanding the moving image data acquired by the compressed moving image acquisition unit 223.
- the compressed moving image extension unit 224 decodes the encoded moving image data acquired by the compressed moving image acquisition unit 223, and extracts a plurality of captured images included in the moving image.
- the captured image included in the moving image may be a frame image or a field image.
- the plurality of captured images obtained by the compressed moving image decompression unit 224 are supplied to the feature region specifying unit 226 and the compression unit 232.
- the feature region specifying unit 226 detects a feature region from a moving image including a plurality of captured images. Specifically, the feature region specifying unit 226 detects a feature region from each of the plurality of captured images.
- the feature region specifying unit 226 detects, as a feature region, an image region whose image content changes in a moving image. Specifically, the feature region specifying unit 226 may detect an image region including a moving object as a feature region. The feature region specifying unit 226 can detect a plurality of feature regions having different types of features from each of the plurality of captured images.
- the type of feature may be a type classified using the type of object as an index, such as a person and a moving object. Further, the type of the object may be determined based on the degree of coincidence of the shape of the object or the color of the object. As described above, the feature region specifying unit 226 may detect a plurality of feature regions having different types of included objects from a plurality of captured images.
- the feature region specifying unit 226 extracts an object that matches a predetermined shape pattern with a matching degree equal to or higher than a predetermined matching degree from each of the plurality of picked-up images, and in the picked-up image including the extracted object.
- the region may be detected as a feature region having the same feature type.
- a plurality of shape patterns may be determined for each type of feature.
- a shape pattern of a human face can be exemplified. Different face patterns may be determined for each of a plurality of persons.
- the feature region specifying unit 226 can detect different regions each including a different person as different feature regions.
- the feature area specifying unit 226 includes an area including a part of a human body such as the person's head or a person's hand, or at least a part of a living body other than the human body. It can be detected as a feature region.
- the feature region specifying unit 226 may detect a region in which a card such as money, a cash card, a vehicle, or a license plate of the vehicle is captured as a feature region.
- the feature region specifying unit 226 In addition to pattern matching based on template matching or the like, the feature region specifying unit 226 also uses, for example, a feature region based on a learning result based on machine learning (for example, Adaboost) described in Japanese Patent Application Laid-Open No. 2007-188419. Can also be detected. For example, an image feature amount extracted from a predetermined subject image and an image feature amount extracted from a subject image other than the predetermined subject are extracted from the predetermined subject image. Learn the features of the image features. Then, the feature region specifying unit 226 may detect, as a feature region, a region from which an image feature amount having a feature that matches the learned feature is extracted.
- Adaboost machine learning
- the feature region can be detected by various methods, not limited to the above-described examples 1 and 2, and the feature region specifying unit 226 can detect the plurality of captured images included in each of the plurality of moving images by an appropriate method. A plurality of feature regions are detected. Then, the feature region specifying unit 226 supplies information indicating the detected feature region to the compression control unit 230.
- the information indicating the feature area may include coordinate information of the feature area indicating the position of the feature area, type information indicating the type of the feature area, and information for identifying the moving image in which the feature area is detected.
- the compression control unit 230 controls the compression processing of the moving image by the compression unit 232 based on the information indicating the feature region acquired from the feature region specifying unit 226.
- the compression unit 232 compresses the captured image with different intensities between the feature region in the captured image and the region other than the feature region in the captured image, under the control of the compression control unit 230.
- the compression unit 232 compresses the captured image by reducing the resolution of the region other than the feature region in the captured image included in the moving image from the feature region.
- the compression unit 232 compresses each image area in the captured image with an intensity corresponding to the importance of the image area.
- the compression unit 232 may compress the images of the plurality of feature regions in the captured image with an intensity corresponding to the feature type of each feature region. For example, the compression unit 232 may reduce the resolution of the images of the plurality of feature regions in the captured image to a resolution determined according to the feature type of the feature region.
- the association processing unit 234 associates information specifying the feature area detected from the captured image with the captured image. Specifically, the association processing unit 234 associates information for specifying the feature area detected from the captured image with a compressed moving image including the captured image as a moving image constituent image. Then, the output unit 236 outputs the compressed moving image data associated with the information for specifying the feature region by the association processing unit 234 to the image processing apparatus 250.
- the external information acquisition unit 228 acquires data used by the feature region specifying unit 226 for processing for specifying the feature region from the outside of the image processing apparatus 220.
- the feature region specifying unit 226 specifies the feature region using the data acquired by the external information acquisition unit 228.
- the data acquired by the external information acquisition unit 228 will be described in relation to the parameter storage unit 650 shown in FIG.
- FIG. 15 shows an example of a block configuration of the feature area specifying unit 226.
- the feature region specifying unit 226 includes a first feature region specifying unit 610, a second feature region specifying unit 620, a region estimating unit 630, a high image quality region determining unit 640, a parameter storage unit 650, and an image generating unit 660.
- the second feature region specifying unit 620 includes a partial region determination unit 622 and a feature region determination unit 624.
- the first feature region specifying unit 610 acquires a captured image that is a moving image constituent image included in the moving image from the image acquisition unit 222, and specifies a feature region from the acquired captured image.
- the first feature region specifying unit 610 may specify the feature region from the captured image by detecting the feature region using the detection method exemplified in the above-described “example of feature region detection method 1, example 2”. .
- the image generation unit 660 increases the image quality of a region that is more likely to be identified as a feature region among regions that are not identified as feature regions (corresponding to “first feature regions”) by the first feature region identification unit 610.
- a high-quality image is generated from the captured image.
- a super-resolution image processing means using tensor projection according to the present invention can be used as a means for generating a high-quality image in the image generation unit 660.
- the second feature region specifying unit 620 searches for a feature region (corresponding to a “second feature region”) from the high-quality image generated by the image generation unit 660.
- the feature regions specified by the first feature region specifying unit 610 and the second feature region specifying unit 620 are all supplied to the compression control unit 230 as the feature regions specified by the feature region specifying unit 226.
- the second feature region specifying unit 620 may search for the feature region in more detail than the first feature region specifying unit 610 based on the high-quality image obtained from the image generating unit 660.
- the second feature region specifying unit 620 may be mounted with a detector that detects the feature region with higher accuracy than the detection accuracy with which the first feature region specifying unit 610 specifies the feature region. That is, a detector capable of detecting with higher accuracy than the detection accuracy of a detector mounted as the first feature region specifying unit 610 may be mounted as the second feature region specifying unit 620.
- the second feature region specifying unit 620 uses a first feature region specifying unit 610 from the same input image (an image that is not subjected to high image quality processing) as input to the first feature region specifying unit 610.
- the feature area may be searched in more detail.
- the image generation unit 660 captures a high-quality image obtained by preferentially increasing the image quality of a region that is more likely to be specified as the feature region among the regions that are not specified as the feature region by the first feature region specifying unit 610. It may be generated from an image. Further, the image generation unit 660 may generate a high quality image by image processing on the captured image.
- the image generation unit 660 is more likely to be specified as a feature region among the regions not specified as the feature region by the first feature region specification unit 610 after the first feature region specification unit 610 specifies the feature region.
- a high-quality image in which a high area has a higher image quality may be generated from the captured image.
- the “region that is not specified as the feature region by the first feature region specifying unit 610” is specified as the feature region by the first feature region specifying unit 610 at the stage of specification by the first feature region specifying unit 610. It may be an area that did not exist. In this case, the second feature region specifying unit 620 searches for the feature region again.
- the “region not specified as the feature region by the first feature region specifying unit 610” is not specified by the first feature region specifying unit 610 when the first feature region specifying unit 610 has not specified. May be a predicted region. For example, when the first feature region specifying unit 610 detects a region that meets a predetermined condition as a feature region, the “region that is not specified as a feature region by the first feature region specifying unit 610” It may be a non-conforming area.
- the image generation unit 660 may generate a high-quality image when the first feature region specifying unit 610 has not specified a feature region.
- the first feature region specifying unit 610 and the second feature region specifying unit 620 are shown as different functional blocks, but it should be understood that they can be implemented with a single functional element. It is.
- the first feature region specifying unit 610 and the second feature region specifying unit 620 share at least a part of hardware elements such as an electric circuit for detecting a feature region and software elements such as software for detecting a feature region. be able to.
- the image generation unit 660 includes a feature region in which the first feature region specifying unit 610 specifies a feature region.
- An image with higher image quality than the image targeted for the specific processing may be generated and provided to the second feature region specifying unit 620.
- the image generation unit 660 generates an image with higher image quality than the image obtained by the image processing.
- the second feature area specifying unit 620 may be provided.
- the high-quality image generated by the image generation unit 660 may be any image that has a higher image quality than the image used by the first feature region specifying unit 610 for the feature region specifying process. Any of these images.
- the image generation unit 660 generates, from the input image, a high-quality image in which the region that is not specified as the feature region by the first feature region specifying unit 610 is changed to the image quality according to the possibility of being specified as the feature region. To do.
- the image generation unit 660 may generate a high-quality image with an image quality with accuracy according to the possibility of being identified as a feature region.
- the region estimation unit 630 estimates a region to be specified as a feature region in the captured image. For example, when the feature area specifying unit 226 should specify the area of the moving object in the moving image as the feature area, the area estimating unit 630 estimates the area where the moving object exists in the moving image. For example, the region estimation unit 630 determines whether a moving object is based on the position of a moving object extracted from one or more other captured images as moving image constituent images included in the same moving image, the timing at which another captured image is captured, and the like. Estimate the existing location. Then, the region estimation unit 630 may estimate a region having a predetermined size including the estimated position as a region where a moving object exists in the moving image.
- the first feature region specifying unit 610 specifies the region of the moving object as the feature region from the regions estimated by the region estimation unit 630 in the captured image. Then, the image generation unit 660 generates a high-quality image in which the region in which the moving object region is not specified by the first feature region specifying unit 610 among the regions estimated by the region estimation unit 630 is improved in quality. Good.
- the partial area determination unit 622 determines whether or not images of one or more partial areas existing at predetermined positions in a specific image area meet predetermined conditions. Then, the feature region determination unit 624 determines whether the specific image region is a feature region based on the determination result by the partial region determination unit 622. For example, when determining whether or not a specific image region is a feature region, the partial region determination unit 622 determines a predetermined condition for each of a plurality of different partial regions on the specific image region. It is determined whether or not it is suitable. The feature region determination unit 624 determines that the specific image region is a feature region when the number of partial regions for which a negative determination result is obtained is smaller than a predetermined value.
- the second feature region specifying unit 620 described above with respect to one or more partial regions existing at predetermined positions in the specific image region.
- the image generation unit 660 may improve the image quality of the one or more partial regions when generating a high-quality image in which the image quality of the specific image region is improved. As a result, it is possible to improve the image quality of only the region effective for the feature region detection process, and thus it is possible to reduce the amount of calculation required for the feature region re-detection process.
- the high image quality region determination unit 640 determines a region where the image generation unit 660 increases the image quality. Specifically, the high image quality region determination unit 640 determines a region where the image generation unit 660 increases the image quality when the possibility that the region is specified as the feature region is lower. The image generation unit 660 generates a high quality image in which the region determined by the high image quality region determination unit 640 has a higher image quality. As a result, it is possible to increase the possibility that a moving object can be extracted by re-searching, and to reduce the probability that a feature region detection leak occurs in the feature region specifying unit 226.
- the parameter storage unit 650 stores image processing parameters used to improve the image quality of the image in association with the feature amount extracted from the image. Then, the image generation unit 660 uses the image processing parameter stored in the parameter storage unit 650 in association with the feature amount that matches the feature amount extracted from the target region for higher image quality. A high-quality image with a high-quality area is generated.
- the parameter storage unit 650 stores image processing parameters calculated by learning using a plurality of images from which similar feature amounts are extracted as teacher images in association with feature amounts representing the similar feature amounts. It's okay.
- the image processing parameter may be image data having a spatial frequency component in a higher frequency region to be added to the image data to be improved in image quality.
- Other image processing parameters include vectors, matrices, and tensors that convert input data into data representing a high-quality image when data of pixel values of a plurality of pixels or data of a plurality of feature amount components is used as input data. , N-dimensional mixed normal distribution, n-dimensional mixed multinomial distribution, and the like. Here, n is an integer of 1 or more. The image processing parameters will be described later in relation to the operation of the image processing apparatus 250.
- the external information acquisition unit 228 shown in FIG. 13 acquires at least one of the image processing parameter and the feature amount stored in the parameter storage unit 650 (described in FIG. 15) from the outside.
- the parameter storage unit 650 stores at least one of the image processing parameter and the feature amount acquired by the external information acquisition unit 228.
- FIG. 16 shows an example of the feature region specifying process in the feature region specifying unit 226. Here, processing for specifying a feature region in the captured image 700 will be described.
- the first feature region specifying unit 610 calculates the degree of conformity to a predetermined condition for a plurality of image regions of the captured image 700 as shown in FIG. Then, the first feature region specifying unit 610 specifies regions 710-1 and 710-2 that have a degree of conformity to a predetermined condition in the captured image that is greater than the first threshold as feature regions.
- the image quality enhancement area determination unit 640 (see FIG. 15) has an area 710-3 and an area 710-4 in which the degree of conformity to a predetermined condition in the captured image is greater than a second threshold value that is equal to or less than the first threshold value. Is selected (see FIG. 16). Then, the image quality improvement region determination unit 640 includes the region 710-3, and the image generation unit 660 increases the image quality of the region 710-5 having a size according to the degree of suitability of the image of the region 710-3 with respect to the above condition. Is determined as the target area.
- the image quality improvement area determination unit 640 includes an area 710-4, and the image generation unit 660 increases the image quality of the area 710-6 having a size according to the degree of suitability of the image in the area 710-4 with respect to the above condition. Is determined as the target area.
- the high image quality area determination unit 640 expanded the area 710-4 with a larger enlargement ratio.
- the area 710-6 is determined as a target area for image quality improvement by the image generation unit 660 (see FIG. 15).
- the high image quality region determination unit 640 expands a region having a degree of conformity to the condition larger than a predetermined second threshold with an enlargement ratio corresponding to the degree of conformity. This is determined as a target area for image quality improvement by 660.
- the second feature region specifying unit 620 searches for the feature region from the images of the region 710-5 and the region 710-6 with high image quality (see FIG. 16).
- the second feature region specifying unit 620 searches the region 710-5 and the region 710-6 for which the image quality has been improved through the same processing as the first feature region specifying unit 610 to search for a region that meets the above conditions. Good.
- the second feature region specifying unit 620 determines that the region 722 meets the above condition in the image 720 of the region 710-5 with high image quality.
- the feature region specifying unit 226 adds the region 710-7 corresponding to the region 722 on the image 720 in addition to the region 710-1 and the region 710-2 specified by the first feature region specifying unit 610. As specified.
- the image generation unit 660 has a higher image quality in a region that is not specified as a feature region by the first feature region specifying unit 610 and that has a higher degree of conformity to a predetermined condition.
- a quality image is generated from the captured image.
- the image generation unit 660 increases the image quality of an area that is not specified as a feature area by the first feature area specifying unit 610 and that has a degree of conformance to the above condition that is greater than a predetermined second threshold. Generated high-quality images. As a result, it is possible to increase the possibility that a feature region is extracted from a region that is highly likely to be a feature region, and to reduce the probability that a feature region will be detected.
- the area other than the area specified as the feature area by the first feature area specifying unit 610 and the target area for high image quality is determined as a non-feature area that is not a feature area.
- a region that is not a feature region is specified as a feature region based on the result of specifying the feature region by the first feature region specifying unit 610 and the second feature region specifying unit 620, the previous test result, or the subsequent test result.
- the first threshold value may be set so that the probability is greater than a predetermined value. Thereby, the possibility that the non-feature region is included in the region identified as the feature region by the first feature region identifying unit 610 can be reduced.
- the degree of fitness close to the first threshold value may be calculated for the non-feature region, but by setting the first threshold value as described above, there is a possibility that such a region may be erroneously detected as a feature region. Can be reduced.
- the fitness calculated from the feature region based on the feature region specification result, the previous test result, or the subsequent test result by the first feature region specifying unit 610 and the second feature region specifying unit 620 is the second threshold value.
- the value of the second threshold value may be set so as to be above. Thereby, it is possible to reduce the possibility that the feature region is included in the region in which the fitness level equal to or less than the second threshold is calculated.
- the degree of matching close to the second threshold value may be calculated for the feature region as well, but setting the second threshold value as described above reduces the possibility of such a region becoming a non-feature region. be able to.
- the feature region is included in the region in which the degree of conformance greater than the second threshold and less than or equal to the first threshold is calculated by setting the first threshold and the second threshold.
- the feature region specifying unit 2226 since the feature region is searched for by the second feature region specifying unit 620 after the image quality is improved, the feature region and the non-feature region can be appropriately separated. Both the probability of failing to detect a feature region and the probability of detecting a non-feature region as a feature region can be reduced.
- the feature region specifying unit 226 can provide a feature region detector having both high sensitivity and specificity.
- the image generation unit 660 determines whether or not to perform high-quality processing based on the relationship between the degree of fitness and the threshold as described above, and applies at least a part of the image area of the input image to the above condition.
- a high-quality image with high image quality may be generated with high image quality accuracy according to the degree.
- the image quality improvement accuracy may be determined by a continuous function or a discontinuous function according to the fitness.
- FIG. 17 shows another example of the feature region specifying process in the feature region specifying unit 226.
- an example of processing of the feature region specifying unit 226 when specifying a moving object region from a moving image as a feature region is shown.
- the first feature region specifying unit 610 or the second feature region specifying unit 620 (see FIG. 15) in the captured image 800-1 and the captured image 800-2, respectively, as shown in FIG. 2 is specified as the feature region.
- an object in which the same subject is captured exists in the area 810-1 and the area 810-2.
- the region estimation unit 630 (see FIG. 15), the position on each image of the region 810-1 and the region 810-2, the timing at which each of the captured image 800-1 and the captured image 800-2 is captured, In addition, based on the timing when the captured image 800-3 is captured, the region 810-3 is determined as the region where the object of the same subject should exist in the captured image 800-3 (FIG. 17).
- the area estimation unit 630 displays the position of the area 810-1 and the area 810-2 on the image, the image area of the object that moves from the timing when the captured image 800-1 and the captured image 800-2 are captured. Based on the calculated speed, the position of the region 810-2, and the time difference between the timing when the captured image 800-2 is captured and the timing when the captured image 800-3 is captured.
- An area 810-3 is determined as an area in which the object of this type should exist.
- the first feature area specifying unit 610 searches the area 810-3 for a moving object (FIG. 17).
- the image generation unit 660 generates a high-quality image 820-4 in which the area 810-3 is improved in image quality (FIG. 17).
- the second feature area specifying unit 620 searches for a moving object from the high-quality image 820-4. Accordingly, it is possible to increase the possibility that the object is extracted from the region where the moving object is likely to be detected, and it is possible to reduce the probability that the moving object is detected and leaked.
- the image generation unit 660 may generate a high-quality image 820-4 in which the central area in the area 810-3 has a higher image quality.
- the image generation unit 660 may generate a high-quality image 820-4 in which the central area in the area 810-3 has a higher image quality.
- FIG. 18 shows an example of a feature region determination process by the second feature region specifying unit 620 described in FIG.
- the second feature area specifying unit 620 determines the feature amount from the partial areas 910-1 to 910-4 having a predetermined positional relationship in the image area 900. Extract. At this time, the second feature area specifying unit 620 extracts, from each of the partial areas 910, a feature amount of a predetermined type according to each position of the partial area 910 in the image area 900.
- the second feature area specifying unit 620 calculates, for each partial area 910, the degree of suitability of the feature amount extracted from the image of the partial area 910 with respect to a predetermined condition.
- the second feature region specifying unit 620 determines whether or not the image region 900 is a feature region based on the degree of matching calculated for each partial region 910.
- the second feature region specifying unit 620 may determine that the image region 900 is a feature region when the weighted total value of the fitness is greater than a predetermined value. Further, the second feature region specifying unit 620 determines that the image region 900 is a feature region when the number of partial regions 910 for which the degree of fitness greater than a predetermined value is calculated is greater than a predetermined value. May be.
- the above-described processing from feature amount extraction to fitness calculation can be implemented by an image filter. Further, the processing can be implemented as a weak classifier. Further, the position of the partial area 910 may be determined according to the type of object to be extracted as the characteristic area. For example, when an area including a human face object is to be detected as a feature area, the partial area 910 is determined at a position where the discrimination power for a human face object is higher than a predetermined value. Good. High discrimination means that there is a high probability that the discrimination result is true for a human face object, and a high probability that the discrimination result is false for an object other than a human face. Good.
- the image generation unit 660 does not improve the image quality of the areas other than the partial area 910, and improves the image quality of only the partial area 910.
- the second feature area specifying unit 620 extracts the feature area from the high-quality image, and determines whether or not the image area 900 is a feature area. Thereby, it is possible to increase the detection probability of the feature region while limiting the image region to be improved in image quality, and thus it is possible to detect the feature region at high speed and with high probability.
- the feature region determination processing in the second feature region specifying unit 620 has been described. However, the first feature region specifying unit 610 may determine whether or not it is a feature region by the same processing.
- the processing in the first feature region specifying unit 610 and the second feature region specifying unit 620 can be implemented by a plurality of weak classifiers. A description will be given below by taking as an example a case of mounting using all N weak classifiers.
- the first feature region specifying unit 610 it is determined whether or not it is a feature region using Nf weak classifiers. The degree of fitness is calculated based on the discrimination result, and as described above, a region where the fitness is greater than the first threshold is determined as a feature region, and a region where the fitness is less than or equal to the second threshold is determined as a non-feature region.
- the image generation unit 660 increases the image quality of an area where the fitness is less than or equal to the first threshold and greater than the second threshold.
- the high-quality image is obtained by Nf weak classifiers used by the first feature region specifying unit 610 and Nb weak classifiers other than the Nf weak classifiers. Is used to determine whether the region is a feature region. For example, it may be determined whether or not the region is a feature region based on the fitness calculated from the determination results of Nf + Nb weak classifiers.
- the feature region may be specified by the processing. For example, whether or not the region for which the degree of fitness greater than the third threshold is calculated is not a high-quality image by the image generation unit 660 and whether or not the second feature region specifying unit 620 is a feature region by Nf + Nb weak classifiers. May be determined.
- the image generation unit 660 increases the image quality of the region for which the fitness level equal to or less than the third threshold is calculated, and the second feature region specifying unit 620 determines whether the region is a feature region using Nf + Nb weak classifiers. May be.
- the number Nb of weak classifiers used in the processing of the second feature region specifying unit 620 may be adjusted according to the degree of fitness. For example, the smaller the matching degree, the second feature region specifying unit 620 may determine whether or not the feature region is a feature region using more weak classifiers.
- the second feature region specifying unit 620 may search for the feature region from the image quality changed image in more detail as the matching level is lower.
- a weak classifier configuration in at least one of the first feature region specifying unit 610 and the second feature region specifying unit 620 a weak classifier configuration by Adaboost can be exemplified.
- the first feature region specifying unit 610 and the second feature region specifying unit 620 may detect feature regions from low-resolution image groups each configured by multi-resolution representation.
- the image generation unit 660 may generate a low-resolution image group by performing multi-resolution with higher accuracy than the multi-resolution by the first feature region specifying unit 610.
- reduction processing by the bicubic method can be exemplified.
- the second feature region specifying unit 620 may generate a low-resolution image group from the input image using image processing parameters obtained by learning using the original image and the target resolution image. For learning, it is more preferable to use an image having a target resolution with smaller aliasing noise. For example, images obtained by different imaging devices having different numbers of imaging elements can be used for learning.
- the image processing method using tensor projection according to the present invention can be applied as the image quality enhancement processing described with reference to FIGS. That is, the image generation unit 660 generates a high-quality image obtained by improving the image quality of a region that is more likely to be identified as a feature region. Processing techniques may be used.
- the high image quality processing is not limited to high resolution processing, but can be exemplified by multi-gradation processing that increases the number of gradations and multi-color processing that increases the number of colors.
- the image processing method using the tensor projection according to the present invention can be applied.
- image quality when the captured image to be improved in image quality is a moving image constituent image (frame image or field image), higher resolution, higher number of colors, higher number of gradations, reduced noise, block noise,
- image quality may be improved using pixel values of other captured images.
- the image quality may be improved by using a shift in the imaging position of a moving object due to a difference in imaging timing. That is, the image generation unit 660 may generate a high-quality image using a captured image that is a moving image configuration image included in the moving image and another moving image configuration image included in the moving image.
- JP 2008-167949A, JP 2008-167950A, JP 2008-167948A, and JP 2008-229161A can be exemplified.
- the image generation unit 660 can reduce noise using the result of prior learning using an image with a larger amount of noise and an image with a smaller amount of noise.
- examples of the higher-accuracy sharpening process include a process using a filter having a larger filter size and a process of sharpening in more directions.
- FIG. 19 illustrates an example of a block configuration of the compression unit 232 illustrated in FIG.
- the compression unit 232 includes an image division unit 242, a plurality of fixed value conversion units 244a-c (hereinafter, may be collectively referred to as a fixed value conversion unit 244), and a plurality of compression processing units 246a-d (hereinafter, compression processing). Part 246 may be collectively referred to).
- the image dividing unit 242 acquires a plurality of captured images from the image acquisition unit 222. Then, the image dividing unit 242 divides each of the plurality of captured images into a feature region and a background region other than the feature region. Specifically, the image dividing unit 242 divides a plurality of captured images into each of a plurality of feature areas and a background area other than the feature areas. Then, the compression processing unit 246 compresses the feature region image, which is a feature region image, and the background region image, which is a background region image, with different strengths. Specifically, the compression processing unit 246 compresses a feature area moving image including a plurality of characteristic area images and a background area moving image including a plurality of background area images with different strengths.
- the image dividing unit 242 generates a feature region moving image for each of a plurality of feature types by dividing a plurality of captured images.
- the fixed value unit 244 converts the pixel values of the regions other than the feature regions of the respective feature types into fixed values for each of the feature region images included in the plurality of feature region moving images generated for each feature type. To do.
- the fixed value unit 244 sets the pixel values of the areas other than the feature areas to predetermined pixel values. Then, the compression processing units 246a-c compress the plurality of feature area moving images by MPEG or other encoding formats for each feature type.
- the fixed value converting sections 244a-c convert the feature area moving image of the first feature type, the feature area moving image of the second feature type, and the feature area moving image of the third feature type, respectively, into fixed values.
- the compression processing units 246a-c then perform the first feature type feature region moving image, the second feature type feature region moving image, and the third feature value, respectively, which have been fixed values by the fixed value converting units 244a-c. Compress the feature area video of the type.
- the compression processing units 246a-c compress the feature region moving image with a predetermined strength according to the feature type.
- the compression processing unit 246 may convert the feature area moving image into a moving image having a different resolution determined in advance according to the feature type of the feature area, and compress the converted feature area moving image.
- the compression processing unit 246 may compress the feature region moving image with different quantization parameters determined in advance according to the feature type.
- the compression processing unit 246d compresses the background area moving image.
- the compression processing unit 246d may compress the background area moving image with a strength higher than the compression strength by any of the compression processing units 246a-c.
- the feature area moving image and the background area moving image compressed by the compression processing unit 246 are supplied to the association processing unit 234 (see FIG. 14).
- regions other than the feature region are fixed values by the fixed value unit 244, when the compression processing unit 246 performs predictive encoding by MPEG encoding or the like, prediction is performed in regions other than the feature region.
- the amount of difference between images can be significantly reduced. For this reason, the compression unit 232 can compress the feature region moving image at a higher compression rate.
- each of the plurality of compression processing units 246 included in the compression unit 232 compresses each of a plurality of feature region images and a background region image.
- a compression processing unit 246 may be included, and one compression processing unit 246 may compress the images of the plurality of characteristic regions and the images of the background region with different strengths. For example, a plurality of feature region images and a background region image are sequentially supplied to one compression processing unit 246 in a time-sharing manner, and the one compression processing unit 246 differs from the plurality of feature region images and the background region image. You may compress sequentially by intensity.
- the one compression processing unit 246 quantizes the image information of the plurality of feature regions and the image information of the background region with different quantization coefficients, respectively, thereby converting the images of the plurality of feature regions and the images of the background region. They may be compressed with different strengths. Also, an image obtained by converting the images of the plurality of feature regions and the images of the background regions into images of different image quality is supplied to one compression processing unit 246, and the one compression processing unit 246 Each image in the background area may be compressed. Further, as described above, one compression processing unit 246 quantizes with a different quantization coefficient for each region, or one compression processing unit 246 compresses an image converted into a different image quality for each region.
- the compression processing unit 246 may compress the entire one image, or may compress each of the images divided by the image dividing unit 242 as described with reference to FIG. When one compression processing unit 246 compresses the entire one image, the dividing process by the image dividing unit 242 and the fixed value processing by the fixed value converting unit 244 do not have to be performed. The image dividing unit 242 and the fixed value converting unit 244 may not be provided.
- FIG. 20 shows another example of the block configuration of the compression unit 232 described in FIG.
- the compression unit 232 in the present configuration compresses a plurality of captured images by a spatial scalable encoding process according to the type of feature.
- the 20 includes an image quality conversion unit 510, a difference processing unit 520, and an encoding unit 530.
- the difference processing unit 520 includes a plurality of inter-layer difference processing units 522a-d (hereinafter collectively referred to as inter-layer difference processing units 522).
- Encoding section 530 includes a plurality of encoders 532a-d (hereinafter collectively referred to as encoders 532).
- the image quality conversion unit 510 acquires a plurality of captured images from the image acquisition unit 222. In addition, the image quality conversion unit 510 acquires information specifying the feature region detected by the feature region specifying unit 226 and information specifying the type of feature of the feature region. Then, the image quality conversion unit 510 duplicates the captured image, and generates captured images of the number of types of features in the feature area. Then, the image quality conversion unit 510 converts the generated captured image into an image having a resolution corresponding to the type of feature.
- the image quality conversion unit 510 has a captured image converted to a resolution corresponding to the background area (hereinafter referred to as a low resolution image), and a captured image converted to the first resolution corresponding to the type of the first feature ( Hereinafter, referred to as a first resolution image), a captured image converted to a second resolution corresponding to the second feature type (hereinafter referred to as a second resolution image), and a third feature type.
- the captured image converted to the third resolution (hereinafter referred to as a third resolution image) is generated.
- the first resolution image has a higher resolution than the low resolution image
- the second resolution image has a higher resolution than the first resolution image
- the third resolution image has a higher resolution than the second resolution image.
- the image quality conversion unit 510 converts the low resolution image, the first resolution image, the second resolution image, and the third resolution image into the inter-layer difference processing unit 522d, the inter-layer difference processing unit 522a, and the inter-layer difference processing unit 522b, respectively. , And the inter-tier difference processing unit 522c.
- the image quality conversion unit 510 supplies a moving image to each of the inter-layer difference processing units 522 by performing the above-described image quality conversion processing on each of the plurality of captured images.
- the image quality conversion unit 510 may convert the frame rate of the moving image supplied to each of the inter-layer difference processing unit 522 in accordance with the feature type of the feature region.
- the image quality conversion unit 510 may supply, to the inter-layer difference processing unit 522d, a moving image having a lower frame rate than the moving image supplied to the inter-layer difference processing unit 522a.
- the image quality conversion unit 510 may supply a moving image having a lower frame rate than the moving image supplied to the inter-layer difference processing unit 522b to the inter-layer difference processing unit 522a, and a frame lower than the moving image supplied to the inter-layer difference processing unit 522c.
- the rate movie may be supplied to the inter-tier difference processing unit 522b.
- the image quality conversion unit 510 may convert the frame rate of the moving image supplied to the inter-layer difference processing unit 522 by thinning out the captured image according to the feature type of the feature region.
- the inter-layer difference processing unit 522d and the encoder 532d predictively encode a background area moving image including a plurality of low-resolution images. Specifically, the inter-layer difference processing unit 522 generates a difference image from a predicted image generated from another low-resolution image. Then, the encoder 532d quantizes the transform coefficient obtained by converting the difference image into a spatial frequency component, and encodes the quantized transform coefficient by entropy coding or the like. Note that such predictive encoding processing may be performed for each partial region of the low-resolution image.
- the inter-layer difference processing unit 522a predictively encodes the first feature region moving image including the plurality of first resolution images supplied from the image quality conversion unit 510.
- the inter-layer difference processing unit 522b and the inter-layer difference processing unit 522c each predictively encode a second feature area moving image including a plurality of second resolution images and a third feature area moving image including a plurality of third resolution images. To do.
- specific operations of the inter-layer difference processing unit 522a and the encoder 532a will be described.
- the inter-layer difference processing unit 522a decodes the first resolution image encoded by the encoder 532d, and expands the decoded image to an image having the same resolution as the first resolution. Then, the inter-layer difference processing unit 522a generates a difference image between the enlarged image and the low resolution image. At this time, the inter-layer difference processing unit 522a sets the difference value in the background area to zero. Then, the encoder 532a encodes the difference image in the same manner as the encoder 532d. Note that the encoding process by the inter-layer difference processing unit 522a and the encoder 532a may be performed for each partial region of the first resolution image.
- the inter-layer difference processing unit 522a When the first resolution image is encoded, the inter-layer difference processing unit 522a is generated from the code amount predicted when the difference image with the low resolution image is encoded and the other first resolution image. The amount of code predicted when the difference image between the predicted image and the predicted image is encoded is compared. In the case where the latter code amount is smaller, the inter-layer difference processing unit 522a generates a difference image from the predicted image generated from the other first resolution image. Note that the inter-layer difference processing unit 522a, when encoding without taking the difference from the low resolution image or the predicted image, is expected to reduce the code amount, It is not necessary to take the difference between.
- the inter-layer difference processing unit 522a may not set the difference value in the background area to zero.
- the encoder 532a may set the encoded data for difference information in an area other than the feature area to zero.
- the encoder 532a may set the conversion coefficient after conversion to a frequency component to zero.
- the motion vector information when the inter-layer difference processing unit 522d performs predictive encoding is supplied to the inter-layer difference processing unit 522a.
- the inter-layer difference processing unit 522a may calculate a motion vector for a predicted image using the motion vector information supplied from the inter-layer difference processing unit 522d.
- the operations of the inter-layer difference processing unit 522b and the encoder 532b are that the second resolution image is encoded, and when the second resolution image is encoded, the first resolution image after the encoding by the encoder 532a Since the operations of the inter-layer difference processing unit 522b and the encoder 532b are substantially the same as the operations of the inter-layer difference processing unit 522a and the encoder 532a, a description thereof will be omitted. Similarly, the operations of the inter-layer difference processing unit 522c and the encoder 532c are that the third resolution image is encoded, and that the second resolution after the encoding by the encoder 532b is performed when the third resolution image is encoded. Except for the fact that a difference from the resolution image may be obtained, the operations are substantially the same as the operations of the inter-layer difference processing unit 522a and the encoder 532a, and thus description thereof is omitted.
- the image quality conversion unit 510 generates, from each of the plurality of captured images, a low-quality image having a low image quality and a feature region image having higher image quality than the low-quality image at least in the feature region. Then, the difference processing unit 520 generates a feature region difference image indicating a difference image between the feature region image in the feature region image and the feature region image in the low-quality image. Then, the encoding unit 530 encodes the feature region difference image and the low quality image, respectively.
- the image quality conversion unit 510 generates a low-quality image with reduced resolution from a plurality of captured images
- the difference processing unit 520 includes a feature region image in the feature region image and a feature region image in the low-quality image.
- a feature region difference image between the image and the image enlarged is generated.
- the difference processing unit 520 has a spatial frequency component in which the difference between the feature region image and the enlarged image in the feature region is converted into the spatial frequency region, and the data amount of the spatial frequency component is in the region other than the feature region.
- a reduced feature area difference image is generated.
- the compression unit 232 encodes hierarchically by encoding image differences between a plurality of layers having different resolutions.
- a part of the compression method by the compression unit 232 of this configuration is H.264. It is clear that a compression scheme according to H.264 / SVC is included.
- the image processing apparatus 250 decompresses such a hierarchized compressed moving image, the moving image data of each layer is decoded, and the difference is taken for the region encoded by the inter-layer difference.
- the captured image having the original resolution can be generated by the addition process with the captured image decoded in the hierarchy.
- FIG. 21 shows an example of a block configuration of the image processing apparatus 250 shown in FIG.
- the image processing apparatus 250 includes a compressed image acquisition unit 301, an association analysis unit 302, an expansion control unit 310, an expansion unit 320, an external information acquisition unit 380, and an image processing unit 330.
- the decompression unit 320 includes a plurality of decoders 322a-d (hereinafter collectively referred to as decoders 322).
- the compressed image acquisition unit 301 acquires the compressed moving image compressed by the image processing device 250. Specifically, the compressed image acquisition unit 301 acquires a compressed moving image including a plurality of feature area moving images and a background area moving image. More specifically, the compressed image acquisition unit 301 acquires a compressed moving image with feature area information attached thereto.
- the association analysis unit 302 separates the compressed video into a plurality of feature area videos, background area videos, and feature area information, and supplies the plurality of feature area videos and background area videos to the decompression unit 320.
- the association analysis unit 302 analyzes the feature region information and supplies the feature region position and the feature type to the extension control unit 310 and the image processing unit 330.
- the extension control unit 310 controls the extension process by the extension unit 320 in accordance with the position of the feature region and the feature type acquired from the association analysis unit 302. For example, the expansion control unit 310 expands each area of the moving image indicated by the compressed moving image to the expansion unit 320 according to a compression method in which the compression unit 232 compresses each area of the moving image according to the position of the feature region and the type of the feature.
- the decoder 322 decodes one of the plurality of encoded feature area videos and background area videos. Specifically, the decoder 322a, the decoder 322b, the decoder 322c, and the decoder 322d decode the first feature region moving image, the second feature region moving image, the third feature region moving image, and the background region moving image, respectively.
- the image processing unit 330 synthesizes a plurality of feature area videos and background area videos expanded by the expansion unit 320 to generate one video. Specifically, the image processing unit 330 generates one display moving image by combining the image of the feature region on the captured image included in the plurality of feature region moving images with the captured image included in the background region moving image. . Note that the image processing unit 330 may generate a display moving image in which the characteristic area has a higher image quality than the background area. For this conversion process for improving image quality, the super-resolution image processing means using the tensor projection of the present invention can be used.
- the image processing unit 330 outputs the characteristic area information and the display moving image acquired from the association analysis unit 302 to the display device 260 or the image DB 255 (see FIG. 13).
- the image DB 255 associates the position of the feature region indicated by the feature region information, the type of feature of the feature region, and the number of feature regions with information for identifying the captured image included in the display moving image, and stores it in a nonvolatile recording such as a hard disk It may be recorded on a medium.
- the external information acquisition unit 380 acquires data used for image processing in the image processing unit 330 from the outside of the image processing apparatus 250.
- the image processing unit 330 performs image processing using the data acquired by the external information acquisition unit 380. Data acquired by the external information acquisition unit 380 will be described with reference to FIG.
- FIG. 22 illustrates an example of a block configuration of the image processing unit 330 included in the image processing apparatus 250 described with reference to FIG.
- the image processing unit 330 includes a parameter storage unit 1010, an attribute specifying unit 1020, a specific object region detecting unit 1030, a parameter selecting unit 1040, a weight determining unit 1050, a parameter generating unit 1060, and an image generating unit 1070. including.
- the parameter storage unit 1010 stores a plurality of image processing parameters for increasing the image quality of the subject images of the respective attributes in association with the plurality of attributes of the subject images.
- the attribute specifying unit 1020 specifies the attribute of the subject image included in the input image.
- the input image may be a frame image obtained by the decompressing unit 320.
- the parameter selection unit 1040 selects a plurality of image processing parameters stored in the parameter storage unit 1010 with higher priority in association with attributes that match the attributes specified by the attribute specification unit 1020.
- the image generation unit 1070 generates a high quality image obtained by improving the image quality of the subject image included in the input image using the plurality of image processing parameters selected by the parameter selection unit 1040 together. For this conversion process for improving image quality, the super-resolution image processing means using the tensor projection of the present invention is used.
- examples of the attribute include the state of the subject, such as the orientation of the subject. That is, the parameter storage unit 1010 stores a plurality of image processing parameters in association with a plurality of attributes indicating the state of the subject captured as a subject image. The attribute specifying unit 1020 specifies the state of the subject captured as the subject image included in the input image from the subject image.
- the state of the subject can be exemplified by the orientation of the subject when the image is taken.
- the direction of the subject may be, for example, the direction of a human face as an example of the subject.
- the parameter storage unit 1010 stores a plurality of image processing parameters in association with a plurality of attributes indicating the orientation of the subject captured as a subject image.
- the attribute specifying unit 1020 specifies the orientation of the subject captured as the subject image included in the input image from the subject image.
- the attribute may be the type of subject.
- the types of subjects include, for example, the sex of the person as the subject, the age of the person, the facial expression of the imaged person, the gesture of the imaged person, the posture of the imaged person, the race of the imaged person,
- the wearable items worn by a person can be exemplified.
- the parameter storage unit 1010 may store a plurality of image processing parameters in association with a plurality of attributes including at least one of these various attributes.
- the attribute specifying unit 1020 specifies the corresponding attribute of the person imaged as the subject image included in the input image from the subject image.
- the weight determination unit 1050 determines weights for a plurality of image processing parameters when the image quality of the subject image included in the input image is improved. Then, based on the weight determined by the weight determination unit 1050, the image generation unit 1070 generates a high-quality image obtained by improving the input image using a plurality of image processing parameters selected by the parameter selection unit 1040. . Note that the weight determination unit 1050 may determine a weight having a higher weight for an image processing parameter associated with an attribute having a higher degree of fitness for the identified attribute.
- the parameter generation unit 1060 generates a composite parameter obtained by combining a plurality of image processing parameters selected by the parameter selection unit 1040. Then, the image generation unit 1070 generates a high-quality image by increasing the image quality of the subject image included in the input image using the composite parameter generated by the parameter generation unit 1060.
- the image processing unit 330 may change the intensity of high image quality on the image.
- the parameter storage unit 1010 stores a specific parameter that is an image processing parameter used to improve the image quality of an image of a specific object, and a non-specific parameter that is an image processing parameter used to improve the image quality of an image for which no object is specified. To do.
- the non-specific parameter may be a general-purpose image processing parameter that has a certain effect of improving the image quality regardless of the object.
- the specific object area detection unit 1030 detects a specific object area that is an area of the specific object from the input image.
- the specific object may be a subject object to be detected as a feature region.
- the weight determination unit 1050 determines the weights of the specific parameter and the non-specific parameter when the image quality of the input image in which the specific object area is detected is improved.
- the weight determination unit 1050 determines a weight for which the weight for the specific parameter is larger than the non-specific parameter for the image of the specific object area in the input image. As a result, the image quality of the specific object to be detected as the feature area can be improved. In addition, the weight determination unit 1050 determines a weight for which the weight for the non-specific parameter is larger than the specific parameter for the image of the non-specific object area that is an area other than the specific object area. As a result, it is possible to prevent the image quality from being improved with the image processing parameters dedicated to the specific object.
- the image generation unit 1070 generates a high quality image obtained by improving the quality of the input image using both the specific parameter and the non-specific parameter based on the weight determined by the weight determination unit 1050.
- the parameter storage unit 1010 learns specific parameters calculated by learning using a plurality of images of a specific object as learning images (also referred to as “training images”), and a plurality of images that are not images of a specific object. Non-specific parameters calculated by learning used as images are stored. Thereby, a specific parameter specialized for the specific object can be calculated. In addition, general-purpose specific parameters for various objects can be calculated.
- edge information of the learning image not the luminance information itself of the learning image.
- edge information in which information in the low spatial frequency region is reduced it is possible to realize a high image quality process that is robust against illumination fluctuations, in particular, low-frequency illumination changes.
- the parameter generation unit 1060 may generate a composite parameter by combining the non-specific parameter and the specific parameter with the weight determined by the weight determination unit 1050.
- the image generation unit 1070 may generate a high quality image by improving the quality of the input image using the synthesis parameter generated by the parameter generation unit 1060.
- the image generation unit 1070 may improve the image quality of the subject image included in the input image using different combinations of a plurality of image processing parameters.
- the image generation unit 1070 may improve the image quality of the subject image included in the input image using different combinations of a plurality of predetermined image processing parameters.
- the image generation unit 1070 may select at least one image from a plurality of images obtained by improving the image quality based on the comparison with the input image, and the selected image may be a high-quality image.
- the image generation unit 1070 may preferentially select an image whose image content is more similar to the input image from among a plurality of images obtained by improving the image quality as a high-quality image.
- the parameter selection unit 1040 may select different combinations of a plurality of image processing parameters based on the subject attributes specified from the input image.
- the image generation unit 1070 may improve the image quality of the subject image included in the input image using the plurality of selected image processing parameters. Then, the image generation unit 1070 may select at least one image from a plurality of images obtained by the high image quality based on the comparison with the input image, and the selected image may be a high quality image.
- the image processing apparatus 250 uses image processing parameters that can deal with images of subjects with various attributes even if the parameter storage unit 1010 stores a limited number of image processing parameters.
- Image quality can be improved. Examples of high image quality include high resolution, multiple gradations, multiple colors, low noise, low artifacts, reduced blur, sharpness, and higher frame rate. Can do.
- the parameter storage unit 1010 can store these various image processing parameters for high image quality processing.
- the external information acquisition unit 380 illustrated in FIG. 21 acquires the image processing parameters stored in the parameter storage unit 1010 (see FIG. 22) from the outside.
- the parameter storage unit 1010 stores the image processing parameters acquired by the external information acquisition unit 380.
- the external information acquisition unit 380 acquires at least one of a specific parameter and a non-specific parameter from the outside.
- the parameter storage unit 1010 stores at least one of the specific parameter and the non-specific parameter acquired by the external information acquisition unit 380.
- FIG. 23 shows an example of parameters stored in the parameter storage unit 1010 in a table format.
- the parameter storage unit 1010 stores specific parameters A0, A1,... That are image processing parameters for a human face in association with the face orientation.
- the specific parameters A0 and A1 are calculated in advance by pre-learning using a corresponding face orientation image as a learning image.
- the calculation process of the specific parameter A by pre-learning will be described by taking as an example a resolution enhancement process by weighted addition of pixel values of peripheral pixels of the target pixel.
- ⁇ indicates addition over i.
- w i is a weighting factor for the pixel values x i of the peripheral pixels
- the specific parameter A should weighting coefficient w i is calculated by the prior learning.
- the weighting coefficient w i can be calculated by an arithmetic process such as a least square method.
- the specific parameter A corresponding to each face direction can be calculated by performing the above-described specific parameter calculation processing for a plurality of face-oriented face images.
- the parameter storage unit 1010 stores a non-specific parameter B for an object that is not a human face.
- the non-specific parameter B is calculated in advance by pre-learning using images of various subjects as learning images.
- the non-specific parameter B can be calculated by a pre-learning process similar to the specific parameter A.
- the non-specific parameter B can be calculated by using an image other than a person as a learning image instead of a face image.
- FIG. 24 shows an example of weighting specific parameters. Assume that areas 1210 and 1220 inside thick lines in the image 1200 are detected as feature areas.
- the weight determination unit 1050 determines the weight coefficient of the specific parameter as 100% and the weight coefficient of the non-specific parameter as 0% in the region 1210 inside the feature region. Further, in the region 1220 near the non-feature region outside the region 1210 in the feature region (inside the thick line frame), the weighting factor of the specific parameter is determined to be 80% and the weighting factor of the non-specific parameter is determined to be 20%.
- the weighting factor of the specific parameter is determined to be 50% and the weighting factor of the non-specific parameter is set to 50% in the region 1230 near the feature region.
- the weighting factor of the specific parameter is determined to be 0%, and the weighting factor of the non-specific parameter is determined to be 100%.
- the weight determination unit 1050 determines a weight that gives a higher weight to the specific parameter for the image in the area inside the specific object area in the input image.
- the weight determination unit 1050 determines a weight that gives a higher weight to the specific parameter as it is closer to the specific object area, with respect to the image of the non-specific object area that is an area other than the specific object area.
- the weight determination unit 1050 decreases the weighting factor of the specific parameter stepwise from the feature region toward the non-feature region from the center of the feature region to the outside.
- the weight determining unit 1050 continuously increases the weighting factor in proportion to the distance from the center of the feature region or the distance from the surrounding region of the feature region. It may be decreased.
- the weight determining unit 1050 may increase the value of the weighting factor with respect to the distance x according to a function such as 1 / x, 1 / x 2 , e ⁇ x , or the like. The weighting factor of the value that decreases to a value may be determined.
- the weight determination unit 1050 may control the weighting coefficient according to the detection reliability as the feature region. Specifically, the weight determination unit 1050 determines a weight that gives a higher weight to a specific parameter for an image of a specific object region having a higher detection reliability as the specific object region.
- the image processing unit 330 even in an area that is not detected as a feature area, the image quality enhancement process having the effect of the specific parameter for the specific object is performed, so it is determined whether or not the specific object exists from the image with the high image quality. In some cases, it can be easily determined.
- the specific parameter may be an image processing parameter obtained by combining a plurality of image processing parameters described with reference to FIG.
- the weight determination unit 1050 determines the weighting factor for the specific parameter A0 as 25% and the weighting factor for the specific parameter A1 as 75%.
- the parameter generation unit 1060 generates a composite parameter obtained by combining the specific parameter A0 and the specific parameter A1 with weighting factors of 25% and 75%, respectively.
- the image generation unit 1070 uses the image processing parameters obtained by weighting the combination parameters generated by the parameter combination unit and the non-specific parameters in the ratio illustrated in FIG.
- the parameter generation unit 1060 uses the weighting coefficient determined by the weight determination unit 1050 as the weighting coefficient of the image processing parameter.
- the synthesis parameter represented by the obtained weighting coefficient may be calculated by weighting and adding.
- a spatial frequency component in the spatial frequency domain or pixel data itself for example, image data of a high frequency component
- the parameter generation unit 1060 uses vectors, matrices, tensors,
- the synthesis parameter may be generated by weighted addition or multiplication of an n-dimensional mixed normal distribution or an n-dimensional mixed polynomial distribution.
- n is an integer of 1 or more.
- the sum of a feature vector obtained by multiplying a feature vector oriented at 0 ° by a coefficient 0.25 and a feature vector obtained by multiplying a feature vector oriented at 20 ° by a coefficient 0.75 is a feature vector oriented at 15 °.
- the parameter generation unit 1060 can calculate a composite parameter from the specific parameter and the non-specific parameter.
- the parameter generation unit 1060 can also calculate a composite parameter from a plurality of different specific parameters.
- the image generation unit 1070 When generating a high-quality image using specific parameters and non-specific parameters, the image generation unit 1070 performs image processing using image information obtained by image processing using specific parameters and non-specific parameters.
- a high-quality image may be generated by adding the obtained image information to the weight coefficient determined by the weight determination unit 1050.
- the image generation unit 1070 may generate a high-quality image by performing image processing using non-specific parameters on image information obtained by performing image processing using specific parameters. Similar processing can be applied to high image quality processing using a plurality of specific parameters.
- Examples of the image data here include pixel values themselves, feature quantity vectors in a feature quantity space, matrices, n-dimensional mixed normal distribution, n-dimensional mixed multinomial distribution, and the like. For example, by performing vector interpolation in the feature vector space, blur due to synthesis may be reduced on a vector that cannot be expressed by a scalar.
- a plurality of image processing parameters used for image quality improvement of a feature region based on the orientation of a person's face specified from the image in the feature region is a parameter selection unit. 1040. Then, the image generation unit 1070 generates one high-quality image using the plurality of image processing parameters selected by the parameter selection unit 1040.
- the image generation unit 1070 may generate a plurality of images in which the quality of the characteristic area is improved from each of a plurality of combinations of image processing parameters stored in the image generation unit 1070. Then, the image generation unit 1070 may generate an image that is most similar to the image in the feature region among the obtained plurality of images as a high-quality image in which the feature region is improved in image quality.
- the image generation unit 1070 generates an image in which the image of the feature region is improved in image quality using a composite parameter of the specific parameter A0 corresponding to the 0 ° direction and the specific parameter A1 corresponding to the 20 ° direction. .
- the image generation unit 1070 further generates one or more images obtained by improving the image quality of the image of the feature region using the synthesis parameter of the specific parameter of one or more other combinations.
- the image generation unit 1070 compares each of the generated plurality of images with the image in the feature region, and calculates the degree of coincidence of the image contents.
- the image generation unit 1070 determines, as a high-quality image, an image that has the highest matching score among the plurality of generated images.
- the image generation unit 1070 uses the plurality of synthesis parameters based on a plurality of predetermined sets of specific parameters, respectively.
- the image quality may be improved.
- the parameter selecting unit 1040 may select a plurality of sets of predetermined specific parameters without the attribute specifying unit 1020 specifying the face orientation.
- the parameter selection unit 1040 may select a plurality of specific parameter sets based on the face orientation of the person specified from the image in the feature area.
- the parameter selection unit 1040 stores information for specifying a plurality of sets of specific parameters and information for specifying the orientation of the person's face in association with each other, and the person's face specified from the image in the feature region A plurality of sets of specific parameters stored in association with each other may be selected. Then, a plurality of images in which the quality of the image in the feature area is improved may be generated by improving the quality of the image in the feature area by each of a plurality of synthesis parameters based on the selected plurality of sets.
- the image generation unit 1070 may improve the image quality in the image in the feature area by each of a plurality of specific parameters. Then, the image generation unit 1070 may generate an image most similar to the image in the feature area among the obtained plurality of images as a high-quality image obtained by improving the quality of the feature area. Even in this case, the parameter specifying unit 1040 may select a plurality of predetermined specific parameters without performing the process of specifying the face orientation by the attribute specifying unit 1020, or may be specified from the image in the feature region. The parameter selection unit 1040 may select a plurality of specific parameters based on the orientation of the person's face.
- an image processing parameter (specific parameter) for improving the image quality of a face image with a specific face direction can be calculated from a learning image with a specific face direction.
- the image processing parameters corresponding to each of the plurality of face orientations can be calculated by calculating the image processing parameters for each of the other face orientations in the same manner.
- the parameter storage unit 1010 stores the calculated image processing parameters in advance in association with the corresponding face orientations.
- the image processing parameters for improving the image quality of the face image may be image processing parameters for improving the image quality of the entire face, but face images such as an eye image, a mouth image, a nose image, an ear image, etc. May be an image processing parameter for improving the image quality of at least some of the objects included in.
- the face orientation is an example of the orientation of the subject, and for the orientation of other subjects, a plurality of image processing parameters respectively corresponding to the orientations of the plurality of subjects can be calculated in the same manner as the face orientation.
- the orientation of the human body can be exemplified as the orientation of the subject, and more specifically, the orientation of the body part, the orientation of the hand, etc. can be exemplified as the orientation of the human body.
- a plurality of image processing parameters for improving the image quality of a subject image obtained by capturing subjects in a plurality of directions can be calculated in the same manner as a face image.
- the direction of the subject is an example of the state of the subject, and the state of the subject can be further classified according to the facial expression of the person.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of each face image having a different specific expression.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of the face when the person is in emotional state, the face when the person is in tension, and the like.
- the state of the subject can be classified according to the gesture of the person.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of a person's image in a different specific behavior.
- the plurality of image processing parameters stored in the parameter storage unit 1010 include a person image in a running state, a person image in a state of walking quickly, a person image in a state of starting to run, and a state in which an object is being colored Improve the image quality of each person's image.
- the state of the subject can be classified according to the posture of the person.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of images of persons in different specific postures.
- the plurality of image processing parameters stored in the parameter storage unit 1010 include a portrait of a person with his back folded, a portrait of a person with his hand in his pocket, a portrait of his person with arms folded, Improve the image quality of each person's image that does not match the orientation.
- the state of the subject can be classified according to the person's wear.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of the images of persons wearing different specific wearing items.
- the plurality of image processing parameters stored in the parameter storage unit 1010 include a person image wearing glasses, a person image wearing sunglasses, a person image wearing a mask, and a person image wearing a hat. Etc. to improve the image quality.
- a subject is classified into a plurality of attributes corresponding to a plurality of states of the subject.
- the subject can be classified into a plurality of attributes according to the type of the subject.
- the race of a person can be exemplified. Examples of the race of a person include a race classified in a region such as an Asian race, a race of Europe, or a race anthropologically classified.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of images of persons classified into the corresponding races.
- the type of subject can be classified by the gender of the person, such as male or female.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of images of a person of a corresponding gender such as a male image or a female.
- the types of subjects can be classified according to the age group of the person.
- the plurality of image processing parameters stored in the parameter storage unit 1010 improve the image quality of images of people of corresponding ages, such as images of teenagers and images of people of the twenties.
- the attribute of the subject image is defined by the type of subject exemplified above, the plurality of states of the subject, or a combination thereof.
- the parameter storage unit 1010 stores in advance image processing parameters for improving the image quality of subject images belonging to each attribute in association with each specified attribute.
- the image processing parameters stored by the parameter storage unit 1010 can be calculated by a method similar to the method for calculating the image processing parameters for each face orientation. For example, when an attribute is defined by a facial expression, an image processing parameter for improving the image quality of the laughing face is calculated by pre-learning a plurality of images obtained by capturing the laughing face as learning images. be able to.
- a plurality of image processing parameters for improving the image quality of each facial image of each facial expression can be calculated by pre-learning images of other facial expressions such as an angry facial image in the same manner.
- Image processing parameters can be calculated in the same manner for each attribute defined by gesture, posture, wear, race, gender, age, and the like.
- the attribute specifying unit 1020 can specify the attribute of the subject image by applying a discriminator calculated in advance by boosting such as Adaboost to the subject image. For example, a plurality of face images obtained by capturing faces in a specific direction are used as teacher images, and weak classifiers are integrated by boosting processing to generate a classifier. It is possible to determine whether or not the image is a face image of a specific face according to the correct / incorrect identification result obtained when the subject image is applied to the generated classifier. For example, when a positive identification result is obtained, it can be determined that the input subject image is a face image of a specific face orientation.
- a discriminator calculated in advance by boosting such as Adaboost
- the attribute specifying unit 1020 can apply the plurality of classifiers to the subject image and specify the face direction based on the correct / incorrect identification results obtained from the respective classifiers.
- one or more other attributes defined by facial expressions, gender, etc. can also be specified by applying a classifier generated for each attribute by boosting processing.
- the attribute specifying unit 1020 can specify an attribute by applying, to a subject image, a discriminator learned for each attribute by various methods such as a linear discriminating method and a mixed Gaussian model in addition to learning by boosting.
- FIG. 25 shows an example of a block configuration of the display device 260 in FIG.
- the display device 260 includes an image acquisition unit 1300, a first image processing unit 1310, a feature region specifying unit 1320, a parameter determining unit 1330, a display control unit 1340, a second image processing unit 1350, and external information acquisition.
- the image acquisition unit 1300 acquires an input image.
- the input image here may be a frame image included in the moving image received from the image processing device 250.
- the first image processing unit 1310 generates a predetermined image quality image obtained by improving the image quality of the input image using predetermined image processing parameters. For example, when the resolution is increased, the first image processing unit 1310 generates a predetermined image quality using image processing parameters of a method in which a required calculation amount is smaller than a predetermined value, such as simple interpolation enlargement processing.
- the display control unit 1340 causes the display unit 1390 to display the predetermined image quality generated by the first image processing unit 1310. In this way, the display unit 1390 displays a predetermined image quality image.
- the feature area specifying unit 1320 specifies a plurality of feature areas in the input image.
- the feature area specifying unit 1320 may specify a plurality of feature areas in the input image in a state where the display unit 1390 displays a predetermined image quality image.
- the image processing device 250 may transmit information specifying the feature region as additional information to the moving image and transmit the information to the display device 260.
- the feature region specifying unit 1320 may specify a plurality of feature regions by extracting information specifying the feature region from the accompanying information of the moving image acquired by the image acquisition unit 1300.
- the parameter determination unit 1330 determines, for each of the plurality of feature regions, an image processing parameter for further improving the image quality of each image of the plurality of feature regions. For example, the parameter determination unit 1330 determines, for each of the plurality of feature regions, an image processing parameter for improving the image quality of each of the plurality of feature regions with different intensities. “Improve image quality with different intensity” means to improve the image quality with different amount of calculation, to improve the image quality with different amount of calculation per unit area, or to improve the image quality with the image quality improvement method with different required amount of calculation. It may mean that.
- the second image processing unit 1350 uses the image processing parameters determined by the parameter determination unit 1330 to generate a plurality of high-quality feature region images obtained by improving the image quality of the images of the plurality of feature regions.
- the display control unit 1340 displays a plurality of feature region images in the plurality of feature regions in the predetermined image quality image displayed by the display unit 1390. As described above, the display control unit 1340 displays a high-quality image instead of the predetermined image quality image already displayed on the display unit 1390 at the stage where the high-quality image is generated. Since the display unit 1390 quickly generates and displays a predetermined image quality image, the user can observe a monitoring image with a certain image quality without substantial delay.
- the parameter determination unit 1330 may determine an image processing parameter for each of the plurality of feature regions based on the importance of each image of the plurality of feature regions. Information indicating the importance may be attached to the accompanying information. The importance may be determined in advance according to the type of subject in the feature area. The importance for each type of subject may be set by a user who observes the display unit 1390. The parameter determination unit 1330 determines an image processing parameter for improving the image quality of a feature region having a higher importance level with a higher intensity. Therefore, the user can observe an image in which the important feature region has a higher quality.
- the parameter determination unit 1330 determines an image processing parameter for each of the plurality of feature regions based on the type of feature of each image of the plurality of feature regions. Further, the parameter determination unit 1330 may determine image processing parameters for each of the plurality of feature areas based on the types of subjects imaged in the plurality of feature areas. Thus, the parameter determination unit 1330 may determine the image processing parameter directly according to the type of subject.
- the parameter determination unit 1330 determines the image processing parameter based on the required processing amount required to improve the image quality of each of the plurality of feature regions in the second image processing unit 1350. Specifically, the parameter determination unit 1330 determines an image processing parameter for increasing the image quality with higher strength when the required processing amount is smaller.
- the parameter determination unit 1330 may determine an image processing parameter for increasing the resolution with higher intensity when the areas of the plurality of feature regions are smaller. Then, the second image processing unit 1350 uses the image processing parameters determined by the parameter determination unit 1330 to generate a plurality of high quality feature region images obtained by increasing the resolution of the images of the plurality of feature regions. In addition, the parameter determination unit 1330 may determine an image processing parameter for increasing the image quality with higher intensity when the number of pixels in the plurality of feature regions is smaller.
- the parameter determination unit 1330 determines the image processing parameter based on the processable capacity that is the processing amount allowed by the second image processing unit 1350. Specifically, the parameter determination unit 1330 may determine an image processing parameter for improving the image quality with higher strength when the processable capacity is smaller.
- the degree of high image quality can be controlled in accordance with the amount of calculation that the second image processing unit 1350 can process. For this reason, it may be possible to prevent the load on the display unit 1390 from being overloaded by the image quality enhancement process and delaying the display of the image. If there is a margin in the calculation amount of the display unit 1390, a high-quality image is quickly generated and can be observed.
- the parameter determination unit 1330 determines an image processing parameter for increasing the resolution of each image of the plurality of feature regions for each of the plurality of feature regions.
- the second image processing unit 1350 uses the image processing parameters determined by the parameter determination unit 1330 to generate a plurality of high-quality feature region images obtained by increasing the resolution of the images of the plurality of feature regions.
- increasing the resolution with high intensity includes increasing the resolution with high accuracy and generating a high-quality image having a larger number of pixels.
- high image quality processing examples include high resolution, multi-gradation, multi-color processing, low noise, low artifacts, reduced blur, and sharpness.
- the parameter determination unit 1330 determines image processing parameters for various image quality enhancements for each of the plurality of feature regions, and the second image processing unit 1350 Using the image processing parameters determined by the parameter determination unit 1330, it is possible to generate a plurality of high quality feature region images obtained by improving the image quality of the images of the plurality of feature regions.
- the image acquisition unit 1300 may acquire a plurality of moving image constituent images included in the moving image as input images.
- the parameter determination unit 1330 determines an image processing parameter for increasing the frame rate of each of the plurality of feature regions for each of the plurality of feature regions.
- the second image processing unit 1350 may generate a plurality of high-quality feature region images with a high frame rate using the image processing parameters determined by the parameter determination unit 1330.
- the parameter determination unit 1330 determines the image processing parameter based on the frame rate of the moving image. Specifically, the parameter determination unit 1330 may determine an image processing parameter for improving the image quality with higher strength when the frame rate of the moving image is lower.
- the second image processing unit 1350 may generate a high-quality moving image by improving the image quality of each input image using the determined image processing parameters. Note that the image quality improvement by the second image processing unit 1350 is also the same as the image quality improvement by the image processing device 250.
- the resolution, the number of colors, the number of gradations, the noise reduction, the block noise, and the mosquito noise The second image processing unit 1350 can generate a high quality image by these processes.
- the concept of artifact reduction, blur reduction, and sharpness reduction that reduce artifacts such as the above may be included.
- the display device 260 can determine the strength of image quality improvement according to the amount of image data to be improved in image quality and the amount of computation that can be assigned to the image quality improvement processing. According to the display device 260, it is possible to quickly provide an image with a certain quality to the user, and it is possible to prevent the display of the image subjected to the high image quality processing from being extremely delayed. For this reason, the display device 260 can prevent an overload due to the high image quality processing, and can smoothly reproduce the moving image provided from the image processing device 250.
- the external information acquisition unit 1380 acquires a determination condition for determining an image processing parameter for each feature region from the outside of the display device 260.
- the parameter determination unit 1330 determines an image processing parameter for each of the plurality of feature regions based on the determination condition acquired by the external information acquisition unit 1380. Examples of the determination condition include conditions using parameters such as the importance of the feature region, the type of feature of the feature region, the required processing amount, the area of the feature region, the number of pixels in the feature region, the processable capacity, and the like.
- FIG. 26 shows an example of an image display area 1400.
- the display area 1400 is an area where an input image is displayed by the display unit 1390.
- three feature regions are specified from the input image. It is assumed that images of these feature areas are displayed in the feature area area 1410, the feature area area 1420, and the feature area area 1430 in the display area 1400.
- the display control unit 1340 causes the display area 1400 of the display unit 1390 to display the acquired input image as it is.
- the second image processing unit 1350 performs predetermined high resolution processing such as simple interpolation on the image of each feature region that has a required calculation amount smaller than a predetermined value, A predetermined quality image of the image of each feature area is generated (first high resolution stage).
- the strength of the resolution enhancement is the amount of image data such as the number of pixels in the feature region, the frame rate, the importance of the feature region, the type of subject, and the calculation permission in the second image processing unit 1350.
- the second image processing unit 1350 performs high-resolution processing with a predetermined intensity. It should be noted that the amount of calculation required to perform the resolution enhancement processing with the predetermined intensity over the entire area of the input image may be always assigned to the second image processing unit 1350.
- the display control unit 1340 includes the predetermined image quality image 1412, the predetermined image quality image 1422, and the predetermined image quality image.
- the image quality image 1432 is displayed in the corresponding feature area area 1410, feature area area 1420, and feature area area 1430, respectively.
- the second image processing unit 1350 performs the high resolution processing at the intensity determined for each feature region by the parameter determination unit 1330. Then, a high-quality image of the image of each feature region is generated (second high resolution stage). In this second high resolution stage, the strength of the high resolution is the intensity determined by the parameter determination unit 1330.
- the amount of image data such as the number of pixels in the feature area and the frame rate, the importance of the feature area, the subject And the allowable calculation amount in the second image processing unit 1350.
- the display control unit 1340 includes the high-quality image 1414, the high-quality image 1424, and the high-quality image 1424.
- the image quality image 1434 is displayed in the corresponding feature area area 1410, feature area area 1420, and feature area area 1430, respectively.
- the second image processing unit 1350 increases the resolution with the intensity according to the current load amount and the calculation amount required to improve the image quality. Can be provided.
- FIG. 27 shows an example of an image processing system 201 according to another embodiment.
- the configuration of the image processing system 201 in this embodiment is the same as the image processing apparatus 201a-d except that the imaging apparatuses 210a-d have image processing units 804a-d (hereinafter collectively referred to as image processing units 804).
- the configuration is the same as that of the processing system 200.
- the image processing unit 804 has components other than the image acquisition unit 222 among the components included in the image processing apparatus 220 described in FIG.
- the functions and operations of the constituent elements included in the image processing unit 804 are changed to that the constituent elements included in the image processing apparatus 220 process the moving image obtained by the decompression processing by the compressed moving image decompression unit 224.
- the functions and operations of each component included in the image processing device 220 may be substantially the same except that the moving image captured by the imaging unit 212 is processed. Also in the image processing system 201 having such a configuration, the same effects as those described in relation to the image processing system 200 from FIGS. 13 to 26 can be obtained.
- the image processing unit 804 acquires a moving image including a plurality of captured images represented in the RAW format from the imaging unit 212, and compresses the plurality of captured images represented in the RAW format included in the acquired moving image in the RAW format. You can do it.
- the image processing unit 804 may detect one or more feature regions from a plurality of captured images expressed in the RAW format.
- the image processing unit 804 may compress a moving image including a plurality of captured images in the compressed RAW format. Note that the image processing unit 804 can compress the moving image by the compression method described as the operation of the image processing apparatus 220 in relation to FIGS. 13 to 18.
- the image processing apparatus 250 can acquire a plurality of captured images expressed in the RAW format by expanding the moving image acquired from the image processing unit 804.
- the image processing apparatus 250 enlarges each of the plurality of captured images expressed in the RAW format acquired by the expansion for each area, and performs a synchronization process for each area.
- the image processing apparatus 250 may perform synchronization processing with higher accuracy in the feature region than in the region other than the feature region.
- the image processing device 250 may perform super-resolution processing on the image of the feature region in the captured image obtained by the synchronization processing.
- the super-resolution processing in the image processing apparatus 250 the super-resolution means using the tensor projection according to the present invention can be applied.
- the image processing apparatus 250 may perform super-resolution processing for each object included in the feature area. For example, when the feature region includes a human face image, the image processing apparatus 250 performs super-resolution processing for each face part (for example, eyes, nose, mouth, etc.) as an example of the object. In this case, the image processing apparatus 250 stores learning data such as a model as described in JP-A-2006-350498 for each face part (for example, eyes, nose, mouth). Then, the image processing device 250 may perform super-resolution processing on the image of each face part using the learning data selected for each face part included in the feature region.
- the image processing apparatus 250 stores learning data such as a model as described in JP-A-2006-350498 for each face part (for example, eyes, nose, mouth).
- the image processing device 250 may perform super-resolution processing on the image of each face part using the learning data selected for each face part included in the feature region.
- Learning data such as a model may be stored for each combination of a plurality of facial expressions, a plurality of face directions, and a plurality of illumination conditions.
- the expression includes a face and a true face when in emotional state, and the face direction includes front, upper, lower, right, left, and rear.
- Illumination conditions include conditions for illumination intensity and illumination direction.
- the image processing apparatus 250 may perform super-resolution processing on the face image using learning data corresponding to a combination of facial expression, face direction, and illumination conditions.
- the facial expression and face direction can be specified based on the image content of the face image included in the feature area.
- the facial expression can be specified from the shape of the mouth and / or eyes, and the direction of the face can be specified from the positional relationship of the eyes, mouth, nose, and ears.
- the illumination intensity and direction of the face can be specified based on the image content of the face image, such as the position and size of the shadow.
- the facial expression, face direction, and illumination condition may be specified by the image processing unit 804, and the specified facial expression, face direction, and illumination condition may be transmitted from the output unit 236 in association with the image.
- the image processing apparatus 250 may perform super-resolution processing using learning data corresponding to facial expressions, face directions, and illumination conditions received from the output unit 236.
- a model for each part of the face can be used in addition to a model representing the entire face.
- gender and / or racial face models can be used.
- the model is not limited to a person, but a model can be stored for each type of object to be monitored, such as a vehicle or a ship.
- the image processing apparatus 250 can reconstruct the image of the feature region using the local preserving projection (LPP).
- the image reconstruction method by the image processing apparatus 250 and the learning method for the image reconstruction include local preservation such as local linear embedding (LLE) in addition to local preservation projection (LPP). Can be used.
- LLE local linear embedding
- LLP local preservation projection
- the learning data includes low frequency components and high frequency components of the object image respectively extracted from a large number of sample images of the object. Good.
- the low-frequency component of the object image is divided into a plurality of clusters in each of the plurality of object types. It may be clustered.
- a typical low frequency component for example, centroid value
- the image processing device 250 extracts a low frequency component from the image of the object included in the feature region in the captured image. Then, the image processing apparatus 250 determines, as a representative low-frequency component, a value that matches the extracted low-frequency component among the low-frequency component clusters extracted from the sample image of the extracted object type object. Identify the cluster. Then, the image processing apparatus 250 identifies a cluster of high frequency components associated with the low frequency component included in the identified cluster. In this way, the image processing apparatus 250 can specify a cluster of high-frequency components that are correlated with the low-frequency components extracted from the objects included in the captured image.
- the image processing device 250 may convert the image of the object into a high-quality image with higher image quality using high-frequency components that represent the specified cluster of high-frequency components. For example, the image processing apparatus 250 may add the high-frequency component selected for each object with a weight according to the distance from the center of each object to the processing target position on the face to the object image.
- the representative high-frequency component may be generated by closed-loop learning. As described above, the image processing apparatus 250 selects and uses desired learning data for each object from the learning data generated by learning for each object. There are cases where image quality can be improved.
- the image processing apparatus 250 can also improve the image quality of the input image by using the stored low frequency component and high frequency component without performing clustering by the k-means method or the like.
- the image processing apparatus 250 uses a low-resolution edge component that is an edge component extracted from each patch in the low-resolution learning image and a high-resolution edge component that is an edge component extracted from each patch in the high-resolution learning image.
- a low-resolution edge component that is an edge component extracted from each patch in the low-resolution learning image
- a high-resolution edge component that is an edge component extracted from each patch in the high-resolution learning image.
- These edge components may be stored as a vector on an eigenspace such as LPP.
- the image processing apparatus 250 extracts edge components for each patch from the enlarged image obtained by enlarging the input image by a predetermined method such as bicubic. For each patch in the input image, the image processing device 250 calculates a norm between the extracted edge component and the stored edge component on an eigenspace such as LPP. The image processing apparatus 250 selects, from the stored patches, a plurality of patches for which a norm smaller than a predetermined value is calculated. Then, the image processing apparatus 250 sets a Markov random field of the extracted edge component and the high-resolution edge component of the selected plurality of patches for the patch of interest and its surrounding patches.
- the image processing apparatus 250 solves the energy minimization problem of the Markov random field model set for each patch of interest using an iterative probability propagation method (LBP) or the like, thereby adding a high-resolution edge to be added to the image in each patch of interest.
- a component is selected for each target patch from the stored high-resolution edge components.
- the image processing apparatus 250 generates a high-quality image by adding each high-resolution edge component selected for each patch to the image component of each patch of the enlarged image.
- the image processing apparatus 250 can improve the image quality of an input image using a plurality of classes of Gaussian mixture models. For example, the image data of each patch in the low-resolution learning image and the image vector of each patch in the high-resolution learning image are used as learning data. Using the cluster vector obtained from the image vector of each patch in the low-resolution learning image, the average and variance of the density distribution corresponding to each class in the Gaussian mixture model, and the weight for each class are determined by the EM algorithm, etc. calculate. The image processing apparatus 250 stores these averages, variances, and weights as learning data.
- the image processing apparatus 250 uses the image vector of each patch in the input image, the cluster vector obtained from the image vector, the average and variance stored as learning data. And a weight are used to generate a high-quality image.
- the image processing apparatus 250 can generate a high-quality image only from the input image by using the contour information extracted from the input image. For example, when the resolution of a specific image region near the contour extracted from the input image is increased in resolution, the image processing device 250 calculates the pixel value of the pixel included in the other region along the contour in the specific image region. By disposing them, it is possible to generate a high-quality image obtained by increasing the resolution of a specific image region. For example, the image processing apparatus 250 arranges the pixel value of the pixel at which position in the specific image area based on the positional relationship between the position of the pixel included in the other area and the position of the contour. And the pixel value is arranged at the determined position, so that the resolution of the specific image area can be increased.
- the image processing apparatus 250 may perform the high resolution processing using the contour information limited to the vicinity of the edge region including the edge in the input image.
- the image area other than the edge area may be increased in resolution by a filter method or the like.
- the image processing apparatus 250 may increase the resolution of a flat region from which an edge amount equal to or less than a predetermined amount is extracted using a filter method.
- the image processing apparatus 250 modifies the image that has been increased in resolution using the filter method so that the condition generated from the input image is satisfied, You may increase the resolution.
- the parameter storage unit 1010 is a parameter used for image quality improvement processing by the image processing apparatus 250, for example, the high frequency component data corresponding to the low frequency component, the filter for increasing the resolution of the flat region, and the learning related to the Gaussian mixture model. Data etc. can be stored.
- an image quality improvement process using a locally stored projection tensor according to the present invention can be applied.
- Face images with different resolutions, persons, and patch positions are used as learning images for calculating the fourth-order tensors whose learning targets are resolution, patch positions, individuals, and pixels.
- eigenvectors in the eigenspace are calculated for the resolution, patch position, person, and pixel value, respectively.
- the fourth-order tensor based on the product of the calculated eigenvectors is used when generating a medium-resolution face image from the face image included in the input image.
- the eigenvector can be calculated by learning using an eigenvalue decomposition method, local preservation projection (LPP), or the like. Note that a high-resolution patch used to recover a high-frequency component from a medium-resolution face image is obtained from the high-resolution learning image.
- the image processing apparatus 250 stores the obtained tensor and high resolution patch.
- the image processing apparatus 250 converts the face image in units of patches using the stored fourth-order tensor, thereby converting the face image with medium resolution. Get the patch to be formed. Then, the image processing apparatus 250 sets a Markov random field between the medium resolution patch and the stored high resolution patch. By solving the energy minimization problem of all the patches of the Markov random field model using a sequential improvement method (ICM) or the like, a high-resolution face image in which high-frequency components are recovered can be obtained.
- ICM sequential improvement method
- the output image of the adding unit 160 (or the combining unit 166) in FIG. To the face image.
- the “medium resolution” image is further input to the energy minimization problem of the Markov random field model and solved to obtain an output of the “high resolution” image.
- the image processing apparatus 250 may perform a process of generating a low-resolution face image from the face image included in the input image as a pre-process for obtaining a medium-resolution patch.
- the image processing apparatus 250 obtains a medium-resolution patch by converting the low-resolution face image obtained by the preprocessing with the above-described fourth-order tensor.
- the pre-processing can include a process of converting a face image included in the input image using a fifth-order tensor obtained with respect to the face direction, lighting level, facial expression, person, and pixels.
- face images with different face orientations, illumination levels, facial expressions, and persons can be used.
- the pre-processing includes a registration process of the face image included in the input image.
- the face image may be aligned by affine transformation.
- the affine transformation parameters are optimized to match the positions of the face image after affine transformation and the learning face image.
- it is desirable to perform the alignment process so that the learning face images are aligned with each other.
- LPP local storage projection
- eigenvectors are calculated from each of the low-resolution image and the high-resolution image as learning images by local preservation projection (LPP).
- LPP local preservation projection
- the low resolution image and the high resolution image are associated as network weights by a radial basis function.
- a residual image between the medium resolution image and the low resolution image obtained by inputting the low resolution image of the learning image and a residual image between the high resolution image of the learning image and the medium resolution image are calculated.
- the image processing apparatus 250 stores a residual image between the medium resolution image and the low resolution image and a residual image between the high resolution image and the medium resolution image for each patch.
- the image processing apparatus 250 When the input image to be improved in image quality is improved, the image processing apparatus 250 generates an intermediate resolution image from the eigenvector and the radial basis function obtained in the learning stage by local preservation projection (LPP) from the input image. To do.
- the image processing device 250 calculates a residual image between the medium resolution image and the input face image. From the residual image, a residual image between the corresponding high resolution image and medium resolution image is selected for each patch from the stored residual images by local linear embedding (LLE) and nearest neighbor search. Then, the image processing apparatus 250 adds the residual image obtained by smoothing the residual image between the selected high resolution image and the medium resolution image to the medium resolution image generated from the input image. Generate a quality image.
- LLE local linear embedding
- the image processing unit 804 may calculate the weighting coefficient from the image of the object included in the feature area in the compression process for compressing the image of the feature area in the plurality of captured images acquired from the imaging unit 212. In other words, the image processing unit 804 can compress the image of the object included in the feature area by representing the principal component vector and the weighting coefficient. Then, the image processing unit 804 may transmit the principal component vector and the weighting coefficient to the image processing device 250.
- the image processing apparatus 250 can reconstruct an image of an object included in the feature region using the principal component vector and the weighting coefficient acquired from the image processing unit 804.
- the image processing unit 804 is included in the feature region using a model that represents an object with various feature parameters in addition to a model based on principal component analysis as described in JP-A-2006-350498. Needless to say, an image of an object can be compressed.
- the image processing device 250 or the display device 260 performs the above-described super-resolution processing on the image of the feature region as the image quality enhancement processing. be able to.
- the compression unit 232 may further compress the captured image by representing the image with a principal component vector and a weighting coefficient, as in the image processing apparatus 220 described above. it can.
- the present invention can be applied to high image quality processing and encoding for a document scanned by a scanner device such as a copying machine.
- a scanner device such as a copying machine.
- the image quality enhancement processing such as the super-resolution processing described above can be applied as the resolution enhancement processing for those regions.
- the feature region detection processing and compression processing described above can be applied to the detection and encoding of the feature regions.
- the above-described feature region detection processing, high image quality processing, and compression processing can be applied to detection of a body part, high image quality, and encoding.
- ⁇ Modification 1> In the image processing systems 200 and 201 described above, an example in which a plurality of imaging devices 210a-d are provided has been described, but the number of imaging devices 210 is not particularly limited, and may be one. Further, the number of display devices 260 is not particularly limited, and may be one.
- the feature region is specified from the captured image (frame image or field image) in the moving image data.
- the present invention is not limited to the moving image data and can be applied to still image data.
- ⁇ Modification 3> In the image processing systems 200 and 201 described above, the configuration in which a plurality of feature regions can be detected from one captured image has been described. However, the number of feature regions is not particularly limited, and one feature region is provided for each captured image. It may be.
- the means for acquiring the learning image group is not limited to a mode in which a pair of high-quality images and low-quality images is prepared in advance, and only a high-quality image is given and a low-quality image is generated from the high-quality image.
- An image pair may be obtained.
- the image processing apparatus is equipped with processing means (low image quality processing means) for performing processing for reducing image quality, and by inputting a high quality learning image, the image quality is reduced and learned in the apparatus.
- a mode of acquiring an image pair is also possible.
- the learning image is not limited to a mode provided from a database prepared in advance, but is actually performed by the imaging device 210 depending on the operation of the system.
- the learning content can also be updated based on the captured image or an image (partial image) cut out from the image.
- ⁇ Modification 5> In the above-described embodiment, an example in which image data is learned and image conversion with high image quality is performed has been described. However, the present invention is not limited to image quality improvement processing, and can be applied to other image conversions such as image recognition. Further, data to be processed is not limited to an image, and can be similarly applied to various data other than an image. That is, the configurations described as the image processing device, the image processing unit, and the image processing system can be expanded as a data processing device, a data processing unit, and a data processing system.
- the similarity (for example, “Mr. A”) with the specific person can be determined from the positional relationship between the learning data in the intermediate eigenspace (here, the individual difference eigenspace) and the newly input data.
- the face image there are various conditions for the face image to be input, such as front-facing, left-side-facing, right-side-facing, etc., but no matter what orientation is entered, the front-facing, left-facing, right-facing, ..
- One or more conditions can be handled accurately with a single standard by using the property of gathering at one point on the intermediate eigenspace (for example, individual difference eigenspace) via the orientation modality A new effect that it can be obtained.
- ⁇ Application example for speech recognition> As an example of handling data other than images, an example applied to speech recognition will be described. Instead of image data, the same processing as the processing up to the intermediate eigenspace of the image quality enhancement processing described in FIGS. 2, 3, 6, etc. is performed on the audio data, and the position of the coefficient vector in the intermediate eigenspace Speech can be recognized using the relationship. As for the positional relationship, the distance, orientation, etc. may be obtained by the method of obtaining the “coefficient vector correction processing unit 140”. In other words, the closer the distance and direction of the obtained input data are to the learning data, the higher the possibility of being a determination target.
- the voice sampling number (low resolution, high resolution) modality of the voice data is applied to the pixel modality (low resolution, high resolution) described for the image data.
- the signal noise ratio (S / N) and the position of the sound source and microphone (sensor) can also be handled as modalities.
- the determination is made on a common eigenspace for speech recognition (corresponding to “intermediate eigenspace”), in the case of a plurality of sampling numbers and quantization numbers based on a single determination criterion.
- it will be possible to recognize and respond in common. Therefore, there is an effect that it is not necessary to adjust the judgment standard for each case.
- by applying tensor projection while suppressing the low frequency components of the input it is possible to remove the effects of disturbances caused by disturbances and noises contained in the low frequency components, and processing for low frequency components (disturbance, noise, etc.)
- the robustness (robustness) can be improved.
- ⁇ Application example for language processing> As another example of handling data other than images, an example applied to language processing will be described. Similar to the processing up to the intermediate eigenspace of the image quality improvement processing described in FIGS. 2, 3, 6, etc., for language data (speech data or text data) instead of image data It is possible to perform language processing using the positional relationship of coefficient vectors in the intermediate eigenspace. As for the positional relationship, the distance, orientation, etc. may be obtained by the method of obtaining the “coefficient vector correction processing unit 140”. In other words, the closer the distance and direction of the obtained input data are to the learning data, the higher the possibility of being a determination target.
- the language (Japanese, English) modality is applied to the pixel modality (low resolution, high resolution) described for the image data.
- regions (dialects), uses (formal (news), informal), times (Heian, Edo, Hyundai), and generations (high school students, seniors) can be treated as modalities.
- the biological information includes, for example, heartbeat, pulse, blood pressure, respiration, sweating waveform, period, amplitude, and the like.
- the biometric information data is processed, the same processing as the processing up to the intermediate eigenspace of the image quality enhancement processing described in FIG. 2, FIG. 3, FIG.
- Biological information processing can be performed using the positional relationship.
- the positional relationship the distance, orientation, etc. may be obtained by the method of obtaining the “coefficient vector correction processing unit 140”. In other words, the closer the distance and direction of the obtained input data are to the learning data, the higher the possibility of being a determination target.
- the number of biological data sampling (low resolution, high resolution) modality is applied to the pixel modality (low resolution, high resolution) described for the image data.
- the signal-to-noise ratio (S / N) and the position of the signal source and sensor can also be handled as modalities.
- a determination is made on a common eigenspace for biological information processing (corresponding to an “intermediate eigenspace”), a plurality of sampling numbers and quantization numbers can be determined based on a single determination criterion. Even in this case, it becomes possible to recognize and cope in common. Therefore, there is an effect that it is not necessary to adjust the judgment standard for each case.
- by applying tensor projection while suppressing the low frequency components of the input it is possible to remove the effects of disturbances caused by disturbances and noises contained in the low frequency components, and processing for low frequency components (disturbance, noise, etc.)
- the robustness (robustness) can be improved.
- Natural / physical information includes, for example, weather, climate, earthquake waveform and period, amplitude, and the like.
- image data natural / physical information data is targeted, and the same processing as the processing up to the intermediate eigenspace of the image quality enhancement processing described in FIG. 2, FIG. 3, FIG.
- Natural / physical information can be processed using the positional relationship of coefficient vectors.
- the positional relationship the distance, orientation, etc. may be obtained by the method of obtaining the “coefficient vector correction processing unit 140”. In other words, the closer the distance and direction of the obtained input data are to the learning data, the higher the possibility of being a determination target.
- the data sampling number (low resolution, high resolution) modality is applied to the pixel modality (low resolution, high resolution) described for the image data.
- the signal-to-noise ratio (S / N) and the position of the signal source and sensor can also be handled as modalities.
- a plurality of sampling numbers and quantization can be performed with one kind of determination criterion Even in the case of numbers, it becomes possible to recognize and cope in common. Therefore, there is an effect that it is not necessary to adjust the judgment standard for each case.
- by applying tensor projection while suppressing the low frequency components of the input it is possible to remove the effects of disturbances caused by disturbances and noises contained in the low frequency components, and processing for low frequency components (disturbance, noise, etc.)
- the robustness (robustness) can be improved.
- DESCRIPTION OF SYMBOLS 100 ... Image processing apparatus, 102 ... Low resolution expansion process part, 104 ... High-pass filter, 108 ... LPP projection tensor production
Abstract
Description
はじめに射影変換の原理を説明する。低画質の入力画像から高画質の画像を復元する処理を行うための準備段階として、事前に複数人分の顔画像のデータを学習し、変換関係を規定する関数を求めておく。このような処理を学習ステップという。そして、この学習ステップで得られた変換関数を用いて、任意の入力画像(低画質)から高画質の出力画像を得る工程を復元ステップとよぶ。
まず、学習画像セットとして、複数人数分(例えば、60人分)の顔の低解像画像と高解像画像とを対(ペア)とした学習画像群を用意する。ここで用いる学習画像セットは、高解像の学習画像から一定割合で画素を間引くなど、ある条件で情報を減らすことにより低画質化したものを低解像の学習画像として用いている。この情報削減によって生成した低解像の学習画像と、これに対応する元の高解像の学習画像(同一人物の同内容の画像)とのペアの対応関係を事前に学習することで、変換関数(射影を規定するテンソル)を生成する。
表1の例に限らず、更なる多モダリティ化も可能である。例えば、顔の向きとして「右向き~正面~左向き」の範囲で10段階に方向を変えた10パターン、顔の表情としてノーマル、笑顔、怒り、叫び表情の4パターン、照明の方向として「右真横~正面~左真横」の範囲で45度ずつ5段階に方向を変えた5パターンなど、各種モダリティを追加することが可能である(表2参照)。
図1はテンソル射影の概念図である。ここでは図示の便宜上、3次元の空間で説明するが、任意の有限次元(N次元)に拡張することができる。テンソル射影は、ある実空間Rから固有空間(「特徴空間」ともいう。)Aへの移動を可能とするとともに、複数の固有A,B,Cの間での移動(射影)を可能とする。
H=UpixelsGHV
一方、画素実空間における低解像度画素ベクトルLは同様に、次式となる。
L=UpixelsGLV
よって、画素実空間の低解像度画像(低解像度画素ベクトルL)から画素固有空間→個人差固有空間を経由して画素固有空間→画素実空間に戻し、画素実空間における高解像度画像を得る場合、次式の射影によって変換可能である。
H=UpixelsGHV=UpixelsGH(UpixelsGL)-1L
本実施形態では、低解像画像と高解像画像のペア群からなる学習画像セットから局所性保存射影(LPP)を利用して射影関数(Upixels、)を求め、これを基に個人差空間上で同一人物のL画像点とH画像点とが略一致するように射影関数GL、GHを求めている。
LPP射影の演算手順を概説すると、次のとおりである。
例えば、[1]Cholesky分解や[2]一般固有値問題を逆行列算出により、固有値問題に変形して解く。
図3Aは本発明の実施形態における処理の概要を示すブロックチャートである。図示のように、本実施形態による処理は、学習ステップと復元ステップとに大別することができる。
図4は、LPP固有空間上でのモダリティ(ここでは、個人差)内の変化が線形に近い性質を持つ場合の例を示したものである。例えば、Aさん、Bさん、Cさん、Dさんの4人の学習画像についてLPPで変換すると、局所構造を維持した状態で図4のAさんからBさんまでの間の変化(個人差の変化)が当該個人差固有空間上で概ねなめらかに(連続的に)変化していく線形に近いものなる。
図3Aで説明した処理の手順を含んで更に実用的な実施形態について以下に説明する。
低解像拡大処理部102は、入力された低解像画像を所定のサイズに拡大する処理を行う。拡大法は、特に限定されず、バイキュービック、Bスプライン、バイリニア、ニアレストネイバー等、各種方法を用いることができる。
高域通過フィルタ104は、入力された画像に低域を抑制するフィルタをかけるものである。フィルタには、アンシャープマスク、ラプラシアン、グラジエントなどを用いることができる。顔画像における照明変動の影響の多くは低周波域に存在するため、この高域通過フィルタ104によって低域を抑圧することで照明変動の影響を取り除き、照明変動に対するロバスト性を上げることができる。
パッチ分割部106は、入力された画像を将棋盤のマス状に分割する。学習ステップ、復元ステップともに、各パッチ単位で信号処理が行われることになる。パッチ毎の処理を行うことで、処理対象を画像の局所に限定することで射影対象を低次元で扱えるようにしたため、高画質及び個人差の変化に対してロバスト化できる。したがって、本発明の実施に際し、パッチ分割の手段を具備する構成は好ましい態様である。
LPP射影テンソル生成部108は、上記の低解像拡大、高域通過フィルタ、パッチ分割といった前処理の済んだ入力学習画像セット(低解像画像と高解像画像のペア群)から局所保存射影(LPP)を適用して、LPP射影テンソルを生成する。
LPPアルゴリズムから対角行列Dとラプラシアン行列Lが求まっていることを前提にして直交LPP射影行列WOLPP={u1,…,ur}を以下の手順で求める。なお、次元数rは、元の次元数n以下の数である。
M(k)={I-(XDXt)-1A(k-1)[B(k-1)]-1[A(k-1)]}(XDXt)-1(XLXt)
ここで、
A(k-1)={u1,…,uk-1},
B(k-1)=[A(k-1)]t(XDXt)-1A(k-1)
である。
上述のLPPに対し、主成分分析(PCA)の原理は、大局分散の最大化であり、大域的な分布を保持して線形次元を削減することを主目的とする。PCAは、大域的な幾何学性を保存し、線形変換のみで簡単に射影するという特徴があり、直交基底である。
既述のとおり、本実施形態では、射影関数の決定に際して適切なサンプルを選択するために学習画像を絞り込む。その際、最終的に使用する学習画像のペア群の数(ここでは、サンプルの人数)を「学習代表数」といい、この学習代表数の情報を外部から取得する。
学習セット代表値化処理部112は、前処理済の入力学習画像セット(低解像度画像と高解像度画像の少なくとも一方)から個人差固有空間係数ベクトル群を求める処理を行う。この処理は、入力学習画像セットについて、復元ステップにおける第1のLPP_HOSVD射影処理部130と同じ処理、すなわち、L画素→固有空間射影(符号132による処理)と[L画素→個人差]固有空間射影(符号134による処理)までの処理を行い、個人差固有空間の係数ベクトルを求めるものである。
再射影テンソル生成部114は、学習セット代表値化処理部112で得られたN個の代表学習画像セットについてLPP射影テンソル生成部108と同じ処理を行い、LPP固有射影行列とLPP射影核テンソルを生成し直す。こうして、代表学習画像セットを基に、後述の復元ステップで使用されるLPP固有射影行列(Upixels)115とLPP射影核テンソル(G)116が得られる。
設定値取得部120は、処理対象とするパッチ位置の情報と、L、Hの設定を指定する情報を外部から取得し、その情報を「第1のサブ核テンソル生成部122」、「第2のサブ核テンソル生成部124」、「L画素→固有空間射影部132」、「固有空間→H画素射影部154」に与える手段である。
第1のサブ核テンソル生成部122は、設定値取得部120から出力されるパッチ位置とL設定の条件を与えることにより、再射影テンソル生成部114の出力に係るLPP射影核テンソル116から低解像用のサブ核テンソルGLを生成する。なお、当該手段は、学習ステップで行ってもよく、LPP射影核テンソル116を記憶保存する態様に代えて、或いは、これと併用して、学習ステップにおいてサブ核テンソルGLを生成し、記憶保存しておいてもよい。かかる態様によれば、当該サブ核テンソルを保存するメモリが必要になるが、復元ステップの処理時間が短縮できるという利点がある。
第1のLPP_HOSVD射影処理部130における「L画素→固有空間射影部132」は、設定値取得部120から与えられるパッチ位置を基に、LPP固有射影行列115(Upixels)を得て、パッチ分割部106からの出力の画像に対して、図2(a)→(b)で説明した画素固有空間へのUpixels -1射影の処理を行う。なお、Upixels -1は、Upixelsの逆行列を表す。
図6において「L画素→固有空間射影部132」に続く[L画素→個人差]固有空間射影部134は、第1のサブ核テンソル生成部122から該当する射影テンソルGLを得て、「L画素→固有空間射影部132」の出力に対して、図2(b)→(c)で説明した個人差固有空間へのGL -1射影の処理を行い、個人差固有空間係数ベクトルを求める。
係数ベクトル補正処理部140は、図6の[L画素→個人差]固有空間射影部134で求められたパッチ数分の個人差固有空間係数ベクトル群を用いて、第2のLPP_HOSVD射影処理部150の[個人差→H画素]固有空間射影部152に与える補正係数ベクトル群を生成する。
隠蔽物が存在するパッチの画素ベクトルは、個人差固有空間において、他の隠蔽物がないパッチの画素ベクトルが集まる領域から離れた位置の点となる。このような場合に、隠蔽物のあるパッチの画素ベクトルを補正し、隠蔽物のないベクトル(補正係数ベクトル)に修正できる。
個人差固有空間における同人物に係るパッチ群の係数ベクトル群の平均値、メジアン、最大値、最小値等の代表値を補正係数ベクトル群の値として用いることで、個人差固有空間係数ベクトル群のノイズ(眼鏡、マスク、扉等部分隠蔽物の影響)を除去する。
個人差固有空間における同人物に係るパッチ群の係数ベクトル群のヒストグラムにおける平均値、メジアン、最大値、最小値等の代表値を中心に、例えば分散σの範囲、又は2σの範囲の個人差固有空間係数ベクトル群を対象にした平均値、メジアン、最大値、最小値等を補正係数ベクトル群の値として用いることで、更にノイズ除去してもよい。
隠蔽物が存在する領域が検出されたときに、当該領域をそれ専用のテンソルで変換する態様も可能である。
顔内の眼鏡(上部横長)やマスク(下部中央)の相対位置は事前に概ね把握できているため、該当領域のパッチの個人差固有空間係数ベクトル群と顔全体(又は、隠蔽候補領域を除いた顔領域)のパッチの個人差固有空間係数ベクトル群の代表値とを比較して、類似していたら(距離が近ければ)隠蔽無しの確率が高いと検出する。逆に、両者の距離が離れていたら隠蔽物が存在している確率が高いと検出される。
「例A-2-1」では代表値との距離に注目して隠蔽物を検出したが、係数ベクトル群の分布の広がりから検出することもできる。すなわち、例A-2-1の他の実施例として、隠蔽候補に該当する領域に対応するパッチの個人差固有空間係数ベクトル群の分布が広がっていたら隠蔽が有る確率が高いと検出する態様も可能である。隠蔽候補領域の分布が顔全体における同分布より広がっている場合、隠蔽が有る確率が高いとしても良い。
他の実施例として、事前に正解(学習セットには含まれない画像)の個人差固有空間係数ベクトル群の分布形状を求めておく態様もある。この場合、個人差固有空間係数ベクトル群が事前の分布形状と類似していたら隠蔽無しの確率が高いと検出する。
(例A-3-1):
「例A-2-1」と同様の検出を行い、隠蔽物領域に対して、バイキュービックや「汎用超解像処理部164」(図6参照)など別の変換手法による復元をする態様も可能である。
(例A-4-1):
同一人物の顔画像を分割したパッチ群の画素ベクトルについて、個人差固有空間で高い相関があることを利用して、顔内の一部(例えば、目、鼻、口の各領域)のパッチのみの個人差固有空間係数ベクトル群から、顔全体の補正係数ベクトル群を求めるようにしてもよい。
例えば、顔内の一部の個人差固有空間係数ベクトル群の平均値、メジアン、最大値、最小値等の代表値を顔全体の補正係数ベクトル群の値として用いる。
「例A-4-1-1」に代えて、顔内の中央部分の複数パッチについて個人差固有空間係数ベクトル群の分布を求める。次に、同分布より、外挿予測して、当該中央部分以外の補正係数ベクトル群を求める。例えば、顔内中央部分の3×3の9パッチについて係数ベクトル群の分布を求め、この分布から当該9パッチの外側位置の係数ベクトルを外挿法(補外法)によって求める。
顔内の水平垂直方向に間引いたパッチに対してのみ個人差固有空間係数ベクトル群の分布を求める。次に、同分布を補間して個人差固有空間係数ベクトルを求めていないパッチの補正係数ベクトル群を求める。例えば、偶数番号のパッチ位置についてのみ係数ベクトル群の分布を求め、残りの奇数番号のパッチについては補間して求める。
処理対象のパッチ及びその周囲のパッチの補正係数ベクトル群に対して、更に低域通過フィルタ(例えば、平均フィルタ)を掛けてもよい。かかる態様によれば、求められた補正係数ベクトル群を空間的に滑らかにし、ノイズ成分を除去する効果がある。また、平均フィルタに代えて、最大値、最小値、メジアンフィルタをかけても良い。
第2のサブ核テンソル生成部124は、設定値取得部120の出力のパッチ位置とH設定の条件を与えることにより、LPP射影核テンソル116から上記サブ核テンソルGHを生成する。
[個人差→H画素]固有空間射影部152は、第2のサブ核テンソル生成部124からGHを得て、係数ベクトル補正処理部140の出力の補正係数ベクトルに対して図2(c)→(d)で説明したGH射影を行う。
固有空間→H画素射影部154は、設定値取得部120からのパッチ位置をもとにLPP固有射影行列Upixelsを得て、[個人差→H画素]固有空間射影部152の出力の係数ベクトルに対して図2(d)→(e)で説明したUpixels射影の処理をして高解像画像を求める。
加算部160は、固有空間→H画素射影部154からの入力(高周波成分の復元情報)と、低解像拡大処理部102からの入力(元の低解像拡大画像)の和を出力する。また、この加算部160にて、全パッチ分を加算統合して1枚の顔画像(高解像の画像)を生成する。元の低解像拡大画像に対して、所定にフィルタリング処理を施した後に、高周波成分の復元情報を加算するように構成してもよい。
汎用超解像処理部164は、入力された低解像画像を出力と同サイズに超解像拡大する。
[数5]
x=Σ(Ai・z+Bi)・wi(y-μi,πi)
ただし、z:低解像画像、x:高解像画像、Ai、Bi、μi、πiはそれぞれ学習時に確定され、重みとしての確率wiは、復元時、未知画素と周囲の差分の次元ベクトルyによって動的に求められる。
重み算出部162は、入力条件の外れ程度に応じて、汎用超解像処理部164による汎用超解像方式の採用率を増減調整するよう、合成部166で用いる重みw1を求める手段である。入力条件の外れ程度が低ければ汎用超解像方式の採用率を下げ、入力条件の外れ程度が高いほど汎用超解像方式の採用率を高くするよう重みw1が決定される。
既述したテンソル射影超解像の手段(図6の符号100A、100B)は、個人差固有空間上で個人差固有空間係数ベクトルが学習セットの係数ベクトルから遠いほど復元性が悪い、という特徴がある(特徴[1])。
学習セットの係数ベクトルと個人差固有空間係数ベクトルとの向きが類似しているほどw1を大きくする。
また、既述したテンソル射影超解像の手段(図4の符号100A,100B)は、個人差固有空間上で、個人差固有空間係数ベクトルの「パッチ数を標本数とした分布」が広がっている(ばらついている)ほど復元性能が悪い、という特徴がある(特徴[2])。
「例B-2-1」のパッチ標本に対する分布において、標本数の少ない(又は代表値から遠い)パッチ標本ほどw1を小さくする。すなわち、ヒストグラム上の頻度に応じて重みを変える。この場合、パッチ毎に重みが制御できるという効果がある。
「例B-2-1」のパッチ標本に対する分布において、分布の形状が類似しているほど重みを大きくしても良い。例えば、学習ステップで把握されているAさんの分布と、入力画像(未知の画像)の分布の分布形状が似ているかどうかによって重みを変える。
上述した「例B-1-1」、「例B-1-2」、「例B-2-1」、「例B-2-2」、「例B-3」についてそれぞれ共通に、次のような構成を採用し得る。例えば、「例B-1-1」又は「例B-1-2」において、更に学習サンプルである代表個人差ベクトルの個々に対し、個人毎(例えば、Aさんの顔内)の個々のパッチの正解妥当性判断指標を考える。この判断指標としてパッチ標本に対する分布の代表値からの個々のパッチの距離を利用する。代表値から遠いほど正解には相応しくないと扱うようにする。具体的には図11、β2/x、β2/x2、exp(-β2x)等と同様な特性を持つwpを求め、w1’=w1・wpを合成部166に与えても良い。
また、上述した「例B-1-1」、「例B-1-2」、「例B-2-1」、「例B-2-2」、「例B-3」についてそれぞれ共通に、代表値としては平均、メジアン、最大、最小など用いてよい。
上述した「例B-1-1」、「例B-1-2」、「例B-2-1」、「例B-2-2」、「例B-3」についてそれぞれ共通に、分布の広がり(ばらつき)としては分散、標準偏差など用いてよい。
学習セットの重心、周囲境界点などの代表値と個人差固有空間係数ベクトルとの距離が近く又は向きが類似しているほどw1を大きくする。かかる態様によれば、距離や向きの算出対象を減らし、高速化が可能である。
上述した各例における「距離」の計算については、ユークリット距離、マハラノビス距離、KL距離など用いてよい。
上述した各例における「向き」の計算については、ベクトル角度、内積、外積などを用いてよい。
図4で説明した「学習ステップ」時に距離、向き、代表値、分布広がり、分布形状と復元誤差との関係を正解不正解セットとして定義しておく。復元誤差とは、学習画像セットから求めた射影関数で復元した画像と正解画像との差であり、例えば、正解不正解画像との平均自乗誤差やPNSR(ピーク信号対ノイズ比)で表される。
「距離、向き、代表値、分布広がり、分布形状」のうち少なくとも1つと「復元誤差」の関係を求めておく。例えば、「距離-復元誤差の特性」として求めておく。なお、頻度に比例した信頼確率付き特性としても良い。
図6で説明した「復元ステップ」において求めた「距離、向き、代表値、分布広がり、分布形状」から、最も近い「学習ステップ」時の「距離、向き、代表値、分布広がり、分布形状」を選択し、対応する「復元誤差」を得る。
重みw1=b0+b1×(復元誤差)
[数6]で示す線形関数に代えて、非線形関数を定義して重みを求めても良い。
上記「例B-共通-7」における個人差固有空間上の正解不正解セットの「距離、向き、代表値、分布広がり、分布形状」のうち少なくとも1つと「重み」との相関を規定する関数は、(正則化)最小2乗法、重回帰分析、SVM(回帰)、AdaBoost(回帰)、ノンパラメトリックベイズ、最尤推定法、EMアルゴリズム、変分ベイズ法、マルコフ連鎖モンテカルロ法等で、[数5]の係数b0、b1を求めても良い。
また、上記の各例(「例B-1-1」~「例B-共通-8」)において、更に、処理対象のパッチ及びその周囲のパッチの重みに対して低域通過(平均)フィルタを掛けてもよい。この態様によれば、求められた重みを空間的に滑らかにする効果及びノイズを除去する効果がある。また、最大値、最小値、メジアンフィルタをかけても良い。
図6の合成部166は、加算部160から与えられる画像(入力画像1)と、汎用超解像処理部164から与えられる画像(入力画像2)とを、重み算出部162で得られた以下の重みに応じて合成、又は選択をする。
出力高解像画像=Σ(wi・Ii)=w1・I1+w2・I2
ただし、w1は加算部160の出力I1の重みw1を表し、w2は汎用超解像処理部164の出力I2の重みw2=1-w1を表す。
図12は、他の実施形態を示すブロックである。図12中、図7の構成と同一又は類似する要素には同一の符号を付し、その説明は省略する。
図6、図12では、学習ステップと復元ステップとを1つの画像処理装置で実施し得る構成を示したが、学習ステップを実施する画像処理装置と、復元ステップを実施する画像処理装置とを別々の装置構成とすることも可能である。この場合、復元ステップを担う画像処理装置は、別途作成されている射影関係の情報(固有射影行列、射影テンソル)を外部から取得できる構成とすることが望ましい。このような情報取得手段としては、光ディスクその他のリムーバフル記憶媒体に対応したメディアインターフェースや通信インターフェースを適用できる。
上記実施形態では、局所関係を利用する射影として、LPPを例示したが、LPPに代えて、局所線形埋込み(LLE;locally linear embedding)、線形接空間位置合せ(LTSA;linear tangent-space alignment)、Isomap、ラプラス固有マップ(LE;Laplacian Eigenmaps)、近傍保存埋込み(NPE;Neighborhood Preserving Embedding)など、各種の多様体学習の手法を適用することも可能である。
図6で説明した実施形態では、説明を簡単にするために、表1で説明した4種類のモダリティに対して、パッチと解像度のモダリティを既知の要素として条件を設定し、「画素値」と「個人差」のモダリティに注目して、画素実空間から画素固有空間と個人差固有空間を経由した射影ルートを設計したが、本発明の実施に際して射影ルートの設計は本例に限定されない。モダリティバリエーションに応じて、射影ルートの中で経由する固有空間として様々な固有空間を選択することが可能である。
復元ステップに入力される変換元の画像は、図6や図12で説明した処理の手順に入る前段階で、ある画像の中から部分的に切り出された(抽出された)画像領域であってもよい。例えば、元となる画像内から人物の顔部分を抽出する処理が行われ、この抽出した顔画像領域について、復元ステップの入力画像データとして取り扱うことができる。
学習画像セットを以下のように変えることで様々な「対象」、「モダリティ」、「画像処理」に適用できるため、本発明の適用範囲は、上記の実施形態に限定するものではない。
図13は、本発明の実施形態に係る画像処理システム200の一例を示す。以下に説明する画像処理システム200は、一例として監視システムとして機能することができる。
図14は、画像処理装置220のブロック構成の一例を示す。画像処理装置220は、画像取得部222、特徴領域特定部226、外部情報取得部228、圧縮制御部230、圧縮部232、対応付け処理部234、及び出力部236を備える。画像取得部222は、圧縮動画取得部223及び圧縮動画伸張部224を有する。
例えば、特徴領域特定部226は、予め定められた形状パターンに予め定められた一致度以上の一致度で一致するオブジェクトを複数の撮像画像のそれぞれから抽出して、抽出したオブジェクトを含む撮像画像における領域を、特徴の種類が同じ特徴領域として検出してよい。なお、形状パターンは、特徴の種類毎に複数定められてよい。また、形状パターンの一例としては、人物の顔の形状パターンを例示することができる。なお、複数の人物毎に異なる顔のパターンが定められてよい。これにより、特徴領域特定部226は、異なる人物をそれぞれ含む異なる領域を、異なる特徴領域として検出することができる。
また、特徴領域特定部226は、テンプレートマッチング等によるパターンマッチングの他にも、例えば、特開2007-188419号公報に記載された機械学習(例えば、アダブースト)等による学習結果に基づいて、特徴領域を検出することもできる。例えば、予め定められた被写体の画像から抽出された画像特徴量と、予め定められた被写体以外の被写体の画像から抽出された画像特徴量とを用いて、予め定められた被写体の画像から抽出された画像特徴量の特徴を学習する。そして、特徴領域特定部226は、当該学習された特徴に適合する特徴を有する画像特徴量が抽出された領域を、特徴領域として検出してよい。
図15は、特徴領域特定部226のブロック構成の一例を示す。特徴領域特定部226は、第1特徴領域特定部610、第2特徴領域特定部620、領域推定部630、高画質化領域決定部640、パラメータ格納部650、及び画像生成部660を有する。第2特徴領域特定部620は、部分領域判断部622及び特徴領域判断部624を含む。
図19は、図14に記載した圧縮部232のブロック構成の一例を示す。圧縮部232は、画像分割部242、複数の固定値化部244a-c(以下、固定値化部244と総称する場合がある。)、及び複数の圧縮処理部246a-d(以下、圧縮処理部246と総称する場合がある。)を有する。
図20は、図14に記載した圧縮部232のブロック構成の他の一例を示す。本構成における圧縮部232は、特徴の種類に応じた空間スケーラブルな符号化処理によって複数の撮像画像を圧縮する。
図21は、図13に示した画像処理装置250のブロック構成の一例を示す。図21に示すように、画像処理装置250は、圧縮画像取得部301、対応付け解析部302、伸張制御部310、伸張部320、外部情報取得部380、及び画像処理部330を備える。伸張部320は、複数の復号器322a-d(以下、復号器322と総称する。)を有する。
図22は、図21で説明した画像処理装置250が有する画像処理部330のブロック構成の一例を示す。図22に示すように、画像処理部330は、パラメータ格納部1010、属性特定部1020、特定オブジェクト領域検出部1030、パラメータ選択部1040、重み決定部1050、パラメータ生成部1060、及び画像生成部1070を含む。
図25は、図13中の表示装置260のブロック構成の一例を示す。図25に示すように、表示装置260は、画像取得部1300、第1画像処理部1310、特徴領域特定部1320、パラメータ決定部1330、表示制御部1340、第2画像処理部1350、外部情報取得部1380、及び表示部1390を有する。
図27は、他の実施形態に係る画像処理システム201の一例を示す。本実施形態における画像処理システム201の構成は、撮像装置210a-dがそれぞれ画像処理部804a-d(以下、画像処理部804と総称する。)を有する点を除いて、図13で説明した画像処理システム200の構成と同じとなっている。
上述の画像処理システム200、201では、複数の撮像装置210a-dを備えた例を述べたが、撮像装置210の台数は特に限定されず、1台であってもよい。また、表示装置260の台数も特に限定されず、1台であってもよい。
上述の画像処理システム200、201では、動画データの中の撮像画像(フレーム画像、或いはフィールド画像)から特徴領域を特定したが、動画データに限らず、静止画データについても適用可能である。
上述の画像処理システム200、201では、1つの撮像画像から複数の特徴領域を検出し得る構成を説明したが、特徴領域の数は特に限定されず、1つの撮像画像につき、特徴領域は1つであってもよい。
学習画像群を取得する手段について、予め高画質画像と低画質画像の対の画像群を用意しておく態様に限らず、高画質画像のみを与え、その高画質画像から低画質画像を生成することにより画像対を得てもよい。例えば、画像処理装置内に低画質化の処理を行うための処理手段(低画質化処理手段)を備え、高画質の学習画像を入力することにより、同装置内でこれを低画質化して学習画像対を取得する態様も可能である。
上述の実施形態では画像データを学習して高画質化の画像変換を行う例に説明したが、本発明は高画質化処理に限らず、画像認識など、他の画像変換にも適用できる。また、処理の対象とするデータは画像に限定されず、画像以外の各種データについて同様に適用できる。すなわち、画像処理装置、画像処理手段、画像処理システムとして説明した構成は、データ処理装置、データ処理手段、データ処理システムとして拡張することができる。
高画質化処理以外の応用例として、画像認識に基づく個人認証の技術への適用例を説明する。この場合、図2、図3、図6等で説明した高画質化処理の中間固有空間までの処理と同様の処理を行い、中間固有空間における係数ベクトルの位置関係を利用して個人認証することができる。位置関係は「係数ベクトル補正処理部140」の求め方で距離や向き等を求めてもよい。つまり、求められた入力データの距離や向きが学習データに近ければ近いほど判断対象である可能性が高いことになる。
画像以外のデータを取り扱う一例として、音声認識に適用する例を説明する。画像データに代えて、音声データを対象とし、図2、図3、図6等で説明した高画質化処理の中間固有空間までの処理と同様の処理を行い、中間固有空間における係数ベクトルの位置関係を利用して音声認識することができる。位置関係は「係数ベクトル補正処理部140」の求め方で距離や向き等を求めてもよい。つまり、求められた入力データの距離や向きが学習データに近ければ近いほど判断対象である可能性が高いことになる。
画像以外のデータを取り扱う他の例として、言語処理に適用する例を説明する。画像データに代えて、言語データ(音声データでもよいし、文字データでもよい)を対象とし、図2、図3、図6等で説明した高画質化処理の中間固有空間までの処理と同様の処理を行い、中間固有空間における係数ベクトルの位置関係を利用して言語処理することができる。位置関係は「係数ベクトル補正処理部140」の求め方で距離や向き等を求めてもよい。つまり、求められた入力データの距離や向きが学習データに近ければ近いほど判断対象である可能性が高いことになる。
画像以外のデータを取り扱う他の例として、生体情報処理に適用する例を説明する。生体情報には、例えば、心拍、脈拍、血圧、呼吸、発汗の波形や周期、振幅等がある。画像データに代えて、生体情報のデータを対象とし、図2、図3、図6等で説明した高画質化処理の中間固有空間までの処理と同様の処理を行い、中間固有空間における係数ベクトルの位置関係を利用して生体情報処理することができる。位置関係は「係数ベクトル補正処理部140」の求め方で距離や向き等を求めてもよい。つまり、求められた入力データの距離や向きが学習データに近ければ近いほど判断対象である可能性が高いことになる。
画像以外のデータを取り扱う他の例として、自然・物理情報処理に適用する例を説明する。自然・物理情報には、例えば、天候、気候、地震の波形や周期、振幅等がある。画像データに代えて、自然・物理情報のデータを対象とし、図2、図3、図6等で説明した高画質化処理の中間固有空間までの処理と同様の処理を行い、中間固有空間における係数ベクトルの位置関係を利用して自然・物理情報を処理することができる。位置関係は「係数ベクトル補正処理部140」の求め方で距離や向き等を求めてもよい。つまり、求められた入力データの距離や向きが学習データに近ければ近いほど判断対象である可能性が高いことになる。
Claims (32)
- 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成された固有射影行列、及び前記学習画像群及び前記固有射影行列から生成された射影核テンソルを取得する情報取得手段と、
前記取得された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル生成手段と、
前記取得された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル生成手段と、
処理の対象とする入力画像の高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
を備えることを特徴とする画像処理装置。 - 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成される固有射影行列、及び前記学習画像群と前記射影行列から生成された射影核テンソルを用いて生成された第1の設定で特定した条件に該当する第1のサブ核テンソル、前記射影核テンソルを用いて生成された第2の設定で特定した条件に該当する第2のサブ核テンソル、を取得する情報取得手段と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
を備えることを特徴とする画像処理装置。 - 請求項1又は2に記載の画像処理装置において、
前記情報取得手段は、前記第1画質画像と前記第2画質画像との高周波成分を対とした画像対を含む学習画像群から射影演算によって生成された固有射影行列、及び前記学習画像群及び前記固有射影行列から生成された射影核テンソルを取得し、
前記フィルタ手段は、前記入力画像の高周波成分を抽出した高周波成分画像を生成するとともに、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段及び前記第2のサブテンソル射影手段は、前記高周波成分画像から高周波成分の射影画像を生成して、入力画像において表現される周波数領域を超える高周波領域の画像情報を生成することを特徴する画像処理装置。 - 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算により生成された固有射影行列を生成する固有射影行列生成手段と、
前記第1画質画像の高周波成分又は高周波成分及び中周波成分と中間固有空間の対応関係と、前記第2画質画像の高周波成分又は高周波成分及び中周波成分と前記中間固有空間の対応関係とを規定した射影核テンソルを生成する射影核テンソル生成手段と、
前記生成された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル取得手段と、
前記生成された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル取得手段と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
を備えることを特徴とする画像処理装置。 - 請求項4に記載の画像処理装置において、
前記固有射影行列生成手段は、前記第1画質画像と前記第2画質画像との高周波成分を対とした画像対を含む学習画像群から射影演算によって前記固有射影行列を生成し、
前記射影核テンソル生成手段は、前記学習画像群及び前記固有射影行列から射影核テンソルを生成し、
前記フィルタ手段は、前記入力画像の高周波成分を抽出した高周波成分画像を生成するとともに、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段及び前記第2のサブテンソル射影手段は、前記高周波成分画像から高周波成分の射影画像を生成して、入力画像において表現される周波数領域を超える高周波領域の画像情報を生成することを特徴する画像処理装置。 - 請求項1乃至5のいずれかに記載の画像処理装置において、
第1画質画像の高周波成分及び中周波成分は、前記第1画質画像に対して前記フィルタ手段と同一の処理を施して抽出されるとともに、第2画質画像の高周波成分及び中周波成分は、第2画質画像に対して前記フィルタ手段と同一の処理を施して抽出されることを特徴とする画像処理装置。 - 請求項1乃至6のいずれかに記載の画像処理装置において、
前記加算手段によって加算される前記射影画像及び前記変換画像に対して重み付けをする重み係数を決定する重み係数決定手段を備えたことを特徴とする画像処理装置。 - 請求項1乃至7のいずれかに記載の画像処理装置において、
前記フィルタ手段は、入力画像におけるナイキスト周波数に基づいた周波数以上の成分を抽出する処理を施すことを特徴とする画像処理装置。 - 請求項1乃至8のいずれかに記載の画像処理装置において、
前記第1画質画像は、前記画像対において相対的に低画質の画像であり、
前記第2画質画像は、前記画像対において相対的に高画質の画像であり、
前記変更画質画像は、前記入力画像よりも高画質の画像であることを特徴とする画像処理装置。 - 請求項1乃至9のいずれかに記載の画像処理装置において、
前記第1の設定は、前記第1画質画像を前記中間固有空間に射影する射影関係を指定するものであり、
前記第2の設定は、前記第2画質画像を前記中間固有空間に射影する射影関係を指定するものであることを特徴とする画像処理装置。 - 請求項1乃至10のいずれか1項に記載の画像処理装置において、
前記射影演算は、局所性保存射影(LPP;locality preserving projection)、局所線形埋込み(LLE;locally linear embedding)、線形接空間位置合せ(LTSA;linear tangent-space alignment)のうち、いずれかであることを特徴とする画像処理装置。 - 請求項1乃至11のいずれか1項に記載の画像処理装置において、
前記学習画像群は、人物の顔を対象にした前記画像対を含み、
前記中間固有空間は、個人差固有空間であることを特徴とする画像処理装置。 - 請求項1乃至12のいずれか1項に記載の画像処理装置において、
入力された画像内から第1特徴領域を特定する第1特徴領域特定手段と、
前記入力された画像について前記第1特徴領域の画像部分を第1の圧縮強度で圧縮する一方、これら特徴領域以外の画像部分を前記第1の圧縮強度よりも高い圧縮強度の第2の圧縮強度で圧縮する圧縮処理手段と、
少なくとも第1の特徴領域を前記第1のサブテンソル射影手段及び前記第2のサブテンソル射影手段により射影して画質を変更する画質変更処理手段と、
を備えたことを特徴とする画像処理装置。 - 請求項1乃至13のいずれかに記載の画像処理装置において、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とする画像処理装置。 - 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成された固有射影行列、及び前記学習画像群及び前記固有射影行列から生成された射影核テンソルを取得する情報取得工程と、
前記取得された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル生成工程と、
前記取得された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル生成工程と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ処理工程と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影工程と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影工程と、
前記入力画像と異なる画質の変換画像を生成する画像変換工程と、
前記射影画像と前記変換画像とを加算する加算工程と、
を含むことを特徴とする画像処理方法。 - 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成される固有射影行列、及び前記学習画像群と前記射影行列から生成された射影核テンソルを用いて生成された第1の設定で特定した条件に該当する第1のサブ核テンソル、前記射影核テンソルを用いて生成された第2の設定で特定した条件に該当する第2のサブ核テンソル、を取得する情報取得工程と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ処理工程と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影工程と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影工程と、
前記入力画像と異なる画質の変換画像を生成する画像変換工程と、
前記射影画像と前記変換画像とを加算する加算工程と、
を含むことを特徴とする画像処理方法。 - 互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算により生成された固有射影行列を生成する固有射影行列生成工程と、
前記第1画質画像の高周波成分と中間固有空間の対応関係と、前記第2画質画像の高周波成分と前記中間固有空間の対応関係とを規定した射影核テンソルを生成する射影核テンソル生成工程と、
前記生成された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル取得工程と、
前記生成された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル取得工程と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ処理工程と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影工程と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影工程と、
前記入力画像と異なる画質の変換画像を生成する画像変換工程と、
前記射影画像と前記変換画像とを加算する加算工程と、
を含むことを特徴とする画像処理方法。 - 請求項15乃至17のいずれかに記載の画像処理方法において、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とする画像処理方法。 - コンピュータを、
互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成された固有射影行列、及び前記学習画像群及び前記固有射影行列から生成された射影核テンソルを取得する情報取得手段と、
前記取得された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル生成手段と、
前記取得された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル生成手段と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
として機能させるためのプログラム。 - コンピュータを、
互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算によって生成される固有射影行列、及び前記学習画像群と前記射影行列から生成された射影核テンソルを用いて生成された第1の設定で特定した条件に該当する第1のサブ核テンソル、前記射影核テンソルを用いて生成された第2の設定で特定した条件に該当する第2のサブ核テンソル、を取得する情報取得手段と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
として機能させるためのプログラム。 - コンピュータを、
互いに画質の異なる第1画質画像と第2画質画像との高周波成分を対とした画像対、及び前記第1画質画像と前記第2画質画像との高周波成分及び中周波成分を対とした画像対の少なくともいずれかを含んだ学習画像群から射影演算により生成された固有射影行列を生成する固有射影行列生成手段と、
前記第1画質画像の高周波成分又は高周波成分及び中周波成分と中間固有空間の対応関係と、前記第2画質画像の高周波成分又は高周波成分及び中周波成分と前記中間固有空間の対応関係とを規定した射影核テンソルを生成する射影核テンソル生成手段と、
前記生成された射影核テンソルから第1の設定で特定した条件に該当する第1のサブ核テンソルを生成する第1のサブ核テンソル取得手段と、
前記生成された射影核テンソルから第2の設定で特定した条件に該当する第2のサブ核テンソルを生成する第2のサブ核テンソル取得手段と、
処理の対象とする入力画像の高周波成分又は高周波成分及び中周波成分が抽出された低周波成分抑制画像を生成するフィルタ手段と、
前記低周波成分抑制画像を前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
前記算出された前記係数ベクトルを前記第2のサブ核テンソルと前記固有射影行列とを利用した第2の射影演算によって射影して前記低周波成分抑制画像から射影画像を生成する第2のサブテンソル射影手段と、
前記入力画像と異なる画質の変換画像を生成する画像変換手段と、
前記射影画像と前記変換画像とを加算する加算手段と、
として機能させるためのプログラム。 - 請求項19乃至21のいずれかに記載のプログラムにおいて、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とするプログラム。 - 互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得手段と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタ手段と、
前記低周波成分抑制入力データを、前記情報取得手段から取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
を備えることを特徴とするデータ処理装置。 - 互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得手段と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタ手段と、
前記低周波成分抑制入力データを、前記情報取得手段から取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
を備えることを特徴とするデータ処理装置。 - 請求項23又は24に記載のデータ処理装置において、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とするデータ処理装置。 - 互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得工程と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタフィルタ工程と、
前記低周波成分抑制入力データを、前記情報取得工程により取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影工程と、
を含むことを特徴とするデータ処理方法。 - 互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得工程と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタ工程と、
前記低周波成分抑制入力データを、前記情報取得工程により取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影工程と、
を含むことを特徴とするデータ処理方法。 - 請求項26又は27に記載のデータ処理方法において、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とするデータ処理方法。 - コンピュータを、
互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得手段と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタ手段と、
前記低周波成分抑制入力データを、前記情報取得手段から取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段、
として機能させることを特徴とするプログラム。 - コンピュータを、
互いに条件の異なる第1条件のデータと第2条件のデータとの少なくとも中周波成分又は高周波成分を対としたデータ対を含んだ学習データ群から射影演算によって生成された固有射影行列と、前記学習データ群及び前記固有射影行列から生成された射影核テンソルであって、前記第1条件のデータと中間固有空間の対応関係並びに前記第2条件のデータと前記中間固有空間の対応関係とを規定した射影核テンソルから、第1の設定で特定した条件に該当するものとして作成された第1のサブ核テンソルと、を取得する情報取得手段と、
処理の対象とする入力データの高周波成分、又は高周波成分及び中周波成分が抽出された低周波成分抑制入力データを生成するフィルタ手段と、
前記低周波成分抑制入力データを、前記情報取得手段から取得した前記固有射影行列と前記第1のサブ核テンソルを利用した第1の射影演算によって射影して前記中間固有空間における係数ベクトルを算出する第1のサブテンソル射影手段と、
として機能させることを特徴とするプログラム。 - 請求項29又は30に記載のプログラムにおいて、
前記射影演算は局所関係を利用した射影演算を含むことを特徴とするプログラム。 - 請求項19乃至22、29乃至31のいずれかに記載のプログラムを記録した記録媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201080034113.3A CN102473279B (zh) | 2009-07-31 | 2010-07-26 | 图像处理装置和方法、数据处理装置和方法 |
US13/388,036 US8565518B2 (en) | 2009-07-31 | 2010-07-26 | Image processing device and method, data processing device and method, program, and recording medium |
EP10804356.3A EP2461289A4 (en) | 2009-07-31 | 2010-07-26 | IMAGE PROCESSING DEVICE AND METHOD, DATA PROCESSING DEVICE AND METHOD, PROGRAM, AND RECORDING MEDIUM |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-179842 | 2009-07-31 | ||
JP2009179842A JP5506274B2 (ja) | 2009-07-31 | 2009-07-31 | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011013610A1 true WO2011013610A1 (ja) | 2011-02-03 |
Family
ID=43529265
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/062510 WO2011013610A1 (ja) | 2009-07-31 | 2010-07-26 | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム及び記録媒体 |
Country Status (5)
Country | Link |
---|---|
US (1) | US8565518B2 (ja) |
EP (1) | EP2461289A4 (ja) |
JP (1) | JP5506274B2 (ja) |
CN (1) | CN102473279B (ja) |
WO (1) | WO2011013610A1 (ja) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011035658A (ja) * | 2009-07-31 | 2011-02-17 | Fujifilm Corp | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム |
JP2012231367A (ja) * | 2011-04-27 | 2012-11-22 | Fujifilm Corp | 画像圧縮装置、画像伸長装置、方法、及びプログラム |
CN111736712A (zh) * | 2020-06-24 | 2020-10-02 | 北京百度网讯科技有限公司 | 输入信息的预测方法、系统、服务器及电子设备 |
CN111881858A (zh) * | 2020-07-31 | 2020-11-03 | 中南大学 | 一种微震信号多尺度去噪方法、装置及可读存储介质 |
CN113904764A (zh) * | 2021-09-18 | 2022-01-07 | 大连大学 | 基于多尺度压缩感知和马尔科夫模型的图像加密方法 |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5476955B2 (ja) * | 2009-12-04 | 2014-04-23 | ソニー株式会社 | 画像処理装置および画像処理方法、並びにプログラム |
KR20120137413A (ko) * | 2010-03-12 | 2012-12-20 | 국립대학법인 나고야공업대학 | 화상 처리 장치, 화상 처리 프로그램을 기록한 컴퓨터 판독가능 기록 매체, 및 화상을 생성하는 방법 |
US9053562B1 (en) | 2010-06-24 | 2015-06-09 | Gregory S. Rabin | Two dimensional to three dimensional moving image converter |
JP5751184B2 (ja) * | 2012-01-31 | 2015-07-22 | Nkワークス株式会社 | 画像処理プログラム、画像処理装置および画像処理方法 |
WO2015042873A1 (en) * | 2013-09-27 | 2015-04-02 | Google Inc. | Decomposition techniques for multi-dimensional data |
US9159123B2 (en) * | 2014-01-24 | 2015-10-13 | Adobe Systems Incorporated | Image prior as a shared basis mixture model |
US9384402B1 (en) * | 2014-04-10 | 2016-07-05 | Google Inc. | Image and video compression for remote vehicle assistance |
JP5847228B2 (ja) * | 2014-04-16 | 2016-01-20 | オリンパス株式会社 | 画像処理装置、画像処理方法及び画像処理プログラム |
US10225575B2 (en) * | 2014-09-03 | 2019-03-05 | Nec Corporation | Image reconstruction in which unknown patch is replaced by selected patch |
CN106297768B (zh) * | 2015-05-11 | 2020-01-17 | 苏州大学 | 一种语音识别方法 |
US9589323B1 (en) * | 2015-08-14 | 2017-03-07 | Sharp Laboratories Of America, Inc. | Super resolution image enhancement technique |
US10180782B2 (en) * | 2015-08-20 | 2019-01-15 | Intel Corporation | Fast image object detector |
US10402696B2 (en) * | 2016-01-04 | 2019-09-03 | Texas Instruments Incorporated | Scene obstruction detection using high pass filters |
DE112016007498B4 (de) * | 2016-12-06 | 2020-11-26 | Mitsubishi Electric Corporation | Untersuchungseinrichtung und untersuchungsverfahren |
KR102351083B1 (ko) * | 2017-08-30 | 2022-01-13 | 삼성전자주식회사 | 디스플레이 장치 및 그 영상 처리 방법 |
CN109035143B (zh) * | 2018-07-17 | 2020-09-08 | 华中科技大学 | 一种基于贝塞尔光片成像的三维超分辨方法 |
US10325371B1 (en) * | 2019-01-22 | 2019-06-18 | StradVision, Inc. | Method and device for segmenting image to be used for surveillance using weighted convolution filters for respective grid cells by converting modes according to classes of areas to satisfy level 4 of autonomous vehicle, and testing method and testing device using the same |
US10339424B1 (en) * | 2019-01-22 | 2019-07-02 | StradVision, Inc. | Method and device of neural network operations using a grid generator for converting modes according to classes of areas to satisfy level 4 of autonomous vehicles |
US10373317B1 (en) * | 2019-01-22 | 2019-08-06 | StradVision, Inc. | Learning method and learning device for attention-driven image segmentation by using at least one adaptive loss weight map to be used for updating HD maps required to satisfy level 4 of autonomous vehicles and testing method and testing device using the same |
US10402977B1 (en) * | 2019-01-25 | 2019-09-03 | StradVision, Inc. | Learning method and learning device for improving segmentation performance in road obstacle detection required to satisfy level 4 and level 5 of autonomous vehicles using laplacian pyramid network and testing method and testing device using the same |
US10410352B1 (en) * | 2019-01-25 | 2019-09-10 | StradVision, Inc. | Learning method and learning device for improving segmentation performance to be used for detecting events including pedestrian event, vehicle event, falling event and fallen event using edge loss and test method and test device using the same |
CN110136106B (zh) * | 2019-05-06 | 2022-12-27 | 腾讯医疗健康(深圳)有限公司 | 医疗内窥镜图像的识别方法、系统、设备和内窥镜影像系统 |
CN112950463A (zh) * | 2019-12-11 | 2021-06-11 | 香港理工大学深圳研究院 | 一种图像超分辨率方法、图像超分辨率装置及终端设备 |
US11412133B1 (en) * | 2020-06-26 | 2022-08-09 | Amazon Technologies, Inc. | Autonomously motile device with computer vision |
CN113239835B (zh) * | 2021-05-20 | 2022-07-15 | 中国科学技术大学 | 模型感知的手势迁移方法 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001222702A (ja) * | 2000-02-07 | 2001-08-17 | Sony Corp | 画像処理装置および画像処理方法、並びに記録媒体 |
JP2002170112A (ja) * | 2000-12-04 | 2002-06-14 | Minolta Co Ltd | 解像度変換プログラムを記録したコンピュータ読取可能な記録媒体、解像度変換装置および解像度変換方法 |
JP2006350498A (ja) | 2005-06-14 | 2006-12-28 | Fujifilm Holdings Corp | 画像処理装置および方法並びにプログラム |
JP2007188419A (ja) | 2006-01-16 | 2007-07-26 | Fujifilm Corp | 顔検出方法および装置並びにプログラム |
JP2008084213A (ja) * | 2006-09-28 | 2008-04-10 | Sony Corp | 画像処理装置、撮像装置、画像処理方法およびプログラム |
JP2008167950A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008167949A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008167948A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008192031A (ja) * | 2007-02-07 | 2008-08-21 | Nec Corp | 圧縮方法、圧縮装置、圧縮データ復元方法、圧縮データ復元装置、可視化方法および可視化装置 |
JP2008229161A (ja) | 2007-03-22 | 2008-10-02 | Fujifilm Corp | 画像成分分離装置、方法、およびプログラム、ならびに、正常画像生成装置、方法、およびプログラム |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7280985B2 (en) * | 2001-12-06 | 2007-10-09 | New York University | Logic arrangement, data structure, system and method for multilinear representation of multimodal data ensembles for synthesis, recognition and compression |
US7822693B2 (en) * | 2001-12-06 | 2010-10-26 | New York University | Logic arrangement, data structure, system and method for multilinear representation of multimodal data ensembles for synthesis, recognition and compression |
US7379925B2 (en) * | 2003-07-25 | 2008-05-27 | New York University | Logic arrangement, data structure, system and method for multilinear representation of multimodal data ensembles for synthesis, rotation and compression |
US20100067772A1 (en) | 2007-01-12 | 2010-03-18 | Fujifilm Corporation | Radiation image processing method, apparatus and program |
-
2009
- 2009-07-31 JP JP2009179842A patent/JP5506274B2/ja not_active Expired - Fee Related
-
2010
- 2010-07-26 WO PCT/JP2010/062510 patent/WO2011013610A1/ja active Application Filing
- 2010-07-26 US US13/388,036 patent/US8565518B2/en not_active Expired - Fee Related
- 2010-07-26 CN CN201080034113.3A patent/CN102473279B/zh not_active Expired - Fee Related
- 2010-07-26 EP EP10804356.3A patent/EP2461289A4/en not_active Withdrawn
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001222702A (ja) * | 2000-02-07 | 2001-08-17 | Sony Corp | 画像処理装置および画像処理方法、並びに記録媒体 |
JP2002170112A (ja) * | 2000-12-04 | 2002-06-14 | Minolta Co Ltd | 解像度変換プログラムを記録したコンピュータ読取可能な記録媒体、解像度変換装置および解像度変換方法 |
JP2006350498A (ja) | 2005-06-14 | 2006-12-28 | Fujifilm Holdings Corp | 画像処理装置および方法並びにプログラム |
JP2007188419A (ja) | 2006-01-16 | 2007-07-26 | Fujifilm Corp | 顔検出方法および装置並びにプログラム |
JP2008084213A (ja) * | 2006-09-28 | 2008-04-10 | Sony Corp | 画像処理装置、撮像装置、画像処理方法およびプログラム |
JP2008167950A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008167949A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008167948A (ja) | 2007-01-12 | 2008-07-24 | Fujifilm Corp | 放射線画像処理方法および装置ならびにプログラム |
JP2008192031A (ja) * | 2007-02-07 | 2008-08-21 | Nec Corp | 圧縮方法、圧縮装置、圧縮データ復元方法、圧縮データ復元装置、可視化方法および可視化装置 |
JP2008229161A (ja) | 2007-03-22 | 2008-10-02 | Fujifilm Corp | 画像成分分離装置、方法、およびプログラム、ならびに、正常画像生成装置、方法、およびプログラム |
Non-Patent Citations (5)
Title |
---|
ATKINS, C.B.; BOUMAN, C.A.; ALLEBACH, J.P.: "Optimal image scaling using pixel classification", IEEE, IMAGE PROCESSING, 2001. PROCEEDINGS. 2001 INTERNATIONAL CONFERENCE, vol. 3, 2001, pages 864 - 867, XP010563487, DOI: doi:10.1109/ICIP.2001.958257 |
JIA KUI; GONG SHAOGANG: "Generalized Face Super-Resolution", IEEE TRANSACTIONS OF IMAGE PROCESSING, vol. 17, no. 6, June 2008 (2008-06-01), pages 873 - 886 |
K. JIA ET AL.: "Generalized Face Super-Resolution", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 17, no. 6, 30 June 2008 (2008-06-30), pages 873 - 886, XP011208277 * |
See also references of EP2461289A4 * |
ZHUANG YUETING; ZHANG JIAN; WU FEI: "Hallucinating faces: LPH super-resolution and neighbor reconstruction for residue compensation", PATTERN RECOGN, vol. 40, no. 11, 2007, pages 3178 - 3194 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011035658A (ja) * | 2009-07-31 | 2011-02-17 | Fujifilm Corp | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム |
JP2012231367A (ja) * | 2011-04-27 | 2012-11-22 | Fujifilm Corp | 画像圧縮装置、画像伸長装置、方法、及びプログラム |
US8805105B2 (en) | 2011-04-27 | 2014-08-12 | Fujifilm Corporation | Image compression apparatus, image expansion apparatus, and methods and programs thereof |
CN111736712A (zh) * | 2020-06-24 | 2020-10-02 | 北京百度网讯科技有限公司 | 输入信息的预测方法、系统、服务器及电子设备 |
CN111736712B (zh) * | 2020-06-24 | 2023-08-18 | 北京百度网讯科技有限公司 | 输入信息的预测方法、系统、服务器及电子设备 |
CN111881858A (zh) * | 2020-07-31 | 2020-11-03 | 中南大学 | 一种微震信号多尺度去噪方法、装置及可读存储介质 |
CN111881858B (zh) * | 2020-07-31 | 2024-02-13 | 中南大学 | 一种微震信号多尺度去噪方法、装置及可读存储介质 |
CN113904764A (zh) * | 2021-09-18 | 2022-01-07 | 大连大学 | 基于多尺度压缩感知和马尔科夫模型的图像加密方法 |
Also Published As
Publication number | Publication date |
---|---|
EP2461289A4 (en) | 2013-05-15 |
US20120134579A1 (en) | 2012-05-31 |
CN102473279B (zh) | 2014-07-23 |
EP2461289A1 (en) | 2012-06-06 |
JP2011034345A (ja) | 2011-02-17 |
US8565518B2 (en) | 2013-10-22 |
CN102473279A (zh) | 2012-05-23 |
JP5506274B2 (ja) | 2014-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5506274B2 (ja) | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム | |
JP5506273B2 (ja) | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム | |
JP5178662B2 (ja) | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム | |
JP5161845B2 (ja) | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム | |
JP5506272B2 (ja) | 画像処理装置及び方法、データ処理装置及び方法、並びにプログラム | |
JP5366855B2 (ja) | 画像処理方法及び装置並びにプログラム | |
JP5684488B2 (ja) | 画像処理装置、画像処理方法およびプログラム | |
JP5335713B2 (ja) | 画像処理方法及び装置並びにプログラム | |
JP2010272109A (ja) | 画像処理装置、画像処理方法およびプログラム | |
JP5193931B2 (ja) | 画像処理装置、画像処理方法およびプログラム | |
JP5352332B2 (ja) | 画像処理装置、画像処理方法およびプログラム | |
CN108182429B (zh) | 基于对称性的人脸图像特征提取的方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080034113.3 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10804356 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2010804356 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13388036 Country of ref document: US Ref document number: 2010804356 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |