WO2021229984A1

WO2021229984A1 - Image processing device, image processing method, and program

Info

Publication number: WO2021229984A1
Application number: PCT/JP2021/015389
Authority: WO
Inventors: 信一郎五味; 哲平栗田
Original assignee: ソニーグループ株式会社
Priority date: 2020-05-15
Filing date: 2021-04-14
Publication date: 2021-11-18
Also published as: TW202147258A

Abstract

The objective of the present invention is to achieve a configuration for generating a noise-removed skin image reflecting irregularities in the skin with a high degree of accuracy by removing noise such as body hair and blemishes on the face of a user, and capable of analyzing the highly accurate three-dimensional shape of the skin. This image processing device includes: an image acquiring unit for acquiring an image of the skin of the face or the like; an image analyzing unit for analyzing the skin image acquired by the acquiring unit; and a three-dimensional shape analyzing unit for utilizing the analysis results obtained by the image analyzing unit to analyze the three-dimensional shape of the skin. The image acquiring unit acquires a plurality of polarized images of light with different wavelengths, and the image analyzing unit generates a mirror reflection component image of the skin surface, and a melanin pigment density index value image, and uses the generated images to generate a noise-removed skin image from which noise such as body hair and blemishes has been removed. The three-dimensional shape analyzing unit utilizes the noise-removed skin image to analyze the highly accurate three-dimensional shape of the skin.

Description

Image processing device, image processing method, and program

This disclosure relates to an image processing device, an image processing method, and a program. More specifically, the present invention relates to an image processing apparatus for performing analysis processing of human skin, an image processing method, and a program.

A process of observing and diagnosing the skin condition based on the photographed image by photographing the pixels of the human skin surface using a close-up photography camera, and by analyzing the photographed image, the condition of the skin such as texture and pores is quantified. Based on the results, care is widely practiced from the viewpoint of health and beauty.

For example, when analyzing the texture and pores of the skin surface, it is necessary to analyze the smoothness and uneven shape of the skin surface with high accuracy. The smoothness of the skin surface and the accuracy of analysis of the uneven shape vary depending on the degree of shading caused by the illumination at the time of taking a camera image and the uneven shape of the surface. Furthermore, there are spots and hair on the surface of the skin that are easily misrecognized as shadows, and these misrecognitions may cause an error in the analysis result.

Therefore, in order to realize highly accurate analysis processing, it is necessary to separate the shadows caused by the actual uneven shape of the skin from the shadow components caused by other spots, body hair, and the like.

As the prior art that discloses the method for solving this problem, there are, for example, Patent Document 1 (Japanese Patent Laid-Open No. 2015-187849) and Patent Document 2 (Japanese Patent Laid-Open No. 2013-188341).

Patent Document 1 (Japanese Unexamined Patent Publication No. 2015-187849) discloses a configuration in which a body hair (eyelash) region and a skin region are separated by performing a binarization process on the difference between a gray image and an edge image. ..
Further, Patent Document 2 (Japanese Unexamined Patent Publication No. 2013-188341) discloses a configuration in which a hair region and a skin region are separated by a color component of an image.

However, with these methods, there are problems such as an error that the shadow due to the skin groove on the skin surface is mistakenly recognized as body hair, and the shadow due to the skin groove in the spot area cannot be distinguished.

Japanese Unexamined Patent Publication No. 2015-187849 Japanese Unexamined Patent Publication No. 2013-188341

The present disclosure has been made in view of the above problems, for example, by taking a polarized image of human skin, analyzing the taken polarized image, and separating specular reflection and internal scattering. Exclude the effects of stains and stains.
After that, a spectral image is taken, and by analyzing the spectral image, areas containing a large amount of melanin pigment, such as hair and stains, are detected and excluded from the analysis target, so that only the shadow component caused by the unevenness of the skin surface is selected. It is acquired and the selected data is analyzed to generate analysis data of the texture and unevenness of the skin surface.
It is an object of the present disclosure to provide an image processing apparatus, an image processing method, and a program that realize highly accurate analysis processing of human skin by these processes.

The first aspect of this disclosure is
The image acquisition section that acquires skin images, and
An image analysis unit that analyzes the image acquired by the image acquisition unit, and an image analysis unit.
It has a three-dimensional shape analysis unit that analyzes the three-dimensional shape of the skin using the analysis results of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
It is in an image processing apparatus that analyzes a three-dimensional shape of skin by using the noise-removed skin image.

Further, the second aspect of the present disclosure is
It is an image processing method executed in an image processing device.
The image acquisition unit acquires the skin image and the image acquisition process,
Image analysis processing in which the image analysis unit analyzes the image acquired by the image acquisition unit, and
The three-dimensional shape analysis unit executes a three-dimensional shape analysis process for analyzing the three-dimensional shape of the skin using the analysis result of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
It is an image processing method for analyzing a three-dimensional shape of skin by using the noise-removed skin image.

Further, the third aspect of the present disclosure is
A program that executes image processing in an image processing device.
Image acquisition processing that causes the image acquisition unit to acquire skin images,
Image analysis processing that causes the image analysis unit to analyze the image acquired by the image acquisition unit, and
The 3D shape analysis unit is made to execute a 3D shape analysis process for analyzing the 3D shape of the skin by using the analysis result of the image analysis unit.
In the image acquisition process,
Acquire multiple polarized images of light of different wavelengths
In the image analysis process,
The polarized image is analyzed to generate a noise-removed skin image with noise removed.
In the three-dimensional shape analysis process,
There is a program for analyzing the three-dimensional shape of the skin using the noise-removing skin image.

The program of the present disclosure is, for example, a program that can be provided by a storage medium or a communication medium provided in a computer-readable format to an information processing device or a computer system capable of executing various program codes. By providing such a program in a computer-readable format, processing according to the program can be realized on an information processing apparatus or a computer system.

Still other objectives, features and advantages of the present disclosure will be clarified by more detailed description based on the examples of the present disclosure and the accompanying drawings described below. In the present specification, the system is a logical set configuration of a plurality of devices, and the devices of each configuration are not limited to those in the same housing.

According to the configuration of one embodiment of the present disclosure, a noise-removing skin image that highly accurately reflects the unevenness of the skin from which noise such as hair and spots on the user's face has been removed is generated, and the three-dimensional skin is highly accurate. A configuration that enables analysis of the shape is realized.
Specifically, for example, an image acquisition unit that acquires an image of skin such as a face, an image analysis unit that analyzes the skin image acquired by the image acquisition unit, and a skin 3 using the analysis results of the image analysis unit. It has a three-dimensional shape analysis unit that analyzes a dimensional shape. The image acquisition unit acquires a plurality of polarized images of light having different wavelengths, and the image analysis unit analyzes the polarized images to generate and generate a mirror reflection component image of the skin surface and a melanin dye concentration index value image. Using these images, a noise-removed skin image in which noise such as body hair and stains is removed is generated. The three-dimensional shape analysis unit analyzes the highly accurate three-dimensional shape of the skin using this noise-removed skin image.
With this configuration, it is possible to generate a noise-removing skin image that highly accurately reflects the unevenness of the skin from which noise such as hair and spots on the user's face has been removed, and to analyze the three-dimensional shape of the skin with high accuracy. It will be realized.
It should be noted that the effects described in the present specification are merely exemplary and not limited, and may have additional effects.

It is a figure explaining the process performed by the image processing apparatus of this disclosure. It is a figure explaining the process performed by the image processing apparatus of this disclosure. It is a figure explaining the configuration example of the image processing apparatus of this disclosure. It is a figure explaining the structural example of the image acquisition part of the image processing apparatus of this disclosure. It is a figure explaining the structural example of the image acquisition part of the image processing apparatus of this disclosure. It is a figure explaining the structural example of the image acquisition part of the image processing apparatus of this disclosure. It is a figure explaining the demosaic processing performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the polarization signal analysis processing performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the polarization signal analysis processing performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the dye signal analysis processing performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the signal discrimination process performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the signal discrimination process performed by the image analysis part of the image processing apparatus of this disclosure. It is a figure explaining the normal information calculation process executed by the 3D shape analysis part of the image processing apparatus of this disclosure. It is a figure explaining the distance conversion process performed by the 3D shape analysis part of the image processing apparatus of this disclosure. It is a figure explaining the distance analysis processing executed by the 3D shape analysis part of the image processing apparatus of this disclosure. It is a figure explaining the distance analysis processing executed by the 3D shape analysis part of the image processing apparatus of this disclosure. It is a figure explaining the analysis data display processing example executed by the display part of the image processing apparatus of this disclosure. It is a figure explaining the analysis data display processing example executed by the display part of the image processing apparatus of this disclosure. It is a figure explaining the analysis data display processing example executed by the display part of the image processing apparatus of this disclosure. It is a figure which shows the flowchart explaining the sequence of processing executed by the image processing apparatus of this disclosure. It is a figure explaining the learning process executed by the image processing apparatus of this disclosure. It is a figure which shows the flowchart explaining the processing sequence executed by the image processing apparatus of this disclosure. It is a figure explaining the structural example of the image acquisition part of the image processing apparatus of this disclosure. It is a figure explaining the structural example of the image acquisition part of the image processing apparatus of this disclosure. It is a figure explaining the hardware configuration example of the image processing apparatus of this disclosure.

Hereinafter, the details of the image processing apparatus, the image processing method, and the program of the present disclosure will be described with reference to the drawings. The explanation will be given according to the following items.
1. 1. Outline of processing executed by the image processing apparatus of the present disclosure 2. About the configuration example of the image processing apparatus of this disclosure 3. Configuration of each component of the image processing device and details of the processing to be executed 3- (1). Details of the configuration and processing of the image acquisition unit 3- (2). Details of the configuration and processing of the image analysis unit 3- (3). Details of the configuration and processing of the three-dimensional (3D) shape analysis unit 3- (4). Details of display configuration and processing 4. 5. About the sequence of processing executed by the image processing device. 6. About an example of learning processing to generate a learning device used to calculate normal information for each pixel. About other configuration examples of the image acquisition unit (camera) 7. About the hardware configuration example of the image processing device 8. Summary of the structure of this disclosure

[1. Outline of processing executed by the image processing apparatus of the present disclosure]
First, an outline of the processing executed by the image processing apparatus of the present disclosure will be described with reference to FIGS. 1 and the following.

The image processing device of the present disclosure, for example, takes a picture of the skin of a person's face with a close-up camera, analyzes the image taken by this camera, and performs a process of generating and displaying a highly accurate analysis result.

The outline of the processing executed by the image processing apparatus of the present disclosure is as follows.
For example, a polarized light sensor camera is used to take a polarized image of the skin of a person's face, and the captured polarized image is analyzed to separate specular reflection and internal scattering, thereby excluding the effects of hokuro and stains. ..

After that, a spectral image is taken, and by analyzing the spectral image, a region containing a large amount of melanin pigment, for example, a hair or a spot region is detected and excluded from the analysis target, so that only the shadow component caused by the unevenness of the skin surface is removed. Selective acquisition is performed, and this selected data is analyzed to generate analysis data for the texture and unevenness of the skin surface.
The present disclosure analyzes the shape of the skin surface, that is, the three-dimensional (3D) shape of the skin without being affected by body hair and stains by these treatments, and based on the analysis results, wrinkles, textures, etc. of human skin. Generate high-precision analysis data and provide it to users.

1 and 2 are diagrams showing an example of a UI (user interface) displayed on the display unit of the image processing apparatus of the present disclosure.

FIG. 1 is an example of an initial screen displayed to the user.
As shown in FIG. 1, the initial screen is displayed.
(A) User operation guide image (b) Camera shooting skin image (c) Shooting start icon These display data are included.

(A) The user operation guide image is an explanatory image for explaining the operation to be performed by the user. The example shown in the figure is an example explaining that the camera is placed on the cheek to take a picture.
(B) The skin image taken by the camera is an actual photographed image taken by the camera placed on the user's cheek by the user.
(C) The shooting start icon is an icon corresponding to a switch (shutter) for causing the camera to shoot by touching the icon.

When the user touches the shooting start icon according to this initial screen, a skin image of the user's face is shot.
When the image is taken, the image processing device starts the analysis process of the shot image.
When the analysis process is completed, the image processing device generates an analysis result and displays it on the display unit.

FIG. 2 is a diagram showing an example of display data of analysis results.
The example shown in FIG. 2 is a display example of the texture analysis result of the user's skin. There are various types of analysis data, and the example shown in FIG. 2 is one of them.

In the example shown in FIG. 2, the texture evaluation value and the comprehensive evaluation value of each of the three places are displayed based on the photographed images of the three skin areas of the user's forehead, cheek, and chin.
In addition, the user's skin image, the analysis result image corresponding to the skin image, and the like are also displayed.
As described above, the analysis data is not limited to the data shown in FIG. 2, and there are various data.

[2. About the configuration example of the image processing apparatus of the present disclosure]
Next, a configuration example of the image processing apparatus of the present disclosure will be described.

FIG. 3 is a diagram showing a configuration example of the image processing apparatus of the present disclosure.
As shown in FIG. 3, the image processing apparatus 100 of the present disclosure includes an image acquisition unit (camera) 110, an image analysis unit 120, a three-dimensional (3D) shape analysis unit 130, and a display unit 140.

The image acquisition unit (camera) 110 is, for example, a close-up camera that photographs the skin of a person's face, and has a polarized image acquisition unit 111 that supports a plurality of colors.
The image analysis unit 120 includes a polarization signal analysis unit 121, a dye signal analysis unit 122, and a signal determination unit 123.
The three-dimensional (3D) shape analysis unit 130 includes a normal information estimation unit 131, a distance information conversion unit 132, and a distance information analysis unit 133.
The display unit 140 includes a measurement information display unit 141, a signal information display unit 142, a three-dimensional shape display unit 143, and a measurement status display unit 144.

First, an outline of the processing executed by these components will be described.
The details of the processes executed by each component will be described in sequence in the latter part.

The image acquisition unit (camera) 110 captures an image of a measurement target, for example, the skin of the user's face (= measurement target). The image acquisition unit (camera) 110 acquires image data for analysis in the image analysis unit 120 in the subsequent stage.

The multi-color polarized image acquisition unit 111 of the image acquisition unit 110 performs a process of acquiring polarized images of a plurality of colors, specifically, for example, white light, red light, and near-infrared (NIR) light.

The image analysis unit 120 inputs the measurement result of the image acquisition unit 110 and performs signal analysis.
The polarization signal analysis unit 121 of the image analysis unit 120 uses the polarized image acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110 to convert the polarization component signal into a mirror-reflected light component and other components (inside). Performs the process of separating into scattered light, etc.).

The dye signal analysis unit 122 of the image analysis unit 120 analyzes the red (R) light acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110 and the polarized image corresponding to near infrared (NIR) light. Performs processing to analyze pigment signals that cause disturbances other than human skin.

The signal discrimination unit 123 of the image analysis unit 120 inputs the analysis results of the polarization signal analysis unit 121 and the dye signal analysis unit 122 to reflect the uneven shape of the skin surface from which the influence of disturbance such as hair and stains is removed. Generate an image signal.

The three-dimensional (3D) shape analysis unit 130 analyzes the three-dimensional (3D) shape of the skin included in the image captured by the camera using the signal output from the image analysis unit 120.

The normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 estimates the normal information of the skin surface. The normal is a line orthogonal to the surface of the object. In the process of the present disclosure, it corresponds to a line orthogonal to the skin surface.

The distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130 converts the normal information on the skin surface estimated by the normal information estimation unit 131 into distance information indicating the uneven shape of the skin surface.
The distance information analysis unit 133 of the three-dimensional (3D) shape analysis unit 130 uses the distance information generated by the distance information conversion unit 132 as an index value that serves as an evaluation index for the texture of the skin such as the roughness coefficient of the skin surface. Is calculated and analyzed.

The display unit 140 displays the data acquired and analyzed by each of the image acquisition unit (camera) 110, the image analysis unit 120, and the three-dimensional (3D) shape analysis unit 130.
The measurement information display unit 141 of the display unit 140 displays the information acquired or measured by the image acquisition unit 110.
The signal information display unit 142 of the display unit 140 displays the information analyzed by the image analysis unit 120.

The three-dimensional shape display unit 143 of the display unit 140 displays the three-dimensional shape information of the human skin analyzed by the three-dimensional (3D) shape analysis unit 130.
The measurement status display unit 144 of the display unit 140 displays information on the progress of processing being executed by the image acquisition unit 110 to the three-dimensional (3D) shape analysis unit 130.

[3. About the configuration of each component of the image processing device and the details of the processing to be executed]
Next, the configuration of each component of the image processing apparatus 100 of the present disclosure and the details of the processing to be executed will be described.

The details of the configuration and processing of each component shown below will be described in sequence.
(1) Details of the configuration and processing of the image acquisition unit (2) Details of the configuration and processing of the image analysis unit (3) Details of the configuration and processing of the three-dimensional (3D) shape analysis unit (4) Details of the display unit Details of configuration and processing

(3- (1). Details of configuration and processing of image acquisition unit)
First, the details of the configuration and processing of the image acquisition unit (camera) 110 will be described.

As described above, the image acquisition unit (camera) 110 captures an image of the measurement target, for example, the skin of the user's face (= measurement target) taken by the close-up camera of the image acquisition unit (camera) 110.

The multi-color compatible polarized image acquisition unit 111 of the image acquisition unit (camera) 110 performs processing to acquire polarized images of a plurality of colors, specifically, for example, white light, red light, and near-infrared (NIR) light. ..

FIG. 4 is a diagram showing a configuration example of the image acquisition unit (camera) 110.
As shown in FIG. 4, the image acquisition unit (camera) 110 has an image pickup unit 210 and an illumination unit 220 around the image pickup unit.

As shown in the figure, the illumination unit 220 around the image pickup unit 210 is composed of the following three types of illumination.
(A) Lighting A = Lighting A221 with a polarizing filter in the direction in front of the white LED,
(B) Lighting B = Lighting B222 composed of red LEDs,
(C) Illumination C = Illumination C223 composed of near infrared (NIR) LEDs,

The illumination A221 is composed of LEDs that output wavelength light in the visible light region of about 400 to 700 nm.
Illumination B is composed of an LED that outputs wavelength light in the red (R) color light region of about 660 nm.
Illumination C is composed of LEDs that output wavelength light in the near infrared (NIR) light region of about 880 nm.

The image acquisition unit (camera) 110 sequentially turns on these three types of lights A to C for the same skin area, and acquires three images taken in three different lighting environments.

The image pickup unit 210 is composed of a polarization sensor camera. The infrared (IR) light cut filter attached to many general cameras has been removed.

The detailed configuration of the imaging unit 210 will be described with reference to FIGS. 5 and 6.
As shown in FIG. 5, each pixel constituting the image pickup device of the image pickup unit 210 is provided with a polarizing element that functions as an optical filter that allows only light polarized in a specific direction to pass through. A photoelectric conversion element that receives light that has passed through the polarizing element is provided below the polarizing element.

The polarizing element set for each pixel constituting the image pickup device is configured such that, for example, 2 × 2 = 4 pixels are set as one unit, and these 4 pixels pass only light in different polarization directions.
The hatching shown in each pixel of the image pickup device shown in the lower right of FIG. 5 indicates the polarization direction.
For example, the polarization directions of the four pixels a2311, b2232, c233, and d234 shown in the lower right of FIG. 5 are set as follows.

The polarization direction of the pixel a231 is the horizontal direction, and the pixel a receives only the horizontal polarization. That is, the pixel a231 is a 0-degree polarized pixel.
The polarization direction of the pixel b232 is the lower left diagonal direction, and the pixel b receives only the polarized light in the lower left diagonal direction. That is, the pixel b232 is a 45-degree polarized pixel.
The polarization direction of the pixel c233 is the vertical direction, and only the polarization in the direction perpendicular to the pixel c is received. That is, the pixel c233 is a 90-degree polarized pixel.
The polarization direction of the pixel d234 is the upper left diagonal direction, and the pixel d receives only the upper left oblique polarized light. That is, the pixel d234 is a 135 degree polarized pixel.

In the example shown in FIG. 5, the image sensor has a configuration in which 2 × 2 = 4 pixels are used as one unit to pass different polarization direction lights, and such a configuration in units of 4 pixels is repeatedly set, and the image pickup unit 210 All pixels of are configured.

FIG. 6 is a diagram showing a cross-sectional configuration of an image pickup device of the image pickup unit 210.
As shown in the enlarged cross-sectional view at the lower right of FIG. 6, the cross section of the image sensor has a laminated structure in which the following layers are configured from the top (the surface of the image sensor) to the bottom (the inside of the image sensor).
(1) Silicon lens,
(2) Polarizer,
(3) Photoelectric conversion element,
The image pickup unit 210 has a laminated structure having each of the layers (1) to (3).

The light input to the image sensor by image capture passes through the polarizing element via the silicon lens and is received by the photoelectric conversion element.

As shown in FIG. 6, the image pickup unit 210 has a imaging unit 210.
(A) A plurality of polarizing elements that pass polarized light in a plurality of different polarization directions,
(B) It is a photoelectric conversion element set corresponding to each of a plurality of polarizing elements, and has a photoelectric conversion element that receives incident light via each polarizing element and acquires a polarized image.
The photoelectric conversion element of each pixel receives only a specific polarized image.
Therefore, a specific polarized image can receive only one pixel out of the four pixels of the image sensor.

The process of generating a polarized image of all pixels (demosaic process) based on the polarized image of only a part of pixels is executed by the polarized signal analysis unit 211 of the image analysis unit 120 in the subsequent stage.
This process (demosaic process) will be described later.

As described with reference to FIGS. 4 to 6, the image acquisition unit (camera) 110 has the following three types of different wavelength light, that is,
(A) Lighting A = Lighting A221 with a polarizing filter in the direction in front of the white LED,
(B) Lighting B = Lighting B222 composed of red LEDs,
(C) Illumination C = Illumination C223 composed of near infrared (NIR) LEDs,
Under the different lighting of these three documents, four types of polarized images (0 degree, 45 degree, 90 degree, 135 degree) are taken. The captured image is input to the image analysis unit 120 in the subsequent stage.

(3- (2). Details of the configuration and processing of the image analysis unit)
Next, the details of the configuration and processing of the image analysis unit 120 will be described.

As described above, the image analysis unit 120 inputs the measurement result of the image acquisition unit 110 and performs signal analysis.
The polarization signal analysis unit 121 of the image analysis unit 120 uses the polarized image acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110 to convert the polarization component signal into a mirror-reflected light component and other components (inside). Performs the process of separating into scattered light, etc.).

First, the processing executed by the polarization signal analysis unit 121 of the image analysis unit 120 will be described.
The polarization signal analysis unit 121 uses the polarized image acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110 to convert the polarization component signal into a specular reflected light component and other components (internal scattered light, etc.). Perform the process of separation.

The polarization signal analysis unit 121 has a demosaic unit and a polarization model estimation unit.
The demosaic unit of the polarization signal analysis unit 121 can receive only one of the four pixels of the image pickup element, that is, the polarized image acquired by the multicolor compatible polarized image acquisition unit 111 of the image acquisition unit 110, that is, as described above. For each of the four types of polarized images (0 degree polarized image, 45 degree polarized image, 90 degree polarized image, 135 degree polarized image), all four types of polarized images (0 degree polarized image, 45 degree polarized image) , 90 degree polarized image, 135 degree polarized image) is executed (demosaic processing).

The polarization model estimation unit uses image analysis processing using four types of polarized images (0-degree polarized image, 45-degree polarized image, 90-degree polarized image, 135-degree polarized image) generated by the demosaic unit to obtain pixel values. A process of acquiring only the specular reflection component light reflected on the skin surface from the light component contained in the above, that is, a specular reflection component extraction process of removing components other than the specular reflected light component (internally scattered light, etc.) is performed.

First, with reference to FIG. 7, the demosaic process executed by the demosaic unit of the polarization signal analysis unit 121 will be described.

As described above with reference to FIG. 5, the polarized images acquired by the multicolor compatible polarized image acquisition unit 111 of the image acquisition unit 110 are four types of polarized images in which each pixel is different in units of four pixels of the image pickup element (4 types of polarized images). A 0-degree polarized image, a 45-degree polarized image, a 90-degree polarized image, and a 135-degree polarized image) are taken.

Therefore, each polarized image (0-degree polarized image, 45-degree polarized image, 90-degree polarized image, 135-degree polarized image) is only captured by one of the four pixels of the image sensor of the image pickup unit. The remaining 3 pixels out of the 4 pixels are taking another polarized image.

The demosaic unit executes pixel value interpolation processing using the pixel values of a specific polarized image captured in one pixel of four pixels, and executes demosaic processing to set the pixel values of the specific polarized image to all pixels. do.

A specific example of demosaic processing will be described with reference to FIG. 7.
The demosaic process is a pixel value interpolation process in which the pixel value of a certain pixel is used to estimate and set the pixel value of a pixel for which the pixel value is not set, and there are various methods.
The example shown in FIG. 7 is a diagram illustrating bilinear interpolation, which is a typical example of pixel value interpolation processing.

For example, in the example shown in FIG. 7, the pixel value of the 90-degree polarized image is set to only one of the four pixels of the image pickup device of the image pickup unit 210. Pixel values of 90-degree polarized images are set for each of the pixels a, b, c, and d shown in FIG. 7.
For example, the pixel value of the 90-degree polarized image is not set for each of the P, Q, and R pixels other than the a pixel among the four pixels at the upper left end shown in FIG.

With such a setting, the pixel value of the 90-degree polarized image of each pixel P, Q, R is estimated and set.
As shown in FIG. 7, the pixel values of the 90-degree polarized images of the P, Q, and R pixels can be calculated (estimated) according to the following calculation formula according to the pixel value interpolation algorithm of bilinear interpolation.
P = (a + b) / 2
Q = (a + c) / 2
R = (a + b + c + d) / 4

As described above, the pixel value of the pixel for which the pixel value is not set can be calculated (estimated) by using the pixel value of the surrounding pixel.
All the pixels of the image pickup unit are subjected to the same processing as the above calculation processing, and the pixels of four types of polarized images (0 degree polarized image, 45 degree polarized image, 90 degree polarized image, 135 degree polarized image) for all the pixels. Calculate the value.

The four types of polarized images (0 degree polarized image, 45 degree polarized image, 90 degree polarized image, 135 degree polarized image) generated by this demosaic processing are polarized model estimation which is a processing unit after the polarization signal analysis unit 121. It is input to the part.

With reference to FIG. 8, a process executed by the polarization model estimation unit, that is, a process of acquiring only the specular reflection component light reflected on the skin surface will be described.

The graph shown in FIG. 8 is a graph in which the polarization angle (α) is set on the horizontal axis and the luminance I (α) is set on the vertical axis, and is a graph showing a polarization model. It is known that the brightness of one point of a polarized image taken by a camera changes as shown in the graph shown in FIG. 8 depending on the polarization angle.
The polarization model graph shown in FIG. 8 shows the same luminance change every time the polarization angle changes by 180 degrees. That is, it is known to exhibit a luminance change having a polarization angle period of 180 degrees.

Here, the highest brightness within the brightness change range is Imax, and the lowest brightness is Imin.
Further, the polarization angle α = ψ when the maximum luminance Imax is observed is set as the azimuth angle.
For example, the specular reflection component reflected on the surface of the subject such as the surface of the skin is Is.
The specular reflection component Is reflected on the surface of the subject is the difference between the maximum luminance value Imax and the minimum luminance value Imin in the polarization model, that is,
Is = Imax-Imin
It can be calculated by the above formula.

The curve of the graph shown in FIG. 8 can be obtained by luminance analysis of the captured image of the camera 250 in the configuration shown in FIG. 9, for example.
The subject (OB) 251 is photographed using the camera (CM) 250 shown in FIG.
However, the camera (CM) 250 captures a polarized image by capturing an image via the polarizing plate (PL) 252 in front of the camera (CM) 250.

It is known that in the polarized image generated by the camera (CM) 250, the brightness of the subject (OB) 251 changes according to the rotation of the polarizing plate (PL) 252. Here, the highest brightness when the polarizing plate (PL) 252 is rotated is Imax, and the lowest brightness is Imin. Further, as shown in the figure, when the x-axis and the y-axis in the two-dimensional coordinates are in the plane direction of the polarizing plate (PL) 52, it is on the xy plane with respect to the x-axis when the polarizing plate (PL) 252 is rotated. Let the angle be the polarization angle α. When the polarizing plate (PL) 252 is rotated 180 degrees, it returns to the original polarized state and has a period of 180 degrees. In the case of the diffuse reflection model, the polarization angle α when the maximum luminance Imax is observed is defined as the azimuth angle ψ. With such a definition, the luminance I (α) observed when the polarizing plate (PL) 252 is rotated becomes a graph as shown in FIG.

The luminance I (α) at the polarization angle α is
Maximum brightness value Imax and
Minimum brightness value Imin and
Polarization angle α and
The polarization angle α that gives the maximum luminance value Imax, that is, the azimuth angle ψ,
It is defined by the following equation using these four parameters.

In the graph shown in FIG. 8,
(A) Luminance I (0 °) when the polarization angle = 0 degrees
(B) Luminance I (45 °) when the polarization angle = 45 degrees
(C) Luminance I (90 °) when the polarization angle = 90 degrees
(D) Luminance I (135 °) when the polarization angle = 135 degrees
These luminance value data can be acquired from the demosaic image generated by the demosaic unit.
Further, the luminance I (0 °), the luminance I (45 °), the luminance I (90 °), the luminance I (135 °), and the polarization angles α when these luminances are acquired are 0 degrees and 45 degrees, respectively. , 90 degrees, 135 degrees.

That is, in the above (Equation 1), the unknown parameter is
Maximum brightness value Imax and
Minimum brightness value Imin and
The polarization angle α that gives the maximum luminance value Imax, that is, the azimuth angle ψ,
These three parameters.

On the other hand, the known parameters are luminance I (0 °), luminance I (45 °), luminance I (90 °), luminance I (135 °), and the polarization angle α when these luminances are acquired. By solving the above (Equation 1) using these known parameters, three unknowns, that is,
Maximum brightness value Imax and
Minimum brightness value Imin and
The polarization angle α that gives the maximum luminance value Imax, that is, the azimuth angle ψ,
These three parameters can be calculated.

Further, from the maximum luminance value Imax and the minimum luminance value Imin,
The specular reflection component Is reflected on the surface of the subject (skin surface),
Is = Imax-Imin
It can be calculated by the above formula.
The polarization model estimation unit of the polarization signal analysis unit 121 calculates the specular reflection component Is reflected on the subject surface (skin surface) by these processes.

A specific processing example executed by the polarization model estimation unit of the polarization signal analysis unit 121 will be described.
In the above (formula 1), the luminance I (0 °), the luminance I (45 °), the luminance I (90 °), the luminance I (135 °), and the polarization angle α (0) when these luminances are acquired are obtained. Determinant, 45 degrees, 90 degrees, 135 degrees), by using these known data, the determinant shown below, that is, the "known matrix", "unknown parameter configuration formula", and "photographed data" shown in (Equation 2). It can be shown as a determinant of a matrix composed of these data.

Furthermore, the matrix x in which the unknown parameters Imax, Imin, and ψ are set as x1, x2, x3, respectively, that is,

When the above matrix x is defined, the above (Equation 2) becomes
Ax = b ... (Equation 3)
However, A and b can be expressed as known parameters.

Further, based on the above (Equation 3), the following (Equation 4) is derived.
x = A-1b ... (Equation 4)

By solving the above (Equation 4), three unknowns, that is,
Maximum brightness value Imax and
Minimum brightness value Imin and
The polarization angle α that gives the maximum luminance value Imax, that is, the azimuth angle ψ,
These three parameters can be calculated. Each parameter can be calculated by the following (Equation 5).

The polarization model estimation unit of the polarization signal analysis unit 121 further obtains the maximum luminance value Imax calculated by the above (Equation 5) and the minimum luminance value Imin.
The specular reflection component Is reflected on the surface of the subject (skin surface),
Is = Imax-Imin
Calculated by the above formula.

As described above, the polarization model estimation unit of the polarization signal analysis unit 121 is a polarized image of all four types of pixels generated by the demosaic unit (0 degree polarized image, 45 degree polarized image, 90 degree polarized image, 135 degree polarized image). By the image analysis process using The specular reflection component extraction process is executed.

Next, the details of the processing executed by the dye signal analysis unit 122 of the image analysis unit 120 will be described.
As described above, the dye signal analysis unit 122 of the image analysis unit 120 is polarized for red (R) light or near-infrared (NIR) light acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110. The image is analyzed and the dye signal that causes disturbance other than human skin is analyzed.

First, the dye signal analysis unit 122 has four directions calculated from the illumination B222 in the illumination unit 220 of the image acquisition unit (camera) 110 described above with reference to FIG. 4, that is, an image taken when the red LED is lit. Red polarized image pixel value according to the following (Equation 21) for each of the corresponding pixels of each of the polarized component images (I (r0 °), I (r45 °), I (r90 °), I (r135 °)) of The average (I (r)) is calculated.
That is,
I (r) = (I (r0 °) + I (r45 °) + I (r90 °) + I (r135 °)) / 4 ... (Equation 21)
According to the above (Equation 21), the average red polarized image pixel value (I (r)) of each pixel is calculated.

Further, the illumination C223 in the illumination unit 220 of the image acquisition unit (camera) 110 described above with reference to FIG. 4, that is, the polarization in four directions calculated from the image taken when the near-infrared (NIR) LED is lit. Near-infrared (NIR) polarization according to the following (Equation 22) for each corresponding pixel of each image of the component image (I (nir0 °), I (nir45 °), I (nir90 °), I (nir135 °)). The average image pixel value (I (nir)) is calculated.
That is,
I (nir) = (I (nir0 °) + I (nir45 °) + I (nir90 °) + I (nir135 °)) / 4 ... (Equation 22)
According to the above (Equation 22), the near-infrared (NIR) polarized image pixel value average (I (nir)) of each pixel is calculated.

Further, the dye signal analysis unit 122 has the red polarized image pixel value average (I (r)) of each pixel calculated according to the above (Equation 21) and the near infrared (NIR) of each pixel calculated according to the above (Equation 22). ) Polarized image Pixel value average (I (nir)) is used to calculate a melanin dye concentration index value (MI: MeraninIndex) according to the following (Equation 23).

MI = α (logI (nir) -logI (r)) + β ... (Equation 23)
In the above (Equation 23), α and β are predetermined constants.

The melanin pigment concentration index value (MI: MeraninIndex) shows a high value in a region such as body hair or a spot.
FIG. 10 shows a specific example.

FIG. 10 shows each of the following images.
(A) Image taken by camera (b) Output image of melanin pigment concentration index value (MI: MeraninIndex)

(A) Regions with high melanin pigment concentration, such as "blemishes" regions and body hair regions in images taken by cameras, are (b) other skin regions (melanin pigments) in the melanin pigment concentration index value (MI: MeraninIndex) output image. It is set to a pixel value (for example, a dark red pixel value) different from that in the low density region).

The melanin pigment concentration index value (MI: ManinIndex) output image is an image in which the pixel value is set according to the melanin pigment concentration, and the pixel value output mode can be set in various ways.

For example, it can be output as a luminance image, and an image with various settings such as an image having a high luminance value (white) as the melanin pigment concentration is high and an image having a low luminance value (black) as the melanin pigment concentration is high can be displayed. It can be generated.

The dye signal analysis unit 122 generates such a melanin dye concentration index value (MI: MeraninIndex) output image.

Details of the processing executed by the signal discrimination unit 123 of the image analysis unit 120 will be described.
As described above, the signal discrimination unit 123 of the image analysis unit 120 inputs the analysis results of the polarization signal analysis unit 121 and the dye signal analysis unit 122 to remove the influence of disturbance such as hair and stains on the skin surface. Generates an image signal that reflects the uneven shape.

The signal discrimination unit 123 uses the specular reflection component signal obtained by the polarization signal analysis unit 121 and the melanin dye concentration index value (MI: MeraninIndex) obtained by the dye signal analysis unit 122 to make minute irregularities on the skin surface. The selective extraction process of the resulting shadow component is executed.

The details of the processing executed by the signal discrimination unit 123 of the image analysis unit 120 will be described with reference to FIGS. 11 and 12.
FIG. 11 shows each of the following images.
(A) Camera image (b) Specular component image (after brightness adjustment)

The "(b) specular component image (after brightness adjustment)" is generated by extracting only the specular reflection component generated by analyzing the polarized image described with reference to FIG. 8 above. It is an image that was made.
That is, it is a specular reflection component image acquired by the polarization image analysis process executed by the polarization signal analysis unit 121 of the image analysis unit 120.

Although it is difficult to understand from the image shown in FIG. 11, in "(a) camera-taken image", the pixel value is low (low brightness) in the image area such as body hair, stains, and moles, and the shadows such as skin grooves and wrinkles are also the same. , The pixel value becomes low (low brightness).
On the other hand, the "(b) specular component image" obtained by the analysis processing of the polarized image is an image in which only the shadow on the surface and the hair on the surface are reflected in the pixel value. The effect of spots, moles, etc. from the depths of the skin to the vicinity of the surface is hardly reflected in the pixel values.

Further, as described above with reference to FIG. 10, the melanin pigment concentration index value output image generated by the dye signal analysis unit 122 of the image analysis unit 120 includes hair and spots / moles having a high melanin pigment concentration. It is an image that outputs a pixel value that is distinguished from other skin areas.

As described above, the dye signal analysis unit 122 of the image analysis unit 120 sets, for example, a melanin pigment having a higher pixel value (high brightness) than other skin regions for hair and spots / moles having a high melanin pigment concentration. It is possible to output a density index value output image.
On the contrary, it is also possible to output a melanin pigment concentration index value output image in which the pixel value of hair and spots / moles having a high melanin pigment concentration is set lower than that of other skin areas (boat brightness).

The signal discrimination unit 123 of the image analysis unit 120 uses the following three types of images to reflect an image reflecting the uneven shape of the skin surface from which noise such as disturbance such as body hair and stains has been removed, that is, a noise-removed skin image. To generate.
(A) Image taken by the camera acquired by the image acquisition unit (camera) 110 (b) Mirror reflection component image generated by the polarized image analysis process executed by the polarization signal analysis unit 121 of the image analysis unit 120 (c) Image analysis unit A melanin dye concentration index value output image generated by a dye signal analysis process executed by the dye signal analysis unit 122 of 120.

A processing sequence executed by the signal discrimination unit 123 of the image analysis unit 120 will be described with reference to FIG. 12.
First, the signal discrimination unit 123 synthesizes (b) a specular reflection component image shown in FIG. 12 and (c) a melanin dye concentration index value output image to generate (d) a composite image.
(D) The composite image is an image in which a pixel region having a low specular reflection component and a high melanin pigment concentration index value is output as a low pixel value (low luminance) (hereinafter referred to as a dark portion).

Next, the signal discrimination unit 123 first generates (e) a noise-removed skin image from (b) a specular reflection component image and (d) a composite image shown in FIG. 12.
(E) The noise-removed skin image is an image that reflects the uneven shape of the skin surface from which noise such as disturbance such as body hair and spots is removed.

(E) In generating the noise-removed skin image, for example, (d) the luminance value of the corresponding (b) specular reflection component image is used for the portion other than the dark portion (hereinafter referred to as the bright portion) of the composite image. Further, (d) the dark part of the image, that is, the pixel region having a low specular reflection component and a high melanin dye concentration index value corresponds to the pixel value of the bright part if there is a bright part in the vicinity of the pixel, in other words, (b). ) Interpolate with the brightness value of the pixel. On the other hand, when there is no bright portion in the vicinity of the pixel, the pixel value of the pixel is used as it is without performing the interpolation processing.

In this way, the image analysis unit 120 inputs the analysis results of the polarization signal analysis unit 121 and the dye signal analysis unit 122, and reflects the uneven shape of the skin surface from which the influence of disturbance such as body hair and spots is removed. Generates the image signal.

(3- (3) .Details of configuration and processing of 3D (3D) shape analysis unit)
Next, the details of the configuration and processing of the three-dimensional (3D) shape analysis unit 130 will be described.

As described above, the three-dimensional (3D) shape analysis unit 130 analyzes the three-dimensional (3D) shape of the skin included in the image captured by the camera using the signal output from the image analysis unit 120.

That is, using "(e) noise-removed skin image", which is an image signal reflecting the uneven shape of the skin surface from which the influence of disturbance such as hair and stains explained with reference to FIG. 12 is removed, is used in the image taken by the camera. The three-dimensional (3D) shape of the included skin is analyzed.

The normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 estimates the normal information of the skin surface. The normal is a line orthogonal to the object surface, that is, the skin surface.
The distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130 converts the normal information on the skin surface estimated by the normal information estimation unit 131 into distance information indicating the uneven shape of the skin surface.
The distance information analysis unit 133 of the three-dimensional (3D) shape analysis unit 130 uses the distance information generated by the distance information conversion unit 132 as an index value that serves as an evaluation index for the texture of the skin such as the roughness coefficient of the skin surface. Is calculated and analyzed.

First, with reference to FIG. 13, the normal information estimation process of the skin surface executed by the normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 will be described.

The normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 removes the noise-removed skin image generated by the image analysis unit 120, that is, the influence of disturbance such as hair and stains described with reference to FIG. The “(e) noise-removed skin image”, which is an image signal reflecting the uneven shape of the skin surface, is input to the learner 301.

The learning device 301 is, for example, a learning device using a CNN (Convolutional Neural Network) or the like, the input of the learning device 301 is "(e) noise removing skin image"), and the output is an input image ". (E) Noise-removed skin image ”is pixel-based normal information.

The pixel-based normal information includes, for example, the following parameters.
p: Calculated normal x-direction component value (nx)
q: Y-direction component value (ny) of the calculated normal
The x-direction and the y-direction correspond to the x- and y-directions of the coordinate well shown in FIG. 9 described above.

In this way, the normal information estimation unit 131 inputs the “(e) noise-removing skin image”, which is a signal output from the image analysis unit 120, into the learner (CNN) 301, and inputs the normal information for each pixel. Output.

The learning device (CNN) 301 is generated by a learning process executed in advance using various image data. At the time of learning, prepare a large number of pairs of images of actual skin and replicas and separately converted unevenness information measured by a 3D scanning device into normal information (GT (Ground Truth) data), and the least squares error (L2). ) Learn network weights using a loss function.
A specific example of this learning process will be described later.

In this way, the normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 estimates the normal information on the skin surface using the learner 301 shown in FIG. The normal is a line orthogonal to the object surface, that is, the skin surface.

Next, the process executed by the distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130 will be described.
The distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130 converts the normal information on the skin surface estimated by the normal information estimation unit 131 into distance information indicating the uneven shape of the skin surface.

A process executed by the distance information conversion unit 132 will be described with reference to FIG.
The distance information conversion unit 132 calculates the distance information (Z) of the pixel by using the normal information (p = nx, q = ny) of each pixel output from the normal information estimation unit 131.

As the distance calculation formula for obtaining the distance information from the normal information of the pixels, for example, the Frankot-Chellappa algorithm shown in the following (Equation 31) can be used.

Each parameter in the above (Equation 31) is as follows.
F: Fourier transform εx: spatial frequency (x)
εy: Spatial frequency (y)
p: x-direction component value (nx) of the normal
q: Normal component value in the y direction (ny)

The above (Equation 31) does not calculate the absolute distance between the camera and the subject. The distance information (Z) calculated by the above (Equation 31) corresponds to the distance (shape) calculated by providing a certain reference point and integrating the gradient field from the reference point. The distance (Z) is calculated so that the gradient field and the derivative of the shape match.
In order to know the absolute distance from the camera to the subject, it is necessary to separately acquire the distance to the reference point.

In this way, the distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130 converts the normal information on the skin surface estimated by the normal information estimation unit 131 into distance information indicating the uneven shape of the skin surface. ..

Next, the process executed by the distance information analysis unit 133 of the three-dimensional (3D) shape analysis unit 130 will be described.
The distance information analysis unit 133 analyzes the distance information calculated by the distance information conversion unit 132. For example, using the distance information generated by the distance information conversion unit 132, index values such as the roughness coefficient of the skin surface, which are evaluation indexes for the texture of the skin, are calculated and analyzed.

The process executed by the distance information analysis unit 133 will be described with reference to FIGS. 15 and 16.
The depth map (distance image) shown in FIG. 15 is a map generated based on the distance information calculated by the distance information conversion unit 132. That is, it is a depth map (distance image) in which pixel values are set according to the distance in pixel units of the skin image taken by the image acquisition unit (camera) 110.

The distance information analysis unit 133 analyzes, for example, the distance information (profile) of the portion indicated by the line AB in the central portion from this depth map.
The graph on the right side of FIG. 15 is an example of the distance (depth) analysis data generated by the distance information analysis unit 133, and shows changes in the distance (depth) of each pixel included in the line AB in the depth map (distance image). It is a graph which shows.

The larger the change in the distance of each pixel included in the line AB, the larger the unevenness of the skin. On the other hand, the smaller the change in the distance of each pixel included in the line AB, the smaller the unevenness of the skin and the smoother the skin.

The distance information analysis unit 133 further uses the distance (depth) analysis data showing the change in the distance (depth) of each pixel shown in FIG. 15 to determine the "average roughness", "maximum height", and the like of the skin. Calculate the skin roughness index value.
A specific example will be described with reference to FIG.

FIG. 16 shows a calculation example of the “average roughness” of the skin and the “maximum height” of the skin, which are the skin roughness index values calculated by the distance information analysis unit 133.

The average roughness (Za) of the skin is calculated by the following (Equation 32) as shown in the figure.

In the above (Equation 32), each parameter is as follows.
N: Number of pixels in the calculation area Zn: Distance value of the pixel n in the calculation area

Further, the maximum skin height (Zz) is calculated by the following (Equation 8).
Zz = Zp + Zv ... (Equation 8)

In the above (Equation 8), each parameter is as follows.
Zp: Difference between the maximum distance and the average distance (Zave) in the calculation area Zn: Difference between the minimum distance and the average distance (Zave) in the calculation area

As described above, the three-dimensional (3D) shape analysis unit 130 analyzes the three-dimensional (3D) shape of the skin included in the image captured by the camera by using the signal output from the image analysis unit 120.
That is, using "(e) noise-removed skin image", which is an image signal reflecting the uneven shape of the skin surface from which the influence of disturbance such as hair and stains explained with reference to FIG. 12 is removed, is used in the image taken by the camera. The three-dimensional (3D) shape of the included skin is analyzed.

(3- (4). Details of display unit configuration and processing)
Next, the details of the configuration and processing of the display unit 140 will be described.

As described above, the display unit 140 displays the data acquired and analyzed by each of the image acquisition unit 110, the image analysis unit 120, and the three-dimensional (3D) shape analysis unit 130.
The measurement information display unit 141 of the display unit 140 displays the information acquired or measured by the image acquisition unit 110.
The signal information display unit 142 of the display unit 140 displays the information analyzed by the image analysis unit 120.
The three-dimensional shape display unit 143 of the display unit 140 displays the three-dimensional shape information of the human skin analyzed by the three-dimensional (3D) shape analysis unit 130.
The measurement status display unit 144 of the display unit 140 displays information on the progress of processing being executed by the image acquisition unit 110 to the three-dimensional (3D) shape analysis unit 130.

An example of data displayed by the display unit 140 will be described with reference to FIGS. 17 to 19.
An example of the display data shown in FIG. 17 is
(A) Camera image (b) Depth map (distance image)
(C) Three-dimensional (3D) image This is an example of displaying these image data.

(A) The image captured by the camera is an image acquired from the image acquisition unit (camera) 110.
(B) The depth map (distance image) and (c) the three-dimensional (3D) image are images generated by the three-dimensional (3D) shape analysis unit 130.
By looking at these images, the user can accurately determine the shape and unevenness of his / her skin.

An example of the display data shown in FIG. 18 is
(A) Camera image (b) Depth map (distance image)
(C) Distance (depth) analysis data This is an example of displaying these data.

(A) The image captured by the camera is an image acquired from the image acquisition unit (camera) 110.
(B) The depth map (distance image) and (c) the distance (depth) analysis data are images generated by the three-dimensional (3D) shape analysis unit 130.
By looking at these images and graphs, the user can accurately determine the shape and unevenness of his / her skin.

An example of the display data shown in FIG. 19 is
(A) Image taken by a camera (b) Image of melanin pigment concentration index value output This is an example of displaying these image data.

(A) The image captured by the camera is an image acquired from the image acquisition unit (camera) 110.
(B) The melanin pigment concentration index value output image is an image generated by the image analysis unit 120.
By looking at these images, the user can accurately determine the condition of his / her skin, for example, the condition of stains.

[4. About the sequence of processing executed by the image processing device]
Next, a sequence of processing executed by the image processing apparatus 100 of the present disclosure will be described.

FIG. 20 is a diagram showing a flowchart illustrating a sequence of processes executed by the image processing apparatus 100 of the present disclosure.
The process according to the flowchart shown in FIG. 20 or lower can be executed according to the program stored in the storage unit of the image processing apparatus 100. For example, it can be performed as a program execution process by a processor such as a CPU having a program execution function.
Hereinafter, the processing of each step of the flow will be described in sequence.

(Steps S101 to S106)
The processes of steps S101 to S106 are processes executed by the image acquisition unit (camera) 110.
First, the image acquisition unit (camera) 110 turns on the polarized white LED of the illumination unit in step S101, and captures a skin image in step S102.

In the image pickup unit of the image acquisition unit (camera) 110, as described above with reference to FIGS. 5 and 6, for example, 2 × 2 = 4 pixels are used as one unit, and these 4 pixels have different polarizations. It is configured to allow only light in the direction to pass through.
By taking an image using such a camera, four types of polarized images (0-degree polarized image, 45-degree polarized image, 90-degree polarized image, 135-degree polarized image) can be obtained in units of four types of pixels of the image sensor. Be photographed.

Next, the image acquisition unit (camera) 110 turns on the red (R) LED of the lighting unit in step S103, and takes a skin image in step S104.

Next, the image acquisition unit (camera) 110 turns on the near-infrared (NIR) LED of the illumination unit in step S105, and takes a skin image in step S106.

All of these captured images are input to the image analysis unit 120.

(Step S107)
The processes of steps S107 to S109 are processes executed by the image analysis unit 120.

First, the image analysis unit 120 executes the polarization signal analysis process in step S107.
This process is executed by the polarization signal analysis unit 121 of the image analysis unit 120.
In step S107, the polarization signal analysis unit 121 of the image analysis unit 120 uses the polarized image acquired by the multi-color compatible polarized image acquisition unit 111 of the image acquisition unit 110 to convert the polarization component signal into a mirror-reflected light component and other parts. The process of separating into the components (internally scattered light, etc.) is performed.

This process is the process described above with reference to FIGS. 7 and 8, and includes demosaic process and polarization model estimation process.

First, as described with reference to FIG. 7, a pixel value interpolation process using the pixel values of a specific polarized image captured in one pixel of four pixels is executed to obtain the pixel values of the specific polarized image. Executes the demosaic process to be set for all pixels.

Next, using a graph in which the polarization angle (α) is set on the horizontal axis and the brightness I (α) is set on the vertical axis, which is described with reference to FIG. 8, the mirror reflection component Is reflected on the subject surface (skin surface). Is calculated.
That is, the difference between the maximum luminance value Imax and the minimum luminance value Imin of the polarization model described with reference to FIG. 8, that is,
Is = Imax-Imin
Using the above formula, the specular reflection component Is reflected on the surface of the subject (skin surface) is calculated.

(Step S108)
Next, the image analysis unit 120 executes the color signal analysis process in step S108.
This process is executed by the color signal analysis unit 122 of the image analysis unit 120.

The dye signal analysis unit 122 analyzes the red (R) light and the near-infrared (NIR) light-compatible polarized image acquired by the multi-color polarized image acquisition unit 111 of the image acquisition unit 110, and analyzes the polarized image other than human skin. Performs processing to analyze the dye signal that becomes a disturbance.

The dye signal analysis unit 122 includes illumination B222 in the illumination unit 220 of the image acquisition unit (camera) 110, that is, a four-direction polarization component image (I (r0 °), I) calculated from an image taken when the red LED is lit. The average red polarized image pixel value (I (r)) is calculated according to the following equation for each of the corresponding pixels of each image of (r45 °), I (r90 °), I (r135 °)).
That is,
I (r) = (I (r0 °) + I (r45 °) + I (r90 °) + I (r135 °)) / 4

Further, the illumination C223 in the illumination unit 220 of the image acquisition unit (camera) 110, that is, the four-direction polarization component image (I (nir0 °), I) calculated from the image taken when the near infrared (NIR) LED is lit. Near-infrared (NIR) polarized image pixel value average (I (nir)) according to the following (Equation 22) for each corresponding pixel of each image of (nir45 °), I (nir90 °), I (nir135 °)). Is calculated.
I (nir) = (I (nir0 °) + I (nir45 °) + I (nir90 °) + I (nir135 °)) / 4

Further, using the red polarized image pixel value average (I (r)) of each pixel calculated according to the above equations and the near infrared (NIR) polarized image pixel value average (I (nir)) of each pixel, the following is used. The melanin pigment concentration index value (MI: MeraninIndex) is calculated according to the formula of.
MI = α (logI (nir) -logI (r)) + β ... (Equation 23)
In the above (Equation 23), α and β are predetermined constants.

As described above with reference to FIG. 10, the melanin pigment concentration index value (MI: MeraninIndex) shows a high value in a region such as hair or a spot.
As can be understood from FIG. 10, in (a) the “stain” region in the camera-taken image, (b) the pixel value of the melanin dye concentration index value (MI: MeraninIndex) output image is high in the pixel value of a specific color. It is set to a pixel value (for example, a dark red pixel value).
The dye signal analysis unit 122 generates such a melanin dye concentration index value (MI: MeraninIndex) output image.

(Step S109)
Next, the image analysis unit 120 executes the signal discrimination process in step S109.
This process is executed by the signal discrimination unit 123 of the image analysis unit 120.

The signal discrimination unit 123 uses the specular reflection component signal obtained by the polarization signal analysis unit 121 and the melanin dye concentration index value (MI: MeraninIndex) obtained by the dye signal analysis unit 122 to make minute irregularities on the skin surface. A noise-removing skin image is generated by performing a selective extraction process for the resulting shadow component.
The "(e) noise-removing skin image" described above with reference to FIG. 12 is generated.

First, the signal discrimination unit 123 synthesizes (b) a specular reflection component image shown in FIG. 12 and (c) a melanin dye concentration index value output image to generate (d) a composite image.
(D) The composite image is an image in which a pixel region having a low specular reflection component and a high melanin pigment concentration index value is output as a low pixel value (low brightness).

Next, the signal discrimination unit 123 generates (e) a noise-removed skin image by using (b) a specular reflection component image and (d) a composite image shown in FIG. 12.
(E) The noise-removed skin image is an image that reflects the uneven shape of the skin surface from which noise such as disturbance such as body hair and spots is removed.

Since it is presumed that (b) the portion of the mirror reflection component image having a particularly high brightness value is affected by sweat, cosmetics (lame), etc., (d) the low pixels of the composite image are also obtained in these pixel regions. It may be output as a value (low brightness), and a process of generating (e) a noise-removed skin image from (d) the composite image and (b) the mirror reflection component image thus generated may be performed.

(Step S110)
The processes of steps S110 to S112 are processes executed by the three-dimensional (3D) shape analysis unit 130.

First, in step S110, the normal estimation process is executed.
This process is executed by the normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130.
The normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 estimates the normal information of the skin surface. The normal is a line orthogonal to the object surface, that is, the skin surface.

As described above with reference to FIG. 13, the normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 refers to the noise-removed skin image generated by the image analysis unit 120, that is, FIG. The "(e) noise-removing skin image", which is an image signal reflecting the uneven shape of the skin surface from which the influence of disturbance such as body hair and stains has been removed, is input to the learning device 301 and output from the learning device 301. The normal line information for each pixel of "(e) noise-removed skin image" is acquired.

The learning device 301 is, for example, a learning device using a CNN (Convolutional Neural Network) or the like.

(Step S111)
Next, in step S111, the distance conversion process is executed.
This process is executed by the distance information conversion unit 132 of the three-dimensional (3D) shape analysis unit 130.

The distance information conversion unit 132 converts the normal information on the skin surface estimated by the normal information estimation unit 131 into distance information indicating the uneven shape of the skin surface.
This process is the process described above with reference to FIG.
The distance information conversion unit 132 calculates the distance information (Z) of the pixel by using the normal information (p = nx, q = ny) of each pixel output from the normal information estimation unit 131.
As the distance calculation formula for obtaining the distance information, for example, the Francot-Chellappa algorithm shown in (Equation 31) described above can be used.

(Step S112)
Next, in step S112, the distance analysis process is executed.
This process is executed by the distance information analysis unit 133 of the three-dimensional (3D) shape analysis unit 130.

The distance information analysis unit 133 analyzes the distance information calculated by the distance information conversion unit 132. For example, using the distance information generated by the distance information conversion unit 132, index values such as the roughness coefficient of the skin surface, which are evaluation indexes for the texture of the skin, are calculated and analyzed.

This process is the process described above with reference to FIGS. 15 and 16.
For example, as described with reference to FIG. 15, the distance information analysis unit 133 analyzes the distance information (profile) of the portion indicated by the line AB in the central portion from the depth map.
The graph on the right side of FIG. 15 is an example of the distance (depth) analysis data generated by the distance information analysis unit 133, and shows changes in the distance (depth) of each pixel included in the line AB in the depth map (distance image). It is a graph which shows.

Further, the distance information analysis unit 133 uses the distance (depth) analysis data showing the change in the distance (depth) of each pixel shown in FIG. 15, and as described with reference to FIG. 16, the “average roughness” of the skin. Calculates skin roughness index values such as "sa" and "maximum height".

As described above, the three-dimensional (3D) shape analysis unit 130 analyzes the three-dimensional (3D) shape of the skin included in the image captured by the camera by using the signal output from the image analysis unit 120 in step S112.
That is, using "(e) noise-removed skin image", which is an image signal reflecting the uneven shape of the skin surface from which the influence of disturbance such as hair and stains explained with reference to FIG. 12 is removed, is used in the image taken by the camera. The three-dimensional (3D) shape of the included skin is analyzed.

(Step S113)
Finally, in step S113, the analysis result is displayed on the display unit.
This process is a process executed by the display unit 140.

As described above, the display unit 140 displays the data acquired and analyzed by each of the image acquisition unit 110, the image analysis unit 120, and the three-dimensional (3D) shape analysis unit 130.

Specifically, for example, various analysis data described above with reference to FIGS. 17 to 19 are displayed.
An example of the display data shown in FIG. 17 is
(A) Camera image (b) Depth map (distance image)
(C) Three-dimensional (3D) image This is an example of displaying these image data.

In this way, the display unit 140 displays the data acquired and analyzed by each of the image acquisition unit 110, the image analysis unit 120, and the three-dimensional (3D) shape analysis unit 130.

By looking at these display data, the user can accurately determine the condition of his / her skin, for example, the shape of the skin, the uneven condition, the condition of stains, and the like.

[5. About an example of learning processing to generate a learning device used to calculate normal information for each pixel]
Next, an example of learning processing for generating a learning device used for calculating normal information in pixel units will be described.

The learning device 301 is, for example, a learning device using a CNN (Convolutional Neural Network) or the like.
The learning device (CNN) 301 is generated by a learning process executed in advance using various image data. At the time of learning, prepare a large number of pairs of images of actual skin and replicas and separately converted unevenness information measured by a 3D scanning device into normal information (GT (Ground Truth) data), and the least squares error (L2). ) Learn network weights using a loss function.
A specific example of this learning process will be described.

FIG. 21 is a diagram illustrating an example of generation of a learning device (CNN) 401, that is, an example of machine learning processing.
In the example shown in FIG. 21, the sample image 411 is input to the learner (CNN) 401. The output of the learner (CNN) 401 is pixel unit normal information 412.

The degree of similarity between the pixel unit normal information 412, which is the output when the sample image 411 is input to the learner 401, and the normal information 413, which is the true value (Ground Truth) of learning, is calculated. For example, in the least squares error (L2) calculation unit 402, the minimum square error (L2) between the pixel unit normal information 412 and the normal information 413 which is the true value (Ground Truth) of learning is calculated, and L2 is lost. The learning process is performed as.

For example, the weight of the learner (CNN) 401 is updated by backpropagating the calculated loss. As a result, the learner (CNN) 401 is generated.

Here, as an example of machine learning, it is assumed that a learning device is generated using CNN, but the present invention is not limited to this. As machine learning, the learning device 300 may be generated by using various methods such as RNN (Recurrent Neural Network) other than CNN. Further, in the above-mentioned example, the weight of the learner is updated by backpropagating the calculated loss, but the weight is not limited to this. In addition to backpropagation, the weight of the learner may be updated by using an arbitrary learning method such as a stochastic gradient descent method.

FIG. 22 is a flowchart illustrating a processing sequence of learning processing for generating a learning device.
The process of steps S201 to S209 of the flow shown in FIG. 22 is the same process as the process of steps S101 to S109 of the flow described above with reference to FIG. 20.
However, the image generated in step S209 is a sample image for the learning process.

This sample image is applied to the learning process executed in step S210 to perform the learning process.

A learning device is generated by performing a learning process according to this sequence.
That is, the learning device (CNN) 301 used by the normal information estimation unit 131 of the three-dimensional (3D) shape analysis unit 130 described above with reference to FIG. 13 can be generated.

The learner (CNN) 301 is a noise-removing skin image generated by the image analysis unit 120, that is, an image reflecting the uneven shape of the skin surface from which the influence of disturbance such as hair and stains described with reference to FIG. 12 is removed. It is a learning device that enables input of a signal "(e) noise-removed skin image" and acquisition of pixel-based normal information of "(e) noise-removed skin image" as an output.

[6. About other configuration examples of the image acquisition unit (camera)]
Next, another configuration example of the image acquisition unit (camera) 110, which is a component of the image processing device 100, will be described.

An example of a configuration of the image acquisition unit (camera) 110 has been described above with reference to FIG.
The image acquisition unit (camera) 110 can have a configuration other than that shown in FIG.

FIG. 23 shows a configuration example of the image acquisition unit (camera) 110 that is different from the configuration shown in FIG.
The image acquisition unit (camera) 500 shown in FIG. 23 also has an image pickup unit 510 and an illumination unit 520 around the image pickup unit.

As shown in the figure, the illumination unit 520 around the image pickup unit 510 is composed of the following four types of illumination.
(A) Illumination A = Illumination A521 in which a polarizing filter in a direction parallel to the polarizing filter set in the image pickup unit 510 is installed on the front surface of the white LED.
(B) Illumination B = Illumination B522 in which a polarizing filter set in the image pickup unit 510 and a polarizing filter in the orthogonal direction are installed on the front surface of the white LED.
(C) Lighting C = Lighting C523 composed of red LEDs,
(D) Illumination D = Illumination D524 composed of near infrared (NIR) LEDs,

The illuminations A and B are composed of LEDs that output wavelength light in the visible light region of about 400 to 700 nm.
Illumination C is composed of an LED that outputs wavelength light in the red (R) color light region of about 660 nm.
Illumination D is composed of LEDs that output wavelength light in the near infrared (NIR) light region of about 880 nm.

The image acquisition unit (camera) 500 sequentially turns on these four types of lights A to D for the same skin area, and acquires four images taken in four different lighting environments.

The image pickup unit 510 is configured by a camera having a polarizing filter mounted on the front surface. The infrared (IR) light cut filter attached to many general cameras has been removed.
The image sensor of the image pickup unit 510 is an image sensor similar to that of a normal camera, and a polarizing filter is installed in front of the image sensor.

The process when the image acquisition unit (camera) 500 is used is different from the process when the image acquisition unit (camera) 110 described with reference to FIG. 4 is used in the following points.
In the image capturing process, the white LED (parallel direction filter), the white LED (orthogonal direction filter), the red LED, and the near infrared (NIR) LED are sequentially turned on.

Further, the specular reflection component in the polarization signal analysis unit 121 is calculated for Is, that is,
The specular reflection component Is reflected on the surface of the subject is the difference between the maximum luminance value Imax and the minimum luminance value Imin, that is,
Is = Imax-Imin
In the process calculated by the above formula, the following pixel values are used for the maximum luminance value Imax and the minimum luminance value Imin.

The maximum luminance value Imax uses pixels of an image taken when the white LED and the polarization directions of the camera are parallel to each other.
The minimum luminance value Imin uses an image taken when the white LED and the polarization directions of the camera are orthogonal to each other.

Further, FIG. 24 shows another configuration example of the image acquisition unit (camera) 110 that is different from the configuration shown in FIG.
The image acquisition unit (camera) 600 shown in FIG. 24 also has an image pickup unit 610 and an illumination unit 620 around the image pickup unit.

As shown in the figure, the illumination unit 620 around the image pickup unit 610 is composed of the following three types of illumination.
(A) Illumination A = Illumination A621 in which a polarizing filter in a direction parallel to the polarizing filter set in the image pickup unit 610 is installed on the front surface of the white LED.
(B) Illumination B = Illumination B622 in which a polarizing filter set in the image pickup unit 610 and a polarizing filter in the orthogonal direction are installed on the front surface of the white LED.
(C) Lighting C = Lighting C623 composed of white LEDs,

The illuminations A, B, and C are all composed of LEDs that output wavelength light in the visible light region of about 400 to 700 nm.
The image acquisition unit (camera) 600 sequentially turns on these three types of lights A to C for the same skin area, and acquires three images taken in three different lighting environments.

The image pickup unit 610 is configured by a camera having a polarizing filter mounted on the front surface. The infrared (IR) light cut filter attached to many general cameras has been removed.
The image sensor of the image pickup unit 610 is an image sensor similar to that of a normal camera, and a polarizing filter is installed in front of the image sensor.
Further, a color filter 611 is mounted on the front surface of the polarizing filter.

As shown in the figure, the color filter 611 has a color filter 611.
A red (R) filter that selectively transmits light with a wavelength near 660 nm,
A near-infrared (NIR) filter that selectively transmits light with a wavelength near 880 nm,
Visible light (Vis filter,) that selectively transmits wavelength light in the vicinity of 400 to 700 nm
It has a configuration in which these three types of filters are arranged.

The process when the image acquisition unit (camera) 600 is used is different from the process when the image acquisition unit (camera) 110 described with reference to FIG. 4 is used in the following points.
In the image capturing process, the color filter 611 installed in front of the imaging unit 610 is sequentially moved to sequentially change the wavelength band of the light incident on the imaging unit 610 to obtain a visible light component polarized image and a red component polarized image. , Acquires a near-infrared (NIR) component polarized image.

In the image analysis unit 120, using these three different color component polarized images, the polarization signal analysis processing in the polarization signal analysis unit 121, the dye signal analysis processing in the dye signal analysis unit 122, and the signal determination unit 123 in the signal determination unit 123. Execute signal judgment processing.

[7. About hardware configuration example of image processing device]
Next, a hardware configuration example of the image processing apparatus 100 of the present disclosure will be described.
FIG. 25 is a diagram showing a hardware configuration example of the image processing device.
Each component of the hardware configuration shown in FIG. 25 will be described.

The CPU (Central Processing Unit) 701 functions as a data processing unit that executes various processes according to a program stored in the ROM (Read Only Memory) 702 or the storage unit 708. For example, the process according to the sequence described in the above-described embodiment is executed.

The RAM (Random Access Memory) 703 stores programs and data executed by the CPU 701. These CPU 701, ROM 702, and RAM 703 are connected to each other by a bus 704.

The CPU 701 is connected to the input / output interface 705 via the bus 704, and the input / output interface 705 includes an input unit 706 consisting of various operation units, switches, etc., and an output unit including a display, a speaker, etc., which are display units, in addition to the camera. 707 is connected.

The CPU 701 inputs camera shot images and operation information input from the input unit 706, executes various processes, and outputs the process results to, for example, the output unit 707.
The storage unit 708 connected to the input / output interface 705 is composed of, for example, a hard disk or the like, and stores a program executed by the CPU 701 and various data. The communication unit 709 functions as a transmission / reception unit for data communication via a network such as the Internet or a local area network, and communicates with an external device.

The drive 710 connected to the input / output interface 705 drives a removable media 711 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card, and records or reads data.

[8. Summary of the structure of this disclosure]
As described above, the embodiments of the present disclosure have been described in detail with reference to the specific embodiments. However, it is self-evident that those skilled in the art may modify or substitute the examples without departing from the gist of the present disclosure. That is, the present invention has been disclosed in the form of an example and should not be construed in a limited manner. In order to judge the gist of this disclosure, the column of claims should be taken into consideration.

The technology disclosed in the present specification can have the following configurations.
(1) An image acquisition unit that acquires a skin image,
An image analysis unit that analyzes the image acquired by the image acquisition unit, and an image analysis unit.
It has a three-dimensional shape analysis unit that analyzes the three-dimensional shape of the skin using the analysis results of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
An image processing device that analyzes the three-dimensional shape of skin using the noise-removed skin image.

(2) The image analysis unit is
The polarized image is analyzed to generate a specular reflection component image of the skin surface and a melanin pigment concentration index value image.
The image processing apparatus according to (1), which generates the noise-removed skin image by using the generated specular reflection component image and the melanin pigment concentration index value image.

(3) The image processing apparatus according to (1) or (2), wherein the noise is at least one of body hair, a stain, or a mole.

(4) The image acquisition unit is
The image processing apparatus according to any one of (1) to (3), which has an illumination unit that selectively outputs light having different wavelengths.

(5) The image acquisition unit is
It has an illumination unit that selectively outputs three types of light with different wavelengths: white light, red light, and near-infrared light.
The image processing apparatus according to any one of (1) to (4), which acquires polarized images corresponding to three types of light having different wavelengths, white light, red light, and near-infrared light.

(6) The image acquisition unit is
The image processing apparatus according to any one of (1) to (5), which has a configuration for capturing a plurality of different polarized images on a pixel-by-pixel basis.

(7) The image analysis unit is
The image processing apparatus according to (6), which performs demosaic processing of a plurality of different polarized images captured in pixel units.

(8) The image analysis unit is
Using the image input from the image acquisition unit, a plurality of different polarized images are generated.
The image processing apparatus according to any one of (1) to (7), which generates a specular reflection component image of the skin surface based on the generated plurality of polarized images and the correspondence data between the polarization angle and the brightness.

(9) The image analysis unit is
Using the image input from the image acquisition unit, a plurality of different polarized images are generated.
Based on the generated multiple polarized images and the polarization model which is the correspondence data between the polarization angle and the brightness, the specular component signal is separated into the specular reflected light component and the other component signals, and the specular reflection component on the skin surface. The image processing apparatus according to any one of (1) to (8) for generating an image.

(10) The image analysis unit is
The specular reflected light component Is
Difference between maximum luminance value Imax and minimum luminance value Imin in the polarization model,
Is = Imax-Imin
The image processing apparatus according to (9), which is calculated according to the above formula.

(11) The image analysis unit is
Described in any one of (1) to (10), which generates a melanin dye concentration index value image by using a photographed image under red light illumination and an image photographed under near infrared light illumination input from the image acquisition unit. Image processing equipment.

(12) The image analysis unit is
A composite image of the specular reflection component image of the skin surface generated by analyzing the polarized image and the melanin pigment concentration index value image is generated.
The image processing apparatus according to any one of (1) to (11), which generates the noise-removed skin image from the generated composite image and the specular reflection component image.

(13) The three-dimensional shape analysis unit is
The normal information estimation unit that estimates the normal information on the skin surface,
A distance information conversion unit that converts the normal information on the skin surface estimated by the normal information estimation unit into distance information indicating the uneven shape of the skin surface, and a distance information conversion unit.
The image processing apparatus according to any one of (1) to (12), which has a distance information analysis unit that calculates an evaluation index value based on an evaluation of the uneven shape of the skin surface using the distance information generated by the distance information conversion unit.

(14) The distance information analysis unit is
The image processing apparatus according to (13), which calculates at least one of the average roughness and the maximum height of the skin by using the depth map showing the unevenness of the skin.

(15) The normal information estimation unit is
The image processing apparatus according to (13) or (14), wherein the noise-removing skin image generated by the image analysis unit is input to a learning device, and normal information on the skin surface is acquired as an output of the learning device.

(16) The image processing apparatus further includes
The image processing apparatus according to any one of (1) to (15), which has a display unit for displaying at least one of the analysis results of the image analysis unit or the analysis result of the three-dimensional shape analysis unit.

(17) The display unit is
The image processing apparatus according to (16), which displays at least one data of a three-dimensional image of the skin surface, a depth map showing unevenness of the skin surface, or a melanin pigment concentration index value image.

(18) An image processing method executed in an image processing apparatus.
The image acquisition unit acquires the skin image and the image acquisition process,
Image analysis processing in which the image analysis unit analyzes the image acquired by the image acquisition unit, and
The three-dimensional shape analysis unit executes a three-dimensional shape analysis process for analyzing the three-dimensional shape of the skin using the analysis result of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
An image processing method for analyzing a three-dimensional shape of skin using the noise-removed skin image.

(19) A program that executes image processing in an image processing device.
Image acquisition processing that causes the image acquisition unit to acquire skin images,
Image analysis processing that causes the image analysis unit to analyze the image acquired by the image acquisition unit, and
The 3D shape analysis unit is made to execute a 3D shape analysis process for analyzing the 3D shape of the skin by using the analysis result of the image analysis unit.
In the image acquisition process,
Acquire multiple polarized images of light of different wavelengths
In the image analysis process,
The polarized image is analyzed to generate a noise-removed skin image with noise removed.
In the three-dimensional shape analysis process,
A program that analyzes the three-dimensional shape of the skin using the noise-removed skin image.

Note that the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, install the program that records the processing sequence in the memory in the computer built in the dedicated hardware and execute it, or execute the program on a general-purpose computer that can execute various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to installing on a computer from a recording medium, programs can be received via networks such as LAN (Local Area Network) and the Internet, and installed on a recording medium such as a built-in hard disk.

Further, the various processes described in the specification are not only executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. Further, in the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

As described above, according to the configuration of one embodiment of the present disclosure, a noise-removing skin image that accurately reflects the unevenness of the skin from which noise such as hair and spots on the user's face has been removed is generated. A configuration that enables analysis of the three-dimensional shape of the skin with high accuracy is realized.
Specifically, for example, an image acquisition unit that acquires an image of skin such as a face, an image analysis unit that analyzes the skin image acquired by the image acquisition unit, and a skin 3 using the analysis results of the image analysis unit. It has a three-dimensional shape analysis unit that analyzes a dimensional shape. The image acquisition unit acquires a plurality of polarized images of light having different wavelengths, and the image analysis unit analyzes the polarized images to generate and generate a mirror reflection component image of the skin surface and a melanin dye concentration index value image. Using these images, a noise-removed skin image in which noise such as body hair and stains is removed is generated. The three-dimensional shape analysis unit analyzes the highly accurate three-dimensional shape of the skin using this noise-removed skin image.
With this configuration, it is possible to generate a noise-removing skin image that highly accurately reflects the unevenness of the skin from which noise such as hair and spots on the user's face has been removed, and to analyze the three-dimensional shape of the skin with high accuracy. It will be realized.

100 Image processing device 110 Image acquisition unit (camera)
111 Multicolor compatible polarized image acquisition unit 120 Image analysis unit 121 Polarization signal analysis unit 122 Dye signal analysis unit 123 Signal judgment unit 130 3D (3D) shape analysis unit 131 Normal information estimation unit 132 Distance information conversion unit 133 Distance information analysis Unit 140 Display unit 141 Measurement information display unit 142 Signal information display unit 143 Three-dimensional shape display unit 144 Measurement status display unit 210 Imaging unit 220 Lighting unit 221 to 223 Lighting A to C
301 Learner 401 Learner 500 Image acquisition unit (camera)
510 Imaging unit 520 Lighting unit 600 Image acquisition unit (camera)
610 Imaging unit 620 Lighting unit 701 CPU
702 ROM
703 RAM
704 Bus 705 I / O interface 706 Input section 707 Output section 708 Storage section 709 Communication section 710 Drive 711 Removable media

Claims

The image acquisition section that acquires skin images, and
An image analysis unit that analyzes the image acquired by the image acquisition unit, and an image analysis unit.
It has a three-dimensional shape analysis unit that analyzes the three-dimensional shape of the skin using the analysis results of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
An image processing device that analyzes the three-dimensional shape of skin using the noise-removed skin image.
The image analysis unit
The polarized image is analyzed to generate a specular reflection component image of the skin surface and a melanin pigment concentration index value image.
The image processing apparatus according to claim 1, wherein the noise-removed skin image is generated by using the generated specular reflection component image and the melanin pigment concentration index value image.
The image processing device according to claim 1, wherein the noise is at least one of body hair, stains, and moles.
The image acquisition unit
The image processing apparatus according to claim 1, further comprising an illumination unit that selectively outputs light having different wavelengths.
The image acquisition unit
It has an illumination unit that selectively outputs three types of light with different wavelengths: white light, red light, and near-infrared light.
The image processing apparatus according to claim 1, wherein a polarized image corresponding to three types of different wavelength light, white light, red light, and near-infrared light, is acquired.
The image acquisition unit
The image processing apparatus according to claim 1, further comprising a configuration in which a plurality of different polarized images are captured on a pixel-by-pixel basis.
The image analysis unit
The image processing apparatus according to claim 6, which performs demosaic processing of a plurality of different polarized images captured in pixel units.
The image analysis unit
Using the image input from the image acquisition unit, a plurality of different polarized images are generated.
The image processing apparatus according to claim 1, wherein a specular reflection component image of a skin surface is generated based on the generated plurality of polarized images and the correspondence data between the polarization angle and the brightness.
The image analysis unit
Using the image input from the image acquisition unit, a plurality of different polarized images are generated.
Based on the generated multiple polarized images and the polarization model which is the correspondence data between the polarization angle and the brightness, the specular component signal is separated into the specular reflected light component and the other component signals, and the specular reflection component on the skin surface. The image processing apparatus according to claim 1, which generates an image.
The image analysis unit
The specular reflected light component Is
Difference between maximum luminance value Imax and minimum luminance value Imin in the polarization model,
Is = Imax-Imin
The image processing apparatus according to claim 9, which is calculated according to the above formula.
The image analysis unit
The image processing apparatus according to claim 1, wherein a melanin dye concentration index value image is generated by using a photographed image under red light illumination and an image photographed under near infrared light illumination input from the image acquisition unit.
The image analysis unit
A composite image of the specular reflection component image of the skin surface generated by analyzing the polarized image and the melanin pigment concentration index value image is generated.
The image processing apparatus according to claim 1, wherein the noise-removed skin image is generated from the generated composite image and the specular reflection component image.
The three-dimensional shape analysis unit is
The normal information estimation unit that estimates the normal information on the skin surface,
A distance information conversion unit that converts the normal information on the skin surface estimated by the normal information estimation unit into distance information indicating the uneven shape of the skin surface, and a distance information conversion unit.
The image processing apparatus according to claim 1, further comprising a distance information analysis unit that calculates an evaluation index value based on an evaluation of the uneven shape of the skin surface using the distance information generated by the distance information conversion unit.
The distance information analysis unit
The image processing apparatus according to claim 13, wherein the depth map showing the unevenness of the skin is used to calculate at least one of the average roughness and the maximum height of the skin.
The normal information estimation unit is
The image processing apparatus according to claim 13, wherein the noise-removing skin image generated by the image analysis unit is input to a learning device, and normal information on the skin surface is acquired as an output of the learning device.
The image processing device further
The image processing apparatus according to claim 1, further comprising a display unit that displays the analysis result of the image analysis unit or at least one of the analysis results of the three-dimensional shape analysis unit.
The display unit is
The image processing apparatus according to claim 16, which displays at least one data of a three-dimensional image of the skin surface, a depth map showing unevenness of the skin surface, or a melanin pigment concentration index value image.
It is an image processing method executed in an image processing device.
The image acquisition unit acquires the skin image and the image acquisition process,
Image analysis processing in which the image analysis unit analyzes the image acquired by the image acquisition unit, and
The three-dimensional shape analysis unit executes a three-dimensional shape analysis process for analyzing the three-dimensional shape of the skin using the analysis result of the image analysis unit.
The image acquisition unit
Acquire multiple polarized images of different wavelength light,
The image analysis unit
The polarized image is analyzed to generate a noise-removed skin image from which noise is removed.
The three-dimensional shape analysis unit is
An image processing method for analyzing a three-dimensional shape of skin using the noise-removed skin image.
A program that executes image processing in an image processing device.
Image acquisition processing that causes the image acquisition unit to acquire skin images,
Image analysis processing that causes the image analysis unit to analyze the image acquired by the image acquisition unit, and
The 3D shape analysis unit is made to execute a 3D shape analysis process for analyzing the 3D shape of the skin by using the analysis result of the image analysis unit.
In the image acquisition process,
Acquire multiple polarized images of light of different wavelengths
In the image analysis process,
The polarized image is analyzed to generate a noise-removed skin image with noise removed.
In the three-dimensional shape analysis process,
A program that analyzes the three-dimensional shape of the skin using the noise-removed skin image.