WO2024034630A1

WO2024034630A1 - Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method

Info

Publication number: WO2024034630A1
Application number: PCT/JP2023/029036
Authority: WO
Inventors: 草太渡部
Original assignee: ライオン株式会社
Priority date: 2022-08-10
Filing date: 2023-08-09
Publication date: 2024-02-15

Abstract

This image processing device is provided with: a model storage unit for storing a trained model subjected to machine learning, using a learning captured image obtained by capturing an image of the skin of a person irradiated with visible light and a correct-answer captured image obtained by capturing an image of the skin of a person in which fluorescence caused by porphyrin is produced due to being irradiated with UV light, so as to generate an image of the skin of a person in which fluorescence caused by porphyrin is produced from a captured image obtained by capturing an image of the skin of a person irradiated with visible light; and an output image acquisition unit for acquiring, using the trained model, an output image, which is an image of the skin of a person in which fluorescence caused by porphyrin is produced, from a subject captured image obtained by capturing an image of the skin of a subject irradiated with visible light.

Description

Image processing device, model generation device, skin condition measurement system, image processing method, model generation method, determination method, and training data creation method

The present invention relates to an image processing device, a model generation device, a skin condition measurement system, an image processing method, a model generation method, a determination method, and a training data creation method.
This application claims priority to Japanese Patent Application No. 2022-127815 filed in Japan on August 10, 2022, the contents of which are incorporated herein.

Traditionally, the presence of Propionibacterium acnes and the condition of the skin have been investigated by detecting porphyrins on the skin, which are produced as metabolites by Propionibacterium acnes, which causes acne. ing. As a method for detecting porphyrin on the skin, for example, the porphyrin detection method described in Patent Document 1 is known. In the porphyrin detection method described in Patent Document 1, the skin surface is irradiated with weak ultraviolet light including visible light, and the skin surface is imaged by a CCD color camera equipped with a CCD (charge coupled device) cooling mechanism. A color image signal of the skin surface outputted from a CCD color camera is displayed as a still image on a television monitor, and the still image is observed.

Japanese Patent Application Publication No. 5-103771

However, the porphyrin detection method described in Patent Document 1 mentioned above requires irradiation of ultraviolet rays onto the skin surface of the subject, so damage to the skin due to ultraviolet rays is a problem. There was a need for a method that could detect.

The present invention has been made in consideration of these circumstances, and its purpose is to provide an image processing device and a model that can detect porphyrins on the skin of a subject without irradiating the subject with ultraviolet rays. The present invention provides a generation device, a skin condition measurement system, an image processing method, and a model generation method.

One aspect of the present invention provides a learning image that captures the skin of a person irradiated with visible light, and a correct image that captures the skin of a person that has fluorescence caused by porphyrins due to irradiation with ultraviolet rays. A trained model is machine learned to generate an image of human skin with fluorescence caused by porphyrins from a captured image of human skin irradiated with visible light. a subject captured image acquisition unit that acquires a subject captured image in which the skin of the subject irradiated with visible light is captured; and an output image acquisition unit that acquires an output image that is an image of a person's skin where fluorescence has occurred.

In one aspect of the present invention, in the above image processing device, the machine learning of the learned model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image. This is an image processing device that uses a correct captured image.

One aspect of the present invention is the image processing device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.

One aspect of the present invention is the image processing apparatus described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.

In one aspect of the present invention, in the above-described image processing device, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to change from the R image to the G image. The image processing device is an RGB image that is generated by adding a difference image obtained by subtracting the above RGB image to the RGB image.

One aspect of the present invention is the image processing device described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.

In one aspect of the present invention, in the above-described image processing device, the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is an image processing device.

One aspect of the present invention includes a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light, and a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light; a correct captured image acquisition unit that obtains a correct captured image in which human skin is captured; and a captured image in which human skin irradiated with visible light is captured using the learning captured image and the correct captured image; The present invention is a model generation device that performs machine learning on a model to generate an image of human skin in which fluorescence caused by porphyrin occurs from the surface of the skin, and generates a learned model.

One aspect of the present invention is the above model generation device, in which the model generation unit generates a compressed learning captured image obtained by compressing a portion of the learning captured image, and a compressed correct captured image obtained by compressing a portion of the correct captured image. The present invention is a model generation device that generates a compression learning captured image and the compressed correct captured image for machine learning of the model.

One aspect of the present invention is the model generating device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.

In one aspect of the present invention, in the above model generation device, the model generation unit generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the model generator This is a model generation device used for learning.

In one aspect of the present invention, in the above-described model generation device, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is a model generation device that is an RGB image generated by adding a difference image obtained by subtracting .

One aspect of the present invention is the model generation device described above, in which the model is composed of a layer of an encoder portion and a layer of a decoder portion.

One aspect of the present invention is the model generation device described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is a device.

One aspect of the present invention is a skin condition measurement system that includes the above-described image processing device and the above-described model generation device.

One aspect of the present invention includes the step of acquiring a subject image in which the skin of the subject irradiated with visible light is captured, and the step of acquiring a subject image in which the subject's skin is imaged, and using a trained model to detect fluorescence caused by porphyrins from the subject image. and obtaining an output image that is an image of human skin, and the trained model includes a learning image that is a human skin that has been irradiated with visible light, and a training image that is an image of human skin that has been irradiated with ultraviolet rays. Using the correct captured image of the skin of a person in which fluorescence caused by porphyrins has occurred, the skin of a person in which fluorescence caused by porphyrins has occurred is obtained from the captured image of the skin of a person irradiated with visible light. This is an image processing method that is a model that uses machine learning to generate images.

In one aspect of the present invention, in the above image processing method, machine learning of the trained model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image. This is an image processing method that uses a correct captured image.

One aspect of the present invention is the image processing method described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.

One aspect of the present invention is the image processing method described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.

In one aspect of the present invention, in the image processing method described above, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is an image processing method in which an RGB image is generated by adding a difference image obtained by subtracting .

One aspect of the present invention is the image processing method described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.

In one aspect of the present invention, in the above image processing method, the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. This is an image processing method.

One aspect of the present invention includes the steps of acquiring a learning image in which the skin of a person irradiated with visible light is captured, and the skin of a person in which fluorescence caused by porphyrins is generated due to the irradiation with ultraviolet rays is imaged. a step of obtaining a correct captured image obtained by using the learning captured image and the correct captured image, and detecting that fluorescence due to porphyrin is generated from the captured image of human skin irradiated with visible light using the learning captured image and the correct captured image; This model generation method includes a model generation step of performing machine learning on a model to generate an image of human skin and generating a learned model.

One aspect of the present invention is that in the above model generation method, the model generation step includes a compressed learning captured image obtained by compressing a portion of the captured image for learning, and a compressed correct captured image obtained by compressing a portion of the correct captured image. This is a model generation method, in which the compressed learning captured image and the compressed correct captured image are used for machine learning of the model.

One aspect of the present invention is the model generation method described above, wherein the skin is facial skin, and the part is around the forehead, nose, or mouth.

One aspect of the present invention is the above model generation method, in which the model generation step generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the emphasized correct captured image is transferred to a machine of the model. This is a model generation method used for learning.

One aspect of the present invention is that in the above model generation method, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is a model generation method in which an RGB image is generated by adding a difference image obtained by subtracting .

One aspect of the present invention is the model generation method described above, in which the model is composed of an encoder portion layer and a decoder portion layer.

One aspect of the present invention is the model generation method described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It's a method.

One aspect of the present invention is the image processing method described above, in which at least one projection reference point is set in the step of acquiring the learning captured image and the correct captured image.

One aspect of the present invention is to compare the output image obtained by the image processing device described above with a standard image, which is an image of the skin of a typical person who does not have acne, to determine the likelihood of acne. This is a determination method for determining whether

One aspect of the present invention is an evaluation method for evaluating the effect of skin care before and after a subject performs skin care, wherein the image is acquired by the above image processing device based on the image captured by the subject before the subject performs skin care. a first acquisition step of acquiring the output image obtained by the subject, and a second acquisition step of acquiring the output image acquired by the image processing device based on the captured image of the subject after the subject performs skin care; This determination method compares the output image acquired in the first acquisition step and the output image acquired in the second acquisition step to determine the degree of improvement due to skin care.

One aspect of the present invention is a learning captured image acquisition step of acquiring a learning captured image, which is an image of human skin irradiated with visible light and in which a portion having a reference point is captured. is an image of the same subject as the learning image captured by an imaging device different from the image capturing device that captured the learning image, and the fluorescence caused by porphyrin is emitted by irradiation with ultraviolet rays. a correct captured image acquisition step of acquiring a correct captured image that is an image of the skin of the person who has developed the skin; a first extraction step of extracting a part of the area based on the reference point from the learning captured image; A second extraction step of extracting a part of the region based on the reference point from the correct captured image, and storing the images extracted from each of the learning captured image and the correct captured image in association with each other, thereby creating training data. A teacher data creation method includes a step of creating teacher data.

One aspect of the present invention is the above teaching data creation method, further including an image processing step of generating an image matching the learning captured image by performing image processing on the correct captured image, and the second extraction step includes: , a teacher data creation method for extracting a part of the region based on the reference point from the correct captured image subjected to image processing in the image processing step.

According to the present invention, it is possible to detect porphyrins on the skin of a subject without irradiating the subject with ultraviolet rays.

FIG. 1 is a block diagram illustrating a configuration example of a skin condition measurement system according to an embodiment. 1 is a configuration diagram showing a schematic configuration example of a model according to an embodiment; FIG. 3 is a flowchart illustrating an example of a procedure of a model generation method according to an embodiment. FIG. 6 is a diagram illustrating an example of cut-out portions of a learning captured image and a correct captured image according to an embodiment. It is an example of an image for explaining preprocessing part 2 of the correct captured image according to one embodiment. 3 is a flowchart illustrating an example of a procedure of an image processing method according to an embodiment. FIG. 4 is an image diagram for explaining the difference in imaging between a learning captured image and a correct captured image according to an embodiment. FIG. 3 is an image diagram of reference points used for creating teacher data according to an embodiment. FIG. 6 is a diagram illustrating an example of preprocessing of a correct captured image and cutting out a learning captured image and a correct captured image according to an embodiment.

Embodiments of the present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram showing a configuration example of a skin condition measuring system according to an embodiment. The skin condition measurement system shown in FIG. 1 includes an image processing device 10 and a model generation device 20. The image processing device 10 and the model generation device 20 may exchange data online through communication, or may input and output data offline.

Subject U is a person who uses the skin condition measurement system and has his skin condition measured. The subject U uses the camera of the smartphone 30 to capture an image of the skin of the area that he/she wants to have measured, and sends the captured image (subject captured image) A to the image processing device 10 using the smartphone 30. For example, subject U may take an image of his or her entire face, or the area around the forehead, nose, and mouth, where acne is a concern, using the camera of the smartphone 30, and send the subject captured image A of the captured face to the image processing device 10 using the smartphone 30. do.

The subject image A is a captured image of the subject U's skin irradiated with visible light. When the subject image A is captured, it is sufficient that the skin of the subject U is irradiated with visible light, and there is no need to irradiate the skin of the subject U with ultraviolet rays when the subject image A is captured. A light source that emits visible light includes wavelengths from 400 to 770 nm. Since it is not necessary to irradiate ultraviolet rays, the ultraviolet (UV) intensity of the light source is preferably as weak as possible in order to reduce damage to the skin of subject U, for example, below the detection limit of the UV measuring device. preferable. Examples of light sources that emit visible light include natural light such as sunlight and moonlight, and artificial light such as incandescent light bulbs, fluorescent light bulbs, xenon lamps, and LED light bulbs. Although the light source may contain ultraviolet rays, artificial light with low UV intensity is preferable, and among the artificial lights, xenon lamps and LED bulbs are more preferable.

Note that in this embodiment, the subject U uses a smartphone 30 that is equipped with a camera (imaging device), but the present invention is not limited to this. For example, a mobile terminal device such as a tablet computer (tablet PC) equipped with a camera may be used. Further, a captured image captured by a digital camera may be transmitted to the image processing device 10 by a communication terminal such as a smartphone. When capturing an image with the camera of the smartphone 30, the accuracy of porphyrin detection can be further improved by capturing the image so that color correction and position correction can be performed. As a color correction method, for example, a color chart may be projected when photographing the subject U. As a position correction method, for example, when photographing the subject U, an ArUco marker, a landmark sticker, or the like is projected and used as a reference point, and projective transformation is performed using image editing software. Further, when using a mobile terminal device such as a smartphone 30 or a tablet PC equipped with a camera, the image processing device 10 described later may be incorporated and integrated into the mobile terminal device. In this case, it is not necessary to transmit the captured image A of the subject's face to the image processing device 10 using the smartphone 30, and it is not necessary to transmit the output image B, which will be described later, from the image processing device 10 to the smartphone 30 of the subject U. is also unnecessary.

[Image processing device]
The image processing device 10 includes a model storage section 11 , a subject captured image acquisition section 12 , and an output image acquisition section 13 .

Each function of the image processing device 10 includes computer hardware such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and a memory. This is realized by executing a computer program stored in the computer. Note that the image processing device 10 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device. For example, the image processing device 10 may be configured using a server computer connected to a communication network such as the Internet. Further, each function of the image processing device 10 may be realized by cloud computing. Further, the image processing device 10 may be realized by a single computer, or the functions of the image processing device 10 may be realized by distributing the functions to a plurality of computers. Further, the image processing apparatus 10 may be configured to open a website using, for example, a WWW system.

The model storage unit 11 stores the learned model MDa. The learned model MDa is provided from the model generation device 20. The learned model MDa consists of a learning image of human skin irradiated with visible light and a correct image of human skin in which fluorescence caused by porphyrins is generated due to ultraviolet irradiation. This is a model in which machine learning was performed to generate an image of human skin in which fluorescence caused by porphyrin has occurred from a captured image of human skin irradiated with visible light.

The subject captured image acquisition unit 12 acquires the subject captured image A. The subject captured image A is a captured image of the subject U's skin irradiated with visible light. The subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via the communication line.
Note that the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'. For example, when the subject captured image size is "3456 x 5184" pixels, it is compressed to "256 x 256" pixels.
In addition, in order to minimize the blurring of the output image due to the difference in the position of the face due to the movement of the subject U during image acquisition, the image size is changed from the subject captured image A of "3456 x 5184" pixels to the image size of "768 x 768" pixels. The image may be cut out, and the image of the cut-out portion of "768 x 768" pixels may be compressed to "256 x 256" pixels. The portion to be cut out may be a captured image of the subject's entire face at equal intervals, and it is also preferable to cut out the area around the forehead, nose, or mouth where porphyrin is likely to be present. The "256x256" pixel image is the subject compressed captured image A'.
Note that the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.

The output image acquisition unit 13 uses the learned model MDa from the subject captured image A or the subject compressed captured image A' to obtain an output image B or output image B, which is an image of the human skin in which fluorescence due to porphyrin has occurred. Get '.
The trained model MDa outputs an output image B when the captured image A of the subject is input. Output image B is an image inferred by trained model MDa from subject image A of subject U, and is an image of subject U's skin in which fluorescence due to porphyrin has occurred. For example, the output image B is an image of the face of the subject U in which fluorescence caused by porphyrin is inferred from the subject captured image A of the face of the subject U by the learned model MDa.
The learned model MDa outputs an output image B' when the compressed captured image A' of the subject is input. The output image B' is an image inferred by the learned model MDa from the subject compressed captured image A' of the subject U, and is an image of the subject U's skin in which fluorescence caused by porphyrin has occurred. For example, the output image B' is an image of the subject U's face in which fluorescence due to porphyrin has occurred, which is inferred by the trained model MDa from the subject compressed captured image A' of the subject's U's face. Since the output image B' is a compressed image, the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B. When the captured image of the subject's entire face is cut out at equal intervals and compressed, the output image acquisition unit 13 restores the output image B' to the output image B, and further connects the output images B to create the subject captured image. Obtain an output image B of the same size as A.

The output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line.

The subject U receives the output image B transmitted from the image processing device 10 with the smartphone 30 via the communication line. By displaying the output image B received by the smartphone 30 on the display screen of the smartphone 30, the subject U can visually recognize the state of porphyrin on his or her skin using the output image B. For example, subject U transmits a captured image A of his/her own face that is concerned about acne to the image processing device 10 using the smartphone 30, and in response, sends an output image B received from the image processing device 10 to the smartphone 30. By displaying the output image B on the display screen of , it is possible to visually recognize the state of porphyrins on one's own face where acne is a concern.

[Model generation device]
The model generation device 20 includes a learning captured image acquisition section 21, a correct captured image acquisition section 22, a model generation section 23, and a model output section 24.

Each function of the model generation device 20 is realized by the model generation device 20 including computer hardware such as a CPU, GPU, and memory, and the CPU executing a computer program stored in the memory. Note that the model generation device 20 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device. For example, the model generation device 20 may be configured using a server computer connected to a communication network such as the Internet. Further, each function of the model generation device 20 may be realized by cloud computing. Further, the model generation device 20 may be realized by a single computer, or the functions of the model generation device 20 may be realized by distributing it to a plurality of computers.

The learning captured image acquisition unit 21 acquires the learning captured image C. The correct captured image acquisition unit 22 acquires the correct captured image D. The learning captured image C and the correct captured image D are acquired by a person (selected subject) selected for acquiring learning data. The learning captured image C is a captured image of the selected subject's skin irradiated with visible light. For example, the learning captured image C is a captured image in which the face of the selected subject is irradiated with visible light. The correct captured image D is an image of the selected subject's skin in which fluorescence due to porphyrin has occurred due to irradiation with ultraviolet rays. For example, the correct captured image D is an image of the selected subject's face in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays.

When the correct image D is captured, the selected subject's skin is irradiated with ultraviolet rays, so it is preferable to obtain consent from the selected subject regarding the irradiation of ultraviolet rays and the effects of irradiating them with ultraviolet rays. . A light source that emits ultraviolet light includes a wavelength of 100 to 400 nm. Examples of the light source that irradiates ultraviolet rays include a mercury lamp, an ultraviolet LED lamp, and a black light. Further, it is preferable that the learning captured image C and the correct captured image D have the same angle of view and position. An example of an imaging device for acquiring the learning captured image C and the correct captured image D having the same angle of view and position is "VISIA (registered trademark), manufactured by Canfield Scientific." Moreover, a smartphone may be used instead of the VISIA as the imaging device for acquiring the captured learning image C. In this case, in order to obtain the learning captured image C and the correct captured image D with the same angle of view and position, at least one reference point at the same location, preferably 4 to 8 locations, is set in the captured screen. It is recommended to specify.

The model generation unit 23 uses the learning captured image C and the correct captured image D to generate an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light. Machine learning is performed on model MD to generate a learned model MDa. The trained model MDa outputs an output image B or an output image B' when the subject captured image A or the subject compressed captured image A' is input.

The model output unit 24 outputs the trained model MDa generated by the model generation unit 23. Examples of methods for outputting the trained model MDa include writing to a computer-readable recording medium and transmitting data via communication. The learned model MDa output by the model output unit 24 is provided to the image processing device 10. In the image processing device 10, the model storage unit 11 stores the provided trained model MDa.

Note that the image processing device 10 and the model generation device 20 may be realized using the same information processing device, or may be realized using different information processing devices.

FIG. 2 is a model configuration diagram showing a schematic configuration example of the model MD according to the present embodiment. The model MD shown in FIG. 2 has a U-Net structure in which the encoder 110 layer and the decoder 120 layer have a symmetrical structure, and each layer is connected to each other by a skip connection 130. The encoder 110 encodes the input image Pin in stages according to each layer. The decoder 120 decodes the encoding result of the encoder 110 in stages according to each layer. The decoding result of the decoder 120 is output from the model MD as an output image Pout. Each layer of the encoder 110 and each corresponding layer of the decoder 120 are connected to each other by a skip connection 130.

Note that in this embodiment, the U-Net structure model illustrated in FIG. 2 is used as the model MD, but the present invention is not limited to this. For example, model MD may be a model that does not include the skip connection 130 in FIG. 2. Furthermore, in this embodiment, GAN (Generative Adversarial Networks) may be applied.

Next, a model generation method and an image processing method according to this embodiment will be explained.

[Model generation method]
The model generation method according to this embodiment will be explained with reference to FIG. FIG. 3 is a flowchart illustrating an example of the procedure (model learning step S100) of the model generation method according to the present embodiment.

The model learning step S100 is a step executed by the model generation device 20, and is a step in which machine learning is performed on the model MD using the learning captured image C and the correct captured image D to generate a learned model MDa. .

(Step S101) The learning captured image acquisition unit 21 acquires learning captured images C for a plurality of selected subjects. The correct captured image acquisition unit 22 acquires the correct captured images D for the plurality of selected subjects. The learning captured image C and the correct captured image D are RGB images of the selected subject's face captured in color. The RGB image is composed of a red component image (R image), a green component image (G image), and a blue component image (B image). The model generation device 20 associates and holds the learning captured image C and the correct captured image D of the same selected subject. In the learning captured image C and the correct captured image D of the same selected subject, the positions of the faces appearing in the captured images are aligned.
Note that when acquiring the learning captured image C with a smartphone camera, position correction is performed to align the positions of the faces. Examples of the position correction method include the methods described above. This will be explained in detail below.
As a position correction method, for example, projective transformation is performed to align an image captured by a smartphone under visible light irradiation with an image captured under ultraviolet irradiation with fluorescence caused by porphyrins. The specific procedure is to specify at least one reference point, preferably between 4 and 8, at the same location between the two images, and then align them by aligning them with one of the reference points. Realize. Examples of the projection reference point specified above include a mole, a pseudo mole with a marker, and a sticker attached. The smaller the size of the reference point is, the better, from the viewpoint of more consistent positional deviations in units of pixels. As a specific example, in order to align the areas around the forehead, nose, and mouth that are most prone to acne, a sticker with a very small hole made in advance is pasted on the cheek area surrounding these areas. After imaging with a smartphone and VISIA, it is preferable to perform positioning by projective transformation using the hole in the seal as a reference point. For example, if the pixel value correlation coefficient between the generated image and the correct image when aligned visually is 1, the pixel value correlation coefficient when a marker is marked on a sticker is 1.10. ", the pixel value correlation coefficient when making a mark by making a hole in the seal with a needle (0.5 mm thick) is "1.23", making a hole in the seal with an ultra-fine needle (0.4 mm thick) The pixel value correlation coefficient when a mark was attached was "3.20".

(Step S102) The model generation unit 23 performs preprocessing on the learning captured image C and the correct captured image D. Hereinafter, preprocessing for the learning captured image C and the correct captured image D will be explained.

(Pretreatment part 1)
In preprocessing part 1, the model generation unit 23 uses a compressed learning captured image obtained by compressing a portion of the captured image C for learning and a compressed correct solution obtained by compressing a portion of the correct captured image D as images used for machine learning of the model MD. A captured image is generated. FIG. 4 shows portions (cutout portions) cut out from each of the learning captured image C and the correct captured image D. In the example of FIG. 4, portions of the face are cut out from around the forehead, nose, and mouth. This is because the areas around the forehead, nose, and mouth are areas where acne is particularly likely to occur, that is, areas where acne-causing bacteria are likely to occur and a large amount of porphyrin is likely to be detected. Note that the cutout portion may be cut out from at least the forehead, nose, or mouth area of the face. Further, the cutout portion may be cut out from other parts of the face (for example, the cheeks) other than the forehead, nose, and the area around the mouth, so that the learning captured image C and the correct captured image D are the entire face of the selected subject. , each image may be cut out at equal intervals.

The model generation unit 23 compresses the cut-out portion of the image to a predetermined compression size. For example, if the image size of the original learning captured image C and correct captured image D is "3456 x 5184" pixels, the image size of the cutout part is set to "768 x 768" pixels, and the "768 x 768" pixels are The image of the cutout part is compressed to "256 x 256" pixels. The "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original learning captured image C is the compressed learning captured image. The "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original correct captured image D is the compressed correct captured image. Note that the image size of the cutout portion of the learning captured image C and the correct captured image D is the output image due to the difference in the position of the face due to the movement of the same selected subject between the captured learning images C and the correct captured image D. "768x768" pixels are preferred to minimize blurring.
Note that the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.

(Pretreatment part 2)
In preprocessing No. 2, the model generation unit 23 generates an emphasized correct captured image that emphasizes the red component of the correct captured image D. In the present embodiment, preprocessing part 2 is performed on the compressed correct captured image generated in preprocessing part 1 described above. In addition, as a preferred embodiment of preprocessing part 2, the emphasized correct captured image is obtained by subtracting the G image from the R image among the R image, G image, and B image that constitute the RGB image of the correct captured image D. This is an RGB image generated by adding a difference image to the RGB image.

FIG. 5 is an example of an image for explaining the second preprocessing of the correct captured image D according to the present embodiment. FIG. 5(1) is a compressed correct image (RGB image) generated in preprocessing part 1. FIGS. 5(2), (3), and (4) are R, G, and B images that constitute the compressed correct captured image (RGB image). The model generation unit 23 adds the difference image in FIG. 5 to the RGB image in FIG. 5(1) to generate the RGB image (enhanced correct captured image) in FIG. In the RGB image (enhanced correct captured image) in FIG. 5(6), the red component is more emphasized than in the original RGB image (compressed correct captured image) in FIG. 5(1).

The above is an explanation of the preprocessing for the learning captured image C and the correct captured image D.

In the present embodiment, the model generation unit 23 uses the compressed learning captured image and the emphasized correct captured image for machine learning of the model MD. In addition, in this embodiment, the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD, but preprocessing part 2 is not performed, and the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD. May be used for learning.

(Step S103) The model generation unit 23 inputs the compressed correct captured image C' to the model MD as the input image Pin (see FIG. 2). When the compressed correct captured image (input image Pin) is input, the model MD generates and outputs an output image Pout (see FIG. 2).

(Step S104) The model generation unit 23 compares the output image Pout output from the model MD and the emphasized correct captured image D', and provides feedback to the model MD so that the difference value of the comparison result becomes smaller. Take control. The emphasized correct captured image D' is an enhanced correct captured image generated from the correct captured image D of the same selected subject, which corresponds to the compressed correct captured image C' used for the input image Pin.

In the present embodiment, as the difference value between the output image Pout and the emphasized correct captured image D', for example, the following MSE (mean squared error) is used.

However, N is the number of pixels. i is a pixel number. xi is the i-th pixel value of the emphasized correct captured image D'. yi is the i-th pixel value of the output image Pout.

The model generation unit 23 satisfies a predetermined termination condition using the learning captured image C (compressed correct captured image C') and the correct captured image D (enhanced correct captured image D') acquired from a plurality of selected subjects. Machine learning of the iterative model MD is performed until The predetermined termination condition may be a predetermined number of times, or may be that the difference value between the output image Pout and the emphasized correct captured image D' becomes less than a predetermined value.

The model generation unit 23 passes the model MD for which machine learning has been completed to the model output unit 24 as a learned model MDa. The model output unit 24 outputs the learned model MDa.

By using the compressed learning captured image and the compressed correct captured image generated by the preprocessing part 1 described above for machine learning of the model MD, the location of fluorescence caused by porphyrin in the output image Pout generated by the learned model MDa is determined. It was confirmed through experiments that the reproducibility of Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was When using subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels and preprocessing part 1 is not performed, the pixel value correlation coefficient between the output image and the correct captured image is "0.22". '', the 768 x 768 pixel image is extracted from the 3456 x 5184 pixel learning image C and the correct image D to create the 256 x 256 pixel compressed learning image. When performing preprocessing No. 1 for compressing into a compressed correct captured image, the pixel value correlation coefficient between the output image and the correct captured image was "0.35".
In addition, by using the emphasized correct captured image generated by the above-mentioned preprocessing 2 for machine learning of the model MD, the reproducibility of the fluorescence location caused by porphyrin in the output image Pout generated by the trained model MDa can be improved. It was confirmed through experiments that this improvement was achieved. Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was If you use subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels, and perform the above preprocessing 1 and do not perform preprocessing 2, the pixels of the output image and the correct captured image While the value correlation coefficient was "0.35", when performing the above-mentioned preprocessing 1 and preprocessing 2, the pixel value correlation coefficient between the output image and the correct captured image was "0.35". 0.61''.

[Image processing method]
The image processing method according to this embodiment will be explained with reference to FIG. FIG. 6 is a flowchart illustrating an example of the procedure (skin condition measurement step S200) of the image processing method according to the present embodiment.

The skin condition measurement step S200 is a step executed by the image processing device 10, and outputs from the subject captured image A of the subject U or the subject compressed captured image A' converted from the subject captured image A using the learned model MDa. This is the stage of generating image B. The model storage unit 11 of the image processing device 10 stores the trained model MDa generated by the model generation device 20 in the above-described model learning step S100.

(Step S201) The subject U images the part of the skin (for example, the face) that the subject U wants to have measured using, for example, the camera of the smartphone 30, and transmits the captured subject image A to the image processing device 10 using the smartphone 30. The subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via a communication line or the like. Here, the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'.
Note that when capturing an image with the camera of the smartphone 30, the image may be captured so that color correction and position correction can be performed. Specific methods for color correction and position correction include the methods described above.

(Step S202) The output image acquisition unit 13 inputs the subject captured image A or the subject compressed captured image A' to the learned model MDa as the input image Pin (see FIG. 2). When the trained model MDa receives the subject captured image A or the subject compressed captured image A' (input image Pin), it generates and outputs an output image Pout (see FIG. 2). The output image acquisition unit 13 acquires the output image Pout output from the trained model MDa as an output image B or an output image B'. The output image B or the output image B' is an image of the skin of the subject U in which fluorescence due to porphyrin has occurred, which is inferred by the learned model MDa from the subject image A or the subject compressed image A' of the subject U. . For example, the output image B or the output image B' is the face of the subject U in which fluorescence caused by porphyrin has occurred, which is inferred by the learned model MDa from the subject captured image A or the subject compressed captured image A' of the subject U's face. This is an image of Note that since the output image B' is a compressed image, the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B.

The output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line. Subject U receives output image B transmitted from image processing device 10 with smartphone 30 via a communication line. By displaying the output image B received by the smartphone 30 on the display screen of the smartphone 30, the subject U can visually recognize the state of porphyrin on his or her skin (for example, face) using the output image B. Note that when using a mobile terminal device such as a smartphone 30 or a tablet PC that is equipped with a camera and has the image processing device 10 incorporated therein, the image processing device uses the smartphone 30 to process the image A of the captured face of the subject. 10 and sending the output image B from the image processing device 10 to the smartphone 30 of the subject U is unnecessary.

[Variation example of training data creation method]
Next, a modification of the teacher data creation method will be described with reference to FIGS. 7 to 9. First, the problem that the modification is intended to solve will be explained. For example, in the learning stage, "VISIA (registered trademark), manufactured by Canfield Scientific" is used as an imaging device for acquiring the learning captured image C and the correct captured image D. By using such a device, the angle of view and position of the learning captured image C and the correct captured image D become the same, and alignment is not required (or the amount of alignment becomes small). However, in the inference stage, it is assumed that the subject U uses the smartphone 30 to capture the subject captured image A. In such a case, the number of pixels in the learning image and the number of pixels in the inference image are significantly different, so that sufficient inference may not be obtained.

Therefore, it is conceivable to use images captured using the smartphone 30 in the learning stage as well, similar to the inference stage. The learning captured image C is a captured image of the selected subject's skin irradiated with visible light, and therefore can be easily captured using the smartphone 30. On the other hand, the correct captured image D cannot be easily captured using the smartphone 30 because it is an image of the selected subject's face in which fluorescence due to porphyrin has been generated due to irradiation with ultraviolet rays. Therefore, the learning captured image C is an image captured using the smartphone 30, and the correct captured image D is an image captured using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific". , it is conceivable to align these images and use them as training data.

FIG. 7 is an image diagram for explaining the difference in imaging between the learning captured image and the correct captured image according to one embodiment. An image acquisition method in a modified example of the teacher data creation method will be described with reference to the same figure. FIG. 7A shows an example of a method for acquiring the correct captured image D, and is an example where imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific." As shown in the figure, the subject's face is fixed on the device, so hand shake and the like do not occur. If the learning image C is also captured at the same time, the position of the subject's face does not change, so alignment is not required (even if not simultaneously, the amount of alignment will be small). Furthermore, as shown in FIG. 7(A), when imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific," the imaging device itself also has a higher image quality than the smartphone 30. is used.

FIG. 7(B) is an example of a method for acquiring the learning captured image C, and is an example where the smartphone 30 is used to capture the image. As shown in the figure, the subject's face is not fixed and camera shake may occur. Furthermore, the image quality of the imaging device itself is lower than that of devices such as “VISIA (registered trademark), manufactured by Canfield Scientific.” In order to use the correct captured image D and the learning captured image C captured in this manner as teacher data, highly accurate positioning is performed in this embodiment. Note that even when using the smartphone 30, it is possible to fix the face and capture an image while suppressing the occurrence of camera shake. However, in this embodiment, in order to intentionally create training data based on an image similar to the image in the inference stage, imaging may be performed in a situation similar to that in which the image is captured in the inference stage. The situation that is imaged in the inference stage may be one in which the user stretches out his hand and takes an image of himself (so-called self-portrait) using the in-camera of the smartphone 30.

FIG. 8 is an image diagram of reference points used for creating teacher data according to one embodiment. Four reference points are attached to the subject's face. The reference point may be something like a sticker. In order to further increase accuracy, a mark may be made on the seal using a marker or the like, or a hole may be made in the seal for position identification.

FIG. 9 is a diagram illustrating an example of preprocessing the correct captured image and cutting out the learning captured image and the correct captured image according to an embodiment. With reference to the figure, an example of preprocessing of the correct captured image D and cutting out of the learning captured image C and the correct captured image D according to a modified example of the teacher data creation method will be described.

First, FIG. 9(A) is an example of the correct captured image D. Since the correct captured image D is captured with the face fixed, the image is shown as viewed from the front. FIG. 9(B) is an example of a captured image C for learning. The learning captured image C is captured using the smartphone 30 without fixing the face, so it may not be viewed from the front. In this embodiment, these images are aligned.

Before alignment, preprocessing is performed on the correct captured image D. In the preprocessing, a conventional technique such as projective transformation is used to transform the correct captured image D in accordance with the learning captured image C. The preprocessing may be performed by image processing. FIG. 9C shows the result of preprocessing the correct captured image D. Here, in the present embodiment, the learning captured image C is not converted in accordance with the correct captured image D, but the correct captured image D is converted in accordance with the learning captured image C. This is an attempt to perform learning using images similar to those used in the inference stage by performing transformations that match the images used in the inference stage.

Next, teacher data is created by extracting (cutting out) a partial region of the image from each of the image shown in FIG. 9(B) and the image shown in FIG. 9(C). Reference points as shown in FIG. 8 are used to extract the image. Note that the reference point does not need to be intentionally added for the purpose of creating the teacher data, and may be a characteristic point of the human body, such as a mole or the corner of the mouth, for example.

According to the modified example of the training data creation method as described above, by including the learning captured image acquisition step, an image of a person's skin irradiated with visible light is captured, and a portion having a reference point is By acquiring the captured image for learning C, which is a captured image, and having the step of acquiring the correct captured image, the captured image for learning C is captured by an imaging device different from the imaging device with which the captured image for learning C is captured. A correct captured image D is obtained, which is an image of the subject and is an image of the human skin in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays. Further, by having the first extraction step, a part of the region based on the reference point is extracted from the captured image C for learning, and by having the second extraction step, a part of the region based on the reference point is extracted from the correct captured image D. Extract the area of the area. Furthermore, by including the step of creating teacher data, the teacher data is created by storing images extracted from each of the learning captured image C and the correct captured image D in association with each other. By creating the training data in this way, it is possible to create the same training data as the image at the inference stage. That is, according to the present embodiment, it is possible to learn with high accuracy and to perform inference with high accuracy.

Further, according to the modification of the training data creation method described above, by further including an image processing step, the correct captured image D (instead of the learning captured image C) is image-processed, and the training captured image C is converted into the learning captured image C. Generate a combined image. The image processing step is pre-processing. Furthermore, in the second extraction step, a part of the area based on the reference point is extracted from the correct captured image D that has been subjected to image processing in the image processing step. By creating the training data in this way, it is possible to create the same training data as the image at the inference stage. That is, according to the present embodiment, it is possible to learn with high accuracy and to perform inference with high accuracy.
Note that in the image processing step, in addition to image processing the correct captured image D, image processing may also be performed on the learning captured image C.

The above is the explanation of the model generation method and image processing method according to this embodiment.

Note that it is preferable that the visible light used when capturing the learning captured image C and the subject captured image A have the same wavelength and intensity. Furthermore, blue light (eg, wavelength of 380 to 550 nm) may be used as the visible light used when capturing the learning captured image C and the subject captured image A. By using blue light when capturing the training image C and the subject image A, the fluorescence caused by porphyrin in the output image Pout generated by the trained model MDa is reduced compared to the case where white light is used. It was confirmed through experiments that the reproducibility of the parts was improved. Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was When performing the above-mentioned preprocessing 1 and preprocessing 2 using subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels, white light (400 to 770 nm) is used. When using blue light (380 to 550 nm), the pixel value correlation coefficient between the output image and the correct captured image was 0.61, but when blue light (380 to 550 nm) was used, the pixel value correlation coefficient between the output image and the correct captured image was 0.61. The correlation coefficient was "0.65". In addition, examples of embodiments using blue light include, for example, covering artificial light such as incandescent light bulbs, fluorescent lights, and LED light bulbs with a blue film, irradiation with LED light bulbs that only emit blue light, or lighting the entire surface of a smartphone or monitor. For example, it may be displayed in blue and irradiated.

According to the above-described embodiment, the image processing device 10 is configured to display a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays, resulting in fluorescence caused by porphyrins. The machine generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image D of the human skin irradiated with visible light using the correct captured image D of the skin of the person. An output image that is an image of the human skin in which fluorescence due to porphyrin has occurred is obtained from the subject image A in which the skin of the subject U irradiated with visible light is captured using the trained model MDa that has been trained. Get B. This provides the effect that porphyrins on the skin of the subject U can be detected without irradiating the subject U with ultraviolet rays. The subject captured image A may be converted into a subject compressed captured image A' which is a part of the subject captured image A compressed. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa. Note that the part of the face may be around the forehead, nose, or mouth. An output image B', which is an image of the human skin in which fluorescence due to porphyrin has occurred, is obtained from the subject compressed captured image A'. The output image B' is a compressed image, and is expanded to the same size as the captured image A of the subject and restored to the output image B, thereby obtaining the output image B.

Furthermore, for machine learning of the learned model MDa, a compressed learning captured image obtained by compressing a portion of the learning captured image C and a compressed correct captured image obtained by compressing a portion of the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. Note that the part of the face may be around the forehead, nose, or mouth.

Furthermore, for the machine learning of the learned model MDa, an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa. Note that the correct captured image D is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.

Furthermore, the trained model MDa may be composed of a layer of an encoder portion and a layer of a decoder portion. Note that the learned model MDa may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.

According to the embodiment described above, the model generation device 20 generates a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays and has fluorescence caused by porphyrins. The model generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light using the correct captured image D in which the skin of the person is captured. Machine learning of MD is performed to generate a learned model MDa. By using this learned model MDa, it is possible to detect porphyrins on the skin of the subject U without irradiating the subject U with ultraviolet rays.

In addition, the model generation device 20 generates a compressed learning captured image in which a portion of the learning captured image C is compressed, and a compressed correct captured image in which a portion of the correct captured image D is compressed, and the compressed learning captured image and the compressed correct captured image are compressed. The compressed correct captured image may be used for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. Note that the part of the face may be around the forehead, nose, or mouth.

Furthermore, the model generation device 20 may generate an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D, and use the emphasized correct captured image for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. In addition, the correct captured image D is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.

Furthermore, the model MD may be composed of an encoder portion layer and a decoder portion layer. Further, the model MD may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.

It is also possible to determine the susceptibility to acne using the image processing device 10 as described above. For example, by comparing the output image acquired by the image processing device 10 with a standard image, the susceptibility to acne can be determined. The standard image may be an image of a typical person without acne. Note that, as the standard image, a predetermined suitable image may be prepared depending on, for example, the age, gender, nationality, etc. of the person to be determined (subject U). Further, the standard image does not have to be an actually captured image, but may be an image generated by image processing.

Furthermore, using the image processing device 10 as described above, it is also possible to evaluate the effects of skin care before and after the subject performs skin care. In this case, as a first acquisition step, an output image output by the image processing device 10 is acquired based on the photographed image A of the subject before the subject performs skin care. Furthermore, as a second acquisition step, an output image output by the image processing device 10 is acquired based on the photographed image A of the subject after the subject has performed skin care. Then, the output image acquired in the first acquisition step and the output image acquired in the second acquisition step are compared to determine the effect of skin care. The determination may be made by comparing the output images before and after skin care, or by comparing the output images before and after skin care with a standard image, respectively.

Further, a computer program for realizing the functions of each device described above may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed. Note that the "computer system" here may include hardware such as an OS and peripheral devices.
Furthermore, the term "computer system" includes the homepage providing environment (or display environment) if a WWW system is used.
Furthermore, "computer-readable recording media" refers to flexible disks, magneto-optical disks, ROMs, writable non-volatile memories such as flash memory, portable media such as DVDs (Digital Versatile Discs), and media built into computer systems. A storage device such as a hard disk.

Furthermore, "computer-readable recording medium" refers to volatile memory (for example, DRAM (Dynamic It also includes those that retain programs for a certain period of time, such as Random Access Memory).
Further, the program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in a transmission medium. Here, the "transmission medium" that transmits the program refers to a medium that has a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
Moreover, the above-mentioned program may be for realizing a part of the above-mentioned functions. Furthermore, it may be a so-called difference file (difference program) that can realize the above-described functions in combination with a program already recorded in the computer system.

Although the embodiments of the present invention have been described above in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and may include design changes without departing from the gist of the present invention.

10... Image processing device, 11... Model storage section, 12... Subject captured image acquisition section, 13... Output image acquisition section, 21... Learning captured image acquisition section, 22... Correct captured image acquisition section, 23... Model generation section, 24...model output unit, MD...model, MDa...trained model, 30...smartphone, U...subject

Claims

Using a learning image of human skin irradiated with visible light and a correct image of human skin with fluorescence caused by porphyrins due to ultraviolet irradiation, the visible a model storage unit that stores a trained model subjected to machine learning to generate an image of human skin in which fluorescence caused by porphyrin is generated from a captured image of human skin irradiated with light;
a subject captured image acquisition unit that acquires a subject captured image in which the skin of the subject irradiated with visible light is captured;
an output image acquisition unit that uses the trained model to acquire an output image that is an image of human skin in which fluorescence due to porphyrin has occurred from the captured image of the subject;
An image processing device comprising:
The machine learning of the learned model uses a compressed learning captured image in which a portion of the learning captured image is compressed, and a compressed correct captured image in which a portion of the correct captured image is compressed.
The image processing device according to claim 1.
the skin is facial skin;
The part is around the forehead, nose or mouth,
The image processing device according to claim 2.
The machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
The image processing device according to claim 1.
The correct captured image is an RGB image composed of an R image, a G image, and a B image,
The emphasized correct captured image is an RGB image generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image.
The image processing device according to claim 4.
The learned model is composed of an encoder portion layer and a decoder portion layer,
The image processing device according to any one of claims 1 to 5.
a learning captured image acquisition unit that acquires a learning captured image of human skin irradiated with visible light;
a correct image acquisition unit that obtains a correct image of human skin in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays;
A model that uses the learning captured image and the correct captured image to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light. a model generation unit that performs machine learning and generates a trained model;
A model generation device comprising:
comprising the image processing device according to any one of claims 1 to 5 and the model generation device according to claim 7,
Skin condition measurement system.
acquiring a subject image in which the subject's skin is irradiated with visible light;
from the captured image of the subject, using a trained model to obtain an output image that is an image of human skin in which fluorescence due to porphyrin has occurred;
The learned model includes a learning image of a person's skin irradiated with visible light and a correct image of a person's skin with fluorescence caused by porphyrins as a result of being irradiated with ultraviolet rays. This is a model in which machine learning is performed to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light using
Image processing method.
The image processing method according to claim 9, wherein in the step of acquiring the learning captured image and the correct captured image, at least one projection reference point is set.
acquiring a learning image in which the skin of a person irradiated with visible light is captured;
obtaining a correct image of human skin in which fluorescence due to porphyrin has occurred due to irradiation with ultraviolet rays;
A model that uses the learning captured image and the correct captured image to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light. a step of performing machine learning and generating a trained model;
Model generation methods, including:
By comparing the output image acquired by the image processing device according to any one of claims 1 to 5 with a standard image that is an image of the skin of a typical person without acne, A method for determining the susceptibility to acne.
An evaluation method for evaluating the effect of skin care before and after a subject performs skin care,
A first acquisition step of acquiring the output image acquired by the image processing device according to any one of claims 1 to 5, based on the captured image of the subject before the subject performs skin care;
a second acquisition step of acquiring the output image acquired by the image processing device according to any one of claims 1 to 5, based on the captured image of the subject after the subject has performed skin care;
A determination method in which the output image acquired in the first acquisition step and the output image acquired in the second acquisition step are compared to determine the degree of improvement due to skin care.
a learning captured image acquisition step of acquiring a learning captured image, which is an image of a person's skin irradiated with visible light and in which a portion having a reference point is captured;
An image of the same subject as the learning image captured by an imaging device different from the image capturing device that captured the learning image, and in which fluorescence due to porphyrin is generated due to ultraviolet irradiation. a correct captured image acquisition step of obtaining a correct captured image that is an image of human skin;
a first extraction step of extracting a part of the learning image based on the reference point;
a second extraction step of extracting a part of the correct captured image based on the reference point;
a teacher data creation step of creating teacher data by storing images extracted from each of the learning captured image and the correct captured image in association with each other;
Training data creation method including.