WO2024034630A1 - Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method - Google Patents

Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method Download PDF

Info

Publication number
WO2024034630A1
WO2024034630A1 PCT/JP2023/029036 JP2023029036W WO2024034630A1 WO 2024034630 A1 WO2024034630 A1 WO 2024034630A1 JP 2023029036 W JP2023029036 W JP 2023029036W WO 2024034630 A1 WO2024034630 A1 WO 2024034630A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
captured image
learning
subject
correct
Prior art date
Application number
PCT/JP2023/029036
Other languages
French (fr)
Japanese (ja)
Inventor
草太 渡部
Original Assignee
ライオン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ライオン株式会社 filed Critical ライオン株式会社
Publication of WO2024034630A1 publication Critical patent/WO2024034630A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B10/00Other methods or instruments for diagnosis, e.g. instruments for taking a cell sample, for biopsy, for vaccination diagnosis; Sex determination; Ovulation-period determination; Throat striking implements
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to an image processing device, a model generation device, a skin condition measurement system, an image processing method, a model generation method, a determination method, and a training data creation method.
  • the presence of Propionibacterium acnes and the condition of the skin have been investigated by detecting porphyrins on the skin, which are produced as metabolites by Propionibacterium acnes, which causes acne. ing.
  • the porphyrin detection method described in Patent Document 1 As a method for detecting porphyrin on the skin, for example, the porphyrin detection method described in Patent Document 1 is known.
  • the skin surface is irradiated with weak ultraviolet light including visible light, and the skin surface is imaged by a CCD color camera equipped with a CCD (charge coupled device) cooling mechanism.
  • a color image signal of the skin surface outputted from a CCD color camera is displayed as a still image on a television monitor, and the still image is observed.
  • the present invention has been made in consideration of these circumstances, and its purpose is to provide an image processing device and a model that can detect porphyrins on the skin of a subject without irradiating the subject with ultraviolet rays.
  • the present invention provides a generation device, a skin condition measurement system, an image processing method, and a model generation method.
  • One aspect of the present invention provides a learning image that captures the skin of a person irradiated with visible light, and a correct image that captures the skin of a person that has fluorescence caused by porphyrins due to irradiation with ultraviolet rays.
  • a trained model is machine learned to generate an image of human skin with fluorescence caused by porphyrins from a captured image of human skin irradiated with visible light.
  • a subject captured image acquisition unit that acquires a subject captured image in which the skin of the subject irradiated with visible light is captured; and an output image acquisition unit that acquires an output image that is an image of a person's skin where fluorescence has occurred.
  • the machine learning of the learned model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image.
  • This is an image processing device that uses a correct captured image.
  • One aspect of the present invention is the image processing device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
  • One aspect of the present invention is the image processing apparatus described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
  • the correct captured image is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is configured to change from the R image to the G image.
  • the image processing device is an RGB image that is generated by adding a difference image obtained by subtracting the above RGB image to the RGB image.
  • One aspect of the present invention is the image processing device described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.
  • the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is an image processing device.
  • One aspect of the present invention includes a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light, and a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light; a correct captured image acquisition unit that obtains a correct captured image in which human skin is captured; and a captured image in which human skin irradiated with visible light is captured using the learning captured image and the correct captured image;
  • the present invention is a model generation device that performs machine learning on a model to generate an image of human skin in which fluorescence caused by porphyrin occurs from the surface of the skin, and generates a learned model.
  • One aspect of the present invention is the above model generation device, in which the model generation unit generates a compressed learning captured image obtained by compressing a portion of the learning captured image, and a compressed correct captured image obtained by compressing a portion of the correct captured image.
  • the present invention is a model generation device that generates a compression learning captured image and the compressed correct captured image for machine learning of the model.
  • One aspect of the present invention is the model generating device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
  • the model generation unit in the above model generation device, the model generation unit generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the model generator This is a model generation device used for learning.
  • the correct captured image is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is configured to convert the R image to the G image.
  • This is a model generation device that is an RGB image generated by adding a difference image obtained by subtracting .
  • One aspect of the present invention is the model generation device described above, in which the model is composed of a layer of an encoder portion and a layer of a decoder portion.
  • One aspect of the present invention is the model generation device described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is a device.
  • One aspect of the present invention is a skin condition measurement system that includes the above-described image processing device and the above-described model generation device.
  • One aspect of the present invention includes the step of acquiring a subject image in which the skin of the subject irradiated with visible light is captured, and the step of acquiring a subject image in which the subject's skin is imaged, and using a trained model to detect fluorescence caused by porphyrins from the subject image. and obtaining an output image that is an image of human skin, and the trained model includes a learning image that is a human skin that has been irradiated with visible light, and a training image that is an image of human skin that has been irradiated with ultraviolet rays.
  • the skin of a person in which fluorescence caused by porphyrins has occurred is obtained from the captured image of the skin of a person irradiated with visible light.
  • This is an image processing method that is a model that uses machine learning to generate images.
  • machine learning of the trained model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image.
  • This is an image processing method that uses a correct captured image.
  • One aspect of the present invention is the image processing method described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
  • One aspect of the present invention is the image processing method described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
  • the correct captured image is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is configured to convert the R image to the G image.
  • One aspect of the present invention is the image processing method described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.
  • the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. This is an image processing method.
  • One aspect of the present invention includes the steps of acquiring a learning image in which the skin of a person irradiated with visible light is captured, and the skin of a person in which fluorescence caused by porphyrins is generated due to the irradiation with ultraviolet rays is imaged. a step of obtaining a correct captured image obtained by using the learning captured image and the correct captured image, and detecting that fluorescence due to porphyrin is generated from the captured image of human skin irradiated with visible light using the learning captured image and the correct captured image;
  • This model generation method includes a model generation step of performing machine learning on a model to generate an image of human skin and generating a learned model.
  • the model generation step includes a compressed learning captured image obtained by compressing a portion of the captured image for learning, and a compressed correct captured image obtained by compressing a portion of the correct captured image.
  • This is a model generation method, in which the compressed learning captured image and the compressed correct captured image are used for machine learning of the model.
  • One aspect of the present invention is the model generation method described above, wherein the skin is facial skin, and the part is around the forehead, nose, or mouth.
  • One aspect of the present invention is the above model generation method, in which the model generation step generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the emphasized correct captured image is transferred to a machine of the model.
  • This is a model generation method used for learning.
  • the correct captured image is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is configured to convert the R image to the G image.
  • This is a model generation method in which an RGB image is generated by adding a difference image obtained by subtracting .
  • One aspect of the present invention is the model generation method described above, in which the model is composed of an encoder portion layer and a decoder portion layer.
  • One aspect of the present invention is the model generation method described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It's a method.
  • One aspect of the present invention is the image processing method described above, in which at least one projection reference point is set in the step of acquiring the learning captured image and the correct captured image.
  • One aspect of the present invention is to compare the output image obtained by the image processing device described above with a standard image, which is an image of the skin of a typical person who does not have acne, to determine the likelihood of acne. This is a determination method for determining whether
  • One aspect of the present invention is an evaluation method for evaluating the effect of skin care before and after a subject performs skin care, wherein the image is acquired by the above image processing device based on the image captured by the subject before the subject performs skin care.
  • a first acquisition step of acquiring the output image obtained by the subject and a second acquisition step of acquiring the output image acquired by the image processing device based on the captured image of the subject after the subject performs skin care;
  • This determination method compares the output image acquired in the first acquisition step and the output image acquired in the second acquisition step to determine the degree of improvement due to skin care.
  • One aspect of the present invention is a learning captured image acquisition step of acquiring a learning captured image, which is an image of human skin irradiated with visible light and in which a portion having a reference point is captured. is an image of the same subject as the learning image captured by an imaging device different from the image capturing device that captured the learning image, and the fluorescence caused by porphyrin is emitted by irradiation with ultraviolet rays.
  • a teacher data creation method includes a step of creating teacher data.
  • One aspect of the present invention is the above teaching data creation method, further including an image processing step of generating an image matching the learning captured image by performing image processing on the correct captured image, and the second extraction step includes: , a teacher data creation method for extracting a part of the region based on the reference point from the correct captured image subjected to image processing in the image processing step.
  • FIG. 1 is a block diagram illustrating a configuration example of a skin condition measurement system according to an embodiment.
  • 1 is a configuration diagram showing a schematic configuration example of a model according to an embodiment
  • FIG. 3 is a flowchart illustrating an example of a procedure of a model generation method according to an embodiment.
  • FIG. 6 is a diagram illustrating an example of cut-out portions of a learning captured image and a correct captured image according to an embodiment. It is an example of an image for explaining preprocessing part 2 of the correct captured image according to one embodiment.
  • 3 is a flowchart illustrating an example of a procedure of an image processing method according to an embodiment.
  • FIG. 4 is an image diagram for explaining the difference in imaging between a learning captured image and a correct captured image according to an embodiment.
  • FIG. 3 is an image diagram of reference points used for creating teacher data according to an embodiment.
  • FIG. 6 is a diagram illustrating an example of preprocessing of a correct captured image and cutting out a learning captured image and a correct captured
  • FIG. 1 is a block diagram showing a configuration example of a skin condition measuring system according to an embodiment.
  • the skin condition measurement system shown in FIG. 1 includes an image processing device 10 and a model generation device 20.
  • the image processing device 10 and the model generation device 20 may exchange data online through communication, or may input and output data offline.
  • Subject U is a person who uses the skin condition measurement system and has his skin condition measured.
  • the subject U uses the camera of the smartphone 30 to capture an image of the skin of the area that he/she wants to have measured, and sends the captured image (subject captured image) A to the image processing device 10 using the smartphone 30.
  • subject U may take an image of his or her entire face, or the area around the forehead, nose, and mouth, where acne is a concern, using the camera of the smartphone 30, and send the subject captured image A of the captured face to the image processing device 10 using the smartphone 30. do.
  • the subject image A is a captured image of the subject U's skin irradiated with visible light.
  • a light source that emits visible light includes wavelengths from 400 to 770 nm. Since it is not necessary to irradiate ultraviolet rays, the ultraviolet (UV) intensity of the light source is preferably as weak as possible in order to reduce damage to the skin of subject U, for example, below the detection limit of the UV measuring device. preferable.
  • Examples of light sources that emit visible light include natural light such as sunlight and moonlight, and artificial light such as incandescent light bulbs, fluorescent light bulbs, xenon lamps, and LED light bulbs.
  • natural light such as sunlight and moonlight
  • artificial light such as incandescent light bulbs, fluorescent light bulbs, xenon lamps, and LED light bulbs.
  • the light source may contain ultraviolet rays, artificial light with low UV intensity is preferable, and among the artificial lights, xenon lamps and LED bulbs are more preferable.
  • the subject U uses a smartphone 30 that is equipped with a camera (imaging device), but the present invention is not limited to this.
  • a mobile terminal device such as a tablet computer (tablet PC) equipped with a camera may be used.
  • a captured image captured by a digital camera may be transmitted to the image processing device 10 by a communication terminal such as a smartphone.
  • a color correction method for example, a color chart may be projected when photographing the subject U.
  • a position correction method for example, when photographing the subject U, an ArUco marker, a landmark sticker, or the like is projected and used as a reference point, and projective transformation is performed using image editing software.
  • the image processing device 10 described later may be incorporated and integrated into the mobile terminal device. In this case, it is not necessary to transmit the captured image A of the subject's face to the image processing device 10 using the smartphone 30, and it is not necessary to transmit the output image B, which will be described later, from the image processing device 10 to the smartphone 30 of the subject U. is also unnecessary.
  • the image processing device 10 includes a model storage section 11 , a subject captured image acquisition section 12 , and an output image acquisition section 13 .
  • Each function of the image processing device 10 includes computer hardware such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and a memory. This is realized by executing a computer program stored in the computer.
  • the image processing device 10 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device.
  • the image processing device 10 may be configured using a server computer connected to a communication network such as the Internet.
  • each function of the image processing device 10 may be realized by cloud computing.
  • the image processing device 10 may be realized by a single computer, or the functions of the image processing device 10 may be realized by distributing the functions to a plurality of computers.
  • the image processing apparatus 10 may be configured to open a website using, for example, a WWW system.
  • the model storage unit 11 stores the learned model MDa.
  • the learned model MDa is provided from the model generation device 20.
  • the learned model MDa consists of a learning image of human skin irradiated with visible light and a correct image of human skin in which fluorescence caused by porphyrins is generated due to ultraviolet irradiation. This is a model in which machine learning was performed to generate an image of human skin in which fluorescence caused by porphyrin has occurred from a captured image of human skin irradiated with visible light.
  • the subject captured image acquisition unit 12 acquires the subject captured image A.
  • the subject captured image A is a captured image of the subject U's skin irradiated with visible light.
  • the subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via the communication line.
  • the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'. For example, when the subject captured image size is "3456 x 5184" pixels, it is compressed to "256 x 256" pixels.
  • the image size is changed from the subject captured image A of "3456 x 5184" pixels to the image size of "768 x 768" pixels.
  • the image may be cut out, and the image of the cut-out portion of "768 x 768" pixels may be compressed to "256 x 256" pixels.
  • the portion to be cut out may be a captured image of the subject's entire face at equal intervals, and it is also preferable to cut out the area around the forehead, nose, or mouth where porphyrin is likely to be present.
  • the "256x256" pixel image is the subject compressed captured image A'. Note that the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.
  • the output image acquisition unit 13 uses the learned model MDa from the subject captured image A or the subject compressed captured image A' to obtain an output image B or output image B, which is an image of the human skin in which fluorescence due to porphyrin has occurred. Get '.
  • the trained model MDa outputs an output image B when the captured image A of the subject is input.
  • Output image B is an image inferred by trained model MDa from subject image A of subject U, and is an image of subject U's skin in which fluorescence due to porphyrin has occurred.
  • the output image B is an image of the face of the subject U in which fluorescence caused by porphyrin is inferred from the subject captured image A of the face of the subject U by the learned model MDa.
  • the learned model MDa outputs an output image B' when the compressed captured image A' of the subject is input.
  • the output image B' is an image inferred by the learned model MDa from the subject compressed captured image A' of the subject U, and is an image of the subject U's skin in which fluorescence caused by porphyrin has occurred.
  • the output image B' is an image of the subject U's face in which fluorescence due to porphyrin has occurred, which is inferred by the trained model MDa from the subject compressed captured image A' of the subject's U's face. Since the output image B' is a compressed image, the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B.
  • the output image acquisition unit 13 restores the output image B' to the output image B, and further connects the output images B to create the subject captured image. Obtain an output image B of the same size as A.
  • the output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line.
  • the subject U receives the output image B transmitted from the image processing device 10 with the smartphone 30 via the communication line.
  • the subject U can visually recognize the state of porphyrin on his or her skin using the output image B.
  • subject U transmits a captured image A of his/her own face that is concerned about acne to the image processing device 10 using the smartphone 30, and in response, sends an output image B received from the image processing device 10 to the smartphone 30.
  • the output image B By displaying the output image B on the display screen of , it is possible to visually recognize the state of porphyrins on one's own face where acne is a concern.
  • the model generation device 20 includes a learning captured image acquisition section 21, a correct captured image acquisition section 22, a model generation section 23, and a model output section 24.
  • Each function of the model generation device 20 is realized by the model generation device 20 including computer hardware such as a CPU, GPU, and memory, and the CPU executing a computer program stored in the memory.
  • the model generation device 20 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device.
  • the model generation device 20 may be configured using a server computer connected to a communication network such as the Internet.
  • each function of the model generation device 20 may be realized by cloud computing.
  • the model generation device 20 may be realized by a single computer, or the functions of the model generation device 20 may be realized by distributing it to a plurality of computers.
  • the learning captured image acquisition unit 21 acquires the learning captured image C.
  • the correct captured image acquisition unit 22 acquires the correct captured image D.
  • the learning captured image C and the correct captured image D are acquired by a person (selected subject) selected for acquiring learning data.
  • the learning captured image C is a captured image of the selected subject's skin irradiated with visible light.
  • the learning captured image C is a captured image in which the face of the selected subject is irradiated with visible light.
  • the correct captured image D is an image of the selected subject's skin in which fluorescence due to porphyrin has occurred due to irradiation with ultraviolet rays.
  • the correct captured image D is an image of the selected subject's face in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays.
  • a light source that emits ultraviolet light includes a wavelength of 100 to 400 nm.
  • Examples of the light source that irradiates ultraviolet rays include a mercury lamp, an ultraviolet LED lamp, and a black light.
  • the learning captured image C and the correct captured image D have the same angle of view and position.
  • An example of an imaging device for acquiring the learning captured image C and the correct captured image D having the same angle of view and position is "VISIA (registered trademark), manufactured by Canfield Scientific.”
  • a smartphone may be used instead of the VISIA as the imaging device for acquiring the captured learning image C.
  • at least one reference point at the same location, preferably 4 to 8 locations, is set in the captured screen. It is recommended to specify.
  • the model generation unit 23 uses the learning captured image C and the correct captured image D to generate an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light.
  • Machine learning is performed on model MD to generate a learned model MDa.
  • the trained model MDa outputs an output image B or an output image B' when the subject captured image A or the subject compressed captured image A' is input.
  • the model output unit 24 outputs the trained model MDa generated by the model generation unit 23. Examples of methods for outputting the trained model MDa include writing to a computer-readable recording medium and transmitting data via communication.
  • the learned model MDa output by the model output unit 24 is provided to the image processing device 10.
  • the model storage unit 11 stores the provided trained model MDa.
  • image processing device 10 and the model generation device 20 may be realized using the same information processing device, or may be realized using different information processing devices.
  • FIG. 2 is a model configuration diagram showing a schematic configuration example of the model MD according to the present embodiment.
  • the model MD shown in FIG. 2 has a U-Net structure in which the encoder 110 layer and the decoder 120 layer have a symmetrical structure, and each layer is connected to each other by a skip connection 130.
  • the encoder 110 encodes the input image Pin in stages according to each layer.
  • the decoder 120 decodes the encoding result of the encoder 110 in stages according to each layer.
  • the decoding result of the decoder 120 is output from the model MD as an output image Pout.
  • Each layer of the encoder 110 and each corresponding layer of the decoder 120 are connected to each other by a skip connection 130.
  • model MD may be a model that does not include the skip connection 130 in FIG. 2.
  • GAN Geneative Adversarial Networks
  • FIG. 3 is a flowchart illustrating an example of the procedure (model learning step S100) of the model generation method according to the present embodiment.
  • the model learning step S100 is a step executed by the model generation device 20, and is a step in which machine learning is performed on the model MD using the learning captured image C and the correct captured image D to generate a learned model MDa. .
  • the learning captured image acquisition unit 21 acquires learning captured images C for a plurality of selected subjects.
  • the correct captured image acquisition unit 22 acquires the correct captured images D for the plurality of selected subjects.
  • the learning captured image C and the correct captured image D are RGB images of the selected subject's face captured in color.
  • the RGB image is composed of a red component image (R image), a green component image (G image), and a blue component image (B image).
  • the model generation device 20 associates and holds the learning captured image C and the correct captured image D of the same selected subject. In the learning captured image C and the correct captured image D of the same selected subject, the positions of the faces appearing in the captured images are aligned.
  • position correction is performed to align the positions of the faces.
  • the position correction method include the methods described above. This will be explained in detail below.
  • projective transformation is performed to align an image captured by a smartphone under visible light irradiation with an image captured under ultraviolet irradiation with fluorescence caused by porphyrins.
  • the specific procedure is to specify at least one reference point, preferably between 4 and 8, at the same location between the two images, and then align them by aligning them with one of the reference points. Realize.
  • the projection reference point specified above include a mole, a pseudo mole with a marker, and a sticker attached.
  • a sticker with a very small hole made in advance is pasted on the cheek area surrounding these areas.
  • the pixel value correlation coefficient when making a mark by making a hole in the seal with a needle (0.5 mm thick) is "1.23", making a hole in the seal with an ultra-fine needle (0.4 mm thick)
  • the pixel value correlation coefficient when a mark was attached was "3.20".
  • Step S102 The model generation unit 23 performs preprocessing on the learning captured image C and the correct captured image D.
  • preprocessing for the learning captured image C and the correct captured image D will be explained.
  • Pretreatment part 1 In preprocessing part 1, the model generation unit 23 uses a compressed learning captured image obtained by compressing a portion of the captured image C for learning and a compressed correct solution obtained by compressing a portion of the correct captured image D as images used for machine learning of the model MD. A captured image is generated.
  • FIG. 4 shows portions (cutout portions) cut out from each of the learning captured image C and the correct captured image D.
  • portions of the face are cut out from around the forehead, nose, and mouth. This is because the areas around the forehead, nose, and mouth are areas where acne is particularly likely to occur, that is, areas where acne-causing bacteria are likely to occur and a large amount of porphyrin is likely to be detected.
  • the cutout portion may be cut out from at least the forehead, nose, or mouth area of the face. Further, the cutout portion may be cut out from other parts of the face (for example, the cheeks) other than the forehead, nose, and the area around the mouth, so that the learning captured image C and the correct captured image D are the entire face of the selected subject. , each image may be cut out at equal intervals.
  • the model generation unit 23 compresses the cut-out portion of the image to a predetermined compression size. For example, if the image size of the original learning captured image C and correct captured image D is "3456 x 5184" pixels, the image size of the cutout part is set to “768 x 768" pixels, and the "768 x 768" pixels are The image of the cutout part is compressed to "256 x 256" pixels. The "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original learning captured image C is the compressed learning captured image.
  • the "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original correct captured image D is the compressed correct captured image.
  • the image size of the cutout portion of the learning captured image C and the correct captured image D is the output image due to the difference in the position of the face due to the movement of the same selected subject between the captured learning images C and the correct captured image D.
  • "768x768" pixels are preferred to minimize blurring.
  • the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.
  • Pretreatment part 2 In preprocessing No. 2, the model generation unit 23 generates an emphasized correct captured image that emphasizes the red component of the correct captured image D.
  • preprocessing part 2 is performed on the compressed correct captured image generated in preprocessing part 1 described above.
  • the emphasized correct captured image is obtained by subtracting the G image from the R image among the R image, G image, and B image that constitute the RGB image of the correct captured image D. This is an RGB image generated by adding a difference image to the RGB image.
  • FIG. 5 is an example of an image for explaining the second preprocessing of the correct captured image D according to the present embodiment.
  • FIG. 5(1) is a compressed correct image (RGB image) generated in preprocessing part 1.
  • FIGS. 5(2), (3), and (4) are R, G, and B images that constitute the compressed correct captured image (RGB image).
  • the model generation unit 23 adds the difference image in FIG. 5 to the RGB image in FIG. 5(1) to generate the RGB image (enhanced correct captured image) in FIG.
  • the red component is more emphasized than in the original RGB image (compressed correct captured image) in FIG. 5(1).
  • the model generation unit 23 uses the compressed learning captured image and the emphasized correct captured image for machine learning of the model MD.
  • the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD, but preprocessing part 2 is not performed, and the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD. May be used for learning.
  • Step S103 The model generation unit 23 inputs the compressed correct captured image C' to the model MD as the input image Pin (see FIG. 2).
  • the model MD When the compressed correct captured image (input image Pin) is input, the model MD generates and outputs an output image Pout (see FIG. 2).
  • Step S104 The model generation unit 23 compares the output image Pout output from the model MD and the emphasized correct captured image D', and provides feedback to the model MD so that the difference value of the comparison result becomes smaller. Take control.
  • the emphasized correct captured image D' is an enhanced correct captured image generated from the correct captured image D of the same selected subject, which corresponds to the compressed correct captured image C' used for the input image Pin.
  • the difference value between the output image Pout and the emphasized correct captured image D' for example, the following MSE (mean squared error) is used.
  • N is the number of pixels.
  • i is a pixel number.
  • xi is the i-th pixel value of the emphasized correct captured image D'.
  • yi is the i-th pixel value of the output image Pout.
  • the model generation unit 23 satisfies a predetermined termination condition using the learning captured image C (compressed correct captured image C') and the correct captured image D (enhanced correct captured image D') acquired from a plurality of selected subjects. Machine learning of the iterative model MD is performed until The predetermined termination condition may be a predetermined number of times, or may be that the difference value between the output image Pout and the emphasized correct captured image D' becomes less than a predetermined value.
  • the model generation unit 23 passes the model MD for which machine learning has been completed to the model output unit 24 as a learned model MDa.
  • the model output unit 24 outputs the learned model MDa.
  • the location of fluorescence caused by porphyrin in the output image Pout generated by the learned model MDa is determined. It was confirmed through experiments that the reproducibility of Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was When using subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels and preprocessing part 1 is not performed, the pixel value correlation coefficient between the output image and the correct captured image is "0.22".
  • the 768 x 768 pixel image is extracted from the 3456 x 5184 pixel learning image C and the correct image D to create the 256 x 256 pixel compressed learning image.
  • the pixel value correlation coefficient between the output image and the correct captured image was "0.35".
  • the reproducibility of the fluorescence location caused by porphyrin in the output image Pout generated by the trained model MDa can be improved. It was confirmed through experiments that this improvement was achieved.
  • the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was If you use subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels, and perform the above preprocessing 1 and do not perform preprocessing 2, the pixels of the output image and the correct captured image While the value correlation coefficient was "0.35", when performing the above-mentioned preprocessing 1 and preprocessing 2, the pixel value correlation coefficient between the output image and the correct captured image was "0.35". 0.61''.
  • FIG. 6 is a flowchart illustrating an example of the procedure (skin condition measurement step S200) of the image processing method according to the present embodiment.
  • the skin condition measurement step S200 is a step executed by the image processing device 10, and outputs from the subject captured image A of the subject U or the subject compressed captured image A' converted from the subject captured image A using the learned model MDa. This is the stage of generating image B.
  • the model storage unit 11 of the image processing device 10 stores the trained model MDa generated by the model generation device 20 in the above-described model learning step S100.
  • the subject U images the part of the skin (for example, the face) that the subject U wants to have measured using, for example, the camera of the smartphone 30, and transmits the captured subject image A to the image processing device 10 using the smartphone 30.
  • the subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via a communication line or the like.
  • the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'.
  • the image may be captured so that color correction and position correction can be performed. Specific methods for color correction and position correction include the methods described above.
  • Step S202 The output image acquisition unit 13 inputs the subject captured image A or the subject compressed captured image A' to the learned model MDa as the input image Pin (see FIG. 2).
  • the trained model MDa receives the subject captured image A or the subject compressed captured image A' (input image Pin), it generates and outputs an output image Pout (see FIG. 2).
  • the output image acquisition unit 13 acquires the output image Pout output from the trained model MDa as an output image B or an output image B'.
  • the output image B or the output image B' is an image of the skin of the subject U in which fluorescence due to porphyrin has occurred, which is inferred by the learned model MDa from the subject image A or the subject compressed image A' of the subject U. .
  • the output image B or the output image B' is the face of the subject U in which fluorescence caused by porphyrin has occurred, which is inferred by the learned model MDa from the subject captured image A or the subject compressed captured image A' of the subject U's face.
  • the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B.
  • the output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line.
  • Subject U receives output image B transmitted from image processing device 10 with smartphone 30 via a communication line.
  • the subject U can visually recognize the state of porphyrin on his or her skin (for example, face) using the output image B.
  • the image processing device uses the smartphone 30 to process the image A of the captured face of the subject. 10 and sending the output image B from the image processing device 10 to the smartphone 30 of the subject U is unnecessary.
  • the learning captured image C is a captured image of the selected subject's skin irradiated with visible light, and therefore can be easily captured using the smartphone 30.
  • the correct captured image D cannot be easily captured using the smartphone 30 because it is an image of the selected subject's face in which fluorescence due to porphyrin has been generated due to irradiation with ultraviolet rays. Therefore, the learning captured image C is an image captured using the smartphone 30, and the correct captured image D is an image captured using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific". , it is conceivable to align these images and use them as training data.
  • VIPSIA registered trademark
  • FIG. 7 is an image diagram for explaining the difference in imaging between the learning captured image and the correct captured image according to one embodiment.
  • An image acquisition method in a modified example of the teacher data creation method will be described with reference to the same figure.
  • FIG. 7A shows an example of a method for acquiring the correct captured image D, and is an example where imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific.” As shown in the figure, the subject's face is fixed on the device, so hand shake and the like do not occur. If the learning image C is also captured at the same time, the position of the subject's face does not change, so alignment is not required (even if not simultaneously, the amount of alignment will be small). Furthermore, as shown in FIG. 7(A), when imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific," the imaging device itself also has a higher image quality than the smartphone 30. is used.
  • FIG. 7(B) is an example of a method for acquiring the learning captured image C, and is an example where the smartphone 30 is used to capture the image.
  • the subject's face is not fixed and camera shake may occur.
  • the image quality of the imaging device itself is lower than that of devices such as “VISIA (registered trademark), manufactured by Canfield Scientific.”
  • VISIA registered trademark
  • highly accurate positioning is performed in this embodiment. Note that even when using the smartphone 30, it is possible to fix the face and capture an image while suppressing the occurrence of camera shake.
  • imaging may be performed in a situation similar to that in which the image is captured in the inference stage.
  • the situation that is imaged in the inference stage may be one in which the user stretches out his hand and takes an image of himself (so-called self-portrait) using the in-camera of the smartphone 30.
  • FIG. 8 is an image diagram of reference points used for creating teacher data according to one embodiment.
  • Four reference points are attached to the subject's face.
  • the reference point may be something like a sticker.
  • a mark may be made on the seal using a marker or the like, or a hole may be made in the seal for position identification.
  • FIG. 9 is a diagram illustrating an example of preprocessing the correct captured image and cutting out the learning captured image and the correct captured image according to an embodiment.
  • an example of preprocessing of the correct captured image D and cutting out of the learning captured image C and the correct captured image D according to a modified example of the teacher data creation method will be described.
  • FIG. 9(A) is an example of the correct captured image D. Since the correct captured image D is captured with the face fixed, the image is shown as viewed from the front.
  • FIG. 9(B) is an example of a captured image C for learning. The learning captured image C is captured using the smartphone 30 without fixing the face, so it may not be viewed from the front. In this embodiment, these images are aligned.
  • preprocessing is performed on the correct captured image D.
  • a conventional technique such as projective transformation is used to transform the correct captured image D in accordance with the learning captured image C.
  • the preprocessing may be performed by image processing.
  • FIG. 9C shows the result of preprocessing the correct captured image D.
  • the learning captured image C is not converted in accordance with the correct captured image D, but the correct captured image D is converted in accordance with the learning captured image C. This is an attempt to perform learning using images similar to those used in the inference stage by performing transformations that match the images used in the inference stage.
  • teacher data is created by extracting (cutting out) a partial region of the image from each of the image shown in FIG. 9(B) and the image shown in FIG. 9(C).
  • Reference points as shown in FIG. 8 are used to extract the image. Note that the reference point does not need to be intentionally added for the purpose of creating the teacher data, and may be a characteristic point of the human body, such as a mole or the corner of the mouth, for example.
  • an image of a person's skin irradiated with visible light is captured, and a portion having a reference point is
  • the captured image for learning C which is a captured image
  • the captured image for learning C is captured by an imaging device different from the imaging device with which the captured image for learning C is captured.
  • a correct captured image D is obtained, which is an image of the subject and is an image of the human skin in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays.
  • the teacher data is created by storing images extracted from each of the learning captured image C and the correct captured image D in association with each other.
  • the correct captured image D (instead of the learning captured image C) is image-processed, and the training captured image C is converted into the learning captured image C. Generate a combined image.
  • the image processing step is pre-processing.
  • a part of the area based on the reference point is extracted from the correct captured image D that has been subjected to image processing in the image processing step.
  • the visible light used when capturing the learning captured image C and the subject captured image A have the same wavelength and intensity.
  • blue light eg, wavelength of 380 to 550 nm
  • the fluorescence caused by porphyrin in the output image Pout generated by the trained model MDa is reduced compared to the case where white light is used. It was confirmed through experiments that the reproducibility of the parts was improved.
  • the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was
  • white light 400 to 770 nm
  • the pixel value correlation coefficient between the output image and the correct captured image was 0.61, but when blue light (380 to 550 nm) was used, the pixel value correlation coefficient between the output image and the correct captured image was 0.61.
  • the correlation coefficient was "0.65".
  • examples of embodiments using blue light include, for example, covering artificial light such as incandescent light bulbs, fluorescent lights, and LED light bulbs with a blue film, irradiation with LED light bulbs that only emit blue light, or lighting the entire surface of a smartphone or monitor. For example, it may be displayed in blue and irradiated.
  • the image processing device 10 is configured to display a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays, resulting in fluorescence caused by porphyrins.
  • the machine generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image D of the human skin irradiated with visible light using the correct captured image D of the skin of the person.
  • An output image that is an image of the human skin in which fluorescence due to porphyrin has occurred is obtained from the subject image A in which the skin of the subject U irradiated with visible light is captured using the trained model MDa that has been trained. Get B.
  • the subject captured image A may be converted into a subject compressed captured image A' which is a part of the subject captured image A compressed. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa.
  • the part of the face may be around the forehead, nose, or mouth.
  • An output image B' which is an image of the human skin in which fluorescence due to porphyrin has occurred, is obtained from the subject compressed captured image A'.
  • the output image B' is a compressed image, and is expanded to the same size as the captured image A of the subject and restored to the output image B, thereby obtaining the output image B.
  • a compressed learning captured image obtained by compressing a portion of the learning captured image C and a compressed correct captured image obtained by compressing a portion of the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa.
  • the part of the face may be around the forehead, nose, or mouth.
  • an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa.
  • the correct captured image D is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.
  • the trained model MDa may be composed of a layer of an encoder portion and a layer of a decoder portion.
  • the learned model MDa may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.
  • the model generation device 20 generates a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays and has fluorescence caused by porphyrins.
  • the model generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light using the correct captured image D in which the skin of the person is captured.
  • Machine learning of MD is performed to generate a learned model MDa. By using this learned model MDa, it is possible to detect porphyrins on the skin of the subject U without irradiating the subject U with ultraviolet rays.
  • the model generation device 20 generates a compressed learning captured image in which a portion of the learning captured image C is compressed, and a compressed correct captured image in which a portion of the correct captured image D is compressed, and the compressed learning captured image and the compressed correct captured image are compressed.
  • the compressed correct captured image may be used for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa.
  • the part of the face may be around the forehead, nose, or mouth.
  • the model generation device 20 may generate an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D, and use the emphasized correct captured image for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa.
  • the correct captured image D is an RGB image composed of an R image, a G image, and a B image
  • the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.
  • model MD may be composed of an encoder portion layer and a decoder portion layer. Further, the model MD may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.
  • the susceptibility to acne can be determined.
  • the standard image may be an image of a typical person without acne.
  • a predetermined suitable image may be prepared depending on, for example, the age, gender, nationality, etc. of the person to be determined (subject U).
  • the standard image does not have to be an actually captured image, but may be an image generated by image processing.
  • an output image output by the image processing device 10 is acquired based on the photographed image A of the subject before the subject performs skin care.
  • an output image output by the image processing device 10 is acquired based on the photographed image A of the subject after the subject has performed skin care. Then, the output image acquired in the first acquisition step and the output image acquired in the second acquisition step are compared to determine the effect of skin care. The determination may be made by comparing the output images before and after skin care, or by comparing the output images before and after skin care with a standard image, respectively.
  • a computer program for realizing the functions of each device described above may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed.
  • the "computer system” here may include hardware such as an OS and peripheral devices.
  • the term “computer system” includes the homepage providing environment (or display environment) if a WWW system is used.
  • “computer-readable recording media” refers to flexible disks, magneto-optical disks, ROMs, writable non-volatile memories such as flash memory, portable media such as DVDs (Digital Versatile Discs), and media built into computer systems.
  • a storage device such as a hard disk.
  • “computer-readable recording medium” refers to volatile memory (for example, DRAM (Dynamic It also includes those that retain programs for a certain period of time, such as Random Access Memory).
  • the program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in a transmission medium.
  • the "transmission medium” that transmits the program refers to a medium that has a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
  • the above-mentioned program may be for realizing a part of the above-mentioned functions.
  • it may be a so-called difference file (difference program) that can realize the above-described functions in combination with a program already recorded in the computer system.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pathology (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

This image processing device is provided with: a model storage unit for storing a trained model subjected to machine learning, using a learning captured image obtained by capturing an image of the skin of a person irradiated with visible light and a correct-answer captured image obtained by capturing an image of the skin of a person in which fluorescence caused by porphyrin is produced due to being irradiated with UV light, so as to generate an image of the skin of a person in which fluorescence caused by porphyrin is produced from a captured image obtained by capturing an image of the skin of a person irradiated with visible light; and an output image acquisition unit for acquiring, using the trained model, an output image, which is an image of the skin of a person in which fluorescence caused by porphyrin is produced, from a subject captured image obtained by capturing an image of the skin of a subject irradiated with visible light.

Description

画像処理装置、モデル生成装置、皮膚状態測定システム、画像処理方法、モデル生成方法、判定方法及び教師データ作成方法Image processing device, model generation device, skin condition measurement system, image processing method, model generation method, determination method, and training data creation method
 本発明は、画像処理装置、モデル生成装置、皮膚状態測定システム、画像処理方法、モデル生成方法、判定方法及び教師データ作成方法に関する。
 本願は、2022年8月10日に日本に出願された特願2022-127815について優先権を主張し、その内容をここに援用する。
The present invention relates to an image processing device, a model generation device, a skin condition measurement system, an image processing method, a model generation method, a determination method, and a training data creation method.
This application claims priority to Japanese Patent Application No. 2022-127815 filed in Japan on August 10, 2022, the contents of which are incorporated herein.
 従来、ニキビの原因となるプロピオニバクテリウム・アクネ(Propionibacterium acnes;アクネ菌)が代謝物として生成した皮膚上のポルフィリンを検出することによって、アクネ菌の存在や皮膚の状態を調べることが行われている。その皮膚上のポルフィリンを検出する方法として、例えば特許文献1に記載されたポルフィリン検出方法が知られている。特許文献1に記載されたポルフィリン検出方法では、皮膚表面に可視光を含む微弱な紫外線を照射し、その皮膚表面を、CCD(電荷結合素子)の冷却機構を備えたCCDカラーカメラによって撮像し、CCDカラーカメラから出力される皮膚表面のカラー画像信号をテレビモニターに静止画像として写し出し、静止画像を観察するものである。 Traditionally, the presence of Propionibacterium acnes and the condition of the skin have been investigated by detecting porphyrins on the skin, which are produced as metabolites by Propionibacterium acnes, which causes acne. ing. As a method for detecting porphyrin on the skin, for example, the porphyrin detection method described in Patent Document 1 is known. In the porphyrin detection method described in Patent Document 1, the skin surface is irradiated with weak ultraviolet light including visible light, and the skin surface is imaged by a CCD color camera equipped with a CCD (charge coupled device) cooling mechanism. A color image signal of the skin surface outputted from a CCD color camera is displayed as a still image on a television monitor, and the still image is observed.
特開平5-103771号公報Japanese Patent Application Publication No. 5-103771
 しかし、上述した特許文献1に記載されたポルフィリン検出方法は、紫外線を被験者の皮膚表面に照射する必要があるため、紫外線照射による皮膚へのダメージが問題であり、紫外線を照射しなくてもポルフィリンを検出することができる方法が求められていた。 However, the porphyrin detection method described in Patent Document 1 mentioned above requires irradiation of ultraviolet rays onto the skin surface of the subject, so damage to the skin due to ultraviolet rays is a problem. There was a need for a method that could detect.
 本発明は、このような事情を考慮してなされたものであり、その目的は、被験者に紫外線を照射しなくても、被験者の皮膚上のポルフィリンを検出することができる、画像処理装置、モデル生成装置、皮膚状態測定システム、画像処理方法及びモデル生成方法を提供することにある。 The present invention has been made in consideration of these circumstances, and its purpose is to provide an image processing device and a model that can detect porphyrins on the skin of a subject without irradiating the subject with ultraviolet rays. The present invention provides a generation device, a skin condition measurement system, an image processing method, and a model generation method.
 本発明の一態様は、可視光が照射された人の皮膚が撮像された学習用撮像画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われた学習済モデルを格納するモデル格納部と、可視光が照射された被験者の皮膚が撮像された被験者撮像画像を取得する被験者撮像画像取得部と、前記被験者撮像画像から、前記学習済モデルを用いて、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像を取得する出力画像取得部と、を備える画像処理装置である。 One aspect of the present invention provides a learning image that captures the skin of a person irradiated with visible light, and a correct image that captures the skin of a person that has fluorescence caused by porphyrins due to irradiation with ultraviolet rays. A trained model is machine learned to generate an image of human skin with fluorescence caused by porphyrins from a captured image of human skin irradiated with visible light. a subject captured image acquisition unit that acquires a subject captured image in which the skin of the subject irradiated with visible light is captured; and an output image acquisition unit that acquires an output image that is an image of a person's skin where fluorescence has occurred.
 本発明の一態様は、上記の画像処理装置において、前記学習済モデルの機械学習は、前記学習用撮像画像の一部分を圧縮した圧縮学習用撮像画像と、前記正解撮像画像の一部分を圧縮した圧縮正解撮像画像とが用いられた、画像処理装置である。 In one aspect of the present invention, in the above image processing device, the machine learning of the learned model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image. This is an image processing device that uses a correct captured image.
 本発明の一態様は、上記の画像処理装置において、前記皮膚は顔の皮膚であり、前記一部分は、額、鼻又は口周辺である、画像処理装置である。 One aspect of the present invention is the image processing device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
 本発明の一態様は、上記の画像処理装置において、前記学習済モデルの機械学習は、前記正解撮像画像に対して赤色成分を強調した強調正解撮像画像が用いられた、画像処理装置である。 One aspect of the present invention is the image processing apparatus described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
 本発明の一態様は、上記の画像処理装置において、前記正解撮像画像はR画像、G画像及びB画像から構成されるRGB画像であり、前記強調正解撮像画像は、前記R画像から前記G画像を差し引いた差分画像を前記RGB画像に加えて生成されたRGB画像である、画像処理装置である。 In one aspect of the present invention, in the above-described image processing device, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to change from the R image to the G image. The image processing device is an RGB image that is generated by adding a difference image obtained by subtracting the above RGB image to the RGB image.
 本発明の一態様は、上記の画像処理装置において、前記学習済モデルは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成される、画像処理装置である。 One aspect of the present invention is the image processing device described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.
 本発明の一態様は、上記の画像処理装置において、前記学習済モデルは、前記エンコーダー部分のレイヤーと前記デコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有する、画像処理装置である。 In one aspect of the present invention, in the above-described image processing device, the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is an image processing device.
 本発明の一態様は、可視光が照射された人の皮膚が撮像された学習用撮像画像を取得する学習用撮像画像取得部と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像を取得する正解撮像画像取得部と、前記学習用撮像画像と前記正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルの機械学習を行い、学習済モデルを生成するモデル生成部と、を備えるモデル生成装置である。 One aspect of the present invention includes a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light, and a learning image acquisition unit that acquires a learning image of human skin irradiated with visible light; a correct captured image acquisition unit that obtains a correct captured image in which human skin is captured; and a captured image in which human skin irradiated with visible light is captured using the learning captured image and the correct captured image; The present invention is a model generation device that performs machine learning on a model to generate an image of human skin in which fluorescence caused by porphyrin occurs from the surface of the skin, and generates a learned model.
 本発明の一態様は、上記のモデル生成装置において、前記モデル生成部は、前記学習用撮像画像の一部分を圧縮した圧縮学習用撮像画像と、前記正解撮像画像の一部分を圧縮した圧縮正解撮像画像とを生成し、前記圧縮学習用撮像画像と前記圧縮正解撮像画像とを前記モデルの機械学習に用いる、モデル生成装置である。 One aspect of the present invention is the above model generation device, in which the model generation unit generates a compressed learning captured image obtained by compressing a portion of the learning captured image, and a compressed correct captured image obtained by compressing a portion of the correct captured image. The present invention is a model generation device that generates a compression learning captured image and the compressed correct captured image for machine learning of the model.
 本発明の一態様は、上記のモデル生成装置において、前記皮膚は顔の皮膚であり、前記一部分は、額、鼻又は口周辺である、モデル生成装置である。 One aspect of the present invention is the model generating device described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
 本発明の一態様は、上記のモデル生成装置において、前記モデル生成部は、前記正解撮像画像に対して赤色成分を強調した強調正解撮像画像を生成し、前記強調正解撮像画像を前記モデルの機械学習に用いる、モデル生成装置である。 In one aspect of the present invention, in the above model generation device, the model generation unit generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the model generator This is a model generation device used for learning.
 本発明の一態様は、上記のモデル生成装置において、前記正解撮像画像はR画像、G画像及びB画像から構成されるRGB画像であり、前記強調正解撮像画像は、前記R画像から前記G画像を差し引いた差分画像を前記RGB画像に加えて生成されたRGB画像である、モデル生成装置である。 In one aspect of the present invention, in the above-described model generation device, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is a model generation device that is an RGB image generated by adding a difference image obtained by subtracting .
 本発明の一態様は、上記のモデル生成装置において、前記モデルは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成される、モデル生成装置である。 One aspect of the present invention is the model generation device described above, in which the model is composed of a layer of an encoder portion and a layer of a decoder portion.
 本発明の一態様は、上記のモデル生成装置において、前記モデルは、前記エンコーダー部分のレイヤーと前記デコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有する、モデル生成装置である。 One aspect of the present invention is the model generation device described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It is a device.
 本発明の一態様は、上記の画像処理装置と、上記のモデル生成装置とを備える、皮膚状態測定システムである。 One aspect of the present invention is a skin condition measurement system that includes the above-described image processing device and the above-described model generation device.
 本発明の一態様は、可視光が照射された被験者の皮膚が撮像された被験者撮像画像を取得するステップと、前記被験者撮像画像から、学習済モデルを用いて、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像を取得するステップと、を含み、前記学習済モデルは、可視光が照射された人の皮膚が撮像された学習用撮像画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われたモデルである、画像処理方法である。 One aspect of the present invention includes the step of acquiring a subject image in which the skin of the subject irradiated with visible light is captured, and the step of acquiring a subject image in which the subject's skin is imaged, and using a trained model to detect fluorescence caused by porphyrins from the subject image. and obtaining an output image that is an image of human skin, and the trained model includes a learning image that is a human skin that has been irradiated with visible light, and a training image that is an image of human skin that has been irradiated with ultraviolet rays. Using the correct captured image of the skin of a person in which fluorescence caused by porphyrins has occurred, the skin of a person in which fluorescence caused by porphyrins has occurred is obtained from the captured image of the skin of a person irradiated with visible light. This is an image processing method that is a model that uses machine learning to generate images.
 本発明の一態様は、上記の画像処理方法において、前記学習済モデルの機械学習は、前記学習用撮像画像の一部分を圧縮した圧縮学習用撮像画像と、前記正解撮像画像の一部分を圧縮した圧縮正解撮像画像とが用いられた、画像処理方法である。 In one aspect of the present invention, in the above image processing method, machine learning of the trained model is performed using a compressed learning captured image obtained by compressing a part of the learned captured image and a compressed compressed learning captured image obtained by compressing a part of the correct captured image. This is an image processing method that uses a correct captured image.
 本発明の一態様は、上記の画像処理方法において、前記皮膚は顔の皮膚であり、前記一部分は、額、鼻又は口周辺である、画像処理方法である。 One aspect of the present invention is the image processing method described above, wherein the skin is facial skin, and the portion is around the forehead, nose, or mouth.
 本発明の一態様は、上記の画像処理方法において、前記学習済モデルの機械学習は、前記正解撮像画像に対して赤色成分を強調した強調正解撮像画像が用いられた、画像処理方法である。 One aspect of the present invention is the image processing method described above, in which the machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
 本発明の一態様は、上記の画像処理方法において、前記正解撮像画像はR画像、G画像及びB画像から構成されるRGB画像であり、前記強調正解撮像画像は、前記R画像から前記G画像を差し引いた差分画像を前記RGB画像に加えて生成されたRGB画像である、画像処理方法である。 In one aspect of the present invention, in the image processing method described above, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is an image processing method in which an RGB image is generated by adding a difference image obtained by subtracting .
 本発明の一態様は、上記の画像処理方法において、前記学習済モデルは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成される、画像処理方法である。 One aspect of the present invention is the image processing method described above, in which the learned model is composed of a layer of an encoder portion and a layer of a decoder portion.
 本発明の一態様は、上記の画像処理方法において、前記学習済モデルは、前記エンコーダー部分のレイヤーと前記デコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有する、画像処理方法である。 In one aspect of the present invention, in the above image processing method, the learned model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. This is an image processing method.
 本発明の一態様は、可視光が照射された人の皮膚が撮像された学習用撮像画像を取得するステップと、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像を取得するステップと、前記学習用撮像画像と前記正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルの機械学習を行い、学習済モデルを生成するモデル生成ステップと、を含むモデル生成方法である。 One aspect of the present invention includes the steps of acquiring a learning image in which the skin of a person irradiated with visible light is captured, and the skin of a person in which fluorescence caused by porphyrins is generated due to the irradiation with ultraviolet rays is imaged. a step of obtaining a correct captured image obtained by using the learning captured image and the correct captured image, and detecting that fluorescence due to porphyrin is generated from the captured image of human skin irradiated with visible light using the learning captured image and the correct captured image; This model generation method includes a model generation step of performing machine learning on a model to generate an image of human skin and generating a learned model.
 本発明の一態様は、上記のモデル生成方法において、モデル生成ステップは、前記学習用撮像画像の一部分を圧縮した圧縮学習用撮像画像と、前記正解撮像画像の一部分を圧縮した圧縮正解撮像画像とを生成し、前記圧縮学習用撮像画像と前記圧縮正解撮像画像とを前記モデルの機械学習に用いる、モデル生成方法である。 One aspect of the present invention is that in the above model generation method, the model generation step includes a compressed learning captured image obtained by compressing a portion of the captured image for learning, and a compressed correct captured image obtained by compressing a portion of the correct captured image. This is a model generation method, in which the compressed learning captured image and the compressed correct captured image are used for machine learning of the model.
 本発明の一態様は、上記のモデル生成方法において、前記皮膚は顔の皮膚であり、前記一部分は、額、鼻又は口周辺である、モデル生成方法である。 One aspect of the present invention is the model generation method described above, wherein the skin is facial skin, and the part is around the forehead, nose, or mouth.
 本発明の一態様は、上記のモデル生成方法において、前記モデル生成ステップは、前記正解撮像画像に対して赤色成分を強調した強調正解撮像画像を生成し、前記強調正解撮像画像を前記モデルの機械学習に用いる、モデル生成方法である。 One aspect of the present invention is the above model generation method, in which the model generation step generates an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image, and the emphasized correct captured image is transferred to a machine of the model. This is a model generation method used for learning.
 本発明の一態様は、上記のモデル生成方法において、前記正解撮像画像はR画像、G画像及びB画像から構成されるRGB画像であり、前記強調正解撮像画像は、前記R画像から前記G画像を差し引いた差分画像を前記RGB画像に加えて生成されたRGB画像である、モデル生成方法である。 One aspect of the present invention is that in the above model generation method, the correct captured image is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is configured to convert the R image to the G image. This is a model generation method in which an RGB image is generated by adding a difference image obtained by subtracting .
 本発明の一態様は、上記のモデル生成方法において、前記モデルは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成される、モデル生成方法である。 One aspect of the present invention is the model generation method described above, in which the model is composed of an encoder portion layer and a decoder portion layer.
 本発明の一態様は、上記のモデル生成方法において、前記モデルは、前記エンコーダー部分のレイヤーと前記デコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有する、モデル生成方法である。 One aspect of the present invention is the model generation method described above, wherein the model has a U-Net structure in which a layer of the encoder portion and a layer of the decoder portion have a symmetrical structure and are connected by a skip connection. It's a method.
 本発明の一態様は、上記の画像処理方法において、前記学習用撮像画像と前記正解撮像画像を取得するステップにおいて、少なくとも1か所以上の映しこみ基準点を設置する、画像処理方法である。 One aspect of the present invention is the image processing method described above, in which at least one projection reference point is set in the step of acquiring the learning captured image and the correct captured image.
 本発明の一態様は、上記の画像処理装置により取得された前記出力画像と、ニキビが発生していない一般的な人の皮膚の画像である標準画像とを比較することにより、ニキビのかかりやすさを判定する判定方法である。 One aspect of the present invention is to compare the output image obtained by the image processing device described above with a standard image, which is an image of the skin of a typical person who does not have acne, to determine the likelihood of acne. This is a determination method for determining whether
 本発明の一態様は、被験者がスキンケアを行った前後におけるスキンケアの効果を評価する評価方法であって、被験者がスキンケアを行う前における前記被験者撮像画像に基づいて、上記の画像処理装置により取得された前記出力画像を取得する第1取得ステップと、被験者がスキンケアを行った後における前記被験者撮像画像に基づいて、上記の画像処理装置により取得された前記出力画像を取得する第2取得ステップと、前記第1取得ステップにより取得された前記出力画像と、前記第2取得ステップにより取得された前記出力画像とを比較し、スキンケアによる改善の程度を判定する判定方法である。 One aspect of the present invention is an evaluation method for evaluating the effect of skin care before and after a subject performs skin care, wherein the image is acquired by the above image processing device based on the image captured by the subject before the subject performs skin care. a first acquisition step of acquiring the output image obtained by the subject, and a second acquisition step of acquiring the output image acquired by the image processing device based on the captured image of the subject after the subject performs skin care; This determination method compares the output image acquired in the first acquisition step and the output image acquired in the second acquisition step to determine the degree of improvement due to skin care.
 本発明の一態様は、可視光が照射された人の皮膚が撮像された画像であって、基準点を有する部分が撮像された画像である学習用撮像画像を取得する学習用撮像画像取得ステップと、前記学習用撮像画像が撮像された撮像装置とは異なる撮像装置により前記学習用撮像画像と同一の被写体が撮像された画像であって、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された画像である正解撮像画像を取得する正解撮像画像取得ステップと、前記学習用撮像画像のうち前記基準点に基づく一部の領域を抽出する第1抽出ステップと、前記正解撮像画像のうち前記基準点に基づく一部の領域を抽出する第2抽出ステップと、前記学習用撮像画像及び前記正解撮像画像それぞれから抽出された画像を対応付けて記憶することにより教師データを作成する教師データ作成ステップと、を含む教師データ作成方法である。 One aspect of the present invention is a learning captured image acquisition step of acquiring a learning captured image, which is an image of human skin irradiated with visible light and in which a portion having a reference point is captured. is an image of the same subject as the learning image captured by an imaging device different from the image capturing device that captured the learning image, and the fluorescence caused by porphyrin is emitted by irradiation with ultraviolet rays. a correct captured image acquisition step of acquiring a correct captured image that is an image of the skin of the person who has developed the skin; a first extraction step of extracting a part of the area based on the reference point from the learning captured image; A second extraction step of extracting a part of the region based on the reference point from the correct captured image, and storing the images extracted from each of the learning captured image and the correct captured image in association with each other, thereby creating training data. A teacher data creation method includes a step of creating teacher data.
 本発明の一態様は、上記の教師データ作成方法において、正解撮像画像を画像処理することにより、前記学習用撮像画像に合わせた画像を生成する画像処理ステップを更に含み、前記第2抽出ステップは、前記画像処理ステップにより画像処理が行われた前記正解撮像画像のうち前記基準点に基づく一部の領域を抽出する教師データ作成方法である。 One aspect of the present invention is the above teaching data creation method, further including an image processing step of generating an image matching the learning captured image by performing image processing on the correct captured image, and the second extraction step includes: , a teacher data creation method for extracting a part of the region based on the reference point from the correct captured image subjected to image processing in the image processing step.
 本発明によれば、被験者に紫外線を照射しなくても、被験者の皮膚上のポルフィリンを検出することができるという効果が得られる。 According to the present invention, it is possible to detect porphyrins on the skin of a subject without irradiating the subject with ultraviolet rays.
一実施形態に係る皮膚状態測定システムの構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of a skin condition measurement system according to an embodiment. 一実施形態に係るモデルの概略の構成例を示す構成図である。1 is a configuration diagram showing a schematic configuration example of a model according to an embodiment; FIG. 一実施形態に係るモデル生成方法の手順の例を示すフローチャートである。3 is a flowchart illustrating an example of a procedure of a model generation method according to an embodiment. 一実施形態に係る学習用撮像画像及び正解撮像画像の切り出し部分の例を示す図である。FIG. 6 is a diagram illustrating an example of cut-out portions of a learning captured image and a correct captured image according to an embodiment. 一実施形態に係る正解撮像画像の前処理その2を説明するための画像の例である。It is an example of an image for explaining preprocessing part 2 of the correct captured image according to one embodiment. 一実施形態に係る画像処理方法の手順の例を示すフローチャートである。3 is a flowchart illustrating an example of a procedure of an image processing method according to an embodiment. 一実施形態に係る学習用撮像画像及び正解撮像画像の撮像の違いを説明するためのイメージ図である。FIG. 4 is an image diagram for explaining the difference in imaging between a learning captured image and a correct captured image according to an embodiment. 一実施形態に係る教師データ作成に用いられる基準点のイメージ図である。FIG. 3 is an image diagram of reference points used for creating teacher data according to an embodiment. 一実施形態に係る正解撮像画像の前処理、及び学習用撮像画像及び正解撮像画像の切り出しの例を示す図である。FIG. 6 is a diagram illustrating an example of preprocessing of a correct captured image and cutting out a learning captured image and a correct captured image according to an embodiment.
 以下、図面を参照し、本発明の実施形態について説明する。
 図1は、一実施形態に係る皮膚状態測定システムの構成例を示すブロック図である。図1に示される皮膚状態測定システムは、画像処理装置10と、モデル生成装置20とを備える。画像処理装置10とモデル生成装置20とは、通信によりオンラインでデータを送受してもよく、又はオフラインでデータを入出力してもよい。
Embodiments of the present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram showing a configuration example of a skin condition measuring system according to an embodiment. The skin condition measurement system shown in FIG. 1 includes an image processing device 10 and a model generation device 20. The image processing device 10 and the model generation device 20 may exchange data online through communication, or may input and output data offline.
 被験者Uは、皮膚状態測定システムを利用する人であって、皮膚の状態を測定してもらう人である。被験者Uは、測定してもらいたい箇所の皮膚をスマートフォン30のカメラで撮像し、撮像した画像(被験者撮像画像)Aをスマートフォン30で画像処理装置10へ送信する。例えば、被験者Uは、ニキビが気になる自分の顔全体、又は額や鼻や口周辺をスマートフォン30のカメラで撮像し、撮像した顔の被験者撮像画像Aをスマートフォン30で画像処理装置10へ送信する。 Subject U is a person who uses the skin condition measurement system and has his skin condition measured. The subject U uses the camera of the smartphone 30 to capture an image of the skin of the area that he/she wants to have measured, and sends the captured image (subject captured image) A to the image processing device 10 using the smartphone 30. For example, subject U may take an image of his or her entire face, or the area around the forehead, nose, and mouth, where acne is a concern, using the camera of the smartphone 30, and send the subject captured image A of the captured face to the image processing device 10 using the smartphone 30. do.
 被験者撮像画像Aは、可視光が照射された被験者Uの皮膚が撮像された撮像画像である。被験者撮像画像Aが撮像される際には、可視光が被験者Uの皮膚に照射されればよく、被験者撮像画像Aが撮像される際に、被験者Uの皮膚に紫外線を照射する必要はない。可視光を照射する光源は、400~770nmの波長を含む。紫外線を照射する必要はないため、光源の紫外線(UV)強度は、被験者Uの皮膚へのダメージを少なくするために、できる限り弱いことが好ましく、例えばUV測定器の検出限界以下であることが好ましい。可視光を照射する光源としては、太陽光や月光等の自然光、白熱電球、蛍光灯、キセノンランプやLED電球等の人工光が挙げられる。光源に紫外線が含まれていてもよいが、UV強度が弱い人工光が好ましく、人工光の中でもキセノンランプ、LED電球がより好ましい。 The subject image A is a captured image of the subject U's skin irradiated with visible light. When the subject image A is captured, it is sufficient that the skin of the subject U is irradiated with visible light, and there is no need to irradiate the skin of the subject U with ultraviolet rays when the subject image A is captured. A light source that emits visible light includes wavelengths from 400 to 770 nm. Since it is not necessary to irradiate ultraviolet rays, the ultraviolet (UV) intensity of the light source is preferably as weak as possible in order to reduce damage to the skin of subject U, for example, below the detection limit of the UV measuring device. preferable. Examples of light sources that emit visible light include natural light such as sunlight and moonlight, and artificial light such as incandescent light bulbs, fluorescent light bulbs, xenon lamps, and LED light bulbs. Although the light source may contain ultraviolet rays, artificial light with low UV intensity is preferable, and among the artificial lights, xenon lamps and LED bulbs are more preferable.
 なお、本実施形態では、被験者Uはカメラ(撮像装置)を備えるスマートフォン30を利用するが、これに限定されない。例えば、カメラを備えるタブレット型のコンピュータ(タブレットPC)等の携帯端末装置が利用されてもよい。また、デジタルカメラで撮像された撮像画像がスマートフォン等の通信端末で画像処理装置10へ送信されてもよい。スマートフォン30のカメラで撮像する場合、色彩補正や位置補正ができるよう撮像することにより、ポルフィリン検出精度をより向上させることができる。色彩補正方法としては、例えば、被験者Uを撮像する際にカラーチャートを映しこむことが挙げられる。位置補正方法としては、例えば、被験者Uを撮像する際にArUcoマーカーや目印となるシール等を映しこみ基準点とし、画像編集ソフトで射影変換すること等が挙げられる。また、カメラを備えるスマートフォン30やタブレットPC等の携帯端末装置を利用する場合、後述する画像処理装置10が携帯端末装置に組み込まれて一体化されてもよい。この場合、撮像された顔の被験者撮像画像Aをスマートフォン30で画像処理装置10へ送信することは不要であり、また後述する出力画像Bを画像処理装置10から被験者Uのスマートフォン30へ送信することも不要である。 Note that in this embodiment, the subject U uses a smartphone 30 that is equipped with a camera (imaging device), but the present invention is not limited to this. For example, a mobile terminal device such as a tablet computer (tablet PC) equipped with a camera may be used. Further, a captured image captured by a digital camera may be transmitted to the image processing device 10 by a communication terminal such as a smartphone. When capturing an image with the camera of the smartphone 30, the accuracy of porphyrin detection can be further improved by capturing the image so that color correction and position correction can be performed. As a color correction method, for example, a color chart may be projected when photographing the subject U. As a position correction method, for example, when photographing the subject U, an ArUco marker, a landmark sticker, or the like is projected and used as a reference point, and projective transformation is performed using image editing software. Further, when using a mobile terminal device such as a smartphone 30 or a tablet PC equipped with a camera, the image processing device 10 described later may be incorporated and integrated into the mobile terminal device. In this case, it is not necessary to transmit the captured image A of the subject's face to the image processing device 10 using the smartphone 30, and it is not necessary to transmit the output image B, which will be described later, from the image processing device 10 to the smartphone 30 of the subject U. is also unnecessary.
[画像処理装置]
 画像処理装置10は、モデル格納部11と、被験者撮像画像取得部12と、出力画像取得部13とを備える。
[Image processing device]
The image processing device 10 includes a model storage section 11 , a subject captured image acquisition section 12 , and an output image acquisition section 13 .
 画像処理装置10の各機能は、画像処理装置10がCPU(Central Processing Unit:中央演算処理装置)、GPU(Graphics Processing Unit:画像演算処理装置)及びメモリ等のコンピュータハードウェアを備え、CPUがメモリに格納されたコンピュータプログラムを実行することにより実現される。なお、画像処理装置10として、汎用のコンピュータ装置を使用して構成してもよく、又は、専用のハードウェア装置として構成してもよい。例えば、画像処理装置10は、インターネット等の通信ネットワークに接続されるサーバコンピュータを使用して構成されてもよい。また、画像処理装置10の各機能はクラウドコンピューティングにより実現されてもよい。また、画像処理装置10は、単独のコンピュータにより実現するものであってもよく、又は画像処理装置10の機能を複数のコンピュータに分散させて実現するものであってもよい。また、画像処理装置10として、例えばWWWシステム等を利用してウェブサイトを開設するように構成してもよい。 Each function of the image processing device 10 includes computer hardware such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and a memory. This is realized by executing a computer program stored in the computer. Note that the image processing device 10 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device. For example, the image processing device 10 may be configured using a server computer connected to a communication network such as the Internet. Further, each function of the image processing device 10 may be realized by cloud computing. Further, the image processing device 10 may be realized by a single computer, or the functions of the image processing device 10 may be realized by distributing the functions to a plurality of computers. Further, the image processing apparatus 10 may be configured to open a website using, for example, a WWW system.
 モデル格納部11は、学習済モデルMDaを格納する。学習済モデルMDaは、モデル生成装置20から提供される。学習済モデルMDaは、可視光が照射された人の皮膚が撮像された学習用撮像画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われたモデルである。 The model storage unit 11 stores the learned model MDa. The learned model MDa is provided from the model generation device 20. The learned model MDa consists of a learning image of human skin irradiated with visible light and a correct image of human skin in which fluorescence caused by porphyrins is generated due to ultraviolet irradiation. This is a model in which machine learning was performed to generate an image of human skin in which fluorescence caused by porphyrin has occurred from a captured image of human skin irradiated with visible light.
 被験者撮像画像取得部12は、被験者撮像画像Aを取得する。被験者撮像画像Aは、可視光が照射された被験者Uの皮膚が撮像された撮像画像である。被験者撮像画像取得部12は、被験者Uのスマートフォン30から送信された被験者撮像画像Aを、通信回線を介して受信する。
 なお、被験者撮像画像取得部12は、被験者撮像画像Aを所定の圧縮サイズに圧縮し、被験者圧縮撮像画像A’に変換してもよい。例えば、被験者撮像画像サイズが「3456×5184」ピクセルである場合に、「256×256」ピクセルに圧縮する。
 また、撮像取得時に被験者Uが動くことによる顔の位置の相違による出力画像のぼやけを最小化するために、「3456×5184」ピクセルの被験者撮像画像Aから「768×768」ピクセルの画像サイズで切り出し、その「768×768」ピクセルの切り出し部分の画像を「256×256」ピクセルに圧縮してもよい。切り出す部分は、被験者の顔全体の撮像画像を等間隔に切り出してもよく、ポルフィリンが存在している確率が高い額、鼻又は口周辺を切り出すことも好ましい。その「256×256」ピクセルの画像が、被験者圧縮撮像画像A’である。
 なお、上記した「3456×5184」ピクセル、「768×768」ピクセル及び「256×256」ピクセル等の画像サイズの数値は、画像サイズの一例であってこれに限定されない。
The subject captured image acquisition unit 12 acquires the subject captured image A. The subject captured image A is a captured image of the subject U's skin irradiated with visible light. The subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via the communication line.
Note that the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'. For example, when the subject captured image size is "3456 x 5184" pixels, it is compressed to "256 x 256" pixels.
In addition, in order to minimize the blurring of the output image due to the difference in the position of the face due to the movement of the subject U during image acquisition, the image size is changed from the subject captured image A of "3456 x 5184" pixels to the image size of "768 x 768" pixels. The image may be cut out, and the image of the cut-out portion of "768 x 768" pixels may be compressed to "256 x 256" pixels. The portion to be cut out may be a captured image of the subject's entire face at equal intervals, and it is also preferable to cut out the area around the forehead, nose, or mouth where porphyrin is likely to be present. The "256x256" pixel image is the subject compressed captured image A'.
Note that the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.
 出力画像取得部13は、被験者撮像画像A又は被験者圧縮撮像画像A’から、学習済モデルMDaを用いて、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像B又は出力画像B’を取得する。
 学習済モデルMDaは、被験者撮像画像Aが入力されると、出力画像Bを出力する。出力画像Bは、学習済モデルMDaが被験者Uの被験者撮像画像Aから推論した画像であって、ポルフィリンに起因する蛍光が生じた被験者Uの皮膚の画像である。例えば、出力画像Bは、被験者Uの顔の被験者撮像画像Aから学習済モデルMDaによって推論された、ポルフィリンに起因する蛍光が生じた被験者Uの顔の画像である。
 学習済モデルMDaは、被験者圧縮撮像画像A’が入力されると、出力画像B’を出力する。出力画像B’は、学習済モデルMDaが被験者Uの被験者圧縮撮像画像A’から推論した画像であって、ポルフィリンに起因する蛍光が生じた被験者Uの皮膚の画像である。例えば、出力画像B’は、被験者Uの顔の被験者圧縮撮像画像A’から学習済モデルMDaによって推論された、ポルフィリンに起因する蛍光が生じた被験者Uの顔の画像である。出力画像B’は圧縮画像であるため、出力画像取得部13は、出力画像B’から被験者撮像画像Aと同じ大きさに展開して出力画像Bに復元する。被験者の顔全体の撮像画像を等間隔に切り出しそれぞれを圧縮した場合は、出力画像取得部13は、出力画像B’から出力画像Bに復元し、さらに出力画像Bを連結することで被験者撮像画像Aと同じ大きさの出力画像Bを取得する。
The output image acquisition unit 13 uses the learned model MDa from the subject captured image A or the subject compressed captured image A' to obtain an output image B or output image B, which is an image of the human skin in which fluorescence due to porphyrin has occurred. Get '.
The trained model MDa outputs an output image B when the captured image A of the subject is input. Output image B is an image inferred by trained model MDa from subject image A of subject U, and is an image of subject U's skin in which fluorescence due to porphyrin has occurred. For example, the output image B is an image of the face of the subject U in which fluorescence caused by porphyrin is inferred from the subject captured image A of the face of the subject U by the learned model MDa.
The learned model MDa outputs an output image B' when the compressed captured image A' of the subject is input. The output image B' is an image inferred by the learned model MDa from the subject compressed captured image A' of the subject U, and is an image of the subject U's skin in which fluorescence caused by porphyrin has occurred. For example, the output image B' is an image of the subject U's face in which fluorescence due to porphyrin has occurred, which is inferred by the trained model MDa from the subject compressed captured image A' of the subject's U's face. Since the output image B' is a compressed image, the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B. When the captured image of the subject's entire face is cut out at equal intervals and compressed, the output image acquisition unit 13 restores the output image B' to the output image B, and further connects the output images B to create the subject captured image. Obtain an output image B of the same size as A.
 出力画像取得部13は、通信回線を介して、出力画像Bを被験者Uのスマートフォン30へ送信する。 The output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line.
 被験者Uは、画像処理装置10から送信された出力画像Bを、通信回線を介してスマートフォン30で受信する。被験者Uは、スマートフォン30で受信した出力画像Bをスマートフォン30の表示画面上に表示させることにより、出力画像Bによって自分の皮膚上のポルフィリンの状態を視認することができる。例えば、被験者Uは、ニキビが気になる自分の顔の被験者撮像画像Aをスマートフォン30で画像処理装置10へ送信し、その応答として画像処理装置10からスマートフォン30で受信した出力画像Bをスマートフォン30の表示画面上に表示させることにより、出力画像Bによって、ニキビが気になる自分の顔のポルフィリンの状態を視認することができる。 The subject U receives the output image B transmitted from the image processing device 10 with the smartphone 30 via the communication line. By displaying the output image B received by the smartphone 30 on the display screen of the smartphone 30, the subject U can visually recognize the state of porphyrin on his or her skin using the output image B. For example, subject U transmits a captured image A of his/her own face that is concerned about acne to the image processing device 10 using the smartphone 30, and in response, sends an output image B received from the image processing device 10 to the smartphone 30. By displaying the output image B on the display screen of , it is possible to visually recognize the state of porphyrins on one's own face where acne is a concern.
[モデル生成装置]
 モデル生成装置20は、学習用撮像画像取得部21と、正解撮像画像取得部22と、モデル生成部23と、モデル出力部24とを備える。
[Model generation device]
The model generation device 20 includes a learning captured image acquisition section 21, a correct captured image acquisition section 22, a model generation section 23, and a model output section 24.
 モデル生成装置20の各機能は、モデル生成装置20がCPU、GPU及びメモリ等のコンピュータハードウェアを備え、CPUがメモリに格納されたコンピュータプログラムを実行することにより実現される。なお、モデル生成装置20として、汎用のコンピュータ装置を使用して構成してもよく、又は、専用のハードウェア装置として構成してもよい。例えば、モデル生成装置20は、インターネット等の通信ネットワークに接続されるサーバコンピュータを使用して構成されてもよい。また、モデル生成装置20の各機能はクラウドコンピューティングにより実現されてもよい。また、モデル生成装置20は、単独のコンピュータにより実現するものであってもよく、又はモデル生成装置20の機能を複数のコンピュータに分散させて実現するものであってもよい。 Each function of the model generation device 20 is realized by the model generation device 20 including computer hardware such as a CPU, GPU, and memory, and the CPU executing a computer program stored in the memory. Note that the model generation device 20 may be configured using a general-purpose computer device, or may be configured as a dedicated hardware device. For example, the model generation device 20 may be configured using a server computer connected to a communication network such as the Internet. Further, each function of the model generation device 20 may be realized by cloud computing. Further, the model generation device 20 may be realized by a single computer, or the functions of the model generation device 20 may be realized by distributing it to a plurality of computers.
 学習用撮像画像取得部21は、学習用撮像画像Cを取得する。正解撮像画像取得部22は、正解撮像画像Dを取得する。学習用撮像画像C及び正解撮像画像Dは、学習用データ取得のために選定された人(選任被験者)が取得対象である。学習用撮像画像Cは、可視光が照射された選任被験者の皮膚が撮像された撮像画像である。例えば、学習用撮像画像Cは、可視光が照射された選任被験者の顔が撮像された撮像画像である。正解撮像画像Dは、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた選任被験者の皮膚が撮像された画像である。例えば、正解撮像画像Dは、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた選任被験者の顔が撮像された画像である。 The learning captured image acquisition unit 21 acquires the learning captured image C. The correct captured image acquisition unit 22 acquires the correct captured image D. The learning captured image C and the correct captured image D are acquired by a person (selected subject) selected for acquiring learning data. The learning captured image C is a captured image of the selected subject's skin irradiated with visible light. For example, the learning captured image C is a captured image in which the face of the selected subject is irradiated with visible light. The correct captured image D is an image of the selected subject's skin in which fluorescence due to porphyrin has occurred due to irradiation with ultraviolet rays. For example, the correct captured image D is an image of the selected subject's face in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays.
 正解撮像画像Dが撮像される際には、選任被験者の皮膚に紫外線を照射するため、選任被験者に対して、紫外線を照射すること及び紫外線を照射することによる影響についての承諾を得ることが好ましい。紫外線を照射する光源は、100~400nmの波長を含む。紫外線を照射する光源としては、例えば、水銀ランプ、紫外LEDランプ、ブラックライトなどが挙げられる。また、学習用撮像画像Cと正解撮像画像Dとの画角及び位置は同一であることが好ましい。画角及び位置が同一となる学習用撮像画像Cと正解撮像画像Dを取得するための撮像機として、例えば「VISIA(登録商標)、Canfield Scientific製」が挙げられる。また、学習用撮像画像Cを取得するための撮像機として、前記VISIAの代わりに、スマートフォンを用いても良い。この場合、画角及び位置が同一となる学習用撮像画像Cと正解撮像画像Dを取得するため、撮像画面内において、同じ場所の基準点を少なくとも1か所以上、好ましくは4から8か所指定するとよい。 When the correct image D is captured, the selected subject's skin is irradiated with ultraviolet rays, so it is preferable to obtain consent from the selected subject regarding the irradiation of ultraviolet rays and the effects of irradiating them with ultraviolet rays. . A light source that emits ultraviolet light includes a wavelength of 100 to 400 nm. Examples of the light source that irradiates ultraviolet rays include a mercury lamp, an ultraviolet LED lamp, and a black light. Further, it is preferable that the learning captured image C and the correct captured image D have the same angle of view and position. An example of an imaging device for acquiring the learning captured image C and the correct captured image D having the same angle of view and position is "VISIA (registered trademark), manufactured by Canfield Scientific." Moreover, a smartphone may be used instead of the VISIA as the imaging device for acquiring the captured learning image C. In this case, in order to obtain the learning captured image C and the correct captured image D with the same angle of view and position, at least one reference point at the same location, preferably 4 to 8 locations, is set in the captured screen. It is recommended to specify.
 モデル生成部23は、学習用撮像画像Cと正解撮像画像Dとを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルMDの機械学習を行い、学習済モデルMDaを生成する。学習済モデルMDaは、被験者撮像画像A又は被験者圧縮撮像画像A’が入力されると、出力画像B又は出力画像B’を出力する。 The model generation unit 23 uses the learning captured image C and the correct captured image D to generate an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light. Machine learning is performed on model MD to generate a learned model MDa. The trained model MDa outputs an output image B or an output image B' when the subject captured image A or the subject compressed captured image A' is input.
 モデル出力部24は、モデル生成部23が生成した学習済モデルMDaを出力する。学習済モデルMDaの出力方法は、例えば、コンピュータ読み取り可能な記録媒体への書き込み、通信によるデータ送信などが挙げられる。モデル出力部24が出力した学習済モデルMDaは、画像処理装置10へ提供される。画像処理装置10においてモデル格納部11は、提供された学習済モデルMDaを格納する。 The model output unit 24 outputs the trained model MDa generated by the model generation unit 23. Examples of methods for outputting the trained model MDa include writing to a computer-readable recording medium and transmitting data via communication. The learned model MDa output by the model output unit 24 is provided to the image processing device 10. In the image processing device 10, the model storage unit 11 stores the provided trained model MDa.
 なお、画像処理装置10とモデル生成装置20とは、同じ情報処理装置を使用して実現されてもよく、又はそれぞれ異なる情報処理装置を使用して実現されてもよい。 Note that the image processing device 10 and the model generation device 20 may be realized using the same information processing device, or may be realized using different information processing devices.
 図2は、本実施形態に係るモデルMDの概略の構成例を示すモデル構成図である。図2に示されるモデルMDは、エンコーダー110のレイヤーとデコーダー120のレイヤーとが対称構造となり、各レイヤーがそれぞれにスキップコネクション130で接続されたU-Net構造を有する。エンコーダー110は、入力画像Pinを各レイヤーにより段階的にエンコードする。デコーダー120は、エンコーダー110のエンコード結果を各レイヤーにより段階的にデコードする。デコーダー120のデコード結果は、出力画像PoutとしてモデルMDから出力される。エンコーダー110の各レイヤーと、それに対応するデコーダー120の各レイヤーとは、それぞれにスキップコネクション130で接続されている。 FIG. 2 is a model configuration diagram showing a schematic configuration example of the model MD according to the present embodiment. The model MD shown in FIG. 2 has a U-Net structure in which the encoder 110 layer and the decoder 120 layer have a symmetrical structure, and each layer is connected to each other by a skip connection 130. The encoder 110 encodes the input image Pin in stages according to each layer. The decoder 120 decodes the encoding result of the encoder 110 in stages according to each layer. The decoding result of the decoder 120 is output from the model MD as an output image Pout. Each layer of the encoder 110 and each corresponding layer of the decoder 120 are connected to each other by a skip connection 130.
 なお、本実施形態では、モデルMDとして図2に例示されるU-Net構造のモデルを用いるが、これに限定されない。例えば、モデルMDは、図2においてスキップコネクション130を備えないモデルであってもよい。また、本実施形態において、GAN(Genera tive Adversarial Networks)を適用してもよい。 Note that in this embodiment, the U-Net structure model illustrated in FIG. 2 is used as the model MD, but the present invention is not limited to this. For example, model MD may be a model that does not include the skip connection 130 in FIG. 2. Furthermore, in this embodiment, GAN (Generative Adversarial Networks) may be applied.
 次に本実施形態に係るモデル生成方法及び画像処理方法を説明する。 Next, a model generation method and an image processing method according to this embodiment will be explained.
[モデル生成方法]
 図3を参照して本実施形態に係るモデル生成方法を説明する。図3は、本実施形態に係るモデル生成方法の手順(モデル学習段階S100)の例を示すフローチャートである。
[Model generation method]
The model generation method according to this embodiment will be explained with reference to FIG. FIG. 3 is a flowchart illustrating an example of the procedure (model learning step S100) of the model generation method according to the present embodiment.
 モデル学習段階S100は、モデル生成装置20が実行する段階であって、学習用撮像画像Cと正解撮像画像Dとを用いてモデルMDの機械学習を行い、学習済モデルMDaを生成する段階である。 The model learning step S100 is a step executed by the model generation device 20, and is a step in which machine learning is performed on the model MD using the learning captured image C and the correct captured image D to generate a learned model MDa. .
(ステップS101) 学習用撮像画像取得部21は、複数の選任被験者を対象にして学習用撮像画像Cを取得する。正解撮像画像取得部22は、当該複数の選任被験者を対象にして正解撮像画像Dを取得する。学習用撮像画像C及び正解撮像画像Dは、選任被験者の顔がカラーで撮像されたRGB画像である。RGB画像は、赤色成分の画像(R画像)と緑色成分の画像(G画)と青色成分の画像(B画像)から構成される。モデル生成装置20は、同一選任被験者の学習用撮像画像Cと正解撮像画像Dとを関連付けて保持する。同一選任被験者の学習用撮像画像Cと正解撮像画像Dとは、撮像画像に写っている顔の位置が揃っている。
 なお、スマートフォンのカメラで学習用撮像画像Cを取得する場合は、顔の位置を揃えるため位置補正が行われる。位置補正の方法としては、上述した方法が挙げられる。以下に、詳述する。
 位置補正方法として例えば、可視光が照射されスマートフォンで撮像した画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた状態で撮像した画像を、位置合わせするよう射影変換する。具体的な手順として、2枚の画像間で同じ場所の映しこみ基準点を少なくとも1か所以上、好ましくは4か所以上8か所以下指定し、一方の基準点に合わせることで位置合わせを実現する。前記で指定する映しこみ基準点としては、ホクロ、マジックでの疑似ホクロ、シール貼付等が挙げられる。基準点の大きさとしては、ピクセル単位での位置ずれがより一致する観点から、小さいほうが好ましい。具体的な一例としては、主にニキビが出来やすい、額、鼻、口周辺部の位置が揃うように、これらを囲む位置にある頬部に、予め極小の穴を空けたシールを貼付し、スマートフォンとVISIAで撮像後、そのシール中の穴を基準点として射影変換することで位置合わせを行うことが好ましい。例えば、目視にて位置合わせした場合の、生成画像と正解画像の画素値相関係数を1とした場合、シールにマジックを塗って目印をつけた場合の画素値相関係数は「1.10」、シールに針(太さ0.5mm)で穴を開けて目印をつけた場合の画素値相関係数は「1.23」、シールに極細針(太さ0.4mm)で穴を開けて目印をつけた場合の画素値相関係数は「3.20」となった。
(Step S101) The learning captured image acquisition unit 21 acquires learning captured images C for a plurality of selected subjects. The correct captured image acquisition unit 22 acquires the correct captured images D for the plurality of selected subjects. The learning captured image C and the correct captured image D are RGB images of the selected subject's face captured in color. The RGB image is composed of a red component image (R image), a green component image (G image), and a blue component image (B image). The model generation device 20 associates and holds the learning captured image C and the correct captured image D of the same selected subject. In the learning captured image C and the correct captured image D of the same selected subject, the positions of the faces appearing in the captured images are aligned.
Note that when acquiring the learning captured image C with a smartphone camera, position correction is performed to align the positions of the faces. Examples of the position correction method include the methods described above. This will be explained in detail below.
As a position correction method, for example, projective transformation is performed to align an image captured by a smartphone under visible light irradiation with an image captured under ultraviolet irradiation with fluorescence caused by porphyrins. The specific procedure is to specify at least one reference point, preferably between 4 and 8, at the same location between the two images, and then align them by aligning them with one of the reference points. Realize. Examples of the projection reference point specified above include a mole, a pseudo mole with a marker, and a sticker attached. The smaller the size of the reference point is, the better, from the viewpoint of more consistent positional deviations in units of pixels. As a specific example, in order to align the areas around the forehead, nose, and mouth that are most prone to acne, a sticker with a very small hole made in advance is pasted on the cheek area surrounding these areas. After imaging with a smartphone and VISIA, it is preferable to perform positioning by projective transformation using the hole in the seal as a reference point. For example, if the pixel value correlation coefficient between the generated image and the correct image when aligned visually is 1, the pixel value correlation coefficient when a marker is marked on a sticker is 1.10. ", the pixel value correlation coefficient when making a mark by making a hole in the seal with a needle (0.5 mm thick) is "1.23", making a hole in the seal with an ultra-fine needle (0.4 mm thick) The pixel value correlation coefficient when a mark was attached was "3.20".
(ステップS102) モデル生成部23は、学習用撮像画像C及び正解撮像画像Dに対して前処理を行う。以下、学習用撮像画像C及び正解撮像画像Dに対する前処理について説明する。 (Step S102) The model generation unit 23 performs preprocessing on the learning captured image C and the correct captured image D. Hereinafter, preprocessing for the learning captured image C and the correct captured image D will be explained.
(前処理その1)
 前処理その1では、モデル生成部23は、モデルMDの機械学習に用いる画像として、学習用撮像画像Cの一部分を圧縮した圧縮学習用撮像画像と、正解撮像画像Dの一部分を圧縮した圧縮正解撮像画像とを生成する。図4には、学習用撮像画像C及び正解撮像画像Dのそれぞれから切り出される部分(切り出し部分)が示される。図4の例では、顔のうち額、鼻及び口周辺から切り出し部分が切り出される。これは、額、鼻及び口周辺は、特にニキビができやすい箇所、つまり、ニキビの原因となるアクネ菌が発生しやすくポルフィリンが多く検出されやすい箇所であるからである。なお、顔のうち、少なくとも額、鼻又は口周辺のいずれかから切り出し部分が切り出されてもよい。また、顔のうち、額、鼻及び口周辺の部分以外の他の部分(例えば頬など)から切り出し部分が切り出されてもよく、学習用撮像画像C及び正解撮像画像Dが選任被験者の顔全体の撮像画像である場合は、それぞれの画像を等間隔に切り出してもよい。
(Pretreatment part 1)
In preprocessing part 1, the model generation unit 23 uses a compressed learning captured image obtained by compressing a portion of the captured image C for learning and a compressed correct solution obtained by compressing a portion of the correct captured image D as images used for machine learning of the model MD. A captured image is generated. FIG. 4 shows portions (cutout portions) cut out from each of the learning captured image C and the correct captured image D. In the example of FIG. 4, portions of the face are cut out from around the forehead, nose, and mouth. This is because the areas around the forehead, nose, and mouth are areas where acne is particularly likely to occur, that is, areas where acne-causing bacteria are likely to occur and a large amount of porphyrin is likely to be detected. Note that the cutout portion may be cut out from at least the forehead, nose, or mouth area of the face. Further, the cutout portion may be cut out from other parts of the face (for example, the cheeks) other than the forehead, nose, and the area around the mouth, so that the learning captured image C and the correct captured image D are the entire face of the selected subject. , each image may be cut out at equal intervals.
 モデル生成部23は、切り出し部分の画像を所定の圧縮サイズに圧縮する。例えば、元の学習用撮像画像C及び正解撮像画像Dの画像サイズが「3456×5184」ピクセルである場合に、切り出し部分の画像サイズを「768×768」ピクセルにし、その「768×768」ピクセルの切り出し部分の画像を「256×256」ピクセルに圧縮する。元の学習用撮像画像Cから切り出された「768×768」ピクセルの切り出し部分の画像が圧縮された「256×256」ピクセルの画像が、圧縮学習用撮像画像である。元の正解撮像画像Dから切り出された「768×768」ピクセルの切り出し部分の画像が圧縮された「256×256」ピクセルの画像が、圧縮正解撮像画像である。なお、学習用撮像画像C及び正解撮像画像Dの切り出し部分の画像サイズは、学習用撮像画像C及び正解撮像画像Dの撮像間における、同一選任被験者が動くことによる顔の位置の相違による出力画像のぼやけを最小化するために、「768×768」ピクセルが好ましい。
 なお、上記した「3456×5184」ピクセル、「768×768」ピクセル及び「256×256」ピクセル等の画像サイズの数値は、画像サイズの一例であってこれに限定されない。
The model generation unit 23 compresses the cut-out portion of the image to a predetermined compression size. For example, if the image size of the original learning captured image C and correct captured image D is "3456 x 5184" pixels, the image size of the cutout part is set to "768 x 768" pixels, and the "768 x 768" pixels are The image of the cutout part is compressed to "256 x 256" pixels. The "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original learning captured image C is the compressed learning captured image. The "256x256" pixel image obtained by compressing the image of the "768x768" pixel cutout portion cut out from the original correct captured image D is the compressed correct captured image. Note that the image size of the cutout portion of the learning captured image C and the correct captured image D is the output image due to the difference in the position of the face due to the movement of the same selected subject between the captured learning images C and the correct captured image D. "768x768" pixels are preferred to minimize blurring.
Note that the image size values such as "3456 x 5184" pixels, "768 x 768" pixels, and "256 x 256" pixels described above are examples of image sizes, and are not limited thereto.
(前処理その2)
 前処理その2では、モデル生成部23は、正解撮像画像Dに対して赤色成分を強調した強調正解撮像画像を生成する。本実施形態では、上述した前処理その1で生成された圧縮正解撮像画像に対して、前処理その2を行う。また、前処理その2の好ましい実施形態としては、強調正解撮像画像は、正解撮像画像DのRGB画像を構成するR画像、G画像及びB画像のうち、当該R画像から当該G画像を差し引いた差分画像を当該RGB画像に加えて生成されたRGB画像である。
(Pretreatment part 2)
In preprocessing No. 2, the model generation unit 23 generates an emphasized correct captured image that emphasizes the red component of the correct captured image D. In the present embodiment, preprocessing part 2 is performed on the compressed correct captured image generated in preprocessing part 1 described above. In addition, as a preferred embodiment of preprocessing part 2, the emphasized correct captured image is obtained by subtracting the G image from the R image among the R image, G image, and B image that constitute the RGB image of the correct captured image D. This is an RGB image generated by adding a difference image to the RGB image.
 図5は、本実施形態に係る正解撮像画像Dの前処理その2を説明するための画像の例である。図5(1)は、前処理その1で生成された圧縮正解撮像画像(RGB画像)である。図5(2),(3),(4)は、圧縮正解撮像画像(RGB画像)を構成するR画像,G画像,B画像である。モデル生成部23は、図(5)の差分画像を図5(1)のRGB画像に加えて、図(6)のRGB画像(強調正解撮像画像)を生成する。図5(6)のRGB画像(強調正解撮像画像)は、図5(1)の元のRGB画像(圧縮正解撮像画像)よりも、赤色成分が強調されている。 FIG. 5 is an example of an image for explaining the second preprocessing of the correct captured image D according to the present embodiment. FIG. 5(1) is a compressed correct image (RGB image) generated in preprocessing part 1. FIGS. 5(2), (3), and (4) are R, G, and B images that constitute the compressed correct captured image (RGB image). The model generation unit 23 adds the difference image in FIG. 5 to the RGB image in FIG. 5(1) to generate the RGB image (enhanced correct captured image) in FIG. In the RGB image (enhanced correct captured image) in FIG. 5(6), the red component is more emphasized than in the original RGB image (compressed correct captured image) in FIG. 5(1).
 以上が学習用撮像画像C及び正解撮像画像Dに対する前処理の説明である。 The above is an explanation of the preprocessing for the learning captured image C and the correct captured image D.
 本実施形態では、モデル生成部23は、圧縮学習用撮像画像及び強調正解撮像画像をモデルMDの機械学習に用いる。なお、本実施形態では、圧縮学習用撮像画像及び強調正解撮像画像をモデルMDの機械学習に用いるが、前処理その2を行わず、圧縮学習用撮像画像及び圧縮正解撮像画像をモデルMDの機械学習に用いてもよい。 In the present embodiment, the model generation unit 23 uses the compressed learning captured image and the emphasized correct captured image for machine learning of the model MD. In addition, in this embodiment, the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD, but preprocessing part 2 is not performed, and the captured image for compression learning and the emphasized correct captured image are used for machine learning of the model MD. May be used for learning.
(ステップS103) モデル生成部23は、圧縮正解撮像画像C’を入力画像PinとしてモデルMDに入力する(図2参照)。モデルMDは、圧縮正解撮像画像(入力画像Pin)が入力されると、出力画像Poutを生成して出力する(図2参照)。 (Step S103) The model generation unit 23 inputs the compressed correct captured image C' to the model MD as the input image Pin (see FIG. 2). When the compressed correct captured image (input image Pin) is input, the model MD generates and outputs an output image Pout (see FIG. 2).
(ステップS104) モデル生成部23は、モデルMDから出力された出力画像Poutと強調正解撮像画像D’とを比較し、当該比較の結果の差分値が小さくなるように、モデルMDに対してフィードバック制御を行う。当該強調正解撮像画像D’は、入力画像Pinに使用された圧縮正解撮像画像C’に対応する同一選任被験者の正解撮像画像Dから生成された強調正解撮像画像である。 (Step S104) The model generation unit 23 compares the output image Pout output from the model MD and the emphasized correct captured image D', and provides feedback to the model MD so that the difference value of the comparison result becomes smaller. Take control. The emphasized correct captured image D' is an enhanced correct captured image generated from the correct captured image D of the same selected subject, which corresponds to the compressed correct captured image C' used for the input image Pin.
 本実施形態では、出力画像Poutと強調正解撮像画像D’との差分値として、例えば、次式のMSE(mean squared error;平均二乗誤差)を用いる。 In the present embodiment, as the difference value between the output image Pout and the emphasized correct captured image D', for example, the following MSE (mean squared error) is used.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 但し、Nは画素数である。iは画素番号である。xiは強調正解撮像画像D’のi番目の画素値である。yiは出力画像Poutのi番目の画素値である。 However, N is the number of pixels. i is a pixel number. xi is the i-th pixel value of the emphasized correct captured image D'. yi is the i-th pixel value of the output image Pout.
 モデル生成部23は、複数の選任被験者から取得された学習用撮像画像C(圧縮正解撮像画像C’)及び正解撮像画像D(強調正解撮像画像D’)を用いて、所定の終了条件を満たすまで、繰返しモデルMDの機械学習を行う。所定の終了条件は、所定回数であってもよく、又は出力画像Poutと強調正解撮像画像D’との差分値が所定値未満になることであってもよい。 The model generation unit 23 satisfies a predetermined termination condition using the learning captured image C (compressed correct captured image C') and the correct captured image D (enhanced correct captured image D') acquired from a plurality of selected subjects. Machine learning of the iterative model MD is performed until The predetermined termination condition may be a predetermined number of times, or may be that the difference value between the output image Pout and the emphasized correct captured image D' becomes less than a predetermined value.
 モデル生成部23は、機械学習が終了したモデルMDを学習済モデルMDaとしてモデル出力部24へ渡す。モデル出力部24は、当該学習済モデルMDaを出力する。 The model generation unit 23 passes the model MD for which machine learning has been completed to the model output unit 24 as a learned model MDa. The model output unit 24 outputs the learned model MDa.
 上述した前処理その1により生成された圧縮学習用撮像画像及び圧縮正解撮像画像をモデルMDの機械学習に用いることによって、学習済モデルMDaが生成する出力画像Poutにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上することが実験により確認された。具体的には、選任被験者は9名であり、被験者撮像画像については「3456×5184」ピクセルの被験者撮像画像Aから「768×768」ピクセルの画像サイズで切り出し、その「768×768」ピクセルの切り出し部分の画像を「256×256」ピクセルに圧縮した被験者撮像画像A’を使用し、前処理その1を行わない場合は出力画像と正解撮像画像との画素値相関係数が「0.22」であったのに対して、「3456×5184」ピクセルの学習用撮像画像C及び正解撮像画像Dから「768×768」ピクセルの画像を切り出して「256×256」ピクセルの圧縮学習用撮像画像及び圧縮正解撮像画像に圧縮する前処理その1を行う場合は出力画像と正解撮像画像との画素値相関係数が「0.35」であった。
 また、上述した前処理その2により生成された強調正解撮像画像をモデルMDの機械学習に用いることによって、学習済モデルMDaが生成する出力画像Poutにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上することが実験により確認された。具体的には、選任被験者は9名であり、被験者撮像画像については「3456×5184」ピクセルの被験者撮像画像Aから「768×768」ピクセルの画像サイズで切り出し、その「768×768」ピクセルの切り出し部分の画像を「256×256」ピクセルに圧縮した被験者撮像画像A’を使用し、上記した前処理その1を行い且つ前処理その2を行わない場合は出力画像と正解撮像画像との画素値相関係数が「0.35」であったのに対して、上記した前処理その1を行い且つ前処理その2を行う場合は出力画像と正解撮像画像との画素値相関係数が「0.61」であった。
By using the compressed learning captured image and the compressed correct captured image generated by the preprocessing part 1 described above for machine learning of the model MD, the location of fluorescence caused by porphyrin in the output image Pout generated by the learned model MDa is determined. It was confirmed through experiments that the reproducibility of Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was When using subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels and preprocessing part 1 is not performed, the pixel value correlation coefficient between the output image and the correct captured image is "0.22". '', the 768 x 768 pixel image is extracted from the 3456 x 5184 pixel learning image C and the correct image D to create the 256 x 256 pixel compressed learning image. When performing preprocessing No. 1 for compressing into a compressed correct captured image, the pixel value correlation coefficient between the output image and the correct captured image was "0.35".
In addition, by using the emphasized correct captured image generated by the above-mentioned preprocessing 2 for machine learning of the model MD, the reproducibility of the fluorescence location caused by porphyrin in the output image Pout generated by the trained model MDa can be improved. It was confirmed through experiments that this improvement was achieved. Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was If you use subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels, and perform the above preprocessing 1 and do not perform preprocessing 2, the pixels of the output image and the correct captured image While the value correlation coefficient was "0.35", when performing the above-mentioned preprocessing 1 and preprocessing 2, the pixel value correlation coefficient between the output image and the correct captured image was "0.35". 0.61''.
[画像処理方法]
 図6を参照して本実施形態に係る画像処理方法を説明する。図6は、本実施形態に係る画像処理方法の手順(皮膚状態測定段階S200)の例を示すフローチャートである。
[Image processing method]
The image processing method according to this embodiment will be explained with reference to FIG. FIG. 6 is a flowchart illustrating an example of the procedure (skin condition measurement step S200) of the image processing method according to the present embodiment.
 皮膚状態測定段階S200は、画像処理装置10が実行する段階であって、学習済モデルMDaを用いて被験者Uの被験者撮像画像A又は被験者撮像画像Aから変換された被験者圧縮撮像画像A’から出力画像Bを生成する段階である。画像処理装置10のモデル格納部11には、モデル生成装置20が上述したモデル学習段階S100で生成した学習済モデルMDaが格納されている。 The skin condition measurement step S200 is a step executed by the image processing device 10, and outputs from the subject captured image A of the subject U or the subject compressed captured image A' converted from the subject captured image A using the learned model MDa. This is the stage of generating image B. The model storage unit 11 of the image processing device 10 stores the trained model MDa generated by the model generation device 20 in the above-described model learning step S100.
(ステップS201) 被験者Uは、測定してもらいたい箇所の皮膚(例えば顔)を例えばスマートフォン30のカメラで撮像し、撮像した被験者撮像画像Aをスマートフォン30で画像処理装置10へ送信する。被験者撮像画像取得部12は、被験者Uのスマートフォン30から送信された被験者撮像画像Aを、通信回線等を介して受信する。ここで、被験者撮像画像取得部12は、被験者撮像画像Aを所定の圧縮サイズに圧縮し、被験者圧縮撮像画像A’に変換してもよい。
 なお、スマートフォン30のカメラで撮像する場合、色彩補正や位置補正ができるよう撮像してもよい。色彩補正や位置補正の具体的な方法は、上述した方法が挙げられる。
(Step S201) The subject U images the part of the skin (for example, the face) that the subject U wants to have measured using, for example, the camera of the smartphone 30, and transmits the captured subject image A to the image processing device 10 using the smartphone 30. The subject captured image acquisition unit 12 receives the subject captured image A transmitted from the smartphone 30 of the subject U via a communication line or the like. Here, the subject captured image acquisition unit 12 may compress the subject captured image A to a predetermined compression size and convert it into a subject compressed captured image A'.
Note that when capturing an image with the camera of the smartphone 30, the image may be captured so that color correction and position correction can be performed. Specific methods for color correction and position correction include the methods described above.
(ステップS202) 出力画像取得部13は、被験者撮像画像A又は被験者圧縮撮像画像A’を入力画像Pinとして学習済モデルMDaに入力する(図2参照)。学習済モデルMDaは、被験者撮像画像A又は被験者圧縮撮像画像A’(入力画像Pin)が入力されると、出力画像Poutを生成して出力する(図2参照)。出力画像取得部13は、学習済モデルMDaから出力された出力画像Poutを出力画像B又は出力画像B’として取得する。出力画像B又は出力画像B’は、被験者Uの被験者撮像画像A又は被験者圧縮撮像画像A’から学習済モデルMDaによって推論された、ポルフィリンに起因する蛍光が生じた被験者Uの皮膚の画像である。例えば、出力画像B又は出力画像B’は、被験者Uの顔の被験者撮像画像A又は被験者圧縮撮像画像A’から学習済モデルMDaによって推論された、ポルフィリンに起因する蛍光が生じた被験者Uの顔の画像である。なお、出力画像B’は圧縮画像であるため、出力画像取得部13は、出力画像B’から被験者撮像画像Aと同じ大きさに展開して出力画像Bに復元する。 (Step S202) The output image acquisition unit 13 inputs the subject captured image A or the subject compressed captured image A' to the learned model MDa as the input image Pin (see FIG. 2). When the trained model MDa receives the subject captured image A or the subject compressed captured image A' (input image Pin), it generates and outputs an output image Pout (see FIG. 2). The output image acquisition unit 13 acquires the output image Pout output from the trained model MDa as an output image B or an output image B'. The output image B or the output image B' is an image of the skin of the subject U in which fluorescence due to porphyrin has occurred, which is inferred by the learned model MDa from the subject image A or the subject compressed image A' of the subject U. . For example, the output image B or the output image B' is the face of the subject U in which fluorescence caused by porphyrin has occurred, which is inferred by the learned model MDa from the subject captured image A or the subject compressed captured image A' of the subject U's face. This is an image of Note that since the output image B' is a compressed image, the output image acquisition unit 13 expands the output image B' to the same size as the captured image A of the subject and restores it to the output image B.
 出力画像取得部13は、通信回線を介して、出力画像Bを被験者Uのスマートフォン30へ送信する。被験者Uは、画像処理装置10から送信された出力画像Bを、通信回線を介してスマートフォン30で受信する。被験者Uは、スマートフォン30で受信した出力画像Bをスマートフォン30の表示画面上に表示させることにより、出力画像Bによって自分の皮膚上(例えば顔)のポルフィリンの状態を視認することができる。なお、カメラを備え、画像処理装置10が組み込まれて一体化されているスマートフォン30やタブレットPC等の携帯端末装置を利用する場合、撮像された顔の被験者撮像画像Aをスマートフォン30で画像処理装置10へ送信すること、及び出力画像Bを画像処理装置10から被験者Uのスマートフォン30へ送信することは不要である。 The output image acquisition unit 13 transmits the output image B to the smartphone 30 of the subject U via the communication line. Subject U receives output image B transmitted from image processing device 10 with smartphone 30 via a communication line. By displaying the output image B received by the smartphone 30 on the display screen of the smartphone 30, the subject U can visually recognize the state of porphyrin on his or her skin (for example, face) using the output image B. Note that when using a mobile terminal device such as a smartphone 30 or a tablet PC that is equipped with a camera and has the image processing device 10 incorporated therein, the image processing device uses the smartphone 30 to process the image A of the captured face of the subject. 10 and sending the output image B from the image processing device 10 to the smartphone 30 of the subject U is unnecessary.
[教師データ作成方法の変形例]
 次に、図7から図9を参照しながら、教師データ作成方法の変形例について説明する。まず、当該変形例が解決しようとする課題について説明する。例えば、学習段階においては、学習用撮像画像Cと正解撮像画像Dを取得するための撮像機として、「VISIA(登録商標)、Canfield Scientific製」が用いられる。このような装置を用いることにより、学習用撮像画像Cと正解撮像画像Dの画角及び位置が同一となり、位置合わせを要しない(又は位置合わせ量が軽微なものとなる)。しかしながら、推論段階において、被験者Uは、スマートフォン30を用いて、被験者撮像画像Aを撮像することが想定される。このような場合、学習用画像の画素数と、推論用画像の画素数とが大きく異なるため、十分な推論が得られないといった場合があった。
[Variation example of training data creation method]
Next, a modification of the teacher data creation method will be described with reference to FIGS. 7 to 9. First, the problem that the modification is intended to solve will be explained. For example, in the learning stage, "VISIA (registered trademark), manufactured by Canfield Scientific" is used as an imaging device for acquiring the learning captured image C and the correct captured image D. By using such a device, the angle of view and position of the learning captured image C and the correct captured image D become the same, and alignment is not required (or the amount of alignment becomes small). However, in the inference stage, it is assumed that the subject U uses the smartphone 30 to capture the subject captured image A. In such a case, the number of pixels in the learning image and the number of pixels in the inference image are significantly different, so that sufficient inference may not be obtained.
 そこで、学習段階においても、推論段階と同様に、スマートフォン30を用いて撮像された画像を用いることが考えられる。学習用撮像画像Cは、可視光が照射された選任被験者の皮膚が撮像された撮像画像であるため、スマートフォン30を用いて容易に撮像することができる。一方、正解撮像画像Dは、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた選任被験者の顔が撮像された画像であるため、スマートフォン30を用いて容易に撮像することができない。したがって、学習用撮像画像Cについては、スマートフォン30を用いて撮像された画像とし、正解撮像画像Dについては、「VISIA(登録商標)、Canfield Scientific製」等の装置を用いて撮像された画像とし、これらの画像を位置合わせして、教師データとすることが考えられる。 Therefore, it is conceivable to use images captured using the smartphone 30 in the learning stage as well, similar to the inference stage. The learning captured image C is a captured image of the selected subject's skin irradiated with visible light, and therefore can be easily captured using the smartphone 30. On the other hand, the correct captured image D cannot be easily captured using the smartphone 30 because it is an image of the selected subject's face in which fluorescence due to porphyrin has been generated due to irradiation with ultraviolet rays. Therefore, the learning captured image C is an image captured using the smartphone 30, and the correct captured image D is an image captured using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific". , it is conceivable to align these images and use them as training data.
 図7は、一実施形態に係る学習用撮像画像及び正解撮像画像の撮像の違いを説明するためのイメージ図である。同図を参照しながら、教師データ作成方法の変形例における画像の取得方法について説明する。図7(A)は、正解撮像画像Dについての取得方法の一例であり、「VISIA(登録商標)、Canfield Scientific製」等の装置を用いて撮像が行われる場合の一例である。図示するように、被験者の顔は装置に固定されているため、手振れ等が発生しない。もし、学習用撮像画像Cも同時に撮像するのであれば、被験者の顔の位置が変化しないため、位置合わせを要しない(同時ではない場合も、位置合わせ量が軽微なものとなる)。また、図7(A)に示すように、「VISIA(登録商標)、Canfield Scientific製」等の装置を用いて撮像が行われる場合、撮像装置自体も、スマートフォン30と比較して高画質のものが用いられる。 FIG. 7 is an image diagram for explaining the difference in imaging between the learning captured image and the correct captured image according to one embodiment. An image acquisition method in a modified example of the teacher data creation method will be described with reference to the same figure. FIG. 7A shows an example of a method for acquiring the correct captured image D, and is an example where imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific." As shown in the figure, the subject's face is fixed on the device, so hand shake and the like do not occur. If the learning image C is also captured at the same time, the position of the subject's face does not change, so alignment is not required (even if not simultaneously, the amount of alignment will be small). Furthermore, as shown in FIG. 7(A), when imaging is performed using a device such as "VISIA (registered trademark), manufactured by Canfield Scientific," the imaging device itself also has a higher image quality than the smartphone 30. is used.
 図7(B)は、学習用撮像画像Cについての取得方法の一例であり、スマートフォン30を用いて撮像が行われる場合の一例である。図示するように、被験者の顔は固定されておらず、手振れ等も発生する。また、撮像装置自体の画質も、「VISIA(登録商標)、Canfield Scientific製」等の装置と比較して低画質である。このようにそれぞれ撮像された正解撮像画像D及び学習用撮像画像Cを教師データとするために、本実施形態においては、高精度な位置合わせが行われる。なお、スマートフォン30を用いた場合であっても、顔を固定して、手振れの発生を抑止して撮像することは可能である。しかしながら、本実施形態においては、あえて推論段階における画像と同様の画像に基づく教師データを作成するため、推論段階において撮像されるような状況で撮像を行ってもよい。推論段階において撮像されるような状況とは、スマートフォン30のインカメラを用いて、手を伸ばして自身を撮像する(いわゆる自撮り)ようなものであってもよい。 FIG. 7(B) is an example of a method for acquiring the learning captured image C, and is an example where the smartphone 30 is used to capture the image. As shown in the figure, the subject's face is not fixed and camera shake may occur. Furthermore, the image quality of the imaging device itself is lower than that of devices such as “VISIA (registered trademark), manufactured by Canfield Scientific.” In order to use the correct captured image D and the learning captured image C captured in this manner as teacher data, highly accurate positioning is performed in this embodiment. Note that even when using the smartphone 30, it is possible to fix the face and capture an image while suppressing the occurrence of camera shake. However, in this embodiment, in order to intentionally create training data based on an image similar to the image in the inference stage, imaging may be performed in a situation similar to that in which the image is captured in the inference stage. The situation that is imaged in the inference stage may be one in which the user stretches out his hand and takes an image of himself (so-called self-portrait) using the in-camera of the smartphone 30.
 図8は、一実施形態に係る教師データ作成に用いられる基準点のイメージ図である。被験者の顔には、4個の基準点が付される。当該基準点とは、シールのようなものであってもよい。更に精度を上げるため、シールの上にマジック等で印が付されたものであってもよいし、シールに位置特定用の穴が開けられたものであってもよい。 FIG. 8 is an image diagram of reference points used for creating teacher data according to one embodiment. Four reference points are attached to the subject's face. The reference point may be something like a sticker. In order to further increase accuracy, a mark may be made on the seal using a marker or the like, or a hole may be made in the seal for position identification.
 図9は、一実施形態に係る正解撮像画像の前処理、及び学習用撮像画像及び正解撮像画像の切り出しの例を示す図である。同図を参照しながら、教師データ作成方法の変形例に係る正解撮像画像Dの前処理、及び学習用撮像画像C及び正解撮像画像Dの切り出しの一例について説明する。 FIG. 9 is a diagram illustrating an example of preprocessing the correct captured image and cutting out the learning captured image and the correct captured image according to an embodiment. With reference to the figure, an example of preprocessing of the correct captured image D and cutting out of the learning captured image C and the correct captured image D according to a modified example of the teacher data creation method will be described.
 まず、図9(A)は、正解撮像画像Dの一例である。正解撮像画像Dは、顔を固定して撮像するため、正面視した画像が示されている。図9(B)は、学習用撮像画像Cの一例である。学習用撮像画像Cは、スマートフォン30を用いて、顔を固定せずに撮像するものであるため、正面視でない場合がある。本実施形態においては、これらの画像の位置合わせを行う。 First, FIG. 9(A) is an example of the correct captured image D. Since the correct captured image D is captured with the face fixed, the image is shown as viewed from the front. FIG. 9(B) is an example of a captured image C for learning. The learning captured image C is captured using the smartphone 30 without fixing the face, so it may not be viewed from the front. In this embodiment, these images are aligned.
 位置合わせの前に、正解撮像画像Dについての前処理が行われる。当該前処理では、射影変換等の従来技術が用いられ、学習用撮像画像Cに合わせて正解撮像画像Dを変換する。当該前処理は、画像処理により行われてもよい。図9(C)は、正解撮像画像Dについて前処理を行った結果を示す。ここで、本実施形態においては、正解撮像画像Dに合わせて学習用撮像画像Cを変換するのではなく、学習用撮像画像Cに合わせて正解撮像画像Dを変換する。これは、推論段階における画像に合わせて変換を行うことにより、推論段階における画像と同様の画像で学習を行おうとするものである。 Before alignment, preprocessing is performed on the correct captured image D. In the preprocessing, a conventional technique such as projective transformation is used to transform the correct captured image D in accordance with the learning captured image C. The preprocessing may be performed by image processing. FIG. 9C shows the result of preprocessing the correct captured image D. Here, in the present embodiment, the learning captured image C is not converted in accordance with the correct captured image D, but the correct captured image D is converted in accordance with the learning captured image C. This is an attempt to perform learning using images similar to those used in the inference stage by performing transformations that match the images used in the inference stage.
 次に、図9(B)に示す画像と、図9(C)に示す画像との、それぞれから、画像の一部の領域を抽出する(切り出す)ことにより、教師データを作成する。当該画像の抽出には、図8に示したような基準点が用いられる。なお、基準点とは、教師データ作成のためにあえて付されたものである必要はなく、例えば、ほくろや、口角等、人体の特徴的な点であってもよい。 Next, teacher data is created by extracting (cutting out) a partial region of the image from each of the image shown in FIG. 9(B) and the image shown in FIG. 9(C). Reference points as shown in FIG. 8 are used to extract the image. Note that the reference point does not need to be intentionally added for the purpose of creating the teacher data, and may be a characteristic point of the human body, such as a mole or the corner of the mouth, for example.
 上述したような教師データ作成方法の変形例によれば、学習用撮像画像取得ステップを有することにより、可視光が照射された人の皮膚が撮像された画像であって、基準点を有する部分が撮像された画像である学習用撮像画像Cを取得し、正解撮像画像取得ステップを有することにより、学習用撮像画像Cが撮像された撮像装置とは異なる撮像装置により学習用撮像画像Cと同一の被写体が撮像された画像であって、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された画像である正解撮像画像Dを取得する。また、第1抽出ステップを有することにより、学習用撮像画像Cのうち基準点に基づく一部の領域を抽出し、第2抽出ステップを有することにより、正解撮像画像Dのうち基準点に基づく一部の領域を抽出する。また、教師データ作成ステップを有することにより、学習用撮像画像C及び正解撮像画像Dそれぞれから抽出された画像を対応付けて記憶することにより教師データを作成する。このように教師データを作成することにより、推論段階における画像と同様の教師データを作成することができる。すなわち、本実施形態によれば、精度よく学習することができ、精度よく推論を行うことが可能となる。 According to the modified example of the training data creation method as described above, by including the learning captured image acquisition step, an image of a person's skin irradiated with visible light is captured, and a portion having a reference point is By acquiring the captured image for learning C, which is a captured image, and having the step of acquiring the correct captured image, the captured image for learning C is captured by an imaging device different from the imaging device with which the captured image for learning C is captured. A correct captured image D is obtained, which is an image of the subject and is an image of the human skin in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays. Further, by having the first extraction step, a part of the region based on the reference point is extracted from the captured image C for learning, and by having the second extraction step, a part of the region based on the reference point is extracted from the correct captured image D. Extract the area of the area. Furthermore, by including the step of creating teacher data, the teacher data is created by storing images extracted from each of the learning captured image C and the correct captured image D in association with each other. By creating the training data in this way, it is possible to create the same training data as the image at the inference stage. That is, according to the present embodiment, it is possible to learn with high accuracy and to perform inference with high accuracy.
 また、上述したような教師データ作成方法の変形例によれば、画像処理ステップを更に含むことにより、(学習用撮像画像Cではなく)正解撮像画像Dを画像処理し、学習用撮像画像Cに合わせた画像を生成する。当該画像処理ステップとは、すなわち、前処理である。また、第2抽出ステップでは、画像処理ステップにより画像処理が行われた正解撮像画像Dのうち基準点に基づく一部の領域が抽出される。このように教師データを作成することにより、推論段階における画像と同様の教師データを作成することができる。すなわち、本実施形態によれば、精度よく学習することができ、精度よく推論を行うことが可能となる。
 なお、画像処理ステップでは、正解撮像画像Dを画像処理することに加えて、学習用撮像画像Cについても画像処理が行われてもよい。
Further, according to the modification of the training data creation method described above, by further including an image processing step, the correct captured image D (instead of the learning captured image C) is image-processed, and the training captured image C is converted into the learning captured image C. Generate a combined image. The image processing step is pre-processing. Furthermore, in the second extraction step, a part of the area based on the reference point is extracted from the correct captured image D that has been subjected to image processing in the image processing step. By creating the training data in this way, it is possible to create the same training data as the image at the inference stage. That is, according to the present embodiment, it is possible to learn with high accuracy and to perform inference with high accuracy.
Note that in the image processing step, in addition to image processing the correct captured image D, image processing may also be performed on the learning captured image C.
 以上が本実施形態に係るモデル生成方法及び画像処理方法の説明である。 The above is the explanation of the model generation method and image processing method according to this embodiment.
 なお、学習用撮像画像C及び被験者撮像画像Aを撮像する際に用いる可視光は同一の波長及び強度の光であることが好ましい。また、学習用撮像画像C及び被験者撮像画像Aを撮像する際に用いる可視光として青色光(例えば、波長として380~550nm)を用いてもよい。学習用撮像画像C及び被験者撮像画像Aを撮像する際に青色光を用いることによって、白色光を用いる場合に比して、学習済モデルMDaが生成する出力画像Poutにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上することが実験により確認された。具体的には、選任被験者は9名であり、被験者撮像画像については「3456×5184」ピクセルの被験者撮像画像Aから「768×768」ピクセルの画像サイズで切り出し、その「768×768」ピクセルの切り出し部分の画像を「256×256」ピクセルに圧縮した被験者撮像画像A’を使用し、上記した前処理その1を行い且つ前処理その2を行う場合において、白色光(400~770nm)を用いるときは出力画像と正解撮像画像との画素値相関係数が「0.61」であったのに対して、青色光(380~550nm)を用いるときは出力画像と正解撮像画像との画素値相関係数が「0.65」であった。なお、青色光を用いる実施形態としては、例えば、白熱電球や蛍光灯やLED電球等の人工光に青色のフィルムを被せたり、青色の光のみ発するLED電球による照射、またはスマートフォンやモニターの一面を青色に表示して照射すること等が挙げられる。 Note that it is preferable that the visible light used when capturing the learning captured image C and the subject captured image A have the same wavelength and intensity. Furthermore, blue light (eg, wavelength of 380 to 550 nm) may be used as the visible light used when capturing the learning captured image C and the subject captured image A. By using blue light when capturing the training image C and the subject image A, the fluorescence caused by porphyrin in the output image Pout generated by the trained model MDa is reduced compared to the case where white light is used. It was confirmed through experiments that the reproducibility of the parts was improved. Specifically, there were 9 subjects selected, and the subject image was cut out from the subject image A of 3456 x 5184 pixels with an image size of 768 x 768 pixels, and the image size of 768 x 768 pixels was When performing the above-mentioned preprocessing 1 and preprocessing 2 using subject captured image A' in which the image of the cutout part is compressed to "256 x 256" pixels, white light (400 to 770 nm) is used. When using blue light (380 to 550 nm), the pixel value correlation coefficient between the output image and the correct captured image was 0.61, but when blue light (380 to 550 nm) was used, the pixel value correlation coefficient between the output image and the correct captured image was 0.61. The correlation coefficient was "0.65". In addition, examples of embodiments using blue light include, for example, covering artificial light such as incandescent light bulbs, fluorescent lights, and LED light bulbs with a blue film, irradiation with LED light bulbs that only emit blue light, or lighting the entire surface of a smartphone or monitor. For example, it may be displayed in blue and irradiated.
 上述した実施形態によれば、画像処理装置10は、可視光が照射された人の皮膚が撮像された学習用撮像画像Cと、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像Dとを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われた学習済モデルMDaを用いて、可視光が照射された被験者Uの皮膚が撮像された被験者撮像画像Aから、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像Bを取得する。これにより、被験者Uに紫外線を照射しなくても、被験者Uの皮膚上のポルフィリンを検出することができるという効果が得られる。被験者撮像画像Aは、被験者撮像画像Aの一部分を圧縮した被験者圧縮撮像画像A’に変換して用いてもよい。これにより、学習済モデルMDaが生成する出力画像Bにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上する。なお、当該一部分は、顔のうち額、鼻又は口周辺であってもよい。被験者圧縮撮像画像A’から、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像B’を取得する。出力画像B’は圧縮画像であり、被験者撮像画像Aと同じ大きさに展開して出力画像Bに復元し、出力画像Bを取得する。 According to the above-described embodiment, the image processing device 10 is configured to display a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays, resulting in fluorescence caused by porphyrins. The machine generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image D of the human skin irradiated with visible light using the correct captured image D of the skin of the person. An output image that is an image of the human skin in which fluorescence due to porphyrin has occurred is obtained from the subject image A in which the skin of the subject U irradiated with visible light is captured using the trained model MDa that has been trained. Get B. This provides the effect that porphyrins on the skin of the subject U can be detected without irradiating the subject U with ultraviolet rays. The subject captured image A may be converted into a subject compressed captured image A' which is a part of the subject captured image A compressed. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa. Note that the part of the face may be around the forehead, nose, or mouth. An output image B', which is an image of the human skin in which fluorescence due to porphyrin has occurred, is obtained from the subject compressed captured image A'. The output image B' is a compressed image, and is expanded to the same size as the captured image A of the subject and restored to the output image B, thereby obtaining the output image B.
 また、学習済モデルMDaの機械学習は、学習用撮像画像Cの一部分を圧縮した圧縮学習用撮像画像と、正解撮像画像Dの一部分を圧縮した圧縮正解撮像画像とが用いられてもよい。これにより、学習済モデルMDaが生成する出力画像Bにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上する。なお、当該一部分は、顔のうち額、鼻又は口周辺であってもよい。 Furthermore, for machine learning of the learned model MDa, a compressed learning captured image obtained by compressing a portion of the learning captured image C and a compressed correct captured image obtained by compressing a portion of the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. Note that the part of the face may be around the forehead, nose, or mouth.
 また、学習済モデルMDaの機械学習は、正解撮像画像Dに対して赤色成分を強調した強調正解撮像画像が用いられてもよい。これにより、学習済モデルMDaが生成する出力画像Bにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上する。なお、正解撮像画像DはR画像、G画像及びB画像から構成されるRGB画像であり、強調正解撮像画像は、当該R画像から当該G画像を差し引いた差分画像を当該RGB画像に加えて生成されたRGB画像であってもよい。 Furthermore, for the machine learning of the learned model MDa, an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D may be used. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the learned model MDa. Note that the correct captured image D is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.
 また、学習済モデルMDaは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成されてもよい。なお、学習済モデルMDaは、エンコーダー部分のレイヤーとデコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有するものであってもよい。 Furthermore, the trained model MDa may be composed of a layer of an encoder portion and a layer of a decoder portion. Note that the learned model MDa may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.
 上述した実施形態によれば、モデル生成装置20は、可視光が照射された人の皮膚が撮像された学習用撮像画像Cと、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像Dとを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルMDの機械学習を行い、学習済モデルMDaを生成する。この学習済モデルMDaを用いれば、被験者Uに紫外線を照射しなくても、被験者Uの皮膚上のポルフィリンを検出することができるという効果が得られる。 According to the embodiment described above, the model generation device 20 generates a learning image C in which the skin of a person irradiated with visible light is captured, and a person whose skin is irradiated with ultraviolet rays and has fluorescence caused by porphyrins. The model generates an image of the human skin in which fluorescence due to porphyrin has occurred from the captured image of the human skin irradiated with visible light using the correct captured image D in which the skin of the person is captured. Machine learning of MD is performed to generate a learned model MDa. By using this learned model MDa, it is possible to detect porphyrins on the skin of the subject U without irradiating the subject U with ultraviolet rays.
 また、モデル生成装置20は、学習用撮像画像Cの一部分を圧縮した圧縮学習用撮像画像と、正解撮像画像Dの一部分を圧縮した圧縮正解撮像画像とを生成し、当該圧縮学習用撮像画像と当該圧縮正解撮像画像とをモデルMDの機械学習に用いてもよい。これにより、学習済モデルMDaが生成する出力画像Bにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上する。なお、当該一部分は、顔のうち額、鼻又は口周辺であってもよい。 In addition, the model generation device 20 generates a compressed learning captured image in which a portion of the learning captured image C is compressed, and a compressed correct captured image in which a portion of the correct captured image D is compressed, and the compressed learning captured image and the compressed correct captured image are compressed. The compressed correct captured image may be used for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. Note that the part of the face may be around the forehead, nose, or mouth.
 また、モデル生成装置20は、正解撮像画像Dに対して赤色成分を強調した強調正解撮像画像を生成し、当該強調正解撮像画像をモデルMDの機械学習に用いてもよい。これにより、学習済モデルMDaが生成する出力画像Bにおける、ポルフィリンに起因する蛍光の箇所の再現度が向上する。また、正解撮像画像DはR画像、G画像及びB画像から構成されるRGB画像であり、強調正解撮像画像は、当該R画像から当該G画像を差し引いた差分画像を当該RGB画像に加えて生成されたRGB画像であってもよい。 Furthermore, the model generation device 20 may generate an emphasized correct captured image in which the red component is emphasized with respect to the correct captured image D, and use the emphasized correct captured image for machine learning of the model MD. This improves the reproducibility of the fluorescent location caused by porphyrin in the output image B generated by the trained model MDa. In addition, the correct captured image D is an RGB image composed of an R image, a G image, and a B image, and the emphasized correct captured image is generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image. It may be an RGB image.
 また、モデルMDは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成されてもよい。また、モデルMDは、エンコーダー部分のレイヤーとデコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有するものであってもよい。 Furthermore, the model MD may be composed of an encoder portion layer and a decoder portion layer. Further, the model MD may have a U-Net structure in which the encoder portion layer and the decoder portion layer have a symmetrical structure and are connected by a skip connection.
 また、上述したような画像処理装置10を用いて、ニキビのかかりやすさを判定することも可能である。例えば、画像処理装置10により取得された出力画像と、標準画像とを比較することにより、ニキビのかかりやすさを判定することができる。標準画像とは、ニキビが発生していない一般的な人の画像であってもよい。なお、標準画像は、例えば、判定対象者(被験者U)の年齢や性別、国籍等に応じて、予め定められた好適な画像が用意されてもよい。また、標準画像とは、実際に撮像された画像でなくてもよく、画像処理により生成された画像であってもよい。 It is also possible to determine the susceptibility to acne using the image processing device 10 as described above. For example, by comparing the output image acquired by the image processing device 10 with a standard image, the susceptibility to acne can be determined. The standard image may be an image of a typical person without acne. Note that, as the standard image, a predetermined suitable image may be prepared depending on, for example, the age, gender, nationality, etc. of the person to be determined (subject U). Further, the standard image does not have to be an actually captured image, but may be an image generated by image processing.
 また、上述したような画像処理装置10を用いて、被験者がスキンケアを行った前後におけるスキンケアの効果を評価することも可能である。この場合、第1取得ステップとして、被験者がスキンケアを行う前における被験者撮像画像Aに基づいて、画像処理装置10により出力された出力画像を取得する。また、第2取得ステップとして、被験者がスキンケアを行った後における被験者撮像画像Aに基づいて、画像処理装置10により出力された出力画像を取得する。そして、第1取得ステップにより取得された出力画像と、第2取得ステップにより取得された出力画像とを比較し、スキンケアによる効果を判定する。当該判定は、スキンケア前後の出力画像を比較することにより行われてもよいし、スキンケア前後の出力画像をそれぞれ標準画像と比較することによりおこなわれてもよい。 Furthermore, using the image processing device 10 as described above, it is also possible to evaluate the effects of skin care before and after the subject performs skin care. In this case, as a first acquisition step, an output image output by the image processing device 10 is acquired based on the photographed image A of the subject before the subject performs skin care. Furthermore, as a second acquisition step, an output image output by the image processing device 10 is acquired based on the photographed image A of the subject after the subject has performed skin care. Then, the output image acquired in the first acquisition step and the output image acquired in the second acquisition step are compared to determine the effect of skin care. The determination may be made by comparing the output images before and after skin care, or by comparing the output images before and after skin care with a standard image, respectively.
 また、上述した各装置の機能を実現するためのコンピュータプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行するようにしてもよい。なお、ここでいう「コンピュータシステム」とは、OSや周辺機器等のハードウェアを含むものであってもよい。
 また、「コンピュータシステム」は、WWWシステムを利用している場合であれば、ホームページ提供環境(あるいは表示環境)も含むものとする。
 また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、フラッシュメモリ等の書き込み可能な不揮発性メモリ、DVD(Digital Versatile Disc)等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。
Further, a computer program for realizing the functions of each device described above may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed. Note that the "computer system" here may include hardware such as an OS and peripheral devices.
Furthermore, the term "computer system" includes the homepage providing environment (or display environment) if a WWW system is used.
Furthermore, "computer-readable recording media" refers to flexible disks, magneto-optical disks, ROMs, writable non-volatile memories such as flash memory, portable media such as DVDs (Digital Versatile Discs), and media built into computer systems. A storage device such as a hard disk.
 さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ(例えばDRAM(Dynamic Random Access Memory))のように、一定時間プログラムを保持しているものも含むものとする。
 また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク(通信網)や電話回線等の通信回線(通信線)のように情報を伝送する機能を有する媒体のことをいう。
 また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル(差分プログラム)であってもよい。
Furthermore, "computer-readable recording medium" refers to volatile memory (for example, DRAM (Dynamic It also includes those that retain programs for a certain period of time, such as Random Access Memory).
Further, the program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in a transmission medium. Here, the "transmission medium" that transmits the program refers to a medium that has a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
Moreover, the above-mentioned program may be for realizing a part of the above-mentioned functions. Furthermore, it may be a so-called difference file (difference program) that can realize the above-described functions in combination with a program already recorded in the computer system.
 以上、本発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。 Although the embodiments of the present invention have been described above in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and may include design changes without departing from the gist of the present invention.
10…画像処理装置、11…モデル格納部、12…被験者撮像画像取得部、13…出力画像取得部、21…学習用撮像画像取得部、22…正解撮像画像取得部、23…モデル生成部、24…モデル出力部、MD…モデル、MDa…学習済モデル、30…スマートフォン、U…被験者 10... Image processing device, 11... Model storage section, 12... Subject captured image acquisition section, 13... Output image acquisition section, 21... Learning captured image acquisition section, 22... Correct captured image acquisition section, 23... Model generation section, 24...model output unit, MD...model, MDa...trained model, 30...smartphone, U...subject

Claims (14)

  1.  可視光が照射された人の皮膚が撮像された学習用撮像画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われた学習済モデルを格納するモデル格納部と、
     可視光が照射された被験者の皮膚が撮像された被験者撮像画像を取得する被験者撮像画像取得部と、
     前記被験者撮像画像から、前記学習済モデルを用いて、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像を取得する出力画像取得部と、
     を備える画像処理装置。
    Using a learning image of human skin irradiated with visible light and a correct image of human skin with fluorescence caused by porphyrins due to ultraviolet irradiation, the visible a model storage unit that stores a trained model subjected to machine learning to generate an image of human skin in which fluorescence caused by porphyrin is generated from a captured image of human skin irradiated with light;
    a subject captured image acquisition unit that acquires a subject captured image in which the skin of the subject irradiated with visible light is captured;
    an output image acquisition unit that uses the trained model to acquire an output image that is an image of human skin in which fluorescence due to porphyrin has occurred from the captured image of the subject;
    An image processing device comprising:
  2.  前記学習済モデルの機械学習は、前記学習用撮像画像の一部分を圧縮した圧縮学習用撮像画像と、前記正解撮像画像の一部分を圧縮した圧縮正解撮像画像とが用いられた、
     請求項1に記載の画像処理装置。
    The machine learning of the learned model uses a compressed learning captured image in which a portion of the learning captured image is compressed, and a compressed correct captured image in which a portion of the correct captured image is compressed.
    The image processing device according to claim 1.
  3.  前記皮膚は顔の皮膚であり、
     前記一部分は、額、鼻又は口周辺である、
     請求項2に記載の画像処理装置。
    the skin is facial skin;
    The part is around the forehead, nose or mouth,
    The image processing device according to claim 2.
  4.  前記学習済モデルの機械学習は、前記正解撮像画像に対して赤色成分を強調した強調正解撮像画像が用いられた、
     請求項1に記載の画像処理装置。
    The machine learning of the learned model uses an emphasized correct captured image in which a red component is emphasized with respect to the correct captured image.
    The image processing device according to claim 1.
  5.  前記正解撮像画像はR画像、G画像及びB画像から構成されるRGB画像であり、
     前記強調正解撮像画像は、前記R画像から前記G画像を差し引いた差分画像を前記RGB画像に加えて生成されたRGB画像である、
     請求項4に記載の画像処理装置。
    The correct captured image is an RGB image composed of an R image, a G image, and a B image,
    The emphasized correct captured image is an RGB image generated by adding a difference image obtained by subtracting the G image from the R image to the RGB image.
    The image processing device according to claim 4.
  6.  前記学習済モデルは、エンコーダー部分のレイヤーと、デコーダー部分のレイヤーとから構成される、
     請求項1から5のいずれか1項に記載の画像処理装置。
    The learned model is composed of an encoder portion layer and a decoder portion layer,
    The image processing device according to any one of claims 1 to 5.
  7.  可視光が照射された人の皮膚が撮像された学習用撮像画像を取得する学習用撮像画像取得部と、
     紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像を取得する正解撮像画像取得部と、
     前記学習用撮像画像と前記正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルの機械学習を行い、学習済モデルを生成するモデル生成部と、
     を備えるモデル生成装置。
    a learning captured image acquisition unit that acquires a learning captured image of human skin irradiated with visible light;
    a correct image acquisition unit that obtains a correct image of human skin in which fluorescence due to porphyrin is generated due to irradiation with ultraviolet rays;
    A model that uses the learning captured image and the correct captured image to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light. a model generation unit that performs machine learning and generates a trained model;
    A model generation device comprising:
  8.  請求項1~5のいずれか1項に記載の画像処理装置と、請求項7に記載のモデル生成装置とを備える、
     皮膚状態測定システム。
    comprising the image processing device according to any one of claims 1 to 5 and the model generation device according to claim 7,
    Skin condition measurement system.
  9.  可視光が照射された被験者の皮膚が撮像された被験者撮像画像を取得するステップと、
     前記被験者撮像画像から、学習済モデルを用いて、ポルフィリンに起因する蛍光が生じた人の皮膚の画像である出力画像を取得するステップと、を含み、
     前記学習済モデルは、可視光が照射された人の皮膚が撮像された学習用撮像画像と、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するように機械学習が行われたモデルである、
     画像処理方法。
    acquiring a subject image in which the subject's skin is irradiated with visible light;
    from the captured image of the subject, using a trained model to obtain an output image that is an image of human skin in which fluorescence due to porphyrin has occurred;
    The learned model includes a learning image of a person's skin irradiated with visible light and a correct image of a person's skin with fluorescence caused by porphyrins as a result of being irradiated with ultraviolet rays. This is a model in which machine learning is performed to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light using
    Image processing method.
  10.  学習用撮像画像と正解撮像画像を取得するステップにおいて、少なくとも1か所以上の映しこみ基準点を設置する、請求項9に記載の画像処理方法。 The image processing method according to claim 9, wherein in the step of acquiring the learning captured image and the correct captured image, at least one projection reference point is set.
  11.  可視光が照射された人の皮膚が撮像された学習用撮像画像を取得するステップと、
     紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された正解撮像画像を取得するステップと、
     前記学習用撮像画像と前記正解撮像画像とを用いて、可視光が照射された人の皮膚が撮像された撮像画像からポルフィリンに起因する蛍光が生じた人の皮膚の画像を生成するようにモデルの機械学習を行い、学習済モデルを生成するステップと、
     を含むモデル生成方法。
    acquiring a learning image in which the skin of a person irradiated with visible light is captured;
    obtaining a correct image of human skin in which fluorescence due to porphyrin has occurred due to irradiation with ultraviolet rays;
    A model that uses the learning captured image and the correct captured image to generate an image of human skin in which fluorescence due to porphyrin has occurred from a captured image of human skin irradiated with visible light. a step of performing machine learning and generating a trained model;
    Model generation methods, including:
  12.  請求項1から5のいずれか1項に記載の画像処理装置により取得された前記出力画像と、ニキビが発生していない一般的な人の皮膚の画像である標準画像とを比較することにより、ニキビのかかりやすさを判定する
     判定方法。
    By comparing the output image acquired by the image processing device according to any one of claims 1 to 5 with a standard image that is an image of the skin of a typical person without acne, A method for determining the susceptibility to acne.
  13.  被験者がスキンケアを行った前後におけるスキンケアの効果を評価する評価方法であって、
     被験者がスキンケアを行う前における前記被験者撮像画像に基づいて、請求項1から5のいずれか1項に記載の画像処理装置により取得された前記出力画像を取得する第1取得ステップと、
    被験者がスキンケアを行った後における前記被験者撮像画像に基づいて、請求項1から5のいずれか1項に記載の画像処理装置により取得された前記出力画像を取得する第2取得ステップと、
     前記第1取得ステップにより取得された前記出力画像と、前記第2取得ステップにより取得された前記出力画像とを比較し、スキンケアによる改善の程度を判定する
     判定方法。
    An evaluation method for evaluating the effect of skin care before and after a subject performs skin care,
    A first acquisition step of acquiring the output image acquired by the image processing device according to any one of claims 1 to 5, based on the captured image of the subject before the subject performs skin care;
    a second acquisition step of acquiring the output image acquired by the image processing device according to any one of claims 1 to 5, based on the captured image of the subject after the subject has performed skin care;
    A determination method in which the output image acquired in the first acquisition step and the output image acquired in the second acquisition step are compared to determine the degree of improvement due to skin care.
  14.  可視光が照射された人の皮膚が撮像された画像であって、基準点を有する部分が撮像された画像である学習用撮像画像を取得する学習用撮像画像取得ステップと、
     前記学習用撮像画像が撮像された撮像装置とは異なる撮像装置により前記学習用撮像画像と同一の被写体が撮像された画像であって、紫外線が照射されたことによりポルフィリンに起因する蛍光が生じた人の皮膚が撮像された画像である正解撮像画像を取得する正解撮像画像取得ステップと、
     前記学習用撮像画像のうち前記基準点に基づく一部の領域を抽出する第1抽出ステップと、
     前記正解撮像画像のうち前記基準点に基づく一部の領域を抽出する第2抽出ステップと、
     前記学習用撮像画像及び前記正解撮像画像それぞれから抽出された画像を対応付けて記憶することにより教師データを作成する教師データ作成ステップと、
     を含む教師データ作成方法。
    a learning captured image acquisition step of acquiring a learning captured image, which is an image of a person's skin irradiated with visible light and in which a portion having a reference point is captured;
    An image of the same subject as the learning image captured by an imaging device different from the image capturing device that captured the learning image, and in which fluorescence due to porphyrin is generated due to ultraviolet irradiation. a correct captured image acquisition step of obtaining a correct captured image that is an image of human skin;
    a first extraction step of extracting a part of the learning image based on the reference point;
    a second extraction step of extracting a part of the correct captured image based on the reference point;
    a teacher data creation step of creating teacher data by storing images extracted from each of the learning captured image and the correct captured image in association with each other;
    Training data creation method including.
PCT/JP2023/029036 2022-08-10 2023-08-09 Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method WO2024034630A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022127815 2022-08-10
JP2022-127815 2022-08-10

Publications (1)

Publication Number Publication Date
WO2024034630A1 true WO2024034630A1 (en) 2024-02-15

Family

ID=89851614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/029036 WO2024034630A1 (en) 2022-08-10 2023-08-09 Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method

Country Status (1)

Country Link
WO (1) WO2024034630A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05103771A (en) * 1991-10-17 1993-04-27 Kao Corp Detection method for porphyrin on skin
WO2006118060A1 (en) * 2005-04-28 2006-11-09 Shiseido Company, Ltd. Skin state analyzing method, skin state analyzing device, and recording medium on which skin state analyzing program is recorded
JP2009000494A (en) * 2007-05-23 2009-01-08 Noritsu Koki Co Ltd Porphyrin detection method, porphyrin display method, and porphyrin detector
JP2018196426A (en) * 2017-05-23 2018-12-13 花王株式会社 Pore detection method and pore detection device
CN110390631A (en) * 2019-07-11 2019-10-29 上海媚测信息科技有限公司 Generate method, system, network and the storage medium of UV spectrum picture
JP2021125056A (en) * 2020-02-07 2021-08-30 カシオ計算機株式会社 Identification device, identification equipment learning method, identification method, and program
WO2022149110A1 (en) * 2021-01-11 2022-07-14 Baracoda Daily Healthtech Systems and methods for skin analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05103771A (en) * 1991-10-17 1993-04-27 Kao Corp Detection method for porphyrin on skin
WO2006118060A1 (en) * 2005-04-28 2006-11-09 Shiseido Company, Ltd. Skin state analyzing method, skin state analyzing device, and recording medium on which skin state analyzing program is recorded
JP2009000494A (en) * 2007-05-23 2009-01-08 Noritsu Koki Co Ltd Porphyrin detection method, porphyrin display method, and porphyrin detector
JP2018196426A (en) * 2017-05-23 2018-12-13 花王株式会社 Pore detection method and pore detection device
CN110390631A (en) * 2019-07-11 2019-10-29 上海媚测信息科技有限公司 Generate method, system, network and the storage medium of UV spectrum picture
JP2021125056A (en) * 2020-02-07 2021-08-30 カシオ計算機株式会社 Identification device, identification equipment learning method, identification method, and program
WO2022149110A1 (en) * 2021-01-11 2022-07-14 Baracoda Daily Healthtech Systems and methods for skin analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WATANABE SOTA; HASEGAWA MAKOTO: "Visualization of Cutibacterium acnes with visible light using deep learning", SPIE, 1000 20TH ST. BELLINGHAM WA 98225-6705 USA, vol. 12592, 25 March 2023 (2023-03-25), 1000 20th St. Bellingham WA 98225-6705 USA, pages 1259214 - 1259214-6, XP060175088, ISSN: 0277-786X, ISBN: 978-1-5106-6308-4, DOI: 10.1117/12.2666674 *

Similar Documents

Publication Publication Date Title
CN110197229B (en) Training method and device of image processing model and storage medium
JP4373828B2 (en) Specific area detection method, specific area detection apparatus, and program
Krupinski et al. American Telemedicine Association’s practice guidelines for teledermatology
US8655068B1 (en) Color correction system
WO2019072190A1 (en) Image processing method, electronic apparatus, and computer readable storage medium
CN109587556B (en) Video processing method, video playing method, device, equipment and storage medium
US9277148B2 (en) Maximizing perceptual quality and naturalness of captured images
JP2004357277A (en) Digital image processing method
JP2023056056A (en) Data generation method, learning method and estimation method
JP2005092759A (en) Image processing device and method, red-eye detection method, and program
EP4024352A3 (en) Method and apparatus for face liveness detection, and storage medium
US12014829B2 (en) Image processing and presentation techniques for enhanced proctoring sessions
CN116188296A (en) Image optimization method and device, equipment, medium and product thereof
WO2024034630A1 (en) Image processing device, model generation device, skin state measurement system, image processing method, model generation method, determining method, and teacher data creation method
JP6898150B2 (en) Pore detection method and pore detection device
Banterle et al. A psychophysical evaluation of inverse tone mapping techniques
US11790475B2 (en) Light-field messaging to embed a hidden message into a carrier
Akyüz et al. An evaluation of image reproduction algorithms for high contrast scenes on large and small screen display devices
US10973412B1 (en) System for producing consistent medical image data that is verifiably correct
WO2021126268A1 (en) Neural networks to provide images to recognition engines
WO2021125268A1 (en) Control device, control method, and program
CN114945079A (en) Video recording and invigilating method for online pen test, electronic equipment and storage medium
US8233693B1 (en) Automatic print and negative verification methods and apparatus
WO2021171444A1 (en) Teaching data generation device, teaching data generation method, recording device, and recording method
Payne et al. “OpenVPCal”: An Open Source In-Camera Visual Effects Calibration Framework

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23852592

Country of ref document: EP

Kind code of ref document: A1