WO2024105829A1 - 画像処理装置及び画像処理プログラム - Google Patents

画像処理装置及び画像処理プログラム Download PDF

Info

Publication number
WO2024105829A1
WO2024105829A1 PCT/JP2022/042627 JP2022042627W WO2024105829A1 WO 2024105829 A1 WO2024105829 A1 WO 2024105829A1 JP 2022042627 W JP2022042627 W JP 2022042627W WO 2024105829 A1 WO2024105829 A1 WO 2024105829A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
reflection intensity
intensity distribution
captured image
component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/042627
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
将吾 佐藤
泰洋 八尾
大我 吉田
潤 島村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Inc
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2024558581A priority Critical patent/JPWO2024105829A1/ja
Priority to PCT/JP2022/042627 priority patent/WO2024105829A1/ja
Publication of WO2024105829A1 publication Critical patent/WO2024105829A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging

Definitions

  • the disclosed technology relates to an image processing device and an image processing program.
  • Non-Patent Document 1 discloses a method using a VAE (Variational Autoencoder)-based unsupervised learning model as inherent image decomposition technology.
  • Non-Patent Document 2 discloses a method using a supervised learning model as inherent image decomposition technology.
  • Non-Patent Document 1 is an unsupervised learning technology, and therefore cannot learn to distinguish between shadows cast by objects and patterns formed on the object's surface, and the accuracy of estimating reflectance components in particular is insufficient. Furthermore, with the technology described in Non-Patent Document 2, it is difficult to prepare training data for learning. This is because it is difficult to actually observe the reflectance components of an object, and attempting to generate training data through simulation leads to increased costs and reduced accuracy.
  • the disclosed technology has been developed in consideration of the above points, and aims to provide an image processing device and an image processing program that can accurately estimate at least one of the reflectance components and the shading components in an image.
  • a first aspect of the present disclosure is an image processing device that includes an acquisition unit that acquires a captured image obtained by capturing an image of an object and a reflection intensity distribution on the surface of the object, and an estimation unit that estimates at least one of a reflectance component in the captured image that is independent of illumination light and a shading component in the captured image that is dependent on illumination light, based on the captured image and the reflection intensity distribution.
  • the second aspect of the present disclosure is an image processing program for causing a computer to function as the image processing device of the first aspect.
  • the disclosed technology makes it possible to accurately estimate at least one of the reflectance and shading components in an image.
  • FIG. 2 is a block diagram showing an example of a hardware configuration of the image processing device.
  • 1 is a block diagram showing a configuration of an image processing device according to a first embodiment
  • FIG. 2 is a diagram illustrating input and output of the eigenimage decomposition model of the first embodiment.
  • FIG. 2 is a diagram illustrating an example of the configuration of an eigenimage decomposition model according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of the configuration of an eigenimage decomposition model according to the first embodiment.
  • 4 is a flowchart showing a flow of image processing according to the first embodiment.
  • FIG. 13 is a diagram illustrating input and output of the eigenimage decomposition model of the second embodiment.
  • FIG. 13 is a diagram illustrating an example of the configuration of an eigenimage decomposition model according to the second embodiment. 10 is a flowchart showing a flow of image processing according to a second embodiment.
  • FIG. 13 is a block diagram showing a configuration of an image processing device according to a third embodiment.
  • FIG. 13 is a diagram illustrating input and output of the eigenimage decomposition model of the third embodiment. 13 is a flowchart showing a flow of image processing according to a third embodiment.
  • FIG. 13 is a block diagram showing a configuration of an image processing device according to a fourth embodiment.
  • FIG. 13 is a diagram illustrating input and output of the eigenimage decomposition model of the fourth embodiment.
  • Each embodiment relates to an inherent image decomposition technique for estimating a reflectance component (albedo) that is independent of illuminating light and a shade component (shade) that is dependent on illuminating light for an image obtained by photographing an object (i.e., decomposing a photographed image into a reflectance component and a shade component).
  • an object i.e., decomposing a photographed image into a reflectance component and a shade component
  • the reflectance components and shading components in the captured image are estimated with high accuracy by using the reflection intensity measured by a sensor capable of measuring the actual reflection intensity value, such as a LiDAR (Light Detection and Ranging) sensor.
  • a sensor capable of measuring the actual reflection intensity value such as a LiDAR (Light Detection and Ranging) sensor.
  • a LiDAR sensor shines laser light onto an object and measures the distance between the object based on the time it takes to receive the reflected light or the phase change between the emitted light and the received light.
  • a LiDAR sensor can measure the three-dimensional shape (three-dimensional coordinates) of an object by arranging multiple laser light emitters in the vertical direction and having each emitter scan (rotate) horizontally.
  • the LiDAR sensor can also measure the reflection intensity at each position on the object's surface using the light intensity ratio between the emitted light and the received light. The reflection intensity measured by this LiDAR sensor is not affected by ambient light, as it is the light intensity ratio between the emitted light and the received light.
  • reflection intensity refers to the actual reflection intensity (e.g., a value corresponding to the light intensity ratio between the emitted light and the received light of the laser light) measured by a sensor capable of measuring the actual reflection intensity, such as a LiDAR sensor.
  • reflection intensity distribution refers to the distribution of the actual reflection intensity at each position on the object surface, measured by a sensor capable of measuring the actual reflection intensity, such as a LiDAR sensor. Details of each embodiment are described below.
  • FIG. 1 is a block diagram showing the hardware configuration of an image processing device 10 according to the present embodiment.
  • the image processing device 10 includes a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a storage 14, an input unit 15, a display unit 16, and a communication I/F (InterFace) 17.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • storage 14 an input unit
  • I/F InterFace
  • the CPU 11 is a central processing unit that executes various programs and controls each part. That is, the CPU 11 reads the programs from the ROM 12 or storage 14, and executes the programs using the RAM 13 as a working area. The CPU 11 controls each of the above components and performs various calculation processes according to the programs stored in the ROM 12 or storage 14.
  • the ROM 12 or storage 14 stores an image processing program for estimating reflectance components and shadow components from a captured image.
  • the image processing program may be a single program, or may be a group of programs consisting of multiple programs or modules.
  • ROM 12 stores various programs and data.
  • RAM 13 temporarily stores programs or data as a working area.
  • Storage 14 is composed of a storage device such as an HDD (Hard Disk Drive) or SSD (Solid State Drive), and stores various programs including an operating system, and various data.
  • HDD Hard Disk Drive
  • SSD Solid State Drive
  • the input unit 15 includes, for example, a pointing device such as a mouse, and a keyboard, and is used to perform various input operations.
  • the display unit 16 is, for example, a liquid crystal display, and displays various types of information.
  • the display unit 16 may also function as the input unit 15 by employing a touch panel system.
  • the communication I/F 17 is an interface for communicating with other devices including, for example, a camera and a LiDAR sensor.
  • a wired communication standard such as Ethernet (registered trademark) or FDDI
  • a wireless communication standard such as 4G, 5G, or Wi-Fi (registered trademark) is used.
  • FIG. 2 is a block diagram showing an example of the functional configuration of the image processing device 10.
  • the image processing device 10 has, as its functional configuration, an acquisition unit 20, an estimation unit 22, and a learning unit 24.
  • Each functional configuration is realized by the CPU 11 reading out an image processing program stored in the ROM 12 or storage 14, expanding it in the RAM 13, and executing it.
  • the acquisition unit 20 acquires a captured image I obtained by photographing an object, and a reflection intensity distribution L on the surface of the object.
  • the captured image I is captured by, for example, a digital camera.
  • the reflection intensity distribution L is measured by a LiDAR sensor as described above. Note that the captured image I and the reflection intensity distribution L are each obtained by photographing and measuring the same object.
  • the estimation unit 22 estimates at least one of a reflectance component R that is independent of the illumination light in the captured image I and a shading component S that is dependent on the illumination light in the captured image I, based on the captured image I and the reflection intensity distribution L acquired by the acquisition unit 20. That is, the estimation unit 22 estimates at least one of the reflectance component R and the shading component S for each pixel of the captured image I.
  • the estimation unit 22 estimates at least one of the reflectance component R and the shading component S using the inherent image decomposition model 30.
  • FIG. 3 shows the input and output of the inherent image decomposition model 30 according to this embodiment.
  • the inherent image decomposition model 30 is a trained model that receives the captured image I and the reflection intensity distribution L as input, and is trained in advance to output the reflectance component R and the shading component S.
  • the estimation unit 22 obtains the reflectance component R and the shading component S of the captured image I by inputting the captured image I and the reflection intensity distribution L acquired by the acquisition unit 20 into the inherent image decomposition model 30.
  • the eigenimage decomposition model 30 includes an encoder E c R that encodes the reflectance component R into a domain-independent latent space, an encoder E p R that encodes the reflectance component R into a domain-dependent latent space, and a decoder D R that decodes the two encoder outputs into the reflectance component R.
  • the eigenimage decomposition model 30 includes an encoder E c I that encodes the combined captured image I and reflection intensity distribution L into a domain-independent latent space, an encoder E p I that encodes the combined captured image I and reflection intensity distribution L into a domain-dependent latent space, and a decoder D I that decodes the captured image I and reflection intensity distribution L from the outputs of the encoder E c I and the encoder E p I.
  • the eigenimage decomposition model 30 includes an encoder E c S that encodes the shadow component S into a domain-independent latent space, an encoder E p S that encodes the shadow component S into a domain-dependent latent space, and a decoder D S that decodes the shadow component S from the two encoder outputs.
  • the estimation unit 22 inputs the captured image I and the reflection intensity distribution L combined to the encoders E c I and E p I , and inputs the outputs of the encoders E c I and E p R to the decoder D R to obtain an estimation result R' of the reflectance component R of the captured image I.
  • the output of the encoder E p R is obtained by mapping the output of the encoder E p I to the domain of the reflectance component R.
  • the estimation unit 22 inputs the outputs of the encoders E c I and E p S to the decoder D S to obtain an estimation result S' of the shadow component S of the captured image I.
  • the output of the encoder E p S is obtained by mapping the output of the encoder E p I to the domain of the shadow component S.
  • the intrinsic image decomposition model 30 is preferably trained by unsupervised learning using a training image, a training reflection intensity distribution, a training reflectance image made up of reflectance components, and a training shade image made up of shade components.
  • the training image, the training reflectance image, and the training shade image may be independent of each other.
  • the images from which the training reflectance image and the training shade image are estimated do not necessarily have to match, and do not have to match the training image.
  • the training image and the training reflection intensity distribution are obtained by photographing and measuring the same object.
  • an inherent image decomposition model 30 for example, a VAE-based unsupervised learning model as described in the above non-patent document 1 can be applied, but is not particularly limited.
  • the inherent image decomposition model 30 may be based on a DNN (Deep Neural Network).
  • supervised learning may be performed using multiple teacher data including combinations of a learning photographed image, a learning reflection intensity distribution, a learning reflectance image, and a learning shadow image.
  • the combination of a learning photographed image, a learning reflection intensity distribution, a learning reflectance image, and a learning shadow image is formed using a learning reflectance image and a learning shadow image that are the correct answer for a pair of a learning photographed image and a learning reflection intensity distribution obtained by photographing and measuring the same object.
  • the inherent image decomposition model 30 is described as combining the captured image I and the reflection intensity distribution L before encoding, but this is not limited to the above.
  • the captured image I and the reflection intensity distribution L may be combined after encoding.
  • the eigenimage decomposition model 30 includes an encoder E c R , an encoder E p R , and a decoder D R. Furthermore, the eigenimage decomposition model 30 includes an encoder E c I that encodes the captured image I into a domain-independent latent space, an encoder E p I that encodes the captured image I into a domain-dependent latent space, an encoder E c L that encodes the reflection intensity distribution L into a domain-independent latent space, and an encoder E p L that encodes the reflection intensity distribution L into a domain-dependent latent space.
  • the eigenimage decomposition model 30 includes a decoder D I that decodes the combined output of the encoder E c I and the encoder E c L and the combined output of the encoder E p I and the encoder E p L into the captured image I and the reflection intensity distribution L. Furthermore, the eigenimage decomposition model 30 includes an encoder E c S , an encoder E p S , and a decoder D S.
  • the estimation unit 22 inputs the captured image I to the encoders E c I and E p I , inputs the reflection intensity distribution L to the encoders E c L and E p L , and inputs the outputs of the encoders E c I , E c L , and E p R to the decoder D R to obtain an estimation result R' of the reflectance component R of the captured image I.
  • the output of the encoder E p R is obtained by mapping the output of the encoder E p I to the domain of the reflectance component R.
  • the estimation unit 22 inputs the outputs of the encoders E c I , E c L , and E p S to the decoder D S to obtain an estimation result S' of the shadow component S of the captured image I.
  • the output of the encoder E p S is obtained by mapping the output of the encoder E p I to the domain of the shadow component S.
  • the estimation unit 22 may control the display unit 16 to display at least one of the reflectance component R and the shadow component S in the captured image I obtained as described above as an image.
  • the learning unit 24 trains the intrinsic image decomposition model 30 using a loss function that includes at least one of the error between the result of masking the estimated reflectance component R according to the reflection intensity distribution L and the result of masking the reflection intensity distribution L according to the reflection intensity distribution L, and the error between the result of masking the estimated shadow component S according to the reflection intensity distribution L and the result of masking the ratio between the captured image I and the reflection intensity distribution L according to the reflection intensity distribution L.
  • ⁇ r is a weight related to the estimation error of the reflectance component R.
  • g(a, b) is a function related to the error between a and b, such as the mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE).
  • MSE mean squared error
  • RMSE root mean squared error
  • MAE mean absolute error
  • h(a) is a function that grayscales a.
  • ⁇ and ⁇ are parameters for aligning the scale and bias of the reflection intensity distribution L obtained by a LiDAR sensor or the like and the reflectance component R, and may be updated during the learning process.
  • M is a mask for the reflection intensity, where M is 1 when there is a reflection intensity corresponding to the pixel, and M is 0 when there is no reflection intensity distribution L corresponding to the pixel.
  • the resolution of a LiDAR sensor is smaller than the resolution of a camera, and it is not always possible to project the reflection intensity corresponding to all pixels in the captured image I, so such a mask is provided.
  • M ⁇ h(R) is the result of masking the grayscaled version of the reflectance component R according to the reflection intensity distribution L.
  • M ⁇ (L ⁇ ⁇ + ⁇ ) is the result of masking the reflection intensity distribution L corrected by the parameters ⁇ and ⁇ according to the reflection intensity distribution L.
  • the learning unit 24 may add a term Ts1 relating to the following shadow component S as a term of the loss function.
  • Ts1 ⁇ s ⁇ g (M ⁇ S, M ⁇ Iv / (L ⁇ ⁇ + ⁇ ))
  • ⁇ s is a weight related to the estimation error of the shadow component S.
  • Iv is the pixel value (brightness value) of the captured image I.
  • ⁇ and ⁇ are parameters for aligning the scale and bias of the reflection intensity distribution L obtained by a LiDAR sensor or the like and the pixel value Iv of the captured image I, and may be updated during the learning process.
  • M ⁇ S is the result of masking the shadow component S according to the reflection intensity distribution L.
  • M ⁇ Iv/(L ⁇ + ⁇ ) is the result of masking the ratio between the pixel value Iv of the captured image I and the reflection intensity distribution L corrected by the parameters ⁇ and ⁇ according to the reflection intensity distribution L.
  • a RAW image is image data that is output directly from an imaging element such as a CCD (Charge Coupled Device) or CMOS (Complementary Metal Oxide Semiconductor) equipped in the photographing device.
  • CCD Charge Coupled Device
  • CMOS Complementary Metal Oxide Semiconductor
  • the learning unit 24 may add at least one of the term Tr2 relating to the reflectance component R, which takes into account various image processing processes in the imaging device, and the term Ts2 relating to the shadow component S, as a term of the loss function.
  • Tr2 ⁇ r ⁇ g(M ⁇ h(f(R)), M ⁇ (L ⁇ + ⁇ ))
  • Ts2 ⁇ s ⁇ g(M ⁇ S, M ⁇ f(Iv/(L ⁇ + ⁇ )))
  • f(a) is a function that restores various image processing performed in the imaging device, such as an inverse transformation function for gamma correction. That is, M ⁇ h(f(R)) is the result of masking the restored reflectance component R, which has been converted to grayscale, according to the reflection intensity distribution L. M ⁇ (L ⁇ + ⁇ ) is the result of masking the reflection intensity distribution L, which has been corrected with parameters ⁇ and ⁇ , according to the reflection intensity distribution L. M ⁇ S is the result of masking the shadow component S according to the reflection intensity distribution L. M ⁇ f(Iv/(L ⁇ + ⁇ )) is the result of masking the restored ratio between the pixel value Iv of the captured image I and the reflection intensity distribution L, which has been corrected with parameters ⁇ and ⁇ , according to the reflection intensity distribution L.
  • the learning unit 24 may also perform self-supervised learning on the eigenimage decomposition model 30.
  • FIG. 6 is a flowchart showing the flow of image processing by the image processing device 10.
  • the image processing shown in FIG. 6 is performed by the CPU 11 reading out an image processing program from the ROM 12 or storage 14, expanding it into the RAM 13, and executing it.
  • step S10 the CPU 11, functioning as the acquisition unit 20, acquires a captured image obtained by photographing an object.
  • step S12 the CPU 11, functioning as the acquisition unit 20, acquires the reflection intensity distribution on the surface of the object.
  • step S14 the CPU 11, as the estimation unit 22, estimates at least one of the reflectance component and the shadow component in the captured image acquired in step S10 based on the captured images and the reflection intensity distribution acquired in steps S10 and S12.
  • this image processing ends.
  • the acquisition unit 20 acquires a captured image obtained by capturing an image of an object and a reflection intensity distribution on the surface of the object. Furthermore, the estimation unit 22 estimates at least one of a reflectance component in the captured image that is independent of the illumination light and a shading component in the captured image that is dependent on the illumination light, based on the captured image and the reflection intensity distribution. According to the image processing device according to this embodiment, at least one of the reflectance component and the shading component in the image can be accurately estimated by using the reflection intensity.
  • the same functions and configurations as those of the image processing device 10 according to the first embodiment are denoted by the same reference numerals, and detailed description thereof will be omitted.
  • the image processing device 10 according to the second embodiment has the same hardware configuration (see FIG. 1) and functional configuration (see FIG. 2) as those of the first embodiment.
  • the acquisition unit 20 acquires a captured image I obtained by capturing an image of an object, and a reflection intensity distribution L on the surface of the object.
  • the acquisition unit 20 also acquires a depth distribution D on the surface of the object.
  • the depth distribution D can be obtained from the three-dimensional coordinates measured by the LiDAR sensor as described above.
  • the estimation unit 22 estimates at least one of the reflectance component R and the shading component S in the captured image I based on the captured image I, the reflection intensity distribution L, and the depth distribution D acquired by the acquisition unit 20. Specifically, the estimation unit 22 estimates at least one of the reflectance component R and the shading component S using an inherent image decomposition model 30 shown in FIG. 7.
  • FIG. 7 shows the input and output of the inherent image decomposition model 30 according to this embodiment.
  • the inherent image decomposition model 30 according to this embodiment is a trained model that is trained in advance to further input the depth distribution D in addition to the captured image I and the reflection intensity distribution L, and output the reflectance component R and the shading component S.
  • the estimation unit 22 obtains the reflectance component R and the shading component S of the captured image I by inputting the captured image I, the reflection intensity distribution L, and the depth distribution D acquired by the acquisition unit 20 into the inherent image decomposition model 30.
  • the eigenimage decomposition model 30 includes an encoder E c R that encodes the reflectance component R into a domain-independent latent space, an encoder E p R that encodes the reflectance component R into a domain-dependent latent space, and a decoder D R that decodes the two encoder outputs into the reflectance component R.
  • the eigenimage decomposition model 30 includes an encoder E c I that encodes the combined image I, reflection intensity distribution L, and depth distribution D into a domain-independent latent space, an encoder E p I that encodes the combined image I, reflection intensity distribution L, and depth distribution D into a domain-dependent latent space , and a decoder D I that decodes the output of the encoder E c I and the encoder E p I into the captured image I , reflection intensity distribution L, and depth distribution D.
  • the eigenimage decomposition model 30 includes an encoder E c S that encodes the shadow component S into a domain-independent latent space, an encoder E p S that encodes the shadow component S into a domain-dependent latent space, and a decoder D S that decodes the shadow component S from the two encoder outputs.
  • the estimation unit 22 inputs the combined captured image I, reflection intensity distribution L, and depth distribution D to the encoder E c I and the encoder E p I , and inputs the outputs of the encoder E c I and the encoder E p R to the decoder D R to obtain an estimation result R' of the reflectance component R of the captured image I.
  • the output of the encoder E p R is obtained by mapping the output of the encoder E p I to the domain of the reflectance component R.
  • the estimation unit 22 inputs the outputs of the encoder E c I and the encoder E p S to the decoder D S to obtain an estimation result S' of the shadow component S of the captured image I.
  • the output of the encoder E p S is obtained by mapping the output of the encoder E p I to the domain of the shadow component S.
  • FIG. 9 is a flowchart showing the flow of image processing by the image processing device 10.
  • the image processing shown in FIG. 9 is performed by the CPU 11 reading out an image processing program from the ROM 12 or storage 14, expanding it into the RAM 13, and executing it.
  • step S20 the CPU 11, functioning as the acquisition unit 20, acquires a captured image obtained by photographing an object.
  • step S22 the CPU 11, functioning as the acquisition unit 20, acquires a reflection intensity distribution on the surface of the object.
  • step S24 the CPU 11, functioning as the acquisition unit 20, acquires a depth distribution on the surface of the object.
  • step S26 the CPU 11, as the estimation unit 22, estimates at least one of the reflectance component and the shadow component in the captured image acquired in step S20 based on the captured image, reflection intensity distribution, and depth distribution acquired in steps S20 to S24.
  • this image processing ends.
  • the acquisition unit 20 further acquires the depth distribution on the surface of the object.
  • the estimation unit estimates at least one of the reflectance component and the shadow component based on the captured image, the reflection intensity distribution, and the depth distribution. According to the image processing device according to this embodiment, by also taking into account the depth distribution, it is possible to estimate the shadow component in the image with high accuracy.
  • the same functions and configurations as those of the image processing device 10 according to the first and second embodiments are denoted by the same reference numerals, and detailed description thereof will be omitted.
  • the image processing device 10 according to the third embodiment has the same hardware configuration (see FIG. 1) as that of the first embodiment.
  • a RAW image is image data that is output directly from an image sensor such as a CCD or CMOS that is equipped in the photographing device.
  • the photographed image I which has been subjected to various image processing on the RAW image, is used as is to estimate the reflectance component R and the shadow component S, the accuracy of the estimation may decrease.
  • the image processing device 10 therefore restores the captured image I to a RAW image prior to various image processing processes such as gamma correction, and estimates at least one of the reflectance component R and the shadow component S based on the restored RAW image, thereby improving the accuracy of the estimation.
  • FIG. 10 is a block diagram showing an example of the functional configuration of the image processing device 10.
  • the image processing device 10 of this embodiment includes, as its functional configuration, an acquisition unit 20, an estimation unit 22, a learning unit 24, and a restoration unit 26.
  • Each functional configuration is realized by the CPU 11 reading out an image processing program stored in the ROM 12 or storage 14, expanding it in the RAM 13, and executing it.
  • the acquisition unit 20 acquires a captured image I obtained by photographing an object, and a reflection intensity distribution L on the surface of the object.
  • the restoration unit 26 restores the captured image I acquired by the acquisition unit 20 to a RAW image Ir.
  • a known method can be appropriately applied.
  • the restoration unit 26 may restore the captured image I to a RAW image Ir using an inverse conversion function of various image processing (e.g., gamma correction) performed on a RAW image output from an image sensor in a photographing device for the captured image I.
  • the estimation unit 22 estimates at least one of the reflectance component R and the shading component S based on the RAW image Ir restored by the restoration unit 26 and the reflection intensity distribution L instead of the captured image I. Specifically, the estimation unit 22 estimates at least one of the reflectance component R and the shading component S using an inherent image decomposition model 30 shown in FIG. 11.
  • FIG. 11 shows the input and output of the inherent image decomposition model 30 according to this embodiment.
  • the inherent image decomposition model 30 according to this embodiment is a trained model that receives the RAW image Ir restored from the captured image I by the restoration unit 26 and the reflection intensity distribution L as input, and is trained in advance to output the reflectance component R and the shading component S.
  • the estimation unit 22 obtains the reflectance component R and the shading component S of the RAW image Ir by inputting the RAW image Ir and the reflection intensity distribution L restored by the restoration unit 26 to the inherent image decomposition model 30.
  • FIG. 12 is a flowchart showing the flow of image processing by the image processing device 10.
  • the image processing shown in FIG. 12 is performed by the CPU 11 reading out an image processing program from the ROM 12 or storage 14, expanding it into the RAM 13, and executing it.
  • step S30 the CPU 11, functioning as the acquisition unit 20, acquires a captured image obtained by photographing an object.
  • step S32 the CPU 11, functioning as the acquisition unit 20, acquires a reflection intensity distribution on the surface of the object.
  • step S34 the CPU 11, functioning as the restoration unit 26, restores the captured image acquired in step S30 to a RAW image.
  • step S36 the CPU 11, as the estimation unit 22, estimates at least one of the reflectance component and the shadow component in the RAW image based on the reflection intensity distribution acquired in step S32 and the RAW image restored in step S34.
  • the restoration unit 26 restores the captured image to a RAW image. Furthermore, the estimation unit 22 estimates at least one of the reflectance component and the shading component based on the RAW image restored by the restoration unit 26 and the reflection intensity distribution, instead of the captured image. According to the image processing device according to this embodiment, by using the RAW image, it is possible to more accurately estimate at least one of the reflectance component and the shading component in the image.
  • the same functions and configurations as those of the image processing device 10 according to the first to third embodiments are denoted by the same reference numerals, and detailed description thereof will be omitted.
  • the image processing device 10 according to the fourth embodiment has the same hardware configuration (see FIG. 1) as that of the first embodiment.
  • FIG. 13 is a block diagram showing an example of the functional configuration of the image processing device 10.
  • the image processing device 10 of this embodiment includes, as its functional configuration, an acquisition unit 20, an estimation unit 22, a learning unit 24, and a correction unit 28.
  • Each functional configuration is realized by the CPU 11 reading out an image processing program stored in the ROM 12 or storage 14, expanding it in the RAM 13, and executing it.
  • the acquisition unit 20 acquires a captured image I obtained by photographing an object, and a reflection intensity distribution L on the surface of the object.
  • the correction unit 28 corrects the captured image I based on the reflection intensity distribution L acquired by the acquisition unit 20, and generates a corrected image Ic masked in accordance with the reflection intensity distribution L. Specifically, the correction unit 28 generates the corrected image Ic by replacing the pixel value (brightness value) Iv of each pixel of the captured image I with the following pixel value Icv.
  • Icv M ⁇ (L ⁇ ⁇ + ⁇ ) / Iv
  • ⁇ and ⁇ are parameters for aligning the scale and bias of the reflection intensity distribution L obtained by a LiDAR sensor or the like and the pixel value Iv of the captured image I.
  • M is a mask for the reflection intensity, where M is 1 when there is a reflection intensity corresponding to the pixel, and M is 0 when there is no reflection intensity distribution L corresponding to the pixel.
  • the resolution of a LiDAR sensor is smaller than the resolution of a camera, and it is not always possible to project the reflection intensity corresponding to all pixels of the captured image I, so such a mask is provided.
  • M ⁇ (L ⁇ + ⁇ )/Iv is the result of masking the ratio between the reflection intensity distribution L corrected by the parameters ⁇ and ⁇ and the pixel value Iv of the captured image I according to the reflection intensity distribution L.
  • the estimation unit 22 estimates at least one of the reflectance component R and the shading component S based on the captured image I and the corrected image Ic generated by the correction unit 28. Specifically, the estimation unit 22 estimates at least one of the reflectance component R and the shading component S using an inherent image decomposition model 30 shown in FIG. 14.
  • FIG. 14 shows the input and output of the inherent image decomposition model 30 according to this embodiment.
  • the inherent image decomposition model 30 according to this embodiment is a trained model that has been trained in advance to receive the corrected image Ic generated by the correction unit 28 and the captured image I as inputs, and output the reflectance component R and the shading component S.
  • the corrected image Ic may end up being a sparse image. Therefore, it is preferable for the intrinsic image decomposition model 30 according to this embodiment to receive both the corrected image Ic and the captured image I as input.
  • the estimation unit 22 inputs the captured image I and the corrected image Ic to the intrinsic image decomposition model 30 to obtain the reflectance component R and the shading component S of the captured image I.
  • FIG. 15 shows a detailed configuration example of the eigenimage decomposition model 30 according to this embodiment.
  • the eigenimage decomposition model 30 includes an encoder E c R , an encoder E p R , and a decoder D R.
  • the eigenimage decomposition model 30 includes an encoder E c I that encodes the photographed image I into a domain-independent latent space, an encoder E p I that encodes the photographed image I into a domain-dependent latent space, an encoder E c L that encodes the corrected image Ic into a domain-independent latent space, and an encoder E p L that encodes the corrected image Ic into a domain-dependent latent space.
  • the eigenimage decomposition model 30 includes a decoder D L that decodes the combined output of the encoder E c I and the encoder E c L and the combined output of the encoder E p I and the encoder E p L into the photographed image I and the corrected image Ic. Furthermore, the eigenimage decomposition model 30 includes an encoder E c S , an encoder E p S , and a decoder D S.
  • the estimation unit 22 inputs the captured image I to the encoders E c I and E p I , inputs the corrected image Ic to the encoders E c L and E p L , and inputs the outputs of the encoders E c I , E c L , and E p R to the decoder D R to obtain an estimation result R' of the reflectance component R of the captured image I.
  • the output of the encoder E p R is obtained by mapping the output of the encoder E p I to the domain of the reflectance component R.
  • the estimation unit 22 inputs the outputs of the encoders E c I , E c L , and E p S to the decoder D S to obtain an estimation result S' of the shadow component S of the captured image I.
  • the output of the encoder E p S is obtained by mapping the output of the encoder E p I to the domain of the shadow component S.
  • the learning unit 24 trains the intrinsic image decomposition model 30 using a loss function that includes at least one of the error between the result of masking the estimated reflectance component R according to the reflection intensity distribution L and the result of masking the reflection intensity distribution L according to the reflection intensity distribution L, and the error between the result of masking the estimated shadow component S according to the reflection intensity distribution L and the result of masking the ratio between the corrected image Ic and the reflection intensity distribution L according to the reflection intensity distribution L.
  • the learning unit 24 may add at least one of a term Tr3 relating to the reflectance component R below and a term Ts3 relating to the shadow component S below as a term of the loss function.
  • Tr3 ⁇ r ⁇ g(M ⁇ R, M ⁇ (L ⁇ + ⁇ ))
  • Ts3 ⁇ s ⁇ g(M ⁇ S, M ⁇ Icv/(L ⁇ + ⁇ ))
  • M x R is the result of masking the reflectance component R according to the reflection intensity distribution L.
  • M x (L x ⁇ + ⁇ ) is the result of masking the reflection intensity distribution L corrected with the parameters ⁇ and ⁇ according to the reflection intensity distribution L.
  • M x S is the result of masking the shadow component S according to the reflection intensity distribution L.
  • M x Icv/(L x ⁇ + ⁇ ) is the result of masking the ratio between the pixel value Icv of the corrected image Ic and the reflection intensity distribution L corrected with the parameters ⁇ and ⁇ according to the reflection intensity distribution L.
  • the learning unit 24 may also perform self-supervised learning on the eigenimage decomposition model 30.
  • FIG. 16 is a flowchart showing the flow of image processing by the image processing device 10.
  • the image processing shown in FIG. 16 is performed by the CPU 11 reading out an image processing program from the ROM 12 or storage 14, expanding it into the RAM 13, and executing it.
  • step S40 the CPU 11, functioning as the acquisition unit 20, acquires a captured image obtained by photographing an object.
  • step S42 the CPU 11, functioning as the acquisition unit 20, acquires a reflection intensity distribution on the surface of the object.
  • step S44 the CPU 11, functioning as the correction unit 28, corrects the captured image acquired in step S40 based on the reflection intensity distribution L acquired in step S42, to generate a corrected image.
  • step S46 the CPU 11, as the estimation unit 22, estimates at least one of the reflectance component and the shadow component in the captured image based on the captured image acquired in step S40 and the corrected image generated in step S44.
  • this image processing ends.
  • the correction unit 28 corrects the captured image based on the reflection intensity distribution. Furthermore, the estimation unit 22 estimates at least one of the reflectance component and the shading component based on the captured image corrected by the correction unit 28. According to the image processing device according to this embodiment, by using a corrected image that reflects the reflection intensity distribution, it is possible to more accurately estimate at least one of the reflectance component and the shading component in the image.
  • the present disclosure may also be implemented in a form that combines the above-described embodiments.
  • the third embodiment and the fourth embodiment may be combined, and the correction unit 28 may correct the RAW image restored by the restoration unit 26.
  • various processes that the CPU reads and executes software (programs) in each of the above embodiments may be executed by various processors other than the CPU.
  • processors in this case include PLDs (Programmable Logic Devices) such as FPGAs (Field-Programmable Gate Arrays) whose circuit configuration can be changed after manufacture, and dedicated electrical circuits such as ASICs (Application Specific Integrated Circuits), which are processors with circuit configurations designed specifically to execute specific processes.
  • various processes may be executed by one of these various processors, or by a combination of two or more processors of the same or different types (for example, multiple FPGAs, or a combination of a CPU and an FPGA, etc.).
  • the hardware structure of these various processors is, more specifically, an electrical circuit that combines circuit elements such as semiconductor elements.
  • the image processing program is described as being pre-stored (installed) in the storage 14, but this is not limiting.
  • the program may be provided in a form stored in a non-transitory storage medium such as a CD-ROM (Compact Disk Read Only Memory), a DVD-ROM (Digital Versatile Disk Read Only Memory), or a USB (Universal Serial Bus) memory.
  • the program may also be downloaded from an external device via a network.
  • the technology of the present disclosure can also be achieved by appropriately combining the above-described embodiments.
  • the above-described description and illustrations are a detailed explanation of the parts related to the technology of the present disclosure, and are merely one example of the technology of the present disclosure.
  • the above description of the configuration, functions, actions, and effects is an example of the configuration, functions, actions, and effects of the parts related to the technology of the present disclosure. Therefore, unnecessary parts may be deleted, new elements may be added, or replacements may be made to the above-described description and illustrations, within the scope of the gist of the technology of the present disclosure.
  • An image processing device Memory, at least one processor coupled to the memory; Equipped with The processor, Acquiring a captured image of an object and a reflection intensity distribution on a surface of the object; and estimating at least one of a reflectance component independent of illumination light in the captured image and a shadow component dependent on illumination light in the captured image, based on the captured image and the reflection intensity distribution.
  • a non-transitory storage medium storing a program executable by a computer to perform image processing,
  • the image processing includes: Acquiring a captured image of an object and a reflection intensity distribution on a surface of the object; a reflectance component in the captured image that is independent of illumination light and a shadow component in the captured image that is dependent on illumination light, the reflectance component being estimated based on the captured image and the reflection intensity distribution.
  • Image processing device 11
  • CPU 12 ROM 13 RAM 14
  • Storage 15 Input unit 16
  • Display unit 17 Communication I/F 19
  • Bus 20 Acquisition unit 22
  • Learning unit 26 Restoration unit 28
  • Correction unit 30 Eigenimage decomposition model

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
PCT/JP2022/042627 2022-11-16 2022-11-16 画像処理装置及び画像処理プログラム Ceased WO2024105829A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2024558581A JPWO2024105829A1 (https=) 2022-11-16 2022-11-16
PCT/JP2022/042627 WO2024105829A1 (ja) 2022-11-16 2022-11-16 画像処理装置及び画像処理プログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/042627 WO2024105829A1 (ja) 2022-11-16 2022-11-16 画像処理装置及び画像処理プログラム

Publications (1)

Publication Number Publication Date
WO2024105829A1 true WO2024105829A1 (ja) 2024-05-23

Family

ID=91084078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/042627 Ceased WO2024105829A1 (ja) 2022-11-16 2022-11-16 画像処理装置及び画像処理プログラム

Country Status (2)

Country Link
JP (1) JPWO2024105829A1 (https=)
WO (1) WO2024105829A1 (https=)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026013808A1 (ja) * 2024-07-10 2026-01-15 Ntt株式会社 画像処理装置及び画像処理プログラム
WO2026013807A1 (ja) * 2024-07-10 2026-01-15 Ntt株式会社 画像処理装置及び画像処理プログラム

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012001949A1 (ja) * 2010-06-30 2012-01-05 日本電気株式会社 カラー画像処理方法、カラー画像処理装置およびカラー画像処理プログラム
US20180253869A1 (en) * 2017-03-02 2018-09-06 Adobe Systems Incorporated Editing digital images utilizing a neural network with an in-network rendering layer
JP2018535402A (ja) * 2016-01-12 2018-11-29 三菱電機株式会社 異なる分解能を有するセンサーの出力を融合するシステム及び方法
JP2021081791A (ja) * 2019-11-14 2021-05-27 キヤノン株式会社 画像処理装置、画像処理方法、及びプログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012001949A1 (ja) * 2010-06-30 2012-01-05 日本電気株式会社 カラー画像処理方法、カラー画像処理装置およびカラー画像処理プログラム
JP2018535402A (ja) * 2016-01-12 2018-11-29 三菱電機株式会社 異なる分解能を有するセンサーの出力を融合するシステム及び方法
US20180253869A1 (en) * 2017-03-02 2018-09-06 Adobe Systems Incorporated Editing digital images utilizing a neural network with an in-network rendering layer
JP2021081791A (ja) * 2019-11-14 2021-05-27 キヤノン株式会社 画像処理装置、画像処理方法、及びプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DAI PENG, LI ZHUWEN, ZHANG YINDA, LIU SHUAICHENG, ZENG BING: "PBR-Net: Imitating Physically Based Rendering Using Deep Neural Network", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE, USA, vol. 29, 1 January 2020 (2020-01-01), USA, pages 5980 - 5992, XP093014501, ISSN: 1057-7149, DOI: 10.1109/TIP.2020.2987169 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026013808A1 (ja) * 2024-07-10 2026-01-15 Ntt株式会社 画像処理装置及び画像処理プログラム
WO2026013807A1 (ja) * 2024-07-10 2026-01-15 Ntt株式会社 画像処理装置及び画像処理プログラム

Also Published As

Publication number Publication date
JPWO2024105829A1 (https=) 2024-05-23

Similar Documents

Publication Publication Date Title
US12307715B2 (en) Parameter calibration method and apparatus
US10872434B2 (en) Image processing apparatus and method
WO2024105829A1 (ja) 画像処理装置及び画像処理プログラム
CN113284251B (zh) 一种自适应视角的级联网络三维重建方法及系统
US9684840B2 (en) Detection system
JP6115214B2 (ja) パターン処理装置、パターン処理方法、パターン処理プログラム
TWI524050B (zh) 取像裝置、影像深度產生裝置及其方法
CN107248139B (zh) 基于显著视觉和dmd阵列分区控制的压缩感知成像方法
JP7443990B2 (ja) 機械学習装置、画像処理装置、機械学習方法、及び機械学習プログラム
JP7528637B2 (ja) 機械学習装置及び遠赤外線撮像装置
JP2020113975A (ja) イメージ信号処理方法、イメージ信号プロセッサ及びイメージセンサチップ
CN114742907A (zh) 图像增强方法、装置、电子设备和计算机可读存储介质
JP2022189901A (ja) 学習方法、学習装置、プログラムおよび記録媒体
US9554121B2 (en) 3D scanning apparatus and method using lighting based on smart phone
CN119325615A (zh) 图像处理方法、电子系统和非暂时性计算机可读介质
WO2020240760A1 (ja) 差異検出装置、差異検出方法及びプログラム
JP7140091B2 (ja) 画像処理装置、画像処理方法、画像処理プログラム、及び画像処理システム
JP7349290B2 (ja) 対象物認識装置、対象物認識方法、及び対象物認識プログラム
JP2019205133A (ja) 画像処理装置、画像処理方法、画像投影システム、およびプログラム
JP2017147638A (ja) 映像投影システム、映像処理装置、映像処理プログラムおよび映像処理方法
CN106682717B (zh) 一种半色调二维码的生成方法和系统
WO2024176318A1 (ja) 画像処理装置、画像処理方法、及び画像処理プログラム
CN113989397A (zh) 图像快速仿真方法、装置、设备及存储介质
CN111539975A (zh) 一种运动目标的检测方法、装置、设备及存储介质
JP7321772B2 (ja) 画像処理装置、画像処理方法、およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22965805

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2024558581

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22965805

Country of ref document: EP

Kind code of ref document: A1