WO2023026543A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program Download PDF

Info

Publication number
WO2023026543A1
WO2023026543A1 PCT/JP2022/011543 JP2022011543W WO2023026543A1 WO 2023026543 A1 WO2023026543 A1 WO 2023026543A1 JP 2022011543 W JP2022011543 W JP 2022011543W WO 2023026543 A1 WO2023026543 A1 WO 2023026543A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
unit
information processing
component
Prior art date
Application number
PCT/JP2022/011543
Other languages
French (fr)
Japanese (ja)
Inventor
久之 館野
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2023026543A1 publication Critical patent/WO2023026543A1/en

Links

Images

Classifications

    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B15/00Special procedures for taking photographs; Apparatus therefor
    • G03B15/02Illuminating scene
    • G03B15/03Combinations of cameras with lighting apparatus; Flash units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing

Definitions

  • the present disclosure relates to an information processing device, an information processing method, and a program.
  • the present disclosure proposes an information processing device, an information processing method, and a program that enable high-quality video shooting.
  • an information processing apparatus acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component.
  • an acquisition unit that extracts IR component information from the IR image; and an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the IR component information.
  • FIG. 11 is a diagram showing a state in which visible light is used for moving image shooting;
  • FIG. 10 is a diagram showing how a moving image is captured using an IR light; It is a figure which shows the outline
  • 1 is a diagram illustrating a configuration example of a server according to an embodiment of the present disclosure;
  • FIG. It is a figure which shows an example of an infrared illumination part. It is a figure which shows the frequency characteristic of the filter with which an imaging part is provided.
  • FIG. 4 is a diagram for explaining basic method 1; 4 is a flowchart showing image output processing for realizing basic method 1;
  • FIG. 10 is a diagram for explaining basic technique 2; 9 is a flowchart showing image output processing for realizing basic method 2; It is a figure for demonstrating an expansive method. It is a figure for demonstrating an expansive method. 10 is a flowchart showing image output processing for realizing an advanced method; It is a flowchart which shows an estimation process.
  • 1 is a diagram showing an example of a photographing studio of the live-action volumetric photographing system of this embodiment
  • FIG. FIG. 3 is a diagram showing a processing example of the information processing device 10 in the live-action volumetric imaging system
  • FIG. 3 is a diagram showing a state in which a plurality of visible light lights and a plurality of IR lights are arranged omnidirectionally;
  • FIG. 1 is a diagram showing how a visible light is used for moving image shooting.
  • a ring-shaped visible light LED (Light Emitting Diode) light in the example of FIG. 1) is used for illumination.
  • the impression of the image changes depending on how the lighting is applied, so it is difficult to decide how to apply the lighting.
  • the user wears glasses there is a problem that the light is reflected on the glasses when the user is exposed to the light from the front.
  • FIG. 2 is a diagram showing how moving images are captured using an IR light. Then, the information processing apparatus of the present embodiment performs image processing on the captured image based on the infrared light irradiation information as if the subject were irradiated with visible light. This realizes simple relighting that is effective for close-up scenes.
  • FIG. 3 is a diagram showing an overview of image processing according to this embodiment.
  • An information processing apparatus obtains an IR image obtained by irradiating an object (eg, a user and surrounding objects) with infrared light.
  • An IR image is a captured image containing a visible light component and an IR component obtained by irradiating an object with infrared light.
  • the information processing device performs image processing related to brightness or brightness on the captured image of the target based on the information of the IR component extracted from the IR image.
  • the user can shoot moving images with stable lighting without painful or complicated lighting adjustments.
  • IR ring light Infrared light irradiation, it is desirable to use an IR ring light in which infrared light emitting elements are arranged in a ring around the lens.
  • a polarizing filter may be used to prevent reflection of light on the glasses.
  • the information processing device 10 is a computer used by the user for video shooting.
  • the information processing device 10 is typically a personal computer, but is not limited to a personal computer.
  • the information processing device 10 may be a mobile terminal such as a mobile phone, a smart device (smartphone or tablet), a PDA (Personal Digital Assistant), or a notebook PC.
  • the information processing device 10 may be a wearable device such as a smart watch.
  • the information processing apparatus 10 may also be an xR device such as an AR (Augmented Reality) device, a VR (Virtual Reality) device, or an MR (Mixed Reality) device.
  • the xR device may be a glasses-type device such as AR glasses or MR glasses, or a head-mounted device such as a VR head-mounted display.
  • the information processing device 10 may also be a portable IoT (Internet of Things) device. Also, the information processing apparatus 10 may be a motorcycle, a mobile relay vehicle, or the like equipped with a communication device such as an FPU (Field Pickup Unit). Further, the information processing device 10 may be a server device such as a PC server, a midrange server, or a mainframe server. In addition, the information processing apparatus 10 can employ any form of computer.
  • a portable IoT Internet of Things
  • the information processing apparatus 10 may be a motorcycle, a mobile relay vehicle, or the like equipped with a communication device such as an FPU (Field Pickup Unit).
  • the information processing device 10 may be a server device such as a PC server, a midrange server, or a mainframe server.
  • the information processing apparatus 10 can employ any form of computer.
  • FIG. 4 is a diagram showing a configuration example of the information processing device 10 according to the embodiment of the present disclosure.
  • the information processing apparatus 10 includes a communication section 11 , a storage section 12 , a control section 13 , an output section 14 , an infrared illumination section 15 , a synchronization signal generation section 16 and an imaging section 17 .
  • the configuration shown in FIG. 4 is a functional configuration, and the hardware configuration may differ from this. Also, the functions of the information processing apparatus 10 may be distributed and implemented in a plurality of physically separated configurations.
  • the communication unit 11 is a communication interface for communicating with other devices.
  • the communication unit 11 is a LAN (Local Area Network) interface such as a NIC (Network Interface Card).
  • the communication unit 11 may be a device connection interface such as USB (Universal Serial Bus).
  • the communication unit 11 may be a wired interface or a wireless interface.
  • the communication unit 11 communicates with an external device under the control of the control unit 13 .
  • the storage unit 12 is a data readable/writable storage device such as a DRAM (Dynamic Random Access Memory), an SRAM (Static Random Access Memory), a flash memory, a hard disk, or the like.
  • the storage unit 12 functions as storage means of the information processing device 10 .
  • the storage unit 12 functions as a frame buffer for moving images captured by the imaging unit 17 .
  • the control unit 13 is a controller that controls each unit of the information processing device 10 .
  • the control unit 13 is implemented by a processor such as a CPU (Central Processing Unit), MPU (Micro Processing Unit), GPU (Graphics Processing Unit), or the like.
  • the control unit 13 is implemented by the processor executing various programs stored in the storage device inside the information processing apparatus 10 using a RAM (Random Access Memory) or the like as a work area.
  • the control unit 13 may be realized by an integrated circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array).
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the control unit 13 includes an acquisition unit 131 , an extraction unit 132 , an image processing unit 133 , an output control unit 134 , a learning unit 135 and an estimation unit 136 .
  • Each block (obtaining unit 131 to estimating unit 136) constituting the control unit 13 is a functional block indicating the function of the control unit 13.
  • FIG. These functional blocks may be software blocks or hardware blocks.
  • each of the functional blocks described above may be one software module realized by software (including microprograms), or may be one circuit block on a semiconductor chip (die). Of course, each functional block may be one processor or one integrated circuit.
  • the control unit 13 may be configured in functional units different from the functional blocks described above. The configuration method of the functional blocks is arbitrary.
  • control unit 13 may be configured in functional units different from the functional blocks described above. Also, some or all of the blocks (acquisition unit 131 to estimation unit 136) that make up the control unit 13 may be operated by another device. The operation of each block constituting the control unit 13 will be described later.
  • the output unit 14 is a device that performs various outputs such as sound, light, vibration, and images to the outside.
  • the output unit 14 performs various outputs to the user under the control of the control unit 13 .
  • the output unit 14 includes a display device (display unit) that displays various types of information.
  • the display device is, for example, a liquid crystal display or an organic EL display.
  • the output unit 14 may be a touch panel display device. In this case, the output section 14 also functions as an input section.
  • the infrared illumination unit 15 is an IR light (IR illumination light source) that outputs invisible infrared light.
  • the upper limit of the wavelength of light that can be perceived by the human eye is 760-830 nm. Wavelengths such as 850 nm or 940 nm are the major IR illumination sources on the market. Therefore, the infrared illuminator 15 is typically an IR light that outputs infrared light with a wavelength of 850 nm or 940 nm. However, the infrared illuminator 15 is not limited to an IR light that outputs infrared light with a wavelength of 850 nm or 940 nm.
  • the infrared illuminator 15 may be capable of outputting infrared light of other wavelengths.
  • FIG. 5 is a diagram showing an example of the infrared illuminator 15. As shown in FIG. It is desirable that the infrared illuminator 15 be a ring light in order to clearly project the face.
  • the infrared illumination unit 15 is an IR light in which IR light emitting elements are arranged in a ring shape around a lens.
  • the synchronizing signal generating unit 16 is a synchronizing signal generator that generates a synchronizing signal for synchronizing the blinking period of the infrared illumination unit 15 and the frame period of the video (moving image) captured by the imaging unit 17 .
  • the synchronizing signal generator 16 outputs a synchronizing signal under the control of the control unit 13 .
  • the imaging unit 17 is a conversion unit that converts an optical image into an electrical signal.
  • the imaging unit 17 includes, for example, an image sensor and a signal processing circuit that processes analog pixel signals output from the image sensor, and converts light entering from the lens into digital data (image data).
  • An image captured by the imaging unit 17 is not limited to a video (moving image), and may be a still image. Note that the imaging unit can be rephrased as a camera.
  • the imaging unit 17 of this embodiment is a camera (hereinafter also referred to as an IR camera) that can simultaneously acquire visible light and infrared light (IR light).
  • An IR camera can be realized by removing the IR cut filter normally included in commercially available cameras.
  • FIG. 6 is a diagram showing frequency characteristics of a filter included in the imaging unit 17. As shown in FIG. In the example of FIG. 6, the imaging unit 17 is configured to detect infrared light with a wavelength of 850 nm. However, if the infrared illumination unit 15 is a light source that outputs infrared light with a wavelength of 940 nm, the imaging unit 17 may be configured to detect infrared light with a wavelength of 940 nm.
  • FIG. 7 is a diagram for explaining basic method 1.
  • FIG. An outline of basic method 1 will be described below with reference to FIG.
  • the information processing device 10 operates the infrared illumination unit 15 and the imaging unit 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . For example, the information processing device 10 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light.
  • the information processing apparatus 10 can acquire the image of the frame when the infrared light is not irradiated as the visible light image and the image of the frame when the infrared light is irradiated as the IR image.
  • the frame when the IR light is OFF is the visible light image
  • the frame when the IR light is OFF is the IR image.
  • an IR image is a captured image containing a visible light component and an IR component obtained by irradiating an object (the user and surrounding objects in the example of FIG. 7) with infrared light.
  • the information processing device 10 extracts IR component information (hereinafter referred to as IR component information) from the IR image.
  • IR component information indicates from which direction the infrared light from the infrared illumination unit 15 hits the object.
  • the information processing device 10 may acquire the difference between the visible light image and the IR image as IR component information.
  • the information processing apparatus 10 calculates the difference between two consecutive frames of images (a visible light image and an IR image) starting from a frame at a timing when infrared light is not irradiated (IR light OFF frame). Obtained as ingredient information.
  • the information processing device 10 can detect the IR component of light that always exists, such as the light of a room light (for example, a fluorescent lamp), from the IR component information.
  • a room light for example, a fluorescent lamp
  • the information processing device 10 performs image processing related to brightness or brightness on the captured image based on the IR component information. For example, based on the IR component information, the information processing device 10 performs image processing on the next frame image (visible light image) of the two continuous frames (visible light image and IR image) used to extract the IR component information. . In the example of FIG. 7, the information processing apparatus 10 rewrites the luminance (L) information in the HSL color space of the visible light image based on the IR component information. More specifically, the information processing device 10 converts the visible light image from RGB to HSL, and maps the intensity of the IR component to the luminance (L) of the visible light image in the HSL color space.
  • the mapping may be a complete replacement or blending with the original luminance.
  • the color space used in the visible light image (input image) is RGB, but the color space used in the visible light image (input image) is not limited to RGB.
  • the color space used in the visible light image (input image) may be a color space other than RGB, such as YUV.
  • the YUV color space is a color space that expresses colors with luminance (Y) and color difference components (U, V).
  • the color space used for the visible light image (input image) can be appropriately changed according to the color space of the image output by the camera. Note that if the color space used in the visible light image (input image) is a color space having an axis capable of mapping IR components such as brightness and lightness, this color space conversion step can be omitted.
  • the information processing device 10 may blur the edge of the IR component.
  • the information processing apparatus 10 performs edge blurring processing on the difference image as IR component information, and performs image processing on the next frame image of two continuous frames based on the edge-blurred difference image. . Thereby, the information processing apparatus 10 can generate an image with little discomfort even in a scene with motion.
  • the information processing apparatus 10 may correct the IR component information based on motion prediction between frames, and perform image processing on the next frame image of two continuous frames based on the corrected IR component information.
  • the information processing device 10 may acquire the optical flow between adjacent frames of the visible light image, transform the IR component, and map it. This also makes it possible to generate an image with little sense of incongruity.
  • the information processing device 10 converts the image from HSL to RGB and outputs it to the output unit 14 .
  • the information processing device 10 rewrites the luminance (L) information of the captured image in the HSL color space based on the IR component information.
  • the information processing apparatus 10 may rewrite the brightness (V) information of the captured image in the HSV color space based on the IR component information.
  • the information processing device 10 may rewrite the luminance (Y) information of the captured image in the YCoCg color space based on the IR component information.
  • the information processing device 10 may rewrite the values in the RGB color space based on the IR component information.
  • the color space used by the information processing apparatus 10 for image processing is not limited to the color space described above.
  • the color space of the final output image is not limited to RGB depending on the application, and may be YUV, for example.
  • FIG. 8 is a flowchart showing image output processing for realizing basic method 1.
  • the following processing is executed by the control unit 13 of the information processing device 10 .
  • the control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
  • the control unit 13 activates the imaging unit 17 (step S101).
  • the imaging unit 17 is an IR camera that can simultaneously acquire visible light and infrared light (IR light).
  • the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame cycle of the video (moving image) captured by the imaging unit 17 (step S102).
  • the infrared illuminator 15 is an IR light that outputs invisible infrared light.
  • the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame period of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S103).
  • the IR image is a captured image including not only the IR component under the influence of the infrared light emitted by the infrared illuminator 15 but also the visible light component.
  • the extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S104). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. In basic method 1, the extraction unit 132 acquires, as IR component information, the difference between two consecutive frames of images (a visible light image and an IR image) starting from the frame at which the IR light is turned off.
  • the image processing unit 133 of the information processing device 10 performs image processing related to brightness or brightness of the captured image based on the IR component information (step S105). For example, based on the IR component information, the information processing device 10 performs image processing on the next frame image (visible light image) of the two continuous frames (visible light image and IR image) used to extract the IR component information. . For example, the image processing unit 133 rewrites the luminance information of the visible light image based on the IR component information.
  • the output control unit 134 of the information processing device 10 outputs the captured image subjected to the image processing to the output unit 14 (step S106).
  • control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S107). If the shooting has not ended (step S107: No), the control unit 13 returns the process to step S103. If the shooting has ended (step S107: Yes), the control unit 13 stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S108). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
  • the information processing apparatus 10 performs image processing so that it appears as if visible light is hitting an object (for example, a user) based on irradiation information (that is, IR component information) of invisible infrared light. It is carried out. As a result, the user can shoot moving images with stable lighting without being dazzled.
  • irradiation information that is, IR component information
  • Basic method 2 In basic method 1, the information processing apparatus 10 performs image processing on the next frame image (visible light image) of two continuous frames (visible light image and IR image). However, in the case of a scene with motion, this method may result in an unnatural image after image processing. Therefore, in basic method 2, the frame used for generating the difference image is used as the frame to be subjected to image processing, so that even in a scene with motion, the image does not look unnatural.
  • FIG. 9 is a diagram for explaining basic method 2. The outline of basic method 2 will be described below with reference to FIG.
  • the information processing device 10 operates the infrared illumination unit 15 and the imaging unit 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . For example, the information processing device 10 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light.
  • the information processing device 10 extracts IR component information from the IR image. At this time, the information processing device 10 acquires the difference between the visible light image and the IR image as IR component information.
  • the information processing apparatus 10 converts the difference between two consecutive frames of images (visible light image and IR image) starting from the frame at the timing when the infrared light is irradiated (IR light ON frame) to the IR component. obtained as information.
  • the information processing device 10 performs image processing related to brightness or brightness on the captured image based on the IR component information. For example, based on the IR component information, the information processing device 10 performs image processing on the last frame image (visible light image) of two consecutive frames (IR image and visible light image) used to extract the IR component information. conduct. In the example of FIG. 9, the information processing apparatus 10 rewrites the luminance (L) information in the HSL color space of the visible light image based on the IR component information.
  • the HSL color space is a color space that expresses colors with three components of hue (Hue), saturation (Saturation), and brightness (Lightness).
  • the information processing device 10 converts the image from HSL to RGB and outputs it to the output unit 14 .
  • the information processing device 10 rewrites the information of the luminance (L) in the HSL color space of the captured image based on the IR component information.
  • the information processing apparatus 10 may rewrite the brightness (V) information of the captured image in the HSV color space based on the IR component information.
  • the HSV color space is a color space that expresses colors with three components of hue (Hue), saturation (Saturation/Chroma), and brightness (Value/Brightness).
  • the information processing device 10 may rewrite the luminance (Y) information of the captured image in the YCoCg color space based on the IR component information.
  • the YCoCg color space is a color space that expresses colors by luminance (Y) and color difference components (Co (darkness of orange) and Cg (darkness of green)).
  • Y luminance
  • Cg darkness of green
  • the information processing device 10 may rewrite the values in the RGB color space based on the IR component information.
  • the color space used by the information processing apparatus 10 for image processing is not limited to the color space described above.
  • FIG. 10 is a flowchart showing image output processing for realizing basic method 2.
  • FIG. The following processing is executed by the control unit 13 of the information processing device 10 .
  • the control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
  • control unit 13 activates the imaging unit 17 (step S201). Then, the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 (step S202).
  • the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame cycle of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S203).
  • the extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S204). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. In basic method 2, the extraction unit 132 acquires, as IR component information, the difference between two consecutive frames of images (an IR image and a visible light image) starting from the frame at which the IR light is ON.
  • the image processing unit 133 of the information processing device 10 performs image processing related to luminance or brightness of the captured image based on the IR component information (step S205). For example, based on the IR component information, the information processing device 10 performs image processing on the last frame image (visible light image) of the two consecutive frames (IR image and visible light image) used to extract the IR component information. conduct. For example, the image processing unit 133 rewrites the luminance information of the visible light image based on the IR component information.
  • the output control unit 134 of the information processing device 10 outputs the captured image subjected to the image processing to the output unit 14 (step S206).
  • control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S207). If the shooting has not ended (step S207: No), the control unit 13 returns the process to step S203. If the shooting has ended (step S207: Yes), the control unit 13 stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S208). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
  • the information processing apparatus 10 performs image processing on one of the frames used to generate the IR component information (difference image).
  • the time lag with the target image is small. Therefore, the user can obtain an image with less discomfort.
  • the information processing device 10 performs image processing on the captured image based on the IR component information.
  • the information processing apparatus 10 generates a learning model by learning based on the image before image processing and the image after image processing, and uses the generated learning model to estimate the image after image processing from the captured image. may Accordingly, the information processing apparatus 10 can acquire an image as if the user were illuminated without irradiating the user with infrared light.
  • FIG. 11 is a diagram showing processing up to completion of learning of the learning model
  • FIG. 12 is a diagram showing processing after completion of learning of the learning model. An outline of the advanced method will be described below with reference to FIGS. 11 and 12.
  • FIG. 11 is a diagram showing processing up to completion of learning of the learning model
  • FIG. 12 is a diagram showing processing after completion of learning of the learning model. An outline of the advanced method will be described below with reference to FIGS. 11 and 12.
  • FIG. 11 is a diagram showing processing up to completion of learning of the learning model
  • FIG. 12 is a diagram showing processing after completion of learning of the learning model.
  • the information processing apparatus 10 operates the infrared illumination section 15 and the imaging section 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light. Then, the information processing device 10 extracts IR component information from the IR image. Then, the information processing apparatus 10 performs image processing regarding brightness or brightness on the captured image based on the IR component information. Then, the information processing device 10 outputs the image after image processing to the output unit 14 .
  • the information processing device 10 learns a learning model based on the image before image processing and the image after image processing.
  • a learning model is, for example, a model for learning the relationship between an image before image processing and an image after image processing.
  • the information processing apparatus 10 learns the learning model so as to minimize the difference between the image before image processing and the image after image processing.
  • a learning model is, for example, a machine learning model such as a neural network model.
  • a neural network model is composed of layers called an input layer containing a plurality of nodes, an intermediate layer (or hidden layer), and an output layer, and each node is connected via edges. Each layer has a function called activation function, and each edge is weighted.
  • a learning model has one or more intermediate layers (or hidden layers). When the learning model is a neural network model, learning the learning model means, for example, setting the number of intermediate layers (or hidden layers), the number of nodes in each layer, or the weight of each edge.
  • the neural network model may be a model based on deep learning.
  • the neural network model may be a model called DNN (Deep Neural Network).
  • the neural network model may be a model called a CNN (Convolution Neural Network), RNN (Recurrent Neural Network), or LSTM (Long Short-Term Memory).
  • CNN Convolution Neural Network
  • RNN Recurrent Neural Network
  • LSTM Long Short-Term Memory
  • learning models are not limited to neural network models.
  • the learning model may be a model based on reinforcement learning. In reinforcement learning, actions (settings) that maximize value are learned through trial and error.
  • the learning model may be a logistic regression model.
  • the learning model may consist of multiple models.
  • a learning model may consist of multiple neural network models. More specifically, the learning model may consist of multiple neural network models selected from, for example, CNN, RNN, and LSTM. When a learning model is composed of multiple neural network models, these multiple neural network models may be in a dependent relationship or in a parallel relationship.
  • the information processing device 10 stores, in the storage unit 12, character strings, numerical values, and the like that indicate the model structure and connection coefficients as information that constitutes the learning model.
  • the learning model uses a pair of data of an image before image processing (captured image such as a visible light image) and an image after image processing as learning data, and acquires an image before image processing (for example, a captured image such as a visible light image). It may be a model that has learned to output an image after image processing (hereinafter referred to as an estimated image) when a captured image is input.
  • the first learning model includes an input layer for inputting a captured image, an output layer for outputting an estimated image, and a layer other than the output layer which is one of the layers from the input layer to the output layer. 1 element and a second element whose value is calculated based on the weight of the first element and the first element.
  • an operation is performed based on the first element and the weight of the first element (that is, the connection coefficient), so that the estimated image is output from the output layer according to the captured image input to the input layer.
  • the weight of the first element that is, the connection coefficient
  • the learning model is realized by a neural network with one or more hidden layers, such as DNN.
  • the first element included in the learning model corresponds to any node of the input layer or intermediate layer.
  • the second element corresponds to the next node, which is a node to which the value is transmitted from the node corresponding to the first element.
  • the weight of the first element corresponds to the connection coefficient, which is the weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.
  • the first element included in the learning model corresponds to input data (xi) such as x1 and x2.
  • the weight of the first element corresponds to the coefficient ai corresponding to xi.
  • the regression model can be viewed as a simple perceptron with an input layer and an output layer.
  • the first element can be regarded as a node of the input layer
  • the second element can be regarded as a node of the output layer.
  • the information processing device 10 uses a model having an arbitrary structure, such as a neural network or a regression model, to calculate information to be output.
  • the learning model is set with coefficients so that an estimated image is output when a captured image (for example, a visible light image before image processing) is input.
  • the information processing apparatus 10 sets the coefficient based on the degree of similarity between the image after image processing and the value obtained by inputting the captured image (visible light image before image processing) into the learning model.
  • the information processing apparatus 10 uses such a learning model to generate an estimated image from the captured image.
  • the learning model As an example of a learning model, a model that outputs an estimated image when a captured image is input is shown.
  • the learning model according to the embodiment may be a model that is generated based on results obtained by repeatedly inputting and outputting data to the learning model.
  • the learning model may be a model that constitutes part of the GAN.
  • the learning device that learns the learning model may be the information processing device 10, or may be another information processing device.
  • the information processing apparatus 10 learns a learning model.
  • the information processing apparatus 10 learns the learning model and stores the learned learning model in the storage unit 12 . More specifically, the information processing apparatus 10 sets the connection coefficient of the learning model so that the learning model outputs return information when drafter information is input to the learning model.
  • the information processing apparatus 10 inputs a captured image to a node in the input layer of the learning model, propagates the data to the output layer of the learning model by following each intermediate layer, and outputs an estimated image. Then, the information processing apparatus 10 corrects the connection coefficients of the learning model based on the difference between the estimated image actually output by the learning model and the actual image after image processing. For example, the information processing apparatus 10 may correct the connection coefficients using a technique such as back propagation. At this time, the information processing apparatus 10 may correct the connection coefficient based on the cosine similarity between the vector representing the first measured data and the vector representing the value actually output by the learning model.
  • the information processing device 10 may learn the learning model using any learning algorithm.
  • the information processing device 10 may learn a learning model using learning algorithms such as neural networks, support vector machines, clustering, and reinforcement learning.
  • the information processing apparatus 10 starts generating an estimated image.
  • the information processing apparatus 10 uses the generated learning model to estimate an image after image processing of the captured image from the newly acquired captured image.
  • the information processing apparatus 10 switches the image output to the output unit 14 from the image generated by the image processing to the image estimated using the learning model (hereinafter referred to as the estimated image).
  • the information processing device 10 may stop outputting infrared light from the infrared illumination unit 15 at the timing when the image output to the output unit 14 is switched from the image generated by the image processing to the estimated image.
  • half of the frames captured by the information processing apparatus 10 are visible light images before the learning of the learning model is completed, but all the frames are visible light images after the learning of the learning model is completed.
  • the information processing apparatus 10 then generates an estimated image of the visible light image using the learning model, and outputs the generated estimated image to the output unit 14 .
  • the information processing apparatus 10 can double the frame rate of the video output to the output unit 14 before the completion of learning.
  • FIG. 13 is a flow chart showing image output processing for realizing the advanced method.
  • the following processing is executed by the control unit 13 of the information processing device 10 .
  • the control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
  • control unit 13 activates the imaging unit 17 (step S301). Then, the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 (step S302).
  • the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame cycle of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S303).
  • the extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S304). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. Then, the image processing unit 133 of the information processing device 10 performs image processing related to luminance or lightness of the captured image based on the IR component information (step S305). Then, the output control unit 134 of the information processing device 10 outputs the processed image to the output unit 14 (step S306).
  • the learning unit 135 of the information processing device 10 performs learning of the learning model based on the image before image processing and the image after image processing (step S307).
  • step S308 determines whether or not the shooting has ended. If the shooting has ended (step S308: Yes), the control unit 13 advances the process to step S311. If the shooting has not ended (step S308: No), the control unit 13 determines whether learning of the learning model has been completed (step S309). If learning has not been completed (step S309: No), the control unit 13 returns the process to step S303. If the learning has been completed (step S309: Yes), the control unit 13 starts the estimation process (step S310).
  • FIG. 14 is a flow chart showing the estimation process.
  • the control unit 13 of the information processing device 10 stops the operation of the infrared illumination unit 15 (step S401). Then, the acquisition unit 131 of the information processing device 10 acquires the captured image (that is, the visible light image) (step S402). Then, the estimating unit 136 of the information processing apparatus 10 inputs the captured image to the learning model, thereby estimating the image after the image processing of the captured image (step S403). Then, the output control unit 134 of the information processing device 10 outputs the estimated image to the output unit 14 (step S404).
  • control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S405). If the shooting has not ended (step S405: No), the control unit 13 returns the process to step S402. If the shooting has ended (step S405: Yes), the control unit 13 returns the processing to the flow of FIG. 13 and stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S311). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
  • the information processing apparatus 10 can acquire an image as if the user were lighting without irradiating the user with infrared light. Also, before learning is completed, the IR image is not output to the output unit 14, so the frame rate is halved. can.
  • the method of this embodiment can be applied to actual volumetric photography.
  • the live-action volumetric is a technique that acquires three-dimensional information of a subject (for example, a person) in a studio or the like and converts it into 3DCG as it is.
  • the information processing apparatus 10 surrounds and photographs a subject with multiple cameras. Then, the information processing apparatus 10 converts the subject into three-dimensional data from the image data to generate content. Then, the information processing apparatus 10 renders the content from a free viewpoint based on the user's operation.
  • the information processing apparatus 10 shoots a subject mainly in a studio with a plurality of fixed lighting fixtures on the ceiling in order to realize volumetric photography. At this time, if the subject is uniformly illuminated with bright illumination, the quality of the texture and modeling is improved, but the unevenness is reduced, resulting in an unnatural CG-like image. On the other hand, if the subject is shot with biased lighting, the shadows and unevenness will increase, but the texture and modeling quality will deteriorate. It is difficult to add additional lighting even if there is no part of the subject that is illuminated by the shape of the subject and the contrast with the green screen is low.
  • a visible light camera capable of photographing infrared rays and an IR light capable of individually controlling lighting, extinguishing, and irradiation direction are additionally arranged in a conventional live-action volumetric imaging system.
  • the information processing apparatus 10 performs image processing (for example, correction or enhancement of shadows) on the visible light image based on the IR component information.
  • image processing for example, correction or enhancement of shadows
  • FIG. 15 is a diagram showing an example of a photography studio of the live-action volumetric photography system of this embodiment.
  • a plurality of visible light lights 20 a plurality of IR lights (infrared illuminator 15 shown in FIG. 15) are arranged in the photography studio.
  • a plurality of IR cameras 30 are arranged in the photography studio.
  • the IR camera 30 is a camera that can simultaneously acquire visible light and infrared light.
  • the configuration of the IR camera 30 is similar to that of the imaging section 17 .
  • FIG. 16 is a diagram showing a processing example of the information processing device 10 in the live-action volumetric imaging system.
  • the information processing device 10 acquires a multi-viewpoint image composed of visible light images from a plurality of directions of a subject and an IR image of the subject from a plurality of directions.
  • a multi-viewpoint image is an image for generating a 3D model of a subject.
  • the information processing device 10 corrects the multi-viewpoint image based on IR component information (shadow information) extracted from the IR image.
  • IR component information can be used as auxiliary information for foreground-background separation.
  • a specific correction method image processing method
  • the imaging environment of the object may be a combination of a high-speed imaging camera capable of simultaneously acquiring visible light and infrared light, and visible light and IR light arranged omnidirectionally.
  • FIG. 17 is a diagram showing a state in which a plurality of visible light lights 20 and a plurality of IR lights (infrared illuminators 15) are omnidirectionally arranged.
  • the information processing device 10 simultaneously acquires shading/reflectance (albedo) from an arbitrary light source position while shooting with visible light. Since the imaging frame ratio of IR image:visible light image is not limited to 1:1, shading/albedo from multiple light source positions can be acquired simultaneously depending on camera performance. Since it uses infrared light, it does not affect visible light image capturing. If only the shading in the IR monochrome image is acquired, the frame rate can be increased to the limit independently of the visible light camera. Further, the information processing apparatus 10 can add photo-realistic shadows later based on the albedo.
  • the user may produce new video content by synthesizing the 3D model of the subject generated in this embodiment and the 3D data managed by another server. Further, for example, when there is background data acquired by an imaging device such as Lidar, the user can combine the 3D model of the subject generated in the present embodiment with the background data, so that the user can see the subject as background data. It is also possible to create content that makes you feel as if you are in the indicated location.
  • the video content may be 3D video content, or may be 2D video content converted to 2D.
  • the 3D model of the subject generated in the present embodiment includes, for example, a 3D model generated by a 3D model generation unit and a 3D model reconstructed by a rendering unit.
  • the information processing device 10 arranges a subject (for example, a performer) generated in the present embodiment in a virtual space where the user communicates as an avatar. be able to. In this case, the user becomes an avatar and can view the photographed subject in the virtual space.
  • a subject for example, a performer
  • a remote user can A 3D model of the subject can be viewed.
  • the information processing apparatus 10 can transmit the 3D model of the subject in real time, so that the subject and the remote user can communicate in real time.
  • the subject is a teacher and the user is a student, or that the subject is a doctor and the user is a patient.
  • the information processing apparatus 10 can also generate a free-viewpoint video of sports or the like based on the 3D models of a plurality of subjects generated in the present embodiment. Also, an individual can distribute himself/herself, which is a 3D model generated in this embodiment, to a distribution platform. As such, the content of the embodiments described herein can be applied to a variety of technologies and services.
  • the information processing apparatus 10 of this embodiment may be implemented by a dedicated computer system or may be implemented by a general-purpose computer system.
  • a communication program for executing the above operations is distributed by storing it in a computer-readable recording medium such as an optical disk, semiconductor memory, magnetic tape, or flexible disk.
  • the control device is configured by installing the program in a computer and executing the above-described processing.
  • the control device may be a device (for example, a personal computer) external to the information processing device 10 .
  • the control device may be a device inside the information processing device 10 (for example, the control unit 13).
  • the above communication program may be stored in a disk device provided in a server device on a network such as the Internet, so that it can be downloaded to a computer.
  • the functions described above may be realized through cooperation between an OS (Operating System) and application software.
  • the parts other than the OS may be stored in a medium and distributed, or the parts other than the OS may be stored in a server device so that they can be downloaded to a computer.
  • each component of each device illustrated is functionally conceptual and does not necessarily need to be physically configured as illustrated.
  • the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured. Note that this distribution/integration configuration may be performed dynamically.
  • each step of one flowchart may be executed by one device, or may be executed by a plurality of devices.
  • the plurality of processes may be executed by one device, or may be shared by a plurality of devices.
  • a plurality of processes included in one step can also be executed as processes of a plurality of steps.
  • the processing described as multiple steps can also be collectively executed as one step.
  • a computer-executed program may be configured such that the processing of the steps described in the program is executed in chronological order according to the order described in this specification, in parallel, or when calls are executed. It may also be executed individually at necessary timings such as when it is interrupted. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
  • the present embodiment can be applied to any configuration that constitutes a device or system, such as a processor as a system LSI (Large Scale Integration), a module using a plurality of processors, a unit using a plurality of modules, etc. Furthermore, it can also be implemented as a set or the like (that is, a configuration of a part of the device) to which other functions are added.
  • a processor as a system LSI (Large Scale Integration)
  • module using a plurality of processors a unit using a plurality of modules, etc.
  • it can also be implemented as a set or the like (that is, a configuration of a part of the device) to which other functions are added.
  • the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing, are both systems. .
  • this embodiment can take a configuration of cloud computing in which one function is shared by a plurality of devices via a network and processed jointly.
  • the information processing apparatus 10 extracts IR component information from an IR image obtained by irradiating an object (for example, a user and surrounding objects) with infrared light, Based on the extracted IR component information, image processing relating to brightness or brightness is performed on the captured image of the target. Infrared light is invisible to the human eye. The user can obtain an image as if it were illuminated with visible light without being dazzled.
  • the present technology can also take the following configuration.
  • an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component; an extraction unit that extracts IR component information from the IR image; an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component; Information processing device.
  • the acquisition unit acquires a visible light image of the target in addition to the IR image, The extraction unit acquires a difference between the visible light image and the IR image as information on the IR component.
  • the information processing device according to (1) above.
  • the infrared light is blinking
  • the acquisition unit acquires the visible light image and the IR image in a time division manner in synchronization with the blinking cycle of the infrared light.
  • the information processing device according to (2) above.
  • the infrared light blinks in synchronization with the frame period of the video,
  • the acquisition unit acquires the image of the frame at the timing when the infrared light is not irradiated as the visible light image, and acquires the image of the frame at the timing when the infrared light is irradiated as the IR image.
  • the information processing apparatus according to (2) or (3) above.
  • the extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is not irradiated,
  • the image processing unit performs image processing related to brightness or brightness of an image of the next frame of the two consecutive frames based on the information of the IR component.
  • the information processing device according to (4) above.
  • the IR component information is a difference image of the two consecutive frames;
  • the image processing unit performs a process of blurring the edges of the difference image, and performs image processing related to brightness or brightness of the image of the next frame of the consecutive two frames based on the difference image with the edges blurred.
  • the information processing device according to (5) above.
  • the image processing unit corrects the IR component information based on inter-frame motion prediction, and determines the brightness or brightness of the next frame image of the two continuous frames based on the corrected IR component information. perform image processing, The information processing device according to (5) above.
  • the extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is irradiated, The image processing unit performs image processing related to brightness or brightness of the image of the last frame of the two consecutive frames based on the information of the IR component.
  • the information processing device according to (4) above.
  • the image processing unit rewrites luminance information in the HSL color space of the captured image based on the IR component information.
  • the information processing apparatus according to any one of (1) to (8) above.
  • the image processing unit rewrites lightness information in the HSV color space of the captured image based on the IR component information.
  • the information processing apparatus according to any one of (1) to (8) above.
  • An output control unit that outputs an image generated by the image processing to an output unit,
  • the information processing apparatus according to any one of (1) to (10) above.
  • the information processing device controls the infrared irradiation unit so that the infrared light blinks in synchronization with a video frame cycle while the image generated by the image processing is output to the output unit, While the infrared light is blinking, the acquisition unit acquires an image of the frame at the timing when the infrared light is not irradiated as a visible light image, and an image of the frame at the timing when the infrared light is irradiated.
  • the output control unit controls the infrared irradiation unit to stop outputting the infrared light at a timing when the image output to the output unit is switched from the image generated by the image processing to the estimated image, After the output of the infrared light is stopped, the acquisition unit acquires images of all frames as the visible light image, The estimating unit uses the learning model to estimate an image after the image processing of the visible light image.
  • the information processing device according to (12) above.
  • the acquisition unit comprises multi-viewpoint images, which are images for generating a 3D model of a subject and which are composed of visible light images from a plurality of directions of the subject, IR images of the subject from a plurality of directions, and get The image processing unit corrects the multi-viewpoint image based on the information of the IR component extracted from the IR image.
  • the information processing device according to (1) above.
  • the computer an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component; an extraction unit that extracts IR component information from the IR image; an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component;

Abstract

An information processing device according to the present invention comprises: an acquisition unit that acquires an infrared (IR) image, which is a captured image obtained by irradiating a subject with infrared light, including a visible light component and an IR component; an extraction unit that extracts information of the IR component from the IR image; and an image processing unit that, on the basis of the information on the IR component, subjects the captured image of the subject to image processing pertaining to luminance or brightness.

Description

情報処理装置、情報処理方法、及びプログラムInformation processing device, information processing method, and program
 本開示は、情報処理装置、情報処理方法、及びプログラムに関する。 The present disclosure relates to an information processing device, an information processing method, and a program.
 個人でカメラを使用する機会が増加している。例えば、近年では、リモートワーク、テレビ会議、テレビ電話の需要が増え、顔を接写する機会が増えている。また、近年では、動画共有サイトへの投稿動画制作等、個人での動画制作の需要が増えている。 Opportunities for individuals to use cameras are increasing. For example, in recent years, the demand for remote work, video conferences, and video calls has increased, and opportunities to take close-up shots of faces have increased. Also, in recent years, there has been an increase in demand for personal video production, such as production of videos to be posted on video sharing sites.
特開2016-6627号公報JP 2016-6627 A
 動画撮影のクオリティを上げるためには、照明を使うことが望ましい。しかしながら、照明を使った場合、ユーザに苦痛を与えることがある。例えば、照明を使った場合、ユーザは、カメラに向かって話す間、まぶしい照明を我慢しなければならない。  In order to improve the quality of video recording, it is desirable to use lighting. However, the use of lighting may cause discomfort to the user. For example, with lighting, the user must put up with the glare while speaking to the camera.
 そこで、本開示では、高いクオリティの動画撮影を可能にする情報処理装置、情報処理方法、及びプログラムを提案する。 Therefore, the present disclosure proposes an information processing device, an information processing method, and a program that enable high-quality video shooting.
 なお、上記課題又は目的は、本明細書に開示される複数の実施形態が解決し得、又は達成し得る複数の課題又は目的の1つに過ぎない。 It should be noted that the above problem or object is only one of the multiple problems or objects that can be solved or achieved by the multiple embodiments disclosed herein.
 上記の課題を解決するために、本開示に係る一形態の情報処理装置は、対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得する取得部と、前記IR画像からIR成分の情報を抽出する抽出部と、前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う画像処理部と、を備える。 In order to solve the above problems, an information processing apparatus according to one embodiment of the present disclosure acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component. an acquisition unit that extracts IR component information from the IR image; and an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the IR component information. .
動画撮影に可視光ライトを使った様子を示す図である。FIG. 11 is a diagram showing a state in which visible light is used for moving image shooting; IRライトを使った動画撮影の様子を示す図である。FIG. 10 is a diagram showing how a moving image is captured using an IR light; 本実施形態の画像処理の概要を示す図である。It is a figure which shows the outline|summary of the image processing of this embodiment. 本開示の実施形態に係るサーバの構成例を示す図である。1 is a diagram illustrating a configuration example of a server according to an embodiment of the present disclosure; FIG. 赤外線照明部の一例を示す図である。It is a figure which shows an example of an infrared illumination part. 撮像部が備えるフィルタの周波数特性を示す図である。It is a figure which shows the frequency characteristic of the filter with which an imaging part is provided. 基本的な手法1を説明するための図である。FIG. 4 is a diagram for explaining basic method 1; 基本的な手法1を実現するための画像出力処理を示すフローチャートである。4 is a flowchart showing image output processing for realizing basic method 1; 基本的な手法2を説明するための図である。FIG. 10 is a diagram for explaining basic technique 2; 基本的な手法2を実現するための画像出力処理を示すフローチャートである。9 is a flowchart showing image output processing for realizing basic method 2; 発展的な手法を説明するための図である。It is a figure for demonstrating an expansive method. 発展的な手法を説明するための図である。It is a figure for demonstrating an expansive method. 発展的な手法を実現するための画像出力処理を示すフローチャートである。10 is a flowchart showing image output processing for realizing an advanced method; 推測処理を示すフローチャートである。It is a flowchart which shows an estimation process. 本実施形態の実写ボリュメトリック撮影システムの撮影スタジオの一例を示す図である。1 is a diagram showing an example of a photographing studio of the live-action volumetric photographing system of this embodiment; FIG. 実写ボリュメトリック撮影システムにおける情報処理装置10の処理例を示す図である。FIG. 3 is a diagram showing a processing example of the information processing device 10 in the live-action volumetric imaging system; 複数の可視光ライト及び複数のIRライトが全天球配置された様子を示す図である。FIG. 3 is a diagram showing a state in which a plurality of visible light lights and a plurality of IR lights are arranged omnidirectionally;
 以下に、本開示の実施形態について図面に基づいて詳細に説明する。なお、以下の各実施形態において、同一の部位には同一の符号を付することにより重複する説明を省略する。 Below, embodiments of the present disclosure will be described in detail based on the drawings. In addition, in each of the following embodiments, the same parts are denoted by the same reference numerals, thereby omitting redundant explanations.
 以下に説明される1又は複数の実施形態(実施例、変形例を含む)は、各々が独立に実施されることが可能である。一方で、以下に説明される複数の実施形態は少なくとも一部が他の実施形態の少なくとも一部と適宜組み合わせて実施されてもよい。これら複数の実施形態は、互いに異なる新規な特徴を含み得る。したがって、これら複数の実施形態は、互いに異なる目的又は課題を解決することに寄与し得、互いに異なる効果を奏し得る。 Each of one or more embodiments (including examples and modifications) described below can be implemented independently. On the other hand, at least some of the embodiments described below may be implemented in combination with at least some of the other embodiments as appropriate. These multiple embodiments may include novel features that differ from each other. Therefore, these multiple embodiments can contribute to solving different purposes or problems, and can produce different effects.
 また、以下に示す項目順序に従って本開示を説明する。
  1.概要 
  2.情報処理装置の構成
  3.情報処理装置の動作
   3-1.基本的な手法1
   3-2.基本的な手法2
   3-3.発展的な手法
  4.実写ボリュメトリックへの応用
   4-1.課題
   4-2.実施例
   4-3.他の例
  5.変形例
   5-1.製品やサービスへ応用
   5-2.その他の変形例
  6.むすび
Also, the present disclosure will be described according to the order of items shown below.
1. overview
2. Configuration of information processing apparatus 3 . Operation of Information Processing Apparatus 3-1. Basic method 1
3-2. Basic method 2
3-3. Advanced method 4. Application to live-action volumetric 4-1. Issue 4-2. Example 4-3. Other examples5. Modification 5-1. Application to products and services 5-2. Other Modifications6. Conclusion
<<1.概要>>
 近年、個人でカメラを使用する機会が増加している。特に、近年では、リモートワーク、テレビ会議、テレビ電話等を使用する機会が増え、ユーザを近接撮影する機会が増加している。
<<1. Overview>>
In recent years, there have been increasing opportunities for individuals to use cameras. In particular, in recent years, opportunities to use remote work, video conferences, video phones, etc. have increased, and opportunities to take close-up photographs of users have increased.
 動画撮影のクオリティを上げるためには、照明を使うことが望ましい。図1は、動画撮影に可視光ライトを使った様子を示す図である。図1の例では、照明にリング状の可視光ライト(図1の例ではLED(Light Emitting Diode)ライト)を使用している。照明の当て方ひとつで映像の印象が変わるので、照明の当て方は難しい。できるだけカメラと同方向から均等にユーザに光を当てるのが望ましいが、そのような照明の設置は困難である。また、ユーザが眼鏡をかけていた場合、ユーザに真正面から光を当てると、眼鏡に光が反射する等の問題もある。  In order to improve the quality of video recording, it is desirable to use lighting. FIG. 1 is a diagram showing how a visible light is used for moving image shooting. In the example of FIG. 1, a ring-shaped visible light (LED (Light Emitting Diode) light in the example of FIG. 1) is used for illumination. The impression of the image changes depending on how the lighting is applied, so it is difficult to decide how to apply the lighting. Although it is desirable to illuminate the user as evenly as possible from the same direction as the camera, it is difficult to install such lighting. In addition, when the user wears glasses, there is a problem that the light is reflected on the glasses when the user is exposed to the light from the front.
 照明等の機材をそろえたとしても、ユーザは、カメラに向かって話す間、眩しい照明を我慢しないといけない。例えば、ユーザは、気軽にテレビ電話したいだけなのに、機材に投資しセットアップに苦労したうえ、我慢して顔に照明を当て続けなければならない。これはユーザにとって極めて苦痛である。 Even with equipment such as lighting, users have to put up with dazzling lighting while talking to the camera. For example, even though the user just wants to make a video call casually, he has to invest in the equipment, have a hard time setting it up, and have to endure and keep lighting his face. This is extremely painful for the user.
 そこで、本実施形態では、可視光及び赤外光を同時取得できるカメラと、不可視の赤外光を出力するIR(Infrared)ライトを組み合わせる。図2は、IRライトを使った動画撮影の様子を示す図である。そして、本実施形態の情報処理装置は、赤外光の照射の情報に基づいてあたかも可視光が被写体に照射されているかのように撮像画像に画像処理を施す。これにより、近接撮影シーンに有効な簡易的なリライティング(Relighting)を実現する。 Therefore, in this embodiment, a camera capable of simultaneously acquiring visible light and infrared light is combined with an IR (Infrared) light that outputs invisible infrared light. FIG. 2 is a diagram showing how moving images are captured using an IR light. Then, the information processing apparatus of the present embodiment performs image processing on the captured image based on the infrared light irradiation information as if the subject were irradiated with visible light. This realizes simple relighting that is effective for close-up scenes.
 図3は、本実施形態の画像処理の概要を示す図である。情報処理装置は、対象(例えば、ユーザ、及びその周囲の物)に赤外光を照射して得られるIR画像を取得する。IR画像は、対象に赤外光を照射して得られる、可視光成分とIR成分とを含む撮像画像である。そして、情報処理装置は、IR画像から抽出したIR成分の情報に基づいて対象の撮像画像への輝度又は明度に関する画像処理を行う。これにより、ユーザは、苦痛や複雑な照明の調整を伴わずに安定したライティングで動画撮影ができる。 FIG. 3 is a diagram showing an overview of image processing according to this embodiment. An information processing apparatus obtains an IR image obtained by irradiating an object (eg, a user and surrounding objects) with infrared light. An IR image is a captured image containing a visible light component and an IR component obtained by irradiating an object with infrared light. Then, the information processing device performs image processing related to brightness or brightness on the captured image of the target based on the information of the IR component extracted from the IR image. As a result, the user can shoot moving images with stable lighting without painful or complicated lighting adjustments.
 なお、赤外光の照射には、レンズ周りに赤外光の発光素子をリング状に配置したIRリングライトを使用するのが望ましい。また、眼鏡への光の映り込みには、偏光フィルタで対処してもよい。 For infrared light irradiation, it is desirable to use an IR ring light in which infrared light emitting elements are arranged in a ring around the lens. In addition, a polarizing filter may be used to prevent reflection of light on the glasses.
 以上、本実施形態の概要を述べたが、以下、本実施形態の情報処理装置10を詳細に説明する。 The outline of the present embodiment has been described above, and the information processing apparatus 10 of the present embodiment will be described in detail below.
<<2.情報処理装置の構成>>
 まず、情報処理装置10の構成を説明する。
<<2. Configuration of Information Processing Device >>
First, the configuration of the information processing device 10 will be described.
 情報処理装置10は、ユーザが動画撮影に使用するコンピュータである。情報処理装置10は、典型的にはパーソナルコンピュータであるが、パーソナルコンピュータに限られない。例えば、情報処理装置10は、携帯電話、スマートデバイス(スマートフォン、又はタブレット)、PDA(Personal Digital Assistant)、ノートPC等のモバイル端末であってもよい。また、情報処理装置10は、スマートウォッチ等のウェアラブルデバイスであってもよい。 The information processing device 10 is a computer used by the user for video shooting. The information processing device 10 is typically a personal computer, but is not limited to a personal computer. For example, the information processing device 10 may be a mobile terminal such as a mobile phone, a smart device (smartphone or tablet), a PDA (Personal Digital Assistant), or a notebook PC. Also, the information processing device 10 may be a wearable device such as a smart watch.
 また、情報処理装置10は、AR(Augmented Reality)デバイス、VR(Virtual Reality)デバイス、MR(Mixed Reality)デバイス等のxRデバイスであってもよい。このとき、xRデバイスは、ARグラス、MRグラス等のメガネ型デバイスであってもよいし、VRヘッドマウントディスプレイ等のヘッドマウント型デバイスであってもよい。 The information processing apparatus 10 may also be an xR device such as an AR (Augmented Reality) device, a VR (Virtual Reality) device, or an MR (Mixed Reality) device. At this time, the xR device may be a glasses-type device such as AR glasses or MR glasses, or a head-mounted device such as a VR head-mounted display.
 また、情報処理装置10は、持ち運び可能なIoT(Internet of Things)デバイスであってもよい。また、情報処理装置10は、FPU(Field Pickup Unit)等の通信機器が搭載されたバイクや移動中継車等であってもよい。また、情報処理装置10は、PCサーバ、ミッドレンジサーバ、メインフレームサーバ等のサーバ装置であってもよい。その他、情報処理装置10には、あらゆる形態のコンピュータを採用可能である。 The information processing device 10 may also be a portable IoT (Internet of Things) device. Also, the information processing apparatus 10 may be a motorcycle, a mobile relay vehicle, or the like equipped with a communication device such as an FPU (Field Pickup Unit). Further, the information processing device 10 may be a server device such as a PC server, a midrange server, or a mainframe server. In addition, the information processing apparatus 10 can employ any form of computer.
 図4は、本開示の実施形態に係る情報処理装置10の構成例を示す図である。情報処理装置10は、通信部11と、記憶部12と、制御部13と、出力部14と、赤外線照明部15と、同期信号発生部16と、撮像部17と、を備える。なお、図4に示した構成は機能的な構成であり、ハードウェア構成はこれとは異なっていてもよい。また、情報処理装置10の機能は、複数の物理的に分離された構成に分散して実装されてもよい。 FIG. 4 is a diagram showing a configuration example of the information processing device 10 according to the embodiment of the present disclosure. The information processing apparatus 10 includes a communication section 11 , a storage section 12 , a control section 13 , an output section 14 , an infrared illumination section 15 , a synchronization signal generation section 16 and an imaging section 17 . Note that the configuration shown in FIG. 4 is a functional configuration, and the hardware configuration may differ from this. Also, the functions of the information processing apparatus 10 may be distributed and implemented in a plurality of physically separated configurations.
 通信部11は、他の装置と通信するための通信インタフェースである。例えば、通信部11は、NIC(Network Interface Card)等のLAN(Local Area Network)インタフェースである。また、通信部11は、USB(Universal Serial Bus)等の機器接続インタフェースであってもよい。通信部11は、有線インタフェースであってもよいし、無線インタフェースであってもよい。通信部11は、制御部13の制御に従って外部の装置と通信する。 The communication unit 11 is a communication interface for communicating with other devices. For example, the communication unit 11 is a LAN (Local Area Network) interface such as a NIC (Network Interface Card). Also, the communication unit 11 may be a device connection interface such as USB (Universal Serial Bus). The communication unit 11 may be a wired interface or a wireless interface. The communication unit 11 communicates with an external device under the control of the control unit 13 .
 記憶部12は、DRAM(Dynamic Random Access Memory)、SRAM(Static Random Access Memory)、フラッシュメモリ、ハードディスク等のデータ読み書き可能な記憶装置である。記憶部12は、情報処理装置10の記憶手段として機能する。例えば、記憶部12は、撮像部17で撮像された動画のフレームバッファとして機能する。 The storage unit 12 is a data readable/writable storage device such as a DRAM (Dynamic Random Access Memory), an SRAM (Static Random Access Memory), a flash memory, a hard disk, or the like. The storage unit 12 functions as storage means of the information processing device 10 . For example, the storage unit 12 functions as a frame buffer for moving images captured by the imaging unit 17 .
 制御部13は、情報処理装置10の各部を制御するコントローラ(controller)である。制御部13は、例えば、CPU(Central Processing Unit)、MPU(Micro Processing Unit)、GPU(Graphics Processing Unit)等のプロセッサにより実現される。例えば、制御部13は、情報処理装置10内部の記憶装置に記憶されている各種プログラムを、プロセッサがRAM(Random Access Memory)等を作業領域として実行することにより実現される。なお、制御部13は、ASIC(Application Specific Integrated Circuit)やFPGA(Field Programmable Gate Array)等の集積回路により実現されてもよい。CPU、MPU、GPU、ASIC、及びFPGAは何れもコントローラとみなすことができる。 The control unit 13 is a controller that controls each unit of the information processing device 10 . The control unit 13 is implemented by a processor such as a CPU (Central Processing Unit), MPU (Micro Processing Unit), GPU (Graphics Processing Unit), or the like. For example, the control unit 13 is implemented by the processor executing various programs stored in the storage device inside the information processing apparatus 10 using a RAM (Random Access Memory) or the like as a work area. The control unit 13 may be realized by an integrated circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array). CPUs, MPUs, GPUs, ASICs, and FPGAs can all be considered controllers.
 制御部13は、取得部131と、抽出部132と、画像処理部133と、出力制御部134と、学習部135と、推測部136と、を備える。制御部13を構成する各ブロック(取得部131~推測部136)はそれぞれ制御部13の機能を示す機能ブロックである。これら機能ブロックはソフトウェアブロックであってもよいし、ハードウェアブロックであってもよい。例えば、上述の機能ブロックが、それぞれ、ソフトウェア(マイクロプログラムを含む。)で実現される1つのソフトウェアモジュールであってもよいし、半導体チップ(ダイ)上の1つの回路ブロックであってもよい。勿論、各機能ブロックがそれぞれ1つのプロセッサ又は1つの集積回路であってもよい。制御部13は上述の機能ブロックとは異なる機能単位で構成されていてもよい。機能ブロックの構成方法は任意である。 The control unit 13 includes an acquisition unit 131 , an extraction unit 132 , an image processing unit 133 , an output control unit 134 , a learning unit 135 and an estimation unit 136 . Each block (obtaining unit 131 to estimating unit 136) constituting the control unit 13 is a functional block indicating the function of the control unit 13. FIG. These functional blocks may be software blocks or hardware blocks. For example, each of the functional blocks described above may be one software module realized by software (including microprograms), or may be one circuit block on a semiconductor chip (die). Of course, each functional block may be one processor or one integrated circuit. The control unit 13 may be configured in functional units different from the functional blocks described above. The configuration method of the functional blocks is arbitrary.
 なお、制御部13は上述の機能ブロックとは異なる機能単位で構成されていてもよい。また、制御部13を構成する各ブロック(取得部131~推測部136)の一部又は全部の動作を、他の装置が行ってもよい。制御部13を構成する各ブロックの動作は後述する。 It should be noted that the control unit 13 may be configured in functional units different from the functional blocks described above. Also, some or all of the blocks (acquisition unit 131 to estimation unit 136) that make up the control unit 13 may be operated by another device. The operation of each block constituting the control unit 13 will be described later.
 出力部14は、音、光、振動、画像等、外部に各種出力を行う装置である。出力部14は、制御部13の制御に従って、ユーザに各種出力を行う。なお、出力部14は、各種情報を表示する表示装置(表示部)を備える。表示装置は、例えば、液晶ディスプレイ、又は、有機ELディスプレイである。なお、出力部14は、タッチパネル式の表示装置であってもよい。この場合、出力部14は、入力部としても機能する。 The output unit 14 is a device that performs various outputs such as sound, light, vibration, and images to the outside. The output unit 14 performs various outputs to the user under the control of the control unit 13 . Note that the output unit 14 includes a display device (display unit) that displays various types of information. The display device is, for example, a liquid crystal display or an organic EL display. Note that the output unit 14 may be a touch panel display device. In this case, the output section 14 also functions as an input section.
 赤外線照明部15は、不可視の赤外光を出力するIRライト(IR照明光源)である。人間の目で捉えることができる光の波長の上限は760-830nmである。850nm又は940nmといった波長が市場のIR照明光源ではメジャーである。そのため、赤外線照明部15は、典型的には、850nm又は940nmの波長の赤外光を出力するIRライトである。しかしながら、赤外線照明部15は、850nm又は940nmの波長の赤外光を出力するIRライトに限られない。赤外線照明部15は、他の波長の赤外光を出力可能であってもよい。図5は、赤外線照明部15の一例を示す図である。顔をきれいに映すには赤外線照明部15はリングライトであることが望ましい。図5の例では、赤外線照明部15はIR発光素子がレンズ周りにリング状に配置されたIRライトとなっている。 The infrared illumination unit 15 is an IR light (IR illumination light source) that outputs invisible infrared light. The upper limit of the wavelength of light that can be perceived by the human eye is 760-830 nm. Wavelengths such as 850 nm or 940 nm are the major IR illumination sources on the market. Therefore, the infrared illuminator 15 is typically an IR light that outputs infrared light with a wavelength of 850 nm or 940 nm. However, the infrared illuminator 15 is not limited to an IR light that outputs infrared light with a wavelength of 850 nm or 940 nm. The infrared illuminator 15 may be capable of outputting infrared light of other wavelengths. FIG. 5 is a diagram showing an example of the infrared illuminator 15. As shown in FIG. It is desirable that the infrared illuminator 15 be a ring light in order to clearly project the face. In the example of FIG. 5, the infrared illumination unit 15 is an IR light in which IR light emitting elements are arranged in a ring shape around a lens.
 同期信号発生部16は、赤外線照明部15の点滅周期と、撮像部17が撮像する映像(動画)のフレーム周期と、を同期させるための同期信号を生成する同期信号発生器である。同期信号発生部16は、制御部13の制御に従って、同期信号を出力する。 The synchronizing signal generating unit 16 is a synchronizing signal generator that generates a synchronizing signal for synchronizing the blinking period of the infrared illumination unit 15 and the frame period of the video (moving image) captured by the imaging unit 17 . The synchronizing signal generator 16 outputs a synchronizing signal under the control of the control unit 13 .
 撮像部17は、光像を電気信号に変換する変換部である。撮像部17は、例えば、イメージセンサと、イメージセンサから出力されたアナログの画素信号の処理を行う信号処理回路等を備え、レンズから入ってきた光をデジタルデータ(画像データ)に変換する。なお、撮像部17が撮像する画像は、映像(動画)に限られず、静止画であってもよい。なお、撮像部は、カメラと言い換えることができる。 The imaging unit 17 is a conversion unit that converts an optical image into an electrical signal. The imaging unit 17 includes, for example, an image sensor and a signal processing circuit that processes analog pixel signals output from the image sensor, and converts light entering from the lens into digital data (image data). An image captured by the imaging unit 17 is not limited to a video (moving image), and may be a still image. Note that the imaging unit can be rephrased as a camera.
 本実施形態の撮像部17は、可視光と赤外光(IR光)を同時に取得できるカメラ(以下、IRカメラともいう。)である。IRカメラは、通常市販のカメラに入っているIRカットフィルタを除去すれば実現できる。しかしながら、IRライト以外の波長の赤外線のセンサへの影響(=ノイズ)を排除するために、撮像部17は、図6に示すような特性を有するIRカットフィルタ+バンドパスフィルタを備えるのが望ましい。図6は、撮像部17が備えるフィルタの周波数特性を示す図である。なお、図6の例では、撮像部17は、850nmの波長の赤外光を検出するよう構成されている。しかしながら、赤外線照明部15が940nmの波長の赤外光を出力する光源なのであれば、撮像部17は、940nmの波長の赤外光を検出するよう構成されていてもよい。 The imaging unit 17 of this embodiment is a camera (hereinafter also referred to as an IR camera) that can simultaneously acquire visible light and infrared light (IR light). An IR camera can be realized by removing the IR cut filter normally included in commercially available cameras. However, in order to eliminate the influence (=noise) of infrared rays of wavelengths other than IR light on the sensor, it is desirable that the imaging unit 17 be provided with an IR cut filter and a bandpass filter having characteristics as shown in FIG. . FIG. 6 is a diagram showing frequency characteristics of a filter included in the imaging unit 17. As shown in FIG. In the example of FIG. 6, the imaging unit 17 is configured to detect infrared light with a wavelength of 850 nm. However, if the infrared illumination unit 15 is a light source that outputs infrared light with a wavelength of 940 nm, the imaging unit 17 may be configured to detect infrared light with a wavelength of 940 nm.
<<3.情報処理装置の動作>>
 以上、情報処理装置10の構成を説明したが、次に、このような構成を有する情報処理装置10の動作を説明する。
<<3. Operation of Information Processing Apparatus >>
The configuration of the information processing apparatus 10 has been described above. Next, the operation of the information processing apparatus 10 having such a configuration will be described.
<3-1.基本的な手法1>
 図7は、基本的な手法1を説明するための図である。以下、図7を参照しながら基本的な手法1の概要を説明する。
<3-1. Basic method 1>
FIG. 7 is a diagram for explaining basic method 1. FIG. An outline of basic method 1 will be described below with reference to FIG.
 情報処理装置10は、ユーザの操作に従って、赤外線照明部15と撮像部17とを動作させる。このとき、情報処理装置10は、撮像部17が撮像する映像(動画)に同期させながら赤外線照明部15を点滅させる。例えば、情報処理装置10は、撮像部17が撮像する映像(動画)のフレーム周期に同期させながら赤外線照明部15を点滅させる。これにより、情報処理装置10は、赤外光の点滅周期に同期して時分割で可視光画像とIR画像とを取得できる。より具体的には、情報処理装置10は、赤外光が照射されていないタイミングのフレームの画像を可視光画像、赤外光が照射されたタイミングのフレームの画像をIR画像、として取得できる。図7の例では、IRライトOFFのときのフレームが可視光画像であり、IRライトOFFのときのフレームがIR画像である。ここで、IR画像は、対象(図7の例ではユーザ、及びその周囲の物)に赤外光を照射して得られる、可視光成分とIR成分とを含む撮像画像である。 The information processing device 10 operates the infrared illumination unit 15 and the imaging unit 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . For example, the information processing device 10 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light. More specifically, the information processing apparatus 10 can acquire the image of the frame when the infrared light is not irradiated as the visible light image and the image of the frame when the infrared light is irradiated as the IR image. In the example of FIG. 7, the frame when the IR light is OFF is the visible light image, and the frame when the IR light is OFF is the IR image. Here, an IR image is a captured image containing a visible light component and an IR component obtained by irradiating an object (the user and surrounding objects in the example of FIG. 7) with infrared light.
 そして、情報処理装置10は、IR画像からIR成分の情報(以下、IR成分情報という。)を抽出する。このIR成分情報は、赤外線照明部15の赤外光がどの方向から対象に当たっているかを示す。このとき、情報処理装置10は、可視光画像とIR画像との差分をIR成分情報として取得してもよい。図7の例では、情報処理装置10は、赤外光が照射されていないタイミングのフレーム(IRライトOFFのフレーム)から始まる連続する2フレームの画像(可視光画像とIR画像)の差分をIR成分情報として取得している。可視光画像とIR画像との差分をIR成分情報とすることで、情報処理装置10は、室内灯(例えば、蛍光灯)の光等、常時存在している光のIR成分をIR成分情報から除去し、純粋に、赤外線照明部15の点灯による対象への赤外光の影響の情報をIR成分情報として抽出できる。 Then, the information processing device 10 extracts IR component information (hereinafter referred to as IR component information) from the IR image. This IR component information indicates from which direction the infrared light from the infrared illumination unit 15 hits the object. At this time, the information processing device 10 may acquire the difference between the visible light image and the IR image as IR component information. In the example of FIG. 7, the information processing apparatus 10 calculates the difference between two consecutive frames of images (a visible light image and an IR image) starting from a frame at a timing when infrared light is not irradiated (IR light OFF frame). Obtained as ingredient information. By using the difference between the visible light image and the IR image as the IR component information, the information processing device 10 can detect the IR component of light that always exists, such as the light of a room light (for example, a fluorescent lamp), from the IR component information. However, it is possible to purely extract information on the influence of infrared light on the target due to lighting of the infrared illumination unit 15 as IR component information.
 そして、情報処理装置10は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う。例えば、情報処理装置10は、IR成分情報に基づいて、IR成分情報の抽出に使用した連続する2フレーム(可視光画像とIR画像)の次フレームの画像(可視光画像)に画像処理を行う。図7の例では、情報処理装置10は、IR成分情報に基づいて、可視光画像のHSL色空間での輝度(L)の情報を書き換えている。より具体的には、情報処理装置10は、可視光画像をRGBからHSLに変換し、可視光画像のHSL色空間での輝度(L)にIR成分の強度をマッピングしている。ここでマッピングは、完全な置き換えであってもよいし、オリジナルの輝度とのブレンディングであってもよい。なお、上述の例では、可視光画像(入力画像)で使用される色空間がRGBの例を示しているが、可視光画像(入力画像)で使用される色空間RGBに限られない。例えば、可視光画像(入力画像)で使用される色空間は、YUV等のRGB以外の色空間であってもよい。YUV色空間は、色を輝度(Y)と色差成分(U、V)で表現する色空間である。その他、可視光画像(入力画像)で使用される色空間は、カメラが出力する画像の色空間に合わせて適宜変更可能である。なお、可視光画像(入力画像)で使用される色空間が、輝度や明度などIR成分をマッピング可能な軸を持つ色空間であれば、この色空間の変換ステップは省略可能である。 Then, the information processing device 10 performs image processing related to brightness or brightness on the captured image based on the IR component information. For example, based on the IR component information, the information processing device 10 performs image processing on the next frame image (visible light image) of the two continuous frames (visible light image and IR image) used to extract the IR component information. . In the example of FIG. 7, the information processing apparatus 10 rewrites the luminance (L) information in the HSL color space of the visible light image based on the IR component information. More specifically, the information processing device 10 converts the visible light image from RGB to HSL, and maps the intensity of the IR component to the luminance (L) of the visible light image in the HSL color space. Here the mapping may be a complete replacement or blending with the original luminance. In the above example, the color space used in the visible light image (input image) is RGB, but the color space used in the visible light image (input image) is not limited to RGB. For example, the color space used in the visible light image (input image) may be a color space other than RGB, such as YUV. The YUV color space is a color space that expresses colors with luminance (Y) and color difference components (U, V). In addition, the color space used for the visible light image (input image) can be appropriately changed according to the color space of the image output by the camera. Note that if the color space used in the visible light image (input image) is a color space having an axis capable of mapping IR components such as brightness and lightness, this color space conversion step can be omitted.
 なお、動きのあるシーンの場合、単純なマッピングではフレーム間にずれが生じる。この場合、情報処理装置10は、IR成分のエッジにぼかし処理(blur)を施してもよい。例えば、情報処理装置10は、IR成分情報とした差分画像に対してエッジをぼかす処理を行い、エッジをぼかした差分画像に基づいて、連続する2フレームの次フレームの画像への画像処理を行う。これにより、情報処理装置10は、動きのあるシーンであっても、違和感の少ない画像を生成できる。 In addition, in the case of a scene with motion, a gap occurs between frames with simple mapping. In this case, the information processing device 10 may blur the edge of the IR component. For example, the information processing apparatus 10 performs edge blurring processing on the difference image as IR component information, and performs image processing on the next frame image of two continuous frames based on the edge-blurred difference image. . Thereby, the information processing apparatus 10 can generate an image with little discomfort even in a scene with motion.
 また、情報処理装置10は、フレーム間の動き予測に基づいてIR成分情報を補正し、補正したIR成分情報に基づいて、連続する2フレームの次フレームの画像への画像処理を行ってもよい。例えば、情報処理装置10は、可視光画像の隣接フレーム間のオプティカルフロー(Optical Flow)を取得し、IR成分を変形してマッピングしてもよい。これによっても、違和感の少ない画像を生成できる。 Further, the information processing apparatus 10 may correct the IR component information based on motion prediction between frames, and perform image processing on the next frame image of two continuous frames based on the corrected IR component information. . For example, the information processing device 10 may acquire the optical flow between adjacent frames of the visible light image, transform the IR component, and map it. This also makes it possible to generate an image with little sense of incongruity.
 そして、情報処理装置10は、画像をHSLからRGBに変換し、出力部14に出力する。 Then, the information processing device 10 converts the image from HSL to RGB and outputs it to the output unit 14 .
 なお、図7の例では、情報処理装置10は、IR成分情報に基づいて、撮像画像のHSL色空間での輝度(L)の情報を書き換えた。しかしながら、情報処理装置10は、IR成分情報に基づいて、撮像画像のHSV色空間での明度(V)の情報を書き換えるようにしてもよい。情報処理装置10は、IR成分情報に基づいて、撮像画像のYCoCg色空間での輝度(Y)の情報を書き換えるようにしてもよい。勿論、情報処理装置10は、IR成分情報に基づいて、RGB色空間での値を書き換えてもよい。情報処理装置10が画像処理に使用する色空間は、上記した色空間に限られない。最終的な出力映像の色空間は、用途に応じてRGBに限られず、例えば、YUVなどもありうる。 In the example of FIG. 7, the information processing device 10 rewrites the luminance (L) information of the captured image in the HSL color space based on the IR component information. However, the information processing apparatus 10 may rewrite the brightness (V) information of the captured image in the HSV color space based on the IR component information. The information processing device 10 may rewrite the luminance (Y) information of the captured image in the YCoCg color space based on the IR component information. Of course, the information processing device 10 may rewrite the values in the RGB color space based on the IR component information. The color space used by the information processing apparatus 10 for image processing is not limited to the color space described above. The color space of the final output image is not limited to RGB depending on the application, and may be YUV, for example.
 以上、基本的な手法1の概要を説明したが、以下、基本的な手法1を実現するための画像出力処理について説明する。図8は、基本的な手法1を実現するための画像出力処理を示すフローチャートである。以下の処理は、情報処理装置10の制御部13が実行する。制御部13は、ユーザが撮像(例えば、テレビ会議)を開始すると、画像出力処理を開始する。 The outline of the basic method 1 has been described above, and the image output processing for realizing the basic method 1 will be described below. FIG. 8 is a flowchart showing image output processing for realizing basic method 1. FIG. The following processing is executed by the control unit 13 of the information processing device 10 . The control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
 まず、制御部13は、撮像部17を起動する(ステップS101)。上述したように、撮像部17は、可視光と赤外光(IR光)を同時に取得できるIRカメラである。そして、制御部13は、撮像部17が撮像する映像(動画)のフレーム周期に同期させながら赤外線照明部15を点滅させる(ステップS102)。上述したように、赤外線照明部15は、不可視の赤外光を出力するIRライトである。 First, the control unit 13 activates the imaging unit 17 (step S101). As described above, the imaging unit 17 is an IR camera that can simultaneously acquire visible light and infrared light (IR light). Then, the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame cycle of the video (moving image) captured by the imaging unit 17 (step S102). As described above, the infrared illuminator 15 is an IR light that outputs invisible infrared light.
 続いて、情報処理装置10の取得部131は、撮像部17が撮像した画像を取得する。赤外線光が映像のフレーム周期に同期しながら点滅しているので、取得部131は、可視光画像とIR画像とを交互に取得することになる(ステップS103)。ここで、IR画像は、赤外線照明部15が照射する赤外光の影響によるIR成分のみならず、可視光成分をも含む撮像画像である。 Subsequently, the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame period of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S103). Here, the IR image is a captured image including not only the IR component under the influence of the infrared light emitted by the infrared illuminator 15 but also the visible light component.
 情報処理装置10の抽出部132は、IR画像からIR成分情報を抽出する(ステップS104)。具体的には、抽出部132は、可視光画像とIR画像との差分をIR成分情報として取得する。基本的な手法1では、抽出部132は、IRライトOFFのタイミングのフレームから始まる連続する2フレームの画像(可視光画像とIR画像)の差分をIR成分情報として取得する。 The extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S104). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. In basic method 1, the extraction unit 132 acquires, as IR component information, the difference between two consecutive frames of images (a visible light image and an IR image) starting from the frame at which the IR light is turned off.
 続いて、情報処理装置10の画像処理部133は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う(ステップS105)。例えば、情報処理装置10は、IR成分情報に基づいて、IR成分情報の抽出に使用した連続する2フレーム(可視光画像とIR画像)の次フレームの画像(可視光画像)に画像処理を行う。例えば、画像処理部133は、IR成分情報に基づいて、可視光画像の輝度の情報を書き換える。 Subsequently, the image processing unit 133 of the information processing device 10 performs image processing related to brightness or brightness of the captured image based on the IR component information (step S105). For example, based on the IR component information, the information processing device 10 performs image processing on the next frame image (visible light image) of the two continuous frames (visible light image and IR image) used to extract the IR component information. . For example, the image processing unit 133 rewrites the luminance information of the visible light image based on the IR component information.
 そして、情報処理装置10の出力制御部134は、画像処理を行った撮像画像を出力部14に出力する(ステップS106)。 Then, the output control unit 134 of the information processing device 10 outputs the captured image subjected to the image processing to the output unit 14 (step S106).
 その後、情報処理装置10の制御部13は、撮影が終了したか判別する(ステップS107)。撮影が終了していない場合(ステップS107:No)、制御部13は、ステップS103に処理を戻す。撮影が終了している場合(ステップS107:Yes)、制御部13は、撮像部17と赤外線照明部15の動作を停止する(ステップS108)。撮像部17と赤外線照明部15の動作が停止したら、制御部13は、画像出力処理を終了する。 After that, the control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S107). If the shooting has not ended (step S107: No), the control unit 13 returns the process to step S103. If the shooting has ended (step S107: Yes), the control unit 13 stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S108). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
 本手法によれば、情報処理装置10は、不可視の赤外光の照射情報(すなわち、IR成分情報)に基づいて、あたかも可視光が対象(例えば、ユーザ)に当たっているかのように見えるよう画像処理を行っている。これにより、ユーザは、まぶしい思いをすることなく、安定したライティングで動画撮影ができる。 According to this method, the information processing apparatus 10 performs image processing so that it appears as if visible light is hitting an object (for example, a user) based on irradiation information (that is, IR component information) of invisible infrared light. It is carried out. As a result, the user can shoot moving images with stable lighting without being dazzled.
<3-2.基本的な手法2>
 基本的な手法1では、情報処理装置10は、連続する2フレーム(可視光画像とIR画像)の次フレームの画像(可視光画像)に画像処理を行った。しかし、動きのあるシーンの場合、この方法では、画像処理後の画像が不自然な画像になる恐れがある。そこで、基本的な手法2では、画像処理を行うフレームを差分画像の生成に使用したフレームとすることで、動きのあるシーンでも不自然な画像にならないようにする。
<3-2. Basic method 2>
In basic method 1, the information processing apparatus 10 performs image processing on the next frame image (visible light image) of two continuous frames (visible light image and IR image). However, in the case of a scene with motion, this method may result in an unnatural image after image processing. Therefore, in basic method 2, the frame used for generating the difference image is used as the frame to be subjected to image processing, so that even in a scene with motion, the image does not look unnatural.
 図9は、基本的な手法2を説明するための図である。以下、図9を参照しながら基本的な手法2の概要を説明する。 FIG. 9 is a diagram for explaining basic method 2. The outline of basic method 2 will be described below with reference to FIG.
 情報処理装置10は、ユーザの操作に従って、赤外線照明部15と撮像部17とを動作させる。このとき、情報処理装置10は、撮像部17が撮像する映像(動画)に同期させながら赤外線照明部15を点滅させる。例えば、情報処理装置10は、撮像部17が撮像する映像(動画)のフレーム周期に同期させながら赤外線照明部15を点滅させる。これにより、情報処理装置10は、赤外光の点滅周期に同期して時分割で可視光画像とIR画像とを取得できる。 The information processing device 10 operates the infrared illumination unit 15 and the imaging unit 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . For example, the information processing device 10 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light.
 そして、情報処理装置10は、IR画像からIR成分の情報を抽出する。このとき、情報処理装置10は、可視光画像とIR画像との差分をIR成分情報として取得する。図9の例では、情報処理装置10は、赤外光が照射されたタイミングのフレーム(IRライトONのフレーム)から始まる連続する2フレームの画像(可視光画像とIR画像)の差分をIR成分情報として取得している。 Then, the information processing device 10 extracts IR component information from the IR image. At this time, the information processing device 10 acquires the difference between the visible light image and the IR image as IR component information. In the example of FIG. 9, the information processing apparatus 10 converts the difference between two consecutive frames of images (visible light image and IR image) starting from the frame at the timing when the infrared light is irradiated (IR light ON frame) to the IR component. obtained as information.
 そして、情報処理装置10は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う。例えば、情報処理装置10は、IR成分情報に基づいて、IR成分情報の抽出に使用した連続する2フレーム(IR画像と可視光画像)の最後のフレームの画像(可視光画像)に画像処理を行う。図9の例では、情報処理装置10は、IR成分情報に基づいて、可視光画像のHSL色空間での輝度(L)の情報を書き換えている。ここでHSL色空間とは、色を色相(Hue)、彩度(Saturation)、輝度(Lightness)の3成分で表現する色空間である。 Then, the information processing device 10 performs image processing related to brightness or brightness on the captured image based on the IR component information. For example, based on the IR component information, the information processing device 10 performs image processing on the last frame image (visible light image) of two consecutive frames (IR image and visible light image) used to extract the IR component information. conduct. In the example of FIG. 9, the information processing apparatus 10 rewrites the luminance (L) information in the HSL color space of the visible light image based on the IR component information. Here, the HSL color space is a color space that expresses colors with three components of hue (Hue), saturation (Saturation), and brightness (Lightness).
 そして、情報処理装置10は、画像をHSLからRGBに変換し、出力部14に出力する。 Then, the information processing device 10 converts the image from HSL to RGB and outputs it to the output unit 14 .
 なお、図9の例では、情報処理装置10は、IR成分情報に基づいて、撮像画像のHSL色空間での輝度(L)の情報を書き換えた。しかしながら、情報処理装置10は、IR成分情報に基づいて、撮像画像のHSV色空間での明度(V)の情報を書き換えるようにしてもよい。ここで、HSV色空間とは、色を色相(Hue)、彩度(Saturation/Chroma)、明度(Value/Brightness)の3成分で表現する色空間である。情報処理装置10は、IR成分情報に基づいて、撮像画像のYCoCg色空間での輝度(Y)の情報を書き換えるようにしてもよい。YCoCg色空間は、色を輝度(Y)と色差成分(Co(オレンジの濃さ)、Cg(緑の濃さ))で表現する色空間である。勿論、情報処理装置10は、IR成分情報に基づいて、RGB色空間での値を書き換えてもよい。情報処理装置10が画像処理に使用する色空間は、上記した色空間に限られない。 Note that in the example of FIG. 9, the information processing device 10 rewrites the information of the luminance (L) in the HSL color space of the captured image based on the IR component information. However, the information processing apparatus 10 may rewrite the brightness (V) information of the captured image in the HSV color space based on the IR component information. Here, the HSV color space is a color space that expresses colors with three components of hue (Hue), saturation (Saturation/Chroma), and brightness (Value/Brightness). The information processing device 10 may rewrite the luminance (Y) information of the captured image in the YCoCg color space based on the IR component information. The YCoCg color space is a color space that expresses colors by luminance (Y) and color difference components (Co (darkness of orange) and Cg (darkness of green)). Of course, the information processing device 10 may rewrite the values in the RGB color space based on the IR component information. The color space used by the information processing apparatus 10 for image processing is not limited to the color space described above.
 以上、基本的な手法2の概要を説明したが、以下、基本的な手法2を実現するための画像出力処理について説明する。図10は、基本的な手法2を実現するための画像出力処理を示すフローチャートである。以下の処理は、情報処理装置10の制御部13が実行する。制御部13は、ユーザが撮像(例えば、テレビ会議)を開始すると、画像出力処理を開始する。 The outline of the basic method 2 has been described above, and the image output processing for realizing the basic method 2 will be described below. FIG. 10 is a flowchart showing image output processing for realizing basic method 2. FIG. The following processing is executed by the control unit 13 of the information processing device 10 . The control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
 まず、制御部13は、撮像部17を起動する(ステップS201)。そして、制御部13は、撮像部17が撮像する映像(動画)のフレーム周期に同期させながら赤外線照明部15を点滅させる(ステップS202)。 First, the control unit 13 activates the imaging unit 17 (step S201). Then, the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 (step S202).
 続いて、情報処理装置10の取得部131は、撮像部17が撮像した画像を取得する。赤外光が映像のフレーム周期に同期しながら点滅しているので、取得部131は、可視光画像とIR画像とを交互に取得することになる(ステップS203)。 Subsequently, the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame cycle of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S203).
 情報処理装置10の抽出部132は、IR画像からIR成分情報を抽出する(ステップS204)。具体的には、抽出部132は、可視光画像とIR画像との差分をIR成分情報として取得する。基本的な手法2では、抽出部132は、IRライトONのタイミングのフレームから始まる連続する2フレームの画像(IR画像と可視光画像)の差分をIR成分情報として取得する。 The extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S204). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. In basic method 2, the extraction unit 132 acquires, as IR component information, the difference between two consecutive frames of images (an IR image and a visible light image) starting from the frame at which the IR light is ON.
 続いて、情報処理装置10の画像処理部133は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う(ステップS205)。例えば、情報処理装置10は、IR成分情報に基づいて、IR成分情報の抽出に使用した連続する2フレーム(IR画像と可視光画像と)の最後フレームの画像(可視光画像)に画像処理を行う。例えば、画像処理部133は、IR成分情報に基づいて、可視光画像の輝度の情報を書き換える。 Subsequently, the image processing unit 133 of the information processing device 10 performs image processing related to luminance or brightness of the captured image based on the IR component information (step S205). For example, based on the IR component information, the information processing device 10 performs image processing on the last frame image (visible light image) of the two consecutive frames (IR image and visible light image) used to extract the IR component information. conduct. For example, the image processing unit 133 rewrites the luminance information of the visible light image based on the IR component information.
 そして、情報処理装置10の出力制御部134は、画像処理を行った撮像画像を出力部14に出力する(ステップS206)。 Then, the output control unit 134 of the information processing device 10 outputs the captured image subjected to the image processing to the output unit 14 (step S206).
 その後、情報処理装置10の制御部13は、撮影が終了したか判別する(ステップS207)。撮影が終了していない場合(ステップS207:No)、制御部13は、ステップS203に処理を戻す。撮影が終了している場合(ステップS207:Yes)、制御部13は、撮像部17と赤外線照明部15の動作を停止する(ステップS208)。撮像部17と赤外線照明部15の動作が停止したら、制御部13は、画像出力処理を終了する。 After that, the control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S207). If the shooting has not ended (step S207: No), the control unit 13 returns the process to step S203. If the shooting has ended (step S207: Yes), the control unit 13 stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S208). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
 本手法によれば、情報処理装置10はIR成分情報(差分画像)の生成に使用したフレームの一方のフレームに対して画像処理を行っているので、動きがあるシーンでもIR成分情報と画像処理対象の画像との時間のずれが小さい。そのため、ユーザはより違和感の少ない映像を得ることができる。 According to this method, the information processing apparatus 10 performs image processing on one of the frames used to generate the IR component information (difference image). The time lag with the target image is small. Therefore, the user can obtain an image with less discomfort.
<3-3.発展的な手法>
 基本的な手法1、2では、情報処理装置10は、IR成分情報に基づいて撮像画像に画像処理を行った。しかし、情報処理装置10は、画像処理前の画像と画像処理後の画像とに基づく学習により学習モデルを生成し、生成した学習モデルを使用して、撮像画像から画像処理後の画像を推測してもよい。これにより、情報処理装置10は、ユーザに赤外光を照射しなくても、あたかもユーザがライティングされているかのような画像を取得できる。
<3-3. Advanced method>
In basic methods 1 and 2, the information processing device 10 performs image processing on the captured image based on the IR component information. However, the information processing apparatus 10 generates a learning model by learning based on the image before image processing and the image after image processing, and uses the generated learning model to estimate the image after image processing from the captured image. may Accordingly, the information processing apparatus 10 can acquire an image as if the user were illuminated without irradiating the user with infrared light.
 図11及び図12は、発展的な手法を説明するための図である。図11が学習モデルの学習完了までの処理を示す図であり、図12が学習モデルの学習完了後の処理を示す図である。以下、図11及び図12を参照しながら発展的な手法の概要を説明する。  Figures 11 and 12 are diagrams for explaining the advanced method. FIG. 11 is a diagram showing processing up to completion of learning of the learning model, and FIG. 12 is a diagram showing processing after completion of learning of the learning model. An outline of the advanced method will be described below with reference to FIGS. 11 and 12. FIG.
 まず、図11を参照しながら、学習モデルの学習完了までの処理を説明する。情報処理装置10は、ユーザの操作に従って、赤外線照明部15と撮像部17とを動作させる。このとき、情報処理装置10は、撮像部17が撮像する映像(動画)に同期させながら赤外線照明部15を点滅させる。これにより、情報処理装置10は、赤外光の点滅周期に同期して時分割で可視光画像とIR画像とを取得できる。そして、情報処理装置10は、IR画像からIR成分の情報を抽出する。そして、情報処理装置10は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う。そして、情報処理装置10は、画像処理後の画像を出力部14に出力する。 First, referring to FIG. 11, the processing up to the completion of learning of the learning model will be described. The information processing apparatus 10 operates the infrared illumination section 15 and the imaging section 17 according to the user's operation. At this time, the information processing apparatus 10 blinks the infrared illuminator 15 while synchronizing with the image (moving image) captured by the imaging unit 17 . Thereby, the information processing apparatus 10 can acquire a visible light image and an IR image in a time division manner in synchronization with the blinking cycle of infrared light. Then, the information processing device 10 extracts IR component information from the IR image. Then, the information processing apparatus 10 performs image processing regarding brightness or brightness on the captured image based on the IR component information. Then, the information processing device 10 outputs the image after image processing to the output unit 14 .
 情報処理装置10は、画像処理に並行して、画像処理前の画像と画像処理後の画像とに基づき学習モデルの学習を行う。学習モデルは、例えば、画像処理前の画像と画像処理後の画像との関係を学習するためのモデルである。情報処理装置10は、画像処理前の画像と画像処理後の画像との差分が最小化するよう学習モデルの学習を行う。 In parallel with image processing, the information processing device 10 learns a learning model based on the image before image processing and the image after image processing. A learning model is, for example, a model for learning the relationship between an image before image processing and an image after image processing. The information processing apparatus 10 learns the learning model so as to minimize the difference between the image before image processing and the image after image processing.
 学習モデルは、例えば、ニューラルネットワークモデル等の機械学習モデルである。ニューラルネットワークモデルは、複数のノードを含む入力層、中間層(又は、隠れ層)、出力層と呼ばれる層から構成され、各ノードはエッジを介して接続される。各層は、活性化関数と呼ばれる関数を持ち、各エッジは重み付けされる。学習モデルは、1又は複数の中間層(又は、隠れ層)を有する。学習モデルをニューラルネットワークモデルとする場合、学習モデルの学習とは、例えば、中間層(又は、隠れ層)の層数、各層のノード数、又は各エッジの重み等を設定することを意味する。 A learning model is, for example, a machine learning model such as a neural network model. A neural network model is composed of layers called an input layer containing a plurality of nodes, an intermediate layer (or hidden layer), and an output layer, and each node is connected via edges. Each layer has a function called activation function, and each edge is weighted. A learning model has one or more intermediate layers (or hidden layers). When the learning model is a neural network model, learning the learning model means, for example, setting the number of intermediate layers (or hidden layers), the number of nodes in each layer, or the weight of each edge.
 ここで、ニューラルネットワークモデルは、ディープラーニングによるモデルであってもよい。この場合、ニューラルネットワークモデルは、DNN(Deep Neural Network)と呼ばれる形態のモデルであってもよい。また、ニューラルネットワークモデルは、CNN(Convolution Neural Network)、RNN(Recurrent Neural Network)、又はLSTM(Long Short-Term Memory)と呼ばれる形態のモデルであってもよい。勿論、ニューラルネットワークモデルはこれらの形態のモデルに限定されない。 Here, the neural network model may be a model based on deep learning. In this case, the neural network model may be a model called DNN (Deep Neural Network). Also, the neural network model may be a model called a CNN (Convolution Neural Network), RNN (Recurrent Neural Network), or LSTM (Long Short-Term Memory). Of course, neural network models are not limited to these forms of models.
 また、学習モデルは、ニューラルネットワークモデルに限定されない。例えば、学習モデルは、強化学習によるモデルであってもよい。強化学習では、試行錯誤を通じて価値が最大化するような行動(設定)が学習される。その他、学習モデルは、ロジスティック回帰モデルであってもよい。 Also, learning models are not limited to neural network models. For example, the learning model may be a model based on reinforcement learning. In reinforcement learning, actions (settings) that maximize value are learned through trial and error. Alternatively, the learning model may be a logistic regression model.
 なお、学習モデルは、複数のモデルで構成されていてもよい。例えば、学習モデルは、複数のニューラルネットワークモデルから構成されていてもよい。より具体的には、学習モデルは、例えば、CNN、RNN、及び、LSTMの中から選択される複数のニューラルネットワークモデルから構成されていてもよい。学習モデルが複数のニューラルネットワークモデルから構成される場合、これら複数のニューラルネットワークモデルは、従属関係にあってもよいし、並列関係にあってもよい。 The learning model may consist of multiple models. For example, a learning model may consist of multiple neural network models. More specifically, the learning model may consist of multiple neural network models selected from, for example, CNN, RNN, and LSTM. When a learning model is composed of multiple neural network models, these multiple neural network models may be in a dependent relationship or in a parallel relationship.
 情報処理装置10は、学習モデルを構成する情報として、モデルの構造や接続係数を示す文字列や数値等を記憶部12に記憶する。 The information processing device 10 stores, in the storage unit 12, character strings, numerical values, and the like that indicate the model structure and connection coefficients as information that constitutes the learning model.
 学習モデルは、画像処理前の画像(可視光画像等の撮像画像)と画像処理後の画像との組のデータを学習データとして、画像処理前の画像(例えば、可視光画像等の撮像画像。以下、撮像画像という。)を入力した時に、推測される画像処理後の画像(以下、推測画像という。)を出力するよう学習したモデルであってもよい。この場合、第1の学習モデルは、撮像画像を入力する入力層と、推測画像を出力する出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第1要素と、第1要素と第1要素の重みとに基づいて値が算出される第2要素と、を含み、入力層に入力された情報に対し、出力層以外の各層に属する各要素を第1要素として、第1要素と第1要素の重み(すなわち、接続係数)とに基づく演算を行うことにより、入力層に入力された撮像画像に応じて、推測画像を出力層から出力するよう、コンピュータを機能させるためのモデルであってもよい。 The learning model uses a pair of data of an image before image processing (captured image such as a visible light image) and an image after image processing as learning data, and acquires an image before image processing (for example, a captured image such as a visible light image). It may be a model that has learned to output an image after image processing (hereinafter referred to as an estimated image) when a captured image is input. In this case, the first learning model includes an input layer for inputting a captured image, an output layer for outputting an estimated image, and a layer other than the output layer which is one of the layers from the input layer to the output layer. 1 element and a second element whose value is calculated based on the weight of the first element and the first element. As the first element, an operation is performed based on the first element and the weight of the first element (that is, the connection coefficient), so that the estimated image is output from the output layer according to the captured image input to the input layer. , may be a model for making a computer work.
 ここで、学習モデルが、DNN等、1つまたは複数の中間層を有するニューラルネットワークで実現されるとする。この場合、学習モデルが含む第1要素は、入力層または中間層が有するいずれかのノードに対応する。また、第2要素は、第1要素と対応するノードから値が伝達されるノードである次段のノードに対応する。また、第1要素の重みは、第1要素と対応するノードから第2要素と対応するノードに伝達される値に対して考慮される重みである接続係数に対応する。 Here, it is assumed that the learning model is realized by a neural network with one or more hidden layers, such as DNN. In this case, the first element included in the learning model corresponds to any node of the input layer or intermediate layer. Also, the second element corresponds to the next node, which is a node to which the value is transmitted from the node corresponding to the first element. Also, the weight of the first element corresponds to the connection coefficient, which is the weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.
 また、学習モデルが「y=a1*x1+a2*x2+・・・+ai*xi」で示す回帰モデルで実現されるとする。この場合、学習モデルが含む第1要素は、x1やx2等といった入力データ(xi)に対応する。また、第1要素の重みは、xiに対応する係数aiに対応する。ここで、回帰モデルは、入力層と出力層とを有する単純パーセプトロンと見做すことができる。各モデルを単純パーセプトロンと見做した場合、第1要素は、入力層が有するいずれかのノードに対応し、第2要素は、出力層が有するノードと見做すことができる。 Also, assume that the learning model is realized by a regression model indicated by "y=a1*x1+a2*x2+...+ai*xi". In this case, the first element included in the learning model corresponds to input data (xi) such as x1 and x2. Also, the weight of the first element corresponds to the coefficient ai corresponding to xi. Here, the regression model can be viewed as a simple perceptron with an input layer and an output layer. When each model is regarded as a simple perceptron, the first element can be regarded as a node of the input layer, and the second element can be regarded as a node of the output layer.
 情報処理装置10は、ニューラルネットワークや回帰モデル等、任意の構造を有するモデルを用いて、出力する情報の算出を行う。具体的には、学習モデルは、撮像画像(例えば、画像処理前の可視光画像)が入力された場合に、推測画像を出力するように係数が設定される。例えば、情報処理装置10は、画像処理後の画像と、撮像画像(画像処理前の可視光画像)を学習モデルに入力して得られる値と、の類似度に基づいて係数を設定する。情報処理装置10は、このような学習モデルを用いて、撮像画像から推測画像を生成する。 The information processing device 10 uses a model having an arbitrary structure, such as a neural network or a regression model, to calculate information to be output. Specifically, the learning model is set with coefficients so that an estimated image is output when a captured image (for example, a visible light image before image processing) is input. For example, the information processing apparatus 10 sets the coefficient based on the degree of similarity between the image after image processing and the value obtained by inputting the captured image (visible light image before image processing) into the learning model. The information processing apparatus 10 uses such a learning model to generate an estimated image from the captured image.
 なお、上述の例では、学習モデルの一例として、撮像画像が入力された場合に、推測画像を出力するモデルを示した。しかし、実施形態に係る学習モデルは、学習モデルにデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。 In the above example, as an example of a learning model, a model that outputs an estimated image when a captured image is input is shown. However, the learning model according to the embodiment may be a model that is generated based on results obtained by repeatedly inputting and outputting data to the learning model.
 また、情報処理装置10がGAN(Generative Adversarial Networks)を用いた学習或いは出力情報の生成を行う場合、学習モデルは、GANの一部を構成するモデルであってもよい。 Also, when the information processing apparatus 10 performs learning or generation of output information using GAN (Generative Adversarial Networks), the learning model may be a model that constitutes part of the GAN.
 なお、学習モデル(例えば、学習モデル)の学習を行う学習装置は、情報処理装置10であってもよいし、他の情報処理装置であってもよい。例えば、情報処理装置10が学習モデルの学習を行うとする。この場合、情報処理装置10は、学習モデルの学習を行い、学習した学習モデルを記憶部12に格納する。より具体的には、情報処理装置10は、起案者情報を学習モデルに入力した際に、学習モデルがリターン情報を出力するように、学習モデルの接続係数の設定を行う。 The learning device that learns the learning model (for example, the learning model) may be the information processing device 10, or may be another information processing device. For example, assume that the information processing apparatus 10 learns a learning model. In this case, the information processing apparatus 10 learns the learning model and stores the learned learning model in the storage unit 12 . More specifically, the information processing apparatus 10 sets the connection coefficient of the learning model so that the learning model outputs return information when drafter information is input to the learning model.
 例えば、情報処理装置10は、学習モデルが有する入力層のノードに撮像画像を入力し、各中間層を辿って学習モデルの出力層までデータを伝播させることで、推測画像を出力させる。そして、情報処理装置10は、学習モデルが実際に出力した推測画像と、実際の画像処理後の画像との差に基づいて、学習モデルの接続係数を修正する。例えば、情報処理装置10は、バックプロパゲーション等の手法を用いて、接続係数の修正を行ってもよい。このとき、情報処理装置10は、第1の実測データを示すベクトルと、学習モデルが実際に出力した値を示すベクトルとのコサイン類似度に基づいて、接続係数の修正を行ってもよい。 For example, the information processing apparatus 10 inputs a captured image to a node in the input layer of the learning model, propagates the data to the output layer of the learning model by following each intermediate layer, and outputs an estimated image. Then, the information processing apparatus 10 corrects the connection coefficients of the learning model based on the difference between the estimated image actually output by the learning model and the actual image after image processing. For example, the information processing apparatus 10 may correct the connection coefficients using a technique such as back propagation. At this time, the information processing apparatus 10 may correct the connection coefficient based on the cosine similarity between the vector representing the first measured data and the vector representing the value actually output by the learning model.
 なお、情報処理装置10は、いかなる学習アルゴリズムを用いて学習モデルを学習してもよい。例えば、情報処理装置10は、ニューラルネットワーク、サポートベクターマシン(support vector machine)、クラスタリング、強化学習等の学習アルゴリズムを用いて、学習モデルを学習してもよい。 Note that the information processing device 10 may learn the learning model using any learning algorithm. For example, the information processing device 10 may learn a learning model using learning algorithms such as neural networks, support vector machines, clustering, and reinforcement learning.
 次に、図12を参照しながら、学習モデルの学習完了後の処理を説明する。学習モデルの学習が完了したら(例えば、撮影開始又は学習開始から所定期間が経過したら)、情報処理装置10は、推測画像の生成を開始する。このとき、情報処理装置10は、生成された学習モデルを使って、新たに取得した撮像画像からこの撮像画像の画像処理後の画像を推測する。 Next, referring to FIG. 12, the processing after the learning of the learning model is completed will be described. When the learning of the learning model is completed (for example, when a predetermined period of time has elapsed since the start of shooting or the start of learning), the information processing apparatus 10 starts generating an estimated image. At this time, the information processing apparatus 10 uses the generated learning model to estimate an image after image processing of the captured image from the newly acquired captured image.
 そして、情報処理装置10は、出力部14に出力される画像を、画像処理により生成された画像から、学習モデルを使って推測された画像(以下、推測画像という。)に切り替える。 Then, the information processing apparatus 10 switches the image output to the output unit 14 from the image generated by the image processing to the image estimated using the learning model (hereinafter referred to as the estimated image).
 なお、情報処理装置10は、出力部14に出力される画像が画像処理により生成された画像から推測画像に切り替わるタイミングで、赤外線照明部15の赤外光の出力を停止してもよい。これにより、学習モデルの学習完了前は、情報処理装置10が撮影するフレームの半分が可視光画像であったものが、学習モデルの学習完了後は、全てのフレームが可視光画像になる。そして、情報処理装置10は、学習モデルを使って可視光画像の推測画像を生成し、生成した推測画像を出力部14に出力する。これにより、情報処理装置10は、出力部14に出力される映像のフレームレートを学習完了前の2倍とすることができる。 The information processing device 10 may stop outputting infrared light from the infrared illumination unit 15 at the timing when the image output to the output unit 14 is switched from the image generated by the image processing to the estimated image. As a result, half of the frames captured by the information processing apparatus 10 are visible light images before the learning of the learning model is completed, but all the frames are visible light images after the learning of the learning model is completed. The information processing apparatus 10 then generates an estimated image of the visible light image using the learning model, and outputs the generated estimated image to the output unit 14 . As a result, the information processing apparatus 10 can double the frame rate of the video output to the output unit 14 before the completion of learning.
 以上、発展的な手法の概要を説明したが、以下、発展的な手法を実現するための画像出力処理について説明する。図13は、発展的な手法を実現するための画像出力処理を示すフローチャートである。以下の処理は、情報処理装置10の制御部13が実行する。制御部13は、ユーザが撮像(例えば、テレビ会議)を開始すると、画像出力処理を開始する。 The outline of the advanced method has been explained above, and the image output processing for realizing the advanced method will be explained below. FIG. 13 is a flow chart showing image output processing for realizing the advanced method. The following processing is executed by the control unit 13 of the information processing device 10 . The control unit 13 starts image output processing when the user starts imaging (for example, a video conference).
 まず、制御部13は、撮像部17を起動する(ステップS301)。そして、制御部13は、撮像部17が撮像する映像(動画)のフレーム周期に同期させながら赤外線照明部15を点滅させる(ステップS302)。 First, the control unit 13 activates the imaging unit 17 (step S301). Then, the control unit 13 blinks the infrared illumination unit 15 while synchronizing with the frame period of the video (moving image) captured by the imaging unit 17 (step S302).
 続いて、情報処理装置10の取得部131は、撮像部17が撮像した画像を取得する。赤外光が映像のフレーム周期に同期しながら点滅しているので、取得部131は、可視光画像とIR画像とを交互に取得することになる(ステップS303)。 Subsequently, the acquisition unit 131 of the information processing device 10 acquires the image captured by the imaging unit 17 . Since the infrared light is blinking in synchronization with the frame cycle of the video, the acquisition unit 131 alternately acquires the visible light image and the IR image (step S303).
 情報処理装置10の抽出部132は、IR画像からIR成分情報を抽出する(ステップS304)。具体的には、抽出部132は、可視光画像とIR画像との差分をIR成分情報として取得する。そして、情報処理装置10の画像処理部133は、IR成分情報に基づいて撮像画像への輝度又は明度に関する画像処理を行う(ステップS305)。そして、情報処理装置10の出力制御部134は、画像処理後の画像を出力部14に出力する(ステップS306)。 The extraction unit 132 of the information processing device 10 extracts IR component information from the IR image (step S304). Specifically, the extraction unit 132 acquires the difference between the visible light image and the IR image as IR component information. Then, the image processing unit 133 of the information processing device 10 performs image processing related to luminance or lightness of the captured image based on the IR component information (step S305). Then, the output control unit 134 of the information processing device 10 outputs the processed image to the output unit 14 (step S306).
 続いて、情報処理装置10の学習部135は、画像処理前の画像と画像処理後の画像とに基づいて学習モデルの学習を実行する(ステップS307)。 Subsequently, the learning unit 135 of the information processing device 10 performs learning of the learning model based on the image before image processing and the image after image processing (step S307).
 その後、情報処理装置10の制御部13は、撮影が終了したか判別する(ステップS308)。撮影が終了している場合(ステップS308:Yes)、制御部13は、ステップS311に処理を進める。撮影が終了していない場合(ステップS308:No)、制御部13は、学習モデルの学習が完了しているか判別する(ステップS309)。学習が完了していない場合(ステップS309:No)、制御部13は、ステップS303に処理を戻す。学習が完了している場合(ステップS309:Yes)、制御部13は、推測処理を開始する(ステップS310)。図14は、推測処理を示すフローチャートである。 After that, the control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S308). If the shooting has ended (step S308: Yes), the control unit 13 advances the process to step S311. If the shooting has not ended (step S308: No), the control unit 13 determines whether learning of the learning model has been completed (step S309). If learning has not been completed (step S309: No), the control unit 13 returns the process to step S303. If the learning has been completed (step S309: Yes), the control unit 13 starts the estimation process (step S310). FIG. 14 is a flow chart showing the estimation process.
 まず、情報処理装置10の制御部13は、赤外線照明部15の動作を停止する(ステップS401)。そして、情報処理装置10の取得部131は、撮像画像(すなわち、可視光画像)を取得する(ステップS402)。そして、情報処理装置10の推測部136は、学習モデルに撮像画像を入力することで、当該撮像画像の画像処理後の画像を推測する(ステップS403)。そして、情報処理装置10の出力制御部134は、推測画像を出力部14に出力する(ステップS404)。 First, the control unit 13 of the information processing device 10 stops the operation of the infrared illumination unit 15 (step S401). Then, the acquisition unit 131 of the information processing device 10 acquires the captured image (that is, the visible light image) (step S402). Then, the estimating unit 136 of the information processing apparatus 10 inputs the captured image to the learning model, thereby estimating the image after the image processing of the captured image (step S403). Then, the output control unit 134 of the information processing device 10 outputs the estimated image to the output unit 14 (step S404).
 その後、情報処理装置10の制御部13は、撮影が終了したか判別する(ステップS405)。撮影が終了していない場合(ステップS405:No)、制御部13は、ステップS402に処理を戻す。撮影が終了している場合(ステップS405:Yes)、制御部13は、図13のフローに処理を戻し、撮像部17と赤外線照明部15の動作を停止する(ステップS311)。撮像部17と赤外線照明部15の動作が停止したら、制御部13は、画像出力処理を終了する。 After that, the control unit 13 of the information processing device 10 determines whether or not the shooting has ended (step S405). If the shooting has not ended (step S405: No), the control unit 13 returns the process to step S402. If the shooting has ended (step S405: Yes), the control unit 13 returns the processing to the flow of FIG. 13 and stops the operations of the imaging unit 17 and the infrared illumination unit 15 (step S311). When the operations of the imaging unit 17 and the infrared illumination unit 15 are stopped, the control unit 13 ends the image output processing.
 本手法によれば、学習モデルの学習完了後は、情報処理装置10は、ユーザに赤外光を照射しなくても、あたかもユーザがライティングされているかのような画像を取得できる。また、学習完了前は、IR画像が出力部14への出力対象とならないため、フレームレートが半分となるが、学習完了後は、全てのフレームが可視光画像となるので、フルフレームレートを実現できる。 According to this method, after the learning of the learning model is completed, the information processing apparatus 10 can acquire an image as if the user were lighting without irradiating the user with infrared light. Also, before learning is completed, the IR image is not output to the output unit 14, so the frame rate is halved. can.
<<4.実写ボリュメトリックへの応用>>
 本実施形態の手法は、実写ボリュメトリックへ応用可能である。ここで、実写ボリュメトリックとは、スタジオ等で被写体(例えば、人)の立体的な情報を取得し、それをそのまま3DCG化する技術である。
<<4. Application to live-action volumetric >>
The method of this embodiment can be applied to actual volumetric photography. Here, the live-action volumetric is a technique that acquires three-dimensional information of a subject (for example, a person) in a studio or the like and converts it into 3DCG as it is.
<4-1.課題>
 実写ボリュメトリック撮影システムでは、情報処理装置10は、多台数のカメラで被写体を取り囲んで撮影する。そして、情報処理装置10は、画像データから被写体を3次元データ化してコンテンツを生成する。そして、情報処理装置10は、ユーザの操作に基づいて、そのコンテンツを自由な視点でレンダリングする。
<4-1. Issue>
In the live-action volumetric imaging system, the information processing apparatus 10 surrounds and photographs a subject with multiple cameras. Then, the information processing apparatus 10 converts the subject into three-dimensional data from the image data to generate content. Then, the information processing apparatus 10 renders the content from a free viewpoint based on the user's operation.
 なお、現状では、情報処理装置10は、実写ボリュメトリックを実現するために、主に天井に複数の照明が固定されたスタジオで被写体を撮影する。このとき、被写体を一様に明るい照明で照らすと、テクスチャやモデリングの品質が良くなるものの凹凸感が減りCGぽい不自然な映像になってしまう。一方、被写体を偏りのある照明で撮影すると陰影や凹凸感が増すがテクスチャやモデリングの品質が劣化する。被写体の形状的に照明が当たる部分がなくグリーンバックとのコントラストが低くても、追加で照明を加えることが難しい。 Currently, the information processing apparatus 10 shoots a subject mainly in a studio with a plurality of fixed lighting fixtures on the ceiling in order to realize volumetric photography. At this time, if the subject is uniformly illuminated with bright illumination, the quality of the texture and modeling is improved, but the unevenness is reduced, resulting in an unnatural CG-like image. On the other hand, if the subject is shot with biased lighting, the shadows and unevenness will increase, but the texture and modeling quality will deteriorate. It is difficult to add additional lighting even if there is no part of the subject that is illuminated by the shape of the subject and the contrast with the green screen is low.
<4-2.実施例>
 そこで、本実施例では、従来の実写ボリュメトリック撮影システムに、赤外線を撮影可能な可視光カメラと、個別に点灯、消灯、及び照射方向を制御可能なIRライトを追加配置する。そして、情報処理装置10は、IR成分情報に基づいて可視光画像に画像処理(例えば、陰影の修正もしくは強調)を行う。これにより、情報処理装置10は、テクスチャやモデリングの品質を維持しつつも、凹凸感のある画像を生成できる。
<4-2. Example>
Therefore, in this embodiment, a visible light camera capable of photographing infrared rays and an IR light capable of individually controlling lighting, extinguishing, and irradiation direction are additionally arranged in a conventional live-action volumetric imaging system. Then, the information processing apparatus 10 performs image processing (for example, correction or enhancement of shadows) on the visible light image based on the IR component information. As a result, the information processing apparatus 10 can generate an image with unevenness while maintaining texture and modeling quality.
 図15は、本実施形態の実写ボリュメトリック撮影システムの撮影スタジオの一例を示す図である。撮影スタジオには、複数の可視光ライト20に加えて、複数のIRライト(図15に示す赤外線照明部15)が配置されている。そして、撮影スタジオには、複数のIRカメラ30が配置されている。IRカメラ30は、可視光と赤外光を同時に取得できるカメラである。IRカメラ30の構成は、撮像部17の構成と同様である。 FIG. 15 is a diagram showing an example of a photography studio of the live-action volumetric photography system of this embodiment. In addition to a plurality of visible light lights 20, a plurality of IR lights (infrared illuminator 15 shown in FIG. 15) are arranged in the photography studio. A plurality of IR cameras 30 are arranged in the photography studio. The IR camera 30 is a camera that can simultaneously acquire visible light and infrared light. The configuration of the IR camera 30 is similar to that of the imaging section 17 .
 図16は、実写ボリュメトリック撮影システムにおける情報処理装置10の処理例を示す図である。情報処理装置10は、被写体の複数の方向からの可視光画像で構成される多視点画像と、被写体の複数の方向からのIR画像と、を取得する。ここで、多視点画像は、被写体の3Dモデルを生成するための画像である。情報処理装置10は、IR画像から抽出されるIR成分情報(陰影情報)に基づいて多視点画像を補正する。IR成分情報は、前景背景分離の補助情報として利用可能である。具体的な補正の方法(画像処理の方法)は、上述の基本的な手法1、2、及び発展的な手法で示した方法と同様の方法であってもよい。 FIG. 16 is a diagram showing a processing example of the information processing device 10 in the live-action volumetric imaging system. The information processing device 10 acquires a multi-viewpoint image composed of visible light images from a plurality of directions of a subject and an IR image of the subject from a plurality of directions. Here, a multi-viewpoint image is an image for generating a 3D model of a subject. The information processing device 10 corrects the multi-viewpoint image based on IR component information (shadow information) extracted from the IR image. IR component information can be used as auxiliary information for foreground-background separation. A specific correction method (image processing method) may be a method similar to the methods shown in the basic methods 1 and 2 and the advanced method described above.
 これにより、可視光ライトのみの撮影ではできなかった、特定の方向からの照明条件への動的な変更(リライティング)が可能になる。また、被写体にIRライトを当てることによって、可視光での撮影に影響を及ぼさずに前景と背景の分離を強調し精度を向上することが可能になる。 This makes it possible to dynamically change the lighting conditions from a specific direction (relighting), which was not possible when shooting with only visible light. Also, by illuminating the subject with IR light, it is possible to enhance the separation of the foreground and background and improve the accuracy without affecting the imaging with visible light.
<4-3.他の例>
 なお、被写体の撮影環境は、可視光と赤外光を同時に取得できる高速撮影カメラと、全天球配置された可視光ライト及びIRライトと、の組み合わせであってもよい。図17は、複数の可視光ライト20及び複数のIRライト(赤外線照明部15)が全天球配置された様子を示す図である。
<4-3. Other examples>
The imaging environment of the object may be a combination of a high-speed imaging camera capable of simultaneously acquiring visible light and infrared light, and visible light and IR light arranged omnidirectionally. FIG. 17 is a diagram showing a state in which a plurality of visible light lights 20 and a plurality of IR lights (infrared illuminators 15) are omnidirectionally arranged.
 そして、情報処理装置10は、可視光で撮影しながら同時に任意の光源位置からのシェーディング/反射率(アルベド)を取得する。IR画像:可視光画像の撮影フレーム比率は1:1に限らないので、カメラ性能次第で複数光源位置からのシェーディング/アルベドを同時取得できる。赤外光を使用するので可視光画像の撮影に影響を及ぼさない。IRモノクロ画像でのシェーディングを取得するだけなら、可視光カメラとは独立してフレームレートを限界まで高速にできる。また、情報処理装置10は、アルベドを基にして、後からPhoto-realisticな影を追加できる。 Then, the information processing device 10 simultaneously acquires shading/reflectance (albedo) from an arbitrary light source position while shooting with visible light. Since the imaging frame ratio of IR image:visible light image is not limited to 1:1, shading/albedo from multiple light source positions can be acquired simultaneously depending on camera performance. Since it uses infrared light, it does not affect visible light image capturing. If only the shading in the IR monochrome image is acquired, the frame rate can be increased to the limit independently of the visible light camera. Further, the information processing apparatus 10 can add photo-realistic shadows later based on the albedo.
<<5.変形例>>
 上述の実施形態は一例を示したものであり、種々の変更及び応用が可能である。
<<5. Modification>>
The above-described embodiment is an example, and various modifications and applications are possible.
<5-1.製品やサービスへ応用>
 例えば、本開示に係る技術は、様々な製品やサービスへ応用することができる。
<5-1. Application to products and services>
For example, the technology according to the present disclosure can be applied to various products and services.
 (1)コンテンツの制作
 例えば、ユーザは、本実施の形態で生成された被写体の3Dモデルと他のサーバで管理されている3Dデータを合成して新たな映像コンテンツを制作してもよい。また、例えば、Lidarなどの撮像装置で取得した背景データが存在している場合、本実施の形態で生成された被写体の3Dモデルと背景データを組合せることで、ユーザは、被写体が背景データで示す場所にあたかもいるようなコンテンツを制作することもできる。尚、映像コンテンツは3次元の映像コンテンツであってもよいし、2次元に変換された2次元の映像コンテンツでもよい。尚、本実施の形態で生成された被写体の3Dモデルは、例えば、3Dモデル生成部で生成された3Dモデルやレンダリング部で再構築した3Dモデルなどがある。
(1) Production of content For example, the user may produce new video content by synthesizing the 3D model of the subject generated in this embodiment and the 3D data managed by another server. Further, for example, when there is background data acquired by an imaging device such as Lidar, the user can combine the 3D model of the subject generated in the present embodiment with the background data, so that the user can see the subject as background data. It is also possible to create content that makes you feel as if you are in the indicated location. The video content may be 3D video content, or may be 2D video content converted to 2D. Note that the 3D model of the subject generated in the present embodiment includes, for example, a 3D model generated by a 3D model generation unit and a 3D model reconstructed by a rendering unit.
 (2)仮想空間での体験
 例えば、情報処理装置10は、ユーザがアバタとなってコミュニケーションする場である仮想空間の中で、本実施の形態で生成された被写体(例えば、演者)を配置することができる。この場合、ユーザは、アバタとなって仮想空間で実写の被写体を視聴することが可能となる。
(2) Experience in virtual space For example, the information processing device 10 arranges a subject (for example, a performer) generated in the present embodiment in a virtual space where the user communicates as an avatar. be able to. In this case, the user becomes an avatar and can view the photographed subject in the virtual space.
 (3)遠隔地とのコミュニケーションへの応用
 例えば、画像処理部133で生成された被写体の3Dモデルを通信部11から遠隔地に送信することにより、遠隔地にある再生装置を通じて遠隔地のユーザが被写体の3Dモデルを視聴することができる。例えば、情報処理装置10がこの被写体の3Dモデルをリアルタイムに伝送することにより被写体と遠隔地のユーザとがリアルタイムにコミュニケーションすることができる。例えば、被写体が先生であり、ユーザが生徒である場合や、被写体が医者であり、ユーザが患者である場合が想定できる。
(3) Application to communication with a remote location For example, by transmitting a 3D model of an object generated by the image processing unit 133 from the communication unit 11 to a remote location, a remote user can A 3D model of the subject can be viewed. For example, the information processing apparatus 10 can transmit the 3D model of the subject in real time, so that the subject and the remote user can communicate in real time. For example, it can be assumed that the subject is a teacher and the user is a student, or that the subject is a doctor and the user is a patient.
 (4)その他
 例えば、情報処理装置10は、本実施の形態で生成された複数の被写体の3Dモデルに基づいてスポーツなどの自由視点映像を生成することもできる。また、個人が本実施の形態で生成された3Dモデルである自分を配信プラットフォームに配信することもできる。このように、本明細書に記載の実施形態における内容は種々の技術やサービスに応用することができる。
(4) Others For example, the information processing apparatus 10 can also generate a free-viewpoint video of sports or the like based on the 3D models of a plurality of subjects generated in the present embodiment. Also, an individual can distribute himself/herself, which is a 3D model generated in this embodiment, to a distribution platform. As such, the content of the embodiments described herein can be applied to a variety of technologies and services.
<5-2.その他の変形例>
 本実施形態の情報処理装置10は、専用のコンピュータシステムにより実現してもよいし、汎用のコンピュータシステムによって実現してもよい。
<5-2. Other modified examples>
The information processing apparatus 10 of this embodiment may be implemented by a dedicated computer system or may be implemented by a general-purpose computer system.
 例えば、上述の動作を実行するための通信プログラムを、光ディスク、半導体メモリ、磁気テープ、フレキシブルディスク等のコンピュータ読み取り可能な記録媒体に格納して配布する。そして、例えば、該プログラムをコンピュータにインストールし、上述の処理を実行することによって制御装置を構成する。このとき、制御装置は、情報処理装置10の外部の装置(例えば、パーソナルコンピュータ)であってもよい。また、制御装置は、情報処理装置10の内部の装置(例えば、制御部13)であってもよい。 For example, a communication program for executing the above operations is distributed by storing it in a computer-readable recording medium such as an optical disk, semiconductor memory, magnetic tape, or flexible disk. Then, for example, the control device is configured by installing the program in a computer and executing the above-described processing. At this time, the control device may be a device (for example, a personal computer) external to the information processing device 10 . Also, the control device may be a device inside the information processing device 10 (for example, the control unit 13).
 また、上記通信プログラムをインターネット等のネットワーク上のサーバ装置が備えるディスク装置に格納しておき、コンピュータにダウンロード等できるようにしてもよい。また、上述の機能を、OS(Operating System)とアプリケーションソフトとの協働により実現してもよい。この場合には、OS以外の部分を媒体に格納して配布してもよいし、OS以外の部分をサーバ装置に格納しておき、コンピュータにダウンロード等できるようにしてもよい。 Also, the above communication program may be stored in a disk device provided in a server device on a network such as the Internet, so that it can be downloaded to a computer. Also, the functions described above may be realized through cooperation between an OS (Operating System) and application software. In this case, the parts other than the OS may be stored in a medium and distributed, or the parts other than the OS may be stored in a server device so that they can be downloaded to a computer.
 また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 Further, among the processes described in the above embodiments, all or part of the processes described as being automatically performed can be manually performed, or the processes described as being performed manually can be performed manually. All or part of this can also be done automatically by known methods. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.
 また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。なお、この分散・統合による構成は動的に行われてもよい。 Also, each component of each device illustrated is functionally conceptual and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured. Note that this distribution/integration configuration may be performed dynamically.
 また、上述の実施形態は、処理内容を矛盾させない領域で適宜組み合わせることが可能である。また、上述の実施形態のフローチャートに示された各ステップは、適宜順序を変更することが可能である。また、例えば、1つのフローチャートの各ステップを、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。さらに、1つのステップに複数の処理が含まれる場合、その複数の処理を、1つの装置が実行するようにしてもよいし、複数の装置が分担して実行するようにしてもよい。換言するに、1つのステップに含まれる複数の処理を、複数のステップの処理として実行することもできる。逆に、複数のステップとして説明した処理を1つのステップとしてまとめて実行することもできる。 In addition, the above-described embodiments can be appropriately combined in areas where the processing contents are not inconsistent. Also, the order of the steps shown in the flowcharts of the above-described embodiments can be changed as appropriate. Further, for example, each step of one flowchart may be executed by one device, or may be executed by a plurality of devices. Furthermore, when one step includes a plurality of processes, the plurality of processes may be executed by one device, or may be shared by a plurality of devices. In other words, a plurality of processes included in one step can also be executed as processes of a plurality of steps. Conversely, the processing described as multiple steps can also be collectively executed as one step.
 また、例えば、コンピュータが実行するプログラムは、プログラムを記述するステップの処理が、本明細書で説明する順序に沿って時系列に実行されるようにしても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで個別に実行されるようにしても良い。つまり、矛盾が生じない限り、各ステップの処理が上述した順序と異なる順序で実行されるようにしてもよい。さらに、このプログラムを記述するステップの処理が、他のプログラムの処理と並列に実行されるようにしても良いし、他のプログラムの処理と組み合わせて実行されるようにしても良い。 Further, for example, a computer-executed program may be configured such that the processing of the steps described in the program is executed in chronological order according to the order described in this specification, in parallel, or when calls are executed. It may also be executed individually at necessary timings such as when it is interrupted. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
 また、例えば、本実施形態は、装置またはシステムを構成するあらゆる構成、例えば、システムLSI(Large Scale Integration)等としてのプロセッサ、複数のプロセッサ等を用いるモジュール、複数のモジュール等を用いるユニット、ユニットにさらにその他の機能を付加したセット等(すなわち、装置の一部の構成)として実施することもできる。 Also, for example, the present embodiment can be applied to any configuration that constitutes a device or system, such as a processor as a system LSI (Large Scale Integration), a module using a plurality of processors, a unit using a plurality of modules, etc. Furthermore, it can also be implemented as a set or the like (that is, a configuration of a part of the device) to which other functions are added.
 なお、本実施形態において、システムとは、複数の構成要素(装置、モジュール(部品)等)の集合を意味し、全ての構成要素が同一筐体中にあるか否かは問わない。したがって、別個の筐体に収納され、ネットワークを介して接続されている複数の装置、及び、1つの筐体の中に複数のモジュールが収納されている1つの装置は、いずれも、システムである。 In addition, in this embodiment, the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing, are both systems. .
 また、例えば、本技術に関する複数の技術は、矛盾が生じない限り、それぞれ独立に単体で実施することができる。もちろん、任意の複数の本技術を併用して実施することもできる。例えば、いずれかの実施の形態において説明した本技術の一部または全部を、他の実施の形態において説明した本技術の一部または全部と組み合わせて実施することもできる。また、上述した任意の本技術の一部または全部を、上述していない他の技術と併用して実施することもできる。 Also, for example, multiple technologies related to this technology can be implemented independently as long as there is no contradiction. Of course, it is also possible to use any number of the present techniques in combination. For example, part or all of the present technology described in any embodiment can be combined with part or all of the present technology described in other embodiments. Also, part or all of any of the techniques described above may be implemented in conjunction with other techniques not described above.
 また、例えば、本実施形態は、1つの機能を、ネットワークを介して複数の装置で分担、共同して処理するクラウドコンピューティングの構成をとることができる。 Also, for example, this embodiment can take a configuration of cloud computing in which one function is shared by a plurality of devices via a network and processed jointly.
<<6.むすび>>
 以上説明したように、本実施形態によれば、情報処理装置10は、対象(例えば、ユーザ及びその周囲の物)に赤外光を照射して得られるIR画像からIR成分情報を抽出し、抽出したIR成分情報に基づいて対象の撮像画像への輝度又は明度に関する画像処理を行う。赤外光は人の目には見えない。ユーザはまぶしい思いをすることなく、あたかも可視光ライトでライティングされているかのような画像を得ることができる。
<<6. Conclusion>>
As described above, according to the present embodiment, the information processing apparatus 10 extracts IR component information from an IR image obtained by irradiating an object (for example, a user and surrounding objects) with infrared light, Based on the extracted IR component information, image processing relating to brightness or brightness is performed on the captured image of the target. Infrared light is invisible to the human eye. The user can obtain an image as if it were illuminated with visible light without being dazzled.
 以上、本開示の各実施形態について説明したが、本開示の技術的範囲は、上述の各実施形態そのままに限定されるものではなく、本開示の要旨を逸脱しない範囲において種々の変更が可能である。また、異なる実施形態及び変形例にわたる構成要素を適宜組み合わせてもよい。 The embodiments of the present disclosure have been described above, but the technical scope of the present disclosure is not limited to the embodiments described above, and various modifications can be made without departing from the gist of the present disclosure. be. Moreover, you may combine the component over different embodiment and modifications suitably.
 また、本明細書に記載された各実施形態における効果はあくまで例示であって限定されるものでは無く、他の効果があってもよい。 Also, the effects of each embodiment described in this specification are merely examples and are not limited, and other effects may be provided.
 なお、本技術は以下のような構成も取ることができる。
(1)
 対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得する取得部と、
 前記IR画像からIR成分の情報を抽出する抽出部と、
 前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う画像処理部と、
 を備える情報処理装置。
(2)
 前記取得部は、前記IR画像に加えて前記対象の可視光画像を取得し、
 前記抽出部は、前記可視光画像と前記IR画像との差分を前記IR成分の情報として取得する、
 前記(1)に記載の情報処理装置。
(3)
 前記赤外光は点滅しており、
 前記取得部は、前記赤外光の点滅周期に同期して時分割で前記可視光画像とIR画像とを取得する、
 前記(2)に記載の情報処理装置。
(4)
 前記赤外光は映像のフレーム周期に同期して点滅しており、
 前記取得部は、前記赤外光が照射されていないタイミングのフレームの画像を前記可視光画像、前記赤外光が照射されたタイミングのフレームの画像を前記IR画像、として取得する、
 前記(2)又は(3)に記載の情報処理装置。
(5)
 前記抽出部は、前記赤外光が照射されていないタイミングのフレームから始まる連続する2フレームの画像の差分を前記IR成分の情報として取得し、
 前記画像処理部は、前記IR成分の情報に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
 前記(4)に記載の情報処理装置。
(6)
 前記IR成分の情報は、前記連続する2フレームの差分画像であり、
 前記画像処理部は、前記差分画像のエッジをぼかす処理を行い、エッジをぼかした前記差分画像に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
 前記(5)に記載の情報処理装置。
(7)
 前記画像処理部は、フレーム間の動き予測に基づいて前記IR成分の情報を補正し、補正した前記IR成分の情報に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
 前記(5)に記載の情報処理装置。
(8)
 前記抽出部は、前記赤外光が照射されたタイミングのフレームから始まる連続する2フレームの画像の差分を前記IR成分の情報として取得し、
 前記画像処理部は、前記IR成分の情報に基づいて、前記連続する2フレームの最後フレームの画像への輝度又は明度に関する画像処理を行う、
 前記(4)に記載の情報処理装置。
(9)
 前記画像処理部は、前記IR成分の情報に基づいて、前記撮像画像のHSL色空間での輝度の情報を書き換える、
 前記(1)~(8)のいずれかに記載の情報処理装置。
(10)
 前記画像処理部は、前記IR成分の情報に基づいて、前記撮像画像のHSV色空間での明度の情報を書き換える、
 前記(1)~(8)のいずれかに記載の情報処理装置。
(11)
 前記画像処理により生成された画像を出力部に出力する出力制御部、をさらに備える、
 前記(1)~(10)のいずれかに記載の情報処理装置。
(12)
 前記画像処理の前の画像と前記画像処理の後の画像とに基づき学習モデルの学習を実行する学習部と、
 前記学習モデルを使って、新たに取得した撮像画像の前記画像処理の後の画像を推測する推測部と、を備え、
 前記出力制御部は、所定期間の後、前記出力部に出力される画像を、前記画像処理により生成された画像から、前記学習モデルを使って推測された推測画像に切り替える、
 前記(11)に記載の情報処理装置。
(13)
 前記出力制御部は、前記画像処理により生成された画像が前記出力部に出力される間は、前記赤外光が映像のフレーム周期に同期して点滅するよう赤外線照射部を制御し、
 前記取得部は、前記赤外光が点滅している間は、前記赤外光が照射されていないタイミングのフレームの画像を可視光画像、前記赤外光が照射されたタイミングのフレームの画像を前記IR画像、として取得し、
 前記出力制御部は、前記出力部に出力される画像が前記画像処理により生成された画像から前記推測画像に切り替わるタイミングで、前記赤外光の出力が停止するよう前記赤外線照射部を制御し、
 前記取得部は、前記赤外光の出力が停止した後は、全てのフレームの画像を前記可視光画像として取得し、
 前記推測部は、前記学習モデルを使って前記可視光画像の前記画像処理の後の画像を推測する、
 前記(12)に記載の情報処理装置。
(14)
 前記取得部は、被写体の3Dモデルを生成するための画像であって前記被写体の複数の方向からの可視光画像で構成される多視点画像と、前記被写体の複数の方向からのIR画像と、を取得し、
 前記画像処理部は、前記IR画像から抽出される前記IR成分の情報に基づいて前記多視点画像を補正する、
 前記(1)に記載の情報処理装置。
(15)
 対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得し、
 前記IR画像からIR成分の情報を抽出し、
 前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う、
 情報処理方法。
(16)
 コンピュータを、
 対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得する取得部、
 前記IR画像からIR成分の情報を抽出する抽出部、
 前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う画像処理部、
 として機能させるためのプログラム。
Note that the present technology can also take the following configuration.
(1)
an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component;
an extraction unit that extracts IR component information from the IR image;
an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component;
Information processing device.
(2)
The acquisition unit acquires a visible light image of the target in addition to the IR image,
The extraction unit acquires a difference between the visible light image and the IR image as information on the IR component.
The information processing device according to (1) above.
(3)
the infrared light is blinking,
The acquisition unit acquires the visible light image and the IR image in a time division manner in synchronization with the blinking cycle of the infrared light.
The information processing device according to (2) above.
(4)
The infrared light blinks in synchronization with the frame period of the video,
The acquisition unit acquires the image of the frame at the timing when the infrared light is not irradiated as the visible light image, and acquires the image of the frame at the timing when the infrared light is irradiated as the IR image.
The information processing apparatus according to (2) or (3) above.
(5)
The extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is not irradiated,
The image processing unit performs image processing related to brightness or brightness of an image of the next frame of the two consecutive frames based on the information of the IR component.
The information processing device according to (4) above.
(6)
the IR component information is a difference image of the two consecutive frames;
The image processing unit performs a process of blurring the edges of the difference image, and performs image processing related to brightness or brightness of the image of the next frame of the consecutive two frames based on the difference image with the edges blurred.
The information processing device according to (5) above.
(7)
The image processing unit corrects the IR component information based on inter-frame motion prediction, and determines the brightness or brightness of the next frame image of the two continuous frames based on the corrected IR component information. perform image processing,
The information processing device according to (5) above.
(8)
The extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is irradiated,
The image processing unit performs image processing related to brightness or brightness of the image of the last frame of the two consecutive frames based on the information of the IR component.
The information processing device according to (4) above.
(9)
The image processing unit rewrites luminance information in the HSL color space of the captured image based on the IR component information.
The information processing apparatus according to any one of (1) to (8) above.
(10)
The image processing unit rewrites lightness information in the HSV color space of the captured image based on the IR component information.
The information processing apparatus according to any one of (1) to (8) above.
(11)
An output control unit that outputs an image generated by the image processing to an output unit,
The information processing apparatus according to any one of (1) to (10) above.
(12)
a learning unit that performs learning of a learning model based on the image before the image processing and the image after the image processing;
an estimating unit that estimates an image after the image processing of the newly acquired captured image using the learning model;
After a predetermined period of time, the output control unit switches the image output to the output unit from the image generated by the image processing to the estimated image estimated using the learning model.
The information processing device according to (11) above.
(13)
The output control unit controls the infrared irradiation unit so that the infrared light blinks in synchronization with a video frame cycle while the image generated by the image processing is output to the output unit,
While the infrared light is blinking, the acquisition unit acquires an image of the frame at the timing when the infrared light is not irradiated as a visible light image, and an image of the frame at the timing when the infrared light is irradiated. obtained as said IR image,
The output control unit controls the infrared irradiation unit to stop outputting the infrared light at a timing when the image output to the output unit is switched from the image generated by the image processing to the estimated image,
After the output of the infrared light is stopped, the acquisition unit acquires images of all frames as the visible light image,
The estimating unit uses the learning model to estimate an image after the image processing of the visible light image.
The information processing device according to (12) above.
(14)
The acquisition unit comprises multi-viewpoint images, which are images for generating a 3D model of a subject and which are composed of visible light images from a plurality of directions of the subject, IR images of the subject from a plurality of directions, and get
The image processing unit corrects the multi-viewpoint image based on the information of the IR component extracted from the IR image.
The information processing device according to (1) above.
(15)
Acquiring an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component,
extracting IR component information from the IR image;
performing image processing on the brightness or brightness of the captured image of the target based on the IR component information;
Information processing methods.
(16)
the computer,
an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component;
an extraction unit that extracts IR component information from the IR image;
an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component;
A program to function as
 10 情報処理装置
 11 通信部
 12 記憶部
 13 制御部
 14 出力部
 15 赤外線照明部
 16 同期信号発生部
 17 撮像部
 20 可視光ライト
 30 IRカメラ
 131 取得部
 132 抽出部
 133 画像処理部
 134 出力制御部
 135 学習部
 136 推測部
REFERENCE SIGNS LIST 10 information processing device 11 communication unit 12 storage unit 13 control unit 14 output unit 15 infrared illumination unit 16 synchronization signal generation unit 17 imaging unit 20 visible light 30 IR camera 131 acquisition unit 132 extraction unit 133 image processing unit 134 output control unit 135 Learning unit 136 Guessing unit

Claims (16)

  1.  対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得する取得部と、
     前記IR画像からIR成分の情報を抽出する抽出部と、
     前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う画像処理部と、
     を備える情報処理装置。
    an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component;
    an extraction unit that extracts IR component information from the IR image;
    an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component;
    Information processing device.
  2.  前記取得部は、前記IR画像に加えて前記対象の可視光画像を取得し、
     前記抽出部は、前記可視光画像と前記IR画像との差分を前記IR成分の情報として取得する、
     請求項1に記載の情報処理装置。
    The acquisition unit acquires a visible light image of the target in addition to the IR image,
    The extraction unit acquires a difference between the visible light image and the IR image as information on the IR component.
    The information processing device according to claim 1 .
  3.  前記赤外光は点滅しており、
     前記取得部は、前記赤外光の点滅周期に同期して時分割で前記可視光画像とIR画像とを取得する、
     請求項2に記載の情報処理装置。
    the infrared light is blinking,
    The acquisition unit acquires the visible light image and the IR image in a time division manner in synchronization with the blinking cycle of the infrared light.
    The information processing apparatus according to claim 2.
  4.  前記赤外光は映像のフレーム周期に同期して点滅しており、
     前記取得部は、前記赤外光が照射されていないタイミングのフレームの画像を前記可視光画像、前記赤外光が照射されたタイミングのフレームの画像を前記IR画像、として取得する、
     請求項2に記載の情報処理装置。
    The infrared light blinks in synchronization with the frame period of the video,
    The acquisition unit acquires the image of the frame at the timing when the infrared light is not irradiated as the visible light image, and acquires the image of the frame at the timing when the infrared light is irradiated as the IR image.
    The information processing apparatus according to claim 2.
  5.  前記抽出部は、前記赤外光が照射されていないタイミングのフレームから始まる連続する2フレームの画像の差分を前記IR成分の情報として取得し、
     前記画像処理部は、前記IR成分の情報に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
     請求項4に記載の情報処理装置。
    The extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is not irradiated,
    The image processing unit performs image processing related to brightness or brightness of an image of the next frame of the two consecutive frames based on the information of the IR component.
    The information processing apparatus according to claim 4.
  6.  前記IR成分の情報は、前記連続する2フレームの差分画像であり、
     前記画像処理部は、前記差分画像のエッジをぼかす処理を行い、エッジをぼかした前記差分画像に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
     請求項5に記載の情報処理装置。
    the IR component information is a difference image of the two consecutive frames;
    The image processing unit performs a process of blurring the edges of the difference image, and performs image processing related to brightness or brightness of the image of the next frame of the consecutive two frames based on the difference image with the edges blurred.
    The information processing device according to claim 5 .
  7.  前記画像処理部は、フレーム間の動き予測に基づいて前記IR成分の情報を補正し、補正した前記IR成分の情報に基づいて、前記連続する2フレームの次フレームの画像への輝度又は明度に関する画像処理を行う、
     請求項5に記載の情報処理装置。
    The image processing unit corrects the IR component information based on inter-frame motion prediction, and determines the brightness or brightness of the next frame image of the two continuous frames based on the corrected IR component information. perform image processing,
    The information processing device according to claim 5 .
  8.  前記抽出部は、前記赤外光が照射されたタイミングのフレームから始まる連続する2フレームの画像の差分を前記IR成分の情報として取得し、
     前記画像処理部は、前記IR成分の情報に基づいて、前記連続する2フレームの最後フレームの画像への輝度又は明度に関する画像処理を行う、
     請求項4に記載の情報処理装置。
    The extracting unit acquires, as the IR component information, a difference between images of two consecutive frames starting from a frame at which the infrared light is irradiated,
    The image processing unit performs image processing related to brightness or brightness of the image of the last frame of the two consecutive frames based on the information of the IR component.
    The information processing apparatus according to claim 4.
  9.  前記画像処理部は、前記IR成分の情報に基づいて、前記撮像画像のHSL色空間での輝度の情報を書き換える、
     請求項1に記載の情報処理装置。
    The image processing unit rewrites luminance information in the HSL color space of the captured image based on the IR component information.
    The information processing device according to claim 1 .
  10.  前記画像処理部は、前記IR成分の情報に基づいて、前記撮像画像のHSV色空間での明度の情報を書き換える、
     請求項1に記載の情報処理装置。
    The image processing unit rewrites lightness information in the HSV color space of the captured image based on the IR component information.
    The information processing device according to claim 1 .
  11.  前記画像処理により生成された画像を出力部に出力する出力制御部、をさらに備える、
     請求項1に記載の情報処理装置。
    An output control unit that outputs an image generated by the image processing to an output unit,
    The information processing device according to claim 1 .
  12.  前記画像処理の前の画像と前記画像処理の後の画像とに基づき学習モデルの学習を実行する学習部と、
     前記学習モデルを使って、新たに取得した撮像画像の前記画像処理の後の画像を推測する推測部と、を備え、
     前記出力制御部は、所定期間の後、前記出力部に出力される画像を、前記画像処理により生成された画像から、前記学習モデルを使って推測された推測画像に切り替える、
     請求項11に記載の情報処理装置。
    a learning unit that performs learning of a learning model based on the image before the image processing and the image after the image processing;
    an estimating unit that estimates an image after the image processing of the newly acquired captured image using the learning model;
    After a predetermined period of time, the output control unit switches the image output to the output unit from the image generated by the image processing to the estimated image estimated using the learning model.
    The information processing device according to claim 11 .
  13.  前記出力制御部は、前記画像処理により生成された画像が前記出力部に出力される間は、前記赤外光が映像のフレーム周期に同期して点滅するよう赤外線照射部を制御し、
     前記取得部は、前記赤外光が点滅している間は、前記赤外光が照射されていないタイミングのフレームの画像を可視光画像、前記赤外光が照射されたタイミングのフレームの画像を前記IR画像、として取得し、
     前記出力制御部は、前記出力部に出力される画像が前記画像処理により生成された画像から前記推測画像に切り替わるタイミングで、前記赤外光の出力が停止するよう前記赤外線照射部を制御し、
     前記取得部は、前記赤外光の出力が停止した後は、全てのフレームの画像を前記可視光画像として取得し、
     前記推測部は、前記学習モデルを使って前記可視光画像の前記画像処理の後の画像を推測する、
     請求項12に記載の情報処理装置。
    The output control unit controls the infrared irradiation unit so that the infrared light blinks in synchronization with a video frame cycle while the image generated by the image processing is output to the output unit,
    While the infrared light is blinking, the acquisition unit acquires an image of the frame at the timing when the infrared light is not irradiated as a visible light image, and an image of the frame at the timing when the infrared light is irradiated. obtained as said IR image,
    The output control unit controls the infrared irradiation unit to stop outputting the infrared light at a timing when the image output to the output unit is switched from the image generated by the image processing to the estimated image,
    After the output of the infrared light is stopped, the acquisition unit acquires images of all frames as the visible light image,
    The estimating unit uses the learning model to estimate an image after the image processing of the visible light image.
    The information processing apparatus according to claim 12.
  14.  前記取得部は、被写体の3Dモデルを生成するための画像であって前記被写体の複数の方向からの可視光画像で構成される多視点画像と、前記被写体の複数の方向からのIR画像と、を取得し、
     前記画像処理部は、前記IR画像から抽出される前記IR成分の情報に基づいて前記多視点画像を補正する、
     請求項1に記載の情報処理装置。
    The acquisition unit comprises multi-viewpoint images, which are images for generating a 3D model of a subject and which are composed of visible light images from a plurality of directions of the subject, IR images of the subject from a plurality of directions, and get
    The image processing unit corrects the multi-viewpoint image based on the information of the IR component extracted from the IR image.
    The information processing device according to claim 1 .
  15.  対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得し、
     前記IR画像からIR成分の情報を抽出し、
     前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う、
     情報処理方法。
    Acquiring an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component,
    extracting IR component information from the IR image;
    performing image processing on the brightness or brightness of the captured image of the target based on the IR component information;
    Information processing methods.
  16.  コンピュータを、
     対象に赤外光を照射して得られる撮像画像であって可視光成分とIR成分とを含むIR画像を取得する取得部、
     前記IR画像からIR成分の情報を抽出する抽出部、
     前記IR成分の情報に基づいて前記対象の撮像画像への輝度又は明度に関する画像処理を行う画像処理部、
     として機能させるためのプログラム。
    the computer,
    an acquisition unit that acquires an IR image that is a captured image obtained by irradiating an object with infrared light and that includes a visible light component and an IR component;
    an extraction unit that extracts IR component information from the IR image;
    an image processing unit that performs image processing related to brightness or brightness of the captured image of the target based on the information of the IR component;
    A program to function as
PCT/JP2022/011543 2021-08-27 2022-03-15 Information processing device, information processing method, and program WO2023026543A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021139250 2021-08-27
JP2021-139250 2021-08-27

Publications (1)

Publication Number Publication Date
WO2023026543A1 true WO2023026543A1 (en) 2023-03-02

Family

ID=85322654

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/011543 WO2023026543A1 (en) 2021-08-27 2022-03-15 Information processing device, information processing method, and program

Country Status (1)

Country Link
WO (1) WO2023026543A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06342254A (en) * 1993-06-01 1994-12-13 Mitsubishi Electric Corp Monitoring device
JP2004312301A (en) * 2003-04-04 2004-11-04 Sumitomo Electric Ind Ltd Method, system and device for displaying image
JP2009017223A (en) * 2007-07-04 2009-01-22 Sony Corp Imaging device, image processing device, and their image processing method and program
JP2012028837A (en) * 2010-07-20 2012-02-09 Fujitsu Semiconductor Ltd Digest value generation device and digest value generation program
JP2017033448A (en) * 2015-08-05 2017-02-09 大日本印刷株式会社 Image processing unit, program and image processing method
WO2017090462A1 (en) * 2015-11-27 2017-06-01 ソニー株式会社 Information processing device, information processing method, and program
WO2021106370A1 (en) * 2019-11-28 2021-06-03 ソニーセミコンダクタソリューションズ株式会社 Solid-state image sensor, image-capturing system, and control method for solid-state image sensor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06342254A (en) * 1993-06-01 1994-12-13 Mitsubishi Electric Corp Monitoring device
JP2004312301A (en) * 2003-04-04 2004-11-04 Sumitomo Electric Ind Ltd Method, system and device for displaying image
JP2009017223A (en) * 2007-07-04 2009-01-22 Sony Corp Imaging device, image processing device, and their image processing method and program
JP2012028837A (en) * 2010-07-20 2012-02-09 Fujitsu Semiconductor Ltd Digest value generation device and digest value generation program
JP2017033448A (en) * 2015-08-05 2017-02-09 大日本印刷株式会社 Image processing unit, program and image processing method
WO2017090462A1 (en) * 2015-11-27 2017-06-01 ソニー株式会社 Information processing device, information processing method, and program
WO2021106370A1 (en) * 2019-11-28 2021-06-03 ソニーセミコンダクタソリューションズ株式会社 Solid-state image sensor, image-capturing system, and control method for solid-state image sensor

Similar Documents

Publication Publication Date Title
CN108369457B (en) Reality mixer for mixed reality
US20180158246A1 (en) Method and system of providing user facial displays in virtual or augmented reality for face occluding head mounted displays
US11580652B2 (en) Object detection using multiple three dimensional scans
WO2017176349A1 (en) Automatic cinemagraph
US20220028157A1 (en) 3d conversations in an artificial reality environment
KR20190041586A (en) Electronic device composing a plurality of images and method
US11941729B2 (en) Image processing apparatus, method for controlling image processing apparatus, and storage medium
CN107743637A (en) Method and apparatus for handling peripheral images
CN112272296B (en) Video illumination using depth and virtual light
CN109427089B (en) Mixed reality object presentation based on ambient lighting conditions
WO2023026543A1 (en) Information processing device, information processing method, and program
WO2020215263A1 (en) Image processing method and device
JP2007102478A (en) Image processor, image processing method, and semiconductor integrated circuit
US11418723B1 (en) Increasing dynamic range of a virtual production display
US11818325B2 (en) Blended mode three dimensional display systems and methods
JP2023099443A (en) Ar processing method and device
US20230056459A1 (en) Image processing device, method of generating 3d model, learning method, and program
WO2022011621A1 (en) Face illumination image generation apparatus and method
CN111612915A (en) Rendering objects to match camera noise
US11967014B2 (en) 3D conversations in an artificial reality environment
CN116245741B (en) Image processing method and related device
JP7304484B2 (en) Portrait Relighting Using Infrared Light
US11823343B1 (en) Method and device for modifying content according to various simulation characteristics
US20240107113A1 (en) Parameter Selection for Media Playback
US20230342487A1 (en) Systems and methods of image processing for privacy management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22860841

Country of ref document: EP

Kind code of ref document: A1