CN114842063A - Depth map optimization method, device, equipment and storage medium - Google Patents

Depth map optimization method, device, equipment and storage medium Download PDF

Info

Publication number
CN114842063A
CN114842063A CN202110131227.XA CN202110131227A CN114842063A CN 114842063 A CN114842063 A CN 114842063A CN 202110131227 A CN202110131227 A CN 202110131227A CN 114842063 A CN114842063 A CN 114842063A
Authority
CN
China
Prior art keywords
depth
map
depth map
segmentation
rgb
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110131227.XA
Other languages
Chinese (zh)
Inventor
程载熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110131227.XA priority Critical patent/CN114842063A/en
Publication of CN114842063A publication Critical patent/CN114842063A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a depth map optimization method, a depth map optimization device and a storage medium, and relates to the field of image processing. The method comprises the following steps: acquiring a depth map and an RGB map corresponding to a real scene; dividing the RGB image to obtain a division image corresponding to the RGB image; determining effective depth in the depth map according to the segmentation map; and according to the effective depth and the segmentation map in the depth map, fusing to obtain an optimized depth map corresponding to the real scene. By the method, the problems of cavities, noise, environmental influence and the like of the depth map acquired by the depth camera can be solved, and the optimized depth map with better quality and higher resolution can be output with low power consumption and low time delay.

Description

Depth map optimization method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the field of image processing, in particular to a depth map optimization method, a depth map optimization device, a depth map optimization equipment and a storage medium.
Background
Augmented Reality (AR) technology is a technology that skillfully fuses virtual information with a real scene. The main principle of AR technology implementation is to fuse virtual information with real scenes. In the process of fusing the virtual information and the real scene, the virtual and real occlusion processing between the virtual information and the real scene is involved. When the virtual and real shielding processing is performed, the depth information of the real scene and the depth information of the virtual information need to be acquired, and then the shielding effect between the real scene and the virtual information is realized according to the depth information of the real scene and the virtual information. The virtual information is generated through calculation and simulation, so the depth information of the virtual information is known, and thus, one of the keys of the virtual and real occlusion processing is how to acquire the depth information of the real scene.
At present, the method for acquiring depth information of a real scene generally includes: a depth camera (or called a depth sensor) acquires a depth map corresponding to a real scene, and the depth map includes depth information of the real scene.
However, the depth map corresponding to the real scene acquired by the depth camera has poor quality and low resolution.
Disclosure of Invention
The embodiment of the application provides a depth map optimization method, a depth map optimization device, a depth map optimization equipment and a storage medium, which can solve the problems of depth map holes, noise, environmental influence and the like collected by a depth camera, and output an optimized depth map with better quality and higher resolution ratio in low power consumption and low time delay.
In a first aspect, an embodiment of the present application provides a depth map optimization method, where the method includes: acquiring a depth map and an RGB map corresponding to a real scene; dividing the RGB image to obtain a division image corresponding to the RGB image; determining effective depth in the depth map according to the segmentation map; and according to the effective depth and the segmentation map in the depth map, fusing to obtain an optimized depth map corresponding to the real scene.
According to the depth optimization method, the RGB image acquired by the color camera and the depth image acquired by the depth camera are preprocessed, the preprocessed RGB image is segmented to obtain segmented images corresponding to the RGB image, and then the depth image and the segmented images corresponding to the RGB image are fused, so that the optimized depth image with better quality and higher resolution can be obtained. Meanwhile, the mode of obtaining the optimized depth map by the depth map optimization method can realize lower time delay and power consumption.
For example, the depth map acquired by the depth camera has many problems such as holes, noise, and environmental influence, and after the depth map acquired by the depth camera is optimized by the method, the quality of the optimized depth map is better.
As another example, in general, the resolution of a TOF depth camera is 240 × 180, when an application scene typically requires a depth of 5 meters (or even more), the TOF depth camera requires a stronger exposure, while a high exposure-resolution necessarily results in high power consumption. By the depth map optimization method, the resolution of the depth map can be improved, so that the purpose of low power consumption can be achieved by reducing the resolution (such as 48 × 30) and the frame rate (such as 10fps) of the depth map acquired by the TOF depth camera. In addition, because the depth map optimization method does not need to adopt a special algorithm to carry out the super-resolution processing on the depth map, the processing process is more portable and brief, and lower time delay can be achieved.
Optionally, the determining an effective depth in the depth map according to the segmentation map includes: and traversing the depth map pixel by pixel, acquiring the information of the first object example corresponding to each pixel in the depth map, and obtaining the effective depth sum corresponding to each first object example in the depth map and the number of pixels of the effective depth. The obtaining of the optimized depth map corresponding to the real scene through fusion according to the effective depth and the segmentation map in the depth map comprises the following steps: determining the average depth of each first object instance according to the effective depth sum corresponding to each first object instance in the depth map and the number of pixels of the effective depth; and upsampling the segmentation map to a first resolution, and filling the average depth of the first object examples aiming at each first object example to obtain an optimized depth map corresponding to the real scene.
Optionally, the segmentation map is a portrait segmentation map, and the first object instance is a portrait instance.
The embodiments of the present application are equally applicable to optimizing the depth of other objects (e.g., animals, trees, etc.) in the depth map. In other words, the embodiment of the present application can be extended and applied to optimize the depth map in more scenes, and is not limited to the portrait depth.
Optionally, before determining the effective depth in the depth map according to the segmentation map, the method further includes: the segmentation map is scaled to the same size as the depth map.
Optionally, before the segmenting the RGB map, the method further includes: the RGB map and the depth map are aligned and normalized.
Optionally, the segmenting the RGB map to obtain a segmentation map corresponding to the RGB map includes: reasoning the RGB image by adopting a trained deep neural network model to obtain a mask image corresponding to the RGB image; and carrying out connected region segmentation on the mask map to obtain a segmentation map.
Optionally, the depth map is a depth map acquired by a depth camera, or a monocular estimated depth map; wherein, the depth camera includes: any one of a structured light depth camera, a time-of-flight depth camera, and a binocular depth camera.
That is, the depth map acquired by the depth camera can be optimized by the depth map optimization method, and the depth map of monocular estimation can also be optimized.
In a second aspect, an embodiment of the present application provides a depth map optimization apparatus, which may be used to implement the method described in the first aspect. The functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules or units corresponding to the above functions, for example, an acquisition module, a division module, a multi-information fusion module, and the like.
The system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a depth map and an RGB map corresponding to a real scene; the segmentation module is used for segmenting the RGB image to obtain a segmentation image corresponding to the RGB image; and the multi-information fusion module is used for determining the effective depth in the depth map according to the segmentation map, and fusing to obtain the optimized depth map corresponding to the real scene according to the effective depth and the segmentation map in the depth map.
Optionally, the multi-information fusion module is specifically configured to traverse the depth map pixel by pixel, obtain information of a first object instance corresponding to each pixel in the depth map, and obtain a total effective depth and a number of pixels of the effective depth corresponding to each first object instance in the depth map; determining the average depth of each first object instance according to the effective depth sum corresponding to each first object instance in the depth map and the number of pixels of the effective depth; and upsampling the segmentation map to a first resolution, and filling the average depth of the first object examples aiming at each first object example to obtain an optimized depth map corresponding to the real scene.
Optionally, the segmentation map is a portrait segmentation map, and the first object instance is a portrait instance.
Optionally, the multi-information fusion module is further configured to scale the segmentation map to the same size as the depth map before determining the effective depth in the depth map according to the segmentation map.
Optionally, the apparatus further comprises: and the preprocessing module is used for aligning and normalizing the RGB image and the depth image.
Optionally, the segmentation module is specifically configured to use the trained deep neural network model to perform inference on the RGB map to obtain a mask map corresponding to the RGB map; and carrying out connected region segmentation on the mask map to obtain a segmentation map.
Optionally, the depth map is a depth map acquired by a depth camera, or a monocular estimated depth map; wherein, the depth camera includes: any one of a structured light depth camera, a time-of-flight depth camera, and a binocular depth camera.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory for storing processor-executable instructions; the processor is configured to execute the instructions such that the electronic device implements the depth map optimization method as described in the first aspect.
The electronic equipment can be mobile terminals such as mobile phones, tablet computers, wearable equipment, vehicle-mounted equipment, AR/VR equipment, notebook computers, super mobile personal computers, netbooks and personal digital assistants.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by the electronic device, cause the electronic device to implement the depth map optimization method as described in the first aspect.
In a fifth aspect, this embodiment of the present application further provides a computer program product, which includes computer readable code, when the computer readable code is run in an electronic device, causes the electronic device to implement the depth map optimization method described in the foregoing first aspect.
The beneficial effects of the second to fifth aspects can be referred to the description of the first aspect, and are not repeated herein.
It should be appreciated that the description of technical features, solutions, benefits, or similar language in this application does not imply that all of the features and advantages may be realized in any single embodiment. Rather, it is to be understood that the description of a feature or advantage is intended to include the specific features, aspects or advantages in at least one embodiment. Therefore, the descriptions of technical features, technical solutions or advantages in the present specification do not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantages described in the present embodiments may also be combined in any suitable manner. One skilled in the relevant art will recognize that an embodiment may be practiced without one or more of the specific features, aspects, or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
Drawings
FIG. 1 illustrates an interface diagram of an AR application;
fig. 2 shows a schematic structural diagram of a terminal device provided in an embodiment of the present application;
fig. 3 is a flowchart illustrating a depth map optimization method provided in an embodiment of the present application;
FIG. 4 is a logic diagram illustrating a depth map optimization method provided by an embodiment of the present application;
fig. 5 is a schematic diagram illustrating an effect of the depth map optimization method provided by the embodiment of the present application;
fig. 6 is a schematic structural diagram illustrating a depth map optimization apparatus provided in an embodiment of the present application;
fig. 7 shows another schematic structural diagram of the depth map optimization device provided in the embodiment of the present application.
Detailed Description
An Augmented Reality (AR) technology is a technology for skillfully fusing virtual information and a real scene, and a plurality of technical means such as multimedia, three-dimensional modeling, real-time tracking and registration, intelligent interaction, sensing and the like are widely applied, so that virtual information such as characters, images, three-dimensional models, music, videos and the like generated by a computer can be applied to the real scene after analog simulation, and the two kinds of information complement each other, thereby realizing the 'enhancement' of the real scene.
Taking an AR application installed on a mobile phone as an example, fig. 1 shows an interface schematic diagram of an AR application. As shown in fig. 1, in an AR application running on a mobile phone, an image of a real scene captured by a camera may be fused with virtual information obtained by virtual synthesis, and displayed on a screen of the mobile phone. The virtual information may include virtual words, pictures, videos, and the like.
It can be seen that the main principle of AR technology implementation is to fuse virtual information with a real scene. In the process of fusing the virtual information and the real scene, the virtual and real occlusion processing between the virtual information and the real scene is involved. For example, a certain spatial position relationship may exist between the virtual information and the real scene, and from the user perspective or from the display effect, the spatial position relationship is also an occlusion relationship between the virtual information and the real scene, such as: when the virtual information is a virtual object, the occlusion relationship between the virtual object and the object in the real scene may be that the virtual object occludes the object in the real scene, or that the object in the real scene occludes the virtual object. Therefore, in the process of fusing the virtual information and the real scene, the corresponding virtual and real occlusion processing needs to be performed in combination with the occlusion relationship between the virtual information and the real scene.
When the virtual and real shielding processing is performed, the depth information of the real scene and the depth information of the virtual information need to be acquired, and then the shielding effect between the real scene and the virtual information is realized according to the depth information of the real scene and the virtual information. The virtual information is generated through calculation and simulation, so the depth information of the virtual information is known, and thus one of the keys of the virtual and real occlusion processing lies in how to acquire the depth information of the real scene.
At present, the method for acquiring depth information of a real scene generally includes: a depth camera (or called a depth sensor) acquires a depth map corresponding to a real scene, and the depth map contains depth information of the real scene. Depth cameras may include both active depth cameras and passive depth cameras.
Wherein, active depth camera mainly includes: structured light (structured light) depth cameras and time of flight (TOF) depth cameras. The basic principle of the structured light depth camera is that light rays with certain structural characteristics are projected onto a shot object (namely, an object in a real scene) through a near-infrared laser, and then are collected through a special infrared camera. The light with a certain structure can acquire different image phase information due to different depth areas of a shot object, and then the change of the structure is converted into depth information through the arithmetic unit, so that a depth map corresponding to a real scene is obtained. The basic principle of the TOF depth camera is that modulated light pulses are transmitted through an infrared transmitter, the light pulses are reflected by a receiver after encountering objects in a real scene, and the distance between the TOF depth camera and the objects in the real scene is calculated according to the round-trip time of the light pulses, so that a depth map corresponding to the real scene is obtained.
Passive cameras mainly include binocular depth cameras. The basic principle of the binocular depth camera is that two images of an object in a real environment are obtained from different positions based on a parallax principle and by using imaging equipment, then image feature matching is carried out on the two images, and a depth map corresponding to a real scene is obtained by calculating position deviation between feature points corresponding to the two images.
However, the quality of the depth map corresponding to the real scene acquired by the depth camera is poor. For example, when a depth map corresponding to a real scene is acquired by a structured light depth camera, the depth map is easily interfered by ambient light (such as sunlight) and the quality of the depth map is poor. The depth map acquired by the TOF depth camera has many noises such as multi-path interference (MPI) noise, crosstalk, scattering, etc., and since it emits active light, it is difficult to calculate the depth for an object of low reflectivity in a real scene, such as black hair. When a depth map corresponding to a real scene is acquired by a binocular depth camera, it is difficult to calculate depth in a bright, dark or non-textured area of an image depending on the quality of the acquired image.
In addition, the resolution of the depth map corresponding to the real scene acquired by the depth camera is also low. For example, the resolution of a depth map acquired by a common depth camera is 240 × 180, but when the virtual-real occlusion processing is performed, the resolution of a color map (i.e., an RGB map) corresponding to a real scene is generally higher, such as: the resolution is 1920 × 1080, and therefore, it is necessary to perform a super-resolution process on the depth map.
Based on the problems of poor quality and low resolution of a depth map corresponding to a real scene acquired through a depth camera at present, the embodiment of the application provides a depth map optimization method. The method comprises the following steps: RGB images collected by a color camera and depth images collected by a depth camera are acquired. And preprocessing the RGB map and the depth map. And segmenting the preprocessed RGB image to obtain a segmentation image corresponding to the RGB image. And fusing the depth map and the segmentation map corresponding to the RGB map to obtain the optimized depth map. By the method, the depth map acquired by the depth camera can be optimized, and the optimized depth map with better quality and higher resolution is obtained.
It can be understood that the RGB map collected by the color camera and the depth map collected by the depth camera are the RGB map and the depth map corresponding to the real scene.
Illustratively, the method can be applied to a terminal device. The terminal device may be a mobile phone, a tablet computer, a wearable device, an in-vehicle device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), and other mobile terminals, and the specific type of the terminal device is not limited in the embodiment of the present application.
The following specifically describes an embodiment of the present application with reference to the drawings, taking a terminal device as a mobile phone as an example.
In the description of the present application, "at least one" means one or more, "a plurality" means two or more. The words "first", "second", etc. are used merely to distinguish one element from another, and are not intended to particularly limit one feature. "and/or" is used to describe the association relationship of the associated objects, meaning that three relationships may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
Taking a terminal device as a mobile phone as an example, fig. 2 shows a schematic structural diagram of the terminal device provided in the embodiment of the present application. As shown in fig. 2, the mobile phone may include a processor 210, an external memory interface 220, an internal memory 221, a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an earphone interface 270D, a sensor module 280, keys 290, a motor 291, an indicator 292, a camera 293, a display 294, a Subscriber Identity Module (SIM) card interface 295, and the like.
Processor 210 may include one or more processing units, such as: the processor 210 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.
The controller can be a nerve center and a command center of the mobile phone. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 210. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 210, thereby increasing the efficiency of the system.
In some embodiments, processor 210 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a SIM interface, and/or a USB interface, etc.
The external memory interface 220 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the mobile phone. The external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 221 may be used to store computer-executable program code, including instructions. The processor 210 executes various functional applications of the cellular phone and data processing by executing instructions stored in the internal memory 221. The internal memory 221 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The data storage area can store data (such as image data, phone book and the like) created in the use process of the mobile phone. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like.
The charge management module 240 is configured to receive a charging input from a charger. The charging management module 240 can also supply power to the mobile phone through the power management module 241 while charging the battery 242. The power management module 241 is used to connect the battery 242, the charging management module 240, and the processor 210. The power management module 241 may also receive input from the battery 242 to power the mobile phone.
The wireless communication function of the mobile phone can be realized by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor, the baseband processor, and the like. The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the handset may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile phone can implement an audio function through the audio module 270, the speaker 270A, the receiver 270B, the microphone 270C, the earphone interface 270D, the application processor, and the like. Such as music playing, recording, etc.
The sensor module 280 may include a pressure sensor 280A, a gyroscope sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 280K, an ambient light sensor 280L, a bone conduction sensor 280M, and the like.
The display screen 294 is used to display images, video, and the like. The display screen 294 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the cell phone may include 1 or N display screens 294, N being a positive integer greater than 1. For example, the display screen 294 may be used to display a photographing interface, a photo playing interface, and the like.
The mobile phone implements the display function through the GPU, the display screen 294, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 294 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information.
It is to be understood that the structure shown in fig. 2 is not to be construed as a specific limitation of the cellular phone. In some embodiments, the handset may also include more or fewer components than shown in fig. 2, or some components may be combined, some components may be separated, or a different arrangement of components may be used, etc. Alternatively still, some of the components shown in FIG. 2 may be implemented in hardware, software, or a combination of software and hardware.
In addition, when the terminal device is a mobile terminal such as another tablet computer, a wearable device, an in-vehicle device, an AR/VR device, a notebook computer, a UMPC, a netbook, a PDA, etc., the specific structure of the other terminal device may also refer to fig. 2. Illustratively, other terminal devices may have additional or fewer components than the structure shown in fig. 2, and are not described in detail here.
Fig. 3 shows a flowchart of a depth map optimization method provided in an embodiment of the present application. As shown in fig. 3, the depth map optimization method provided in the embodiment of the present application may include S301 to S304.
S301, acquiring an RGB (red, green and blue) image acquired by a color camera and a depth image acquired by a depth camera.
The depth camera may be any one of a structured light (structured light) depth camera, a time of flight (TOF) depth camera, and a binocular depth camera, or may be another type of depth map camera, which is not limited herein.
S302, preprocessing the RGB image and the depth image.
For example, the viewing angle (FOV) of the RGB map and the depth map may be aligned and preprocessed by normalization, etc., to obtain the preprocessed RGB map and depth map.
And S303, segmenting the preprocessed RGB image to obtain a segmentation image.
For example, the preprocessed RGB map may be input into a Deep Convolutional Neural Network (DCNN) model for human image segmentation inference, and the DCNN model may perform human image mask (mask) inference on the preprocessed RGB map to obtain a mask map output by the DCNN model. Then, the mask image output by the DCNN model may be subjected to connected region segmentation to obtain a portrait segmentation image corresponding to the RGB image. That is, the segmentation map obtained in S303 may be a portrait segmentation map.
Optionally, the DCNN model is an offline trained neural network model. In other implementation manners, when the preprocessed RGB is segmented, other network models for segmentation inference can be used, and the network models are not limited to the DCNN model for portrait segmentation inference, and are not limited herein.
And S304, fusing the segmentation map and the preprocessed depth map to obtain an optimized depth map.
For example, taking the segmentation map obtained in S303 as the portrait segmentation map as an example, the step of fusing the segmentation map and the preprocessed depth map may include: traversing the preprocessed depth map pixel by pixel, acquiring portrait instance information (in the embodiment of the present application, a portrait instance or other object instances may be referred to as a first object instance) corresponding to each pixel in the preprocessed depth map, and if the portrait instance information is a background (not a portrait), ignoring the portrait instance information; if it is a portrait, the effective depth is recorded. And counting the effective depth sum corresponding to each portrait instance in the preprocessed depth map and the number of pixels of the effective depth, and calculating to obtain the average depth of each portrait instance according to the effective depth sum corresponding to each portrait instance in the preprocessed depth map and the number of pixels of the effective depth. Then, the portrait segmentation map is up-sampled to a first resolution and the average depth is filled for each portrait instance, so that a portrait depth map with better quality and higher resolution can be obtained, that is, the portrait depth map is the optimized depth map.
The depth map optimization method will be described in more detail below by taking the segmentation map as the portrait segmentation map as an example.
Referring to fig. 4 and 5, fig. 4 shows a logic schematic diagram of a depth map optimization method provided in the embodiment of the present application, and fig. 5 shows an effect schematic diagram of the depth map optimization method provided in the embodiment of the present application.
As shown in fig. 4, in the embodiment of the present application, the process of preprocessing the RGB map collected by the color camera and the depth map collected by the depth camera may be completed by a preprocessing module. After the RGB image collected by the color camera and the depth image collected by the depth camera are obtained, (the mobile phone) can input the RGB image and the depth image into the preprocessing module, and the preprocessing module can preprocess the RGB image and the depth image.
Assuming that the RGB map is Rgb @1440 x 1080 (i.e., respective rate is 1440 x 1080) and the depth map is DepthLR @48 x 30 (i.e., respective rate is 48 x 30), the preprocessing module may perform the following steps on the RGB map. Here, Rgb @1440 × 1080 may be referred to as (a) in fig. 5, and DepthLR @48 × 30 may be referred to as (b) in fig. 5.
1) Rgb @1440 × 1080 was scaled to 288 × 288, yielding rgbfall @288 × 288.
2) Rotating RgbSamll @288 x 288 from rotation angle R keeps the character horizontal, resulting in RgbSmallR @288 x 288.
3) The three channels rgbmallr @288 x 288 are normalized by normalization coefficients (R1, R2, G1, G2, B1, B2) such that R channel minus R1 and divided by R2, G channel minus G1 and divided by G2, B channel minus B1 and divided by B2, to give rgbmallrn @ 288.
4) Rgbmallrn @288 × 288 is channel converted to get rgbbsmallrn @288 × 288.
Specifically, since the data arrangement of rgbbsmallrn @288 is HWC (H denotes high, W denotes wide, and C denotes channel), and the input data arrangement of the DCNN model is CHW, it is necessary to perform channel conversion on rgbbsmallrn @288, and obtain rgbbsmallrn @288 of the data arrangement of CHW.
The foregoing preprocessing is performed on Rgb @1440 × 1080, and the resultant rgbbsmallnrnc @288 × 288 can be referred to as (c) in fig. 5.
The step of the pre-processing module pre-processing the depth map may be as follows.
1) And (3) cutting DepthLR @48 x 30 to obtain DepthLRCrop @40 x 30.
Specifically, due to the data format problem of DepthLR, although the resolution of DepthLR is 48 × 30, only 40 × 30 is actually available, so the rightmost 8 × 30 needs to be cut out first to obtain DepthLRCrop @40 × 30.
2) And (3) performing data analysis on the DepthLRCrop @40 x 30 to obtain DepthLRCropt @40 x 30.
Specifically, the DepthLR data obtained in step 1) is generally stored by using a ushort type, each numerical value is stored by using 16 bits, and the real depth value is only the last 13 bits, so that the data needs to be analyzed to obtain the DepthLRCropT @40 × 30 of the last 13 bits of real depth.
3) The DepthLRCropT @40 x 30 was aligned to the pre-processed RGB map based on Rgb-D alignment factor (scale) to yield DepthAlign @54 x 40. scale, which may also be referred to as a scaling factor, may be 1.348 herein.
Specifically, the resolution of the aligned depth map calculated from the Rgb-D alignment coefficients should be 54 × 40, and thus DepthLRCropT @40 × 30 needs to be padded with depthhalign @54 × 40.
The above pretreatment is performed on DepthLR @48 × 30, and the resultant depthhalign @54 × 40 can be referred to as (d) in fig. 5.
It can be understood that, in the foregoing process of preprocessing the depth map and the RGB, the rotation angle R, Rgb-D, alignment coefficient, normalization coefficient, and other parameters may be determined according to the acquired depth map and RGB map, such as: the determination may be based on camera parameters when the color camera and the depth camera capture the RGB map and the depth map, and will not be described in detail herein.
After the RGB image and the depth image are respectively preprocessed, the preprocessed RGB image may be segmented to obtain a portrait segmentation image corresponding to the RGB image. The process of segmenting the preprocessed RGB image can be completed through a DCNN segmentation module.
The DCNN segmentation module may include a DCNN _ PeopleSeg (DCNN _ PeopleSeg is a deep convolutional neural network model for human image segmentation inference, which has been trained offline), and a connected region segmentation sub-module. Using the preprocessed RGB map: for example, rgbbsmalllrnc @288 x 288 may be input to DCNN _ PeopleSeg for inference, to obtain MaskOut @288 x 288 output from DCNN _ PeopleSeg. Then, the connected region segmentation submodule may perform connected region segmentation on MaskOut @288 x 288 to obtain Mask @288 x 28 in a ushort format, where the label of the background in Mask @288 x 28 is 0 and the label of the portrait instance starts from 1. For example, Mask @288 x 28 may be as shown in fig. 5 (e), including two portrait instances, such that the first portrait instance (e.g., left) has a label of 1 and the second portrait instance (e.g., right) has a label of 2.
After obtaining the portrait segmentation map corresponding to the RGB map, the portrait segmentation map and the preprocessed depth map may be fused to obtain the optimized depth map. The process of fusing the portrait segmentation map and the preprocessed depth map can be realized by a multi-information fusion module.
Dividing the graph by the portrait: mask @288 x 28 and preprocessed depth map depthhalign @54 x 40, for example, the procedure of fusing the portrait segmentation map and the preprocessed depth map by the multi-information fusion module may be as follows.
1) Mask @288 x 288 was scaled to the size of DepthAlign @54 x 40, yielding MaskLR @54 x 40.
2) And traversing MaskLR @54 x 40 and DepthAlign @54 x 40 at the same time to obtain the number num _ i of the effective depth values corresponding to each label i (i is equal to 0, 1, 2 and the like) and the sum value _ i of the depth values. Taking the label 2 as an example (i.e. the second portrait example), traversing each pixel (x, y) in MaskLR @54 x 40 and depthhalign @54 x 40 simultaneously, where x and y represent coordinates of the pixels, and when MaskLR (x, y) is equal to 2 and depthhalign (x, y) is greater than 0, adding 1 to num _2, and adding 1 to value _2 and depthhalign (x, y), according to this step, traversing the whole graph to obtain num _ i and value _ i of all labels.
3) The average depth value of each label, i.e. value _ i divided by num _ i, is calculated to get avg _ i. At this time, avg _ i of each label i is the average depth value corresponding to the label i.
4) Initializing DepthT @288 x 288, traversing all pixel points (x, y) in Mask @288 x 288, and when Mask (x, y) is equal to i, the value of DepthT (x, y) is avg _ i. After all pixels have traversed, a DepthT @288 x 288 with true depth values is obtained.
5) Scaling DepthT @288 x 288 to the first resolution and outputting, e.g., when the first resolution is 640 x 480, it may be output as DepthHR @640 x 480. Depthr @640 x 480 can be referred to as (f) in fig. 5.
Compared with the depth map DepthLR @48 × 30 acquired by the depth camera, in the depth map DepthHR @640 × 480 acquired by the processing procedure, the human body region is filled with complete depth values, the depth precision of the human image example is higher, the quality is better, and in addition, the resolution of the DepthHR @640 × 480 is also higher. Thereby, an optimization of the depth map acquired by the depth camera is achieved.
Optionally, although the foregoing embodiments of the present application have been described with reference to optimizing the portrait depth in the depth map, the embodiments of the present application are also applicable to optimizing the depth of other objects (such as animals, trees, etc.) in the depth map. In other words, the embodiment of the present application can be extended and applied to optimize the depth map in more scenes, and is not limited to the portrait depth. For example, the aforementioned DCNN model may be trained as a network model for other object segmentation reasoning to optimize depth maps in more scenarios.
In addition, it should be noted that the depth map optimization method provided in the embodiment of the present application may be applied to not only the virtual-real fusion scene in the AR technology mentioned in the foregoing embodiment, but also more depth map-based application scenes such as large aperture blurring and collision detection, which is not limited herein.
Optionally, in some other embodiments of the present application, the depth map acquired by the depth camera in the foregoing S301 may also be replaced by a depth map acquired in another manner, for example, the depth map may be a monocular estimation depth map, and the present application is also not limited herein.
Further, at present, some depth maps are processed in a manner that for some mobile application scenarios (such as mobile phones, tablet computers, wearable devices, etc.), real-time performance is required for performing virtual and real occlusion processing, for example: for the RGB image corresponding to each frame of real scene shot by the mobile phone, the corresponding depth image needs to be collected by the depth camera for virtual and real occlusion processing, so these shooting hardware devices will occupy higher power consumption of the mobile phone. By the depth map optimization method provided by the embodiment of the application, the power consumption of the mobile phone can be reduced to a certain extent, and the virtual and real shielding processing with low power consumption and low time delay is realized.
Taking the example that a mobile phone acquires a depth map corresponding to a real scene through a TOF depth camera (or a TOF depth map sensor), generally, the resolution of the TOF depth camera is 240 × 180, when an application scene generally needs a depth of 5 meters (or even more), the TOF depth camera needs a strong exposure, and a high exposure resolution necessarily results in high power consumption. The depth map optimization method provided by the embodiment of the application can improve the resolution of the depth map, so that the purpose of low power consumption can be achieved by reducing the resolution (such as 48 × 30) and the frame rate (such as 10fps) of the depth map acquired by the TOF depth camera. In addition, because the depth map optimization method does not need to adopt a special algorithm to carry out the super-resolution processing on the depth map, the processing process is more portable and brief, and lower time delay can be achieved.
To sum up, according to the depth optimization method provided in the embodiment of the present application, the RGB image acquired by the color camera and the depth image acquired by the depth camera are preprocessed, the preprocessed RGB image is segmented to obtain the segmentation image corresponding to the RGB image, and then the depth image and the segmentation image corresponding to the RGB image are fused to obtain the optimized depth image with better quality and higher resolution. Meanwhile, the mode of obtaining the optimized depth map by the depth map optimization method can realize lower time delay and power consumption.
That is, by the depth optimization method provided by the embodiment of the application, the problems of cavities, noise, environmental influence and the like of the depth map acquired by the depth camera can be solved, and the optimized depth map with better quality and higher resolution can be output with low power consumption and low time delay.
It should be understood that the depth map optimization method provided by the embodiment of the present application can also be applied to other non-mobile terminals, such as: the server, the computer and the like can also achieve the same beneficial effects as those described in the foregoing embodiments, and are not described herein again.
Corresponding to the depth map optimization method described in the foregoing embodiment, the embodiment of the present application further provides a depth map optimization apparatus, which can be applied to a terminal device. The functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules or units corresponding to the above-described functions. For example, fig. 6 shows a schematic structural diagram of a depth map optimization apparatus provided in an embodiment of the present application. As shown in fig. 6, the depth map optimizing apparatus may include: an acquisition module 601, a segmentation module 602, a multi-information fusion module 603, and the like.
The obtaining module 601 is configured to obtain a depth map and an RGB map corresponding to a real scene. The segmentation module 602 is configured to segment the RGB map to obtain a segmentation map corresponding to the RGB map. And the multi-information fusion module 603 is configured to determine an effective depth in the depth map according to the segmentation map, and obtain an optimized depth map corresponding to the real scene through fusion according to the effective depth in the depth map and the segmentation map.
Optionally, the multi-information fusion module 603 is specifically configured to traverse the depth map pixel by pixel, obtain information of a first object instance corresponding to each pixel in the depth map, and obtain a total effective depth and a number of pixels of the effective depth corresponding to each first object instance in the depth map; determining the average depth of each first object instance according to the effective depth sum corresponding to each first object instance in the depth map and the number of pixels of the effective depth; and upsampling the segmentation map to a first resolution, and filling the average depth of the first object examples aiming at each first object example to obtain an optimized depth map corresponding to the real scene.
Optionally, the segmentation map is a portrait segmentation map, and the first object instance is a portrait instance.
Optionally, the multi-information fusion module 603 is further configured to scale the segmentation map to the same size as the depth map before determining the effective depth in the depth map according to the segmentation map.
Optionally, fig. 7 shows another schematic structural diagram of the depth map optimizing apparatus provided in the embodiment of the present application. As shown in fig. 7, the depth map optimizing apparatus further includes: a pre-processing module 604 for aligning and normalizing the RGB map and the depth map.
Optionally, the segmentation module 602 is specifically configured to use the trained deep neural network model to perform inference on the RGB map, so as to obtain a mask map corresponding to the RGB map; and carrying out connected region segmentation on the mask map to obtain a segmentation map.
For example, the segmentation module 602 may be the aforementioned DCNN segmentation module.
Optionally, the depth map is a depth map acquired by a depth camera, or a monocular estimated depth map; wherein, the depth camera includes: any one of a structured light depth camera, a time-of-flight depth camera, and a binocular depth camera.
It should be understood that the division of units or modules (hereinafter referred to as units) in the above apparatus is only a division of logical functions, and may be wholly or partially integrated into one physical entity or physically separated in actual implementation. And the units in the device can be realized in the form of software called by the processing element; or may be implemented entirely in hardware; part of the units can also be realized in the form of software called by a processing element, and part of the units can be realized in the form of hardware.
For example, each unit may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory in the form of a program, and a function of the unit may be called and executed by a processing element of the apparatus. In addition, all or part of the units can be integrated together or can be independently realized. The processing element described herein, which may also be referred to as a processor, may be an integrated circuit having signal processing capabilities. In the implementation process, the steps of the method or the units above may be implemented by integrated logic circuits of hardware in a processor element or in a form called by software through the processor element.
In one example, the units in the above apparatus may be one or more integrated circuits configured to implement the above method, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), or a combination of at least two of these integrated circuit forms.
As another example, when a unit in a device may be implemented in the form of a processing element scheduler, the processing element may be a general purpose processor, such as a CPU or other processor capable of invoking programs. As another example, these units may be integrated together and implemented in the form of a system-on-a-chip (SOC).
In one implementation, the means for implementing the respective corresponding steps of the above method by the above apparatus may be implemented in the form of a processing element scheduler. For example, the apparatus may include a processing element and a memory element, the processing element calling a program stored by the memory element to perform the method described in the above method embodiments. The memory elements may be memory elements on the same chip as the processing elements, i.e. on-chip memory elements.
In another implementation, the program for performing the above method may be in a memory element on a different chip than the processing element, i.e. an off-chip memory element. At this time, the processing element calls or loads a program from the off-chip storage element onto the on-chip storage element to call and execute the method described in the above method embodiment.
For example, the embodiments of the present application may also provide an apparatus, such as: an electronic device may include: a processor, a memory for storing instructions executable by the processor. The processor is configured to execute the above instructions, so that the electronic device implements the depth map optimization method according to the foregoing embodiments. The memory may be located within the electronic device or external to the electronic device. And the processor includes one or more.
In yet another implementation, the unit of the apparatus for implementing the steps of the above method may be configured as one or more processing elements, where the processing elements may be integrated circuits, for example: one or more ASICs, or one or more DSPs, or one or more FPGAs, or a combination of these types of integrated circuits. These integrated circuits may be integrated together to form a chip.
For example, the embodiment of the present application also provides a chip, and the chip can be applied to the electronic device. The chip includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected through a line; the processor receives and executes computer instructions from the memory of the electronic device through the interface circuitry to implement the depth map optimization as described in the previous embodiments.
Embodiments of the present application further provide a computer program product, which includes computer readable code, when the computer readable code is executed in an electronic device, the electronic device is enabled to implement the depth map optimization as described in the foregoing embodiments.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied in the form of software products, such as: and (5) carrying out a procedure. The software product is stored in a program product, such as a computer readable storage medium, and includes several instructions for causing a device (which may be a single chip, a chip, or the like) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
For example, embodiments of the present application may also provide a computer-readable storage medium having stored thereon computer program instructions. The computer program instructions, when executed by the electronic device, cause the electronic device to implement the depth map optimization as described in the previous embodiments.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for depth map optimization, the method comprising:
acquiring a depth map and an RGB map corresponding to a real scene;
dividing the RGB map to obtain a division map corresponding to the RGB map;
determining effective depth in the depth map according to the segmentation map;
and according to the effective depth in the depth map and the segmentation map, fusing to obtain an optimized depth map corresponding to the real scene.
2. The method of claim 1, wherein determining the effective depth in the depth map from the segmentation map comprises:
traversing the depth map pixel by pixel, acquiring information of a first object example corresponding to each pixel in the depth map, and obtaining the effective depth sum corresponding to each first object example in the depth map and the number of pixels of the effective depth;
the obtaining of the optimized depth map corresponding to the real scene through fusion according to the effective depth in the depth map and the segmentation map includes:
determining the average depth of each first object instance according to the effective depth sum corresponding to each first object instance in the depth map and the number of pixels of the effective depth;
and upsampling the segmentation map to a first resolution, and filling the average depth of the first object instance for each first object instance to obtain an optimized depth map corresponding to the real scene.
3. The method of claim 2, wherein the segmentation map is a portrait segmentation map and the first object instance is a portrait instance.
4. The method of any of claims 1-3, wherein prior to determining the effective depth in the depth map from the segmentation map, the method further comprises:
scaling the segmentation map to the same size as the depth map.
5. The method according to any of claims 1-4, wherein prior to said segmenting said RGB map, said method further comprises:
and aligning and normalizing the RGB map and the depth map.
6. The method according to any one of claims 1 to 5, wherein the segmenting the RGB map to obtain the segmentation map corresponding to the RGB map comprises:
reasoning the RGB map by adopting a trained deep neural network model to obtain a mask map corresponding to the RGB map;
and carrying out connected region segmentation on the mask image to obtain the segmentation image.
7. The method according to any of claims 1-6, wherein the depth map is a depth map acquired by a depth camera or a monocular estimated depth map;
wherein the depth camera comprises: any one of a structured light depth camera, a time-of-flight depth camera, and a binocular depth camera.
8. A depth map optimization apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a depth map and an RGB map corresponding to a real scene;
the segmentation module is used for segmenting the RGB image to obtain a segmentation image corresponding to the RGB image;
and the multi-information fusion module is used for determining the effective depth in the depth map according to the segmentation map, and fusing to obtain the optimized depth map corresponding to the real scene according to the effective depth in the depth map and the segmentation map.
9. An electronic device, comprising: a processor, a memory for storing the processor-executable instructions;
the processor is configured to, when executing the instructions, cause the electronic device to implement the method of any of claims 1-7.
10. A computer readable storage medium having stored thereon computer program instructions; it is characterized in that the preparation method is characterized in that,
the computer program instructions, when executed by an electronic device, cause the electronic device to implement the method of any of claims 1-7.
CN202110131227.XA 2021-01-30 2021-01-30 Depth map optimization method, device, equipment and storage medium Pending CN114842063A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110131227.XA CN114842063A (en) 2021-01-30 2021-01-30 Depth map optimization method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110131227.XA CN114842063A (en) 2021-01-30 2021-01-30 Depth map optimization method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114842063A true CN114842063A (en) 2022-08-02

Family

ID=82560867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110131227.XA Pending CN114842063A (en) 2021-01-30 2021-01-30 Depth map optimization method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114842063A (en)

Similar Documents

Publication Publication Date Title
US12022227B2 (en) Apparatus and methods for the storage of overlapping regions of imaging data for the generation of optimized stitched images
Chen et al. Surrounding vehicle detection using an FPGA panoramic camera and deep CNNs
US20220392202A1 (en) Imaging processing method and apparatus, electronic device, and storage medium
CN116324878A (en) Segmentation for image effects
US11948280B2 (en) System and method for multi-frame contextual attention for multi-frame image and video processing using deep neural networks
CN116048244B (en) Gaze point estimation method and related equipment
CN114283050A (en) Image processing method, device, equipment and storage medium
CN115565212A (en) Image processing method, neural network model training method and device
CN117593611B (en) Model training method, image reconstruction method, device, equipment and storage medium
CN115908120B (en) Image processing method and electronic device
CN113379705A (en) Image processing method, image processing device, computer equipment and storage medium
CN116205806B (en) Image enhancement method and electronic equipment
CN116055895B (en) Image processing method and device, chip system and storage medium
CN114693538A (en) Image processing method and device
CN117132515A (en) Image processing method and electronic equipment
CN114842063A (en) Depth map optimization method, device, equipment and storage medium
CN115439577A (en) Image rendering method and device, terminal equipment and storage medium
CN113643343B (en) Training method and device of depth estimation model, electronic equipment and storage medium
CN116109531A (en) Image processing method, device, computer equipment and storage medium
CN114841863A (en) Image color correction method and device
CN114967907A (en) Identification method and electronic equipment
CN114207669A (en) Human face illumination image generation device and method
CN116704572B (en) Eye movement tracking method and device based on depth camera
CN117746192B (en) Electronic equipment and data processing method thereof
CN117727073B (en) Model training method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination