CN113325437A

CN113325437A - Image generation method and device

Info

Publication number: CN113325437A
Application number: CN202010132844.7A
Authority: CN
Inventors: 唐梦研; 罗洪鹍; 周开城; 王艳秋; 陈伟杰; 杨兵; 冯晓刚
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-02-29
Filing date: 2020-02-29
Publication date: 2021-08-31

Abstract

The application discloses an image generation method and device, relates to the field of image processing, and solves the problem of how to acquire high-precision depth information under the influence of multipath effect. The image generation method is applied to the terminal equipment. The terminal device includes a first projector and a second projector. The method comprises the following steps: the terminal equipment receives a control instruction; controlling a first projector to emit first emission light toward the target object and a second projector to emit second emission light toward the target object in response to the control instruction; receiving first reflected light, the first reflected light comprising reflected light of the target object reflected on the first reflected light; receiving second reflected light, the second reflected light comprising reflected light of the target object reflected by the second emitted light; generating a first amplitude map from the first reflected light and a second amplitude map from the second reflected light; and obtaining a target depth map according to the first amplitude map and the second amplitude map, wherein the target depth map is used for representing the depth information of the target object.

Description

Image generation method and device

Technical Field

The present application relates to the field of image processing, and in particular, to an image generation method and apparatus.

Background

At present, in the fields of machine vision, unmanned driving, Augmented Reality (AR)/Virtual Reality (VR) interaction, three-dimensional (3D) imaging, and the like, a depth camera may be used to collect depth information (depth information) and identify an object or establish a model using the depth information. In general, a Time of Flight (TOF) camera may be employed to acquire depth information. By TOF camera may be meant a camera that acquires depth information using TOF techniques. As shown in FIG. 1, the TOF camera includes a light source 101, a TOF sensor (e.g., photodetector) 102, and a processor 103. The light source 101 continuously emits pulses of infrared light (e.g., pulses of light having a wavelength of 940 nm) toward the target object, the sensor 102 receives the pulses of infrared light reflected from the target object, and the processor 103 generates depth information by detecting a round-trip time of flight of the pulses of infrared light to calculate a distance to the target object.

However, the calculated depth information deviates from the actual depth information due to multipath effects during the measurement process. For example, if the target object is a corner, multiple reflections of the light may result. As shown in fig. 2, the light source 101 continuously emits the reflected light 3 toward the corner, and the sensor 102 receives the reflected light 5, the reflected light 6, and the reflected light 7 after being reflected a plurality of times in the scene, in addition to the reflected light 4 reflected once through the corner. Since the reflected light received by the sensor 102 includes the reflected light of the first reflection, the reflected light of the second reflection, and the ambient light, the depth information determined from the received reflected light deviates from the actual depth information. As shown in fig. 3, it is a comparison effect graph of the wall corner depth measured by the TOF camera and the actual wall corner depth. As can be seen from fig. 3, the 90 degree corner extends rearward and blunts the corner under the influence of multipath effects. Therefore, under the influence of the multipath effect, how to obtain high-precision depth information is an urgent problem to be solved.

Disclosure of Invention

The image generation method and device provided by the application solve the problem of how to acquire high-precision depth information under the influence of multipath effect.

In order to achieve the purpose, the technical scheme is as follows:

in a first aspect, the present application provides a TOF depth sensing module comprising a first projector, a second projector and a photosensitive unit, wherein the first projector is configured to emit first emission light towards a target object; a second projector for emitting a second emission light to the target object, the second emission light having a different form from the first emission light; the photosensitive unit is used for receiving first reflected light, and the first reflected light comprises reflected light of the target object reflected by the first reflected light; the photosensitive unit is also used for receiving second reflected light, and the second reflected light comprises reflected light of the target object reflected by the second emitted light; the photosensitive unit is also used for generating a first amplitude map according to the first reflected light and generating a second amplitude map according to the second reflected light.

Since the emitted light is more sparse, the depth information is less affected by the multipath effect, the emitted light is more dense, and the depth information is more affected by the multipath effect under the same light intensity. Therefore, in the embodiment of the application, set up two kinds of projectors in TOF degree of depth sensing module, through launching two kinds of different non-planar lights to the target object, weaken multipath effect from the source, improved the degree of accuracy of target depth map.

Wherein the first projector includes a light emitting diode and a light blocking grating, and the second projector includes a vertical cavity surface emitting laser, a collimating lens, and a diffraction grating.

In one possible design, the first and second emissions are both non-planar. The projection of the first emitted light is different from the projection of the second emitted light. The number of the bright areas projected by the first emission light is equal to that of the bright areas projected by the second emission light; alternatively, the number of bright areas projected by the first emitted light is greater than the number of bright areas projected by the second emitted light. The bright areas of the first emission light projection are complementary to the bright areas of the second emission light projection.

In another possible design, the first emitted light is non-surface light and the second emitted light is surface light.

In a second aspect, the present application provides an image generation method applicable to a terminal device including a first projector and a second projector. The method comprises the following steps: receiving a control instruction; controlling a first projector to emit first emission light toward the target object and a second projector to emit second emission light toward the target object in response to the control instruction; wherein the second emitted light has a morphology different from the morphology of the first emitted light; receiving first reflected light, the first reflected light comprising reflected light of the target object reflected on the first reflected light; receiving second reflected light, the second reflected light comprising reflected light of the target object reflected by the second emitted light; generating a first amplitude map from the first reflected light and a second amplitude map from the second reflected light; and obtaining a target depth map according to the first amplitude map and the second amplitude map, wherein the target depth map is used for representing the depth information of the target object.

In one possible design, the first and second emissions are both non-planar. The projection of the first emitted light is different from the projection of the second emitted light. The projection of the first emitted light is different from the projection of the second emitted light. The number of the bright areas projected by the first emission light is equal to that of the bright areas projected by the second emission light; alternatively, the number of bright areas projected by the first emitted light is greater than the number of bright areas projected by the second emitted light. The bright areas of the first emission light projection are complementary to the bright areas of the second emission light projection.

With reference to the second aspect, in certain implementations of the second aspect, generating a first amplitude map from the first reflected light and generating a second amplitude map from the second reflected light includes: determining first amplitude information of an ith pixel point according to first reflected light of the ith pixel point to obtain a first amplitude map, wherein the first amplitude information of the ith pixel point comprises a first amplitude and a first phase difference of the first reflected light of the ith pixel point, i is an integer and belongs to [1, N ], and the first amplitude map represents the first amplitude information of N pixel points; and determining second amplitude information of the ith pixel point according to the second reflected light of the ith pixel point to obtain a second amplitude map, wherein the second amplitude information of the ith pixel point comprises a second amplitude and a second phase difference of the second reflected light of the ith pixel point, and the second amplitude map represents the second amplitude information of the N pixel points.

With reference to the second aspect, in some implementations of the second aspect, obtaining the target depth map according to the first amplitude map and the second amplitude map includes: correcting the first amplitude diagram to obtain a first corrected amplitude diagram, and correcting the second amplitude diagram to obtain a second corrected amplitude diagram; converting the first modified amplitude map into a first depth map, and converting the second modified amplitude map into a second depth map; and fusing the first depth map and the second depth map to obtain a target depth map.

With reference to the second aspect, in some implementations of the second aspect, modifying the first amplitude map to obtain a first modified amplitude map, and modifying the second amplitude map to obtain a second modified amplitude map includes: correcting the first amplitude information of the ith pixel point according to the first error amplitude information of the ith pixel point, and determining the first corrected amplitude information of the ith pixel point to obtain a first corrected amplitude map, wherein the first error amplitude information is amplitude information corresponding to the first reflected light and the ambient light which are reflected for multiple times; and correcting the second amplitude information of the ith pixel point according to the second error amplitude information of the ith pixel point, and determining the second corrected amplitude information of the ith pixel point to obtain a second corrected amplitude map, wherein the second error amplitude information is amplitude information corresponding to the second reflected light and the ambient light which are reflected for multiple times.

In the embodiment of the application, the amplitude information of the pixel point is corrected by utilizing the error amplitude information of the pixel point, and the high-precision depth value of the pixel point can be obtained according to the corrected phase difference of the pixel point, so that the multipath effect of the pixel point and the error caused by ambient light can be effectively reduced, and the accuracy of the target depth map is improved.

In one possible design, the correcting the first amplitude information of the ith pixel point according to the first error amplitude information of the ith pixel point includes: correcting the first amplitude information of the ith pixel point according to the first error amplitude information in a neighborhood window of the ith pixel point, wherein the neighborhood window takes the ith pixel point as the center; correcting the second amplitude information of the ith pixel point according to the second error amplitude information of the ith pixel point comprises the following steps: and correcting the second amplitude information of the ith pixel point according to the second error amplitude information in the neighborhood window of the ith pixel point.

Specifically, the step of correcting the first amplitude information of the ith pixel point according to the first error amplitude information in the neighborhood window of the ith pixel point includes: determining K pixel points according to the first amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are the pixel points with the last K first amplitudes in the first amplitudes of the pixel points from large to small in the neighborhood window, and K is an integer; determining first error amplitude information of an ith pixel point according to the first amplitude information of the K pixel points; determining first corrected amplitude information of the ith pixel point according to the first amplitude information of the ith pixel point and the first error amplitude information of the ith pixel point; correcting the second amplitude information of the ith pixel point according to the second error amplitude information in the neighborhood window of the ith pixel point comprises the following steps: determining K pixel points according to the second amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are the pixel points with the last K second amplitudes in the second amplitudes of the pixel points from large to small in the neighborhood window; determining second error amplitude information of the ith pixel point according to the second amplitude information of the K pixel points; and determining second corrected amplitude information of the ith pixel point according to the second amplitude information of the ith pixel point and the second error amplitude information of the ith pixel point.

With reference to the second aspect, in some implementations of the second aspect, converting the first modified magnitude map into a first depth map and converting the second modified magnitude map into a second depth map includes: determining a first depth value of the ith pixel point according to a first corrected phase difference contained in the first corrected amplitude information of the ith pixel point to obtain a first depth map, wherein the first depth map is used for representing first depth information of a target object; and determining a second depth value of the ith pixel point according to a second corrected phase difference contained in the second corrected amplitude information of the ith pixel point to obtain a second depth map, wherein the second depth map is used for representing second depth information of the target object.

Specifically, determining the first depth value of the ith pixel point according to the first corrected phase difference included in the first corrected amplitude information of the ith pixel point includes: determining a first depth value of the ith pixel point according to the first corrected phase difference, the light speed and the frequency of the first emitted light of the ith pixel point; determining a second depth value of the ith pixel point according to a second corrected phase difference contained in the second corrected amplitude information of the ith pixel point comprises: and determining a second depth value of the ith pixel point according to the second corrected phase difference, the light speed and the frequency of the second emitted light of the ith pixel point.

With reference to the second aspect, in some implementations of the second aspect, fusing the first depth map and the second depth map to obtain the target depth map includes: and determining a target depth value of the ith pixel point in the target depth map according to the first depth value of the ith pixel point in the first depth map and the second depth value of the ith pixel point in the second depth map so as to obtain the target depth map.

Specifically, determining the target depth value of the ith pixel point in the target depth map according to the first depth value of the ith pixel point in the first depth map and the second depth value of the ith pixel point in the second depth map includes: determining a first confidence coefficient according to a first depth value of an ith pixel point in the first depth map; determining a second confidence coefficient according to a second depth value of the ith pixel point in the second depth map; and comparing the first confidence coefficient and the second confidence coefficient of the ith pixel point, and determining the depth value of the ith pixel point corresponding to the maximum confidence coefficient as the target depth value of the ith pixel point in the target depth map.

Because the confidence degree and the depth value accuracy degree corresponding to the bright spot area of the light of the speckle array are high, and the union of the bright zone parts of the two light sources can basically cover the whole view field, the integrity degree of the fused depth map is high, and the accuracy degree of the depth information is also high. Therefore, the integrity of the target depth map is effectively improved. The larger the union of the bright areas of the first and second emitted light, the higher the completeness of the target depth map.

In a third aspect, the present application provides an image processing apparatus, which is applied to a terminal device; the terminal device comprises a first projector and a second projector; the image processing apparatus includes: a receiving unit for receiving a control instruction; a response unit for controlling the first projector to emit the first emission light to the target object and controlling the second projector to emit the second emission light to the target object in response to the control instruction; wherein the second emitted light has a morphology different from the morphology of the first emitted light; a light sensing unit for receiving first reflected light including reflected light of the target object reflected on the first reflected light; the photosensitive unit is also used for receiving second reflected light, and the second reflected light comprises reflected light of the target object reflected by the second transmitted light; a generating unit for generating a first amplitude map from the first reflected light and a second amplitude map from the second reflected light; and the conversion unit is used for obtaining a target depth map according to the first amplitude map and the second amplitude map, and the target depth map is used for representing the depth information of the target object.

With reference to the third aspect, in some implementations of the third aspect, the generating unit is specifically configured to: determining first amplitude information of an ith pixel point according to first reflected light of the ith pixel point to obtain a first amplitude map, wherein the first amplitude information of the ith pixel point comprises a first amplitude and a first phase difference of the first reflected light of the ith pixel point, i is an integer and belongs to [1, N ], and the first amplitude map represents the first amplitude information of N pixel points; and determining second amplitude information of the ith pixel point according to the second reflected light of the ith pixel point to obtain a second amplitude map, wherein the second amplitude information of the ith pixel point comprises a second amplitude and a second phase difference of the second reflected light of the ith pixel point, and the second amplitude map represents the second amplitude information of the N pixel points.

With reference to the third aspect, in some implementations of the third aspect, the conversion unit is specifically configured to: correcting the first amplitude diagram to obtain a first corrected amplitude diagram, and correcting the second amplitude diagram to obtain a second corrected amplitude diagram; converting the first modified amplitude map into a first depth map, and converting the second modified amplitude map into a second depth map; and fusing the first depth map and the second depth map to obtain a target depth map.

With reference to the third aspect, in some implementations of the third aspect, the conversion unit is specifically configured to: correcting the first amplitude information of the ith pixel point according to the first error amplitude information of the ith pixel point, and determining the first corrected amplitude information of the ith pixel point to obtain a first corrected amplitude map, wherein the first error amplitude information is amplitude information corresponding to the first reflected light and the ambient light which are reflected for multiple times; and correcting the second amplitude information of the ith pixel point according to the second error amplitude information of the ith pixel point, and determining the second corrected amplitude information of the ith pixel point to obtain a second corrected amplitude map, wherein the second error amplitude information is amplitude information corresponding to the second reflected light and the ambient light which are reflected for multiple times.

In one possible design, the conversion unit is specifically configured to: correcting the first amplitude information of the ith pixel point according to the first error amplitude information in a neighborhood window of the ith pixel point, wherein the neighborhood window takes the ith pixel point as the center; and correcting the second amplitude information of the ith pixel point according to the second error amplitude information in the neighborhood window of the ith pixel point.

Specifically, the conversion unit is specifically configured to: determining K pixel points according to the first amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are the pixel points with the last K first amplitudes in the first amplitudes of the pixel points from large to small in the neighborhood window, and K is an integer; determining first error amplitude information of an ith pixel point according to the first amplitude information of the K pixel points; determining first corrected amplitude information of the ith pixel point according to the first amplitude information of the ith pixel point and the first error amplitude information of the ith pixel point; determining K pixel points according to the second amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are the pixel points with the last K second amplitudes in the second amplitudes of the pixel points from large to small in the neighborhood window; determining second error amplitude information of the ith pixel point according to the second amplitude information of the K pixel points; and determining second corrected amplitude information of the ith pixel point according to the second amplitude information of the ith pixel point and the second error amplitude information of the ith pixel point.

With reference to the third aspect, in some implementations of the third aspect, the conversion unit is specifically configured to: determining a first depth value of the ith pixel point according to a first corrected phase difference contained in the first corrected amplitude information of the ith pixel point to obtain a first depth map, wherein the first depth map is used for representing first depth information of a target object; and determining a second depth value of the ith pixel point according to a second corrected phase difference contained in the second corrected amplitude information of the ith pixel point to obtain a second depth map, wherein the second depth map is used for representing second depth information of the target object.

Specifically, the conversion unit is specifically configured to: determining a first depth value of the ith pixel point according to the first corrected phase difference, the light speed and the frequency of the first emitted light of the ith pixel point; determining a second depth value of the ith pixel point according to a second corrected phase difference contained in the second corrected amplitude information of the ith pixel point comprises: and determining a second depth value of the ith pixel point according to the second corrected phase difference, the light speed and the frequency of the second emitted light of the ith pixel point.

With reference to the third aspect, in some implementations of the third aspect, the conversion unit is specifically configured to: and determining a target depth value of the ith pixel point in the target depth map according to the first depth value of the ith pixel point in the first depth map and the second depth value of the ith pixel point in the second depth map so as to obtain the target depth map.

Specifically, the conversion unit is specifically configured to: determining a first confidence coefficient according to a first depth value of an ith pixel point in the first depth map; determining a second confidence coefficient according to a second depth value of the ith pixel point in the second depth map; and comparing the first confidence coefficient and the second confidence coefficient of the ith pixel point, and determining the depth value of the ith pixel point corresponding to the maximum confidence coefficient as the target depth value of the ith pixel point in the target depth map.

In a fourth aspect, the present application provides a terminal device, including: at least one processor, a memory, a sensor, a first projector and a second projector, wherein the first projector and the second projector are configured to emit different modalities of emitted light, the sensor is configured to image based on the different modalities of emitted light, the memory is configured to store a computer program and instructions, and the processor is configured to invoke the computer program and instructions to assist with the sensor, the first projector and the second projector in performing the image generation method as described above in the second aspect.

Drawings

Fig. 1 is a schematic structural diagram of a TOF camera according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram illustrating the multipath effect provided by an embodiment of the present application;

FIG. 3 is a graph illustrating the result of multipath effect provided by an embodiment of the present application;

fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 5 is a block diagram of a software structure of a terminal device according to an embodiment of the present application;

FIG. 6 is a flowchart of an image generation method according to an embodiment of the present application;

FIG. 7 is a schematic view of an embodiment of the present application;

fig. 8 is a schematic structural diagram of a light source according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a light source according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a vector representing magnitude information according to an embodiment of the present application;

FIG. 11 is a diagram illustrating a neighborhood window according to an embodiment of the present application;

FIG. 12 is a flowchart of an image generation method according to an embodiment of the present application;

FIG. 13 is a diagram illustrating a neighborhood window in an amplitude map according to an embodiment of the present application;

FIG. 14 is a diagram illustrating a vector representation of error magnitude information according to an embodiment of the present application;

FIG. 15 is a schematic diagram of a neighborhood window sliding in a magnitude graph according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of a TOF depth sensing module according to an embodiment of the present disclosure;

FIG. 17 is a flowchart of an image generation method according to an embodiment of the present application;

FIG. 18 is a schematic diagram of the first and second light emitters according to an embodiment of the present disclosure;

FIG. 19 is a schematic diagram of the first and second light emitters according to an embodiment of the present disclosure;

FIG. 20 is a flowchart of an image generation method provided in an embodiment of the present application;

fig. 21 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.

Detailed Description

The terms "first," "second," and "third," etc. in the description and claims of this application and the above-described drawings are used for distinguishing between different objects and not for limiting a particular order.

In the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

For clarity and conciseness of the following descriptions of the various embodiments, a brief introduction to the related art is first given:

the depth information may refer to the distance of the target object from the acquisition device (e.g., sensor), i.e., the depth of the scene. The depth information may be represented by a depth map.

In recent years, the terminal equipment can enrich the functions of the terminal equipment by configuring various sensors, and provide better use experience for users. For example, a TOF sensor in the terminal device may be configured to acquire depth information, and 3D imaging is performed by using the depth information, so as to implement a 3D face unlocking function. For another example, a camera (camera) of the terminal device uses a TOF sensor to assist focusing, and uses depth information to determine depth of field during photographing, so as to implement background blurring and the like.

The terminal device may be a smartphone, a tablet, a wearable device, an AR/VR device, etc. The present application does not limit the specific form of the terminal device. Wearable equipment can also be called wearable intelligent equipment, is the general term of applying wearable technique to carry out intelligent design, develop the equipment that can dress to daily wearing, like glasses, gloves, wrist-watch, dress and shoes etc.. A wearable device is a portable device that is worn directly on the body or integrated into the clothing or accessories of the user. The wearable device is not only a hardware device, but also realizes powerful functions through software support, data interaction and cloud interaction. The generalized wearable smart device includes full functionality, large size, and can implement full or partial functionality without relying on a smart phone, such as: smart watches or smart glasses and the like, and only focus on a certain type of application functions, and need to be used in cooperation with other devices such as smart phones, such as various smart bracelets for physical sign monitoring, smart jewelry and the like.

Next, as shown in fig. 4, a schematic structural diagram of a terminal device provided in an embodiment of the present application is shown. The terminal device 400 may include a processor 410, an external memory interface 420, an internal memory 421, a Universal Serial Bus (USB) interface 430, a charging management module 440, a power management module 441, a battery 442, an antenna 1, an antenna 2, a mobile communication module 450, a wireless communication module 460, an audio module 470, a speaker 470A, a receiver 470B, a microphone 470C, an earphone interface 470D, a sensor module 480, a key 490, a motor 491, an indicator 492, a camera 493, a display 494, a Subscriber Identification Module (SIM) card interface 495, and the like. The sensor module 480 may include a pressure sensor 480A, a gyroscope sensor 480B, an air pressure sensor 480C, a magnetic sensor 480D, an acceleration sensor 480E, a distance sensor 480F, a proximity light sensor 480G, a fingerprint sensor 480H, a temperature sensor 480J, a touch sensor 480K, an ambient light sensor 480L, a bone conduction sensor 480M, and the like.

It is to be understood that the illustrated structure of the present embodiment does not constitute a specific limitation to the terminal device 400. In other embodiments, terminal device 400 may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 410 may include one or more processing units, such as: the processor 410 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), among others. The different processing units may be separate devices or may be integrated into one or more processors. For example, in the present application, the processor 410 may determine amplitude information of a pixel point in the amplitude map according to the reflected light, correct the amplitude information of the pixel point according to error amplitude information of the pixel point, and determine a depth value of the pixel point according to a corrected phase difference included in the corrected amplitude information of the pixel point, so as to obtain the target depth map.

The controller may be, among other things, the neural center and the command center of the terminal device 400. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.

A memory may also be provided in processor 410 for storing instructions and data. In some embodiments, the memory in the processor 410 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 410. If the processor 410 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 410, thereby increasing the efficiency of the system.

In some embodiments, processor 410 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

The MIPI interface may be used to connect the processor 410 with peripheral devices such as the display screen 494 and the camera 493. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 410 and camera 493 communicate via a CSI interface to implement the capture function of terminal device 400. The processor 410 and the display screen 494 communicate through the DSI interface to implement the display function of the terminal device 400.

The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect processor 410 with camera 493, display screen 494, wireless communication module 460, audio module 470, sensor module 480, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, and the like.

It should be understood that the interface connection relationship between the modules illustrated in the present embodiment is only an exemplary illustration, and does not constitute a limitation on the structure of the terminal device 400. In other embodiments of the present application, the terminal device 400 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

The terminal device 400 implements a display function through the GPU, the display screen 494, and the application processor, etc. The GPU is an image processing microprocessor connected to a display screen 494 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 410 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 494 is used to display images, videos, and the like. The display screen 494 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-OLED, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the terminal device 400 may include 1 or N display screens 494, N being a positive integer greater than 1.

A series of Graphical User Interfaces (GUIs) can be displayed on the display screen 494 of the terminal device 400, and these GUIs are the main screens of the terminal device 400. Generally, the size of the display screen 494 of the terminal device 400 is fixed, and only limited controls can be displayed in the display screen 494 of the terminal device 400. A control is a GUI element, which is a software component contained in an application program and controls all data processed by the application program and interactive operations related to the data, and a user can interact with the control through direct manipulation (direct manipulation) to read or edit information related to the application program. Generally, a control may include a visual interface element such as an icon, button, menu, tab, text box, dialog box, status bar, navigation bar, Widget, and the like. For example, in the present embodiment, the display screen 494 may display an image with blurred background.

The terminal device 400 may implement a shooting function through the ISP, the camera 493, the video codec, the GPU, the display screen 494, the application processor, and the like.

The ISP is used to process the data fed back by the camera 493. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 493.

The camera 493 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the terminal device 400 may include 1 or N cameras 493, where N is a positive integer greater than 1.

The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the terminal device 400, for example: image recognition, face recognition, speech recognition, text understanding, and the like.

The internal memory 421 may be used to store computer-executable program code, including instructions. The processor 410 executes various functional applications of the terminal device 400 and data processing by executing instructions stored in the internal memory 421. For example, in this embodiment, the processor 410 may modify the magnitude information by executing instructions stored in the internal memory 421. The internal memory 421 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, a phonebook, etc.) created during use of the terminal device 400, and the like. In addition, the internal memory 421 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 410 executes various functional applications of the terminal device 400 and data processing by executing instructions stored in the internal memory 421 and/or instructions stored in a memory provided in the processor.

A distance sensor 480F for measuring distance. The terminal device 400 may measure the distance by infrared or laser. In some embodiments, shooting a scene, the terminal device 400 may utilize the distance sensor 480F to range for fast focus. For example, the distance sensor 480F may be a TOF sensor.

The proximity light sensor 480G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The terminal device 400 emits infrared light to the outside through the light emitting diode. The terminal device 400 detects infrared reflected light from a nearby object using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device 400. When insufficient reflected light is detected, the terminal device 400 can determine that there is no object near the terminal device 400. The terminal device 400 can utilize the proximity light sensor 480G to detect that the user holds the terminal device 400 close to the ear for talking, so as to automatically turn off the screen to save power. The proximity light sensor 480G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.

The ambient light sensor 480L is used to sense the ambient light level. The terminal device 400 may adaptively adjust the brightness of the display screen 494 based on the perceived ambient light level. The ambient light sensor 480L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 480L may also cooperate with the proximity light sensor 480G to detect whether the terminal device 400 is in a pocket to prevent accidental touches.

In addition, an operating system runs on the above components. For example, the iOS os developed by apple, the Android open source os developed by google, the Windows os developed by microsoft, and the like. A running application may be installed on the operating system.

The operating system of the terminal device 400 may employ a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application exemplifies a software structure of the terminal device 400 by taking an Android system with a layered architecture as an example.

Fig. 5 is a block diagram of a software configuration of the terminal device 400 according to the embodiment of the present application.

The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom.

The application layer may include a series of application packages. As shown in fig. 5, the application package may include applications such as camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc. For example, when taking a picture, a camera application may access a camera interface management service provided by the application framework layer.

The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. As shown in FIG. 5, the application framework layer may include a window manager, content provider, view system, phone manager, resource manager, notification manager, camera manager, and the like. For example, in the embodiment of the present application, when taking a picture, the application framework layer may provide an API related to a camera function for the application layer, and provide a camera interface management service for the application layer, so as to implement the picture taking function.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide the communication function of the terminal device 400. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, text information is prompted in the status bar, a prompt tone is given, the terminal device vibrates, an indicator light flickers, and the like.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. And executing java files of the application program layer and the application program framework layer into a binary file by the virtual machine. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface managers (surface managers), Media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., OpenGL ES), 2D graphics engines (e.g., SGL), and the like.

The surface manager is used to manage the display subsystem and provide fusion of 2D and 3D layers for multiple applications.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

Although the Android system is taken as an example for description in the embodiments of the present application, the basic principle is also applicable to terminal devices based on an os, Windows, or other operating systems.

The following describes exemplary work flows of the software and hardware of the terminal device 400 in conjunction with a photographing scene.

The touch sensor 480K receives the touch operation and reports the touch operation to the processor 410, so that the processor starts the camera application in response to the touch operation and displays a user interface of the camera application on the display screen 494. For example, after receiving the touch operation on the camera icon, the touch sensor 480K reports the touch operation on the camera icon to the processor 410, so that the processor 410 starts a camera application corresponding to the camera icon in response to the touch operation, and displays a user interface of the camera on the display screen 494. In addition, in this embodiment of the application, the terminal may be enabled to start the camera in other ways, and a user interface of the camera is displayed on the display screen 494. For example, when a certain user interface is displayed after the terminal is blank, locked, or unlocked, the terminal may start the camera in response to a voice instruction of the user or a shortcut operation, and display the user interface of the camera on the display 494.

It is understood that, as shown in fig. 1, when the user touches the photo button (virtual button or physical button), the light source 101 emits an infrared light pulse, and the infrared light pulse encounters the target object and is reflected (e.g., diffused). The reflected light may be received by the TOF sensor 102. The TOF sensor 102 may obtain the round-trip time of flight of the infrared light pulse by detecting. The terminal device can calculate the distance to the target object according to the round-trip flight time obtained by the TOF sensor 102, so as to generate depth information and assist the terminal device in realizing corresponding functions.

In order to solve the problem of how to acquire high-precision depth information under the influence of a multipath effect, the embodiment of the application provides an image generation method. The method comprises the steps that the terminal equipment receives a control instruction and emits emitting light to a target object in response to the control instruction. After receiving the reflected light, the terminal device determines the amplitude information of the ith pixel point according to the reflected light of the ith pixel point, corrects the amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point, and determines the depth value of the ith pixel point according to the corrected phase difference contained in the corrected amplitude information of the ith pixel point to obtain a target depth map, wherein the target depth map contains the depth information of the target object. The amplitude information of the ith pixel point comprises the amplitude of the reflected light of the ith pixel point and the phase difference of the reflected light. The error amplitude information is amplitude information corresponding to the reflected light and/or the ambient light after multiple reflections. i is an integer, i belongs to [1, N ], and N represents the number of pixel points contained in the amplitude diagram. Therefore, the amplitude information of the pixel point is corrected by utilizing the error amplitude information of the pixel point, and the high-precision depth value of the pixel point can be obtained according to the corrected phase difference of the pixel point, so that the multipath effect of the pixel point and the error caused by the ambient light can be effectively reduced, and the accuracy of the target depth map is improved.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 6 is a flowchart of an image generation method according to an embodiment of the present application. The terminal device may include a projector and a TOF sensor. The projector may also be referred to as a light source. As shown in fig. 6, the method may include:

s601, the terminal equipment receives a control instruction.

The control instruction may refer to an instruction related to an operation of the terminal device by the user. For example, when a user needs to take a picture, the user touches a camera icon, the terminal device responds to the touch operation, starts a camera application corresponding to the camera icon, displays a user interface of the camera, and after the user touches a shooting button (a virtual button or a physical button), the terminal device receives a control instruction.

Optionally, the control instruction may also be an instruction in other forms such as voice, gesture, and the like, which is not limited in this application.

And S602, the terminal equipment responds to the control instruction and emits the emitting light to the target object.

The terminal device controls the projector to emit the emission light toward the target object in response to the control instruction. Since the emitted light is more sparse, the depth information is less affected by the multipath effect, the emitted light is more dense, and the depth information is more affected by the multipath effect under the same light intensity. Thus, multipath effects can be mitigated from the source by emitting non-planar light towards the target object. Understandably, the surface light may be light emitted by a surface light source. The surface light is uniformly irradiated light. Non-planar light is light with alternating light and dark. Herein, an image formed by projection of non-planar light onto a plane includes bright and dark regions. Illustratively, bright and dark areas as shown in (a) of fig. 7. Projection may refer to projection of the emitted light onto an image formed by a plane. The image formed by the projection of the area light onto the plane does not include dark regions. In some embodiments, the degree of emitted light sparseness may be represented by the number of bright areas included in an image formed by projection of non-planar light onto a plane. If the image formed by projecting non-surface light to a plane comprises fewer bright areas, the emitted light can be more sparse; if the image formed by projecting non-surface light to a plane includes more bright areas, it can indicate that the emitted light is denser. Alternatively, non-planar light may be referred to as sparse light. The sparse light may be high density sparse light or low density sparse light.

For example, as can be seen from fig. 7 (c) and 7 (d), the number of bright regions shown in fig. 7 (c) is smaller than the number of bright regions shown in fig. 7 (d), and it may be indicated that the emitted light corresponding to the image shown in fig. 7 (d) is denser than the emitted light corresponding to the image shown in fig. 7 (c), or it may be indicated that the emitted light corresponding to the image shown in fig. 7 (c) is less dense than the emitted light corresponding to the image shown in fig. 7 (d).

Alternatively, for different forms of non-planar light, the image formed by projecting the non-planar light onto the plane may include different distributions of bright and dark regions. For non-planar light modalities, the non-planar light modality may be a speckle array or a modality specific encoded pattern. For example, as shown in (a) of fig. 7, a light spot diagram of a speckle array which is randomly distributed. The light spots may be referred to as bright areas. As another example, as shown in fig. 7 (b), a schematic diagram of spots of a regularly arranged (e.g., regular geometric) speckle array. As shown in fig. 7 (c), the light spot diagram of the speckle array of the encoding pattern. As shown in fig. 7 (d), a schematic diagram of the light spot of the speckle array of the encoding pattern. As shown in fig. 7 (e), the spot diagram of the surface light.

Wherein different forms of non-planar light can be emitted by arranging different optical components at the light source. For example, as shown in fig. 8, the optical components of the Light source may include a Light Emitting Diode (LED) and a Light shielding grid (mask). The form of the emitted light can be controlled by designing the distribution of the light-transmitting part and the light-shielding part on the light-shielding grid. As another example, as shown in fig. 9, the optical components of the light source may include a Vertical Cavity Surface Emitting Laser (VCSEL), a collimating lens (collimating lens), and a diffraction grating (diffraction grating). The shape of the emitted light can be controlled by designing the position of the light emitting point on the VCSEL and the diffraction order of the diffraction grating. Both of these optical assemblies can project light in a speckle array, regular pattern, or encoded pattern. The LED and shading grating are relatively low in design difficulty and simple in hardware process, but the shading part can reduce the utilization rate of a light source, and is high in power consumption and high in heat dissipation requirement. The VCSEL and the diffraction grating have a higher utilization ratio of the light source and lower power consumption, but may cause partial distortion, so that the selection of the optical component may be selected according to the application scenario of the system, which is not limited in the present application.

Herein, the emitted light emitted by the light source may refer to a fixed frequency near-infrared light signal.

And S603, receiving the reflected light by the terminal equipment.

In some embodiments, the terminal device is equipped with a TOF sensor, through which reflected light can be received. The reflected light includes reflected light once reflected by the target object, reflected light after multiple reflections, and ambient light. Understandably, the frequency of the reflected light is the same as the frequency of the emitted light.

After the terminal device receives the reflected light, the amplitude map can be determined according to the reflected light. The amplitude map is a graph representing the shape of a target object using amplitude information. The amplitude information may include the amplitude and phase difference of the reflected light. In some embodiments, the magnitude information for each pixel point may be determined to obtain a magnitude map. Next, taking the ith pixel point as an example, a specific method for determining the amplitude information of the pixel point is described, as set forth in S604. The ith pixel point can be any pixel point, i is an integer, i belongs to [1, N ], and N represents the number of pixel points contained in the amplitude diagram.

S604, the terminal equipment determines the amplitude information of the ith pixel point according to the reflected light of the ith pixel point.

In general, the properties of light can be used for both amplitude and phase representations. The amplitude information of the ith pixel point comprises the amplitude of the reflected light of the ith pixel point and the phase difference of the reflected light of the ith pixel point. In some embodiments, the magnitude information may be represented by a vector. Illustratively, as shown in FIG. 10, point O represents the start of the vector, point A represents the end of the vector, and the vector

Can express the amplitude information of the ith pixel point, namely a vector

Length of (d) represents magnitude, vector

The angle of (d) represents the phase difference. Alternatively, the terminal device may measure the phase of the reflected light multiple times with respect to the phase of the emitted light in a 4-phase (4-quad) manner. The phase of the emitted light for each measurement was changed by 90 degrees, resulting in four measurements, i.e., C0, C90, C180, and C360. The phase difference satisfies the following formula (1), and the amplitude satisfies the following formula (2).

Where c 270-c 90 represent vectors

The component in the y-axis, c0-c180 represents a vector

The component on the x-axis. A represents a vector

Length of (d). Phi denotes a vector

The angle of (c).

Alternatively, in order to accurately acquire reflected light at different phases in a noisy environment, a lock-in amplifier may be used to reduce errors caused by frequency disturbances.

The terminal device may determine the amplitude information of each pixel point in the amplitude map by using the formula (1) and the formula (2) according to the method to obtain the amplitude map. The amplitude map represents the amplitude information of the N pixel points.

After the amplitude map is obtained, each pixel point in the amplitude map can be corrected to obtain a corrected amplitude map. Next, a specific method for correcting the amplitude information of the pixel point is described by taking the ith pixel point as an example, as set forth in S605.

S605, the terminal equipment corrects the amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point to obtain the corrected amplitude information of the ith pixel point.

Since the emitted light is non-surface light with alternate light and shade, after the emitted light irradiates the target object, a bright area and a dark area exist on the target object. By bright areas may be meant areas that are illuminated by emitted light. By dark areas may be meant areas illuminated by ambient light and reflected light having undergone multiple reflections, i.e. areas not illuminated by emitted light. Further, the reflected light received by the terminal device includes reflected light reflected by a bright area and reflected light reflected by a dark area. It will be appreciated that the reflected light reflected by the bright areas includes the reflected light first reflected by the target object. The reflected light reflected by the dark region includes at least one of reflected light via multiple reflections and ambient light. In some embodiments, the reflected light reflected by the dark region received in the neighborhood window may be approximately regarded as reflected light and ambient light which are received by all pixel points in the neighborhood window after multiple reflections, and then the amplitude information of the reflected light of the dark region may be determined as the error amplitude information. It can be understood that the error amplitude information is amplitude information corresponding to the reflected light and the ambient light after multiple reflections. Or the error amplitude information is amplitude information corresponding to reflected light after multiple reflections. Or, the error amplitude information is amplitude information corresponding to the ambient light. For the ith pixel point in the amplitude map, the amplitude information of the ith pixel point can be corrected according to the error amplitude information of the ith pixel point.

For example, for the ith pixel point, the amplitude information of the ith pixel point may be corrected according to the error amplitude information in the neighborhood window of the ith pixel point. The ith pixel point is a pixel point in the amplitude map, and i is an integer greater than or equal to 1. The neighborhood window may be an m × m region centered on the i-th pixel. Optionally, the neighborhood window may include m × m pixels, and the ith pixel is the center of the m × m region. Understandably, the m pixel points include the ith pixel point. For example, as shown in fig. 11, when m is 7, the neighborhood window is a 7 × 7 region centered on the ith pixel point, and the neighborhood window includes 49 pixel points.

Optionally, as shown in fig. 12, correcting the amplitude information of the ith pixel point according to the error amplitude information in the neighborhood window of the ith pixel point may include the following detailed steps.

S6051, the terminal equipment determines K pixel points according to the amplitude of the pixel points in the neighborhood window.

The amplitude of the pixel point is in direct proportion to the confidence, namely, the larger the amplitude of the pixel point is, the higher the confidence of the pixel point is, and the smaller the influence of the ambient light or reflected light after multiple reflections on the pixel point is; on the contrary, the smaller the amplitude of the pixel point is, the lower the confidence coefficient is, which indicates that the pixel point is influenced by the ambient light or the reflected light after multiple reflections. Therefore, the terminal device can sort the amplitudes of all the pixel points in the neighborhood window from large to small, and select the pixel points with the inverse K amplitudes as error reference points, wherein K is an integer. Optionally, the value of K is 3, that is, the terminal device selects a pixel point with 3 last amplitudes from the amplitudes of the pixel points from large to small in the neighborhood window as an error reference point.

For example, as shown in fig. 13 (a), a neighborhood window in the magnitude graph is schematically illustrated, and as shown in fig. 13 (b), a pixel point included in the neighborhood window in the magnitude graph is schematically illustrated. Where M1, M2, and M3 may refer to error reference points.

S6052, the terminal equipment determines error amplitude information of the ith pixel point according to the amplitude information of the K pixel points.

In some embodiments, the average value of the amplitude information of the K pixels may be used as the error amplitude information of the ith pixel. Since the magnitude information can be represented by vectors, the error magnitude information can be determined by the mean of K vectors. Understandably, the average value of the amplitude information of the K pixels may refer to an average multipath effect in a neighborhood window and an error value caused by ambient light.

S6053, the terminal equipment determines the corrected amplitude information of the ith pixel point according to the amplitude information of the ith pixel point and the error amplitude information of the ith pixel point.

In some embodiments, the corrected amplitude information of the ith pixel point can be obtained by subtracting the error amplitude information of the ith pixel point from the amplitude information of the ith pixel point. Understandably, the corrected amplitude information can be obtained by subtracting two vectors, that is, the amplitude information vector of the ith pixel point and the error amplitude information vector of the ith pixel point are subtracted to obtain the corrected amplitude information of the ith pixel point. As shown in fig. 14, vectors

Can express the amplitude information and the vector of the ith pixel point

Can express the error amplitude information and the vector of the ith pixel point

The corrected amplitude information of the ith pixel point can be represented. Wherein the content of the first and second substances,

optionally, starting from the upper left corner of the amplitude map, the neighborhood window is slid in the amplitude map, and the amplitude information of the pixel points in the amplitude map is corrected to obtain the corrected amplitude map, and the specific correction process may refer to the descriptions of S6051 to S6053, which is not described in detail. For example, fig. 15 is a schematic diagram of a neighborhood window sliding in a magnitude graph according to an embodiment of the present application. As can be seen from fig. 15, in the amplitude map, the neighborhood window slides according to the sequence from left to right and from top to bottom, and the distance of moving one pixel point at a time, and the amplitude information of the pixel points in the amplitude map is corrected, so as to obtain the corrected amplitude map.

After the modified amplitude map is obtained, the target depth value of each pixel point in the modified amplitude map may be determined, so as to obtain a target depth map. Next, a specific method for determining the target depth value of the pixel point is described by taking the ith pixel point as an example, as set forth in S606.

And S606, the terminal equipment determines the target depth value of the ith pixel point according to the corrected phase difference contained in the corrected amplitude information of the ith pixel point to obtain a target depth map.

In some embodiments, the terminal device determines the target depth value of the ith pixel point according to the corrected phase difference, the light speed and the frequency of the emitted light of the ith pixel point. The target depth value of the ith pixel point satisfies the following formula (3).

Where d represents the target depth value, i.e. the distance of the TOF camera from the target object. And c represents the speed of light. Generally, the speed of light refers to the speed of electromagnetic waves (including light waves) propagating in a vacuum, and c is 2.99792458 × 108 m/s.

Indicating the corrected phase difference. f denotes the frequency of the emitted light. Optionally, the target depth value of each pixel point in the corrected amplitude map is determined according to the explanation of S606, so as to obtain a target depth map, where the target depth map includes depth information of the target object.

Alternatively, the emitted light may be a surface light in this context. The embodiment of the present application does not limit the type of the emitted light.

According to the image generation method provided by the embodiment of the application, before the depth value is calculated according to the phase difference of the pixel point, the amplitude information of the pixel point is corrected by using the error amplitude information of the pixel point, and then the high-precision depth value of the pixel point is obtained according to the corrected phase difference of the pixel point, so that the multi-path effect of the pixel point and the error caused by ambient light are effectively reduced, and the accuracy of the target depth map is improved.

If the emitted light is non-surface light, the reflected light of the dark area received by the TOF sensor is both from ambient light and reflected light which reaches the dark area after being reflected for multiple times in the external environment due to multipath effect, the amplitude information of the reflected light of the dark area is error amplitude information, and the depth value obtained according to the error amplitude information is not really representing the distance from the target object to the TOF camera, so that the depth information of the dark area is removed from the depth map, and a depth map hole can be caused. In order to solve the problem of depth map holes, the TOF camera can control the light source to emit the emitted light of the two speckle arrays in a time-sharing manner, wherein the union of the bright areas of the emitted light of the two speckle arrays covers the whole field of view of the target object as much as possible. And then the TOF sensor collects the reflected light corresponding to the emitted light of the two speckle arrays respectively, and the depth information of the corresponding pixel point is determined according to the reflected light corresponding to the emitted light of the two speckle arrays respectively. For example, as shown in fig. 16, an embodiment of the present application provides a TOF depth sensing module 1600. The TOF depth sensing module 1600 includes a first projector 1601, a second projector 1602, and a photosensitive unit (e.g., a photodetector) 1603. Optionally, the first projector comprises a light emitting diode and a light blocking grid. The second projector includes a vertical cavity surface emitting laser, a collimating lens, and a diffraction grating. The first projector 1601 is for emitting a first emission light to a target object. The second projector 1602 is configured to emit a second emission light to the target object, the second emission light having a different morphology than the first emission light. The photosensitive unit 1603 is configured to receive the first reflected light and the second reflected light, and generate a first amplitude map according to the first reflected light and a second amplitude map according to the second reflected light. The first reflected light includes reflected light of the target object reflected by the first emitted light. The second reflected light includes reflected light of the target object reflected by the second emitted light. The TOF depth sensing module 1600 may further include a processing unit configured to obtain a target depth map according to the first amplitude map and the second amplitude map, where the target depth map is used to represent depth information of the target object.

In one possible design, the first and second emissions are both non-planar. For example, the first emitted light is high-density sparse light and the second emitted light is low-density sparse light. As another example, the first emitted light is low-density sparse light and the second emitted light is low-density sparse light.

In another possible design, the first emitted light is non-surface light and the second emitted light is surface light. The first emitted light is high-density sparse light or low-density sparse light.

Fig. 17 is a flowchart of an image generation method according to an embodiment of the present application. The terminal device may include a first projector and a second projector. As shown in fig. 17, the method may include:

s1701, the terminal device receives a control instruction.

S1702, the terminal device responds to the control instruction and controls the first projector to emit the first emitting light to the target object.

S1703, the terminal device controls the second projector to emit the second emission light to the target object.

The first emitted light and the second emitted light may both be non-planar light. Wherein the second emitted light has a morphology different from the morphology of the first emitted light. The detailed explanation about the first emission light and the second emission light may refer to the explanation of the emission light in S602, without limitation.

In some embodiments, in order for the first and second emissions to cover the entire field of view of the target object, the union of the bright regions of the first and second emissions is as large as possible. For example, as shown in fig. 18, the first emitted light and the second emitted light are each a spot of a randomly distributed speckle array. The light spot shown in (a) in fig. 18 is a light spot of the first emission light, and the light spot shown in (b) in fig. 18 is a light spot of the second emission light. The light spots may be referred to as bright areas. The bright areas of the first emission light projection are as complementary as possible to the bright areas of the second emission light projection. As another example, as shown in fig. 19, the first emitted light and the second emitted light are each light spots of a regularly arranged speckle array. The light spot shown in (a) of fig. 19 is a light spot of the first emission light, the light spot shown in (b) of fig. 19 is a light spot of the second emission light, and the bright area of the projection of the first emission light is complementary to the bright area of the projection of the second emission light. Alternatively, the number of complementary bright areas is not limited, and the number of bright areas projected by the first emitted light may be equal to the number of bright areas projected by the second emitted light. Alternatively, the number of bright areas projected by the first emitted light is not equal to the number of bright areas projected by the second emitted light.

S1704, the terminal device receives the first reflected light.

And S1705, the terminal equipment receives the second reflected light.

The terminal device may receive the first reflected light and the second reflected light through the light sensing unit. The light sensing unit may be a TOF sensor. The first reflected light includes light reflected back after the first emitted light is irradiated to the target object, reflected light reflected multiple times, and ambient light. The second reflected light includes light reflected back after the second emitted light is irradiated to the target object, reflected light of multiple reflections, and ambient light.

And S1706, the terminal device generates a first amplitude map according to the first reflected light.

And S1707, the terminal device generates a second amplitude map according to the second reflected light.

The terminal device can determine the first amplitude information of the ith pixel point according to the first reflected light of the ith pixel point, and therefore the first amplitude image is obtained by determining the first amplitude information of each pixel point. The first amplitude map represents first amplitude information of the N pixel points. The first amplitude information of the ith pixel point comprises a first amplitude and a first phase difference of first reflected light of the ith pixel point.

The terminal device can determine second amplitude information of the ith pixel point according to the second reflected light of the ith pixel point, and therefore a second amplitude map is obtained by determining the second amplitude information of each pixel point. The second amplitude map represents second amplitude information of the N pixel points. The second amplitude information of the ith pixel point comprises a second amplitude and a second phase difference of second reflected light of the ith pixel point.

The method for determining the first amplitude information of the ith pixel point in the first amplitude map and the method for determining the second amplitude information of the ith pixel point in the second amplitude map by the terminal device may refer to the explanation of determining the amplitude information of the ith pixel point according to the reflected light of the ith pixel point in S604, and is not limited.

After determining the first amplitude map and the second amplitude map, the terminal device may obtain a target depth map according to the first amplitude map and the second amplitude map, where the target depth map is used to represent depth information of a target object. Specifically, as described in S1708 to S1712 below.

S1708, the terminal device corrects the first amplitude map to obtain a first corrected amplitude map.

The terminal device may correct the first amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point in the first amplitude map, so as to obtain first corrected amplitude information of the ith pixel point. Optionally, after the first amplitude information of each pixel point in the first amplitude map is corrected, a first corrected amplitude map is obtained, where the first corrected amplitude map includes the first corrected amplitude information of each pixel point.

And S1709, the terminal equipment corrects the second amplitude map to obtain a second corrected amplitude map.

The terminal device may correct the second amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point in the second amplitude map, so as to obtain second corrected amplitude information of the ith pixel point. Optionally, after the second amplitude information of each pixel point in the second amplitude map is corrected, a second corrected amplitude map is obtained, where the second corrected amplitude map includes the second corrected amplitude information of each pixel point.

The method for correcting the first amplitude information and the second amplitude information of the ith pixel point by the terminal device may refer to the method of correcting the amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point in S605, so as to obtain the explanation of the corrected amplitude map, which is not limited.

And S1710, the terminal device converts the first corrected amplitude map into a first depth map.

The terminal device may determine the first depth value of the ith pixel point according to the corrected phase difference included in the first corrected amplitude information of the ith pixel point, so as to obtain a first depth map. The first depth map includes a first depth value for each pixel point.

And S1711, the terminal equipment converts the second corrected amplitude map into a second depth map.

The terminal device may determine a second depth value of the ith pixel point according to the corrected phase difference included in the second corrected amplitude information of the ith pixel point, so as to obtain a second depth map. The second depth map includes a second depth value for each pixel point.

The method for determining the first depth value and the second depth value of the ith pixel point by the terminal device may refer to the explanation of determining the depth value of the ith pixel point according to the corrected phase difference included in the corrected amplitude information of the ith pixel point in S606, which is not limited.

It should be noted that the order of the steps of the image generation method provided by the embodiment of the present invention may be appropriately adjusted. For example, the order of emitting the first and second emissions can be interchanged, i.e., the second emission can be emitted first and then the first emission. For another example, the order of receiving the first reflected light and the second reflected light may be interchanged, i.e., the second reflected light is received first, and then the first reflected light is received. Any method that can be easily conceived by those skilled in the art within the technical scope of the present disclosure is covered by the protection scope of the present disclosure, and thus, the detailed description thereof is omitted.

And S1712, the terminal device fuses the first depth map and the second depth map to obtain a target depth map.

The terminal device may determine a target depth value of an ith pixel point in the target depth map according to a first depth value of the ith pixel point in the first depth map and a second depth value of the ith pixel point in the second depth map, so as to obtain the target depth map.

Because the number of the pixel points in the first depth map is the same as that of the pixel points in the second depth map, the target depth value of the pixel points can be selected from the first depth map and the second depth map for the same pixel point. In statistics, the Confidence interval (Confidence interval) of a probability sample is an interval estimate for some overall parameter of this sample. The higher the confidence, the more reliable the sample. In some embodiments, the target depth value of the pixel point may be selected from the first depth map and the second depth map according to a confidence of the depth value of the pixel point. Specifically, the terminal device determines a first confidence according to a first depth value of an ith pixel point in the first depth map, determines a second confidence according to a second depth value of the ith pixel point in the second depth map, compares the first confidence and the second confidence of the ith pixel point, and determines the depth value of the ith pixel point corresponding to the maximum confidence as a target depth value of the ith pixel point in the target depth map to obtain the target depth map.

Optionally, the pixel point in the first depth map may also be used as an anchor point, and the depth information of the pixel point in the first depth map is used as a reference depth value to correct the depth information of the pixel point in the second depth map, so as to obtain the target depth map.

Optionally, the depth information of the pixel point in the first depth map may also be used as reference information, and the hole in the second depth map is eliminated by using a hole filling mode and a super-resolution processing mode, so as to obtain the target depth map.

In other embodiments, one of the emitted lights may be arranged to be a surface light. Fig. 20 is a flowchart of an image generation method according to an embodiment of the present application. The terminal device may include a first projector and a second projector. As shown in fig. 20, the method may include:

s2001, the terminal device receives the control command.

And S2002, the terminal equipment responds to the control instruction and controls the first projector to emit the first emitting light to the target object.

And S2003, the terminal device controls the second projector to emit second emission light to the target object.

Wherein the first emitting light may be a non-surface light and the second emitting light may be a surface light. The detailed explanation about the first emission light and the second emission light may refer to the explanation of the emission light in S602, without limitation.

And S2004, the terminal equipment receives the first reflected light.

And S2005, the terminal equipment receives the second reflected light.

The first reflected light includes light reflected back after the first emitted light is irradiated to the target object, reflected light reflected multiple times, and ambient light. The second reflected light includes light reflected back after the second emitted light is irradiated to the target object, reflected light of multiple reflections, and ambient light.

And S2006, the terminal device generates a first amplitude map according to the first reflected light.

And S2007, the terminal device generates a second amplitude map according to the second reflected light.

And S2008, the terminal device corrects the first amplitude diagram to obtain a first corrected amplitude diagram.

The terminal device may correct the first amplitude information of the ith pixel point according to the error amplitude information of the ith pixel point in the first amplitude map, so as to obtain first corrected amplitude information of the ith pixel point. The first modified amplitude map includes first modified amplitude information for each pixel.

The method for correcting the first amplitude information of the ith pixel by the terminal device may refer to S605 to correct the amplitude information of the ith pixel according to the error amplitude information of the ith pixel, so as to obtain the explanation of the corrected amplitude map, which is not limited.

And S2009, the terminal device converts the second amplitude map into a second depth map.

The terminal device may determine a second depth value of the ith pixel point according to the phase difference included in the second amplitude information of the ith pixel point, so as to obtain a second depth map. The second depth map includes a second depth value for each pixel point.

The terminal equipment does not need to correct second amplitude information of the pixel points in the second amplitude image, and determines a second depth value of the ith pixel point according to the phase difference contained in the second amplitude information so as to obtain the second depth image.

And S2010, the terminal equipment converts the first corrected amplitude map into a first depth map.

The method for determining the first depth value of the ith pixel point by the terminal device may refer to the explanation of determining the depth value of the ith pixel point according to the corrected phase difference included in the corrected amplitude information of the ith pixel point in S606, which is not limited.

And S2011, the terminal device fuses the first depth map and the second depth map to obtain a target depth map.

Optionally, the depth information of the cavity portion in the target depth map may be supplemented in a super-resolution processing manner or a hole filling manner, so as to obtain higher integrity.

Optionally, the terminal device may further perform optimization processing on the first depth map and the second depth map, and determine the target depth map according to the first depth map and the second depth map, so as to improve accuracy of the target depth map. For example, the first depth map and the second depth map may be optimized using a despatch algorithm or a post-processing algorithm.

It is to be understood that, in order to implement the functions in the above embodiments, the network device and the terminal device include hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software driven hardware depends on the particular application scenario and design constraints imposed on the solution.

Fig. 21 is a schematic structural diagram of a possible image processing apparatus according to an embodiment of the present application. The image processing devices can be used for realizing the functions of the terminal equipment in the method embodiment, so that the beneficial effects of the method embodiment can be realized. In the embodiment of the present application, the image processing apparatus may be the terminal device 400 shown in fig. 4, and may also be a module (e.g., a chip) applied to the terminal device.

As shown in fig. 21, the image processing apparatus 2100 includes a receiving unit 2101, a response unit 2102, a photosensitive unit 2103, a generating unit 2104, and a converting unit 2105. The image processing apparatus 2100 may be applied to the terminal device 400 as shown in fig. 4, and the terminal device 400 may include a first projector and a second projector. The first projector is used to emit the emitted light 1. The second projector is for emitting the emitted light 2. The response unit 2102 may control the first projector to emit the first emission light and the second projector to emit the second emission light. The light sensing unit 2103 may comprise TOF sensors for receiving the reflected light 1 and the reflected light 2. The generation unit 2104 is used for determining the amplitude map 1 using the reflected light 1 and for determining the amplitude map 2 using the reflected light 2. The conversion unit 2105 is configured to correct the amplitude map 1 and the amplitude map 2, convert the corrected amplitude map 1 into a depth map 1, convert the corrected amplitude map 2 into a depth map 2, and fuse the depth map 1 and the depth map 2 to obtain a target depth map. Illustratively, the image processing apparatus 2100 is configured to implement the functions of the terminal device in the method embodiments illustrated in fig. 6, 12, 17 or 20 described above.

In some embodiments, the response unit 2102 may control the first projector to emit the emission light 1 or the second projector to emit the emission light 2. The generation unit 2104 is used for optimizing the depth map 1 or the depth map 2. For example, when the image processing apparatus 2100 is used to implement the functions of the terminal device in the method embodiment shown in fig. 6, the receiving unit 2101 is used to execute S601; the response unit 2102 is configured to execute S602; the photosensitive unit 2103 is used to execute S603; the generation unit 2104 is used to perform S604. The conversion unit 2105 is used to execute S605 and S606.

When the image processing apparatus 2100 is used to implement the functions of the terminal device in the method embodiment shown in fig. 12, the receiving unit 2101 is used to execute S601; the response unit 2102 is configured to execute S602; the photosensitive unit 2103 is used to execute S603; the generation unit 2104 is used to perform S604. The conversion unit 2105 is configured to execute S6051 to S6053, and S606.

In other embodiments, if the response unit 2102 controls the first projector to emit the emission light 1 to the target object and controls the second projector to emit the emission light 2 to the target object, the conversion unit 2105 is configured to modify the amplitude map 1 and the amplitude map 2, convert the modified amplitude map 1 into the depth map 1, convert the modified amplitude map 2 into the depth map 2, and fuse the depth map 1 and the depth map 2 to obtain the target depth map.

When the image processing apparatus 2100 is used to implement the functions of the terminal device in the method embodiment shown in fig. 17, the receiving unit 2101 is used to execute S1701; the response unit 2102 is configured to execute S1702 and S1703; the photosensitive unit 2103 is used to perform S1704 and S1705; the generation unit 2104 is used to perform S1706 and S1707; the conversion unit 2105 is configured to execute S1708 and S1712.

When the image processing apparatus 2100 is used to realize the functions of the terminal device in the method embodiment shown in fig. 20, the receiving unit 2101 is used to execute S2001; the response unit 2102 is configured to execute S2002 and S2003; the photosensitive unit 2103 is used to execute S2004 and S2005; the generation unit 2104 is used to perform S2006 and S2007; the conversion unit 2105 is configured to execute S2008 and S2011.

It is understood that the Processor in the embodiments of the present Application may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. The general purpose processor may be a microprocessor, but may be any conventional processor.

The method steps in the embodiments of the present application may be implemented by hardware, or may be implemented by software instructions executed by a processor. The software instructions may be comprised of corresponding software modules that may be stored in Random Access Memory (RAM), flash Memory, Read-Only Memory (ROM), Programmable ROM (PROM), Erasable PROM (EPROM), Electrically EPROM (EEPROM), registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a network device or a terminal device. Of course, the processor and the storage medium may reside as discrete components in a network device or a terminal device.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network appliance, a user device, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or optical media such as Digital Video Disks (DVDs); it may also be a semiconductor medium, such as a Solid State Drive (SSD).

In the embodiments of the present application, unless otherwise specified or conflicting with respect to logic, the terms and/or descriptions in different embodiments have consistency and may be mutually cited, and technical features in different embodiments may be combined to form a new embodiment according to their inherent logic relationship.

In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In the description of the text of the present application, the character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula of the present application, the character "/" indicates that the preceding and following related objects are in a relationship of "division".

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. The sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic.

Claims

1. A TOF depth sensing module is characterized by comprising a first projector, a second projector and a photosensitive unit,

the first projector is used for emitting first emission light to a target object;

the second projector is used for emitting second emission light to the target object, and the form of the second emission light is different from that of the first emission light;

the photosensitive unit is used for receiving first reflected light, and the first reflected light comprises reflected light of the target object reflected by the first reflected light;

the photosensitive unit is further used for receiving second reflected light, and the second reflected light comprises reflected light of the target object reflected by the second emitted light;

the photosensitive unit is further used for generating a first amplitude map according to the first reflected light and generating a second amplitude map according to the second reflected light.

2. The TOF depth sensing module of claim 1, wherein the first projector comprises a light emitting diode and a light blocking grid and the second projector comprises a vertical cavity surface emitting laser, a collimating lens, and a diffraction grating.

3. The TOF depth sensing module of claim 1 or 2, wherein the first and second emitted light are both non-planar light.

4. The TOF depth sensing module of claim 3, wherein a projection of the first emitted light is different from a projection of the second emitted light.

5. The TOF depth sensing module of claim 4, wherein a number of bright regions of the first emitted light projection is equal to a number of bright regions of the second emitted light projection; alternatively, the number of bright areas projected by the first emitted light is greater than the number of bright areas projected by the second emitted light.

6. The TOF depth sensing module of claim 4 or 5, wherein the bright regions of the first emitted light projection are complementary to the bright regions of the second emitted light projection.

7. The TOF depth sensing module of claim 1 or 2, wherein the first emitted light is non-surface light and the second emitted light is surface light.

8. An image generation method is characterized in that the method is applied to a terminal device; the terminal device comprises a first projector and a second projector; the method comprises the following steps:

receiving a control instruction;

controlling the first projector to emit first emission light toward a target object and controlling the second projector to emit second emission light toward the target object in response to a control instruction; wherein the second emitted light has a morphology different from the morphology of the first emitted light;

receiving first reflected light, the first reflected light comprising reflected light of the target object reflected on the first reflected light;

receiving second reflected light, the second reflected light comprising reflected light of the target object reflected off of the second emitted light;

generating a first amplitude map according to the first reflected light and generating a second amplitude map according to the second reflected light;

and obtaining a target depth map according to the first amplitude map and the second amplitude map, wherein the target depth map is used for representing the depth information of the target object.

9. The method of claim 8, wherein the first projector includes a light emitting diode and a light blocking grid, and the second projector includes a vertical cavity surface emitting laser, a collimating lens, and a diffraction grating.

10. The method of claim 8 or 9, wherein the first and second emissions are both non-planar.

11. The method of claim 10, wherein the projection of the first emitted light is different from the projection of the second emitted light.

12. The method of claim 11, wherein the number of bright areas of the first emission light projection is equal to the number of bright areas of the second emission light projection; alternatively, the number of bright areas projected by the first emitted light is greater than the number of bright areas projected by the second emitted light.

13. The method of claim 11 or 12, wherein the bright areas of the first emission light projection are complementary to the bright areas of the second emission light projection.

14. The method of claim 8 or 9, wherein the first emitted light is non-surface light and the second emitted light is surface light.

15. The method of any of claims 8-14, wherein generating a first amplitude map from the first reflected light and a second amplitude map from the second reflected light comprises:

determining first amplitude information of an ith pixel point according to first reflected light of the ith pixel point to obtain a first amplitude map, wherein the first amplitude information of the ith pixel point comprises a first amplitude and a first phase difference of the first reflected light of the ith pixel point, i is an integer and belongs to [1, N ], and the first amplitude map represents the first amplitude information of N pixel points;

and determining second amplitude information of the ith pixel point according to the second reflected light of the ith pixel point to obtain the second amplitude map, wherein the second amplitude information of the ith pixel point comprises a second amplitude and a second phase difference of the second reflected light of the ith pixel point, and the second amplitude map represents the second amplitude information of the N pixel points.

16. The method according to any of claims 8-15, wherein said deriving a target depth map from the first and second magnitude maps comprises:

correcting the first amplitude diagram to obtain a first corrected amplitude diagram, and correcting the second amplitude diagram to obtain a second corrected amplitude diagram;

converting the first modified amplitude map into a first depth map, and converting the second modified amplitude map into a second depth map;

and fusing the first depth map and the second depth map to obtain the target depth map.

17. The method of claim 16, wherein modifying the first amplitude map to obtain a first modified amplitude map and modifying the second amplitude map to obtain a second modified amplitude map comprises:

correcting the first amplitude information of the ith pixel point according to the first error amplitude information of the ith pixel point, and determining the first corrected amplitude information of the ith pixel point to obtain a first corrected amplitude map, wherein the first error amplitude information is amplitude information corresponding to the first reflected light and the ambient light which are reflected for multiple times;

and correcting the second amplitude information of the ith pixel point according to the second error amplitude information of the ith pixel point, and determining the second corrected amplitude information of the ith pixel point to obtain a second corrected amplitude map, wherein the second error amplitude information is amplitude information corresponding to second reflected light and ambient light which are reflected for multiple times.

18. The method of claim 17, wherein the modifying the first amplitude information of the ith pixel according to the first error amplitude information of the ith pixel comprises:

correcting the first amplitude information of the ith pixel point according to the first error amplitude information in a neighborhood window of the ith pixel point, wherein the neighborhood window takes the ith pixel point as a center;

the correcting the second amplitude information of the ith pixel point according to the second error amplitude information of the ith pixel point comprises:

and correcting the second amplitude information of the ith pixel point according to the second error amplitude information in the neighborhood window of the ith pixel point.

19. The method of claim 18, wherein the modifying the first magnitude information of the ith pixel point according to the first error magnitude information in the neighborhood window of the ith pixel point comprises:

determining K pixel points according to the first amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are the pixel points with the last K first amplitudes in the first amplitudes of the pixel points from large to small in the neighborhood window, and K is an integer;

determining first error amplitude information of the ith pixel point according to the first amplitude information of the K pixel points;

determining first corrected amplitude information of the ith pixel point according to the first amplitude information of the ith pixel point and the first error amplitude information of the ith pixel point;

the correcting the second amplitude information of the ith pixel point according to the second error amplitude information in the neighborhood window of the ith pixel point comprises:

determining K pixel points according to the second amplitudes of the pixel points in the neighborhood window, wherein the K pixel points are pixel points with the last K second amplitudes in the second amplitudes of the pixel points from large to small in the neighborhood window;

determining second error amplitude information of the ith pixel point according to the second amplitude information of the K pixel points;

and determining second corrected amplitude information of the ith pixel point according to the second amplitude information of the ith pixel point and the second error amplitude information of the ith pixel point.

20. The method of any of claims 16-19, wherein converting the first modified magnitude map to a first depth map and converting the second modified magnitude map to a second depth map comprises:

determining a first depth value of an ith pixel point according to a first corrected phase difference contained in first corrected amplitude information of the ith pixel point to obtain a first depth map, wherein the first depth map is used for representing first depth information of the target object;

and determining a second depth value of the ith pixel point according to a second corrected phase difference contained in the second corrected amplitude information of the ith pixel point to obtain a second depth map, wherein the second depth map is used for representing second depth information of the target object.

21. The method of claim 20, wherein determining the first depth value of the ith pixel according to the first corrected phase difference included in the first corrected amplitude information of the ith pixel comprises:

determining a first depth value of the ith pixel point according to the first corrected phase difference, the light speed and the frequency of the first emitted light of the ith pixel point;

determining a second depth value of the ith pixel point according to a second corrected phase difference included in the second corrected amplitude information of the ith pixel point includes:

and determining a second depth value of the ith pixel point according to the second corrected phase difference of the ith pixel point, the light speed and the frequency of the second emitted light.

22. The method of any one of claims 16-21, wherein the fusing the first depth map and the second depth map to obtain the target depth map comprises:

and determining a target depth value of the ith pixel point in the target depth map according to the first depth value of the ith pixel point in the first depth map and the second depth value of the ith pixel point in the second depth map so as to obtain the target depth map.

23. The method of claim 22, wherein determining the target depth value for the ith pixel point in the target depth map according to the first depth value for the ith pixel point in the first depth map and the second depth value for the ith pixel point in the second depth map comprises:

determining a first confidence coefficient according to a first depth value of an ith pixel point in the first depth map;

determining a second confidence coefficient according to a second depth value of an ith pixel point in the second depth map;

and comparing the first confidence coefficient and the second confidence coefficient of the ith pixel point, and determining the depth value of the ith pixel point corresponding to the maximum confidence coefficient as the target depth value of the ith pixel point in the target depth map.

24. An image processing apparatus, characterized in that the image processing apparatus is applied to a terminal device; the terminal device comprises a first projector and a second projector; the image processing apparatus includes:

a receiving unit for receiving a control instruction;

a response unit for controlling the first projector to emit first emission light to a target object and controlling the second projector to emit second emission light to the target object in response to a control instruction; wherein the second emitted light has a morphology different from the morphology of the first emitted light;

a light sensing unit for receiving first reflected light including reflected light of the target object reflected by the first reflected light;

a generating unit, configured to generate a first amplitude map according to the first reflected light and a second amplitude map according to the second reflected light;

and the conversion unit is used for obtaining a target depth map according to the first amplitude map and the second amplitude map, and the target depth map is used for representing the depth information of the target object.

25. The apparatus of claim 24, wherein the first projector comprises a light emitting diode and a light blocking grid, and the second projector comprises a vertical cavity surface emitting laser, a collimating lens, and a diffraction grating.

26. The device of claim 24 or 25, wherein the first and second emitted lights are both non-planar lights.

27. The apparatus of claim 26, wherein a projection of the first emitted light is different from a projection of the second emitted light.

28. The apparatus of claim 27, wherein the number of bright areas of the first emission light projection is equal to the number of bright areas of the second emission light projection; alternatively, the number of bright areas projected by the first emitted light is greater than the number of bright areas projected by the second emitted light.

29. The apparatus of claim 27 or 28, wherein the bright areas of the first emission light projection are complementary to the bright areas of the second emission light projection.

30. The device of claim 24 or 25, wherein the first emitted light is non-surface light and the second emitted light is surface light.

31. The apparatus according to any one of claims 24 to 30, wherein the generating unit is specifically configured to:

32. The apparatus according to any of claims 24-31, wherein the conversion unit is specifically configured to:

33. The apparatus according to claim 32, wherein the conversion unit is specifically configured to:

34. The apparatus according to claim 33, wherein the conversion unit is specifically configured to:

35. The apparatus according to claim 34, wherein the conversion unit is specifically configured to:

36. The apparatus according to any of claims 32-35, wherein the conversion unit is specifically configured to:

37. The apparatus according to claim 36, wherein the conversion unit is specifically configured to:

38. The apparatus according to any of claims 32-37, wherein the conversion unit is specifically configured to:

39. The apparatus according to claim 38, wherein the conversion unit is specifically configured to:

40. A terminal device, comprising: at least one processor, a memory, a sensor, a first projector, and a second projector, wherein the first projector and the second projector are to emit different modalities of emitted light, the sensor is to image based on the different modalities of emitted light, the memory is to store a computer program and instructions, the processor is to invoke the computer program and instructions to assist with the sensor, the first projector, and the second projector in performing the image generation method of any of claims 8-23.