WO2021087819A1

WO2021087819A1 - Information processing method, terminal device and storage medium

Info

Publication number: WO2021087819A1
Application number: PCT/CN2019/116055
Authority: WO
Inventors: 贾玉虎
Original assignee: Oppo广东移动通信有限公司
Priority date: 2019-11-06
Filing date: 2019-11-06
Publication date: 2021-05-14
Also published as: CN114391259A; CN114391259B

Abstract

Disclosed is an information processing method, comprising: in the case that depth information of a target object is obtained by means of a depth information sensor, obtaining original depth information corresponding to the depth information, the original depth information representing an acquisition state of the depth information acquired by the depth information sensor or information other than the acquired depth information; obtaining video image data of the target object by means of an image sensor; and merging and encoding the original depth information and the video image data to obtain a video image code stream, and outputting the video image code stream. Also disclosed are another information processing method, a terminal device and a storage medium.

Description

Information processing method, terminal equipment and storage medium

Technical field

The present invention relates to computer technology, in particular to an information processing method, terminal equipment and storage medium.

Background technique

In today's society, more and more terminals are equipped with camera devices, so that users can take pictures or videos anytime and anywhere. In actual application, the encoder uses the existing camera device to use Time Of Light (TOF) cameras, binocular cameras and other depth information sensors to obtain the depth information of the target object. The depth information is used on the decoder to perform the target object's depth information. Depth image restoration. However, the depth image only provides the depth information of the target object, and cannot improve the image quality of the video image of the target object.

Summary of the invention

The embodiment of the present invention provides an information processing method, terminal device and storage medium, which can improve the image quality of the video image of the target object.

In the first aspect, an embodiment of the present invention provides an information processing method, including:

In the case of acquiring the depth information of the target object through the depth information sensor, the original depth information corresponding to the depth information is acquired, and the original depth information represents the acquisition state of the depth information collected by the depth information sensor or the collected depth information. Information other than in-depth information;

Acquiring video image data of the target object through an image sensor;

The original depth information and the video image data are combined and encoded to obtain a video image code stream, and the video image code stream is output.

In the second aspect, an embodiment of the present invention provides an information processing method, including:

Receive a video image code stream, the video image code stream is obtained by combining and encoding original depth information and video image data, and the original depth information is obtained when the depth information of the target object is obtained through the depth information sensor, so The video image data is of the target object acquired by an image sensor, and the original depth information represents a collection state of the depth information collected by the depth information sensor or information other than the collected depth information;

Decoding the video image code stream to obtain the original depth information and the video image corresponding to the video image data;

Image processing is performed on the original depth information and the video image to obtain a target video image.

In a third aspect, an embodiment of the present invention provides a terminal device, including:

The first acquiring unit is configured to acquire original depth information corresponding to the depth information when the depth information of the target object is acquired through the depth information sensing unit, and the original depth information represents that the depth information sensing unit collects the The collection status of depth information or information other than the collected depth information;

The second acquiring unit is configured to acquire the video image data of the target object through an image sensing unit;

An encoding unit, configured to merge and encode the original depth information and the video image data to obtain a video image code stream;

The output unit is configured to output the video image code stream.

In a fourth aspect, an embodiment of the present invention provides a terminal device, including:

The receiving unit is configured to receive a video image code stream, the video image code stream is obtained by combining and encoding original depth information and video image data, and the original depth information obtains the depth information of the target object through the depth information sensing unit In the case of acquiring the video image data, the target object is acquired by the image sensing unit; the original depth information represents the acquisition state of the depth information acquired by the depth information sensing unit or the acquisition state of the depth information acquired by the depth information sensing unit Information other than the depth information;

A decoding unit configured to decode the video image code stream to obtain the original depth information and the video image corresponding to the video image data;

The image processing unit is configured to perform image processing on the original depth information and the video image to obtain a target video image.

In a fifth aspect, an embodiment of the present invention provides a terminal device, including a processor and a memory configured to store a computer program that can run on the processor, wherein the processor is configured to execute the above-mentioned computer program when the computer program is run. The steps of the information processing method performed by the terminal device.

In a sixth aspect, an embodiment of the present invention provides a storage medium that stores an executable program, and when the executable program is executed by a processor, the above-mentioned information processing method executed by the terminal device is implemented.

The information processing method provided by the embodiment of the present invention includes: acquiring the original depth information corresponding to the depth information in the case of acquiring the depth information of the target object through the depth information sensor at the encoding end; acquiring the information of the target object through the image sensor Video image data; merge and encode the original depth information and the video image data to obtain a video image code stream, and output the video image code stream. At the decoding end, the video image code stream is received; the video image code stream is decoded to obtain the original depth information and the video image corresponding to the video image data; the original depth information and the video image are imaged Process to get the target video image. In this way, the original depth information obtained by the depth sensor is directly written into the video image code stream at the encoding end, and analyzed at the decoding end, and the original depth information is parsed to obtain the video image obtained from the image data collected by the image sensor to obtain the target video image. Improve the quality of video images and bring users a more realistic image and video experience.

Description of the drawings

FIG. 1A is a schematic diagram of an optional structure of an information processing system according to an embodiment of the present invention;

FIG. 1B is a schematic diagram of an optional structure of an encoding end according to an embodiment of the present invention;

FIG. 1C is a schematic diagram of an optional structure of the decoding end in an embodiment of the present invention

2 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

3 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

4 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

5 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

6 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of an optional processing flow of an information processing method according to an embodiment of the present invention;

8A is a schematic diagram of an optional framework of an information processing system according to an embodiment of the present invention;

8B is a schematic diagram of an optional framework of an information processing system according to an embodiment of the present invention;

FIG. 9A is a schematic diagram of an optional framework of a decoding end according to an embodiment of the present invention; FIG.

FIG. 9B is a schematic diagram of an optional framework of the decoding end according to an embodiment of the present invention; FIG.

FIG. 9C is a schematic diagram of an optional framework of the decoding end according to an embodiment of the present invention; FIG.

FIG. 9D is a schematic diagram of an optional framework of the decoding end according to an embodiment of the present invention; FIG.

10 is a schematic diagram of sampling for sampling original depth information according to an embodiment of the present invention;

FIG. 11 is a schematic diagram of an optional structure of a terminal device implemented in the present invention;

FIG. 12 is a schematic diagram of an optional structure of a terminal device according to an embodiment of the present invention;

FIG. 13 is a schematic diagram of an optional structure of an electronic device provided by an embodiment of the present invention.

Detailed ways

In order to understand the features and technical content of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The attached drawings are for reference and explanation purposes only, and are not used to limit the embodiments of the present invention.

Before describing in detail the information processing method provided by the embodiment of the present invention, the depth image process will be introduced first.

Depth image, also called range image, refers to an image that uses the distance (depth) from the image sensor to each point in the scene as the pixel value, which can directly reflect the geometric shape of the visible surface of the target object . The depth image can be calculated as point cloud data after coordinate conversion, and the point cloud data with rules and necessary information can also be inversely calculated as depth image data.

Here, the encoder end performs video encoding on the depth image captured by the depth information sensor to obtain the encoded depth image information, and the decoder end can only restore the depth image according to the encoded depth image information. However, the amount of information received by the depth information sensor far exceeds that of the depth image. These massive amounts of information are discarded as redundancy after the depth image is generated. Therefore, in the above solution, other functions of the redundant information, such as image enhancement at the decoding end, are not considered.

Based on the foregoing problems, the embodiments of the present invention provide an information processing method. The information processing method of the embodiments of the present invention can be applied to an information processing system,

Exemplarily, the information processing system 100 applied in the embodiment of the present invention may be as shown in FIG. 1A. The information processing 100 may include an encoding terminal 101 and a decoding terminal 102. The encoding terminal 101 is used to collect video image data and original depth information, and encode the video image data and original depth information to form a video image code stream. The decoding terminal 120 is used to decode the image video code stream to obtain video image data and original depth information, and perform image processing on the video image data and original depth information to obtain a target video image.

The encoding terminal 101 and the decoding terminal 102 may include desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, smart phones and other handhelds, televisions, cameras, display devices, digital media players, Video game consoles, on-board computers, or the like.

As shown in FIG. 1A, the decoding end 102 can receive the encoded video image stream from the encoding end 101 via the link 103. The link 103 may include one or more media and/or devices capable of moving the video image stream from the encoding end 101 to the decoding end 102.

In an example, the link 103 may include one or more communication media that enable the encoding end 101 to directly send the encoded video data to the decoding end 102 in real time. In this example, the encoding end 101 can modulate the video image code stream according to a communication standard (for example, a wireless communication protocol), and can send the modulated video image code stream to the decoding end 102.

In an example, the link 103 may include a storage medium storing a video image code stream formed by the encoding terminal 101. In this example, the decoder 102 can access the storage medium through disk access or card access. The storage medium may include a variety of locally accessible data storage media, such as Blu-ray discs, DVDs, CD-ROMs, flash memory, or other suitable digital storage media for storing video image streams.

In another example, the link 103 may include a file server or another intermediate storage device that stores the video image stream formed by the encoding terminal 101. In this example, the decoder 102 can access the video image code stream stored in the file server or other intermediate storage device through streaming or downloading. The file server may be a server type capable of storing video image code streams and sending the video image code streams to the decoder 102. File servers include web servers (for example, for websites), file transfer protocol servers, network attached storage devices, and local disk drives.

The decoder 102 can access the video image stream via a standard data connection (for example, an Internet connection). Examples of data connection types include wireless links (for example, Wi-Fi connections), wired connections (for example, DSL, cable modem, etc.) suitable for accessing the video image stream stored on the file server, or both combination.

As shown in FIG. 1B, the encoding terminal 101 includes a depth information sensor 1011, an image sensor 1012, and a video image encoder 1013. The depth information sensor 1011 is used to obtain original depth information, the image sensor 1012 is used to obtain video image data, and the video image encoding The device 1013 is used to encode the original depth information and video image data to form a video image code stream.

As shown in FIG. 1C, the decoding terminal 102 includes a video image decoder 1021 and an image processor 1022. The video image decoder 1021 is used to decode a video image stream to obtain the original depth information and the video image corresponding to the video image data. The image processor 1022 is used to process the original depth information and the video image to obtain the target video image. Here, the original depth information acts on the video image, and high-quality video images with high definition and low noise can be obtained.

In an example, as shown in FIG. 1C, the decoding end 102 further includes: a depth image generator 1023, configured to generate a depth image based on the original depth information.

An optional processing procedure of the information processing method provided by the embodiment of the present invention is applied to the encoding end, as shown in FIG. 2, and includes the following steps:

S201: In a case where the depth information of the target object is acquired by the depth information sensor, the original depth information corresponding to the depth information is acquired.

The original depth information represents a collection state of the depth information collected by the depth information sensor or information other than the collected depth information.

The depth information sensor is a sensor that can collect the depth information of the target object. In an example, the depth information sensor is a TOF module that uses a TOF ranging method. In an example, the depth information sensor is a binocular camera.

In the embodiment of the present invention, when the depth information sensor collects depth information, the encoder terminal obtains original depth information through the depth information sensor, and the original depth information includes at least one of the following: charge information, phase information, and the depth information sensor The attribute parameters. Wherein, the charge information and the phase information are information other than the depth information collected by the depth information sensor, and the attribute parameters of the depth information sensor represent the depth information collection state of the depth information sensor.

Taking the original depth information as charge information as an example, the charge information at a time point can be embodied as a charge image. Here, the optical signal received when the depth information sensor collects the depth information is acquired, and the optical signal is converted into an electrical signal through photoelectric conversion, and the electrical signal is quantized to generate a charge image.

Taking the original depth information as phase information as an example, the phase information at a time point can be embodied as a phase image.

Taking the original depth information as the attribute parameters of the depth information sensor as an example, the original depth information may include: temperature, pose and other attribute parameters.

S202: Obtain video image data of the target object through an image sensor.

The encoding terminal obtains the video image data of the target object through the image sensor in the image preview or video shooting scene, where the video image data includes at least one image frame.

In the embodiment of the present invention, the original depth information corresponds to the video frame one-to-one. In an example, different charge images or phase images correspond to different image frames.

S203: Perform merge encoding on the original depth information and the video image data to obtain a video image code stream, and output the video image code stream.

The encoding end uses the video image encoder to merge and encode the original depth information and the video image data. The video image encoder outputs the video image code stream, and outputs the video image code stream output by the video image encoder to the decoding end, making the decoding end based on The original depth information performs image processing on the video image corresponding to the video image data.

Optionally, the video image encoder adopts a video image encoding and decoding protocol to encode video image frames or original depth information to obtain video image stream information; the video encoding and decoding protocol can be H.264, H.265, H.266 , VP9 or AV1, etc.

Optionally, the original depth information and the image video data are encoded using a video image coding and decoding protocol. At this time, the data carried by the video image information does not include the depth information.

In the embodiment of the present invention, when the data carried by the video image information does not include depth information, the encoder can only obtain the original depth information of the depth information sensor when the depth information is obtained, and not the depth collected by the depth information sensor. Information, or discard the collected depth information.

Optionally, the original depth information and image video data are encoded using the video image encoding and decoding protocol, and the depth information collected by the depth information sensor is encoded using the video image encoding and decoding protocol. At this time, the data carried by the video image information includes: Original depth information, depth information and video image data.

In the embodiment of the present invention, the processing of the depth information collected by the depth information sensor is not limited in any way.

Optionally, the video image encoder adopts an industry standard or a specific standard of a specific organization to encode video image frames or original depth information to obtain a video image code stream.

The encoding end may input all the original depth information into the video image encoder to encode all the original depth information, or only input part of the original depth information into the video image encoder to encode part of the original depth information. Optionally, part of the original depth information is original depth information corresponding to the specified image frame. Optionally, part of the original depth information is the original depth information corresponding to the specified image position.

Taking part of the original depth information as the original depth information corresponding to the specified image video as an example, the combining and encoding the original depth information and the video image data to obtain a video image code stream includes: corresponding to the video image data The original depth information corresponding to the designated image frame and the video image data in the image frames are combined and encoded to obtain a video image code stream.

Optionally, the designated image frame is one of the image frames corresponding to the video image data. Optionally, the designated image frame includes a plurality of image frames among the image frames corresponding to the video image data.

The embodiment of the present invention does not impose any restriction on the number of designated image frames.

The encoding end only merges and encodes the original depth information corresponding to the designated image frame and the video image data, and does not encode the original depth information corresponding to non-designated video frames other than the designated image frame in the image frame corresponding to the video image data.

Taking part of the original depth information as the original depth information corresponding to the specified image position as an example, the combining and encoding the original depth information and the video image data to obtain the video image code stream includes: the original depth information corresponding to the specified image position The depth information and the video image data are combined and encoded to obtain a video image code stream.

The designated image position is the position of the designated point in the image acquisition range. Optionally, the designated image position is the position of the designated area within the image acquisition range. The embodiment of the present invention does not limit the size of the range or the location of the designated image position in any way.

The encoding end only merges and encodes the original depth information corresponding to the designated image position and the video image data, and does not encode the original depth information corresponding to the non-designated video position other than the designated image position in the image frame.

In the embodiment of the present invention, the encoding method for the encoding end to merge and encode the original depth information and the video image data includes one of the following:

Encoding method 1: Perform mixed encoding on the original depth information and the video image data according to the correlation between the original depth information and the video image data to obtain a video image code stream;

Encoding mode two, separately encoding the original depth information and the video image data to obtain an image video code stream including a first code stream and a second code stream, wherein the first code stream is the original A code stream obtained after encoding the depth information, and the second code stream is a code stream obtained after encoding the image video data.

In the first encoding method, the encoding and decoding protocols used for encoding the original depth information and the video image data are the same.

Optionally, in the first encoding method, the encoding information in the video image bitstream is mixed encoding information obtained by jointly encoding the original depth information and the video image data. Among them, the video image encoder can use the spatial correlation or temporal correlation between the original depth information and the image and video data to jointly encode the original depth information and the video image data.

Optionally, in the first encoding method, the first encoding information corresponding to the original depth information is written in a designated position of the second encoding information corresponding to the video image data. Optionally, the designated position may be an image information header, a sequence information header, an additional parameter set, or any other position.

Optionally, in the first encoding method, the original depth information and the spatial correlation or temporal correlation between the image and video data are used to encode the original depth information to obtain the first encoded information, and the video image data is encoded to obtain Second encoding information, and writing the first encoding information into the designated position of the first encoding information to obtain a video image code stream.

In the second encoding method, the encoding and decoding protocol used for the original depth information is independent of the encoding and decoding protocol used for encoding the video image data. Optionally, the encoding and decoding protocol used for the original depth information is the same as the encoding and decoding protocol used for encoding the video image data. Optionally, the encoding and decoding protocol used for the original depth information is different from the encoding and decoding protocol used for encoding the video image data.

In an embodiment, as shown in FIG. 3, before S203, the method includes:

204A, preprocessing the original depth information.

In S203, the original depth information and the video image data are combined and coded, which can be executed as S203A: the preprocessed original depth information and the video image data are combined and coded to obtain a video image code stream.

In the embodiment of the present invention, the preprocessing can be one or two of phase calibration methods such as filtering, denoising, signal amplification, etc., or other processing methods. The specific preprocessing can be determined according to actual conditions. The embodiment of the present invention does not limit this.

Optionally, the encoding end preprocesses the original depth information through the depth information sensor.

In an embodiment, as shown in FIG. 4, before S203, the method includes:

204B. Perform redundancy elimination processing on the original depth information to eliminate redundant information in the original depth information.

In S203, the original depth information and the video image data are combined and coded, which can be executed as S203B: the original depth information and the video image data that have undergone redundancy elimination processing are combined and coded to obtain a video image code stream.

The encoding end can eliminate redundant information in the original depth information by performing redundancy elimination processing on the original depth information, thereby compressing the information amount of the original depth information, and reducing the size of the video data stream.

In the embodiment of the present invention, the performing redundancy elimination processing on the original depth information according to includes at least one of the following:

Eliminating the redundancy of the original depth information based on phase correlation;

Performing redundancy elimination processing on the original depth information based on spatial correlation;

Performing redundancy elimination processing on the original depth information based on time correlation;

Performing redundancy elimination processing on the original depth information based on the specified depth;

Performing redundancy elimination processing on the original depth information based on frequency domain correlation;

Perform redundancy elimination processing on the coded bits of the original depth information based on the correlation between the coded binary data.

Optionally, the original depth information is converted into the frequency domain, and the original depth information converted into the frequency domain is subjected to redundancy elimination processing based on the frequency domain correlation.

Optionally, the specified depth is a range of the scene-sensitive depth where the target object is located, the original depth information is redundantly eliminated based on the specified depth, and the original depth information corresponding to the depth outside the range of the scene-sensitive depth Eliminate as redundancy.

Optionally, perform entropy coding on the original depth information, and perform redundancy elimination processing on the coded bits of the entropy coding result of the original depth information based on the correlation between the coded binary data.

Taking the redundant elimination processing of the original depth information based on spatial correlation as an example, the original depth information on the encoding end corresponds to at least one viewpoint; the interval viewpoint is determined from the at least one viewpoint, and the original depth information corresponding to the interval viewpoint is used as the interval Original depth information: The original depth information other than the original depth of the interval in the original depth information is used as redundancy to eliminate, and the original depth information of the interval and the video image data are combined and encoded to obtain the video data stream.

Taking the redundancy elimination processing of the original depth information based on time correlation as an example, the encoding end obtains the original depth information within a certain period of time, and samples the obtained original depth information based on the sampling interval, and retains the sampled original depth information. For the original depth information, the original depth information other than the sampled original depth information in the obtained original depth information is used as redundancy to be eliminated, and the original depth information obtained by the sampling and the video image data are combined and encoded to obtain a video data stream.

An optional processing flow of the information processing method provided by the embodiment of the present invention, which is applied to the decoding end, as shown in FIG. 5, includes the following steps:

S501: Receive a video image code stream.

The decoding end receives the video image code stream sent by the encoding end through the link. The video image code stream is obtained by combining and encoding original depth information and video image data, the original depth information is obtained when the depth information of the target object is obtained by the depth information sensor, and the video image data is obtained by Of the target object acquired by the image sensor; the original depth information represents a collection state of the depth information collected by the depth information sensor or information other than the collected depth information.

S502: Decode the video image code stream to obtain a video image corresponding to the original depth information and the video image data.

Here, the video image code stream is decoded by a video image decoder to obtain the original depth information and the video image corresponding to the video image data.

The decoder sends the received video image code stream to the video image decoder, and the video image decoder decodes the video image code stream.

Optionally, the video image decoder and the video encoder on the encoding end support the same video image codec protocol.

Optionally, when the video image encoder performs mixed encoding on the original depth information and the video image data,

The video image decoder performs hybrid decoding on the video image code stream to obtain the original depth information and the video image corresponding to the video image data.

Optionally, in the case that the video image encoder encodes the original depth information and the video image data independently, the video image decoder independently decodes the first code stream and the second code stream in the video image code stream, and the video image The first code stream of the data is decoded to obtain the original depth information, and the second code stream is decoded to obtain the video image corresponding to the video image data. Here, the video image corresponding to the video image data may also be referred to as the original video image. The original video image obtained by decoding the image video code stream may include one or more frames of original video image.

S503: Perform image processing on the original depth information and the video image to obtain a target video image.

Image processing is performed on the original depth information and the video image by an image processor to obtain a target video image.

After the decoder obtains the original depth information and video image data through decoding, the image processor applies the original depth information to the video image, performs image processing on the video image, and obtains the target video image. The image quality of the target video image is higher than that of the original video image.

Optionally, the decoding end may perform redundancy recovery on the original depth information obtained by decoding based on phase correlation, spatial correlation, time correlation, specified depth, frequency domain correlation, and correlation between encoded binary data, to obtain redundant information. After restoring the original depth information, image processing is performed on the video image based on the original depth information after the redundant restoration to obtain the target video image.

Taking the redundant restoration of the original depth information obtained by decoding based on spatial correlation to obtain the original depth information after redundancy restoration as an example, the decoder performs independent decoding or hybrid decoding on the video image stream to obtain the original depth information of the interval viewpoint And the video image of at least one viewpoint; difference the original depth information of the interval viewpoint to obtain the original depth information of other viewpoints in at least one viewpoint except the interval viewpoint; use the original depth information of the interval viewpoint and the original depth information of other viewpoints , Perform image processing on the video image to obtain the target video image.

Taking the original depth information obtained by decoding redundancy recovery based on time correlation to obtain the original depth information after redundancy recovery as an example, the decoder performs independent decoding or mixed decoding on the video image stream to obtain the original depth information after sampling , And restore the original depth information between the adjacent sampled original depth information based on the time-adjacent original depth information after sampling, and use the original depth information obtained by decoding and the restored original depth information to perform image processing on the video image , Get the target video image.

Optionally, the video image decoder and the image processor are independent of each other. Optionally, the image processor is integrated in the video image decoder.

In an example, taking the original depth information as charge information as an example, the performing image processing on the original depth information and the video image to obtain a target video image includes: The video image is subjected to denoising processing or white balance adjustment to obtain the target video image.

In an example, taking the original depth information as phase information as an example, the performing image processing on the original depth information and the video image to obtain a target video image includes: The video image is deblurred to obtain the target video image.

The image processor in the decoding end analyzes each phase information to obtain the analysis result, and uses the analysis result to deblur the corresponding video frame to obtain the target video image.

In an example, in a High Dynamic Range (HDR) video, each frame of HDR image is obtained by fusing a long exposure image and a short exposure image. At the current moment, for the same scene , Control the image sensor to shoot long exposure images and short exposure images, and control the depth information sensor to shoot phase images, using the phase image as the original depth information; perform mixed encoding or independent encoding on the phase image and the long exposure image, and perform the phase image and short exposure The image is mixed or independently encoded to obtain the video image code stream; the video image code stream is output to the decoding end; the decoding end decodes the long-exposure image, short-exposure image and phase image from the video image code stream; and then uses the phase Image, deblur the long-exposure image and short-exposure image respectively to obtain the deblurred long-exposure image and the deblurred short-exposure image; fuse the deblurred long-exposure image and the deblurred short-exposure image , Get a clearer HDR image.

As shown in Figure 6, after S502, it also includes:

S504: Restore the original depth information to obtain a depth image.

Optionally, the original depth information is restored by a depth image generator to obtain the depth image.

It should be noted that, in the embodiment of the present invention, S504 is located after S503 in FIG. 6 as an example to illustrate the sequence of obtaining the target video image and obtaining the depth image. In practical applications, the execution of S504 and S503 is not In order of priority.

Optionally, the depth image generator and the video image decoder are independent of each other. Optionally, the depth image generator is integrated in the video image decoder.

In an example, the video image decoder, the depth image generator, and the image processor are independent of each other. At this time, the video image code stream is input to the video image decoder, and the video image decoder outputs the original depth information and the video image, and the original The depth information and the video image are input to the image processor, the original depth information is input to the depth image generator, the image processor outputs the target video image, and the depth image generator outputs the depth image.

In an example, the depth image generator and the image processor are integrated in the video image decoder. At this time, the video image code stream is input to the video image decoder, and the video image decoder outputs the target video image and the depth image.

In an example, the depth image generator is integrated in the video image decoder, and the image processor and the video image decoder are independent of each other. At this time, the video image code stream is input to the video image decoder, and the video image decoder outputs the original depth information And the target video image, and the original depth information is input to the depth image generator, and the depth image generator outputs the depth image.

In one example, the image processor is integrated in the video image decoder, and the depth image generator and the video image decoder are independent of each other. At this time, the video image code stream is input to the video image decoder, and the video image decoder outputs the original depth information , Video image and depth image, and input the original depth information and video image to the image processor, and the image processor outputs the target video image.

The embodiment of the present invention also provides an information processing method, which is applied to an information processing system including an encoding end and a decoding end, as shown in FIG. 7, including:

S701: The encoding end acquires original depth information corresponding to the depth information when the depth information sensor collects the depth information of the target object.

The original depth information represents a collection state of the depth information collected by the depth information sensor or information other than the collected depth information;

S702: The encoding end obtains the video image data of the target object through the image sensor;

S703: The encoding terminal merges and encodes the original depth information and the video image data to obtain a video image code stream, and output the video image code stream.

S704: The decoding end receives the video image code stream.

S705: The decoding end decodes the video image code stream to obtain a video image corresponding to the original depth information and the video image data.

S706: The decoding end performs image processing on the original depth information and the video image to obtain a target video image.

In the embodiment of the present invention, the decoding end receives a video image code stream including the encoding information of the original depth information and the encoding information of the image and video information. In this way, the decoding end can decode the original depth information and the video image from the video image code stream, and then , The decoding end can not only use the original depth information to recover the depth image, but also use the original depth information to perform optimization processing such as denoising, white balance adjustment and deblurring on the video image, which improves the information utilization rate and obtains the result after optimization. Compared with the original video image, the target video image has higher image quality.

In the following, the information processing method provided by the embodiment of the present invention will be illustrated by using an example of a scenario.

The framework of the information system of the present invention is shown in Fig. 8A and Fig. 8B. The video image encoder 1013 merges and encodes the original depth information 801 obtained by the depth information sensor 1011 and the video image data 802 collected by the image sensor 1012 to form a video image code stream 803; the video image decoder 1021 obtains a video image code stream 803 Then, the video image code stream 803 is parsed to obtain the original depth information 804 and the video image 805, the depth image generator 1023 restores the original depth information 804 to obtain the depth image 806, and the image processor 1022 decodes the video image through the original depth information 804 The video image 805 obtained by 1021 is processed to obtain the target video image 807. Among them, the depth image generator 1023, the image processor 1022, and the video image decoder 1021 can be independent, as shown in FIG. 8A; the depth image generator 1023 and the image processor 1022 can also be used as a component of the video image decoder 1021 Part, as shown in Figure 8B.

The original depth information output by the depth information sensor can be the original data information obtained by the depth information sensor, that is, the original depth information without preprocessing, or the intermediate data information obtained after the initial data information is preprocessed. Preprocessed original depth information; when the output information is the original data information, the output information can be electrical signals after photoelectric conversion such as charge information or phase information; when the output information is intermediate data information, the output information can be Intermediate video image data that can generate a depth image after processing the initial data signal by phase calibration or other methods.

The video image encoder encodes the input original depth information to form a video image code stream. Among them, the encoding methods include:

Encoding method 1. Use the correlation between video image data and depth original information to mix the two for encoding;

Encoding method 2. Independently encode the video image data and the original depth information respectively.

In the coding method 1, the coding information of the original depth information is in the information header, the sequence information header, the additional parameter set, or other arbitrary positions of the coding information of the video image data.

In the coding method 2, the original depth information itself is separately coded by using other correlations such as the spatial correlation or temporal correlation of the original depth information.

In the video image encoder, the original depth information corresponding to each video image can be encoded, or only the original depth information corresponding to the specified image or specified image position can be encoded, and other non-specified images or non-specified image positions can be encoded. Corresponding to the original depth information, no coding is performed.

For the image processor, when taking pictures or previewing scenes, for the generation of depth of field, the original depth information can be directly used on the video image to form a target video image with depth of field, instead of superimposing the depth image and the video image to produce a depth of field Target video image.

In the process of encoding the original depth information at the encoding end, in order to compress the amount of data, the following correlations can be used, but not limited to, to eliminate redundancy:

1. If the original depth information includes the phase information of multiple video images, use the correlation between the phases to eliminate phase data redundancy; if the original depth information is other data, use the spatial correlation between these data and other correlations to eliminate the data redundancy;

2. Use the time correlation of the original depth information to eliminate data redundancy;

3. Use the specified depth to eliminate scene-based data redundancy;

4. Convert the original depth information into the frequency domain, and use frequency domain correlation to eliminate data redundancy in the frequency domain;

5. Use the correlation between the encoded binary data to eliminate the bit redundancy of the encoding; among them, the encoding here can be entropy encoding.

In the embodiment of the present invention, in the video image code stream containing the original depth information formed by the video image encoder, the original depth information and the video image data can be decoded independently, that is, the video image code stream has decoupling or independence, so that the use of Video image decoders of various video image standard encoding and decoding protocols can extract only video images from the video image stream without extracting original depth information, or only extract original depth information without extracting video images.

As shown in Figures 9A to 9D, for the video image decoder, depth image generator and image processor, the three cooperate with each other to decode the video image code stream in accordance with the video image standard codec protocol, and generate processed images and original In-depth information; the video image standard encoding and decoding protocol can be a private standard customized by the manufacturer or an industry standard. The three components of video image decoder, depth image generator and image processor include:

Composition 1. As shown in Figure 9A, the video image decoder 1021, the depth image generator 1023 and the image processor 1022 are independent of each other. The video image decoder 1021 parses the video image code stream 803 to obtain the video image 805 and the original depth After the information 804, the original depth information 804 is sent to the depth image generator 1023 to generate the depth image 806, and the video image 805 and the original depth information 804 are sent to the image processor 1022 to generate the processed target video image 807;

Composition 2. As shown in Figure 9B, the depth image generator 1023 and the image processor 1022 are embedded in the video image decoder 1021, and the video image code stream 803 is processed inside the video image decoder 1021 to directly output the depth image 806 and The processed target video image 807.

Composition mode 3. As shown in Figure 9C, the depth image generator 1023 is embedded in the video image decoder 1021, and the video image code stream 803 is processed inside the video image decoder 1021, and then the depth image 806 and the video image 805 are output. Send the video image 805 and the original depth information 804 to the image processor 1022, and output the processed target video image 807;

Composition 4, as shown in Figure 9D, the image processor 1022 is embedded in the video image decoder 1021, and the image video code stream 803 is first processed inside the video image decoder 1021, and the original depth information 804 and the processed target video are output Image 807, the original depth information 804 is sent to the depth image generator 1023, and the depth image 806 is output.

In the information processing method provided by the embodiment of the present invention, at the encoding end, the original depth information obtained by the depth information sensor is encoded, and the original depth information is encoded to form a video image code stream for transmission; at the decoding end, the video image is transmitted through the video image The code stream can not only recover the depth image, but also process the original video image by analyzing the original depth information to obtain a target video image with higher image quality.

In one example, the original depth information is phase information. The depth image can be recovered from multiple phase images sampled at different time points. When the original video image is blurred due to motion, multiple phase images can carry different time points. For more information, the blurred original video image can be restored through motion estimation based on the phase information to obtain a clearer target video image.

In another example, the depth information sensor is a TOF architecture or module, and the original depth information is charge information. Not only can the depth image be generated, but also the noise and external visible light of the shooting scene can be judged based on the charge information, and the original video can be made through the charge information Image desiccation and white balance adjustment to obtain better image quality video images, giving users a more beautiful and realistic image and video experience.

In the embodiment of the present invention, the method for acquiring the original depth information includes but is not limited to the following methods:

method one

Using the continuous modulation TOF method, under two different transmission signal frequencies, by controlling the integration time, a total of 8 groups of optical signals with different phases are obtained through the TOF sensor sampling, and the 8 groups of optical signals are photoelectrically converted to obtain 8 groups Charge signal, and then perform 10-bit quantization of these 8 groups of charge signals to generate 8 original charge images; the decoding end encodes these 8 original charge images together with the TOF sensor's temperature and other attribute parameters as original depth information; or Eight original charge images are preprocessed to generate two process depth data and one background data, and the two process depth data and one background data are encoded as the original depth information.

Way two

Using the principle of binocular imaging, two video images captured by a binocular camera are used to calculate parallax and other information according to the poses of the two video images, and the parallax information and camera parameters are encoded as the original depth information.

In the embodiment of the present invention, the 3D High Efficiency Video Coding (3D HEVC) of the coding and decoding protocol is taken as an example. When the original depth information is encoded, as a possible implementation method, each view point and The corresponding original depth information is encoded; as another possible implementation, the original depth information can be encoded based on the viewpoint, that is, because the original depth information such as phase map or charge image exists between different viewpoints at the same time Strong correlation, which can be used to reduce the amount of transmitted video image stream data. In one example, for three-view video encoding, at the encoding end, only the original depth data of the left and right viewpoints need to be retained in the video image code stream. On the decoding end, the original depth information of the left and right viewpoints can be obtained. Perform interpolation processing to obtain the original depth information of the intermediate viewpoint.

In the embodiment of the present invention, the redundancy elimination of the original depth information based on time correlation is taken as an example. As a possible implementation method, all the original depth information does not need to be encoded, but only the depth information needs to be sampled. The original depth information collected by the sensor is sampled with a fixed step size, and these sampled signals are encoded by the video image encoder; after the decoder recovers the sampled signals, the original depth information that has not been sampled is restored by interpolation and other methods.

In an example, as shown in Fig. 10, the original depth information includes: the numbers are signal 1, signal 2, signal 3, signal 4...signal N, the original depth information is sampled with a fixed step size 3, and the sampled The original depth information of includes: signal 1, signal 4, signal 7...signal N, the original depth information after sampling is encoded and decoded, and the decoded non-sampled signal is restored based on its adjacent sampled signal; for example, , Interpolate and restore signal 1 and signal 4 to get signal 2, interpolate and restore signal 2 and signal 4 to get signal 3, and so on.

In the embodiment of the present invention, in the AR scene, as a possible implementation manner, the original depth information corresponding to the entire depth image does not need to be encoded, but only part of the picture needs to be encoded, so as to realize the specified local original depth Coding and transmission of information.

In order to implement the foregoing information processing method, an embodiment of the present invention further provides a terminal device. The composition structure of the terminal device is as shown in FIG. 11, the terminal device 1100 includes:

The first acquiring unit 1101 is configured to acquire original depth information corresponding to the depth information in the case of acquiring the depth information of the target object through the depth information sensing unit, and the original depth information represents what the depth information sensing unit collects. The collection status of the depth information or information other than the collected depth information;

The second acquiring unit 1102 is configured to acquire video image data of the target object through an image sensing unit;

The encoding unit 1103 is configured to merge and encode the original depth information and the video image data to obtain a video image code stream;

The output unit 1104 is configured to output the video image code stream.

In the embodiment of the present invention, the encoding unit 1103 is further configured to:

The original depth information corresponding to the specified image frame in the image frame corresponding to the video image data and the video image data are combined and encoded to obtain the video image code stream.

The original depth information corresponding to the designated image position and the video image data are combined and encoded to obtain the video image code stream.

According to the correlation between the original depth information and the video image data, the original depth information and the video image data are mixed-encoded to obtain the video image code stream.

Encoding the original depth information to obtain first encoding information;

Writing the first encoding information into a designated location of the video image data;

The video image data written in the first encoding information is encoded to obtain the video image code stream.

Encoding the original depth information to obtain first encoding information;

Encoding the video image data to obtain second encoding information;

Combining the first coding information and the second coding information to obtain the video image code stream.

In the embodiment of the present invention, the terminal device further includes:

The preprocessing unit is configured as:

Before combining and encoding the original depth information and the video image data to obtain a video image code stream, preprocessing the original depth information.

Elimination unit, configured as:

Before the original depth information and the video image data are combined and encoded to obtain a video image code stream, redundancy elimination processing is performed on the original depth information to eliminate redundant information in the original depth information.

In the embodiment of the present invention, the elimination unit is further configured as at least one of the following:

In the embodiment of the present invention, the original depth information includes at least one of the following: charge information, phase information, and attribute parameters of the depth information sensing unit.

An embodiment of the present invention also provides a terminal device, including a processor and a memory configured to store a computer program that can run on the processor, wherein the processor is configured to execute the above-mentioned terminal device when the computer program is run. The steps of the information processing method.

It should be noted that the depth information sensing unit, the image sensing unit, and the video image encoding unit in the embodiment of the present invention may be a depth information sensor, an image sensor, and a video image encoder, respectively.

In order to implement the foregoing information processing method, an embodiment of the present invention also provides a terminal device. The composition structure of the terminal device is as shown in FIG. 12, the terminal device 1200 includes:

The receiving unit 1201 is configured to receive a video image code stream, the video image code stream is obtained by combining and encoding original depth information and video image data, and the original depth information is obtained by obtaining the depth of the target object through a depth information sensing unit Information, the video image data is the target object acquired by the image sensing unit; the original depth information characterizes the acquisition state of the depth information acquired by the depth information sensing unit or the acquisition status of the depth information acquired by the depth information sensing unit Information other than the said depth information;

The decoding unit 1202 is configured to decode the video image code stream to obtain the original depth information and the video image corresponding to the video image data;

The processing unit 1203 is configured to perform image processing on the original depth information and the video image to obtain a target video image.

In the embodiment of the present invention, the decoding unit 1202 is further configured to decode the video image code stream through the video image decoding unit to obtain the original depth information and the video image corresponding to the video image data;

The processing unit 1203 is further configured to perform image processing on the original depth information and the video image through the video image decoding unit to obtain a target video image.

In the embodiment of the present invention, the video image decoding unit and the image processing unit are independent of each other, or the image processing unit is integrated in the video image decoding unit.

In the embodiment of the present invention, the processing unit 1203 is further configured to:

When the original depth information is charge information, denoising processing or white balance adjustment is performed on the video image according to the charge information to obtain the target video image.

When the original depth information is phase information, the video image is deblurred according to the phase information to obtain the target video image.

The generating unit is configured to restore the original depth information to obtain a depth image.

In the embodiment of the present invention, the generating unit is further configured to restore the original depth information through the depth image generating unit to obtain a depth image to obtain the depth image.

It should be noted that the video image decoding unit, image processing unit, and depth image generating unit in the embodiment of the present invention may be a video image decoder, an image processor, and a depth image generator, respectively.

13 is a schematic diagram of the hardware composition structure of an electronic device (terminal device) according to an embodiment of the present invention. The electronic device 1300 includes: at least one processor 1301, a memory 1302, and at least one network interface 1304. The various components in the electronic device 1300 are coupled together through the bus system 1305. It can be understood that the bus system 1305 is used to implement connection and communication between these components. In addition to the data bus, the bus system 1305 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, various buses are marked as the bus system 1305 in FIG. 13.

It can be understood that the memory 1302 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory. Among them, non-volatile memory can be ROM, Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), and electrically erasable Programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic random access memory (FRAM, ferromagnetic random access memory), flash memory (Flash Memory), magnetic surface memory, optical disk, or CD-ROM (CD) -ROM, Compact Disc Read-Only Memory); Magnetic surface memory can be disk storage or tape storage. The volatile memory may be a random access memory (RAM, Random Access Memory), which is used as an external cache. By way of exemplary but not restrictive description, many forms of RAM are available, such as static random access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory), and dynamic random access memory. Memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Type synchronous dynamic random access memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronous connection dynamic random access memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct memory bus random access memory (DRRAM, Direct Rambus Random Access Memory) ). The memory 1302 described in the embodiment of the present invention is intended to include, but is not limited to, these and any other suitable types of memory.

The memory 1302 in the embodiment of the present invention is used to store various types of data to support the operation of the electronic device 1300. Examples of these data include: any computer program used to operate on the electronic device 1300, such as an application program 13021. The program for implementing the method of the embodiment of the present invention may be included in the application program 13021.

The method disclosed in the foregoing embodiment of the present invention may be applied to the processor 1301 or implemented by the processor 1301. The processor 1301 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 1301 or instructions in the form of software. The aforementioned processor 1301 may be a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like. The processor 1301 may implement or execute various methods, steps, and logical block diagrams disclosed in the embodiments of the present invention. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present invention may be directly embodied as being executed and completed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 1302. The processor 1301 reads the information in the memory 1302 and completes the steps of the foregoing method in combination with its hardware.

In an exemplary embodiment, the electronic device 1300 may be configured by one or more application specific integrated circuits (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), and complex programmable logic device (CPLD). , Complex Programmable Logic Device), FPGA, general-purpose processor, controller, MCU, MPU, or other electronic components to implement the foregoing method.

The embodiment of the present invention also provides a storage medium for storing computer programs.

Optionally, the storage medium can be applied to the terminal device in the embodiment of the present invention, and the computer program causes the computer to execute the corresponding process in each method of the embodiment of the present invention. For brevity, details are not described herein again.

The present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processors of general-purpose computers, special-purpose computers, embedded processors, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment generate configuration A device for realizing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps configured to implement functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

The above are only the preferred embodiments of the present invention and are not configured to limit the scope of protection of the present invention. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in Within the protection scope of the present invention.

Claims

An information processing method, the method comprising:

In the case of acquiring the depth information of the target object through the depth information sensor, the original depth information corresponding to the depth information is acquired, and the original depth information represents the acquisition state of the depth information collected by the depth information sensor or the collected depth information. Information other than in-depth information;

Acquiring video image data of the target object through an image sensor;

The original depth information and the video image data are combined and encoded to obtain a video image code stream, and the video image code stream is output.
The method according to claim 1, wherein said combining and encoding said original depth information and said video image data to obtain a video image code stream comprises:

The original depth information corresponding to the designated image frame in the image frame corresponding to the video image data and the video image data are combined and encoded to obtain the video image code stream.
The method according to claim 1, wherein said combining and encoding said original depth information and said video image data to obtain a video image code stream comprises:

The original depth information corresponding to the designated image position and the video image data are combined and encoded to obtain the video image code stream.
The method according to any one of claims 1 to 3, wherein the combining and encoding the original depth information and the video image data to obtain a video image code stream comprises:

According to the correlation between the original depth information and the video image data, the original depth information and the video image data are mixed-encoded to obtain the video image code stream.
The method according to claim 4, wherein said combining and encoding said original depth information and said video image data to obtain a video image code stream, further comprising:

The first encoding information corresponding to the original depth information is written into the designated position of the second encoding information corresponding to the video image data.
The method according to any one of claims 1 to 3, wherein the combining and encoding the original depth information and the video image data comprises:

The original depth information and the video image data are separately encoded to obtain an image video code stream including a first code stream and a second code stream, and the first code stream is obtained after encoding the original depth information A code stream, and the second code stream is a code stream obtained after encoding the image and video data.
The method according to any one of claims 1 to 6, wherein, before combining and encoding the original depth information and the video image data to obtain a video image code stream, the method further comprises:

Preprocessing the original depth information;

The combining and encoding the original depth information and the video image data to obtain a video image code stream includes:

The preprocessed original depth information and the video image data are combined and encoded to obtain a video image code stream.
The method according to any one of claims 1 to 7, wherein, before combining and encoding the original depth information and the video image data to obtain a video image code stream, the method further comprises:

Perform redundancy elimination processing on the original depth information to eliminate redundant information in the original depth information.
The method according to claim 8, wherein said performing redundancy elimination processing on said original depth information according to said process comprises at least one of the following:

Eliminating the redundancy of the original depth information based on phase correlation;

Performing redundancy elimination processing on the original depth information based on spatial correlation;

Performing redundancy elimination processing on the original depth information based on time correlation;

Performing redundancy elimination processing on the original depth information based on the specified depth;

Performing redundancy elimination processing on the original depth information based on frequency domain correlation;

Perform redundancy elimination processing on the coded bits of the original depth information based on the correlation between the coded binary data.
The method according to any one of claims 1 to 9, wherein the original depth information includes at least one of the following: charge information, phase information, and attribute parameters of the depth information sensor.
An information processing method, the method comprising:

Receive a video image code stream, the video image code stream is obtained by combining and encoding original depth information and video image data, and the original depth information is obtained when the depth information of the target object is obtained by the depth information sensor, so The video image data is obtained by the target object through an image sensor; the original depth information represents the collection state of the depth information collected by the depth information sensor or information other than the collected depth information;

Decoding the video image code stream to obtain the original depth information and the video image corresponding to the video image data;

Image processing is performed on the original depth information and the video image to obtain a target video image.
The method of claim 11, wherein:

Decoding the video image code stream by a video image decoder to obtain the original depth information and the video image corresponding to the video image data;

Image processing is performed on the original depth information and the video image by an image processor to obtain a target video image.
The method according to claim 12, wherein the video image decoder and the image processor are independent of each other, or the image processor is integrated in the video image decoder.
The method according to any one of claims 11 to 13, wherein the original depth information includes at least one of the following: charge information, phase information, and attribute parameters of the depth information sensor.
The method according to claim 14, wherein, when the original depth information is charge information, the performing image processing on the original depth information and the video image to obtain a target video image comprises:

Performing denoising processing or white balance adjustment on the video image according to the charge information to obtain the target video image.
The method according to claim 14, wherein, when the original depth information is phase information, the performing image processing on the original depth information and the video image to obtain a target video image comprises:

Performing deblurring processing on the video image according to the phase information to obtain the target video image.
The method according to any one of claims 11 to 16, wherein the method further comprises:

The original depth information is restored to obtain a depth image.
The method according to claim 17, wherein the original depth information is restored by a depth image generator to obtain the depth image.
A terminal device, the terminal device includes:

The first acquiring unit is configured to acquire original depth information corresponding to the depth information when the depth information of the target object is acquired through the depth information sensing unit, and the original depth information represents that the depth information sensing unit collects the The collection status of depth information or information other than the collected depth information;

The second acquiring unit is configured to acquire the video image data of the target object through an image sensing unit;

An encoding unit configured to merge and encode the original depth information and the video image data to obtain a video image code stream;

The output unit is configured to output the video image code stream.
The terminal device according to claim 19, wherein the encoding unit is further configured to:

According to the correlation between the original depth information and the video image data, the original depth information and the video image data are mixed-encoded to obtain the video image code stream.
The terminal device according to claim 19 or 20, wherein the encoding unit is further configured to:

Encoding the original depth information to obtain first encoding information;

Encoding the video image data to obtain second encoding information;

Combining the first coding information and the second coding information to obtain the video image code stream.
The terminal device according to any one of claims 19 to 21, wherein the terminal device further comprises:

The preprocessing unit is configured as:

Before combining and encoding the original depth information and the video image data to obtain a video image code stream, preprocessing the original depth information.
The terminal device according to any one of claims 19 to 22, wherein the terminal device further comprises:

Elimination unit, configured as:

Before the original depth information and the video image data are combined and encoded to obtain a video image code stream, redundancy elimination processing is performed on the original depth information to eliminate redundant information in the original depth information.
A terminal device, the terminal device includes:

The receiving unit is configured to receive a video image code stream, the video image code stream is obtained by combining and encoding original depth information and video image data, and the original depth information obtains the depth information of the target object through the depth information sensing unit In the case of acquiring the video image data, the target object is acquired by the image sensing unit; the original depth information represents the acquisition state of the depth information acquired by the depth information sensing unit or the acquisition state of the depth information acquired by the depth information sensing unit Information other than the depth information;

A decoding unit configured to decode the video image code stream to obtain the original depth information and the video image corresponding to the video image data;

The processing unit is configured to perform image processing on the original depth information and the video image to obtain a target video image.
The terminal device according to claim 24, wherein the terminal device further comprises:

The generating unit is configured to restore the depth information to obtain a depth image.
A terminal device, comprising a processor and a memory configured to store a computer program that can run on the processor, wherein the processor is configured to execute the computer program described in any one of claims 1 to 10 when the processor is configured to run the computer program. The steps of the information processing method described above, or the steps of the information processing method described in any one of claims 11 to 18 are executed.
A storage medium that stores an executable program that, when executed by a processor, implements the information processing method of any one of claims 1 to 10, or implements any one of claims 11 to 18 The described information processing method.