CN114972469A

CN114972469A - Method and device for generating depth map, electronic equipment and readable storage medium

Info

Publication number: CN114972469A
Application number: CN202210492828.8A
Authority: CN
Inventors: 陈曲; 叶晓青; 孙昊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2022-08-30

Abstract

The disclosure provides a method and a device for generating a depth map, electronic equipment and a readable storage medium, relates to the technical field of artificial intelligence such as image processing, computer vision and deep learning, and can be applied to scenes such as 3D vision, virtual/augmented reality and the like. The method for generating the depth map comprises the following steps: acquiring a reference image and a source image, wherein the reference image and the source image correspond to different visual angles in the same scene; matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process, and determining a target source pixel block corresponding to the reference pixel block in the source image; and generating a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block. The method and the device can improve the matching accuracy of the pixel block, and further improve the accuracy of the generated depth map.

Description

Method and device for generating depth map, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technology, and in particular to the field of image processing, computer vision, deep learning, and the like, and can be applied to scenes such as 3D vision, virtual/augmented reality, and the like. A method, an apparatus, an electronic device and a readable storage medium for generating a depth map are provided.

Background

With the development of computer vision, the traditional computer vision technology based on two-dimensional color image processing cannot meet the requirement of people on applying computer vision to the three-dimensional physical world. Depth maps are increasingly used as images that can directly reflect object distance information.

Disclosure of Invention

According to a first aspect of the present disclosure, there is provided a method of generating a depth map, comprising: acquiring a reference image and a source image, wherein the reference image and the source image correspond to different visual angles in the same scene; matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process, and determining a target source pixel block corresponding to the reference pixel block in the source image; and generating a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

According to a second aspect of the present disclosure, there is provided an apparatus for generating a depth map, comprising: the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a reference image and a source image, and the reference image and the source image correspond to different visual angles in the same scene; the matching unit is used for matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process and determining a target source pixel block corresponding to the reference pixel block in the source image; and the first generation unit is used for generating the depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.

According to a fifth aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method as described above.

According to the technical scheme, the matching between the pixel blocks is realized by adjusting the size of the reference pixel block in the matching process, the matching speed and the matching efficiency of the pixel blocks can be improved, and the generation speed and the generation efficiency in the generation of the depth map are further improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 3 is a schematic illustration according to a third embodiment of the present disclosure;

FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;

fig. 5 is a block diagram of an electronic device for implementing a method of generating a depth map of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure. As shown in fig. 1, the method for generating a depth map in this embodiment specifically includes the following steps:

s101, acquiring a reference image and a source image, wherein the reference image and the source image correspond to different visual angles in the same scene;

s102, matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process, and determining a target source pixel block corresponding to the reference pixel block in the source image;

and S103, generating a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

According to the method for generating the depth map, after the reference image and the source image corresponding to different visual angles in the same scene are obtained, the reference pixel block in the reference image is matched with the source pixel block in the source image, the target source pixel block corresponding to the reference pixel block in the source image is determined, and the depth map of the reference image is generated according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

In this embodiment, when S101 is executed to acquire the reference image and the source image, a plurality of images with different viewing angles captured by the camera in the same scene may be acquired first, and then one of the plurality of images may be selected as the reference image, and the other may be selected as the source image.

For example, if the images of different viewing angles shot by the camera for the scene 1 are the image 1 and the image 2, the embodiment may use the image 1 as a reference image and the image 2 as a source image to generate a depth map of the image 1; then, the image 2 is used as a reference image, and the image 1 is used as a source image, so that a depth map of the image 2 is generated.

In the embodiment, after the reference image and the source image are acquired in S101, S102 is executed to match a reference pixel block in the reference image with a candidate source pixel block in the source image, and the size of the reference pixel block is adjusted in the matching process to determine a target source pixel block corresponding to the reference pixel block in the source image.

In this embodiment, the pixels included in the reference image are reference pixels, and the pixels included in the source image are source pixels; taking a pixel block corresponding to a reference pixel in the reference image as a reference pixel block, and taking a pixel block corresponding to a source pixel in the source image as a source pixel block; wherein each pixel block in the present embodiment corresponds to a different pixel in the image.

Specifically, when performing S102 to match a reference pixel block in a reference image with a source pixel block in a source image, the present embodiment may adopt an optional implementation manner as follows: dividing a reference image into a plurality of reference pixel blocks according to the size of a preset pixel block, dividing a source image into a plurality of source pixel blocks, wherein the size of the preset pixel block can be n × n (that is, each pixel block contains n × n pixels, and n is a positive integer greater than 1 and equal to 1); selecting a candidate source pixel block corresponding to the reference pixel block from the plurality of source pixel blocks; and matching the reference pixel block and the candidate source pixel block corresponding to the reference pixel block.

It is understood that the candidate source pixel blocks corresponding to the reference pixel block are usually selected in multiple numbers in this embodiment, and since one matching process can only complete the matching between the reference pixel block and one candidate source pixel block, this embodiment performs multiple matching processes when executing S102.

In this embodiment, when executing S102 to select a candidate source pixel block corresponding to a reference pixel block from a plurality of source pixel blocks, the optional implementation manner that may be adopted is: determining a target reference pixel corresponding to the reference pixel block in the reference image; and selecting a source pixel block corresponding to the determined target reference pixel from the source image as a candidate source pixel block corresponding to the reference pixel block.

That is to say, in the embodiment, the candidate source pixel block is determined according to the target reference pixel corresponding to the reference pixel block, and since the reference pixel and the source pixel block have a corresponding relationship, the embodiment can improve the selection accuracy and the selection speed of the candidate source pixel block.

The present embodiment may use, as the target reference pixel of the reference pixel block, a pixel in the reference image located at a preset position (e.g., a middle position or a position adjacent to the middle position) in the reference pixel block when performing S102 to determine the target reference pixel corresponding to the reference pixel block in the reference image.

After the step S102 of dividing the image into a plurality of pixel blocks is executed, the present embodiment may pre-establish a corresponding relationship between the reference pixel and different source pixel blocks in the source image, and according to the corresponding relationship, the present embodiment may determine the source pixel block corresponding to the target reference pixel in the source image.

In this embodiment, when S102 is executed to establish a correspondence relationship between the reference pixel and the source pixel block, a source pixel block may be randomly selected from the source image for each reference pixel in the reference image, and then a correspondence relationship between the reference pixel and the selected source pixel block is established.

If multiple iterations are required in the present embodiment when determining a target source pixel block corresponding to a reference pixel block, and each iteration may be regarded as a process of updating a source pixel block corresponding to a reference pixel, therefore, when executing S102 to select a source pixel block corresponding to the determined target reference pixel from a source image as a candidate source pixel block corresponding to the reference pixel block, the present embodiment may further determine, according to a target source pixel block corresponding to the reference pixel block determined in the previous iteration, a source pixel block corresponding to the target reference pixel in the present iteration.

In this embodiment, S102 is executed to match the reference pixel block and the candidate source pixel block corresponding to the reference pixel block to obtain a matching degree between the pixel blocks, where the matching degree may be a similarity between the reference pixel block and the candidate source pixel block corresponding to the reference pixel block, or a matching cost between the reference pixel block and the candidate source pixel block corresponding to the reference pixel block; in this embodiment, the similarity or the matching cost between pixel blocks may be obtained by using the prior art, which is not described herein again.

In this embodiment, when the size of the reference pixel block is adjusted in the matching process in step S102, an optional implementation manner that can be adopted is as follows: acquiring the matching degree of the reference pixel block and the candidate source pixel block in the current matching process in the horizontal direction (x direction) and the vertical direction (y direction), wherein the matching degree is the similarity or the matching cost of the reference pixel block and the candidate source pixel block in the x direction and the y direction; taking the direction with the matching degree meeting the preset requirement as a target direction; the length of the reference pixel block in the target direction is adjusted to perform the next matching process based on the adjusted reference pixel block.

That is to say, in the embodiment, when the pixel blocks are matched, the size of the reference pixel block is adjusted according to the matching degrees in different directions obtained in each matching process, so that the adjusted reference pixel block includes more or fewer pixels in corresponding directions in the reference image, thereby realizing the adaptive adjustment of the size of the reference pixel block in the matching process, and further improving the matching accuracy between the pixel blocks.

In this embodiment, when S102 is executed to set the direction whose matching degree meets the preset requirement as the target direction, the direction with a higher matching degree (for example, a greater similarity or a smaller matching cost) may be set as the target direction, and the direction with a lower matching degree (for example, a lesser similarity or a greater matching cost) may also be set as the target direction.

In this embodiment, when the step S102 is executed to adjust the size of the reference pixel block in the target direction, a corresponding adjustment mode may be obtained according to different preset requirements that are met; if the preset requirement is that the matching degree is high, the length of the reference pixel block in the target direction may be increased when the embodiment executes S102, and if the preset requirement is that the matching degree is low, the length of the reference pixel block in the target direction may be decreased when the embodiment executes S102.

When the adjustment of the length of the reference pixel in the target direction is performed in S102, the length of the reference pixel in the non-target direction may also be adjusted at the same time, for example, the length of the reference pixel in the non-target direction is decreased while the length of the reference pixel in the target direction is increased, or the length of the reference pixel in the non-target direction is increased while the length of the reference pixel in the target direction is decreased.

It can be understood that, when S102 is executed to adjust the length of the reference pixel block in the target direction or the non-target direction, the present embodiment may increase or decrease the length of the reference pixel block according to the preset distance value; the length adjustment method is not limited in this embodiment.

In addition, when performing S102 to adjust the length of the reference pixel block in the target direction, the present embodiment may adopt the following optional implementation manners: and in the case that the length of the reference pixel block in the target direction after the adjustment is determined to be in the preset length range, adjusting the length of the reference pixel block in the target direction.

That is to say, in this embodiment, the size of the reference pixel block is limited to a certain extent, so that the size of the reference pixel block is adjusted only when it is determined that the adjusted length of the reference pixel block is within the preset length range, and the size of the reference pixel block is prevented from being adjusted too large or too small, thereby improving the accuracy of adjusting the reference pixel block.

In this embodiment, when S102 is executed to determine a target source pixel block corresponding to a reference pixel block in a source image, the optional implementation manners that can be adopted are: taking the candidate source pixel block with the highest matching degree with the reference pixel block in the matching process as a first source pixel block; selecting a second source pixel block corresponding to the first source pixel block in the source image, for example, according to a preset search range, taking at least one source pixel block around the first source pixel block in the source image as the second source pixel block; and in the case that the reference pixel block is determined to have a higher matching degree with the selected second source pixel block, taking the second source pixel block as a target source pixel block, otherwise taking the first source pixel block as the target source pixel block.

That is to say, the present embodiment may combine the two matching processes to determine the target source pixel block corresponding to the reference pixel block, and can further improve the speed and accuracy when determining the target source pixel block.

It can be understood that, in this embodiment, multiple iterations may also be performed according to step S102, where each iteration process is to perform multiple matching on the reference pixel block and the candidate source pixel block, and specifically, according to the reference pixel block determined in the current iteration and the target source pixel block corresponding to the reference pixel block, the correspondence between the reference pixel block and the source pixel block is updated, so that the candidate source pixel block corresponding to the reference pixel block is continuously obtained in the next iteration process, and the reference pixel block and the candidate source pixel block corresponding to the reference pixel block are matched.

After the step S102 is executed to determine the target source pixel block corresponding to the reference pixel block in the source image, the step S103 is executed to generate a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

Specifically, in this embodiment, when executing S103 to generate the depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block, the optional implementation manners that can be adopted are: acquiring first position information of source pixels corresponding to a target source pixel block in a source image; acquiring second position information of a reference pixel corresponding to the reference pixel block in the reference image; obtaining the offset of the reference pixel according to the obtained first position information and the second position information; based on the offset of the reference pixel, a depth map of the reference image is generated.

Since the reference image and the source image acquired in S101 by the present embodiment correspond to different view angles of the same scene, after the offset of the reference pixel in the reference image is obtained in S103, the present embodiment can generate a depth map of the reference image by combining the pose relationship between the reference image and the source image, and the depth information of each pixel in the generated depth map represents the distance between the pixel and the camera.

After executing S103 to generate the depth map of the reference image, the present embodiment may further include the following: acquiring depth maps of a plurality of reference images; fusing depth information of the same pixel in different depth maps to obtain a depth fusion result of each pixel; and after removing the pixels of which the depth fusion result exceeds a preset depth threshold, generating the three-dimensional point cloud according to the depth information of the residual pixels.

That is to say, the three-dimensional point cloud of the corresponding scene is generated according to the depth maps corresponding to the multiple reference images, and the accuracy of the generated depth map can be correspondingly improved by adjusting the size of the reference pixel block, so the accuracy of the generated three-dimensional point cloud is improved.

Fig. 2 is a schematic diagram according to a second embodiment of the present disclosure. The left side in fig. 2 is the input reference image and the right side in fig. 2 is the input source image.

Fig. 3 is a schematic diagram according to a third embodiment of the present disclosure. The image in fig. 3 is a depth map of the reference image in fig. 2.

Fig. 4 is a schematic diagram according to a fourth embodiment of the present disclosure. As shown in fig. 4, the apparatus 400 for generating a depth map according to this embodiment includes:

the acquiring unit 401 is configured to acquire a reference image and a source image, where the reference image and the source image correspond to different viewing angles in the same scene;

the matching unit 402 is configured to match a reference pixel block in the reference image with a candidate source pixel block in the source image, adjust the size of the reference pixel block in the matching process, and determine a target source pixel block corresponding to the reference pixel block in the source image;

the first generating unit 403 is configured to generate a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

When acquiring the reference image and the source image, the acquiring unit 401 may first acquire a plurality of images of different viewing angles captured by the camera in the same scene, and then select one of the plurality of images as the reference image and select another image as the source image.

In this embodiment, after the obtaining unit 401 obtains the reference image and the source image, the matching unit 402 matches a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusts the size of the reference pixel block in the matching process, and determines a target source pixel block corresponding to the reference pixel block in the source image.

Specifically, when the matching unit 402 matches a reference pixel block in the reference image with a source pixel block in the source image, the optional implementation manners that can be adopted are as follows: dividing a reference image into a plurality of reference pixel blocks according to the preset size of the pixel blocks, and dividing a source image into a plurality of source pixel blocks; selecting a candidate source pixel block corresponding to the reference pixel block from the plurality of source pixel blocks; and matching the reference pixel block and the candidate source pixel block corresponding to the reference pixel block.

It is understood that, in this embodiment, there are usually a plurality of candidate source pixel blocks corresponding to the reference pixel block, and since one matching process can only complete the matching between the reference pixel block and one candidate source pixel block, the matching unit 402 may perform a plurality of matching processes.

When the matching unit 402 selects a candidate source pixel block corresponding to the reference pixel block from the plurality of source pixel blocks, the optional implementation manner that can be adopted is as follows: determining a target reference pixel in the reference image corresponding to the reference pixel block; and selecting a source pixel block corresponding to the determined target reference pixel from the source image as a candidate source pixel block corresponding to the reference pixel block.

The matching unit 402 may, when determining a target reference pixel in the reference image corresponding to the reference pixel block, use a pixel in the reference image located at a preset position (e.g., a middle position or a position adjacent to the middle position) in the reference pixel block as the target reference pixel of the reference pixel block.

After dividing the image into a plurality of pixel blocks, the matching unit 402 may pre-establish a correspondence between the reference pixel and different source pixel blocks in the source image, and according to the correspondence, this embodiment may determine the source pixel block corresponding to the target reference pixel in the source image.

When establishing the correspondence between the reference pixels and the source pixel blocks, the matching unit 402 may randomly select one source pixel block from the source image for each reference pixel in the reference image, and further establish the correspondence between the reference pixel and the selected source pixel block.

If multiple iterations are required to be performed when determining the target source pixel block corresponding to the reference pixel block in this embodiment, and each iteration may be regarded as a process of updating the source pixel block corresponding to the reference pixel, therefore, when the matching unit 402 selects the source pixel block corresponding to the determined target reference pixel from the source image as the candidate source pixel block corresponding to the reference pixel block, the source pixel block corresponding to the target reference pixel in the current iteration may also be determined according to the target source pixel block corresponding to the reference pixel block determined in the previous iteration.

The matching unit 402 matches the reference pixel block and the candidate source pixel block corresponding to the reference pixel block to obtain a matching degree between the pixel blocks, where the matching degree may be a similarity between the reference pixel block and the candidate source pixel block corresponding to the reference pixel block, or a matching cost between the reference pixel block and the candidate source pixel block corresponding to the reference pixel block.

When the matching unit 402 adjusts the size of the reference pixel block in the matching process, the optional implementation manners that can be adopted are: acquiring the matching degree of a reference pixel block and a candidate source pixel block in the current matching process in the horizontal direction (x direction) and the vertical direction (y direction); taking the direction with the matching degree meeting the preset requirement as a target direction; the length of the reference pixel block in the target direction is adjusted to perform the next matching process based on the adjusted reference pixel block.

When the matching unit 402 takes the direction whose matching degree meets the preset requirement as the target direction, the direction with higher matching degree (for example, with greater similarity or with lower matching cost) may be taken as the target direction, or the direction with lower matching degree (for example, with smaller similarity or with higher matching cost) may be taken as the target direction.

When adjusting the size of the reference pixel block in the target direction, the matching unit 402 may obtain a corresponding adjustment mode according to different preset requirements; if the preset requirement is that the matching degree is high, the matching unit 402 may increase the length of the reference pixel block in the target direction, and if the preset requirement is that the matching degree is low, the matching unit 402 may decrease the length of the reference pixel block in the target direction.

The matching unit 402 may also adjust the length of the reference pixel in the non-target direction at the same time when adjusting the length of the reference pixel in the target direction, for example, decrease the length of the reference pixel in the non-target direction while increasing the length of the reference pixel in the target direction, or increase the length of the reference pixel in the non-target direction while decreasing the length of the reference pixel in the target direction.

It is understood that, when adjusting the length of the reference pixel block in the target direction or the non-target direction, the matching unit 402 may increase or decrease the length of the reference pixel block according to a preset distance value; the embodiment does not limit the way of adjusting the length.

In addition, when the matching unit 402 adjusts the length of the reference pixel block in the target direction, the optional implementation manners that can be adopted are as follows: and in the case that the length of the reference pixel block in the target direction after the adjustment is determined to be in the preset length range, adjusting the length of the reference pixel block in the target direction.

That is to say, the present embodiment may limit the size of the reference pixel block to a certain extent, so that the size of the reference pixel block is adjusted only when it is determined that the length of the adjusted reference pixel block is within the preset length range, and the size of the reference pixel block is prevented from being adjusted too large or too small, thereby improving the accuracy of adjusting the reference pixel block.

When determining the target source pixel block corresponding to the reference pixel block in the source image, the matching unit 402 may adopt the following optional implementation manners: taking the candidate source pixel block with the highest matching degree with the reference pixel block in the matching process as a first source pixel block; selecting a second source pixel block corresponding to the first source pixel block in the source image, for example, according to a preset search range, taking at least one source pixel block around the first source pixel block in the source image as the second source pixel block; and in the case that the reference pixel block is determined to have a higher matching degree with the selected second source pixel block, taking the second source pixel block as a target source pixel block, otherwise taking the first source pixel block as the target source pixel block.

That is to say, the present embodiment may determine, in combination with the two matching processes, the target source pixel block corresponding to the reference pixel block, and can further improve the accuracy in determining the target source pixel block.

It can be understood that the matching unit 402 may perform multiple iterations, where each iteration process is to perform multiple matching on the reference pixel block and the candidate source pixel block, and specifically, according to the correspondence between the reference pixel block and the target source pixel block determined in the current iteration, the candidate source pixel block corresponding to the reference pixel block is continuously obtained, so as to match the reference pixel block and the candidate source pixel block corresponding to the reference pixel block.

In the present embodiment, after the matching unit 402 determines the target source pixel block corresponding to the reference pixel block in the source image, the first generation unit 403 generates the depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

When the first generating unit 403 generates the depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block, the optional implementation manners that can be adopted are as follows: acquiring first position information of source pixels corresponding to a target source pixel block in a source image; acquiring second position information of a reference pixel corresponding to the reference pixel block in the reference image; obtaining the offset of the reference pixel according to the obtained first position information and the second position information; based on the offset of the reference pixel, a depth map of the reference image is generated.

Since the reference image and the source image acquired by the acquiring unit 402 correspond to different view angles of the same scene, after the first generating unit 403 obtains the offset of the reference pixel in the reference image, the first generating unit can generate a depth map of the reference image by combining the pose relationship between the reference image and the source image, and the depth information of each pixel in the generated depth map represents the distance between the pixel and the camera.

The apparatus 400 for generating a depth map according to this embodiment may further include a second generating unit 404, configured to perform the following: acquiring depth maps of a plurality of reference images; fusing depth information of the same pixel in different depth maps to obtain a depth fusion result of each pixel; and after removing the pixels of which the depth fusion result exceeds a preset depth threshold, generating a three-dimensional point cloud according to the depth information of the remaining pixels.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

As shown in fig. 5, is a block diagram of an electronic device of a method of generating a depth map according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the apparatus 500 comprises a computing unit 501 which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, the ROM502, and the RAM503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of exhibitors, speakers, etc.; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 501 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as the method of generating a depth map. For example, in some embodiments, the method of generating a depth map may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508.

In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM602 and/or the communication unit 509. When the computer program is loaded into the RAM503 and executed by the computing unit 501, one or more steps of the method of generating a depth map described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured by any other suitable means (e.g. by means of firmware) to perform the method of generating a depth map.

Various implementations of the systems and techniques described here can be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable depth map generating apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a presentation device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for presenting information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of generating a depth map, comprising:

acquiring a reference image and a source image, wherein the reference image and the source image correspond to different visual angles in the same scene;

matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process, and determining a target source pixel block corresponding to the reference pixel block in the source image;

and generating a depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

2. The method of claim 1, wherein said matching a reference pixel block in the reference image with a candidate source pixel block in the source image comprises:

dividing the reference image into a plurality of reference pixel blocks according to the size of a preset pixel block, and dividing the source image into a plurality of source pixel blocks;

selecting a candidate source pixel block corresponding to the reference pixel block from a plurality of source pixel blocks;

and matching the reference pixel block and the candidate source pixel block corresponding to the reference pixel block.

3. The method of claim 2, wherein said selecting a candidate source pixel block from a plurality of source pixel blocks that corresponds to the reference pixel block comprises:

determining a target reference pixel in the reference image corresponding to the reference pixel block;

and selecting a source pixel block corresponding to the target reference pixel from the source image as a candidate source pixel block corresponding to the reference pixel block.

4. The method according to any of claims 1-3, wherein said resizing the reference pixel block in the matching process comprises:

obtaining the matching degree of the reference pixel block and the candidate source pixel block in the matching process in the horizontal direction and the vertical direction;

taking the direction with the matching degree meeting the preset requirement as a target direction;

adjusting a length of the reference pixel block in the target direction.

5. The method of claim 4, wherein the adjusting the length of the reference pixel block in the target direction comprises:

and in the case that the length of the reference pixel block in the target direction after adjustment is determined to be in a preset length range, adjusting the length of the reference pixel block in the target direction.

6. The method according to any one of claims 1-5, wherein said determining a target source pixel block in the source image corresponding to the reference pixel block comprises:

taking the candidate source pixel block with the highest matching degree with the reference pixel block in the matching process as a first source pixel block;

selecting a second source pixel block corresponding to the first source pixel block in the source image;

in the event that it is determined that the reference pixel block has a higher degree of match with the second source pixel block, treating the second source pixel block as the target source pixel block, otherwise treating the first source pixel block as the target source pixel block.

7. The method of any one of claims 1-6, wherein the generating a depth map for the reference image from the reference pixel block and its corresponding target source pixel block comprises:

acquiring first position information of source pixels corresponding to the target source pixel block in the source image, and acquiring second position information of reference pixels corresponding to the reference pixel block in the reference image;

obtaining the offset of the reference pixel according to the first position information and the second position information;

and generating a depth map of the reference image based on the offset of the reference pixel.

8. The method of any of claims 1-7, further comprising,

after generating the depth maps of the reference images, acquiring the depth maps of a plurality of reference images;

fusing depth information of the same pixels in different depth maps to obtain a depth fusion result of each pixel;

and after removing the pixels of which the depth fusion result exceeds a preset depth threshold, generating the three-dimensional point cloud according to the depth information of the residual pixels.

9. An apparatus for generating a depth map, comprising:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a reference image and a source image, and the reference image and the source image correspond to different visual angles in the same scene;

the matching unit is used for matching a reference pixel block in the reference image with a candidate source pixel block in the source image, adjusting the size of the reference pixel block in the matching process and determining a target source pixel block corresponding to the reference pixel block in the source image;

and the first generation unit is used for generating the depth map of the reference image according to the reference pixel block and the target source pixel block corresponding to the reference pixel block.

10. The apparatus according to claim 9, wherein the matching unit, when matching a reference pixel block in the reference image with a candidate source pixel block in the source image, specifically performs:

11. The apparatus according to claim 10, wherein the matching unit, when selecting the candidate source pixel block corresponding to the reference pixel block from the plurality of source pixel blocks, specifically performs:

12. The apparatus according to any of claims 9-11, wherein the matching unit, when adjusting the size of the reference pixel block during matching, specifically performs:

acquiring the matching degree of the reference pixel block and the candidate source pixel block in the current matching process in the horizontal direction and the vertical direction;

adjusting a length of the reference pixel block in the target direction.

13. The apparatus according to claim 12, wherein the matching unit, when adjusting the length of the reference pixel block in the target direction, specifically performs:

14. The apparatus according to any of claims 9-13, wherein the matching unit, when determining the target source pixel block in the source image corresponding to the reference pixel block, specifically performs:

15. The apparatus according to any one of claims 9-14, wherein the first generating unit, when generating the depth map of the reference image from the reference pixel block and its corresponding target source pixel block, specifically performs:

16. The apparatus according to any of claims 9-15, further comprising a second generating unit for performing:

acquiring depth maps of a plurality of reference images after the first generation unit generates the depth maps of the reference images;

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.

19. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.