CN110415351B

CN110415351B - Method, device and system for constructing three-dimensional grid based on single image

Info

Publication number: CN110415351B
Application number: CN201910543756.3A
Authority: CN
Inventors: 杨骏锋; 范浩强
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2019-06-21
Filing date: 2019-06-21
Publication date: 2023-10-10
Anticipated expiration: 2039-06-21
Also published as: CN110415351A

Abstract

The invention provides a method, a device and a system for constructing a three-dimensional grid based on a single image. The method comprises the following steps: determining a position where shielding exists in a depth image corresponding to an input image; and disconnecting the patch connection relationship between the foreground and the background at the position with the occlusion, and constructing a grid for the background part at the position with the occlusion. According to the method, the device and the system for constructing the three-dimensional grid based on the single image, the original patch connection relation at the position where the shielding exists in the image is disconnected, and the grid is constructed aiming at the shielded part, so that obvious stretching effect can be effectively removed, and visual effect is improved.

Description

Method, device and system for constructing three-dimensional grid based on single image

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a system, and a storage medium for constructing a three-dimensional grid based on a single image.

Background

A three-dimensional (3D) album refers to taking a single image and changing it into a three-dimensional image. For example, a three-dimensional scene is reconstructed on a mobile phone, and images obtained by shooting the scene at different positions (for example, up-down, left-right, front-back positions relative to the shooting time position) and at different angles are simulated. The technology can enhance the expressive force of the photo and can also be used in the virtual reality industry.

In the prior art, when a three-dimensional grid (mesh) is constructed through a single image, shielding relation is not considered at the boundary of different objects, so that the rendered image has obvious stretching effect at the boundary and has poor visual effect; or, simply constructing a double-layer grid, and the problem of triangular patch missing at the positions of a plurality of different depth boundaries can occur; in addition, this approach creates many useless triangular patches, resulting in a large grid that it builds.

Disclosure of Invention

The present invention has been made in order to solve at least one of the above problems. The invention provides a scheme for constructing a three-dimensional grid based on a single image, which breaks the connection relation of original patches at the position where the shielding exists in the image, and constructs the grid aiming at the shielded part, so that obvious stretching effect can be effectively removed, and visual effect is improved. The scheme of constructing a three-dimensional grid based on a single image according to the present invention will be briefly described below, and more details will be described in the following detailed description with reference to the drawings.

According to an aspect of the present invention, there is provided a method of constructing a three-dimensional mesh based on a single image, the method comprising: determining a position where shielding exists in a depth image corresponding to an input image; and disconnecting the patch connection relationship between the foreground and the background at the position with the occlusion, and constructing a grid for the background part at the position with the occlusion.

In one embodiment of the invention, said constructing a grid for a background portion at said location where occlusion exists comprises: when building a tile for the background portion at the location where occlusion exists, a layer of mesh is created for building the tile whenever the location where the tile needs to be built has been occupied.

In one embodiment of the invention, the method further comprises: before building a patch for the background portion at the location where occlusion exists, a maximum value of the patch required to be built for the occluded portion is determined.

In one embodiment of the present invention, the determining a position where an occlusion exists in a depth image corresponding to an input image includes: for each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point.

In one embodiment of the present invention, the adjacent pixel points of the pixel points include adjacent pixel points in four directions of up, down, left, right, and the like of the pixel point.

In one embodiment of the present invention, the disconnecting the patch connection relationship between the foreground and the background at the position where the occlusion exists, and constructing the mesh for the background portion at the position where the occlusion exists includes: and disconnecting the connection relation of the surface patch between the pixel point and any adjacent pixel point, and expanding vertexes in the direction from the pixel point to any adjacent pixel point to construct a grid until no shielding exists in the direction from the pixel point to any adjacent pixel point.

In one embodiment of the present invention, expanding the vertex in the direction from the pixel to any one of the adjacent pixels includes: each time a vertex is expanded towards the direction, determining whether the vertex can be connected with a pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction has been reached; if it is determined that the vertex can be connected with a pixel point adjacent to the vertex or that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, and the iteration is completed until all directions with occlusion of each pixel point are expanded.

In an embodiment of the present invention, the expanding the vertex in the direction from the pixel point to any one of the adjacent pixel points further includes: before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion.

In one embodiment of the invention, the maximum extension boundary of the direction is determined based on an imaging model of a camera capturing the input image, internal parameters of the camera, and preset parameters for simulating the magnitude of the positional change of the camera.

According to another aspect of the present invention, there is provided an apparatus for constructing a three-dimensional mesh based on a single image, the apparatus comprising: the computing module is used for determining the position where the shielding exists in the depth image corresponding to the input image; and a construction module for disconnecting the patch connection relationship between the foreground and the background at the position where the occlusion exists, and constructing a grid for the background part at the position where the occlusion exists.

According to a further aspect of the present invention there is provided a system for constructing a three-dimensional grid based on a single image, the system comprising a storage device and a processor, the storage device having stored thereon a computer program for execution by the processor, the computer program when executed by the processor performing the method of constructing a three-dimensional grid based on a single image as described in any one of the preceding claims.

According to a further aspect of the present invention, there is provided a storage medium having stored thereon a computer program which, when run, performs the method of constructing a three-dimensional grid based on a single image as described in any one of the above.

According to a further aspect of the present invention, there is provided a computer program for executing the method of constructing a three-dimensional grid based on a single image as defined in any one of the above, when executed by a computer or a processor, the computer program further being adapted to implement the modules of the apparatus of constructing a three-dimensional grid based on a single image as defined in any one of the above.

According to the method, the device and the system for constructing the three-dimensional grid based on the single image, the original patch connection relation at the position where the shielding exists in the image is disconnected, and the grid is constructed aiming at the shielded part, so that obvious stretching effect can be effectively removed, and visual effect is improved.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent from the following more particular description of embodiments of the present invention, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, and not constitute a limitation to the invention. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 shows a schematic block diagram of an example electronic device for implementing a method, apparatus, and system for building a three-dimensional grid based on a single image in accordance with an embodiment of the invention;

FIG. 2 shows a schematic flow chart of a method of constructing a three-dimensional grid based on a single image in accordance with an embodiment of the invention;

FIG. 3 shows a schematic block diagram of an apparatus for constructing a three-dimensional grid based on a single image in accordance with an embodiment of the invention; and

Fig. 4 shows a schematic block diagram of a system for constructing a three-dimensional grid based on a single image according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.

First, an example electronic device 100 for implementing the method, apparatus, and system for constructing a three-dimensional grid based on a single image of an embodiment of the present invention is described with reference to fig. 1.

As shown in fig. 1, electronic device 100 includes one or more processors 102, one or more storage devices 104, an input device 106, an output device 108, and an image capture device 110, which are interconnected by a bus system 112 and/or other forms of connection mechanisms (not shown). It should be noted that the components and structures of the electronic device 100 shown in fig. 1 are exemplary only and not limiting, as the electronic device may have other components and structures as desired.

The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that may be executed by the processor 102 to perform the functions of constructing a three-dimensional grid based on a single image and/or other desired functions in embodiments of the invention described below (implemented by the processor). Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer readable storage medium.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, mouse, microphone, touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

The image capture device 110 may capture images (e.g., photographs, videos, etc.) desired by a user and store the captured images in the storage device 104 for use by other components. It should be understood that the image capturing apparatus 110 is merely an example, and the electronic device 100 may not include the image capturing apparatus 110. In this case, an image may be acquired using a component having image acquisition capability, and the acquired image may be transmitted to the electronic device 100.

For example, example electronic devices for implementing the method, apparatus and system for constructing a three-dimensional grid based on a single image according to embodiments of the present invention may be implemented as terminals such as smartphones, tablet computers, or other vehicle-mounted terminals.

Next, a method 200 of constructing a three-dimensional mesh based on a single image according to an embodiment of the present invention will be described with reference to fig. 2. As shown in fig. 2, a method 200 of constructing a three-dimensional grid based on a single image may include the steps of:

In step S210, a position where occlusion exists in a depth image corresponding to an input image is determined.

In an embodiment of the present invention, the input image in step S210 may be a single RGB image acquired by an image acquisition device (e.g., a camera) in real time, or may be a single image from any source. The method is used for constructing the three-dimensional grid based on the single image, and the shielding relation of the boundaries of different objects in the image is considered when the three-dimensional grid is constructed based on the single image, so that obvious stretching effects of the boundaries of the different objects in the image can be effectively removed. In an embodiment of the invention, the occlusion relation at the boundaries of different objects in the image can be determined based on the depth image corresponding to the input image.

Specifically, the operation of determining a position where an occlusion exists in the depth image corresponding to the input image may further include: for each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point. In one example, the adjacent pixels of the pixel may include adjacent pixels of the pixel in four directions, i.e., a pixel adjacent to the pixel (which may be referred to as a neighboring pixel). In another example, the adjacent pixels of the pixel may include adjacent pixels of the pixel in other directions, such as adjacent pixels of the pixel in an oblique direction, that is, pixels of two pixels adjacent to the pixel (referred to as two neighboring pixels). In other examples, the adjacent pixels of the pixel may also include some of the above-mentioned adjacent pixels, for example, for a pixel at an image boundary, it may include only adjacent pixels in two directions.

Next, the adjacent pixel points including the four directions of up, down, left, and right are described as an example. For example, for any pixel point a in the depth image, adjacent pixels in the up-down, left-right directions are respectively a pixel point B, a pixel point C, a pixel point D, and a pixel point E. The depth differences between pixel a and pixel B, pixel C, pixel D, pixel E can be calculated, respectively. Assuming that the depth difference between the pixel point a and the pixel point B and the depth difference between the pixel point a and the pixel point D are smaller than or equal to a preset threshold value, and the depth difference between the pixel point a and the pixel point C and the depth difference between the pixel point a and the pixel point E are larger than the preset threshold value, it is indicated that the direction of the pixel point a to the pixel point C (i.e., the downward direction of the pixel point a) and the direction of the pixel point a to the pixel point E (i.e., the rightward direction of the pixel point a) are blocked. That is, the pixel point a is a background portion, the pixel point C and the pixel point E are foreground portions, and the object where the pixel point a is located has a portion that is blocked by the object where the pixel point C is located and the object where the pixel point E is located. According to the method, whether each pixel point is blocked in the direction towards the adjacent pixel point can be calculated, namely, the blocking position in the image is determined.

In addition, a mark (mask) can be constructed according to whether each pixel point has shielding in the direction of each adjacent pixel point, and each point in the mark can be assigned according to whether each direction has shielding. For example, following the example above, where there is no occlusion of the direction of pixel a to pixel B, the mark corresponding to the direction of pixel B in the mask corresponding to pixel a may be set to 0; if the direction of the pixel point A to the pixel point C is blocked, a mark corresponding to the direction of the pixel point C in the mask corresponding to the pixel point A can be set to be 1; if there is no shielding from the direction of the pixel point A to the direction of the pixel point D, the mark corresponding to the direction of the pixel point D in the mask corresponding to the pixel point A can be set to 0; if there is a blocking of the direction of the pixel point a toward the pixel point E, the mark corresponding to the direction of the pixel point E in the mask corresponding to the pixel point a may be set to 1. Based on this flag, an operation to be described later completes the construction of the final mesh when the points in the mask of each pixel point in the image are made 0.

With continued reference now to fig. 2, at step S220, the patch connection relationship between foreground and background is broken at the occlusion-present location, and a mesh is constructed for the background portion at the occlusion-present location.

In an embodiment of the present invention, for the position where the occlusion exists in the depth image determined in step S210, the original patch connection relationship at the position where the occlusion exists in the depth image may be disconnected. Here, the original patch connection relationship may be a patch connection relationship at that position in the mesh constructed without considering the occlusion relationship in the image. Because the method considers the shielding relation in the image, breaks the original patch connection relation between the foreground and the background at the position where the shielding exists, and constructs the grid aiming at the background part at the position where the shielding exists to supplement the shielded part, the stretching effect can be effectively removed, and the visual effect is improved.

Specifically, the operations of disconnecting the patch connection relationship between the foreground and the background at the occlusion-present position, and constructing the mesh for the background portion at the occlusion-present position may include: disconnecting the connection relation of the patch between the pixel point and any adjacent pixel point (wherein any adjacent pixel point means that a shielding relation exists between an object where the pixel point is located and an object where the pixel point is located), and expanding vertices at the pixel point towards the direction of any adjacent pixel point to construct a grid until the pixel point is not shielded towards the direction of any adjacent pixel point. This may be described again with reference to the previous examples.

Next, as described above, for any pixel a in the depth image, the adjacent pixels in the up-down, left-right directions are respectively a pixel B, a pixel C, a pixel D, and a pixel E, and if the depth difference between the pixel a and the pixel C and the depth difference between the pixel a and the pixel E are greater than the preset threshold, it may be: disconnecting the connection relation of the patch between the pixel point A and the pixel point C, and expanding the vertex from the pixel point A to the direction of the pixel point C (namely expanding the vertex downwards from the pixel point A) to construct the patch so as to construct the grid until no shielding exists in the direction from the pixel point A to the pixel point C; and breaking the connection relation of the surface patch between the pixel point A and the pixel point E, and expanding the vertex from the pixel point A to the direction of the pixel point E (namely expanding the vertex from the pixel point A to the right) to construct the surface patch so as to construct the grid until the direction from the pixel point A to the pixel point E is not blocked.

Further, in the embodiment of the present invention, when a patch is constructed for a background portion where an occlusion exists, a layer of mesh is newly constructed for constructing the patch every time the position where the patch needs to be constructed has been occupied, thereby constructing a multi-layer mesh to solve the problem that a patch deletion occurs only by constructing a single-layer or double-layer mesh. Further, in an embodiment of the present invention, before constructing a patch for the background portion at the position where occlusion exists, a maximum value of the patch required to be constructed for the occluded portion may be determined. Based on this, when the dough sheet is built in a certain direction, the maximum value can be used as a stop condition set for the dough sheet built in the certain direction, thereby effectively reducing the generation of useless dough sheets, greatly reducing the total dough sheet quantity and improving the construction efficiency. This may be described again with reference to the previous examples.

Following the above example, as previously described, vertices are extended in the direction of pixel A toward pixel C (i.e., vertices are extended downward by pixel A) to construct a patch to construct a mesh until there is no longer occlusion in the direction of pixel A toward pixel C. When the pixel point A expands the vertex towards the direction of the pixel point C, determining whether the vertex can be connected with the pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction (namely, the direction from the pixel point A to the pixel point C) is reached or not every time the vertex is expanded; if it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, at this time, the mask corresponding to the pixel point a may be updated, so that a flag value in the mask, which indicates the direction from the pixel point a to the pixel point C, is set to 0, and then the vertex is expanded in the direction corresponding to the point 1 in the mask corresponding to the pixel point a. In the above example, the vertex may then be extended in the direction of pixel point a to pixel point E. Similarly, vertices are extended in the direction of pixel A toward pixel E (i.e., vertices are extended to the right from pixel A) to construct a patch to construct a mesh until there is no longer any occlusion in the direction of pixel A toward pixel E. When the pixel point A expands the vertex towards the direction of the pixel point E, determining whether the vertex can be connected with the pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction (namely, the direction from the pixel point A to the pixel point E) is reached or not every time the vertex is expanded; if it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion ends. The above is the end of the directional expansion for all the existing occlusions of pixel a. According to the method, the expansion is finished until all directions of the shielding exist for each pixel point.

In the above example, when the pixel point a extends the vertex in the direction of the pixel point C (or the direction of the pixel point a in the direction of the pixel point E), each time the vertex is extended, it is determined whether the vertex can be connected to the pixel point adjacent to the vertex, where it can be determined whether the vertex can be the pixel point adjacent to the vertex by the preset threshold value described in step S210. Similar to the foregoing, the depth difference between the vertex and its neighboring pixels may be calculated, and when the depth difference between the vertex and any of its neighboring pixels is less than or equal to a predetermined threshold, it indicates that the vertex may be connected to the neighboring pixels. Otherwise, when the depth difference between the vertex and any adjacent pixel point is greater than the preset threshold value, the vertex cannot be connected with the adjacent pixel point. At this time, it is indicated that the expansion is not completed, and the vertex still needs to be expanded continuously in this direction.

In the above example, the maximum expansion boundary in the direction from the pixel point a to the pixel point C is the maximum value of the number of vertices required for expansion in the direction from the pixel point a to the pixel point C. Wherein the maximum value may be determined based on an imaging model of a camera capturing the input image, internal parameters of the camera, and preset parameters for simulating a magnitude of a positional change of the camera. Illustratively, the following formula may be employed to calculate the maximum of the number of vertices required for expansion in the direction of pixel point A to pixel point C:

Wherein, the liquid crystal display device comprises a liquid crystal display device,z ₁ is the depth value of the pixel point A, z ₂ The depth value of the pixel point C is (i, j) the position coordinate of the pixel point A, f _x 、f _y 、c _x And c _y The internal parameters of the camera, dx, dy and dz, which are all taken as the input images, are preset parameters for simulating the amplitude of the position change of the camera, i.e. it is desirable to obtain a maximum value of the amplitude of the position change of the external reference relative to the original camera. Since dz is negligible relative to z (typically a relationship between cm and m), the above equation can be simplified as:

the maximum value of the number of vertexes required for expanding from the pixel point A to the pixel point C can be calculated by the formula, so that the generation of useless vertexes and useless patches is reduced, and the efficiency is improved. Similarly, the maximum expansion boundary in the direction from pixel A to pixel E is the maximum value of the number of vertices required for expansion in the direction from pixel A to pixel E, and the maximum value of the number of vertices required for expansion in the direction from pixel A to pixel E can be calculated by the above equation by using only z in the above equation ₂ The depth value of the pixel point E is replaced.

Further, in an embodiment of the present invention, the operation of expanding the vertex in the direction from the pixel point to any one of the adjacent pixel points further includes: before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion. Still referring to the above example, each time pixel a extends one vertex in the direction of pixel C (or each time pixel a extends one vertex in the direction of pixel E), it may be determined whether the position of the extended vertex has been occupied by another point (e.g., another pixel may have extended the vertex at that position, or there is originally a pixel at that position, etc.), and if so, a layer of mesh is created for vertex extension. Then, after a new mesh is built to perform vertex expansion, as described above, it is still necessary to determine whether the vertex can be connected to a pixel point adjacent to the vertex or whether the maximum expansion boundary of the direction has been reached. If it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion ends. And iterating in this way until all directions of shielding exist for each pixel point are expanded and ended.

In general, according to the method for constructing the three-dimensional grid based on the single image, which is provided by the embodiment of the invention, whether each pixel point in the depth image corresponding to the input image is blocked in the direction of the adjacent pixel point is determined, so that a pixel point mask needing to expand the vertex is obtained; then determining the maximum value of the vertex to be expanded towards the shielded direction of each pixel point based on an imaging model of a camera shooting the input image, internal parameters of the camera and preset parameters for simulating the position change amplitude of the camera; finally, based on the pixel point mask of the vertex to be expanded, the expansion direction and the maximum boundary to be expanded corresponding to each expansion direction, expanding one vertex to the corresponding direction for each point in the mask, if the position is occupied, creating a layer of expansion vertex, calculating whether the expansion vertex can be connected with four points around the position of the expanded vertex (judged by the threshold value of depth difference), updating the mask (0 is set when the expansion vertex is connected with the 4 points around or the maximum expansion boundary is reached, and the rest are 1), expanding the direction and the maximum boundary to be expanded correspondingly, and repeatedly iterating until the mask is all 0, thus completing the construction of the final grid.

Based on the above description, the method for constructing the three-dimensional grid based on the single image according to the embodiment of the invention breaks the connection relation of the original patches at the position where the shielding exists in the image, and constructs the grid aiming at the shielded part, so that obvious stretching effect can be effectively removed, and visual effect is improved. In addition, the method for constructing the three-dimensional grid based on the single image constructs the multi-layer grid, and effectively solves the problem that triangular patches are missing at the positions of the double-layer grid at the boundaries of a plurality of different depths. In addition, according to the method for constructing the three-dimensional grid based on the single image, when the grid is constructed, the size of the part which is shielded in a certain direction is obtained based on the camera imaging model and the internal and external parameters related to the camera, so that the number of triangular patches which are required to be constructed in the direction at most for the grid layer can be obtained, the number of triangular patches is effectively reduced, and the effective utilization rate of the grid is improved.

The method of constructing a three-dimensional mesh based on a single image according to an embodiment of the present invention is exemplarily described above. Illustratively, the method of constructing a three-dimensional grid based on a single image according to an embodiment of the present invention may be implemented in a device, apparatus or system having a memory and a processor.

In addition, the method for constructing the three-dimensional grid based on the single image can be conveniently deployed on mobile equipment such as a smart phone, a tablet personal computer and a personal computer. Alternatively, the method for constructing the three-dimensional grid based on the single image according to the embodiment of the invention can be deployed at the server side (or cloud end). Alternatively, the method for constructing the three-dimensional grid based on the single image according to the embodiment of the invention can be distributed and deployed at the server (or cloud) and the personal terminal.

An apparatus for constructing a three-dimensional mesh based on a single image according to another aspect of the present invention is described below with reference to fig. 3. Fig. 3 shows a schematic block diagram of an apparatus 300 for constructing a three-dimensional grid based on a single image according to an embodiment of the invention.

As shown in fig. 3, an apparatus 300 for constructing a three-dimensional grid based on a single image according to an embodiment of the present invention includes a calculation module 310 and a construction module 320. The computing module 310 is configured to determine a position where an occlusion exists in a depth image corresponding to the input image. The construction module 320 is configured to break the patch connection relationship between the foreground and the background at the location where the occlusion exists, and construct a mesh for the background portion at the location where the occlusion exists. The various modules may perform the various steps/functions of the method of constructing a three-dimensional grid based on a single image described above in connection with fig. 2, respectively. Only the main functions of each module of the apparatus 300 for constructing a three-dimensional mesh based on a single image will be described below, and the details already described above will be omitted.

In embodiments of the present invention, the input image may be a single RGB image acquired in real-time by an image acquisition device (e.g., a camera) or may be a single image from any source. In an embodiment of the present invention, the computing module 310 may determine occlusion relationships at boundaries of different objects in an image based on depth images corresponding to an input image.

Specifically, the operation of the computing module 310 to determine a location in the depth image corresponding to the input image where an occlusion exists may further include: for each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point. In one example, the adjacent pixels of the pixel may include adjacent pixels of the pixel in four directions, i.e., a pixel adjacent to the pixel (which may be referred to as a neighboring pixel). In another example, the adjacent pixels of the pixel may include adjacent pixels of the pixel in other directions, such as adjacent pixels of the pixel in an oblique direction, that is, pixels of two pixels adjacent to the pixel (referred to as two neighboring pixels). In other examples, the adjacent pixels of the pixel may also include some of the above-mentioned adjacent pixels, for example, for a pixel at an image boundary, it may include only adjacent pixels in two directions.

Next, the adjacent pixel points including the four directions of up, down, left, and right are described as an example. For example, for any pixel point a in the depth image, adjacent pixels in the up-down, left-right directions are respectively a pixel point B, a pixel point C, a pixel point D, and a pixel point E. The calculation module 310 may calculate depth differences between the pixel point a and the pixel point B, the pixel point C, the pixel point D, and the pixel point E, respectively. Assuming that the depth difference between the pixel point a and the pixel point B and the depth difference between the pixel point a and the pixel point D are smaller than or equal to a preset threshold value, and the depth difference between the pixel point a and the pixel point C and the depth difference between the pixel point a and the pixel point E are larger than the preset threshold value, it is indicated that the direction of the pixel point a to the pixel point C (i.e., the downward direction of the pixel point a) and the direction of the pixel point a to the pixel point E (i.e., the rightward direction of the pixel point a) are blocked. That is, the pixel point a is a background portion, the pixel point C and the pixel point E are foreground portions, and the object where the pixel point a is located has a portion that is blocked by the object where the pixel point C is located and the object where the pixel point E is located. According to this method, the calculation module 310 may calculate whether there is an occlusion in the direction of each pixel point to its neighboring pixel points, i.e. determine the position in the image where there is an occlusion.

In addition, the calculation module 310 may construct a label (mask) for each pixel in the direction of each adjacent pixel, and assign a value to each point in the label according to whether each direction has a shade. For example, following the example above, where there is no occlusion of the direction of pixel a to pixel B, the mark corresponding to the direction of pixel B in the mask corresponding to pixel a may be set to 0; if the direction of the pixel point A to the pixel point C is blocked, a mark corresponding to the direction of the pixel point C in the mask corresponding to the pixel point A can be set to be 1; if there is no shielding from the direction of the pixel point A to the direction of the pixel point D, the mark corresponding to the direction of the pixel point D in the mask corresponding to the pixel point A can be set to 0; if there is a blocking of the direction of the pixel point a toward the pixel point E, the mark corresponding to the direction of the pixel point E in the mask corresponding to the pixel point a may be set to 1. Based on this flag, the operation of the build module 320, which will be described below, completes the build of the final mesh when the points in the mask for each pixel point in the image are made 0.

In an embodiment of the present invention, for the location where the computing module 310 determines that there is an occlusion in the resulting depth image, the constructing module 320 may break the original patch connection relationship at the location where there is an occlusion in the depth image. Here, the original patch connection relationship may be a patch connection relationship at that position in the mesh constructed without considering the occlusion relationship in the image. Because the invention considers the shielding relation in the image, breaks the original connection relation between the foreground and the background at the position where the shielding exists, and constructs the grid aiming at the background part at the position where the shielding exists to supplement the shielded part, the stretching effect can be effectively removed, and the visual effect is improved.

Specifically, the operation of the construction module 320 to break the patch connection relationship between foreground and background at the occlusion-present location and construct a mesh for the background portion at the occlusion-present location may include: disconnecting the connection relation of the patch between the pixel point and any adjacent pixel point (wherein any adjacent pixel point means that a shielding relation exists between an object where the pixel point is located and an object where the pixel point is located), and expanding vertices at the pixel point towards the direction of any adjacent pixel point to construct a grid until the pixel point is not shielded towards the direction of any adjacent pixel point. This may be described again with reference to the previous examples.

Next, as described above, for any pixel a in the depth image, the adjacent pixels in the up-down, left-right directions are the pixel B, the pixel C, the pixel D, and the pixel E, and assuming that the calculation module 310 determines, by calculation, that the depth difference between the pixel a and the pixel C and the depth difference between the pixel a and the pixel E is greater than the preset threshold, the construction module 320 may: disconnecting the connection relation of the patch between the pixel point A and the pixel point C, and expanding the vertex from the pixel point A to the direction of the pixel point C (namely expanding the vertex downwards from the pixel point A) to construct the patch so as to construct the grid until no shielding exists in the direction from the pixel point A to the pixel point C; and breaking the connection relation of the surface patch between the pixel point A and the pixel point E, and expanding the vertex from the pixel point A to the direction of the pixel point E (namely expanding the vertex from the pixel point A to the right) to construct the surface patch so as to construct the grid until the direction from the pixel point A to the pixel point E is not blocked.

Further, in an embodiment of the present invention, when the construction module 320 constructs a patch for a background portion where an occlusion exists, a layer of mesh is newly constructed for constructing the patch each time the position where the patch needs to be constructed is already occupied, thereby constructing a multi-layer mesh to solve the problem that a patch deletion occurs only by constructing a single-layer or double-layer mesh. Further, in an embodiment of the present invention, the construction module 320 may determine that the occluded portion needs to construct a maximum value of the patch before constructing the patch for the background portion at the position where the occlusion exists. Based on this, when the dough sheet is built in a certain direction, the maximum value can be used as a stop condition set for the dough sheet built in the certain direction, thereby effectively reducing the generation of useless dough sheets, greatly reducing the total dough sheet quantity and improving the construction efficiency. This may be described again with reference to the previous examples.

Following the example above, as previously described, construction module 320 expands vertices at pixel A toward pixel C (i.e., expands vertices downward from pixel A) to construct a patch to construct a mesh until there is no longer an occlusion of pixel A toward pixel C. When the vertex is extended in the direction from the pixel point a to the pixel point C, the construction module 320 determines whether the vertex can be connected to the pixel point adjacent to the vertex or whether the maximum extension boundary in the direction (i.e., the direction from the pixel point a to the pixel point C) has been reached; if it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, at this time, the mask corresponding to the pixel point a may be updated, so that a flag value in the mask, which indicates the direction from the pixel point a to the pixel point C, is set to 0, and then the vertex is expanded in the direction corresponding to the point 1 in the mask corresponding to the pixel point a. In the above example, the construction module 320 may then expand the vertex at pixel point A in the direction of pixel point E. Similarly, vertices are extended in the direction of pixel A toward pixel E (i.e., vertices are extended to the right from pixel A) to construct a patch to construct a mesh until there is no longer any occlusion in the direction of pixel A toward pixel E. When the vertex is extended in the direction from the pixel point a to the pixel point E, the construction module 320 determines whether the vertex can be connected to the pixel point adjacent to the vertex or whether the maximum extension boundary in the direction (i.e., the direction from the pixel point a to the pixel point E) has been reached; if it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion ends. The above is the end of the directional expansion for all the existing occlusions of pixel a. According to the process, the expansion is finished until all directions of the shielding exist for each pixel point.

In the above example, when the pixel point a extends the vertex in the direction of the pixel point C (or the direction of the pixel point a extends the pixel point E), each time a vertex is extended, it is determined whether the vertex can be connected to the pixel point adjacent to the vertex, where the preset threshold value adopted by the calculation module 310 may be used to determine whether the vertex can be connected to the pixel point adjacent to the vertex. Similar to the above, the construction module 320 may calculate the depth difference between the vertex and its neighboring pixels, and when the depth difference between the vertex and any of its neighboring pixels is less than or equal to a predetermined threshold, it indicates that the vertex may be connected to the neighboring pixels. Otherwise, when the depth difference between the vertex and any adjacent pixel point is greater than the preset threshold value, the vertex cannot be connected with the adjacent pixel point. At this time, it is indicated that the expansion is not completed, and the vertex still needs to be expanded continuously in this direction.

In the above example, the maximum expansion boundary in the direction from the pixel point a to the pixel point C is the maximum value of the number of vertices required for expansion in the direction from the pixel point a to the pixel point C. Wherein the maximum value may be determined by the calculation module 310 based on an imaging model of a camera capturing the input image, internal parameters of the camera, and preset parameters for simulating the magnitude of the positional change of the camera. Illustratively, the calculation module 310 may calculate the maximum value of the number of vertices required for extending in the direction of the pixel point a toward the pixel point C using the following formula:

Wherein z is ₁ Is the depth value of the pixel point A, z ₂ The depth value of the pixel point C is (i, j) the position coordinate of the pixel point A, f _x 、f _y 、c _x And c _y Are internal parameters of the camera capturing the input image, dx, dy and dz are preset parameters for simulating the amplitude of the position change of the camera, i.e. the periodIt is desirable to obtain a maximum value of the amplitude of variation of the outlier with respect to the original camera position. Since dz is negligible relative to z (typically a relationship between cm and m), the above equation can be simplified as:

Further, in an embodiment of the present invention, the operation of expanding the vertex by the building block 320 in the direction from the pixel to any one of the adjacent pixels further includes: before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion. Still referring to the example above, each time pixel a extends one vertex in the direction of pixel C (or each time pixel a extends one vertex in the direction of pixel E), build module 320 can determine whether the location of the extended vertex has been occupied by another point (e.g., another pixel may have extended the vertex at that location, or there is originally a pixel at that location, etc.), and if so, create a layer of mesh for vertex extension. Then, after creating a mesh layer for vertex expansion, the construction module 320 still needs to determine whether the vertex can be connected to a pixel point adjacent to the vertex or whether the maximum expansion boundary of the direction has been reached, as described above. If it is determined that the vertex can be connected to a pixel point adjacent to the vertex or it is determined that the maximum expansion boundary of the direction has been reached, the direction expansion ends. And iterating in this way until all directions of shielding exist for each pixel point are expanded and ended.

In general, the device for constructing a three-dimensional grid based on a single image according to the embodiment of the invention can firstly determine whether each pixel point in a depth image corresponding to an input image is blocked in the direction of the adjacent pixel point, so as to obtain a pixel point mask of a vertex to be expanded; then determining the maximum value of the vertex to be expanded towards the shielded direction of each pixel point based on an imaging model of a camera shooting the input image, internal parameters of the camera and preset parameters for simulating the position change amplitude of the camera; finally, based on the pixel point mask of the vertex to be expanded, the expansion direction and the maximum boundary to be expanded corresponding to each expansion direction, expanding one vertex to the corresponding direction for each point in the mask, if the position is occupied, creating a layer of expansion vertex, calculating whether the expansion vertex can be connected with four points around the position of the expanded vertex (judged by the threshold value of depth difference), updating the mask (0 is set when the expansion vertex is connected with the 4 points around or the maximum expansion boundary is reached, and the rest are 1), expanding the direction and the maximum boundary to be expanded correspondingly, and repeatedly iterating until the mask is all 0, thus completing the construction of the final grid.

Based on the above description, the device for constructing the three-dimensional grid based on the single image according to the embodiment of the invention breaks the connection relation of the original patches at the position where the occlusion exists in the image, and constructs the grid aiming at the occluded part, so that obvious stretching effect can be effectively removed, and visual effect is improved. In addition, the device for constructing the three-dimensional grid based on the single image constructs a multi-layer grid, and the problem that triangular patches are missing at the positions of the double-layer grid at the boundaries of a plurality of different depths is effectively solved. In addition, the device for constructing the three-dimensional grid based on the single image in the embodiment of the invention can obtain the size of the shielded part in a certain direction based on the camera imaging model and the internal and external parameters related to the camera when constructing the grid, and can obtain how many triangular patches are needed to be constructed in the direction at most for the grid layer, thereby effectively reducing the number of the triangular patches and improving the effective utilization rate of the grid.

Fig. 4 shows a schematic block diagram of a system 400 for constructing a three-dimensional grid based on a single image, according to an embodiment of the invention. The system 400 for constructing a three-dimensional grid based on a single image includes a storage device 410 and a processor 420.

Wherein the storage means 410 stores a program for implementing the respective steps in the method of constructing a three-dimensional grid based on a single image according to an embodiment of the present invention. The processor 420 is configured to run a program stored in the storage device 410 to perform respective steps of a method of constructing a three-dimensional grid based on a single image according to an embodiment of the present invention, and to implement respective modules in an apparatus of constructing a three-dimensional grid based on a single image according to an embodiment of the present invention. Furthermore, the system 400 for constructing a three-dimensional grid based on a single image may further comprise an image acquisition device (not shown in fig. 4) which may be used for acquiring said input image. Of course, the image acquisition device is not required, and the system 400 for constructing a three-dimensional grid based on a single image may acquire the input image from other external image acquisition devices.

In one embodiment of the invention, the program, when executed by the processor 420, causes the system 400 for constructing a three-dimensional grid based on a single image to perform the steps of: determining a position where shielding exists in a depth image corresponding to an input image; and disconnecting the patch connection relationship between the foreground and the background at the position with the occlusion, and constructing a grid for the background part at the position with the occlusion.

In one embodiment of the invention, the constructing a grid for the background portion at the occlusion-present location, which when executed by the processor 420 causes the system 400 for constructing a three-dimensional grid based on a single image, to perform comprises: when building a tile for the background portion at the location where occlusion exists, a layer of mesh is created for building the tile whenever the location where the tile needs to be built has been occupied.

In one embodiment of the invention, the program, when executed by the processor 420, further causes the system 400 for constructing a three-dimensional grid based on a single image to perform the steps of: before building a patch for the background portion at the location where occlusion exists, a maximum value of the patch required to be built for the occluded portion is determined.

In one embodiment of the present invention, the determining that there is an occlusion in the depth image corresponding to the input image performed by the system 400 for constructing a three-dimensional grid based on a single image, when the program is executed by the processor 420, includes: for each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point.

In one embodiment of the invention, the breaking of the patch connection relationship between foreground and background at the occlusion-present location and the constructing of the mesh for the background portion at the occlusion-present location performed by the system 400 for constructing a three-dimensional mesh based on a single image, when the program is run by the processor 420, comprises: and disconnecting the connection relation of the surface patch between the pixel point and any adjacent pixel point, and expanding vertexes in the direction from the pixel point to any adjacent pixel point to construct a grid until no shielding exists in the direction from the pixel point to any adjacent pixel point.

In one embodiment of the present invention, the program, when executed by the processor 420, causes the system 400 for constructing a three-dimensional mesh based on a single image to perform expanding vertices at the pixel point toward the direction of any neighboring pixel point, includes: each time a vertex is expanded towards the direction, determining whether the vertex can be connected with a pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction has been reached; if it is determined that the vertex can be connected with a pixel point adjacent to the vertex or that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, and the iteration is completed until all directions with occlusion of each pixel point are expanded.

In one embodiment of the present invention, the expanding vertices at the pixel point toward the direction of any neighboring pixel point performed by the system 400 for constructing a three-dimensional grid based on a single image, when the program is executed by the processor 420, further comprises: before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion.

Furthermore, according to an embodiment of the present invention, there is also provided a storage medium on which program instructions are stored, which program instructions, when being executed by a computer or a processor, are adapted to carry out the respective steps of the method of constructing a three-dimensional grid based on a single image of an embodiment of the present invention, and to carry out the respective modules in the apparatus of constructing a three-dimensional grid based on a single image according to an embodiment of the present invention. The storage medium may include, for example, a memory card of a smart phone, a memory component of a tablet computer, a hard disk of a personal computer, read-only memory (ROM), erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, or any combination of the foregoing storage media. The computer-readable storage medium may be any combination of one or more computer-readable storage media.

In an embodiment, the computer program instructions may implement the respective functional modules of the apparatus for constructing a three-dimensional grid based on a single image according to an embodiment of the present invention, and/or may perform the method for constructing a three-dimensional grid based on a single image according to an embodiment of the present invention.

In one embodiment of the invention, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the steps of: determining a position where shielding exists in a depth image corresponding to an input image; and disconnecting the patch connection relationship between the foreground and the background at the position with the occlusion, and constructing a grid for the background part at the position with the occlusion.

In one embodiment of the invention, the computer program instructions, when executed by a computer or processor, cause the computer or processor to construct a grid for the background portion at the location where the occlusion exists, comprising: when building a tile for the background portion at the location where occlusion exists, a layer of mesh is created for building the tile whenever the location where the tile needs to be built has been occupied.

In one embodiment of the invention, the computer program instructions, when executed by a computer or processor, further cause the computer or processor to perform the steps of: before building a patch for the background portion at the location where occlusion exists, a maximum value of the patch required to be built for the occluded portion is determined.

In one embodiment of the invention, the computer program instructions, when executed by a computer or processor, cause the computer or processor to determine a location of an occlusion in a depth image corresponding to an input image, comprise: for each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point.

In one embodiment of the invention, the computer program instructions, when executed by a computer or processor, cause the computer or processor to break a patch connection relationship between foreground and background at the occlusion-present location and construct a grid for a background portion at the occlusion-present location comprising: and disconnecting the connection relation of the surface patch between the pixel point and any adjacent pixel point, and expanding vertexes in the direction from the pixel point to any adjacent pixel point to construct a grid until no shielding exists in the direction from the pixel point to any adjacent pixel point.

In one embodiment of the invention, the computer program instructions, when executed by the computer or processor, cause the computer or processor to perform expanding vertices at the pixel point toward the direction of any neighboring pixel point, comprising: each time a vertex is expanded towards the direction, determining whether the vertex can be connected with a pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction has been reached; if it is determined that the vertex can be connected with a pixel point adjacent to the vertex or that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, and the iteration is completed until all directions with occlusion of each pixel point are expanded.

In one embodiment of the present invention, the computer program instructions, when executed by the computer or processor, cause the computer or processor to perform expanding vertices at the pixel point toward the direction of any neighboring pixel point further comprise: before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion.

The modules in the apparatus for constructing a three-dimensional grid based on a single image according to the embodiment of the present invention may be implemented by a processor of an electronic device for constructing a three-dimensional grid based on a single image according to the embodiment of the present invention running computer program instructions stored in a memory, or may be implemented when computer instructions stored in a computer readable storage medium of a computer program product according to the embodiment of the present invention are run by a computer.

Furthermore, according to an embodiment of the present invention, there is also provided a computer program, which may be stored on a cloud or local storage medium. The computer program, when being executed by a computer or processor, is adapted to carry out the respective steps of the method of constructing a three-dimensional grid based on a single image according to an embodiment of the invention and to carry out the respective modules in the apparatus of constructing a three-dimensional grid based on a single image according to an embodiment of the invention.

Based on the above description, according to the method, the device and the system for constructing the three-dimensional grid based on the single image, which are provided by the embodiment of the invention, the original patch connection relation at the position where the shielding exists in the image is disconnected, and the grid is constructed aiming at the shielded part, so that the obvious stretching effect can be effectively removed, and the visual effect is improved. In addition, the method, the device and the system for constructing the three-dimensional grid based on the single image construct a multi-layer grid, and effectively solve the problem that triangular patches are missing at the positions of the double-layer grid at the boundaries of a plurality of different depths. In addition, according to the method, the device and the system for constructing the three-dimensional grid based on the single image, the size of the part which is blocked in a certain direction is obtained based on the camera imaging model and the internal and external parameters related to the camera when the grid is constructed, so that the number of triangular patches which are required to be constructed in the direction at most for the grid layer can be obtained, the number of triangular patches is effectively reduced, and the effective utilization rate of the grid is improved.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the above illustrative embodiments are merely illustrative and are not intended to limit the scope of the present application thereto. Various changes and modifications may be made therein by one of ordinary skill in the art without departing from the scope and spirit of the application. All such changes and modifications are intended to be included within the scope of the present application as set forth in the appended claims.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another device, or some features may be omitted or not performed.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in order to streamline the invention and aid in understanding one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of the invention. However, the method of the present invention should not be construed as reflecting the following intent: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be combined in any combination, except combinations where the features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some of the modules according to embodiments of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.

The foregoing description is merely illustrative of specific embodiments of the present invention and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention. The protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A method of constructing a three-dimensional grid based on a single image, the method comprising:

determining a position where shielding exists in a depth image corresponding to an input image;

disconnecting a patch connection relationship between a foreground and a background at the position where the occlusion exists, wherein the patch connection relationship is a patch connection relationship at the position where the occlusion exists in a grid constructed without considering the occlusion relationship in the depth image;

a grid is constructed for the background portion at the location where the occlusion exists to supplement the occluded portion.

2. The method of claim 1, wherein the constructing a grid for the background portion at the location where the occlusion exists comprises:

when building a tile for the background portion at the location where occlusion exists, a layer of mesh is created for building the tile whenever the location where the tile needs to be built has been occupied.

3. The method according to claim 2, wherein the method further comprises:

before building a patch for the background portion at the location where occlusion exists, a maximum value of the patch required to be built for the occluded portion is determined.

4. A method according to any of claims 1-3, wherein determining a location in the depth image to which the input image corresponds where occlusion is present comprises:

For each pixel point in the depth image, calculating the depth difference between the pixel point and the adjacent pixel point; and

if the depth difference between the pixel point and any adjacent pixel point is larger than a preset threshold value, determining that the pixel point is blocked in the direction towards any adjacent pixel point.

5. The method of claim 4, wherein the adjacent pixels of the pixel include four directions of up, down, left, and right of the adjacent pixels of the pixel.

6. The method of claim 4, wherein the breaking of the patch connection relationship between foreground and background at the occlusion-present location, constructing a mesh for the background portion at the occlusion-present location comprises:

and disconnecting the connection relation of the surface patch between the pixel point and any adjacent pixel point, and expanding vertexes in the direction from the pixel point to any adjacent pixel point to construct a grid until no shielding exists in the direction from the pixel point to any adjacent pixel point.

7. The method of claim 6, wherein extending the vertex in the direction of the pixel toward any one of the neighboring pixels comprises:

Each time a vertex is expanded towards the direction, determining whether the vertex can be connected with a pixel point adjacent to the vertex or determining whether the maximum expansion boundary of the direction has been reached;

if it is determined that the vertex can be connected with a pixel point adjacent to the vertex or that the maximum expansion boundary of the direction has been reached, the direction expansion is ended, and the iteration is completed until all directions with occlusion of each pixel point are expanded.

8. The method of claim 7, wherein expanding the vertex in the direction of the pixel toward any one of the neighboring pixels further comprises:

before expanding one vertex towards the direction, determining whether the position of the vertex to be expanded is occupied by other points, and if so, creating a layer of mesh for vertex expansion.

9. The method of claim 7, wherein the maximum expansion boundary of the direction is determined based on an imaging model of a camera capturing the input image, internal parameters of the camera, and preset parameters for simulating a magnitude of a change in position of the camera.

10. An apparatus for constructing a three-dimensional grid based on a single image, the apparatus comprising:

The computing module is used for determining the position where the shielding exists in the depth image corresponding to the input image; and

a construction module, configured to disconnect a patch connection relationship between a foreground and a background at the position where occlusion exists, where the patch connection relationship is a patch connection relationship at the position where occlusion exists in a grid constructed without considering the occlusion relationship in the depth image; a grid is constructed for the background portion at the location where the occlusion exists to supplement the occluded portion.

11. A system for constructing a three-dimensional grid based on a single image, characterized in that the system comprises a storage means and a processor, the storage means having stored thereon a computer program to be run by the processor, which computer program, when run by the processor, performs the method for constructing a three-dimensional grid based on a single image as claimed in any one of claims 1-9.

12. A storage medium having stored thereon a computer program which, when run, performs the method of constructing a three-dimensional grid based on a single image as claimed in any one of claims 1-9.