CN112785492A - Image processing method, image processing device, electronic equipment and storage medium - Google Patents

Image processing method, image processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112785492A
CN112785492A CN202110077973.5A CN202110077973A CN112785492A CN 112785492 A CN112785492 A CN 112785492A CN 202110077973 A CN202110077973 A CN 202110077973A CN 112785492 A CN112785492 A CN 112785492A
Authority
CN
China
Prior art keywords
target
target image
dimensional
information
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110077973.5A
Other languages
Chinese (zh)
Inventor
邓瑞峰
林天威
李甫
张赫男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110077973.5A priority Critical patent/CN112785492A/en
Publication of CN112785492A publication Critical patent/CN112785492A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/08Projecting images onto non-planar surfaces, e.g. geodetic screens
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image processing method, an image processing device, electronic equipment and a storage medium, and relates to the field of artificial intelligence such as computer vision technology and deep learning. The specific scheme is as follows: carrying out segmentation processing on the target image, and at least distinguishing the region where the target body is located in the target image from other regions to obtain a segmentation mask; obtaining a shielding area corresponding to the area of the target body based on the target image and the segmentation mask, and estimating color repair information and depth repair information of at least partial areas in the shielding area; constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information, the color restoration information and the depth restoration information of the target image; and generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data. The processing speed of the scheme is improved, the processed information amount is reduced, the model complexity is reduced, and the generation efficiency of the 3D image is improved.

Description

Image processing method, image processing device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing, and more particularly to the field of computer vision technology and artificial intelligence such as deep learning.
Background
In the prior art, the technical process of generating a three-dimensional (3D) image from a two-dimensional (2D) image is complex, technical links are more, the whole processing process is complicated and time-consuming, and the generation efficiency of the 3D image is reduced.
Disclosure of Invention
The present disclosure provides a method, an apparatus, an electronic device, and a storage medium for image processing.
According to a first aspect of the present disclosure, there is provided an image processing method including:
carrying out segmentation processing on the target image, and at least distinguishing the region where the target body is located in the target image from other regions to obtain a segmentation mask;
obtaining a shielding area corresponding to the area of the target body based on the target image and the segmentation mask, and estimating color repair information and depth repair information of at least partial areas in the shielding area;
constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information, the color restoration information and the depth restoration information of the target image;
and generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data.
According to a second aspect of the present disclosure, there is provided an image processing apparatus comprising:
the segmentation processing module is used for carrying out segmentation processing on the target image and at least distinguishing the region where the target body is located in the target image from other regions to obtain a segmentation mask;
the restoration module is used for obtaining a sheltered area corresponding to the area where the target body is located based on the target image and the segmentation mask, and estimating color restoration information and depth restoration information of at least part of areas in the sheltered area;
the point cloud construction module is used for constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information, the color restoration information and the depth restoration information of the target image;
and the video generation module is used for generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method in any of the embodiments of the present disclosure.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method in any of the embodiments of the present disclosure.
According to the embodiment of the disclosure, the processing speed is improved, the amount of processed information is reduced, the complexity of the model is reduced, and the generation efficiency of the 3D image is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic flow chart illustrating an image processing method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a target image in an embodiment of the present disclosure;
FIG. 3 is a segmentation mask of the target image shown in FIG. 2;
FIG. 4 is an image after color restoration according to an embodiment of the present disclosure;
FIG. 5 is a depth map of the target image shown in FIG. 1;
FIG. 6 is a depth map after depth repair in an embodiment of the present disclosure;
FIG. 7 is a diagram illustrating the effect of the corresponding point cloud model on the target image of FIG. 2 in one embodiment;
fig. 8A, 8B, and 8C respectively show two-dimensional images of three frames in a video generated from video data;
FIG. 9 illustrates an arrangement of pixel points in an image;
FIG. 10 is a block diagram of an image processing apparatus according to an embodiment of the disclosure;
FIG. 11 is a block diagram of a repair module according to an embodiment of the present disclosure;
FIG. 12 is a block diagram of a repair module according to an embodiment of the present disclosure;
FIG. 13 is a block diagram of a point cloud construction module according to an embodiment of the disclosure;
FIG. 14 is a block diagram of a video generation module according to an embodiment of the present disclosure;
FIG. 15 is a block diagram of an image processing apparatus according to an embodiment of the disclosure;
fig. 16 is a block diagram of an electronic device to implement the image processing method of the embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the related art, an input two-dimensional (2D) image is converted into a video with a three-dimensional (3D) effect, and depth estimation, detection of a boundary of a sudden depth change, construction of a 3D model of a scene, and image restoration and depth restoration of an occluded part of the 3D model on a boundary-by-boundary basis can be adopted to finally obtain a 3D image displayed as a video.
In the related art, when a 2D image is converted into a 3D image, the technical process is complex, technical links are many, for example, in order to perform image restoration and depth restoration on a blocked area, a boundary with a sudden depth change needs to be detected, so that many pre-treatments and bilateral median filtering need to be adopted, and the time consumption is serious; moreover, after the boundary with the abrupt depth change is obtained, image and depth restoration needs to be performed boundary by boundary, the process of one-by-one iterative optimization is complex and time-consuming, the time complexity of the algorithm is increased, and the generation efficiency of the 3D image is reduced.
In order to improve the efficiency of generating a 3D image from a 2D image, embodiments of the present disclosure provide an image processing method, and details of the technical solutions of the present disclosure will be described below.
Fig. 1 is a schematic flowchart of an image processing method in an embodiment of the present disclosure, and an embodiment of the present disclosure provides an image processing method, as shown in fig. 1, which may include:
step S101, carrying out segmentation processing on a target image, and at least distinguishing an area where a target body is located in the target image from other areas to obtain a segmentation mask;
step S102, obtaining a shielding area corresponding to the area where the target body is located based on the target image and the segmentation mask, and estimating color repair information and depth repair information of at least part of areas in the shielding area;
s103, constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information, the color repairing information and the depth repairing information of the target image;
and step S104, generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data.
The image processing method of the embodiment of the disclosure includes the steps of segmenting a target image to obtain a segmentation mask, estimating color repair information and depth repair information of at least part of regions in a shielded region based on the target image and the segmentation mask, constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on relevant information, the color repair information and the depth repair information of the target image, and generating video data which is matched with the target image and at least shows the three-dimensional effect of a target body based on the target three-dimensional point cloud data.
Compared with the technologies of detecting the boundary of the abrupt depth change, constructing a 3D model of a scene, performing image restoration and depth restoration on the shielded part of the 3D model boundary by boundary and the like in the process of generating the 3D image by the 2D image in the related technology, the image processing method disclosed by the embodiment of the disclosure has the advantages that the processing speed is improved, the processed information amount is reduced, the complexity of the model is reduced, and the generation efficiency of the 3D image is improved.
In one embodiment, the target image may be captured by the user or selected by the user from an existing two-dimensional picture.
The target image includes a target volume therein, and the target volume may include at least one of a human body, an animal, and the like. When the target body is a human body, the target image may be segmented by using a human image segmentation technique to distinguish at least a region in which the human body is located from other regions in the target image.
Illustratively, the target volume may include a human body and elements associated with the human body. For example, the target image includes a human body, the human body holds a pet, and then the target body may include the human body and the pet held by the human body, and for example, the target image includes the human body, the human body holds the pet, and then the target body may include the human body and the pet held by the human body. The elements associated with the human body may be elements that have limbs connected to the human body.
In the present disclosure, after obtaining the shielding region corresponding to the region where the target body is located based on the target image and the segmentation mask, only color repair information and depth repair information of at least a partial region in the shielding region may be estimated. It is sufficient that video data showing at least the three-dimensional effect of the target volume matching the target image can be generated using the estimated color repair information and depth repair information. Illustratively, to facilitate generation of video data, color repair information and depth repair information of the entire area in the occlusion region may be estimated.
It is understood that, at least some of the occlusion regions are not specifically limited, and the color repair and the depth repair are performed on the regions in the occlusion regions, and the color repair information and the depth repair information are obtained, which all belong to the protection scope of the present disclosure.
The related information of the target image may include two-dimensional coordinate information of each pixel point in the target image in a two-dimensional coordinate system corresponding to the target image, and may include color information, such as a color value, of each pixel point. Illustratively, the related information of the target image may include depth information of each pixel point.
Fig. 2 is a schematic diagram of a target image according to an embodiment of the disclosure, and fig. 3 is a mask for dividing the target image shown in fig. 2. As shown in fig. 2, the target body in the target image includes a human body, the target image may be subjected to portrait segmentation, and the target image shown in fig. 2 may be subjected to portrait segmentation by using a portrait segmentation model, so as to obtain a segmentation mask shown in fig. 3. The division mask shown in fig. 3 includes a region 301 in which the target body is present and a region 302 other than the target body. In the division mask shown in fig. 3, a region 301 where the target body is located is white, and other regions 302 except for the target body are black, thereby distinguishing the region where the target body is located from the other regions. Illustratively, the portrait segmentation model may employ the deplab-v 2 model.
For example, the segmentation mask may be a binary image, and in the segmentation mask, the color value of the region where the target is located is 1, and the color values of other regions are 0, so that the color of the region where the target is located is displayed as white, and the color of other regions is displayed as black. The segmentation mask distinguished by black and white colors can more obviously distinguish the region where the target body is located from other regions, and the binary image type segmentation mask can improve the efficiency of color restoration and depth restoration when color restoration and depth restoration are carried out based on the segmentation mask.
In the present disclosure, the division mask is not limited to the binary image, and the division mask may be a grayscale image as long as the division mask can distinguish at least the region where the target body is located from other regions.
In the present disclosure, the target three-dimensional point cloud data may include a plurality of points in a three-dimensional space, each point having corresponding data information, such as three-dimensional coordinates, color values, and the like of the point. The data information of each point in the target three-dimensional point cloud data can represent the related information of each first pixel point in the target image and the related information of each second pixel point in the color restoration image corresponding to the color restoration information, and the target three-dimensional point cloud data can be a three-dimensional expression corresponding to the target image and the color restoration image corresponding to the color restoration information, so that the video data which is matched with the target image and at least shows the three-dimensional effect of the target body can be generated based on the target three-dimensional point cloud data.
In an embodiment, the obtaining a blocked area corresponding to an area where the target object is located based on the target image and the segmentation mask, and estimating color repair information of at least a part of the blocked area may include: determining a background area in the target image based on the target image and the segmentation mask, wherein the background area is an area except for an occlusion area in the target image; determining color information of a background area in the target image, and estimating color repair information of at least partial area in the occlusion area based on the color information of the background area.
In the segmentation mask, other areas of the area where the target body in the target image is located are distinguished, so that a shielding area in the target image can be determined based on the target image and the segmentation mask, the shielding area is an area shielded by the target body in the target image, and further a background area in the target image can be determined, and the background area is an area except the shielding area in the target image.
The background area in the target image may include a plurality of pixel points, and each pixel point has corresponding color information, so that the color information of the background area in the target image may be determined. It can be understood that the background region and the occlusion region are scenes where the target is located, and therefore, color estimation may be used to estimate color repair information of at least a part of the occlusion region based on color information of the background region.
In one embodiment, the color repairing model may be used to perform color repairing on at least a part of the occlusion region to obtain color repairing information. Illustratively, the target image and the segmentation mask may be input to an image inpainting model, which may output color inpainting information for at least a portion of the regions in the occluded region. Illustratively, the image inpainting model may be a neural network model.
According to the technical scheme, the color restoration information of at least part of the shielding area is estimated at one time based on the target image and the segmentation mask, and compared with edge-by-edge iterative optimization, the restoration efficiency is greatly improved, and further the 3D image generation efficiency is improved.
Fig. 4 is an image after color restoration in an embodiment of the present disclosure. After estimating the color repair information of at least part of the blocked area, a color repaired image can be obtained, as shown in fig. 4, the color repaired image shown in fig. 4 can include an image of a background area and an image of at least part of the blocked area, and fig. 4 can be seen as a complete image located behind the human body in fig. 1.
It is to be understood that the size of the color-repaired image (as shown in fig. 4) and the size of the target image (as shown in fig. 2) are the same, and the related information of the background area in the color-repaired image and the related information of the background area in the target image may be the same, for example, the position and color value of each pixel of the background area in the color-repaired image and the position and color value of each pixel of the background area in the target image may be the same. The image after color restoration comprises the relevant information of the shielding area, and the target image comprises the relevant information of the target body area.
In an embodiment, the obtaining a blocked area corresponding to an area where the target object is located based on the target image and the segmentation mask, and estimating depth repair information of at least a part of the blocked area may include: determining a background area in the target image based on the target image and the segmentation mask, wherein the background area is an area except for an occlusion area in the target image; and determining the depth information of the background area in the target image, and estimating the depth repair information of at least part of area in the occlusion area based on the depth information of the background area.
In the segmentation mask, the region where the target body in the target image is located is distinguished from other regions, so that a shielding region in the target image can be determined based on the target image and the segmentation mask, the shielding region is a region shielded by the target body in the target image, and further a background region in the target image can be determined, and the background region is a region except the shielding region in the target image.
The depth estimation can be performed on the target image to obtain the depth information of the target image, wherein the depth information of the target image comprises the depth value of each pixel point in the background area and the depth value of each pixel point in the area where the target body is located. Therefore, the depth information of the background area can be determined according to the depth information of the target image, and the Midas algorithm can be adopted for depth estimation of the target image. Depth repair information of at least part of the occlusion region can be estimated based on the depth information of the background region.
Fig. 5 is a depth map of the target image shown in fig. 1. The depth estimation may be performed on the target image shown in fig. 1 to obtain the depth information of the target image shown in fig. 1, and further obtain the depth map of the target image shown in fig. 1. From the depth information of the target image shown in fig. 1, the depth information of the background area in the target image shown in fig. 1 can be obtained.
Fig. 6 is a depth map after depth repair in an embodiment of the present disclosure. After estimating the depth repair information of at least part of the shielded area, an image after depth repair can be obtained, as shown in fig. 6, the depth map shown in fig. 6 includes the depth of the background area and also includes the depth of at least part of the shielded area, and fig. 6 can be seen as a depth map corresponding to the complete image located behind the human body in fig. 1.
It is understood that the size of the depth map after depth repair (as shown in fig. 6) and the depth map of the target image (as shown in fig. 5) are the same, and the related information of the background region in the depth map after depth repair and the related information of the background region in the depth map of the target image may be the same, for example, the depth value of each pixel of the background region in the depth map after depth repair and the depth value of each pixel of the background region in the depth map of the target image may be the same. The depth map of the depth repaired includes the depth value of the pixel of the shielded area, and the depth map of the target image includes the depth value of the pixel of the target area.
In one embodiment, a depth repair model may be used to perform depth repair on at least a portion of the occlusion region to obtain depth repair information. Illustratively, a depth map of the target image and the segmentation mask may be input to a depth repair model, obtaining color repair information. The deep repair model may be a neural network model.
According to the technical scheme, based on the target image and the segmentation mask, the depth repairing information of at least part of the shielding area is estimated at one time, and compared with edge-by-edge iterative optimization, the repairing efficiency is greatly improved, and further the 3D image generation efficiency is improved.
In one embodiment, constructing three-dimensional point cloud data for a target image in a three-dimensional coordinate system based on related information of the target image, color inpainting information, and depth inpainting information may include: obtaining the depth value of each first pixel point in the target image based on the depth information of the target image; obtaining the depth value of each second pixel point in the shielding area based on the depth repairing information; mapping each first pixel point in the target image and each second pixel point in the shielding area to a three-dimensional coordinate system to construct three-dimensional point cloud data; the two-dimensional coordinates of the first pixel point or the second pixel point under the two-dimensional coordinate system corresponding to the target image are used as X, Y coordinates of a corresponding point in the three-dimensional point cloud data, and the depth value of the first pixel point or the depth value of the second pixel point is used as a Z coordinate of the corresponding point in the three-dimensional point cloud data; and determining the color value of the first pixel point as the color value of the corresponding point in the three-dimensional point cloud data based on the target image, and determining the color value of the second pixel point as the color value of the corresponding point in the three-dimensional point cloud data based on the color restoration information to obtain the target three-dimensional point cloud data.
For example, as shown in fig. 2 (including the background region and the region where the target body is located), each first pixel in the target image has a two-dimensional coordinate in the two-dimensional coordinate system corresponding to the target image. The shielding region is a region shielded by a human body in the target image, and therefore, the two-dimensional coordinates of each second pixel point in the shielding region are two-dimensional coordinates of each second pixel point in the color-repaired image corresponding to the two-dimensional coordinate system as shown in fig. 4.
It will be appreciated that the color-repaired image shown in fig. 4 is the same size as the target image shown in fig. 2 and has the same coordinate system.
According to the depth information of the target image, the depth value of each first pixel point in the position of the first pixel point in the target image can be obtained, and according to the depth restoration information, the depth value of each second pixel point in the shielding area in the position of the second pixel point can be obtained.
When three-dimensional point cloud data is constructed, two-dimensional coordinates and depth values of each pixel point need to be based, so that when the three-dimensional point cloud data is constructed, relevant information of a target image, depth information of the target image, color restoration information and depth restoration information need to be adopted. The three-dimensional point cloud data may be constructed by using a three-dimensional point cloud construction model, and the target image (for example, as shown in fig. 2), the depth map of the target image (for example, as shown in fig. 5), the color restoration image (for example, as shown in fig. 4), and the depth restoration map (for example, as shown in fig. 6) may be input into the three-dimensional point cloud construction model as a four-tuple to construct the three-dimensional point cloud data. The three-dimensional point cloud construction model can be a neural network model.
When constructing the three-dimensional point cloud data, each first pixel point and each second pixel point may be mapped into a three-dimensional coordinate system, a two-dimensional coordinate of the first pixel point in the two-dimensional coordinate system corresponding to the target image is taken as an X, Y coordinate of a corresponding point in the three-dimensional point cloud data, a two-dimensional coordinate of the second pixel point in the two-dimensional coordinate system corresponding to the target image is taken as an X, Y coordinate of the corresponding point in the three-dimensional point cloud data, and a depth value of the first pixel point or a depth value of the second pixel point is taken as a Z coordinate of the corresponding point in the three-dimensional point cloud data, so that the three-dimensional point cloud data corresponding to each first pixel point and each second pixel point is obtained in the three-dimensional. The related information of each point in the three-dimensional point cloud data comprises three-dimensional coordinates of each point.
The color value of each first pixel point can be determined based on the target image, the color value of each second pixel point can be determined based on the color restoration information, the color value of the first pixel point or the color value of the second pixel point can be mapped to the corresponding point in the three-dimensional point cloud data, color values are given to all points in the three-dimensional point cloud data, and therefore the target three-dimensional point cloud data is obtained. The related information of each point in the target three-dimensional point cloud data can comprise three-dimensional coordinates and color values of each point.
In the embodiment of the disclosure, when a target three-dimensional model is constructed, a target image, a depth map of the target image, a color restoration image and a depth restoration map can be input into a three-dimensional point cloud construction model as a four-tuple to construct three-dimensional point cloud data, and finally construct the target three-dimensional point cloud data, and an LDI model in the related technology is replaced by the four-tuple, so that the information processing amount is reduced, the storage overhead is reduced, the model complexity is reduced, and the generation efficiency of a 3D image is further improved.
In one embodiment, the generating video data showing at least a three-dimensional effect of the target volume matching the target image based on the target three-dimensional point cloud data may include: determining an image acquisition track; determining a multi-frame two-dimensional image matched with the image acquisition track based on the target three-dimensional point cloud data; based on the plurality of frames of two-dimensional images, video data showing at least a three-dimensional effect of the target volume for the target image is generated.
A virtual camera can be set in a terminal adopting the image processing method of the embodiment of the disclosure, and the orientation, the position and other postures of the camera can be determined by setting virtual camera parameters, such as external parameters of the camera. A virtual set of camera parameters may define an image capture point and a plurality of successive sets of virtual camera parameters may be set such that an image capture trajectory may be determined.
One frame of two-dimensional image of the target three-dimensional point cloud data can be acquired at one position on the image acquisition track, so that a plurality of frames of two-dimensional images matched with the image acquisition track can be determined based on the target three-dimensional point cloud data.
For example, two-dimensional image acquisition may be performed frame by frame on the image acquisition track, so that the determined multiple frames of two-dimensional images are continuous multiple frames of two-dimensional images matching the image acquisition track, the generated video data is also continuous video data, and after the images are rendered, a video composed of the continuous multiple frames of two-dimensional images is generated, and no image missing occurs in the video.
The plurality of frames of two-dimensional images may be successively displayed based on the plurality of frames of two-dimensional images, thereby generating video data showing at least the three-dimensional effect of the target volume for the target image, from which video data a video showing the three-dimensional effect of the target volume may be generated. In the video, as the object moves, a scene located behind the object can be viewed.
The mode of generating the video accords with the watching habit of the user and gives the user a feeling of being personally on the scene.
In one embodiment, the image generating method may further include: constructing a patch based on a plurality of adjacent points in the target three-dimensional point cloud data; estimating information relating to at least one derived point within the patch based on information relating to points relating to the patch; wherein the related information of the derived point includes at least one of a three-dimensional coordinate and a color value of the derived point. And generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data, wherein the target three-dimensional point cloud data can comprise related information of the derivative points.
It can be understood that gaps exist among the points in the three-dimensional point cloud data, and when a frame of two-dimensional image of the target three-dimensional point cloud model is directly acquired, the acquired image may have gaps, resulting in an incomplete image. The number of points in the target three-dimensional point cloud data may be enriched by constructing a patch based on adjacent points in the target three-dimensional point cloud data and estimating information about at least one derived point within the patch based on information about points associated with the patch, the derived point having at least one of a three-dimensional coordinate and a color value. Therefore, when a frame of two-dimensional image of the target three-dimensional point cloud model is acquired, the derived points can make up for the vacancy in the image, so that the acquired image is fuller, the image is prevented from being lost, and the display effect of the 3D image (video) is improved.
Illustratively, an interpolation method can be adopted to estimate the three-dimensional coordinates and the color values of the derived points according to the three-dimensional coordinates and the color values of the points corresponding to the pixel points.
In one embodiment, the image generating method may further include: constructing a patch based on a plurality of adjacent points in the target three-dimensional point cloud data; estimating information relating to at least one derived point within the patch based on information relating to points relating to the patch; wherein the related information of the derived point comprises at least one of a three-dimensional coordinate and a color value of the derived point; generating a point cloud model based on the relevant information of each point and the relevant information of the derived points in the target three-dimensional point cloud data; the method comprises the following steps of determining a multi-frame two-dimensional image matched with an image acquisition track based on target three-dimensional point cloud data, and comprises the following steps: and determining a multi-frame two-dimensional image matched with the image acquisition track based on the point cloud model.
After estimating the related information of at least one derived point in the panel, a point cloud model may be generated based on the related information (three-dimensional coordinates and/or color values) of each point in the target three-dimensional point cloud data and the related information (three-dimensional coordinates and/or color values) of the derived point, and the point cloud model may exhibit a three-dimensional model effect similar to a three-dimensional mesh (mesh) in a three-dimensional network.
FIG. 7 is a diagram illustrating the effect of the corresponding point cloud model in one embodiment of the target image shown in FIG. 2. As shown in fig. 7, it can be seen that the point cloud model may present a three-dimensional model effect, and the point cloud model may include a scene model and a target body model, where the scene model includes a model generated based on the background region and a model generated based on the occlusion region, and the target body model is a model generated for the target body region.
For example, determining a plurality of frames of two-dimensional images matched with the image acquisition track based on the target three-dimensional point cloud data may include: and determining a multi-frame two-dimensional image matched with the image acquisition track based on the point cloud model. Based on the plurality of frames of two-dimensional images, video data showing at least a three-dimensional effect of the target volume for the target image is generated. Fig. 8A, 8B, and 8C respectively show two-dimensional images of three frames in a video generated from video data, as shown in fig. 8A, 8B, and 8C, it can be seen that the images in the video finally generated based on the point cloud model are fuller and no color loss occurs.
The color of each point on the gridding point cloud model is more full, so that a plurality of frames of two-dimensional images matched with the image acquisition track can be acquired based on the point cloud model, and the display effect of each frame of image is further improved.
In one embodiment, constructing a patch based on adjacent points in target three-dimensional point cloud data includes: and determining adjacent multiple points in the target three-dimensional point cloud data according to the position relation between each first pixel point in the target image and each second pixel point in the shielding area, and connecting the determined adjacent multiple points to construct a patch.
For example, the position relationship may include an adjacent position relationship, that is, for the adjacent position relationship between each first pixel point in the target image and each second pixel point in the occlusion region, adjacent points in the target three-dimensional point cloud data are determined, and the determined adjacent points are connected to construct a patch.
It is to be understood that, in the implementation, the positional relationship is not limited to the adjacent positional relationship, and other positional relationships, such as a positional relationship maintained at a certain interval, may be set as necessary.
It can be understood that the pixel point of the shielding region and the pixel point of the background region both belong to the pixel point behind the object, and the shielding region and the background region can be regarded as the scene where the object is located, as shown in fig. 4. That is to say, the pixel point of the shielding region and the pixel point of the background region belong to the pixel point of the scene, and the pixel point of the scene and the pixel point of the target respectively belong to two levels. Therefore, the pixel point of the shielding region and the pixel point of the background region at the same level can have an adjacent position relationship. However, the target and the scene are located at different levels, the pixel point of the target and the pixel point of the background region may not have an adjacent position relationship, and the pixel point of the target and the pixel point of the shielding region may not have an adjacent position relationship. For example, the scene image shown in fig. 4 may be used to determine the position relationship between the pixel points of the occlusion region and the pixel points of the background region. Thus, the obtained point cloud model may include two models, one representing the target and the other representing the scene behind the target.
It is understood that in some specific scenes, the pixel points of the object and the pixel points of the background area may have an adjacent position relationship, for example, the object makes contact with the scene. In specific implementation, the position relationship between the pixel point of the target body and the pixel point of the background region can be determined according to actual conditions.
In the embodiment of the disclosure, for the position relationship between each first pixel point in the target image and each second pixel point in the shielding region, adjacent multiple points in the target three-dimensional point cloud data are determined, and the determined adjacent multiple points are connected to construct a patch, so that the position relationship between the pixel points in the target image and the color restoration image can be corresponded to each point in the target three-dimensional point cloud data in a patch constructing manner, thereby ensuring that the pixel points in the image at the same level are connected, avoiding the connection between the pixel points at different levels, enabling the generated point cloud model to better embody the real condition of the target image, and further improving the sense of reality of the generated 3D image.
In one embodiment, constructing a patch based on adjacent points in the target three-dimensional point cloud data may include: aiming at the target image and three pixel points which are in triangular adjacency in the shielding area, establishing a triangular adjacency relation of the three pixel points which are in triangular adjacency; and constructing a triangular patch among three corresponding points in the target three-dimensional point cloud data based on the triangular adjacency relation.
Fig. 9 shows an arrangement of pixel points in an image. The image shown in FIG. 9 includes three rows and three columns of 9 pixels, which are labeled as pixel 1, pixel 2, and …, pixel 8, and pixel 9. In fig. 9, for example, the adjacent pixel point 1, pixel point 2, and pixel point 4 are in a triangular position relationship, and then three points corresponding to the pixel point 1, the pixel point 2, and the pixel point 4 are determined in the target three-dimensional point cloud data, and the three points in the determined target three-dimensional point cloud data are connected to form a patch.
In an embodiment, three points which are in triangular adjacency in the target three-dimensional point cloud data can be determined according to the triangular adjacency relation between each first pixel point in the background area of the target image and each second pixel point in the shielding area, and the determined three points which are in triangular adjacency are connected to construct a triangular patch.
For the triangular adjacency relation, as shown in fig. 9, for the pixel points located at the upper left corner, the lower left corner, the upper right corner and the lower right corner, the pixel points are respectively located in two triangular adjacency relations, for example, the pixel point 1 at the upper left corner may be located in the triangular adjacency relation of the pixel point 1, the pixel point 2 and the pixel point 4, and may be located in the triangular adjacency relation of the pixel point 1, the pixel point 4 and the pixel point 5. For the pixel point located at the non-endpoint position of the edge, the pixel point may be located in four kinds of triangular adjacency relations, for example, the pixel point 2 may be located in the triangular adjacency relation of the pixel point 1, the pixel point 2 and the pixel point 4, the pixel point 2 may be located in the triangular adjacency relation of the pixel point 1, the pixel point 2 and the pixel point 5, the pixel point 2 may be located in the triangular adjacency relation of the pixel point 2, the pixel point 3 and the pixel point 5, and the pixel point 2 may be located in the triangular adjacency relation of the pixel point 2, the pixel point 5 and the pixel. For the pixel point located in the middle area, eight kinds of triangle adjacency relations can be located, for example, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 1, the pixel point 2 and the pixel point 5, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 2, the pixel point 4 and the pixel point 5, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 4, the pixel point 5 and the pixel point 8, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 5, the pixel point 8 and the pixel point 9, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 5, the pixel point 6 and the pixel point 8, the pixel point 5 can be located in the triangle adjacency relation of the pixel point 2, the pixel point 3 and the pixel point 5, and the pixel point 5 can be located in the, Pixel 5 and pixel 6 in a triangular adjacency.
After the triangular adjacent relation of each pixel point is determined, three points corresponding to the triangular adjacent relation in the target three-dimensional point cloud data can be determined based on the triangular adjacent relation, and the determined three points which are in triangular adjacent connection are connected to construct a triangular patch.
The embodiment of the present disclosure is not limited to the positional relationship of the triangular neighbors, and may be a positional relationship of the four-corner neighbors, a positional relationship of the five-corner neighbors, and the like, and the specific positional relationship may be set according to actual needs, as long as an appropriate patch can be constructed, the technical effects of the embodiment of the present disclosure can be achieved.
Fig. 10 is a block diagram of an image processing apparatus according to an embodiment of the present disclosure. As shown in fig. 10, the image processing apparatus includes:
a segmentation processing module 1001, configured to perform segmentation processing on the target image, and at least distinguish a region where a target body in the target image is located from other regions to obtain a segmentation mask;
a repairing module 1002, configured to obtain a shielded area corresponding to an area where a target is located based on a target image and a segmentation mask, and estimate color repairing information and depth repairing information of at least part of the shielded area;
the point cloud construction module 1003 is used for constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information, the color restoration information and the depth restoration information of the target image;
and the video generating module 1004 is used for generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data.
Fig. 11 is a block diagram of a repair module according to an embodiment of the disclosure, and in an implementation, as shown in fig. 11, the repair module includes: a background region determining submodule 1101 configured to determine a background region in the target image based on the target image and the segmentation mask, where the background region is a region other than the occlusion region in the target image; the color repairing sub-module 1102 is configured to determine color information of a background area in the target image, and estimate color repairing information of at least a partial area in the occlusion area based on the color information of the background area.
Fig. 12 is a block diagram of a repair module according to an embodiment of the disclosure, and in an implementation, as shown in fig. 12, the repair module includes: a background region determining submodule 1201, configured to determine a background region in the target image based on the target image and the segmentation mask, where the background region is a region other than the occlusion region in the target image; and the depth repairing sub-module 1202 is configured to determine depth information of a background region in the target image, and estimate depth repairing information of at least part of the shielded region based on the depth information of the background region.
Fig. 13 is a block diagram illustrating a structure of a point cloud constructing module according to an embodiment of the disclosure, and in an embodiment, as shown in fig. 13, the point cloud constructing module includes: the first depth obtaining sub-module 1301 is configured to obtain a depth value of each first pixel in the target image based on the depth information of the target image; the second depth obtaining sub-module 1302 is configured to obtain a depth value of each second pixel point in the occlusion region based on the depth restoration information; the mapping submodule 1303 is used for mapping each first pixel point in the target image and each second pixel point in the shielding area to a three-dimensional coordinate system to construct three-dimensional point cloud data; the two-dimensional coordinates of the first pixel point or the second pixel point under the two-dimensional coordinate system corresponding to the target image are used as X, Y coordinates of a corresponding point in the three-dimensional point cloud data, and the depth value of the first pixel point or the depth value of the second pixel point is used as a Z coordinate of the corresponding point in the three-dimensional point cloud data; and the color assignment submodule 1304 is configured to determine, based on the target image, a color value of the first pixel point as a color value of a corresponding point in the three-dimensional point cloud data, and determine, based on the color restoration information, a color value of the second pixel point as a color value of a corresponding point in the three-dimensional point cloud data, so as to obtain the target three-dimensional point cloud data.
Fig. 14 is a block diagram of a video generation module in an embodiment of the present disclosure, and in an implementation, as shown in fig. 14, the video generation module includes: a trajectory determination submodule 1401 for determining an image acquisition trajectory; a frame image determining submodule 1402, configured to determine, based on the target three-dimensional point cloud data, a multi-frame two-dimensional image matched with the image acquisition trajectory; the video generation sub-module 1403 is configured to generate, based on the multiple frames of two-dimensional images, video data showing at least a three-dimensional effect of the target volume for the target image.
Fig. 15 is a block diagram of an image processing apparatus according to an embodiment of the disclosure, and in an implementation, as shown in fig. 5, the image processing apparatus further includes: a patch constructing module 1501, configured to construct a patch based on a plurality of adjacent points in the target three-dimensional point cloud data; a derived point estimation module 1502 for estimating information about at least one derived point within a patch based on information about points associated with the patch; wherein the related information of the derived point comprises at least one of a three-dimensional coordinate and a color value of the derived point; a point cloud model generation module 1503, configured to generate a point cloud model based on the relevant information of each point in the target three-dimensional point cloud data and the relevant information of the derived point; the frame image determining submodule 1504 is used for determining a multi-frame two-dimensional image matched with the image acquisition track based on the point cloud model.
The frame image determining sub-module 1504 may be the same or similar module as the frame image determining sub-module 1403 in fig. 14.
In one embodiment, the patch construction module is specifically configured to: and determining adjacent multiple points in the target three-dimensional point cloud data according to the position relation between each first pixel point in the target image and each second pixel point in the shielding area, and connecting the determined adjacent multiple points to construct a patch.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 16 shows a schematic block diagram of an example electronic device 1600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 16, the apparatus 1600 includes a computing unit 1601, which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)1602 or a computer program loaded from a storage unit 1608 into a Random Access Memory (RAM) 1603. In the RAM 1603, various programs and data required for the operation of the device 1600 can also be stored. The computing unit 1601, ROM 1602 and RAM 1603 are connected to each other via a bus 1604. An input/output (I/O) interface 1605 is also connected to the bus 1604.
Various components in device 1600 connect to I/O interface 1605, including: an input unit 1606 such as a keyboard, a mouse, and the like; an output unit 1607 such as various types of displays, speakers, and the like; a storage unit 1608, such as a magnetic disk, optical disk, or the like; and a communication unit 1609 such as a network card, a modem, a wireless communication transceiver, etc. A communication unit 1609 allows device 1600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
Computing unit 1601 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of computing unit 1601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 1601 executes the respective methods and processes described above, such as an image processing method. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 1608. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 1600 via ROM 1602 and/or communications unit 1609. When a computer program is loaded into RAM 1603 and executed by computing unit 1601, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the computing unit 1601 may be configured to perform the image processing method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. An image processing method comprising:
carrying out segmentation processing on a target image, and at least distinguishing a region where a target body is located in the target image from other regions to obtain a segmentation mask;
obtaining a shielding area corresponding to the area of the target body based on the target image and the segmentation mask, and estimating color repair information and depth repair information of at least part of the shielding area;
constructing target three-dimensional point cloud data for the target image in a three-dimensional coordinate system based on the relevant information of the target image, the color restoration information and the depth restoration information;
and generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body based on the target three-dimensional point cloud data.
2. The method according to claim 1, wherein the obtaining an occlusion region corresponding to a region where the target object is located based on the target image and the segmentation mask, and estimating color repair information of at least a partial region of the occlusion region comprises:
determining a background area in the target image based on the target image and the segmentation mask, wherein the background area is an area except the occlusion area in the target image;
determining color information of the background area in the target image, and estimating color repair information of at least part of the shielding area based on the color information of the background area.
3. The method according to claim 1, wherein obtaining an occlusion region corresponding to a region in which the target object is located based on the target image and the segmentation mask, and estimating depth repair information of at least a part of the occlusion region comprises:
determining a background area in the target image based on the target image and the segmentation mask, wherein the background area is an area except the occlusion area in the target image;
and determining the depth information of the background area in the target image, and estimating the depth repair information of at least part of area in the occlusion area based on the depth information of the background area.
4. The method of claim 1, wherein the constructing three-dimensional point cloud data for the target image in a three-dimensional coordinate system based on the target image-related information, the color inpainting information, and the depth inpainting information comprises:
obtaining the depth value of each first pixel point in the target image based on the depth information of the target image;
obtaining the depth value of each second pixel point in the shielding area based on the depth repairing information;
mapping each first pixel point in the target image and each second pixel point in the shielding area to the three-dimensional coordinate system to construct three-dimensional point cloud data; the two-dimensional coordinates of the first pixel point or the second pixel point under the two-dimensional coordinate system corresponding to the target image are used as X, Y coordinates of a corresponding point in the three-dimensional point cloud data, and the depth value of the first pixel point or the depth value of the second pixel point is used as a Z coordinate of the corresponding point in the three-dimensional point cloud data;
and determining the color value of the first pixel point based on the target image to be used as the color value of the corresponding point in the three-dimensional point cloud data, and determining the color value of the second pixel point based on the color restoration information to be used as the color value of the corresponding point in the three-dimensional point cloud data to obtain the target three-dimensional point cloud data.
5. The method of claim 1 or 4, wherein the generating video data showing at least a three-dimensional effect of the target volume matching the target image based on the target three-dimensional point cloud data comprises:
determining an image acquisition track;
determining a multi-frame two-dimensional image matched with the image acquisition track based on the target three-dimensional point cloud data;
and generating video data which at least shows the three-dimensional effect of the target body aiming at the target image based on the plurality of frames of two-dimensional images.
6. The method of claim 5, further comprising:
constructing a patch based on a plurality of adjacent points in the target three-dimensional point cloud data;
estimating information about at least one derived point within the patch based on information about points associated with the patch; wherein the information related to the derived point comprises at least one of a three-dimensional coordinate and a color value of the derived point;
generating a point cloud model based on the relevant information of each point in the target three-dimensional point cloud data and the relevant information of the derived points;
determining a multi-frame two-dimensional image matched with the image acquisition track based on the target three-dimensional point cloud data, wherein the method comprises the following steps: and determining a multi-frame two-dimensional image matched with the image acquisition track based on the point cloud model.
7. The method of claim 6, wherein the constructing a patch based on adjacent points in the target three-dimensional point cloud data comprises:
and determining adjacent multiple points in the target three-dimensional point cloud data according to the position relation between each first pixel point in the target image and each second pixel point in the shielding area, and connecting the determined adjacent multiple points to construct the patch.
8. An image processing apparatus comprising:
the segmentation processing module is used for carrying out segmentation processing on the target image and at least distinguishing the region where the target body is located in the target image from other regions to obtain a segmentation mask;
the restoration module is used for obtaining a sheltered area corresponding to the area where the target body is located based on the target image and the segmentation mask, and estimating color restoration information and depth restoration information of at least part of the sheltered area;
the point cloud construction module is used for constructing target three-dimensional point cloud data aiming at the target image in a three-dimensional coordinate system based on the related information of the target image, the color restoration information and the depth restoration information;
and the video generation module is used for generating video data which is matched with the target image and at least shows the three-dimensional effect of the target body on the basis of the target three-dimensional point cloud data.
9. The apparatus of claim 8, wherein the repair module comprises:
a background region determining submodule, configured to determine a background region in the target image based on the target image and the segmentation mask, where the background region is a region of the target image other than the occlusion region;
and the color repairing sub-module is used for determining the color information of the background area in the target image and estimating the color repairing information of at least part of the shielding area based on the color information of the background area.
10. The apparatus of claim 8, wherein the repair module comprises:
a background region determining submodule, configured to determine a background region in the target image based on the target image and the segmentation mask, where the background region is a region of the target image other than the occlusion region;
and the depth repairing sub-module is used for determining the depth information of the background area in the target image and estimating the depth repairing information of at least part of area in the shielding area based on the depth information of the background area.
11. The apparatus of claim 8, wherein the point cloud construction module comprises:
the first depth obtaining submodule is used for obtaining the depth value of each first pixel point in the target image based on the depth information of the target image;
the second depth obtaining submodule is used for obtaining the depth value of each second pixel point in the shielding area based on the depth repairing information;
the mapping sub-module is used for mapping each first pixel point in the target image and each second pixel point in the shielding area to the three-dimensional coordinate system to construct three-dimensional point cloud data; the two-dimensional coordinates of the first pixel point or the second pixel point under the two-dimensional coordinate system corresponding to the target image are used as X, Y coordinates of a corresponding point in the three-dimensional point cloud data, and the depth value of the first pixel point or the depth value of the second pixel point is used as a Z coordinate of the corresponding point in the three-dimensional point cloud data;
and the color assignment submodule is used for determining the color value of the first pixel point based on the target image to be used as the color value of the corresponding point in the three-dimensional point cloud data, and determining the color value of the second pixel point based on the color restoration information to be used as the color value of the corresponding point in the three-dimensional point cloud data so as to obtain the target three-dimensional point cloud data.
12. The apparatus of claim 8 or 11, wherein the video generation module comprises:
the track determining submodule is used for determining an image acquisition track;
the frame image determining submodule is used for determining a multi-frame two-dimensional image matched with the image acquisition track based on the target three-dimensional point cloud data;
and the video generation submodule is used for generating video data which at least shows the three-dimensional effect of the target body aiming at the target image based on the multi-frame two-dimensional image.
13. The apparatus of claim 12, further comprising:
a patch constructing module for constructing a patch based on a plurality of adjacent points in the target three-dimensional point cloud data;
a derived point estimation module for estimating information about at least one derived point within the patch based on information about points associated with the patch; wherein the information related to the derived point comprises at least one of a three-dimensional coordinate and a color value of the derived point;
the point cloud model generation module is used for generating a point cloud model based on the related information of each point in the target three-dimensional point cloud data and the related information of the derived points;
and the frame image determining submodule is used for determining a multi-frame two-dimensional image matched with the image acquisition track based on the point cloud model.
14. The apparatus of claim 13, wherein the patch construction module is to:
and determining adjacent multiple points in the target three-dimensional point cloud data according to the position relation between each first pixel point in the target image and each second pixel point in the shielding area, and connecting the determined adjacent multiple points to construct the patch.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1-7.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-7.
CN202110077973.5A 2021-01-20 2021-01-20 Image processing method, image processing device, electronic equipment and storage medium Pending CN112785492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110077973.5A CN112785492A (en) 2021-01-20 2021-01-20 Image processing method, image processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110077973.5A CN112785492A (en) 2021-01-20 2021-01-20 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112785492A true CN112785492A (en) 2021-05-11

Family

ID=75758095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110077973.5A Pending CN112785492A (en) 2021-01-20 2021-01-20 Image processing method, image processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112785492A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570702A (en) * 2021-07-14 2021-10-29 Oppo广东移动通信有限公司 3D photo generation method and device, terminal and readable storage medium
CN113793255A (en) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 Method, apparatus, device, storage medium and program product for image processing
CN116843807A (en) * 2023-06-30 2023-10-03 北京百度网讯科技有限公司 Virtual image generation method, virtual image model training method, virtual image generation device, virtual image model training device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830272A (en) * 2018-08-03 2018-11-16 中国农业大学 Potato image collecting device and bud eye based on RGB-D camera identify and position method
EP3621036A1 (en) * 2018-09-07 2020-03-11 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for generating three-dimensional data, device, and storage medium
CN111815666A (en) * 2020-08-10 2020-10-23 Oppo广东移动通信有限公司 Image processing method and device, computer readable storage medium and electronic device
CN112102340A (en) * 2020-09-25 2020-12-18 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830272A (en) * 2018-08-03 2018-11-16 中国农业大学 Potato image collecting device and bud eye based on RGB-D camera identify and position method
EP3621036A1 (en) * 2018-09-07 2020-03-11 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for generating three-dimensional data, device, and storage medium
CN111815666A (en) * 2020-08-10 2020-10-23 Oppo广东移动通信有限公司 Image processing method and device, computer readable storage medium and electronic device
CN112102340A (en) * 2020-09-25 2020-12-18 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570702A (en) * 2021-07-14 2021-10-29 Oppo广东移动通信有限公司 3D photo generation method and device, terminal and readable storage medium
CN113793255A (en) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 Method, apparatus, device, storage medium and program product for image processing
WO2023035841A1 (en) * 2021-09-09 2023-03-16 百度在线网络技术(北京)有限公司 Method and apparatus for image processing, and device, storage medium and program product
CN116843807A (en) * 2023-06-30 2023-10-03 北京百度网讯科技有限公司 Virtual image generation method, virtual image model training method, virtual image generation device, virtual image model training device and electronic equipment
CN116843807B (en) * 2023-06-30 2024-09-03 北京百度网讯科技有限公司 Virtual image generation method, virtual image model training method, virtual image generation device, virtual image model training device and electronic equipment

Similar Documents

Publication Publication Date Title
CN112785674B (en) Texture map generation method, rendering device, equipment and storage medium
CN112785492A (en) Image processing method, image processing device, electronic equipment and storage medium
CN109660783B (en) Virtual reality parallax correction
CN108694719B (en) Image output method and device
CN111753762B (en) Method, device, equipment and storage medium for identifying key identification in video
CN112634343A (en) Training method of image depth estimation model and processing method of image depth information
CN112560684B (en) Lane line detection method, lane line detection device, electronic equipment, storage medium and vehicle
CN110443230A (en) Face fusion method, apparatus and electronic equipment
CN111583381B (en) Game resource map rendering method and device and electronic equipment
CN114511662A (en) Method and device for rendering image, electronic equipment and storage medium
CN112270745B (en) Image generation method, device, equipment and storage medium
US20230245339A1 (en) Method for Adjusting Three-Dimensional Pose, Electronic Device and Storage Medium
CN113657396B (en) Training method, translation display method, device, electronic equipment and storage medium
CN115375823B (en) Three-dimensional virtual clothing generation method, device, equipment and storage medium
CN109214996A (en) A kind of image processing method and device
CN115222879B (en) Model face reduction processing method and device, electronic equipment and storage medium
CN114140320B (en) Image migration method and training method and device of image migration model
CN111768467A (en) Image filling method, device, equipment and storage medium
CN114674826A (en) Visual detection method and detection system based on cloth
CN113658035A (en) Face transformation method, device, equipment, storage medium and product
US11195322B2 (en) Image processing apparatus, system that generates virtual viewpoint video image, control method of image processing apparatus and storage medium
CN110717384B (en) Video interactive behavior recognition method and device
CN114723809A (en) Method and device for estimating object posture and electronic equipment
CN115375847B (en) Material recovery method, three-dimensional model generation method and model training method
CN116563172A (en) VR globalization online education interaction optimization enhancement method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination