WO2021170127A1 - Method and apparatus for three-dimensional reconstruction of half-length portrait - Google Patents

Method and apparatus for three-dimensional reconstruction of half-length portrait Download PDF

Info

Publication number
WO2021170127A1
WO2021170127A1 PCT/CN2021/078324 CN2021078324W WO2021170127A1 WO 2021170127 A1 WO2021170127 A1 WO 2021170127A1 CN 2021078324 W CN2021078324 W CN 2021078324W WO 2021170127 A1 WO2021170127 A1 WO 2021170127A1
Authority
WO
WIPO (PCT)
Prior art keywords
texture
image
bust
map
expansion
Prior art date
Application number
PCT/CN2021/078324
Other languages
French (fr)
Chinese (zh)
Inventor
陈国文
胡守刚
赵磊
吕培
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021170127A1 publication Critical patent/WO2021170127A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular to a method and device for three-dimensional reconstruction of a bust.
  • three-dimensional bust reconstruction technology has a wide range of applications in three-dimensional printing, entertainment, and remote augmented reality (AR) calls.
  • the traditional three-dimensional bust image reconstruction technology can adopt a monocular system or a multi-eye system.
  • an infrared depth camera is usually used to scan a person in a circle, and the scanned person needs to remain still during the scanning process. Therefore, this method takes a long time to scan the person, and the reconstruction calculation time is long, and the scanning effect may be unsatisfactory or failure.
  • the multi-eye scanning system is an acquisition system based on multiple viewing angles. Although it has real-time reconstruction capabilities, it is expensive, with large equipment and high complexity, and it is inconvenient to operate.
  • the present application provides a three-dimensional reconstruction method of a bust.
  • the three-dimensional reconstruction method of the bust can be implemented by an electronic device, such as one of the electronic devices or a processing unit.
  • the method may include: obtaining an image to be processed including a bust of the target person, the bust including a frontal face, and then obtaining a first texture expansion image based on the obtained image to be processed, and the first texture expansion image is used to represent the front face of the bust Texture, and at least two organs (such as the nose and ears) of the five senses of the bust in the first texture expansion picture are located at preset positions; after that, expand the first texture according to the front texture of the bust in the first texture expansion picture
  • the backside texture of the bust in the figure is supplemented to obtain a second texture expansion map, which is used to characterize the surface texture of the bust; finally, a three-dimensional model of the bust is obtained according to the second texture expansion map.
  • the second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust.
  • the surface texture includes the front texture of the bust and the back texture of the bust.
  • the front of a bust includes a human face, front neck, or front shoulders.
  • the back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
  • obtaining the first texture expansion map according to the image to be processed can be implemented in the following ways:
  • the obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model may be implemented in the following manner:
  • the third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
  • the texture of the bust can be expanded in a circle to obtain the first texture expansion map.
  • Circumferential expansion is non-Atlas (atlas) texture expansion.
  • Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures.
  • inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
  • the at least two organs include ears; the method further includes:
  • Obtaining the three-dimensional model of the bust based on the processed second texture expansion map including:
  • the texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
  • the above design realizes high-precision local texture and geometry through mesh optimization and texture replacement of the ear part, and optimizes texture completion and mesh reconstruction details.
  • smoothing the texture stitching area in the second texture expansion image can be achieved in the following manner:
  • the back texture is estimated based on the front image.
  • the estimated back texture is free of gaps, and then weighted with the back texture obtained based on the second texture expansion map , Optimize the smoothing effect of the back gap.
  • weighted fusion processing is performed on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, which may be implemented in the following manner:
  • the setting rules are:
  • I 3 (i,j) ⁇ I 1 (i,j)+(1- ⁇ )I 2 (map 1 (i),map 2 (j));
  • I 1 represents the second texture expansion image
  • I 2 represents the back image of the bust
  • I 3 represents the fourth texture expansion image
  • I alpha represents the weight
  • map 1 represents in the X-axis direction
  • the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map
  • map 2 represents the mapping function on the Y-axis direction
  • the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map
  • i Indicates the coordinate value of the pixel in the X-axis direction
  • j represents the coordinate value of the pixel in the X-axis direction.
  • the method before obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model, the method further includes:
  • the three-dimensional mesh model is processed to fill holes to make the three-dimensional mesh model more complete and improve the accuracy of the reconstruction of the three-dimensional model.
  • Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction.
  • Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
  • the present application provides a three-dimensional reconstruction device for a bust, including:
  • An acquiring unit configured to acquire an image to be processed, the image to be processed includes a bust of a target person, and the bust includes a frontal face;
  • the reconstruction unit is configured to obtain a first texture expansion map according to the image to be processed, the first texture expansion map being used to characterize the frontal texture of the bust, and the first texture expansion map in the five senses of the bust At least two organs are located at preset positions; according to the front texture of the bust in the first texture expansion view, supplementing the back texture of the bust in the first texture expansion view to obtain a second texture expansion view
  • the second texture expansion map is used to characterize the surface texture of the bust; and the three-dimensional model of the bust is obtained according to the second texture expansion map.
  • a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
  • the third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
  • the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the second texture expansion map:
  • the at least two organs include ears; the reconstruction unit is also used to fuse the pre-configured ear model into the three-dimensional mesh model of the ear region after the fusion is obtained.
  • the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the processed second texture expansion map:
  • the texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
  • the reconstruction unit is specifically configured to: when smoothing the texture stitch line region in the second texture expansion map:
  • the reconstruction unit when the reconstruction unit performs weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, it is specifically configured to:
  • the setting rules are:
  • I 1 represents the second texture expansion image
  • I 2 represents the back image of the bust
  • I 3 represents the fourth texture expansion image
  • I alpha represents the weight
  • map 1 represents in the X-axis direction
  • the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map
  • map 2 represents the mapping function on the Y-axis direction
  • the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map
  • i Indicates the coordinate value of the pixel in the X-axis direction
  • j represents the coordinate value of the pixel in the X-axis direction.
  • the electronic device further includes a camera; the camera is used to collect images to be processed.
  • the above-mentioned processor is used to control the camera to collect images.
  • a computer program product provided by an embodiment of the present application, when the computer program product runs on an electronic device, causes the electronic device or processor to execute the first aspect and any possible design method thereof.
  • Coupled in the embodiments of the present application means that two components are directly or indirectly combined with each other.
  • Figure 1 is a schematic diagram of an electronic device in an embodiment of the application
  • FIG. 2 is a schematic flowchart of a method for three-dimensional reconstruction of a bust in an embodiment of this application;
  • Fig. 3 is a schematic diagram of a bust in an embodiment of the application.
  • FIG. 4 is a schematic flowchart of a method for texture expansion in an embodiment of the application.
  • FIG. 5 is a schematic diagram of three-dimensional reconstruction of a bust in an embodiment of the application.
  • Fig. 6 is a schematic diagram of the central axis in an embodiment of the application.
  • FIG. 7 is a schematic diagram illustrating the expansion of the circumference in an embodiment of the application.
  • Fig. 8 is a schematic diagram of determining the positions of the nose and ears in an embodiment of the application.
  • FIG. 9 is a schematic diagram of smoothing the gap in an embodiment of the application.
  • FIG. 10 is a schematic diagram of a smoothing method used for smoothing the gap in an embodiment of the application.
  • FIG. 14 is a schematic diagram of a device 1400 in an embodiment of the application.
  • the neural processor includes, but is not limited to, a neural network processing unit, such as a deep neural network processing unit or a convolutional neural network processing unit.
  • the neural processor can use the neural network model to perform training, calculation, or processing.
  • the neural network model includes, but is not limited to, a deep neural network model or a convolutional neural network model.
  • the above digital signal processor, image processing unit or central processing unit can also use the neural network model to perform training, calculation or processing.
  • the NPU included in the processor 110 is a neural-network (NN) computing processor.
  • NN neural-network
  • the input information can be quickly processed. You can also continue to learn by yourself.
  • the three-dimensional reconstruction of the bust of the electronic device 100 can be achieved through the NPU.
  • the memory 120 may be used to store computer executable program code, where the executable program code includes instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device by running instructions stored in the memory 120.
  • the memory 120 may include a program storage area and a data storage area. Among them, the storage program area can store an operating system, driver software, or at least one application program required by a function (such as a sound playback function, an image playback function, etc.).
  • the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100.
  • the memory 120 may include at least one of a power-down volatile memory or a non-power-down volatile memory, such as read only memory (ROM), random access memory (RAM), and dynamic random access memory. (dynamic random access memory, DRAM), embedded multimedia card (eMMC), universal flash storage (UFS), hard disk or magnetic disk, etc.
  • the character "/" generally indicates that the associated objects before and after are in an "or” relationship.
  • "The following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • FIG. 2 is a schematic flowchart of a method for three-dimensional reconstruction of a bust according to an embodiment of the present application.
  • the three-dimensional reconstruction method of the bust can be implemented by the electronic device shown in FIG. 2, for example, by one of the electronic devices or a processing unit.
  • the three-dimensional reconstruction method of the bust mainly includes S201-S204.
  • the bust may include a frontal face, neck, shoulders, etc., for example, see FIG. 3.
  • At least two organs in the five sense organs of the bust in the first texture development image are located at preset positions.
  • the facial features include eyebrows, eyes, ears, nose, and mouth.
  • the at least two organs are located at preset positions, which may be texture expansion images obtained for different images to be processed, and the positions of the at least two organs are the same.
  • two organs include a nose and an ear, the position of the ear in the texture expansion map of different images to be processed is the same, and the position of the nose in the texture expansion map of different images to be processed is the same.
  • the at least two organs located at the preset positions may be texture expansion images obtained for different images to be processed, and the relative positions of the at least two organs are fixed.
  • the distance between the two ears and the distance between the nose and the ears can be the same in the texture expansion of different images to be processed.
  • S203 Supplement the back texture of the bust in the first texture expansion image according to the front texture of the bust in the first texture expansion image to obtain a second texture expansion image.
  • the second texture expansion map is used to characterize the surface texture of the bust.
  • the second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust.
  • the surface texture may be the texture of the upper body surface of the three-dimensional target person. That is, the surface texture includes a texture that surrounds the surface of the upper body along the axis perpendicular to the ground plane.
  • the surface texture includes the front texture of the bust and the back texture of the bust.
  • the front of a bust includes a human face, front neck, or front shoulders.
  • the back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
  • S202 obtaining the first texture expansion image according to the image to be processed may be implemented by the following S401-S404.
  • the first neural network model may be used to remove the background in the processed image to obtain a frontal image of the bust.
  • the first neural network model is used to segment the foreground and background of the image, and output the foreground image.
  • the foreground image is the front image of the bust.
  • S402 Perform semantic segmentation on the front image to obtain a head semantic mask of the front image.
  • Semantic segmentation is the grouping/segmentation of pixels according to the different semantic meanings expressed in the image.
  • the semantic segmentation of the front image may use a full convolution network (Fully Convolution Networks, FCN), such as U-net network, SegNet network, DeepLab, RefineNet, or PSPNet.
  • FCN Full Convolution Networks
  • S403 Obtain a three-dimensional mesh model of the bust according to the semantic mask of the head and the frontal image.
  • the head semantic mask and the frontal image may be input to the second neural network model, and the second neural network model is used for human body reconstruction.
  • the second neural network model outputs a three-dimensional (3D) truncated signed distance function (truncated signed distance function, TSDF) volume. Then extract the surface mesh (mesh) of the TSDF body to obtain a three-dimensional mesh model of the bust.
  • 3D three-dimensional truncated signed distance function
  • TSDF truncated signed distance function
  • At least one of the following processing may be performed on the three-dimensional mesh model: hole filling processing, mesh uniformization processing, or mesh processing. Smoothing.
  • a triangular mesh hole filling method based on a radial basis function (radial basis function, RBF) or a hole filling algorithm based on the Poisson equation may be used.
  • RBF radial basis function
  • a hole filling algorithm based on the Poisson equation may be used.
  • grid uniformization algorithms such as point clustering, edge folding, and vertex addition and deletion can be used.
  • mesh smoothing methods based on Poisson's equation or discrete Laplace equation can be used.
  • the 3D mesh model is made more complete and the accuracy of the 3D model reconstruction is improved.
  • Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction.
  • Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
  • the three-dimensional mesh can be obtained according to the positions of at least two organs on the head in the semantic mask of the head.
  • the first texture expansion map of the grid model makes at least two organs in the first texture expansion map to be located at the preset positions.
  • the texture of the bust may be expanded in a circle to obtain the first texture expansion map.
  • Circumferential expansion is non-Atlas (atlas) texture expansion.
  • Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures.
  • inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
  • the front image of the bust shown in (a) in FIG. 5 is obtained after A1 processing.
  • Semantic segmentation of the front image of the bust to obtain the semantic head mask that is, the semantic mask of the head obtained after A2 processing is shown in Figure 5(b).
  • the head semantic mask and frontal image can be input into the second neural network model.
  • the output TSDF volume is shown in Figure 5 (c).
  • the surface mesh of the TSDF volume is extracted to obtain the three-dimensional mesh model of the bust.
  • the first texture expansion diagram is obtained as shown in (e) of FIG. 5.
  • a second texture expansion map as shown in (f) in FIG. 5 is obtained, that is, a texture map after texture completion.
  • the texture map after texture completion and the surface mesh of the three-dimensional mesh model are combined to obtain the final three-dimensional model of the textured bust, as shown in Figure 5 (g).
  • S404 obtains the first texture expansion map of the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask, which can be implemented in the following manner:
  • A1 performing texture expansion on the texture corresponding to the three-dimensional mesh model based on the central axis to obtain a third texture expansion image, the central axis being the connecting line from the top of the head to the bottom of the head in the three-dimensional mesh model.
  • the top of the head can be the highest point on the top of the head.
  • the bottom of the head can be the geometric center of the lowest surface of the three-dimensional network model.
  • the third texture development map can be obtained based on the central axis according to the spatial angle or the circumference of the curved surface of a certain place on the Mesh relative to the central axis.
  • the first rule can be satisfied by the circumferential expansion according to the angle.
  • the first rule can be:
  • (x,y,z) is the spatial coordinates of a point in a three-dimensional grid model
  • [u,v] represents the pixel coordinates of the point with spatial coordinates (x,y,z) in the texture expansion map
  • pixel coordinates Take the lower left corner of the image as the origin.
  • W is the width of the expanded texture image
  • H is the height of the expanded texture image
  • cx and cz represent the axis coordinates on the plane of equal Y value.
  • the third texture expansion map describes the grid texture of the three-dimensional grid model, or a grid texture expansion map without pixel values.
  • A2. Determine the positions of at least two organs in the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask.
  • the head semantic mask includes the locations of at least two organs, that is, the coordinates of at least two organs can be determined according to the head semantic mask. Take the two organs, the ear and the nose, for example.
  • the three-dimensional mesh model is obtained according to the semantic mask of the head. Therefore, there is a mapping relationship between the ears and nose in the semantic mask of the head and the ears and noses in the three-dimensional mesh model. Therefore, the positions of the ears and the nose in the three-dimensional mesh model can be determined according to the positions of the ears and the nose in the semantic mask of the head.
  • A3 Adjust the third texture expansion map according to the positions of at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
  • Figure 8 (a) shows the ear and nose positions in the semantic mask of the head.
  • Figure 8(b) shows the positions of ears and nose in the three-dimensional mesh model.
  • One way is to adjust the positions of the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D mesh model, that is, determine the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D network model.
  • nose position and adjust the ear and/or nose position so that the adjusted ears and nose are located at the preset position of the third texture expansion map, and the first texture expansion is obtained according to the adjusted third texture expansion map and the front image
  • the adjusted ears and nose are located at the preset positions of the first texture expansion image.
  • the distortion mapping process may be performed on the third expanded texture image, so that the ears and the nose in the first expanded texture image are located at preset positions.
  • Another way is to obtain a color-filled third texture expansion map based on the third texture expansion map and the front image.
  • the positions of the ears and the nose in the three-dimensional network model are determined, and the positions of the ears and/or the nose are adjusted to obtain the first texture expansion map.
  • the ears and the nose are located at the preset positions of the first texture development image.
  • mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of each point in the three-dimensional grid model there is a mapping relationship between the coordinates of each point in the three-dimensional grid model and the coordinates of the pixel points in the front image. Further, there is a mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of the pixel in the front image. Therefore, the pixel values of the pixels in the front image can be mapped to the third texture expansion map (or to the adjusted third texture expansion map).
  • the current half-body texture expansion is generally used to expand the front and back separately, that is, expand into the visible part of the front and the invisible part of the back, resulting in poor texture reconstruction of the invisible part of the hair and ears, and the geometric mesh reconstruction of the ears
  • the effect is poor, so it is difficult to meet user needs.
  • the adjusted ears and nose are located at the preset positions of the first texture expansion map. This process provides convenient conditions for the subsequent optimization of the ear part, and can improve the accuracy of the subsequent ear optimization .
  • the optimization of the ear part will be described later, which will not be repeated here.
  • the hair may completely cover the ear area.
  • the subsequent optimization of the ear part may not be performed.
  • the texture expansion map can be obtained based on the spatial angle or surface circumference of a certain place based on the central axis, and A2 and A3 can no longer be executed.
  • the color-filled third texture expansion map can be obtained directly according to the third texture expansion map and the front image, that is, the filled third texture expansion map is used as the first texture expansion map.
  • a certain cross section of the head can be considered as a circle.
  • the black dots between - ⁇ and ⁇ in (a) in FIG. 7 can be considered as the points on the cross-section of the expansion line used when the texture is expanded.
  • the expansion line is on the back as an example.
  • Figure 9 is a schematic diagram of similar gaps in the suture area of the three-dimensional model. After the gap is smoothed, it is shown on the right in Figure 9.
  • the gap smoothing operation is to perform fusion processing on the texture stitching area in the second texture expansion map.
  • the second texture expansion image and the back texture image of the bust may be subjected to weighted fusion processing according to the set rules to obtain the fourth texture expansion image;
  • the setting rules can be:
  • I 3 (i,j) ⁇ I 1 (i,j)+(1- ⁇ )I 2 (map 1 (i),map 2 (j));
  • I 1 represents the second texture expansion image
  • I 2 represents the back image of the bust
  • I 3 represents the fourth texture expansion image
  • I alpha represents the weight
  • map 1 represents in the X-axis direction, in the back image of the bust
  • map 2 represents the mapping function on the Y-axis direction
  • the pixel points in the back image of the bust are mapped to the mapping function on the second texture expansion map
  • i represents the pixel point on the X axis
  • j represents the coordinate value of the pixel point in the X-axis direction.
  • the back texture image of the bust is determined according to the front image of the bust as shown in FIG. 10(a), and the weight of each pixel can be shown in FIG. 10(b).
  • the second texture expansion diagram is shown in 10 (c).
  • the second texture expansion image ((c) in Fig. 10) and the back texture image of the bust ((a) in Fig. 10) are merged according to (b) in Fig. 10 to obtain (d) in Fig. 10, namely The fourth texture expanded view.
  • the fusion operation is indicated by the plus sign "+" in Figure 10.
  • the process of optimizing the ear part will be described in detail below.
  • the face of the bust includes the ears, that is, the ears are not covered by the hair. You can determine whether to include ears based on the semantic mask of the head.
  • optimizing the ear part the following methods can be used:
  • the fusion method used when fusing the texture of the ear region on the fused three-dimensional network model to the ear region located at the preset position of the fourth texture expansion map may be a fusion algorithm based on the image Laplacian gradient.
  • FIG. 11 is a schematic diagram of a three-dimensional mesh model.
  • the ear area in the determined three-dimensional mesh model can be seen in (b) of FIG. 11.
  • the geometric dimensions of the ear can be determined according to the size of the three-dimensional mesh model.
  • the size of the three-dimensional mesh model can be preset by the user or a default size can be adopted.
  • Fit the ear model to the ear area in the 3D mesh model, as shown in (c) in Figure 11, and then perform the fusion processing on the 3D mesh model fitted to the ear model, and the resulting fused 3D mesh model can be See (d) in Figure 11. Then, the texture of the ear region on the fused three-dimensional network model is obtained, and the texture of the ear region is fused to the ear region at the preset position in the fourth texture expansion map.
  • the solution provided in the embodiment of this application is applied to a virtual three-dimensional dialogue.
  • the local terminal device receives the user’s trigger and starts a virtual 3D video call
  • the video stream is obtained through the terminal’s camera.
  • the 3D reconstruction method based on the bust of the single frame image proposed in this patent can be used to reconstruct the 3D Model.
  • the terminal drives the three-dimensional model by acquiring the user's expression in each frame of the video stream, and sends it to the opposite terminal, and the opposite terminal displays the local terminal user's expression simulated by the three-dimensional model.
  • creating a three-dimensional model can be performed by a computing cloud.
  • the electronic device sends a single frame image to the computing cloud, and the computing cloud creates a three-dimensional model, and then sends the created three-dimensional model to the terminal.
  • the embodiments of the present application also provide an apparatus 1400.
  • the apparatus 1400 may specifically include functional modules in an electronic device (for example, the processor 110 in FIG. 1 Components or software modules executed by them), or the apparatus 1400 may be a chip or a chip system, or the apparatus 1400 may be a module in an electronic device, or the like.
  • the apparatus may include an obtaining unit 1401 and a reconstruction unit 1402.
  • the obtaining unit 1401 and the reconstruction unit 1402 respectively execute different steps of the method shown in the embodiment corresponding to FIG. 2 and FIG. 4.
  • the obtaining unit 1401 may be used to obtain the image to be processed in S201
  • the reconstruction unit 1402 may be used to perform the process of S202-S204.
  • the specific implementation is as described above and will not be repeated here.
  • the aforementioned acquisition unit 1401 or reconstruction unit 1402 can be implemented by software, hardware, or a combination of software and hardware.
  • the hardware can be CPU, microprocessor, DSP, micro control unit (MCU), artificial intelligence processor, application specific integrated circuit (ASIC), field programmable gate array (field programmable gate array) , FPGA), dedicated digital circuits, hardware accelerators, or any one or any combination of non-integrated discrete devices, which can run the necessary software or do not rely on software to perform the above method flow, and are located in the previous description of Figure 1 Inside the processor 110.
  • the module is implemented in software, the software exists in the form of computer program instructions and is stored in a memory, such as the memory 120 in FIG.
  • the processor may include, but is not limited to, at least one of the following: CPU, microprocessor, DSP, microcontroller, or artificial intelligence processor and other computing devices that run software. Each computing device may include one or more A core used to execute software instructions for calculation or processing.
  • the processor can be a single semiconductor chip, or it can be integrated with other circuits to form a semiconductor chip. For example, it can form an SoC (on-chip) with other circuits (such as codec circuits, hardware acceleration circuits, or various bus and interface circuits).
  • the processor may further include necessary hardware accelerators, such as FPGAs, PLDs (programmable logic devices), or logic circuits that implement dedicated logic operations.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)
  • Image Generation (AREA)

Abstract

The present application provides a method and apparatus for three-dimensional reconstruction of a half-length portrait, for solving the problems of high complexity and long reconstruction duration. The method comprises: obtaining an image comprising a frontal human face of a half-length portrait of a target person, then unfolding a texture by using a head semantic mask of the frontal human face, so that at least two organs in the unfolded texture are located at preset positions, and then complementing a back texture according to the upfolded texture map comprising the front. A three-dimensional mesh model can further be constructed according to the frontal human face, a preconfigured ear model is used to replace ears in the three-dimensional mesh model, a texture of an ear area is obtained from a replaced three-dimensional mesh model, and is fused to the complemented back texture, so as to obtain the three-dimensional mesh model. A professional modeling device is not required, and the complexity is low. Because the frontal and back textures of a human body have certain correlation, the back texture is complemented according to the frontal texture, so that the constructed three-dimensional model fits the character better, and a better effect is achieved.

Description

一种半身像的三维重建方法及装置Method and device for three-dimensional reconstruction of bust
相关申请的交叉引用Cross-references to related applications
本申请要求在2020年02月29日提交中国专利局、申请号为202010132592.8、申请名称为“一种半身像的三维重建方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 29, 2020, the application number is 202010132592.8, and the application title is "A method and device for three-dimensional reconstruction of bust", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请实施例涉及图像处理技术领域,尤其涉及一种半身像的三维重建方法及装置。The embodiments of the present application relate to the field of image processing technology, and in particular to a method and device for three-dimensional reconstruction of a bust.
背景技术Background technique
目前三维半身像重建技术在三维打印、娱乐和远程增强现实(augmented reality,AR)通话等领域有着广泛的应用。传统的三维半身像重建技术可以采用单目系统或者多目系统。使用单目系统时,通常采用红外深度摄像头对人物进行绕圈扫描,需要被扫描人物在扫描过程中全程保持静止。从而该方式对人物扫描的时间长,重建计算时间长,并且可能出现扫描效果不理想或者失败的现象。多目扫描系统是基于多视角的采集系统,虽然具备实时重建的能力,但是价格昂贵,设备较大,复杂度较高,操作起来不太方便。At present, three-dimensional bust reconstruction technology has a wide range of applications in three-dimensional printing, entertainment, and remote augmented reality (AR) calls. The traditional three-dimensional bust image reconstruction technology can adopt a monocular system or a multi-eye system. When using a monocular system, an infrared depth camera is usually used to scan a person in a circle, and the scanned person needs to remain still during the scanning process. Therefore, this method takes a long time to scan the person, and the reconstruction calculation time is long, and the scanning effect may be unsatisfactory or failure. The multi-eye scanning system is an acquisition system based on multiple viewing angles. Although it has real-time reconstruction capabilities, it is expensive, with large equipment and high complexity, and it is inconvenient to operate.
发明内容Summary of the invention
本申请实施例提供一种半身像的三维重建方法及装置,用以解决重建时间长,复杂度高的问题。The embodiments of the present application provide a method and device for three-dimensional reconstruction of a bust to solve the problems of long reconstruction time and high complexity.
本申请实施例提供的具体技术方案如下:The specific technical solutions provided by the embodiments of this application are as follows:
第一方面,本申请提供一种半身像的三维重建方法,该半身像的三维重建方法可以由电子设备来实现,比如由电子设备中的一个或者处理单元来实现。方法可以包括:获取包括目标人物的半身像的待处理图像,半身像中包括正面人脸,然后根据获取的待处理图像得到第一纹理展开图,第一纹理展开图用于表征半身像的正面纹理,并且第一纹理展开图中半身像的五官中的至少两个器官(比如鼻子和耳朵)位于预设位置处;之后,根据第一纹理展开图中半身像的正面纹理对第一纹理展开图中半身像的背面纹理进行补充得到第二纹理展开图,所述第二纹理展开图用于表征所述半身像的表面纹理;最后,根据第二纹理展开图得到半身像的三维模型。本申请提供的重建方法,无需专业的建模设备,复杂度较低。由于人体正面和背部纹理存在一定的相关性,因此根据正面纹理来补充背部纹理,使得构建的三维模型更贴合人物,达到较优效果。In the first aspect, the present application provides a three-dimensional reconstruction method of a bust. The three-dimensional reconstruction method of the bust can be implemented by an electronic device, such as one of the electronic devices or a processing unit. The method may include: obtaining an image to be processed including a bust of the target person, the bust including a frontal face, and then obtaining a first texture expansion image based on the obtained image to be processed, and the first texture expansion image is used to represent the front face of the bust Texture, and at least two organs (such as the nose and ears) of the five senses of the bust in the first texture expansion picture are located at preset positions; after that, expand the first texture according to the front texture of the bust in the first texture expansion picture The backside texture of the bust in the figure is supplemented to obtain a second texture expansion map, which is used to characterize the surface texture of the bust; finally, a three-dimensional model of the bust is obtained according to the second texture expansion map. The reconstruction method provided in this application does not require professional modeling equipment and has low complexity. Since the front and back textures of the human body are related to a certain degree, the back texture is supplemented according to the front texture, so that the constructed three-dimensional model fits the character more closely and achieves better results.
所述第二纹理展开图用于表征所述半身像的表面纹理,也就是第二纹理展开图描述半身像的人物表面的全方位的纹理。表面纹理包括半身像的正面纹理,还包括半身像的背面纹理。比如,半身像的正面包括人脸,前面颈部或者前面肩部等。半身像的背面可以包括后脑、背面颈部或者背面肩部等。The second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust. The surface texture includes the front texture of the bust and the back texture of the bust. For example, the front of a bust includes a human face, front neck, or front shoulders. The back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
在一种可能的设计中,根据待处理图像得到第一纹理展开图,可以通过如下方式来实现:In a possible design, obtaining the first texture expansion map according to the image to be processed can be implemented in the following ways:
去除所述待处理图像中的背景得到所述半身像的正面图像;Removing the background in the image to be processed to obtain the front image of the bust;
对所述正面图像进行语义分割得到所述正面图像的头部语义掩膜;Performing semantic segmentation on the frontal image to obtain a semantic mask of the head of the frontal image;
根据所述头部语义掩膜以及所述正面图像获得所述半身像的三维网格模型;Obtaining a three-dimensional mesh model of the bust according to the head semantic mask and the frontal image;
根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,所述第一纹理展开图中所述至少两个器官位于预设位置处。According to the positions of the at least two organs on the head in the head semantic mask and the three-dimensional mesh model, a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
通过上述设计,根据头部语义掩膜来获得纹理展开图,即根据语义像素坐标展开纹理可以提高展开精度。Through the above design, the texture expansion map is obtained according to the semantic mask of the head, that is, the texture expansion according to the semantic pixel coordinates can improve the expansion accuracy.
在一种可能的设计中,所述根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,可以通过如下方式来实现:In a possible design, the obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model may be implemented in the following manner:
基于中心轴对所述三维网格模型对应的纹理执行纹理展开得到第三纹理展开图,所述中心轴为所述三维网格模型中的头部顶端至头部底端的连接线;Performing texture expansion on the texture corresponding to the three-dimensional mesh model based on a central axis to obtain a third texture expansion diagram, where the central axis is the connecting line from the top end of the head to the bottom end of the head in the three-dimensional mesh model;
根据所述头部语义掩膜中头部上至少两个器官所在位置确定所述三维网格模型中所述至少两个器官的所在位置;Determining the locations of the at least two organs in the three-dimensional mesh model according to the locations of the at least two organs on the head in the head semantic mask;
根据所述三维网格模型中所述至少两个器官的所在位置对第三纹理展开图进行调整得到所述第一纹理展开图。The third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
上述设计中,在得到半身三维网格模型后,结合半身像的正面纹理,可以对半身像的纹理进行圆周展开得到第一纹理展开图。圆周展开是非Atlas(地图集)的纹理展开。非地图集展开有利于提高纹理连续性,减少纹理之间的缝隙。另外,还可以避免分块修复(inpainting),同时实现语义区块在纹理图上的位置相对固定,便于机器学习。In the above design, after the three-dimensional mesh model of the bust is obtained, combined with the front texture of the bust, the texture of the bust can be expanded in a circle to obtain the first texture expansion map. Circumferential expansion is non-Atlas (atlas) texture expansion. Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures. In addition, inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
在一种可能的设计中,基于所述第二纹理展开图得到所述半身像的三维模型,包括:In a possible design, obtaining the three-dimensional model of the bust based on the second texture expansion map includes:
对所述第二纹理展开图中纹理缝合线区域进行平滑处理得到第四纹理展开图,基于所述第四纹理展开图得到所述半身像的三维模型,所述纹理缝合线区域根据对所述三维网格模型进行纹理展开时所采用的展开线确定。Smoothing the texture seam area in the second texture expansion map to obtain a fourth texture expansion map, obtain a three-dimensional model of the bust based on the fourth texture expansion map, and the texture seam area according to the The three-dimensional mesh model is used to determine the expansion line used when the texture is expanded.
上述设计,将缝合线区域进行平滑处理,可以抹平缝合线区域的缝隙,优化建模细节。The above design smoothes the stitching area, smoothing the gaps in the stitching area, and optimizing the modeling details.
在一种可能的设计中,所述至少两个器官中包括耳朵;所述方法还包括:In a possible design, the at least two organs include ears; the method further includes:
将预配置的耳朵模型融合到所述三维网格模型中的耳朵区域得到融合后的三维网格模型;Fusing the pre-configured ear model to the ear region in the three-dimensional mesh model to obtain a fused three-dimensional mesh model;
基于处理后的第二纹理展开图得到所述半身像的三维模型,包括:Obtaining the three-dimensional model of the bust based on the processed second texture expansion map, including:
将融合后的三维网络模型上所述耳朵区域的纹理融合到位于所述第四纹理展开图预设位置处的耳朵区域得到融合后的第四纹理展开图,根据融合后的第四纹理展开图得到所述半身像的三维模型。The texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
上述设计,通过耳朵部分的网格优化和纹理替换,实现高精度的局部纹理和几何,优化纹理补全以及网格重建细节。The above design realizes high-precision local texture and geometry through mesh optimization and texture replacement of the ear part, and optimizes texture completion and mesh reconstruction details.
在一种可能的设计中,对所述第二纹理展开图中纹理缝合线区域进行平滑处理,可以通过如下方式来实现:In a possible design, smoothing the texture stitching area in the second texture expansion image can be achieved in the following manner:
根据所述半身像的正面图像确定所述半身像的背面纹理图像;Determining the back texture image of the bust according to the front image of the bust;
将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图。Perform weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image.
上述设计中,由于人体正面和背部纹理存在一定的相关性,因此根据正面图像来估计背部纹理,估计的背部纹理是不含缝隙的,然后与基于第二纹理展开图得到的背部纹理进 行加权处理,优化背部缝隙抹平效果。In the above design, due to the correlation between the front and back textures of the human body, the back texture is estimated based on the front image. The estimated back texture is free of gaps, and then weighted with the back texture obtained based on the second texture expansion map , Optimize the smoothing effect of the back gap.
在一种可能的设计中,将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图,可以通过如下方式来实现:In a possible design, weighted fusion processing is performed on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, which may be implemented in the following manner:
将所述第二纹理展开图以及所述半身像的背面图像按照设定规则进行加权融合处理得到所述第四纹理展开图;Performing weighted fusion processing on the second texture expansion image and the back image of the bust according to a set rule to obtain the fourth texture expansion image;
所述设定规则为:The setting rules are:
I 3(i,j)=αI 1(i,j)+(1-α)I 2(map 1(i),map 2(j)); I 3 (i,j)=αI 1 (i,j)+(1-α)I 2 (map 1 (i),map 2 (j));
α=I alpha(i,j); α=I alpha (i,j);
其中,I 1表示第二纹理展开图,I 2表示所述半身像的背面图像,I 3表示第四纹理展开图,I alpha表示权重,map 1表示在X轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,map 2表示Y轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,i表示像素点在X轴方向上的坐标取值,j表示像素点在X轴方向上的坐标取值。 Among them, I 1 represents the second texture expansion image, I 2 represents the back image of the bust, I 3 represents the fourth texture expansion image, I alpha represents the weight, and map 1 represents in the X-axis direction, the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map, map 2 represents the mapping function on the Y-axis direction, the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map, i Indicates the coordinate value of the pixel in the X-axis direction, and j represents the coordinate value of the pixel in the X-axis direction.
在一种可能的设计中,根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图之前,所述方法还包括:In a possible design, before obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model, the method further includes:
对所述三维网格模型执行如下至少一项处理:Perform at least one of the following processes on the three-dimensional mesh model:
补洞处理、网格均一化处理或者网格平滑处理。Hole filling processing, mesh uniformization processing or mesh smoothing processing.
上述设计中,对三维网格模型进行补洞处理后,使得三维网格模型更加完整,提高三维模型重建的准确度。执行网格均一化处理,可以防止获得的三维网格模型中网格过于密集或者过于稀疏,而影响三维模型重建的准确度。执行网格平滑处理,能够去除三维网格模型中的不准确的网格,即噪点,进而提高三维模型重建的准确度和光顺性。In the above design, the three-dimensional mesh model is processed to fill holes to make the three-dimensional mesh model more complete and improve the accuracy of the reconstruction of the three-dimensional model. Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction. Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
第二方面,本申请提供一种半身像的三维重建装置,包括:In the second aspect, the present application provides a three-dimensional reconstruction device for a bust, including:
获取单元,用于获取待处理图像,所述待处理图像中包括目标人物的半身像,所述半身像包括正面人脸;An acquiring unit, configured to acquire an image to be processed, the image to be processed includes a bust of a target person, and the bust includes a frontal face;
重建单元,用于根据待处理图像得到第一纹理展开图,所述第一纹理展开图用于表征所述半身像的正面纹理,所述第一纹理展开图中所述半身像的五官中的至少两个器官位于预设位置处;根据所述第一纹理展开图中所述半身像的正面纹理对所述第一纹理展开图中所述半身像的背面纹理进行补充得到第二纹理展开图,所述第二纹理展开图用于表征所述半身像的表面纹理;并根据所述第二纹理展开图得到所述半身像的三维模型。The reconstruction unit is configured to obtain a first texture expansion map according to the image to be processed, the first texture expansion map being used to characterize the frontal texture of the bust, and the first texture expansion map in the five senses of the bust At least two organs are located at preset positions; according to the front texture of the bust in the first texture expansion view, supplementing the back texture of the bust in the first texture expansion view to obtain a second texture expansion view The second texture expansion map is used to characterize the surface texture of the bust; and the three-dimensional model of the bust is obtained according to the second texture expansion map.
在一种可能的设计中,所述重建单元在根据待处理图像得到第一纹理展开图时,具体用于:In a possible design, when the reconstruction unit obtains the first texture expansion map according to the image to be processed, it is specifically used for:
去除所述待处理图像中的背景得到所述半身像的正面图像;Removing the background in the image to be processed to obtain the front image of the bust;
对所述正面图像进行语义分割得到所述正面图像的头部语义掩膜;Performing semantic segmentation on the frontal image to obtain a semantic mask of the head of the frontal image;
根据所述头部语义掩膜以及所述正面图像获得所述半身像的三维网格模型;Obtaining a three-dimensional mesh model of the bust according to the head semantic mask and the frontal image;
根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,所述第一纹理展开图中所述至少两个器官位于预设位置处。According to the positions of the at least two organs on the head in the head semantic mask and the three-dimensional mesh model, a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
在一种可能的设计中,所述重建单元,在根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图时,具体用于:In a possible design, the reconstruction unit is specifically configured to: when obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model:
基于中心轴对所述三维网格模型对应的纹理执行纹理展开得到第三纹理展开图,所述中心轴为所述三维网格模型中的头部顶端至头部底端的连接线;Performing texture expansion on the texture corresponding to the three-dimensional mesh model based on a central axis to obtain a third texture expansion diagram, where the central axis is the connecting line from the top end of the head to the bottom end of the head in the three-dimensional mesh model;
根据所述头部语义掩膜中头部上至少两个器官所在位置确定所述三维网格模型中所述至少两个器官的所在位置;Determining the locations of the at least two organs in the three-dimensional mesh model according to the locations of the at least two organs on the head in the head semantic mask;
根据所述三维网格模型中所述至少两个器官的所在位置对第三纹理展开图进行调整得到所述第一纹理展开图。The third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
在一种可能的设计中,所述重建单元,在基于所述第二纹理展开图得到所述半身像的三维模型时,具体用于:In a possible design, the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the second texture expansion map:
对所述第二纹理展开图中纹理缝合线区域进行平滑处理得到第四纹理展开图,基于所述第四纹理展开图得到所述半身像的三维模型,所述纹理缝合线区域根据对所述三维网格模型进行纹理展开时所采用的展开线确定。Smoothing the texture seam area in the second texture expansion map to obtain a fourth texture expansion map, obtain a three-dimensional model of the bust based on the fourth texture expansion map, and the texture seam area according to the The three-dimensional mesh model is used to determine the expansion line used when the texture is expanded.
在一种可能的设计中,所述至少两个器官中包括耳朵;所述重建单元,还用于将预配置的耳朵模型融合到所述三维网格模型中的耳朵区域得到融合后的三维网格模型;In a possible design, the at least two organs include ears; the reconstruction unit is also used to fuse the pre-configured ear model into the three-dimensional mesh model of the ear region after the fusion is obtained. Lattice model
所述重建单元,在基于处理后的第二纹理展开图得到所述半身像的三维模型时,具体用于:The reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the processed second texture expansion map:
将融合后的三维网络模型上所述耳朵区域的纹理融合到位于所述第四纹理展开图预设位置处的耳朵区域得到融合后的第四纹理展开图,根据融合后的第四纹理展开图得到所述半身像的三维模型。The texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
在一种可能的设计中,所述重建单元,在对所述第二纹理展开图中纹理缝合线区域进行平滑处理时,具体用于:In a possible design, the reconstruction unit is specifically configured to: when smoothing the texture stitch line region in the second texture expansion map:
根据所述半身像的正面图像确定所述半身像的背面纹理图像;Determining the back texture image of the bust according to the front image of the bust;
将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图。Perform weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image.
在一种可能的设计中,所述重建单元,在将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图时,具体用于:In a possible design, when the reconstruction unit performs weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, it is specifically configured to:
将所述第二纹理展开图以及所述半身像的背面图像按照设定规则进行加权融合处理得到所述第四纹理展开图;Performing weighted fusion processing on the second texture expansion image and the back image of the bust according to a set rule to obtain the fourth texture expansion image;
所述设定规则为:The setting rules are:
I 3(i,j)=αI 1(i,j)+(1-α)I 2(map 1(i),map 2(j)); I 3 (i,j)=αI 1 (i,j)+(1-α)I 2 (map 1 (i),map 2 (j));
α=I alpha(i,j); α=I alpha (i,j);
其中,I 1表示第二纹理展开图,I 2表示所述半身像的背面图像,I 3表示第四纹理展开图,I alpha表示权重,map 1表示在X轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,map 2表示Y轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,i表示像素点在X轴方向上的坐标取值,j表示像素点在X轴方向上的坐标取值。 Among them, I 1 represents the second texture expansion image, I 2 represents the back image of the bust, I 3 represents the fourth texture expansion image, I alpha represents the weight, and map 1 represents in the X-axis direction, the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map, map 2 represents the mapping function on the Y-axis direction, the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map, i Indicates the coordinate value of the pixel in the X-axis direction, and j represents the coordinate value of the pixel in the X-axis direction.
在一种可能的设计中,所述重建单元,在根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图之前,还用于:对所述三维网格模型执行如下至少一项处理:补洞处理、网格均一化处理或者网格平滑处理。In a possible design, the reconstruction unit is further configured to: before obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model: The three-dimensional mesh model performs at least one of the following processing: hole filling processing, mesh uniformization processing, or mesh smoothing processing.
第三方面,本申请实施例提供了一种电子设备,包括处理器、存储器;其中处理器与存储器相耦合;其中,存储器用于存储程序指令;处理器用于读取存储器中存储的程序指令,以实现第一方面及其任一可能的设计的方法。可选地,该处理器包括ISP,用于执行获取待处理图像的过程。In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory; wherein the processor is coupled to the memory; wherein the memory is used to store program instructions; the processor is used to read program instructions stored in the memory, To achieve the first aspect and any possible design methods. Optionally, the processor includes an ISP for executing the process of acquiring the image to be processed.
在一种可能的设计中,所述电子设备还包括摄像头;所述摄像头用于采集待处理图像。上述处理器用于控制所述摄像头采集图像。In a possible design, the electronic device further includes a camera; the camera is used to collect images to be processed. The above-mentioned processor is used to control the camera to collect images.
第四方面,本申请实施例提供的一种计算机存储介质,该计算机存储介质存储有程序指令,当程序指令在电子设备上运行时,使得电子设备或处理器执行第一方面及其任一可能的设计的方法。In a fourth aspect, a computer storage medium provided by an embodiment of the present application, the computer storage medium stores program instructions, and when the program instructions run on an electronic device, the electronic device or processor executes the first aspect and any of its possibilities The method of design.
第五方面,本申请实施例提供的一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备或处理器执行第一方面及其任一可能的设计的方法。In the fifth aspect, a computer program product provided by an embodiment of the present application, when the computer program product runs on an electronic device, causes the electronic device or processor to execute the first aspect and any possible design method thereof.
第六方面,本申请实施例提供的一种芯片,所述芯片与电子设备中的存储器耦合,执行第一方面及其任一可能的设计的方法。可选地,该芯片包括ISP,用于执行获取待处理图像的过程。The sixth aspect is a chip provided by an embodiment of the present application, which is coupled with a memory in an electronic device, and executes the first aspect and any possible design method thereof. Optionally, the chip includes an ISP for performing the process of acquiring the image to be processed.
另外,第二方面至第六方面所带来的技术效果可参见上述第一方面的描述,此处不再赘述。In addition, the technical effects brought by the second aspect to the sixth aspect can be referred to the description of the above-mentioned first aspect, which will not be repeated here.
需要说明的是,本申请实施例中“耦合”是指两个部件彼此直接或间接地结合。It should be noted that “coupled” in the embodiments of the present application means that two components are directly or indirectly combined with each other.
附图说明Description of the drawings
图1为本申请实施例中的电子设备示意图;Figure 1 is a schematic diagram of an electronic device in an embodiment of the application;
图2为本申请实施例中半身像的三维重建方法流程示意图;FIG. 2 is a schematic flowchart of a method for three-dimensional reconstruction of a bust in an embodiment of this application;
图3为本申请实施例中半身像示意图;Fig. 3 is a schematic diagram of a bust in an embodiment of the application;
图4为本申请实施例中纹理展开的方法流程示意图;FIG. 4 is a schematic flowchart of a method for texture expansion in an embodiment of the application;
图5为本申请实施例中半身像的三维重建示意图;FIG. 5 is a schematic diagram of three-dimensional reconstruction of a bust in an embodiment of the application;
图6为本申请实施例中中心轴示意图;Fig. 6 is a schematic diagram of the central axis in an embodiment of the application;
图7为本申请实施例中圆周展开说明示意图;FIG. 7 is a schematic diagram illustrating the expansion of the circumference in an embodiment of the application;
图8为本申请实施例中鼻子和耳朵位置确定示意图;Fig. 8 is a schematic diagram of determining the positions of the nose and ears in an embodiment of the application;
图9为本申请实施例中缝隙抹平示意图;FIG. 9 is a schematic diagram of smoothing the gap in an embodiment of the application;
图10为本申请实施例中缝隙抹平采用的平滑方式示意图;FIG. 10 is a schematic diagram of a smoothing method used for smoothing the gap in an embodiment of the application;
图11为本申请实施例中耳朵缝合示意图;Figure 11 is a schematic diagram of ear stitching in an embodiment of the application;
图12为本申请实施例中经过重建的三维模型示意图;FIG. 12 is a schematic diagram of a reconstructed three-dimensional model in an embodiment of this application;
图13为本申请实施例中三维虚拟视频通话场景示意图;FIG. 13 is a schematic diagram of a three-dimensional virtual video call scene in an embodiment of the application;
图14为本申请实施例中装置1400示意图。FIG. 14 is a schematic diagram of a device 1400 in an embodiment of the application.
具体实施方式Detailed ways
本申请涉及的半身像的三维重建方案可以应用在三维打印、AR通话、虚拟现实(virtual reality,VR)模型等应用场景中。本申请提供的半身像的三维重建方法可以应用于电子设备。电子设备可以是个人计算机、服务器计算机、客户机、手持或膝上设备、基于微处理器的系统设备、嵌入式系统设备、机顶盒、可编程消费电子产品、网络个人电脑、以及小型计算机、大型计算机统和包括上述任何系统的分布式云计算技术环境服务器、以及个人数字助理和/或图像处理等功能的便携式终端设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴设备(如智能手表)、车载设备,等等。The three-dimensional reconstruction scheme of the bust involved in this application can be applied to application scenarios such as three-dimensional printing, AR calling, and virtual reality (VR) models. The three-dimensional reconstruction method of the bust provided in this application can be applied to electronic equipment. Electronic devices can be personal computers, server computers, client computers, handheld or laptop devices, microprocessor-based system devices, embedded system devices, set-top boxes, programmable consumer electronics, network personal computers, and small computers, large computers Distributed cloud computing technology environment servers including any of the above-mentioned systems, as well as portable terminal devices with functions such as personal digital assistants and/or image processing, such as mobile phones, tablets, and wearable devices with wireless communication functions (such as smart watches) , In-vehicle equipment, etc.
参见图1所示,本申请实施例中涉及的电子设备中可以包括处理器110以及存储器120。 处理器110可以包括一个或多个处理单元。例如:处理器110可以包括中央处理单元(central processing unit,CPU)、图像处理单元(graphics processing unit,GPU)、图像信号处理器(image signal processor,ISP)、数字信号处理器(digital signal processor,DSP)、或神经处理器(neural processing unit,NPU)中一个或者多个。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个芯片或电路板中。其中,数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。神经处理器包括但不限于神经网络处理单元,如深度神经网络处理单元或卷积神经网络处理单元。神经处理器可以利用神经网络模型执行训练、运算或处理。该神经网络模型包括但不限于深度神经网络模型或卷积神经网络模型。以上数字信号处理器、图像处理单元或中央处理单元同样可以利用所述神经网络模型执行训练、运算或处理。Referring to FIG. 1, the electronic device involved in the embodiment of the present application may include a processor 110 and a memory 120. The processor 110 may include one or more processing units. For example, the processor 110 may include a central processing unit (CPU), an image processing unit (graphics processing unit, GPU), an image signal processor (ISP), and a digital signal processor (digital signal processor). One or more of DSP), or neural processing unit (NPU). Among them, the different processing units can be independent devices, and can also be integrated in one or more chips or circuit boards. Among them, the digital signal processor is used to process digital signals. In addition to processing digital image signals, it can also process other digital signals. The neural processor includes, but is not limited to, a neural network processing unit, such as a deep neural network processing unit or a convolutional neural network processing unit. The neural processor can use the neural network model to perform training, calculation, or processing. The neural network model includes, but is not limited to, a deep neural network model or a convolutional neural network model. The above digital signal processor, image processing unit or central processing unit can also use the neural network model to perform training, calculation or processing.
在一些实施例中,处理器110中还可以设置存储器,用于临时存储指令和数据。示例的,处理器110中的存储器可以为高速缓冲存储器(cache)。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。In some embodiments, a memory may also be provided in the processor 110 to temporarily store instructions and data. For example, the memory in the processor 110 may be a cache memory (cache). The memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
在另一些实施例中,处理器110还可以包括一个或多个接口。例如,接口可以为通用串行总线(universal serial bus,USB)接口。又例如,接口还可以为集成电路(inter-integrated circuit,I2C)接口、集成电路内置音频(inter-integrated circuit sound,I2S)接口、脉冲编码调制(pulse code modulation,PCM)接口、通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口、移动产业处理器接口(mobile industry processor interface,MIPI)、通用输入输出(general-purpose input/output,GPIO)接口等。可以理解的是,本申请实施例可以通过接口连接电子设备的包括处理器在内的不同模块,从而使得电子设备能够实现不同的功能。需要说明的是,本申请实施例对电子设备100中接口的连接方式不作限定。In other embodiments, the processor 110 may further include one or more interfaces. For example, the interface may be a universal serial bus (USB) interface. For another example, the interface can also be an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transmission/reception transmission. UART (universal asynchronous receiver/transmitter) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, etc. It is understandable that, in the embodiments of the present application, different modules of the electronic device, including the processor, can be connected through interfaces, so that the electronic device can implement different functions. It should be noted that the embodiment of the present application does not limit the connection mode of the interface in the electronic device 100.
在一个示例中,处理器110中包括的NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的半身像的三维重建。In one example, the NPU included in the processor 110 is a neural-network (NN) computing processor. By drawing on the structure of a biological neural network, for example, drawing on the transfer mode between human brain neurons, the input information can be quickly processed. You can also continue to learn by yourself. The three-dimensional reconstruction of the bust of the electronic device 100 can be achieved through the NPU.
在另一个示例中,处理器110中包括NPU和其它处理器,通过NPU和其它处理器可以实现电子设备100的半身像的三维重建。其它处理器例如可以是CPU、图像处理器(GPU),图像信号处理器(image signal processor,ISP)、数字信号处理器(digital signal processor,DSP)中的一个或者多个。In another example, the processor 110 includes an NPU and other processors, and the three-dimensional reconstruction of the bust of the electronic device 100 can be realized through the NPU and other processors. The other processor may be, for example, one or more of a CPU, an image processor (GPU), an image signal processor (ISP), and a digital signal processor (digital signal processor, DSP).
存储器120可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在存储器120的指令,从而执行电子设备的各种功能应用以及数据处理。存储器120可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统、驱动软件或至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据、电话本等)等。存储器120可以包括掉电易失性存储器或非掉电易失性存储器中的至少一个,例如只读存储器(read only memory,ROM)、随机访问存储器(random access memory,RAM)、动态随机访问存储器(dynamic random access memory,DRAM)、嵌入式多媒体存储卡(embedded multi media card,eMMC)、通用闪存存储(universal flash storage,UFS)、硬盘或磁盘 等。The memory 120 may be used to store computer executable program code, where the executable program code includes instructions. The processor 110 executes various functional applications and data processing of the electronic device by running instructions stored in the memory 120. The memory 120 may include a program storage area and a data storage area. Among them, the storage program area can store an operating system, driver software, or at least one application program required by a function (such as a sound playback function, an image playback function, etc.). The data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100. The memory 120 may include at least one of a power-down volatile memory or a non-power-down volatile memory, such as read only memory (ROM), random access memory (RAM), and dynamic random access memory. (dynamic random access memory, DRAM), embedded multimedia card (eMMC), universal flash storage (UFS), hard disk or magnetic disk, etc.
在一种可能的实现方式中,电子设备中还可以包括图像采集器130,用于采集图像。图像采集器130可以包括摄像头,也可进一步包括之前提到的ISP。ISP用于处理摄像头收集的图像数据。例如,拍照时,打开快门,光线通过摄像头中的镜头被传递到摄像头中的感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP,ISP将相关数据处理并转化为肉眼可见的图像。ISP还可以对图像的噪点、亮度、色度进行算法优化。ISP还可以针对拍摄场景控制图像的曝光、或色温等参数优化。在一些实施例中,ISP可以设置在摄像头中。更为常见地,ISP可作为处理器的一部分而存在,与其他各类处理单元,例如CPU、GPU或DSP等集成在一个或多个芯片上。In a possible implementation manner, the electronic device may further include an image collector 130 for collecting images. The image collector 130 may include a camera, or may further include the aforementioned ISP. ISP is used to process the image data collected by the camera. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element in the camera through the lens in the camera, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP, and the ISP processes and converts the relevant data. It is an image visible to the naked eye. ISP can also optimize the image noise, brightness, and chroma algorithm. ISP can also control image exposure or color temperature optimization for the shooting scene. In some embodiments, the ISP can be set in the camera. More commonly, an ISP can exist as a part of a processor, integrated with other types of processing units, such as a CPU, GPU, or DSP, on one or more chips.
摄像头用于捕获静态图像或动态的视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到处理器110中,由其他的处理单元做加工处理。在一些实施例中,电子设备可以包括1个或多个摄像头。The camera is used to capture still images or dynamic video. The object generates an optical image through the lens and is projected to the photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal. The ISP outputs the digital image signal to the processor 110, which is processed by other processing units. In some embodiments, the electronic device may include one or more cameras.
在一种可能的实现方式中,电子设备中还可以包括显示屏140,显示屏140用于显示图像、或视频等。在一些实施例中,电子设备可以包括1个或多个显示屏。该显示屏包括但不限于触摸屏。在本申请实施例中,显示屏可用来显示人重建后半身像的三维模型。In a possible implementation manner, the electronic device may further include a display screen 140, and the display screen 140 is used to display images, videos, and the like. In some embodiments, the electronic device may include one or more display screens. The display screen includes but is not limited to a touch screen. In the embodiment of the present application, the display screen can be used to display a three-dimensional model of a reconstructed back bust of a person.
本申请涉及的术语“至少一个”,是指一个,或一个以上,即包括一个、两个、三个及以上;“至少两个”,是指两个、或者两个以外,即包括两个、三个及以上。“多个”,是指两个,或两个以上,即包括两个、三个及以上。另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。The term "at least one" referred to in this application refers to one or more than one, that is, one, two, three and more; , Three and more. "Multiple" refers to two or more than two, that is, two, three and more are included. In addition, it should be understood that in the description of this application, words such as "first" and "second" are only used for the purpose of distinguishing description, and cannot be understood as indicating or implying relative importance, nor can it be understood as indicating Or imply the order. "And/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
参见图2所示,为本申请实施例提供一种半身像的三维重建方法流程示意图。该半身像的三维重建方法可以由图2所示的电子设备来实现,比如由电子设备中的一个或者处理单元来实现。如图2所示,半身像的三维重建方法主要包括S201-S204。Refer to FIG. 2, which is a schematic flowchart of a method for three-dimensional reconstruction of a bust according to an embodiment of the present application. The three-dimensional reconstruction method of the bust can be implemented by the electronic device shown in FIG. 2, for example, by one of the electronic devices or a processing unit. As shown in Figure 2, the three-dimensional reconstruction method of the bust mainly includes S201-S204.
S201,获取待处理图像,待处理图像包括目标人物的半身像,所述半身像包括正面人脸。S201: Acquire an image to be processed, where the image to be processed includes a bust of a target person, and the bust includes a frontal face.
S201的获取动作可以理解为接收操作或处理操作。例如,NPU可以在S201中接收其他设备,例如ISP发送的所述待处理图像。又例如,S201可以由ISP来执行,通过对摄像头收集的图像信号或数据进行处理以生成图像。该处理包括但不限于各类色彩校准、像素校准、白平衡或缩放等处理。本申请实施例中,待处理图像中可以包括半身像的正面人脸。The acquiring action of S201 can be understood as a receiving operation or a processing operation. For example, the NPU may receive the to-be-processed image sent by other devices, such as the ISP, in S201. For another example, S201 may be executed by an ISP, which generates an image by processing image signals or data collected by a camera. This processing includes, but is not limited to, various types of color calibration, pixel calibration, white balance, or scaling. In the embodiment of the present application, the image to be processed may include a frontal face of a bust.
示例性的,半身像中可以包括正面人脸、颈部、肩部等,比如,参见图3所示。Exemplarily, the bust may include a frontal face, neck, shoulders, etc., for example, see FIG. 3.
S202,根据待处理图像得到第一纹理展开图,第一纹理展开图用于表征半身像的正面纹理。S202: Obtain a first texture expansion map according to the image to be processed, where the first texture expansion map is used to represent the front texture of the bust.
示例性地,第一纹理展开图中半身像的五官中的至少两个器官位于预设位置处。五官包括眉毛、眼睛、耳朵、鼻子、嘴。至少两个器官,比如耳朵和鼻子、或者眼睛和鼻子。至少两个器官位于预设位置处,可以是针对不同待处理图像得到的纹理展开图中,所述至少两个器官的位置相同。比如,两个器官中包括鼻子和耳朵,耳朵在不同的待处理图像的纹理展开图中的位置相同,鼻子在不同的待处理器图像的纹理展开图中的位置相同。或者至少两个器官位于预设位置处可以是针对不同待处理图像得到的纹理展开图中,所述至少两个器官的相对位置固定。以两个耳朵和鼻子为例,不同的待处理图像的纹理展开图中,两个耳朵的距离,以及鼻子与耳朵的距离可以是相同的。Exemplarily, at least two organs in the five sense organs of the bust in the first texture development image are located at preset positions. The facial features include eyebrows, eyes, ears, nose, and mouth. At least two organs, such as ears and nose, or eyes and nose. The at least two organs are located at preset positions, which may be texture expansion images obtained for different images to be processed, and the positions of the at least two organs are the same. For example, two organs include a nose and an ear, the position of the ear in the texture expansion map of different images to be processed is the same, and the position of the nose in the texture expansion map of different images to be processed is the same. Or the at least two organs located at the preset positions may be texture expansion images obtained for different images to be processed, and the relative positions of the at least two organs are fixed. Taking two ears and a nose as an example, the distance between the two ears and the distance between the nose and the ears can be the same in the texture expansion of different images to be processed.
S203,根据第一纹理展开图中半身像的正面纹理对第一纹理展开图中半身像的背面纹理进行补充得到第二纹理展开图。所述第二纹理展开图用于表征所述半身像的表面纹理。S203: Supplement the back texture of the bust in the first texture expansion image according to the front texture of the bust in the first texture expansion image to obtain a second texture expansion image. The second texture expansion map is used to characterize the surface texture of the bust.
所述第二纹理展开图用于表征所述半身像的表面纹理,也就是第二纹理展开图描述半身像的人物表面的全方位的纹理。表面纹理可以为三维的目标人物的上半身表面的纹理。也就是表面纹理包括按照垂直地平面的方向为轴,环绕上半身表面一周的纹理。The second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust. The surface texture may be the texture of the upper body surface of the three-dimensional target person. That is, the surface texture includes a texture that surrounds the surface of the upper body along the axis perpendicular to the ground plane.
表面纹理包括半身像的正面纹理,还包括半身像的背面纹理。比如,半身像的正面包括人脸,前面颈部或者前面肩部等。半身像的背面可以包括后脑、背面颈部或者背面肩部等。The surface texture includes the front texture of the bust and the back texture of the bust. For example, the front of a bust includes a human face, front neck, or front shoulders. The back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
S204,根据第二纹理展开图得到半身像的三维模型。S204: Obtain a three-dimensional model of the bust according to the second texture expansion map.
在一种可能的实施方式中,参见图4所示,S202根据待处理图像得到第一纹理展开图可以通过如下S401-S404来实现。In a possible implementation manner, referring to FIG. 4, S202 obtaining the first texture expansion image according to the image to be processed may be implemented by the following S401-S404.
S401,去除待处理图像中的背景得到半身像的正面图像。S401: Remove the background in the image to be processed to obtain a front image of the bust.
示例性地,可以采用第一神经网络模型来去除处理图像中的背景得到半身像的正面图像。第一神经网络模型用于分割图像的前景和背景,输出前景图像。在本申请实施中,前景图像即为半身像的正面图像。Exemplarily, the first neural network model may be used to remove the background in the processed image to obtain a frontal image of the bust. The first neural network model is used to segment the foreground and background of the image, and output the foreground image. In the implementation of this application, the foreground image is the front image of the bust.
S402,对正面图像进行语义分割得到正面图像的头部语义掩膜。S402: Perform semantic segmentation on the front image to obtain a head semantic mask of the front image.
语义分割是将像素按照图像中表达语义含义的不同进行分组(Grouping)/分割(Segmentation)。Semantic segmentation is the grouping/segmentation of pixels according to the different semantic meanings expressed in the image.
示例性地,对正面图像进行语义分割可以采用全卷积网络(Fully convolution networks,FCN),如U-net网络、SegNet网络、DeepLab、RefineNet或者PSPNet等方式。Exemplarily, the semantic segmentation of the front image may use a full convolution network (Fully Convolution Networks, FCN), such as U-net network, SegNet network, DeepLab, RefineNet, or PSPNet.
S403,根据头部语义掩膜以及正面图像获得半身像的三维网格模型。S403: Obtain a three-dimensional mesh model of the bust according to the semantic mask of the head and the frontal image.
示例性地,头部语义掩膜和正面图像可以输入第二神经网络模型,第二神经网络模型用于人体重建。第二神经网络模型输出三维(3D)的截断有符号距离函数(truncated signed distance function,TSDF)体。然后提取TSDF体的表面mesh(网格)得到半身像的三维网格模型。Exemplarily, the head semantic mask and the frontal image may be input to the second neural network model, and the second neural network model is used for human body reconstruction. The second neural network model outputs a three-dimensional (3D) truncated signed distance function (truncated signed distance function, TSDF) volume. Then extract the surface mesh (mesh) of the TSDF body to obtain a three-dimensional mesh model of the bust.
例如,提取TSDF体表面mesh时,可以采用匹配立方体(marching cube)算法或其他三维等值面提取算法等。For example, when extracting the surface mesh of a TSDF body, a marching cube algorithm or other three-dimensional isosurface extraction algorithms can be used.
S404,根据三维网格模型获得第一纹理展开图。S404: Obtain a first texture expansion map according to the three-dimensional mesh model.
在一种可能的实施方式中,在根据所述三维网格模型获得第一纹理展开图之前,可以对三维网格模型执行如下至少一项处理:补洞处理、网格均一化处理或者网格平滑处理。In a possible implementation manner, before obtaining the first texture expansion map according to the three-dimensional mesh model, at least one of the following processing may be performed on the three-dimensional mesh model: hole filling processing, mesh uniformization processing, or mesh processing. Smoothing.
示例性的,对三维网格模型进行补洞处理时,可以采用基于径向基函数(radial basis function,RBF)的三角网格补洞方法,或者基于泊松方程的补洞算法。在对三维网格模型 进行网格均一化处理时,可以采用点聚类方法和边折叠以及顶点增删法等网格均一化算法。在对三维网格模型进行网格平滑处理时,可以采用基于泊松方程或离散拉普拉斯方程等网格平滑方法。Exemplarily, when performing hole filling processing on a three-dimensional mesh model, a triangular mesh hole filling method based on a radial basis function (radial basis function, RBF) or a hole filling algorithm based on the Poisson equation may be used. When performing grid uniformization on a three-dimensional grid model, grid uniformization algorithms such as point clustering, edge folding, and vertex addition and deletion can be used. When performing mesh smoothing on a three-dimensional mesh model, mesh smoothing methods based on Poisson's equation or discrete Laplace equation can be used.
对三维网格模型进行补洞处理后,使得三维网格模型更加完整,提高三维模型重建的准确度。执行网格均一化处理,可以防止获得的三维网格模型中网格过于密集或者过于稀疏,而影响三维模型重建的准确度。执行网格平滑处理,能够去除三维网格模型中的不准确的网格,即噪点,进而提高三维模型重建的准确度和光顺性。After the 3D mesh model is filled with holes, the 3D mesh model is made more complete and the accuracy of the 3D model reconstruction is improved. Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction. Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
示例性地,根据头部语义确定半身像中包括耳朵时,在获得三维网格模型的第一纹理展开图时,可以根据头部语义掩膜中头部上至少两个器官所在位置获得三维网格模型的第一纹理展开图,使得第一纹理展开图中至少两个器官位于预设位置处。Exemplarily, when it is determined that the bust includes ears according to the semantics of the head, when the first texture expansion map of the three-dimensional mesh model is obtained, the three-dimensional mesh can be obtained according to the positions of at least two organs on the head in the semantic mask of the head. The first texture expansion map of the grid model makes at least two organs in the first texture expansion map to be located at the preset positions.
示例性地,在得到半身三维网格模型后,结合半身像的正面纹理,可以对半身像的纹理进行圆周展开得到第一纹理展开图。圆周展开是非Atlas(地图集)的纹理展开。非地图集展开有利于提高纹理连续性,减少纹理之间的缝隙。另外,还可以避免分块修复(inpainting),同时实现语义区块在纹理图上的位置相对固定,便于机器学习。Exemplarily, after obtaining the three-dimensional mesh model of the bust, combined with the front texture of the bust, the texture of the bust may be expanded in a circle to obtain the first texture expansion map. Circumferential expansion is non-Atlas (atlas) texture expansion. Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures. In addition, inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
作为一种示例,以图3所示的半身像为例,经过A1处理后得到图5中的(a)所示的半身像正面图像。半身像正面图像进行语义分割得到头部语义掩膜,即经过A2处理后得到的头部语义掩膜如图5中的(b)所示。头部语义掩膜和正面图像可以输入第二神经网络模型输出的TSDF体如图5中的(c)所示,提取TSDF体的表面mesh(网格)得到半身像的三维网格模型如图5中的(d)所示。然后,经过圆周展开后得到第一纹理展开图如图5中的(e)所示。进一步,经过S203处理后,得到如图5中的(f)所示的第二纹理展开图,即经过纹理补全后的纹理图。最后,进过纹理补全后的纹理图和三维网格模型的表面mesh组合起来得到最终的带纹理的半身像的三维模型,如图5中的(g)所示。As an example, taking the bust shown in FIG. 3 as an example, the front image of the bust shown in (a) in FIG. 5 is obtained after A1 processing. Semantic segmentation of the front image of the bust to obtain the semantic head mask, that is, the semantic mask of the head obtained after A2 processing is shown in Figure 5(b). The head semantic mask and frontal image can be input into the second neural network model. The output TSDF volume is shown in Figure 5 (c). The surface mesh of the TSDF volume is extracted to obtain the three-dimensional mesh model of the bust. As shown in (d) in 5. Then, after the circumferential expansion, the first texture expansion diagram is obtained as shown in (e) of FIG. 5. Further, after S203 processing, a second texture expansion map as shown in (f) in FIG. 5 is obtained, that is, a texture map after texture completion. Finally, the texture map after texture completion and the surface mesh of the three-dimensional mesh model are combined to obtain the final three-dimensional model of the textured bust, as shown in Figure 5 (g).
在一种可能的实施方式中,S404根据头部语义掩膜中头部上至少两个器官所在位置获得三维网格模型的第一纹理展开图,可以通过如下方式实现:In a possible implementation manner, S404 obtains the first texture expansion map of the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask, which can be implemented in the following manner:
A1,基于中心轴对三维网格模型对应的纹理执行纹理展开得到第三纹理展开图,中心轴为三维网格模型中的头部顶端至头部底端的连接线。例如,头部顶端,可以是头顶最高点。头部底端,可以是三维网络模型的最低面的几何中心。参见图6所示,为三维网格模型的中心轴示意图。A1, performing texture expansion on the texture corresponding to the three-dimensional mesh model based on the central axis to obtain a third texture expansion image, the central axis being the connecting line from the top of the head to the bottom of the head in the three-dimensional mesh model. For example, the top of the head can be the highest point on the top of the head. The bottom of the head can be the geometric center of the lowest surface of the three-dimensional network model. Refer to Figure 6, which is a schematic diagram of the central axis of the three-dimensional mesh model.
示例性地,可以基于中心轴按照Mesh上某处相对于中心轴的空间角度或曲面周长来获得第三纹理展开图。比如,按照角度的圆周展开可以满足第一规则。第一规则可以为:Exemplarily, the third texture development map can be obtained based on the central axis according to the spatial angle or the circumference of the curved surface of a certain place on the Mesh relative to the central axis. For example, the first rule can be satisfied by the circumferential expansion according to the angle. The first rule can be:
Figure PCTCN2021078324-appb-000001
Figure PCTCN2021078324-appb-000001
其中(x,y,z)是一个三维网格模型中一个点的空间坐标,[u,v]表示空间坐标为(x,y,z)的点在纹理展开图中的像素坐标,像素坐标以图像左下角为原点。Where (x,y,z) is the spatial coordinates of a point in a three-dimensional grid model, [u,v] represents the pixel coordinates of the point with spatial coordinates (x,y,z) in the texture expansion map, pixel coordinates Take the lower left corner of the image as the origin.
其中W是纹理展开图的宽度,H为纹理展开图的高度,cx与cz代表等Y值切面上的轴心坐标。Where W is the width of the expanded texture image, H is the height of the expanded texture image, and cx and cz represent the axis coordinates on the plane of equal Y value.
应理解的是,第三纹理展开图描述的是三维网格模型的网格纹理,或者说是不带有像素点的像素值的网格纹理展开图。It should be understood that the third texture expansion map describes the grid texture of the three-dimensional grid model, or a grid texture expansion map without pixel values.
A2,根据头部语义掩膜中头部上至少两个器官所在位置确定三维网格模型中至少两个器官的所在位置。A2. Determine the positions of at least two organs in the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask.
头部语义掩膜中包括至少两个器官的所在位置,即根据头部语义掩膜可以确定至少两 个器官的坐标。以两个器官为耳朵和鼻子为例。另外,应理解的是,三维网格模型是根据头部语义掩膜来获得的,因此,头部语义掩膜中的耳朵和鼻子与三维网格模型中的耳朵和鼻子一一存在映射关系,因此可以根据头部语义掩膜中的耳朵和鼻子的位置来确定耳朵和鼻子分别在三维网格模型中位置。The head semantic mask includes the locations of at least two organs, that is, the coordinates of at least two organs can be determined according to the head semantic mask. Take the two organs, the ear and the nose, for example. In addition, it should be understood that the three-dimensional mesh model is obtained according to the semantic mask of the head. Therefore, there is a mapping relationship between the ears and nose in the semantic mask of the head and the ears and noses in the three-dimensional mesh model. Therefore, the positions of the ears and the nose in the three-dimensional mesh model can be determined according to the positions of the ears and the nose in the semantic mask of the head.
A3,根据三维网格模型中至少两个器官的所在位置对第三纹理展开图进行调整得到第一纹理展开图。A3: Adjust the third texture expansion map according to the positions of at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
例如,参见图8所示。图8中的(a)所示为头部语义掩膜中的耳朵和鼻子位置。图8中的(b)所示为三维网格模型中的耳朵和鼻子位置。For example, see Figure 8. Figure 8 (a) shows the ear and nose positions in the semantic mask of the head. Figure 8(b) shows the positions of ears and nose in the three-dimensional mesh model.
一种方式是,根据三维网格模型中耳朵和鼻子位置可以调整第三纹理展开图中耳朵和/或鼻子位置,即根据三维网络模型中耳朵和鼻子位置确定第三纹理展开图中耳朵和/或鼻子位置,并调整耳朵和/或鼻子位置,使得调整后的耳朵和鼻子位于第三纹理展开图的预设位置处,并根据调整后的第三纹理展开图以及正面图像得到第一纹理展开图,调整后的耳朵和鼻子位于第一纹理展开图的预设位置处。比如,可以对第三纹理展开图进行扭曲映射处理,以使得到的第一纹理展开图中耳朵和鼻子位于预设位置处。One way is to adjust the positions of the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D mesh model, that is, determine the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D network model. Or nose position, and adjust the ear and/or nose position so that the adjusted ears and nose are located at the preset position of the third texture expansion map, and the first texture expansion is obtained according to the adjusted third texture expansion map and the front image In the figure, the adjusted ears and nose are located at the preset positions of the first texture expansion image. For example, the distortion mapping process may be performed on the third expanded texture image, so that the ears and the nose in the first expanded texture image are located at preset positions.
另一种方式是,根据第三纹理展开图以及正面图像获得颜色填充的第三纹理展开图。根据三维网络模型中耳朵和鼻子位置确定颜色填充的第三纹理展开图中耳朵和鼻子位置,并调整耳朵和/或鼻子位置得到第一纹理展开图。耳朵和鼻子位于第一纹理展开图的预设位置处。Another way is to obtain a color-filled third texture expansion map based on the third texture expansion map and the front image. According to the positions of the ears and the nose in the three-dimensional network model, the positions of the ears and the nose in the third texture expansion map filled with color are determined, and the positions of the ears and/or the nose are adjusted to obtain the first texture expansion map. The ears and the nose are located at the preset positions of the first texture development image.
第三纹理展开图中每个像素点的坐标与三维网格模型中每个点的坐标存在映射关系,如第一规则。由于三维网格模型时基于正面图像获得的,因此,三维网格模型中每个点的坐标与正面图像中像素点的坐标存在映射关系。进一步的,第三纹理展开图中每个像素点的坐标与正面图像中像素点的坐标存在映射关系。因此正面图像中像素点的像素值可以映射到第三纹理展开图上(或者映射到调整后的第三纹理展开图上)。There is a mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of each point in the three-dimensional grid model, as in the first rule. Since the three-dimensional grid model is obtained based on the front image, there is a mapping relationship between the coordinates of each point in the three-dimensional grid model and the coordinates of the pixel points in the front image. Further, there is a mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of the pixel in the front image. Therefore, the pixel values of the pixels in the front image can be mapped to the third texture expansion map (or to the adjusted third texture expansion map).
目前一般采用的半身纹理展开都是前面和背面分开展开,即展开成正面可见部分和背面不可见部分纹理,导致不可见部分头发和耳朵部分的纹理重建效果较差,且耳朵的几何网格重建效果较差,因而难以满足用户需求。通过上述方案,通过对网格模型进行语义对齐的纹理展开,获得一个完整的纹理展开图像,可以提高不可见部分纹理补全的效果;而且通过耳朵部分的网格优化和纹理替换,实现了高精度的局部纹理和几何,可以产生更好的纹理补全和网格细节重建效果。The current half-body texture expansion is generally used to expand the front and back separately, that is, expand into the visible part of the front and the invisible part of the back, resulting in poor texture reconstruction of the invisible part of the hair and ears, and the geometric mesh reconstruction of the ears The effect is poor, so it is difficult to meet user needs. Through the above solution, by performing semantically aligned texture expansion on the mesh model, a complete texture expansion image is obtained, which can improve the effect of texture completion in the invisible part; and through the mesh optimization and texture replacement of the ear part, high Accurate local texture and geometry can produce better texture completion and mesh detail reconstruction effects.
需要说明的是,将经过调整后的耳朵和鼻子位于第一纹理展开图的预设位置处,这个过程,为后续对耳朵部分的优化提供的便利条件,并且可以提高后续耳朵的优化的准确度。后续对耳朵部分的优化进行说明,此处不再赘述。It should be noted that the adjusted ears and nose are located at the preset positions of the first texture expansion map. This process provides convenient conditions for the subsequent optimization of the ear part, and can improve the accuracy of the subsequent ear optimization . The optimization of the ear part will be described later, which will not be repeated here.
作为一种可能的实施方式,对于头发较长的半身像来说,可能头发完全盖住耳朵区域,在该情况下,可以不再执行后续的耳朵部分的优化。基于此,在对三维网格模型对应纹理执行纹理展开时,可以基于中心轴按照某处的空间角度或曲面周长来获得纹理展开图,可以不再执行A2和A3。在该场景中,可以直接根据第三纹理展开图以及正面图像获得颜色填充的第三纹理展开图,即填充的第三纹理展开图作为第一纹理展开图。As a possible implementation manner, for a bust with long hair, the hair may completely cover the ear area. In this case, the subsequent optimization of the ear part may not be performed. Based on this, when performing texture expansion on the corresponding texture of the three-dimensional mesh model, the texture expansion map can be obtained based on the spatial angle or surface circumference of a certain place based on the central axis, and A2 and A3 can no longer be executed. In this scene, the color-filled third texture expansion map can be obtained directly according to the third texture expansion map and the front image, that is, the filled third texture expansion map is used as the first texture expansion map.
作为一种示例,参见图7中的(a)所示,头部某个截面可以认为是一个圆形。图7中的(a)中-π和π之间的黑点可以认为是进行纹理展开时采用的展开线在截面上的点。以经过扭曲处理后需要鼻子位于0π位置处,两个耳朵分别位于-π和π的位置处为例,即两个 耳朵以嘴为中心对称。从图7中的(a)中可以看出,经过扭曲处理之前,鼻子与两个耳朵是非对称的。参见图7中的(b)所示,经过A3的处理后,第一纹理展开图中,鼻子位于0π位置处。为了更直观展示耳朵的所在位置,可以参见图7中的(c)所示,为经过S203处理后的第二纹理展开图的示意图。图7中的(c)中,鼻子位于0π位置处,两个耳朵分别位于-π和π的位置处。As an example, referring to (a) in Figure 7, a certain cross section of the head can be considered as a circle. The black dots between -π and π in (a) in FIG. 7 can be considered as the points on the cross-section of the expansion line used when the texture is expanded. Take the nose at the position of 0π and the two ears at the positions of -π and π respectively after the twisting process, that is, the two ears are symmetrical with the mouth as the center. It can be seen from Figure 7(a) that the nose and the two ears are asymmetrical before being twisted. As shown in (b) in Figure 7, after A3 processing, in the first texture development image, the nose is located at the 0π position. In order to show the location of the ears more intuitively, see (c) in FIG. 7, which is a schematic diagram of the second texture expansion map after S203 processing. In (c) in Figure 7, the nose is located at the position of 0π, and the two ears are located at the positions of -π and π, respectively.
经过S203根据第一纹理展开图中半身像的正面纹理对第一纹理展开图中半身像的背面纹理进行补充得到第二纹理展开图后,如果在根据第二纹理展开图获得半身像的三维模型时,如果背部缝合线区域进行处理时,可能获得的三维模型缝合线区域会存在缝隙。作为一种示例,可以在基于第二纹理展开图得到半身像的三维模型时,可以先对第二纹理展开图中纹理缝合线区域进行平滑处理得到第四纹理展开图,基于第四纹理展开图得到半身像的三维模型,纹理缝合线区域根据对三维网格模型进行纹理展开时所采用的展开线确定。After S203 supplements the back texture of the bust in the first texture expansion image to the back texture of the bust in the first texture expansion image according to the front texture of the bust in the first texture expansion image to obtain the second texture expansion image, if the three-dimensional model of the bust image is obtained according to the second texture expansion image At the time, if the back suture area is processed, there may be gaps in the suture area of the three-dimensional model obtained. As an example, when the three-dimensional model of the bust is obtained based on the second texture expansion map, the texture stitching area in the second texture expansion map may be smoothed first to obtain the fourth texture expansion map, which is based on the fourth texture expansion map The three-dimensional model of the bust is obtained, and the texture stitching area is determined according to the unfolding line used when the three-dimensional mesh model is unfolded.
作为一种示例,以对三维网格模型进行纹理展开时,展开线在背部为例。参见图9中左侧图所示,为三维模型缝合线区域存在的类似缝隙的示意图。缝隙抹平后如图9中右侧图所示。缝隙抹平操作,即对第二纹理展开图中纹理缝合线区域进行融合处理。As an example, when the three-dimensional mesh model is texture-expanded, the expansion line is on the back as an example. Refer to the left image in Figure 9, which is a schematic diagram of similar gaps in the suture area of the three-dimensional model. After the gap is smoothed, it is shown on the right in Figure 9. The gap smoothing operation is to perform fusion processing on the texture stitching area in the second texture expansion map.
在对第二纹理展为开图中纹理缝合线区域进行融合处理时,可以先根据半身像的正面图像确定半身像的背面纹理图像;然后将第二纹理展开图以及半身像的背面纹理图像进行加权融合处理得到第四纹理展开图。When fusing the texture stitching area of the second texture expansion image, you can first determine the back texture image of the bust according to the front image of the bust; then perform the second texture expansion image and the back texture image of the bust The weighted fusion process obtains the fourth texture expansion image.
示例性地,可以将第二纹理展开图以及半身像的背面纹理图像按照设定规则进行加权融合处理得到第四纹理展开图;Exemplarily, the second texture expansion image and the back texture image of the bust may be subjected to weighted fusion processing according to the set rules to obtain the fourth texture expansion image;
设定规则可以为:The setting rules can be:
I 3(i,j)=αI 1(i,j)+(1-α)I 2(map 1(i),map 2(j)); I 3 (i,j)=αI 1 (i,j)+(1-α)I 2 (map 1 (i),map 2 (j));
α=I alpha(i,j); α=I alpha (i,j);
其中,I 1表示第二纹理展开图,I 2表示半身像的背面图像,I 3表示第四纹理展开图,I alpha表示权重,map 1表示在X轴方向上,半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,map 2表示Y轴方向上,半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,i表示像素点在X轴方向上的坐标取值,j表示像素点在X轴方向上的坐标取值。 Among them, I 1 represents the second texture expansion image, I 2 represents the back image of the bust, I 3 represents the fourth texture expansion image, I alpha represents the weight, and map 1 represents in the X-axis direction, in the back image of the bust The pixel points are mapped to the mapping function on the second texture expansion map, map 2 represents the mapping function on the Y-axis direction, the pixel points in the back image of the bust are mapped to the mapping function on the second texture expansion map, i represents the pixel point on the X axis The coordinate value in the direction is taken, and j represents the coordinate value of the pixel point in the X-axis direction.
例如,参见图10所示,根据半身像的正面图像确定半身像的背面纹理图像的如图10中的(a)所示,各个像素点的权重可以通过图10中的(b)所示。第二纹理展开图通过10中(c)所示。将第二纹理展开图(图10中的(c))以及半身像的背面纹理图像(图10中的(a))按照图10中的(b)融合得到图10中的(d),即第四纹理展开图。融合操作在图10中通过加号“+”来表示。For example, referring to FIG. 10, the back texture image of the bust is determined according to the front image of the bust as shown in FIG. 10(a), and the weight of each pixel can be shown in FIG. 10(b). The second texture expansion diagram is shown in 10 (c). The second texture expansion image ((c) in Fig. 10) and the back texture image of the bust ((a) in Fig. 10) are merged according to (b) in Fig. 10 to obtain (d) in Fig. 10, namely The fourth texture expanded view. The fusion operation is indicated by the plus sign "+" in Figure 10.
下面对耳朵部分的优化的过程进行详细描述。半身像的人脸包括耳朵,即耳朵未被头发遮挡。可以根据头部语义掩膜来判断是否包括耳朵。在对耳朵部分进行优化时,可以采用如下方式:The process of optimizing the ear part will be described in detail below. The face of the bust includes the ears, that is, the ears are not covered by the hair. You can determine whether to include ears based on the semantic mask of the head. When optimizing the ear part, the following methods can be used:
B1,先将预配置的耳朵模型融合到三维网格模型中的耳朵区域得到融合后的三维网格模型。B1, first fusion the pre-configured ear model to the ear region in the 3D mesh model to obtain the fused 3D mesh model.
示例性地,耳朵融合到三维网格模型的耳朵区域可以采用拉普拉斯(Laplacian)网格融合方法。Exemplarily, the ear fusion to the ear region of the three-dimensional mesh model may adopt a Laplacian mesh fusion method.
B2,将融合后的三维网络模型上耳朵区域的纹理融合到位于第四纹理展开图预设位置 处的耳朵区域得到融合后的第四纹理展开图;最后根据融合后的第四纹理展开图得到半身像的三维模型。B2. Fusion the texture of the ear region on the fused 3D network model to the ear region located at the preset position of the fourth texture expansion map to obtain the fused fourth texture expansion map; finally get the fusion fourth texture expansion map Three-dimensional model of the bust.
示例性地,将融合后的三维网络模型上耳朵区域的纹理融合到位于第四纹理展开图预设位置处的耳朵区域时采用的融合方法,可以采用基于图像拉普拉斯梯度的融合算法。Exemplarily, the fusion method used when fusing the texture of the ear region on the fused three-dimensional network model to the ear region located at the preset position of the fourth texture expansion map may be a fusion algorithm based on the image Laplacian gradient.
参见图11中的(a)所示,为三维网格模型的示意图。确定的三维网格模型中的耳朵区域可以参见图11中的(b)所示。耳朵的几何尺度可以根据三维网格模型的大小来确定。三维网格模型的大小可以由用户预先设置或者采用默认的尺寸。耳朵模型贴合到三维网格模型中的耳朵区域,如图11中的(c)所示,然后对贴合耳朵模型的三维网格模型进行融合处理,得到的融合后的三维网格模型可以参见图11中的(d)所示。然后获取融合后的三维网络模型上耳朵区域的纹理,并将耳朵区域的纹理融合到第四纹理展开图中预设位置处的耳朵区域上。See (a) in Figure 11, which is a schematic diagram of a three-dimensional mesh model. The ear area in the determined three-dimensional mesh model can be seen in (b) of FIG. 11. The geometric dimensions of the ear can be determined according to the size of the three-dimensional mesh model. The size of the three-dimensional mesh model can be preset by the user or a default size can be adopted. Fit the ear model to the ear area in the 3D mesh model, as shown in (c) in Figure 11, and then perform the fusion processing on the 3D mesh model fitted to the ear model, and the resulting fused 3D mesh model can be See (d) in Figure 11. Then, the texture of the ear region on the fused three-dimensional network model is obtained, and the texture of the ear region is fused to the ear region at the preset position in the fourth texture expansion map.
作为一种示例,比如获取到的待处理图像如图2所示的图像,经过本申请实施例提供的方案,得到的重建后的三维模型如何图12所示。图12中的(a)为三维模型的正面示意图,图12中的(b)为三维模型的背面示意图。As an example, for example, the acquired image to be processed is shown in FIG. 2, and the reconstructed three-dimensional model obtained through the solution provided in the embodiment of the present application is shown in FIG. 12. (A) in FIG. 12 is a schematic front view of the three-dimensional model, and (b) in FIG. 12 is a schematic back view of the three-dimensional model.
例如,将本申请实施例提供的方案应用到虚拟三维对话。在本端终端设备(简称终端)接收到用户的触发,启动虚拟三维视频通话时,通过终端的摄像头获取视频流,可以按照本专利提出的基于单帧图像的半身像的三维重建方法来重建三维模型。终端通过获取的视频流的每一帧的用户的表情来驱动三维模型,并发送给对端终端,对端终端显示该三维模型所模拟的本端终端用户的表情。作为一种示例,参见图13所示,创建三维模型可以计算云来执行,由电子设备将单帧图像发送给计算云,由计算云来创建三维模型,然后将创建的三维模型发送给终端。For example, the solution provided in the embodiment of this application is applied to a virtual three-dimensional dialogue. When the local terminal device (referred to as the terminal) receives the user’s trigger and starts a virtual 3D video call, the video stream is obtained through the terminal’s camera. The 3D reconstruction method based on the bust of the single frame image proposed in this patent can be used to reconstruct the 3D Model. The terminal drives the three-dimensional model by acquiring the user's expression in each frame of the video stream, and sends it to the opposite terminal, and the opposite terminal displays the local terminal user's expression simulated by the three-dimensional model. As an example, referring to FIG. 13, creating a three-dimensional model can be performed by a computing cloud. The electronic device sends a single frame image to the computing cloud, and the computing cloud creates a three-dimensional model, and then sends the created three-dimensional model to the terminal.
为了更好地说明以上实施例,本申请实施例还提供了一种装置1400,参见图14所示,该装置1400具体可以是包括电子设备中的功能模块(比如图1中的处理器110的部件或其执行的软件模块),或者装置1400可以是芯片或者芯片系统,或者装置1400可以是电子设备中一个模块等。示意性的,该装置可以包括获取单元1401和重建单元1402。获取单元1401以及重建单元1402分别执行图2和图4对应的实施例所示的方法的不同步骤。比如获取单元1401可以用于执行S201中获取待处理图像,重建单元1402用于执行S202-S204的过程,具体实现如前文所述,此处不再赘述。In order to better explain the above embodiments, the embodiments of the present application also provide an apparatus 1400. As shown in FIG. 14, the apparatus 1400 may specifically include functional modules in an electronic device (for example, the processor 110 in FIG. 1 Components or software modules executed by them), or the apparatus 1400 may be a chip or a chip system, or the apparatus 1400 may be a module in an electronic device, or the like. Illustratively, the apparatus may include an obtaining unit 1401 and a reconstruction unit 1402. The obtaining unit 1401 and the reconstruction unit 1402 respectively execute different steps of the method shown in the embodiment corresponding to FIG. 2 and FIG. 4. For example, the obtaining unit 1401 may be used to obtain the image to be processed in S201, and the reconstruction unit 1402 may be used to perform the process of S202-S204. The specific implementation is as described above and will not be repeated here.
因此,可以认为之前提到的获取单元1401或重建单元1402可以由软件、硬件或软件与硬件结合来实现。当该模块以硬件实现的时候,该硬件可以是CPU、微处理器、DSP、微控制单元(MCU)、人工智能处理器、专用集成电路(ASIC)、现场可编程门阵列(field programmable gate array,FPGA)、专用数字电路、硬件加速器或非集成的分立器件中的任一个或任一组合,其可以运行必要的软件或不依赖于软件以执行以上方法流程,并位于之前图1的所述处理器110内。当以该模块以软件实现的时候,所述软件以计算机程序指令的方式存在,并被存储在存储器,例如图1的存储器120中,处理器,例如图1中的处理器110可以用于执行所述程序指令以实现以上方法流程。所述处理器可以包括但不限于以下至少一种:CPU、微处理器、DSP、微控制器、或人工智能处理器等各类运行软件的计算设备,每种计算设备可包括一个或多个用于执行软件指令以进行运算或处理的核。该处理器可以是个单独的半导体芯片,也可以跟其他电路一起集成为一个半导体芯片,例如,可以跟其他电路(如编解码电路、硬件加速电路或各种总线和接口电路)构成一个SoC(片 上系统),或者也可以作为一个ASIC的内置处理器集成在所述ASIC当中,该集成了处理器的ASIC可以单独封装或者也可以跟其他电路封装在一起。该处理器除了包括用于执行软件指令以进行运算或处理的核外,还可进一步包括必要的硬件加速器,如FPGA、PLD(可编程逻辑器件)、或者实现专用逻辑运算的逻辑电路。Therefore, it can be considered that the aforementioned acquisition unit 1401 or reconstruction unit 1402 can be implemented by software, hardware, or a combination of software and hardware. When the module is implemented in hardware, the hardware can be CPU, microprocessor, DSP, micro control unit (MCU), artificial intelligence processor, application specific integrated circuit (ASIC), field programmable gate array (field programmable gate array) , FPGA), dedicated digital circuits, hardware accelerators, or any one or any combination of non-integrated discrete devices, which can run the necessary software or do not rely on software to perform the above method flow, and are located in the previous description of Figure 1 Inside the processor 110. When the module is implemented in software, the software exists in the form of computer program instructions and is stored in a memory, such as the memory 120 in FIG. 1, and a processor, such as the processor 110 in FIG. 1, can be used to execute The program instructions are used to realize the above method flow. The processor may include, but is not limited to, at least one of the following: CPU, microprocessor, DSP, microcontroller, or artificial intelligence processor and other computing devices that run software. Each computing device may include one or more A core used to execute software instructions for calculation or processing. The processor can be a single semiconductor chip, or it can be integrated with other circuits to form a semiconductor chip. For example, it can form an SoC (on-chip) with other circuits (such as codec circuits, hardware acceleration circuits, or various bus and interface circuits). System), or it can be integrated into the ASIC as a built-in processor of an ASIC, and the ASIC integrated with the processor can be packaged separately or together with other circuits. In addition to the core used to execute software instructions to perform operations or processing, the processor may further include necessary hardware accelerators, such as FPGAs, PLDs (programmable logic devices), or logic circuits that implement dedicated logic operations.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This application is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of this application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请实施例的范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the scope of the embodiments of the present application. In this way, if these modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application also intends to include these modifications and variations.

Claims (20)

  1. 一种半身像的三维重建方法,其特征在于,包括:A method for three-dimensional reconstruction of a bust, which is characterized in that it comprises:
    获取待处理图像,所述待处理图像中包括目标人物的半身像,所述半身像包括正面人脸;Acquiring a to-be-processed image, where the to-be-processed image includes a bust of a target person, and the bust includes a frontal face;
    根据待处理图像得到第一纹理展开图,所述第一纹理展开图用于表征所述半身像的正面纹理;Obtaining a first texture expansion map according to the image to be processed, the first texture expansion map being used to characterize the front texture of the bust;
    其中,所述第一纹理展开图中所述半身像的五官中的至少两个器官位于预设位置处;Wherein, at least two organs of the five sense organs of the bust in the first texture expansion view are located at preset positions;
    根据所述第一纹理展开图中所述半身像的正面纹理对所述第一纹理展开图中所述半身像的背面纹理进行补充得到第二纹理展开图,所述第二纹理展开图用于表征所述半身像的表面纹理;According to the front texture of the bust in the first texture expansion view, the back texture of the bust in the first texture expansion view is supplemented to obtain a second texture expansion view, and the second texture expansion view is used for Characterize the surface texture of the bust;
    根据所述第二纹理展开图得到所述半身像的三维模型。Obtain a three-dimensional model of the bust according to the second texture expansion map.
  2. 如权利要求1所述的方法,其特征在于,根据待处理图像得到第一纹理展开图,包括:The method according to claim 1, wherein obtaining the first texture expansion map according to the image to be processed comprises:
    去除所述待处理图像中的背景得到所述半身像的正面图像;Removing the background in the image to be processed to obtain the front image of the bust;
    对所述正面图像进行语义分割得到所述正面图像的头部语义掩膜;Performing semantic segmentation on the frontal image to obtain a semantic mask of the head of the frontal image;
    根据所述头部语义掩膜以及所述正面图像获得所述半身像的三维网格模型;Obtaining a three-dimensional mesh model of the bust according to the head semantic mask and the frontal image;
    根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,所述第一纹理展开图中所述至少两个器官位于预设位置处。According to the positions of the at least two organs on the head in the head semantic mask and the three-dimensional mesh model, a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
  3. 如权利要求2所述的方法,其特征在于,所述根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,包括:The method according to claim 2, wherein the obtaining a first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model comprises:
    基于中心轴对所述三维网格模型对应的纹理执行纹理展开得到第三纹理展开图,所述中心轴为所述三维网格模型中的头部顶端至头部底端的连接线;Performing texture expansion on the texture corresponding to the three-dimensional mesh model based on a central axis to obtain a third texture expansion diagram, where the central axis is the connecting line from the top end of the head to the bottom end of the head in the three-dimensional mesh model;
    根据所述头部语义掩膜中头部上至少两个器官所在位置确定所述三维网格模型中所述至少两个器官的所在位置;Determining the locations of the at least two organs in the three-dimensional mesh model according to the locations of the at least two organs on the head in the head semantic mask;
    根据所述三维网格模型中所述至少两个器官的所在位置对第三纹理展开图进行调整得到所述第一纹理展开图。The third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
  4. 如权利要求2或3所述的方法,其特征在于,基于所述第二纹理展开图得到所述半身像的三维模型,包括:The method according to claim 2 or 3, wherein obtaining a three-dimensional model of the bust based on the second texture expansion map comprises:
    对所述第二纹理展开图中纹理缝合线区域进行平滑处理得到第四纹理展开图,基于所述第四纹理展开图得到所述半身像的三维模型,所述纹理缝合线区域根据对所述三维网格模型进行纹理展开时所采用的展开线确定。Smoothing the texture seam area in the second texture expansion map to obtain a fourth texture expansion map, obtain a three-dimensional model of the bust based on the fourth texture expansion map, and the texture seam area according to the The three-dimensional mesh model is used to determine the expansion line used when the texture is expanded.
  5. 如权利要求4所述的方法,其特征在于,所述至少两个器官中包括耳朵;所述方法还包括:The method of claim 4, wherein the at least two organs include ears; and the method further comprises:
    将预配置的耳朵模型融合到所述三维网格模型中的耳朵区域得到融合后的三维网格模型;Fusing the pre-configured ear model to the ear region in the three-dimensional mesh model to obtain a fused three-dimensional mesh model;
    基于处理后的第二纹理展开图得到所述半身像的三维模型,包括:Obtaining the three-dimensional model of the bust based on the processed second texture expansion map, including:
    将融合后的三维网络模型上所述耳朵区域的纹理融合到位于所述第四纹理展开图预设位置处的耳朵区域得到融合后的第四纹理展开图,根据融合后的第四纹理展开图得到所述半身像的三维模型。The texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
  6. 如权利要求4或5所述的方法,其特征在于,对所述第二纹理展开图中纹理缝合线区域进行平滑处理,包括:The method according to claim 4 or 5, wherein the smoothing of the texture stitch line region in the second texture expansion map comprises:
    根据所述半身像的正面图像确定所述半身像的背面纹理图像;Determining the back texture image of the bust according to the front image of the bust;
    将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图。Perform weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image.
  7. 如权利要求6所述的方法,其特征在于,将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图,包括:7. The method of claim 6, wherein performing weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image comprises:
    将所述第二纹理展开图以及所述半身像的背面图像按照设定规则进行加权融合处理得到所述第四纹理展开图;Performing weighted fusion processing on the second texture expansion image and the back image of the bust according to a set rule to obtain the fourth texture expansion image;
    所述设定规则为:The setting rules are:
    I 3(i,j)=αI 1(i,j)+(1-α)I 2(map 1(i),map 2(j)); I 3 (i,j)=αI 1 (i,j)+(1-α)I 2 (map 1 (i),map 2 (j));
    α=I alpha(i,j); α=I alpha (i,j);
    其中,I 1表示第二纹理展开图,I 2表示所述半身像的背面图像,I 3表示第四纹理展开图,I alpha表示权重,map 1表示在X轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,map 2表示Y轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,i表示像素点在X轴方向上的坐标取值,j表示像素点在X轴方向上的坐标取值。 Among them, I 1 represents the second texture expansion image, I 2 represents the back image of the bust, I 3 represents the fourth texture expansion image, I alpha represents the weight, and map 1 represents in the X-axis direction, the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map, map 2 represents the mapping function on the Y-axis direction, the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map, i Indicates the coordinate value of the pixel in the X-axis direction, and j represents the coordinate value of the pixel in the X-axis direction.
  8. 如权利要求2-7任一项所述的方法,其特征在于,根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图之前,所述方法还包括:The method according to any one of claims 2-7, wherein, before obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model, The method also includes:
    对所述三维网格模型执行如下至少一项处理:Perform at least one of the following processes on the three-dimensional mesh model:
    补洞处理、网格均一化处理或者网格平滑处理。Hole filling processing, mesh uniformization processing or mesh smoothing processing.
  9. 一种半身像的三维重建装置,其特征在于,包括:A three-dimensional reconstruction device for a bust, which is characterized in that it comprises:
    获取单元,用于获取待处理图像,所述待处理图像中包括目标人物的半身像,所述半身像包括正面人脸;An acquiring unit, configured to acquire an image to be processed, the image to be processed includes a bust of a target person, and the bust includes a frontal face;
    重建单元,用于根据待处理图像得到第一纹理展开图,所述第一纹理展开图用于表征所述半身像的正面纹理;其中,所述第一纹理展开图中所述半身像的五官中的至少两个器官位于预设位置处;根据所述第一纹理展开图中所述半身像的正面纹理对所述第一纹理展开图中所述半身像的背面纹理进行补充得到第二纹理展开图,所述第二纹理展开图用于表征所述半身像的表面纹理;并根据所述第二纹理展开图得到所述半身像的三维模型。The reconstruction unit is configured to obtain a first texture expansion map according to the image to be processed, and the first texture expansion map is used to characterize the frontal texture of the bust; wherein, the five senses of the bust in the first texture expansion map At least two organs in are located at preset positions; according to the front texture of the bust in the first texture expansion view, the second texture is obtained by supplementing the back texture of the bust in the first texture expansion view An expanded view, where the second texture expanded view is used to characterize the surface texture of the bust; and a three-dimensional model of the bust is obtained according to the second texture expanded view.
  10. 如权利要求9所述的装置,其特征在于,所述重建单元在根据待处理图像得到第一纹理展开图时,具体用于:9. The device according to claim 9, wherein the reconstruction unit is specifically configured to:
    去除所述待处理图像中的背景得到所述半身像的正面图像;Removing the background in the image to be processed to obtain the front image of the bust;
    对所述正面图像进行语义分割得到所述正面图像的头部语义掩膜;Performing semantic segmentation on the frontal image to obtain a semantic mask of the head of the frontal image;
    根据所述头部语义掩膜以及所述正面图像获得所述半身像的三维网格模型;Obtaining a three-dimensional mesh model of the bust according to the head semantic mask and the frontal image;
    根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图,所述第一纹理展开图中所述至少两个器官位于预设位置处。According to the positions of the at least two organs on the head in the head semantic mask and the three-dimensional mesh model, a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
  11. 如权利要求10所述的装置,其特征在于,所述重建单元,在根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图时,具体用于:The device according to claim 10, wherein the reconstruction unit, when obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model, Specifically used for:
    基于中心轴对所述三维网格模型对应的纹理执行纹理展开得到第三纹理展开图,所述中心轴为所述三维网格模型中的头部顶端至头部底端的连接线;Performing texture expansion on the texture corresponding to the three-dimensional mesh model based on a central axis to obtain a third texture expansion diagram, where the central axis is the connecting line from the top end of the head to the bottom end of the head in the three-dimensional mesh model;
    根据所述头部语义掩膜中头部上至少两个器官所在位置确定所述三维网格模型中所述至少两个器官的所在位置;Determining the locations of the at least two organs in the three-dimensional mesh model according to the locations of the at least two organs on the head in the head semantic mask;
    根据所述三维网格模型中所述至少两个器官的所在位置对第三纹理展开图进行调整得到所述第一纹理展开图。The third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
  12. 如权利要求10或11所述的装置,其特征在于,所述重建单元,在基于所述第二纹理展开图得到所述半身像的三维模型时,具体用于:The device according to claim 10 or 11, wherein the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the second texture expansion map:
    对所述第二纹理展开图中纹理缝合线区域进行平滑处理得到第四纹理展开图,基于所述第四纹理展开图得到所述半身像的三维模型,所述纹理缝合线区域根据对所述三维网格模型进行纹理展开时所采用的展开线确定。Smoothing the texture seam area in the second texture expansion map to obtain a fourth texture expansion map, obtain a three-dimensional model of the bust based on the fourth texture expansion map, and the texture seam area according to the The three-dimensional mesh model is used to determine the expansion line used when the texture is expanded.
  13. 如权利要求12所述的装置,其特征在于,所述至少两个器官中包括耳朵;所述重建单元,还用于将预配置的耳朵模型融合到所述三维网格模型中的耳朵区域得到融合后的三维网格模型;The device of claim 12, wherein the at least two organs include ears; and the reconstruction unit is further used to fuse a pre-configured ear model into the ear region in the three-dimensional mesh model to obtain 3D mesh model after fusion;
    所述重建单元,在基于处理后的第二纹理展开图得到所述半身像的三维模型时,具体用于:The reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the processed second texture expansion map:
    将融合后的三维网络模型上所述耳朵区域的纹理融合到位于所述第四纹理展开图预设位置处的耳朵区域得到融合后的第四纹理展开图,根据融合后的第四纹理展开图得到所述半身像的三维模型。The texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
  14. 如权利要求12或13所述的装置,其特征在于,所述重建单元,在对所述第二纹理展开图中纹理缝合线区域进行平滑处理时,具体用于:The device according to claim 12 or 13, wherein the reconstruction unit is specifically configured to:
    根据所述半身像的正面图像确定所述半身像的背面纹理图像;Determining the back texture image of the bust according to the front image of the bust;
    将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图。Perform weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image.
  15. 如权利要求14所述的装置,其特征在于,所述重建单元,在将所述第二纹理展开图以及所述半身像的背面纹理图像进行加权融合处理得到所述第四纹理展开图时,具体用于:The device according to claim 14, wherein the reconstruction unit performs weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, Specifically used for:
    将所述第二纹理展开图以及所述半身像的背面图像按照设定规则进行加权融合处理得到所述第四纹理展开图;Performing weighted fusion processing on the second texture expansion image and the back image of the bust according to a set rule to obtain the fourth texture expansion image;
    所述设定规则为:The setting rules are:
    I 3(i,j)=αI 1(i,j)+(1-α)I 2(map 1(i),map 2(j)); I 3 (i,j)=αI 1 (i,j)+(1-α)I 2 (map 1 (i),map 2 (j));
    α=I alpha(i,j); α=I alpha (i,j);
    其中,I 1表示第二纹理展开图,I 2表示所述半身像的背面图像,I 3表示第四纹理展开图,I alpha表示权重,map 1表示在X轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,map 2表示Y轴方向上,所述半身像的背面图像中的像素点映射到第二纹理展开图上的映射函数,i表示像素点在X轴方向上的坐标取值,j表示像素点在X轴方向上的坐标取值。 Among them, I 1 represents the second texture expansion image, I 2 represents the back image of the bust, I 3 represents the fourth texture expansion image, I alpha represents the weight, and map 1 represents in the X-axis direction, the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map, map 2 represents the mapping function on the Y-axis direction, the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map, i Indicates the coordinate value of the pixel in the X-axis direction, and j represents the coordinate value of the pixel in the X-axis direction.
  16. 如权利要求10-15任一项所述的装置,其特征在于,所述重建单元,在根据头部语义掩膜中头部上至少两个器官所在位置以及所述三维网格模型获得第一纹理展开图之前,还用于:The device according to any one of claims 10-15, wherein the reconstruction unit obtains the first position according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model. Before the texture expansion map, it is also used to:
    对所述三维网格模型执行如下至少一项处理:Perform at least one of the following processes on the three-dimensional mesh model:
    补洞处理、网格均一化处理或者网格平滑处理。Hole filling processing, mesh uniformization processing or mesh smoothing processing.
  17. 一种电子设备,其特征在于,包括处理器、存储器;其中所述处理器与所述存储器耦合;An electronic device, characterized by comprising a processor and a memory; wherein the processor is coupled with the memory;
    所述存储器,用于存储程序指令;The memory is used to store program instructions;
    所述处理器,用于读取所述存储器中存储的所述程序指令,以实现如权利要求1至8任一所述的方法。The processor is configured to read the program instructions stored in the memory to implement the method according to any one of claims 1 to 8.
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有程序指令,当所述程序指令在电子设备或处理器上运行时,使得所述电子设备执行权利要求1至8任一所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores program instructions, when the program instructions run on an electronic device or a processor, the electronic device executes claims 1 to 8 Any of the methods described.
  19. 一种计算机程序产品,其特征在于,当所述计算机程序产品在电子设备上运行时,使得所述电子设备或处理器执行权利要求1至8任一所述的方法。A computer program product, characterized in that, when the computer program product runs on an electronic device, the electronic device or the processor is caused to execute the method according to any one of claims 1 to 8.
  20. 一种芯片,其特征在于,所述芯片与电子设备中的存储器耦合,使得所述电子设备执行权利要求1至8任一所述的方法。A chip, characterized in that the chip is coupled with a memory in an electronic device, so that the electronic device executes the method according to any one of claims 1 to 8.
PCT/CN2021/078324 2020-02-29 2021-02-27 Method and apparatus for three-dimensional reconstruction of half-length portrait WO2021170127A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010132592.8A CN113327277A (en) 2020-02-29 2020-02-29 Three-dimensional reconstruction method and device for half-body image
CN202010132592.8 2020-02-29

Publications (1)

Publication Number Publication Date
WO2021170127A1 true WO2021170127A1 (en) 2021-09-02

Family

ID=77412988

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/078324 WO2021170127A1 (en) 2020-02-29 2021-02-27 Method and apparatus for three-dimensional reconstruction of half-length portrait

Country Status (2)

Country Link
CN (1) CN113327277A (en)
WO (1) WO2021170127A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601490A (en) * 2022-11-29 2023-01-13 思看科技(杭州)股份有限公司(Cn) Texture image pre-replacement method and device based on texture mapping and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780713A (en) * 2016-11-11 2017-05-31 吴怀宇 A kind of three-dimensional face modeling method and system based on single width photo
US20180315222A1 (en) * 2017-05-01 2018-11-01 Lockheed Martin Corporation Real-time image undistortion for incremental 3d reconstruction
CN110197462A (en) * 2019-04-16 2019-09-03 浙江理工大学 A kind of facial image beautifies in real time and texture synthesis method
CN110782507A (en) * 2019-10-11 2020-02-11 创新工场(北京)企业管理股份有限公司 Texture mapping generation method and system based on face mesh model and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100327541B1 (en) * 2000-08-10 2002-03-08 김재성, 이두원 3D facial modeling system and modeling method
CN101383055B (en) * 2008-09-18 2010-09-29 北京中星微电子有限公司 Three-dimensional human face constructing method and system
CN102663820B (en) * 2012-04-28 2014-10-22 清华大学 Three-dimensional head model reconstruction method
CN107452049B (en) * 2016-05-30 2020-09-15 腾讯科技(深圳)有限公司 Three-dimensional head modeling method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780713A (en) * 2016-11-11 2017-05-31 吴怀宇 A kind of three-dimensional face modeling method and system based on single width photo
US20180315222A1 (en) * 2017-05-01 2018-11-01 Lockheed Martin Corporation Real-time image undistortion for incremental 3d reconstruction
CN110197462A (en) * 2019-04-16 2019-09-03 浙江理工大学 A kind of facial image beautifies in real time and texture synthesis method
CN110782507A (en) * 2019-10-11 2020-02-11 创新工场(北京)企业管理股份有限公司 Texture mapping generation method and system based on face mesh model and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601490A (en) * 2022-11-29 2023-01-13 思看科技(杭州)股份有限公司(Cn) Texture image pre-replacement method and device based on texture mapping and storage medium

Also Published As

Publication number Publication date
CN113327277A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN111415422B (en) Virtual object adjustment method and device, storage medium and augmented reality equipment
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
CN111640175A (en) Object modeling movement method, device and equipment
CN108986016B (en) Image beautifying method and device and electronic equipment
CN113628327B (en) Head three-dimensional reconstruction method and device
WO2019196745A1 (en) Face modelling method and related product
WO2021244172A1 (en) Image processing method and image synthesis method, image processing apparatus and image synthesis apparatus, and storage medium
WO2021078179A1 (en) Image display method and device
CN112927362A (en) Map reconstruction method and device, computer readable medium and electronic device
US11640687B2 (en) Volumetric capture and mesh-tracking based machine learning 4D face/body deformation training
CN112348937A (en) Face image processing method and electronic equipment
CN114219878A (en) Animation generation method and device for virtual character, storage medium and terminal
WO2023066120A1 (en) Image processing method and apparatus, electronic device, and storage medium
CN109961496A (en) Expression driving method and expression driving device
CN111951368A (en) Point cloud, voxel and multi-view fusion deep learning method
CN112581518A (en) Eyeball registration method, device, server and medium based on three-dimensional cartoon model
WO2021170127A1 (en) Method and apparatus for three-dimensional reconstruction of half-length portrait
US10650488B2 (en) Apparatus, method, and computer program code for producing composite image
CN115908120A (en) Image processing method and electronic device
TWM630947U (en) Stereoscopic image playback apparatus
WO2024051289A1 (en) Image background replacement method and related device
CN111369651A (en) Three-dimensional expression animation generation method and system
WO2023040754A1 (en) Image light supplement method and electronic device
US12001746B2 (en) Electronic apparatus, and method for displaying image on display device
WO2021254107A1 (en) Electronic apparatus, and method for displaying image on display device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21760368

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21760368

Country of ref document: EP

Kind code of ref document: A1