WO2021170127A1 - Procédé et appareil de reconstruction tridimensionnelle d'un portrait de demi-longueur - Google Patents
Procédé et appareil de reconstruction tridimensionnelle d'un portrait de demi-longueur Download PDFInfo
- Publication number
- WO2021170127A1 WO2021170127A1 PCT/CN2021/078324 CN2021078324W WO2021170127A1 WO 2021170127 A1 WO2021170127 A1 WO 2021170127A1 CN 2021078324 W CN2021078324 W CN 2021078324W WO 2021170127 A1 WO2021170127 A1 WO 2021170127A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- texture
- image
- bust
- map
- expansion
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 210000000056 organ Anatomy 0.000 claims abstract description 53
- 210000005069 ears Anatomy 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims description 56
- 238000010586 diagram Methods 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 28
- 238000009499 grossing Methods 0.000 claims description 22
- 238000013507 mapping Methods 0.000 claims description 20
- 238000007499 fusion processing Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 12
- 230000011218 segmentation Effects 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 8
- 238000003860 storage Methods 0.000 claims description 8
- 210000000697 sensory organ Anatomy 0.000 claims description 2
- 230000001502 supplementing effect Effects 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 7
- 210000003128 head Anatomy 0.000 description 53
- 210000001331 nose Anatomy 0.000 description 34
- 238000013461 design Methods 0.000 description 25
- 238000003062 neural network model Methods 0.000 description 10
- 238000005457 optimization Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 210000004209 hair Anatomy 0.000 description 4
- 210000001508 eye Anatomy 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 238000010146 3D printing Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000000265 homogenisation Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013529 biological neural network Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000004215 lattice model Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/04—Texture mapping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Definitions
- the embodiments of the present application relate to the field of image processing technology, and in particular to a method and device for three-dimensional reconstruction of a bust.
- three-dimensional bust reconstruction technology has a wide range of applications in three-dimensional printing, entertainment, and remote augmented reality (AR) calls.
- the traditional three-dimensional bust image reconstruction technology can adopt a monocular system or a multi-eye system.
- an infrared depth camera is usually used to scan a person in a circle, and the scanned person needs to remain still during the scanning process. Therefore, this method takes a long time to scan the person, and the reconstruction calculation time is long, and the scanning effect may be unsatisfactory or failure.
- the multi-eye scanning system is an acquisition system based on multiple viewing angles. Although it has real-time reconstruction capabilities, it is expensive, with large equipment and high complexity, and it is inconvenient to operate.
- the present application provides a three-dimensional reconstruction method of a bust.
- the three-dimensional reconstruction method of the bust can be implemented by an electronic device, such as one of the electronic devices or a processing unit.
- the method may include: obtaining an image to be processed including a bust of the target person, the bust including a frontal face, and then obtaining a first texture expansion image based on the obtained image to be processed, and the first texture expansion image is used to represent the front face of the bust Texture, and at least two organs (such as the nose and ears) of the five senses of the bust in the first texture expansion picture are located at preset positions; after that, expand the first texture according to the front texture of the bust in the first texture expansion picture
- the backside texture of the bust in the figure is supplemented to obtain a second texture expansion map, which is used to characterize the surface texture of the bust; finally, a three-dimensional model of the bust is obtained according to the second texture expansion map.
- the second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust.
- the surface texture includes the front texture of the bust and the back texture of the bust.
- the front of a bust includes a human face, front neck, or front shoulders.
- the back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
- obtaining the first texture expansion map according to the image to be processed can be implemented in the following ways:
- the obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model may be implemented in the following manner:
- the third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
- the texture of the bust can be expanded in a circle to obtain the first texture expansion map.
- Circumferential expansion is non-Atlas (atlas) texture expansion.
- Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures.
- inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
- the at least two organs include ears; the method further includes:
- Obtaining the three-dimensional model of the bust based on the processed second texture expansion map including:
- the texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
- the above design realizes high-precision local texture and geometry through mesh optimization and texture replacement of the ear part, and optimizes texture completion and mesh reconstruction details.
- smoothing the texture stitching area in the second texture expansion image can be achieved in the following manner:
- the back texture is estimated based on the front image.
- the estimated back texture is free of gaps, and then weighted with the back texture obtained based on the second texture expansion map , Optimize the smoothing effect of the back gap.
- weighted fusion processing is performed on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, which may be implemented in the following manner:
- the setting rules are:
- I 3 (i,j) ⁇ I 1 (i,j)+(1- ⁇ )I 2 (map 1 (i),map 2 (j));
- I 1 represents the second texture expansion image
- I 2 represents the back image of the bust
- I 3 represents the fourth texture expansion image
- I alpha represents the weight
- map 1 represents in the X-axis direction
- the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map
- map 2 represents the mapping function on the Y-axis direction
- the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map
- i Indicates the coordinate value of the pixel in the X-axis direction
- j represents the coordinate value of the pixel in the X-axis direction.
- the method before obtaining the first texture expansion map according to the positions of at least two organs on the head in the head semantic mask and the three-dimensional mesh model, the method further includes:
- the three-dimensional mesh model is processed to fill holes to make the three-dimensional mesh model more complete and improve the accuracy of the reconstruction of the three-dimensional model.
- Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction.
- Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
- the present application provides a three-dimensional reconstruction device for a bust, including:
- An acquiring unit configured to acquire an image to be processed, the image to be processed includes a bust of a target person, and the bust includes a frontal face;
- the reconstruction unit is configured to obtain a first texture expansion map according to the image to be processed, the first texture expansion map being used to characterize the frontal texture of the bust, and the first texture expansion map in the five senses of the bust At least two organs are located at preset positions; according to the front texture of the bust in the first texture expansion view, supplementing the back texture of the bust in the first texture expansion view to obtain a second texture expansion view
- the second texture expansion map is used to characterize the surface texture of the bust; and the three-dimensional model of the bust is obtained according to the second texture expansion map.
- a first texture expansion map is obtained, and the at least two organs in the first texture expansion map are located at preset positions.
- the third texture expansion map is adjusted according to the positions of the at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
- the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the second texture expansion map:
- the at least two organs include ears; the reconstruction unit is also used to fuse the pre-configured ear model into the three-dimensional mesh model of the ear region after the fusion is obtained.
- the reconstruction unit is specifically configured to: when obtaining the three-dimensional model of the bust based on the processed second texture expansion map:
- the texture of the ear region on the fused three-dimensional network model is fused to the ear region located at the preset position of the fourth texture expansion map to obtain a fused fourth texture expansion map, which is based on the fused fourth texture expansion map Obtain a three-dimensional model of the bust.
- the reconstruction unit is specifically configured to: when smoothing the texture stitch line region in the second texture expansion map:
- the reconstruction unit when the reconstruction unit performs weighted fusion processing on the second texture expansion image and the back texture image of the bust to obtain the fourth texture expansion image, it is specifically configured to:
- the setting rules are:
- I 1 represents the second texture expansion image
- I 2 represents the back image of the bust
- I 3 represents the fourth texture expansion image
- I alpha represents the weight
- map 1 represents in the X-axis direction
- the bust’s The pixels in the back image are mapped to the mapping function on the second texture expansion map
- map 2 represents the mapping function on the Y-axis direction
- the pixels in the back image of the bust are mapped to the mapping function on the second texture expansion map
- i Indicates the coordinate value of the pixel in the X-axis direction
- j represents the coordinate value of the pixel in the X-axis direction.
- the electronic device further includes a camera; the camera is used to collect images to be processed.
- the above-mentioned processor is used to control the camera to collect images.
- a computer program product provided by an embodiment of the present application, when the computer program product runs on an electronic device, causes the electronic device or processor to execute the first aspect and any possible design method thereof.
- Coupled in the embodiments of the present application means that two components are directly or indirectly combined with each other.
- Figure 1 is a schematic diagram of an electronic device in an embodiment of the application
- FIG. 2 is a schematic flowchart of a method for three-dimensional reconstruction of a bust in an embodiment of this application;
- Fig. 3 is a schematic diagram of a bust in an embodiment of the application.
- FIG. 4 is a schematic flowchart of a method for texture expansion in an embodiment of the application.
- FIG. 5 is a schematic diagram of three-dimensional reconstruction of a bust in an embodiment of the application.
- Fig. 6 is a schematic diagram of the central axis in an embodiment of the application.
- FIG. 7 is a schematic diagram illustrating the expansion of the circumference in an embodiment of the application.
- Fig. 8 is a schematic diagram of determining the positions of the nose and ears in an embodiment of the application.
- FIG. 9 is a schematic diagram of smoothing the gap in an embodiment of the application.
- FIG. 10 is a schematic diagram of a smoothing method used for smoothing the gap in an embodiment of the application.
- FIG. 14 is a schematic diagram of a device 1400 in an embodiment of the application.
- the neural processor includes, but is not limited to, a neural network processing unit, such as a deep neural network processing unit or a convolutional neural network processing unit.
- the neural processor can use the neural network model to perform training, calculation, or processing.
- the neural network model includes, but is not limited to, a deep neural network model or a convolutional neural network model.
- the above digital signal processor, image processing unit or central processing unit can also use the neural network model to perform training, calculation or processing.
- the NPU included in the processor 110 is a neural-network (NN) computing processor.
- NN neural-network
- the input information can be quickly processed. You can also continue to learn by yourself.
- the three-dimensional reconstruction of the bust of the electronic device 100 can be achieved through the NPU.
- the memory 120 may be used to store computer executable program code, where the executable program code includes instructions.
- the processor 110 executes various functional applications and data processing of the electronic device by running instructions stored in the memory 120.
- the memory 120 may include a program storage area and a data storage area. Among them, the storage program area can store an operating system, driver software, or at least one application program required by a function (such as a sound playback function, an image playback function, etc.).
- the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100.
- the memory 120 may include at least one of a power-down volatile memory or a non-power-down volatile memory, such as read only memory (ROM), random access memory (RAM), and dynamic random access memory. (dynamic random access memory, DRAM), embedded multimedia card (eMMC), universal flash storage (UFS), hard disk or magnetic disk, etc.
- the character "/" generally indicates that the associated objects before and after are in an "or” relationship.
- "The following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
- at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
- FIG. 2 is a schematic flowchart of a method for three-dimensional reconstruction of a bust according to an embodiment of the present application.
- the three-dimensional reconstruction method of the bust can be implemented by the electronic device shown in FIG. 2, for example, by one of the electronic devices or a processing unit.
- the three-dimensional reconstruction method of the bust mainly includes S201-S204.
- the bust may include a frontal face, neck, shoulders, etc., for example, see FIG. 3.
- At least two organs in the five sense organs of the bust in the first texture development image are located at preset positions.
- the facial features include eyebrows, eyes, ears, nose, and mouth.
- the at least two organs are located at preset positions, which may be texture expansion images obtained for different images to be processed, and the positions of the at least two organs are the same.
- two organs include a nose and an ear, the position of the ear in the texture expansion map of different images to be processed is the same, and the position of the nose in the texture expansion map of different images to be processed is the same.
- the at least two organs located at the preset positions may be texture expansion images obtained for different images to be processed, and the relative positions of the at least two organs are fixed.
- the distance between the two ears and the distance between the nose and the ears can be the same in the texture expansion of different images to be processed.
- S203 Supplement the back texture of the bust in the first texture expansion image according to the front texture of the bust in the first texture expansion image to obtain a second texture expansion image.
- the second texture expansion map is used to characterize the surface texture of the bust.
- the second texture expansion map is used to characterize the surface texture of the bust, that is, the second texture expansion map describes the omnidirectional texture of the character surface of the bust.
- the surface texture may be the texture of the upper body surface of the three-dimensional target person. That is, the surface texture includes a texture that surrounds the surface of the upper body along the axis perpendicular to the ground plane.
- the surface texture includes the front texture of the bust and the back texture of the bust.
- the front of a bust includes a human face, front neck, or front shoulders.
- the back of the bust can include the back of the head, the back of the neck, or the back of the shoulders.
- S202 obtaining the first texture expansion image according to the image to be processed may be implemented by the following S401-S404.
- the first neural network model may be used to remove the background in the processed image to obtain a frontal image of the bust.
- the first neural network model is used to segment the foreground and background of the image, and output the foreground image.
- the foreground image is the front image of the bust.
- S402 Perform semantic segmentation on the front image to obtain a head semantic mask of the front image.
- Semantic segmentation is the grouping/segmentation of pixels according to the different semantic meanings expressed in the image.
- the semantic segmentation of the front image may use a full convolution network (Fully Convolution Networks, FCN), such as U-net network, SegNet network, DeepLab, RefineNet, or PSPNet.
- FCN Full Convolution Networks
- S403 Obtain a three-dimensional mesh model of the bust according to the semantic mask of the head and the frontal image.
- the head semantic mask and the frontal image may be input to the second neural network model, and the second neural network model is used for human body reconstruction.
- the second neural network model outputs a three-dimensional (3D) truncated signed distance function (truncated signed distance function, TSDF) volume. Then extract the surface mesh (mesh) of the TSDF body to obtain a three-dimensional mesh model of the bust.
- 3D three-dimensional truncated signed distance function
- TSDF truncated signed distance function
- At least one of the following processing may be performed on the three-dimensional mesh model: hole filling processing, mesh uniformization processing, or mesh processing. Smoothing.
- a triangular mesh hole filling method based on a radial basis function (radial basis function, RBF) or a hole filling algorithm based on the Poisson equation may be used.
- RBF radial basis function
- a hole filling algorithm based on the Poisson equation may be used.
- grid uniformization algorithms such as point clustering, edge folding, and vertex addition and deletion can be used.
- mesh smoothing methods based on Poisson's equation or discrete Laplace equation can be used.
- the 3D mesh model is made more complete and the accuracy of the 3D model reconstruction is improved.
- Performing grid homogenization processing can prevent the obtained 3D mesh model from being too dense or too sparse, which will affect the accuracy of the 3D model reconstruction.
- Performing mesh smoothing can remove inaccurate meshes in the 3D mesh model, that is, noise points, thereby improving the accuracy and smoothness of the 3D model reconstruction.
- the three-dimensional mesh can be obtained according to the positions of at least two organs on the head in the semantic mask of the head.
- the first texture expansion map of the grid model makes at least two organs in the first texture expansion map to be located at the preset positions.
- the texture of the bust may be expanded in a circle to obtain the first texture expansion map.
- Circumferential expansion is non-Atlas (atlas) texture expansion.
- Non-atlas expansion helps to improve the continuity of textures and reduce the gaps between textures.
- inpainting can be avoided, and the position of the semantic block on the texture map can be relatively fixed, which is convenient for machine learning.
- the front image of the bust shown in (a) in FIG. 5 is obtained after A1 processing.
- Semantic segmentation of the front image of the bust to obtain the semantic head mask that is, the semantic mask of the head obtained after A2 processing is shown in Figure 5(b).
- the head semantic mask and frontal image can be input into the second neural network model.
- the output TSDF volume is shown in Figure 5 (c).
- the surface mesh of the TSDF volume is extracted to obtain the three-dimensional mesh model of the bust.
- the first texture expansion diagram is obtained as shown in (e) of FIG. 5.
- a second texture expansion map as shown in (f) in FIG. 5 is obtained, that is, a texture map after texture completion.
- the texture map after texture completion and the surface mesh of the three-dimensional mesh model are combined to obtain the final three-dimensional model of the textured bust, as shown in Figure 5 (g).
- S404 obtains the first texture expansion map of the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask, which can be implemented in the following manner:
- A1 performing texture expansion on the texture corresponding to the three-dimensional mesh model based on the central axis to obtain a third texture expansion image, the central axis being the connecting line from the top of the head to the bottom of the head in the three-dimensional mesh model.
- the top of the head can be the highest point on the top of the head.
- the bottom of the head can be the geometric center of the lowest surface of the three-dimensional network model.
- the third texture development map can be obtained based on the central axis according to the spatial angle or the circumference of the curved surface of a certain place on the Mesh relative to the central axis.
- the first rule can be satisfied by the circumferential expansion according to the angle.
- the first rule can be:
- (x,y,z) is the spatial coordinates of a point in a three-dimensional grid model
- [u,v] represents the pixel coordinates of the point with spatial coordinates (x,y,z) in the texture expansion map
- pixel coordinates Take the lower left corner of the image as the origin.
- W is the width of the expanded texture image
- H is the height of the expanded texture image
- cx and cz represent the axis coordinates on the plane of equal Y value.
- the third texture expansion map describes the grid texture of the three-dimensional grid model, or a grid texture expansion map without pixel values.
- A2. Determine the positions of at least two organs in the three-dimensional mesh model according to the positions of at least two organs on the head in the head semantic mask.
- the head semantic mask includes the locations of at least two organs, that is, the coordinates of at least two organs can be determined according to the head semantic mask. Take the two organs, the ear and the nose, for example.
- the three-dimensional mesh model is obtained according to the semantic mask of the head. Therefore, there is a mapping relationship between the ears and nose in the semantic mask of the head and the ears and noses in the three-dimensional mesh model. Therefore, the positions of the ears and the nose in the three-dimensional mesh model can be determined according to the positions of the ears and the nose in the semantic mask of the head.
- A3 Adjust the third texture expansion map according to the positions of at least two organs in the three-dimensional mesh model to obtain the first texture expansion map.
- Figure 8 (a) shows the ear and nose positions in the semantic mask of the head.
- Figure 8(b) shows the positions of ears and nose in the three-dimensional mesh model.
- One way is to adjust the positions of the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D mesh model, that is, determine the ears and/or noses in the third texture expansion map according to the positions of the ears and noses in the 3D network model.
- nose position and adjust the ear and/or nose position so that the adjusted ears and nose are located at the preset position of the third texture expansion map, and the first texture expansion is obtained according to the adjusted third texture expansion map and the front image
- the adjusted ears and nose are located at the preset positions of the first texture expansion image.
- the distortion mapping process may be performed on the third expanded texture image, so that the ears and the nose in the first expanded texture image are located at preset positions.
- Another way is to obtain a color-filled third texture expansion map based on the third texture expansion map and the front image.
- the positions of the ears and the nose in the three-dimensional network model are determined, and the positions of the ears and/or the nose are adjusted to obtain the first texture expansion map.
- the ears and the nose are located at the preset positions of the first texture development image.
- mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of each point in the three-dimensional grid model there is a mapping relationship between the coordinates of each point in the three-dimensional grid model and the coordinates of the pixel points in the front image. Further, there is a mapping relationship between the coordinates of each pixel in the third texture expansion image and the coordinates of the pixel in the front image. Therefore, the pixel values of the pixels in the front image can be mapped to the third texture expansion map (or to the adjusted third texture expansion map).
- the current half-body texture expansion is generally used to expand the front and back separately, that is, expand into the visible part of the front and the invisible part of the back, resulting in poor texture reconstruction of the invisible part of the hair and ears, and the geometric mesh reconstruction of the ears
- the effect is poor, so it is difficult to meet user needs.
- the adjusted ears and nose are located at the preset positions of the first texture expansion map. This process provides convenient conditions for the subsequent optimization of the ear part, and can improve the accuracy of the subsequent ear optimization .
- the optimization of the ear part will be described later, which will not be repeated here.
- the hair may completely cover the ear area.
- the subsequent optimization of the ear part may not be performed.
- the texture expansion map can be obtained based on the spatial angle or surface circumference of a certain place based on the central axis, and A2 and A3 can no longer be executed.
- the color-filled third texture expansion map can be obtained directly according to the third texture expansion map and the front image, that is, the filled third texture expansion map is used as the first texture expansion map.
- a certain cross section of the head can be considered as a circle.
- the black dots between - ⁇ and ⁇ in (a) in FIG. 7 can be considered as the points on the cross-section of the expansion line used when the texture is expanded.
- the expansion line is on the back as an example.
- Figure 9 is a schematic diagram of similar gaps in the suture area of the three-dimensional model. After the gap is smoothed, it is shown on the right in Figure 9.
- the gap smoothing operation is to perform fusion processing on the texture stitching area in the second texture expansion map.
- the second texture expansion image and the back texture image of the bust may be subjected to weighted fusion processing according to the set rules to obtain the fourth texture expansion image;
- the setting rules can be:
- I 3 (i,j) ⁇ I 1 (i,j)+(1- ⁇ )I 2 (map 1 (i),map 2 (j));
- I 1 represents the second texture expansion image
- I 2 represents the back image of the bust
- I 3 represents the fourth texture expansion image
- I alpha represents the weight
- map 1 represents in the X-axis direction, in the back image of the bust
- map 2 represents the mapping function on the Y-axis direction
- the pixel points in the back image of the bust are mapped to the mapping function on the second texture expansion map
- i represents the pixel point on the X axis
- j represents the coordinate value of the pixel point in the X-axis direction.
- the back texture image of the bust is determined according to the front image of the bust as shown in FIG. 10(a), and the weight of each pixel can be shown in FIG. 10(b).
- the second texture expansion diagram is shown in 10 (c).
- the second texture expansion image ((c) in Fig. 10) and the back texture image of the bust ((a) in Fig. 10) are merged according to (b) in Fig. 10 to obtain (d) in Fig. 10, namely The fourth texture expanded view.
- the fusion operation is indicated by the plus sign "+" in Figure 10.
- the process of optimizing the ear part will be described in detail below.
- the face of the bust includes the ears, that is, the ears are not covered by the hair. You can determine whether to include ears based on the semantic mask of the head.
- optimizing the ear part the following methods can be used:
- the fusion method used when fusing the texture of the ear region on the fused three-dimensional network model to the ear region located at the preset position of the fourth texture expansion map may be a fusion algorithm based on the image Laplacian gradient.
- FIG. 11 is a schematic diagram of a three-dimensional mesh model.
- the ear area in the determined three-dimensional mesh model can be seen in (b) of FIG. 11.
- the geometric dimensions of the ear can be determined according to the size of the three-dimensional mesh model.
- the size of the three-dimensional mesh model can be preset by the user or a default size can be adopted.
- Fit the ear model to the ear area in the 3D mesh model, as shown in (c) in Figure 11, and then perform the fusion processing on the 3D mesh model fitted to the ear model, and the resulting fused 3D mesh model can be See (d) in Figure 11. Then, the texture of the ear region on the fused three-dimensional network model is obtained, and the texture of the ear region is fused to the ear region at the preset position in the fourth texture expansion map.
- the solution provided in the embodiment of this application is applied to a virtual three-dimensional dialogue.
- the local terminal device receives the user’s trigger and starts a virtual 3D video call
- the video stream is obtained through the terminal’s camera.
- the 3D reconstruction method based on the bust of the single frame image proposed in this patent can be used to reconstruct the 3D Model.
- the terminal drives the three-dimensional model by acquiring the user's expression in each frame of the video stream, and sends it to the opposite terminal, and the opposite terminal displays the local terminal user's expression simulated by the three-dimensional model.
- creating a three-dimensional model can be performed by a computing cloud.
- the electronic device sends a single frame image to the computing cloud, and the computing cloud creates a three-dimensional model, and then sends the created three-dimensional model to the terminal.
- the embodiments of the present application also provide an apparatus 1400.
- the apparatus 1400 may specifically include functional modules in an electronic device (for example, the processor 110 in FIG. 1 Components or software modules executed by them), or the apparatus 1400 may be a chip or a chip system, or the apparatus 1400 may be a module in an electronic device, or the like.
- the apparatus may include an obtaining unit 1401 and a reconstruction unit 1402.
- the obtaining unit 1401 and the reconstruction unit 1402 respectively execute different steps of the method shown in the embodiment corresponding to FIG. 2 and FIG. 4.
- the obtaining unit 1401 may be used to obtain the image to be processed in S201
- the reconstruction unit 1402 may be used to perform the process of S202-S204.
- the specific implementation is as described above and will not be repeated here.
- the aforementioned acquisition unit 1401 or reconstruction unit 1402 can be implemented by software, hardware, or a combination of software and hardware.
- the hardware can be CPU, microprocessor, DSP, micro control unit (MCU), artificial intelligence processor, application specific integrated circuit (ASIC), field programmable gate array (field programmable gate array) , FPGA), dedicated digital circuits, hardware accelerators, or any one or any combination of non-integrated discrete devices, which can run the necessary software or do not rely on software to perform the above method flow, and are located in the previous description of Figure 1 Inside the processor 110.
- the module is implemented in software, the software exists in the form of computer program instructions and is stored in a memory, such as the memory 120 in FIG.
- the processor may include, but is not limited to, at least one of the following: CPU, microprocessor, DSP, microcontroller, or artificial intelligence processor and other computing devices that run software. Each computing device may include one or more A core used to execute software instructions for calculation or processing.
- the processor can be a single semiconductor chip, or it can be integrated with other circuits to form a semiconductor chip. For example, it can form an SoC (on-chip) with other circuits (such as codec circuits, hardware acceleration circuits, or various bus and interface circuits).
- the processor may further include necessary hardware accelerators, such as FPGAs, PLDs (programmable logic devices), or logic circuits that implement dedicated logic operations.
- this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
- the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
- These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
- the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Processing Or Creating Images (AREA)
Abstract
La présente invention concerne un procédé et un appareil pour la reconstruction tridimensionnelle d'un portrait de demi-longueur, pour résoudre les problèmes de grande complexité et de longue durée de reconstruction. Le procédé comprend : l'obtention d'une image comprenant un visage humain de face d'un portrait de demi-longueur d'une personne cible, puis le dépliage d'une texture à l'aide d'un masque sémantique de tête de visage humain de face, de sorte qu'au moins deux organes dans la texture dépliée soient situés à des positions préétablies, puis la complétion d'une texture arrière en fonction de la carte de texture repliée comprenant l'avant. Un modèle de maillage tridimensionnel peut en outre être construit selon le visage humain de face, un modèle d'oreille préconfiguré est utilisé pour remplacer les oreilles dans le modèle de maillage tridimensionnel, une texture d'une zone d'oreille est obtenue à partir d'un modèle de maillage tridimensionnel remplacé, et est fusionnée à la texture arrière complémentée, de façon à obtenir le modèle de maillage tridimensionnel. Un dispositif de modélisation professionnel n'est pas nécessaire, et la complexité est faible. Étant donné que les textures avant et arrière d'un corps humain ont une certaine corrélation, la texture arrière est complétée selon la texture de face, de telle sorte que le modèle tridimensionnel construit s'adapte au caractère mieux, et qu'un meilleur effet est obtenu.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010132592.8 | 2020-02-29 | ||
CN202010132592.8A CN113327277A (zh) | 2020-02-29 | 2020-02-29 | 一种半身像的三维重建方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021170127A1 true WO2021170127A1 (fr) | 2021-09-02 |
Family
ID=77412988
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/078324 WO2021170127A1 (fr) | 2020-02-29 | 2021-02-27 | Procédé et appareil de reconstruction tridimensionnelle d'un portrait de demi-longueur |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113327277A (fr) |
WO (1) | WO2021170127A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115601490A (zh) * | 2022-11-29 | 2023-01-13 | 思看科技(杭州)股份有限公司(Cn) | 基于纹理映射的纹理图像前置置换方法、装置和存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106780713A (zh) * | 2016-11-11 | 2017-05-31 | 吴怀宇 | 一种基于单幅照片的三维人脸建模方法及系统 |
US20180315222A1 (en) * | 2017-05-01 | 2018-11-01 | Lockheed Martin Corporation | Real-time image undistortion for incremental 3d reconstruction |
CN110197462A (zh) * | 2019-04-16 | 2019-09-03 | 浙江理工大学 | 一种人脸图像实时美化与纹理合成方法 |
CN110782507A (zh) * | 2019-10-11 | 2020-02-11 | 创新工场(北京)企业管理股份有限公司 | 一种基于人脸网格模型的纹理贴图生成方法、系统及电子设备 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100327541B1 (ko) * | 2000-08-10 | 2002-03-08 | 김재성, 이두원 | 3차원 얼굴 모델링 시스템 및 모델링 방법 |
CN101383055B (zh) * | 2008-09-18 | 2010-09-29 | 北京中星微电子有限公司 | 一种三维人脸模型的构造方法和系统 |
CN102663820B (zh) * | 2012-04-28 | 2014-10-22 | 清华大学 | 三维头部模型重建方法 |
CN107452049B (zh) * | 2016-05-30 | 2020-09-15 | 腾讯科技(深圳)有限公司 | 一种三维头部建模方法及装置 |
-
2020
- 2020-02-29 CN CN202010132592.8A patent/CN113327277A/zh active Pending
-
2021
- 2021-02-27 WO PCT/CN2021/078324 patent/WO2021170127A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106780713A (zh) * | 2016-11-11 | 2017-05-31 | 吴怀宇 | 一种基于单幅照片的三维人脸建模方法及系统 |
US20180315222A1 (en) * | 2017-05-01 | 2018-11-01 | Lockheed Martin Corporation | Real-time image undistortion for incremental 3d reconstruction |
CN110197462A (zh) * | 2019-04-16 | 2019-09-03 | 浙江理工大学 | 一种人脸图像实时美化与纹理合成方法 |
CN110782507A (zh) * | 2019-10-11 | 2020-02-11 | 创新工场(北京)企业管理股份有限公司 | 一种基于人脸网格模型的纹理贴图生成方法、系统及电子设备 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115601490A (zh) * | 2022-11-29 | 2023-01-13 | 思看科技(杭州)股份有限公司(Cn) | 基于纹理映射的纹理图像前置置换方法、装置和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113327277A (zh) | 2021-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021208648A1 (fr) | Procédé et appareil d'ajustement d'objet virtuel, support de stockage et dispositif de réalité augmentée | |
CN111243093B (zh) | 三维人脸网格的生成方法、装置、设备及存储介质 | |
EP3992919B1 (fr) | Procédé et appareil de génération de modèle facial tridimensionnel, dispositif et support | |
CN111640175A (zh) | 一种物体建模运动方法、装置与设备 | |
CN108986016B (zh) | 图像美化方法、装置及电子设备 | |
WO2019196745A1 (fr) | Procédé de modélisation de visage et produit associé | |
WO2021244172A1 (fr) | Procédé de traitement d'image et procédé de synthèse d'image, appareil de traitement d'image et appareil de synthèse d'image, et support de stockage | |
WO2021078179A1 (fr) | Procédé et dispositif d'affichage d'image | |
WO2021027585A1 (fr) | Procédé de traitement d'images de visages humains et dispositif électronique | |
CN112927362A (zh) | 地图重建方法及装置、计算机可读介质和电子设备 | |
CN114219878A (zh) | 虚拟角色的动画生成方法及装置、存储介质、终端 | |
CN113628327A (zh) | 一种头部三维重建方法及设备 | |
US11640687B2 (en) | Volumetric capture and mesh-tracking based machine learning 4D face/body deformation training | |
CN109961496A (zh) | 表情驱动方法及表情驱动装置 | |
WO2023066120A1 (fr) | Procédé et appareil de traitement d'image, dispositif électronique et support de stockage | |
CN112581518A (zh) | 基于三维卡通模型的眼球配准方法、装置、服务器和介质 | |
CN111951368A (zh) | 一种点云、体素和多视图融合的深度学习方法 | |
US10650488B2 (en) | Apparatus, method, and computer program code for producing composite image | |
WO2021170127A1 (fr) | Procédé et appareil de reconstruction tridimensionnelle d'un portrait de demi-longueur | |
CN115908120A (zh) | 图像处理方法和电子设备 | |
US12001746B2 (en) | Electronic apparatus, and method for displaying image on display device | |
TWM630947U (zh) | 立體影像播放裝置 | |
TW202332263A (zh) | 立體影像播放裝置及其立體影像產生方法 | |
WO2024051289A1 (fr) | Procédé de remplacement d'arrière-plan d'image et dispositif associé | |
CN111369651A (zh) | 三维表情动画生成方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21760368 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21760368 Country of ref document: EP Kind code of ref document: A1 |