CN114359452A - Three-dimensional model texture synthesis method based on semantic image translation - Google Patents

Three-dimensional model texture synthesis method based on semantic image translation Download PDF

Info

Publication number
CN114359452A
CN114359452A CN202111514168.0A CN202111514168A CN114359452A CN 114359452 A CN114359452 A CN 114359452A CN 202111514168 A CN202111514168 A CN 202111514168A CN 114359452 A CN114359452 A CN 114359452A
Authority
CN
China
Prior art keywords
image
information
semantic
triangle
rendering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111514168.0A
Other languages
Chinese (zh)
Other versions
CN114359452B (en
Inventor
阮系标
宋海川
马利庄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN202111514168.0A priority Critical patent/CN114359452B/en
Publication of CN114359452A publication Critical patent/CN114359452A/en
Application granted granted Critical
Publication of CN114359452B publication Critical patent/CN114359452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Generation (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a three-dimensional model texture synthesis method based on semantic image translation. And then, performing human-computer interaction to fill semantic labels in the six rendering images, wherein different colors represent different semantic labels, and meanwhile, the six rendering images filled with the semantic labels are used as input data and input into the antagonistic neural network to synthesize six target images with consistent styles. Finally, all the synthesized target images are mapped to a texture map, so that an applicable complete texture image is obtained. The method solves the problem of time-consuming drawing in the process of manufacturing the three-dimensional model texture, simplifies the work of drawing texture details into filling semantic labels, synthesizes the texture details with a fixed style by using an antagonistic neural network, and reduces the workload of personnel in the process of manufacturing the texture.

Description

Three-dimensional model texture synthesis method based on semantic image translation
Technical Field
The invention relates to the technical field of image synthesis and texture mapping, in particular to a three-dimensional model texture synthesis method based on semantic image translation.
Background
Texture mapping of a three-dimensional model is carried out by art workers who need to spread uv a pure geometric model to spread the three-dimensional model into a two-dimensional plane and draw materials and the like on the two-dimensional plane, the drawing work flow is complicated, along with the development of the game industry, a game layer is endless, characters of different scene articles need different model mappings, and the speed of producing the game model mappings by the art workers is difficult to meet the industry speed of high-speed development.
In the previous research of generating model maps, existing image resources on the internet are used to realize texture migration from 2D images to 3D models, but these methods all require multi-view images of the same object as source data or the 2D images and the 3D images have high similarity, and the acquisition of these images is difficult, so these methods are difficult to be really applied to texture generation.
Disclosure of Invention
The invention aims to provide a three-dimensional model texture synthesis method based on semantic image translation, which synthesizes corresponding targets for filled semantic label images by using an antagonistic neural network, controls the style of the synthesized images by using a style migration method, and maps a plurality of synthesized images into one texture map by using the gravity center coordinates of triangles.
The specific technical scheme for realizing the purpose of the invention is as follows:
a three-dimensional model texture synthesis method based on semantic image translation utilizes rasterization to render geometric model outline, semantic labels are filled in the geometric model outline to be added into a countermeasure neural network to realize texture synthesis, and the method comprises the following specific steps:
step 1: rasterizing rendering profile and retaining triangle information
1.1) introducing a pure geometric model into a renderer, selecting a rotation matrix and a model matrix to adjust the model, realizing the rendering of the outlines of a main view, a top view, a left view, a right view, a bottom view and a back view of the model, setting the rendered outlines and internal pixel values to be 255, and setting the other pixel values to be 0;
1.2) the rendering geometric model saves the information of a triangular patch in the geometric model rendered by each rendering map besides the obtained six rendering maps, generates a corresponding file for each rendering map, and records the coordinates of each pixel point in the rendering map, the coordinates of a triangle vertex of the geometric model to which the pixel point belongs and the coordinates of the gravity center of the pixel in each file;
1.3) the geometric model with uv coordinate information retains uv coordinate information and generates an initial texture map for subsequent coloring by the uv coordinate, wherein the size of the initial texture map is set to 1024 × 1024; the geometric model without uv coordinate information calculates the area of the triangle according to the information of the triangle vertex to distribute uv coordinates to each triangle vertex, and an initial texture image is generated for subsequent coloring, and the size of the initial texture image is set to be 1024x 1024;
step 2: semantic labeling of rendered views by user interaction
2.1) performing filling operation on the six rendered views obtained in the step 1, only using different colors to paint the area inside the model outline in the rendered views, and assigning semantic information to the area painted by each color;
2.2) after the filling, the calibration program carries out painting inspection on the overlapping area of the six views, if semantic conflict is found, modification is prompted, and if no semantic conflict exists, the rendered views after painting are output;
and step 3: rendering view with semantic information and synthesizing target image through antagonistic neural network
3.1) inputting the rendering view with the semantic information into an antagonistic neural network, wherein the network can transmit the semantic information in each layer of the network and synthesize a target image corresponding to semantic distribution according to the semantic information;
3.2) the antagonistic neural network learns the image style by using an encoder, an image acquired from the internet is randomly appointed to be used as a style reference image for style migration before a semantic distribution image is input, a texture image similar to the style of the reference image is synthesized, the encoder reduces the dimension of the image to 1024 dimensions by using a 3-layer convolution network, then outputs a potential code with 512 dimensions as style characteristic information by using 8 full-connection layers, and splices the input layers of two generators with semantic characteristics in the antagonistic neural network;
3.3) the antagonistic neural network synthesizes a high-resolution image by using two generators and four discriminators, the low-resolution generator outputs a rough synthesized image by using a U-NET structure, the high-resolution generator extracts semantic image features by using a three-layer convolution structure and outputs feature information to combine the rough synthesized image feature information output by the low-resolution generator, and then inputs the rough synthesized image feature information into six layers of convolution layers to synthesize the high-resolution image, the six layers of convolution layers are used as final output layers, and an adaptive example normalization layer is added after each two layers of the former four layers; the four discriminator networks adopt a Patch-GAN structure, and the four discriminators judge that the scale distribution is original image, original image 1/2 downsampling, original image 1/4 downsampling and original image 1/8 downsampling;
and 4, step 4: mapping multiple composite images into a texture map
4.1) with the target image in the step 3.1 as source data, coloring the initial texture image generated in the step 1.3 by taking the vertex information and the barycentric coordinate information of the triangle corresponding to each rendering view in the step 1.1 as references, reading a pixel value from the source data, obtaining a corresponding pixel in the initial texture image according to the triangle corresponding to the pixel and the barycentric coordinate thereof, and giving the read pixel value to the corresponding pixel in the initial texture image;
4.2) for the pixels with multiple coloring contradictions in the initial texture map, firstly adopting a multiple voting scheme, if the coloring is consistent in 3 times or more, adopting the pixel value, and if the pixel value with multiple coloring is not consistent, adopting a proximity mixing method, and adopting 8 pixels around the pixel for weighted mixing to obtain a final pixel value.
In step 1.2, the rendering geometric model stores, in addition to the six rendering graphs obtained, triangle patch information in the geometric model rendered for each rendering graph, where the triangle patch information refers to triangle vertex coordinates and triangle number information, and the number of triangles rendered for each rendering graph is not necessarily equal and is less than the total number of triangle patches in the geometric model due to the visibility of the triangles rendered for each rendering graph.
Step 1.3, the geometric model without uv coordinate information calculates the area of a triangle according to the information of the vertices of the triangle, allocates uv coordinates to each vertex of the triangle, and generates an initial texture image for subsequent coloring, wherein the area of the initial texture image is calculated by an area formula according to the information of the vertices of the triangle, the initial texture image divides spaces with equal size according to the total number of the patches of the triangle, and then the size of the spaces is adjusted according to the size of the area of the triangle, and the larger the area is, the larger the allocated space is.
And 3.2, synthesizing a texture image with a style similar to that of the reference image, splicing the output style code with semantic features in an input layer of the generator, and inputting the style code into a self-adaptive example normalization layer of the high-resolution generator to realize the migration of the style details of the reference image.
The three-dimensional model texture synthesis method based on semantic image translation simplifies complex texture manufacturing processes, adopts a rasterization technology to render geometric model outlines, simplifies drawing work into filling semantic labels to reduce manual labor, utilizes an anti-neural network to generate texture images with high resolution and high detail retention, and can perform style migration according to an input reference image to specify the final style of a synthesized image.
Drawings
FIG. 1 is a flow chart of model multi-view contour rendering;
FIG. 2 is a flow chart of the generation of texture for an antagonistic neural network;
FIG. 3 is a flow chart of an embodiment of the invention.
Detailed Description
For the purpose of facilitating an understanding of the present invention, the following detailed description is given with reference to the accompanying drawings and examples.
Examples
Referring to fig. 1, in step 1 of the present invention, a geometric model is imported to obtain a multi-view rendering image and barycentric coordinates of triangles related to rendered pixels in each image are retained, and if original uv coordinate information in a triangular patch of the geometric model, a space occupied by each triangle on an initial texture map is allocated according to the original uv coordinate information. If the geometric model has no uv coordinate information, the initial texture map pixel space is allocated according to the area size of the triangle and the total number of the triangles, so that each triangle occupies a proper proportion of the space in the initial texture map.
S100: importing the model into a rasterization renderer, reading vertex coordinates of the geometric model and triangle patch information, and reading the vertex coordinates if the vertex coordinates exist;
S110-S120: for each geometric model, setting a model matrix to perform operations such as rotation, translation, scaling and the like on the model until rendering images of a main view, a top view, a left view, a right view, a bottom view and a rear view are obtained;
s130: the rendered pixels in each view are related to a triangular patch in the geometric model, so that the barycentric coordinates of the rendered pixels in the views and the triangular information to which the rendered pixels belong are calculated and stored as the mapping relation between the rendered views and the initial texture map;
s140: when the uv information exists in the geometric model, when an initial texture map of 1024x1024 is generated, the initial texture map space is allocated to each triangle patch according to uv coordinates. And if the geometry model does not have uv information, distributing the initial texture map space according to the area size of the triangular patch.
Referring to fig. 2, step 3 of the present invention is to input the rendered view with semantic information after user interaction as input data into the antagonistic neural network, which can synthesize a target image conforming to semantic distribution.
S200: filling different colors on a rendering view by a user, representing different semantics for each color, and inputting the rendering view serving as input data into a confrontation neural network;
S210-S220: inputting a sample image by a user, and outputting a style code through a trained encoder to enable the final synthesized image effect to be similar to the style of the sample image;
S230-S240: extracting image characteristics from a semantic distribution image input by a user through a convolution layer, and fusing the characteristics and style codes output by an encoder to be used as data of the next convolution kernel operation;
S250-S260: and performing operation through different convolution kernels, and outputting a final composite image according to the resolution specified by a user.
Referring to fig. 3, in step 1 of the present invention, a multi-view is rendered, a model is input to a renderer for rendering, a model contour under a multi-view is obtained, and related information is retained to prepare for subsequent texture mapping.
S100-S120: performing matrix operation on the model, and performing affine change on the model vertex to render to obtain a model outline under multiple viewing angles, namely multiple views;
s130: saving triangle information to which a pixel in each rendering graph belongs, and reserving a gravity center coordinate value of the pixel relative to the triangle;
s300: providing brushes of semantic labels represented by different colors, providing a drawing interface, transmitting the rendered outline drawing to the interface for display, and performing filling operation;
s310: the filled multi-view rendering outline graph is used as input data, and the rendering graph is operated by a user and has semantic distribution similar to a semantic segmentation graph.
Referring to fig. 3, in step 3 of the present invention, against the synthetic texture of the neural network, when selecting the input data, a sample image is also required to be input as a sample of style migration.
S210: selecting a sample image from the Internet as input data of a style encoder;
s220: outputting style codes by the trained encoder, and performing feature fusion with the semantic distribution image;
S230-S260: extracting feature information from the semantic distribution image, performing feature fusion operation with style codes extracted from the sample image, inputting the feature information serving as input data into a countermeasure neural network, and inducing and synthesizing a target image by keeping semantic features;
s320: since each view can obtain a synthesized target image, the images obtained from six views need to be mapped onto an initial texture map, since a single triangle which may exist appears in different images and a coloring conflict situation appears when the triangle is mapped back, a conflict solution strategy is needed, a proximity blending scheme and a majority voting scheme are adopted to process the coloring conflict problem, namely, a majority voting method is firstly used, if the coloring is consistent in 3 times or more, the pixel value is adopted, and if the coloring is not consistent in 3 times or more, the pixel values of eight pixels around the coloring pixel are averaged by using the proximity blending method;
s330: after the rendering problem is handled, the pixel values of all the target images are mapped into the initial texture map, and the initial texture map is rendered to become a texture map applicable to any game engine.

Claims (4)

1. A three-dimensional model texture synthesis method based on semantic image translation is characterized by comprising the following specific steps:
step 1: rasterizing rendering profile and retaining triangle information
1.1) introducing a pure geometric model into a renderer, selecting a rotation matrix and a model matrix to adjust the model, realizing the rendering of the outlines of a main view, a top view, a left view, a right view, a bottom view and a back view of the model, setting the rendered outlines and internal pixel values to be 255, and setting the other pixel values to be 0;
1.2) the rendering geometric model saves the information of a triangular patch in the geometric model rendered by each rendering map besides the obtained six rendering maps, generates a corresponding file for each rendering map, and records the coordinates of each pixel point in the rendering map, the coordinates of a triangle vertex of the geometric model to which the pixel point belongs and the coordinates of the gravity center of the pixel in each file;
1.3) the geometric model with uv coordinate information retains uv coordinate information and generates an initial texture map for subsequent coloring by the uv coordinate, wherein the size of the initial texture map is set to 1024 × 1024; the geometric model without uv coordinate information calculates the area of the triangle according to the information of the triangle vertex to distribute uv coordinates to each triangle vertex, and an initial texture image is generated for subsequent coloring, and the size of the initial texture image is set to be 1024x 1024;
step 2: semantic labeling of rendered views by user interaction
2.1) performing filling operation on the six rendered views obtained in the step 1, only using different colors to paint the area inside the model outline in the rendered views, and assigning semantic information to the area painted by each color;
2.2) after the filling, the calibration program carries out painting inspection on the overlapping area of the six views, if semantic conflict is found, modification is prompted, and if no semantic conflict exists, the rendered views after painting are output;
and step 3: rendering view with semantic information and synthesizing target image through antagonistic neural network
3.1) inputting the rendering view with the semantic information into an antagonistic neural network, wherein the network can transmit the semantic information in each layer of the network and synthesize a target image corresponding to semantic distribution according to the semantic information;
3.2) the antagonistic neural network learns the image style by using an encoder, an image acquired from the internet is randomly appointed to be used as a style reference image for style migration before a semantic distribution image is input, a texture image similar to the style of the reference image is synthesized, the encoder reduces the dimension of the image to 1024 dimensions by using a 3-layer convolution network, then outputs a potential code with 512 dimensions as style characteristic information by using 8 full-connection layers, and splices the input layers of two generators with semantic characteristics in the antagonistic neural network;
3.3) the antagonistic neural network synthesizes a high-resolution image by using two generators and four discriminators, the low-resolution generator outputs a rough synthesized image by using a U-NET structure, the high-resolution generator extracts semantic image features by using a three-layer convolution structure and outputs feature information to combine the rough synthesized image feature information output by the low-resolution generator, and then inputs the rough synthesized image feature information into six layers of convolution layers to synthesize the high-resolution image, the six layers of convolution layers are used as final output layers, and an adaptive example normalization layer is added after each two layers of the former four layers; the four discriminator networks adopt a Patch-GAN structure, and the four discriminators judge that the scale distribution is original image, original image 1/2 downsampling, original image 1/4 downsampling and original image 1/8 downsampling;
and 4, step 4: mapping multiple composite images into a texture map
4.1) with the target image in the step 3.1 as source data, coloring the initial texture image generated in the step 1.3 by taking the vertex information and the barycentric coordinate information of the triangle corresponding to each rendering view in the step 1.1 as references, reading a pixel value from the source data, obtaining a corresponding pixel in the initial texture image according to the triangle corresponding to the pixel and the barycentric coordinate thereof, and giving the read pixel value to the corresponding pixel in the initial texture image;
4.2) for the pixels with multiple coloring contradictions in the initial texture map, firstly adopting a multiple voting scheme, if the coloring is consistent in 3 times or more, adopting the pixel value, and if the pixel value with multiple coloring is not consistent, adopting a proximity mixing method, and adopting 8 pixels around the pixel for weighted mixing to obtain a final pixel value.
2. The method according to claim 1, wherein the rendering geometric model in step 1.2 stores triangle patch information in the geometric model rendered for each rendering in addition to the six renderings obtained, wherein the triangle patch information refers to triangle vertex coordinates and triangle number information, and the number of triangles rendered for each rendering is not necessarily equal and is less than the total number of triangle patches in the geometric model due to visibility of the triangles rendered for each rendering.
3. The three-dimensional model texture synthesis method based on semantic image translation according to claim 1, wherein in step 1.3, the geometric model without uv coordinate information calculates the size of the triangle area according to the triangle vertex information, allocates uv coordinates to each triangle vertex, and generates an initial texture image for subsequent coloring, the size of the approximate area is calculated by using the triangle vertex coordinate information through an area formula, the initial texture image divides spaces with equal size according to the total number of triangle patches, and then adjusts the size of the space according to the size of the triangle area, and the larger the area is, the larger the space is allocated, the larger the space is.
4. The method according to claim 1, wherein 3.2, the texture image with similar style to the reference image is synthesized, the output style code is spliced with the semantic features in the input layer of the generator, and the style code is input into the adaptive instance normalization layer of the high resolution generator to realize the transfer of the details of the reference image style.
CN202111514168.0A 2021-12-13 2021-12-13 Three-dimensional model texture synthesis method based on semantic image translation Active CN114359452B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111514168.0A CN114359452B (en) 2021-12-13 2021-12-13 Three-dimensional model texture synthesis method based on semantic image translation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111514168.0A CN114359452B (en) 2021-12-13 2021-12-13 Three-dimensional model texture synthesis method based on semantic image translation

Publications (2)

Publication Number Publication Date
CN114359452A true CN114359452A (en) 2022-04-15
CN114359452B CN114359452B (en) 2024-08-16

Family

ID=81100114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111514168.0A Active CN114359452B (en) 2021-12-13 2021-12-13 Three-dimensional model texture synthesis method based on semantic image translation

Country Status (1)

Country Link
CN (1) CN114359452B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712223A (en) * 2017-10-26 2019-05-03 北京大学 A kind of threedimensional model automatic colouring method based on textures synthesis
CN111192201A (en) * 2020-04-08 2020-05-22 腾讯科技(深圳)有限公司 Method and device for generating face image and training model thereof, and electronic equipment
US20210150197A1 (en) * 2019-11-15 2021-05-20 Ariel Al Ltd Image generation using surface-based neural synthesis
US11024060B1 (en) * 2020-03-09 2021-06-01 Adobe Inc. Generating neutral-pose transformations of self-portrait images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712223A (en) * 2017-10-26 2019-05-03 北京大学 A kind of threedimensional model automatic colouring method based on textures synthesis
US20210150197A1 (en) * 2019-11-15 2021-05-20 Ariel Al Ltd Image generation using surface-based neural synthesis
US11024060B1 (en) * 2020-03-09 2021-06-01 Adobe Inc. Generating neutral-pose transformations of self-portrait images
CN111192201A (en) * 2020-04-08 2020-05-22 腾讯科技(深圳)有限公司 Method and device for generating face image and training model thereof, and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PHILLIP ISOLA: "Image-To-Image Translation With Conditional Adversarial Networks", PROCEEDINGS OF THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),2017, 31 December 2017 (2017-12-31), pages 1125 - 1134 *
徐晓刚, 鲍虎军, 马利庄: "基于相关性原理的多样图纹理合成方法", 自然科学进展, no. 06, 25 June 2002 (2002-06-25), pages 107 - 110 *

Also Published As

Publication number Publication date
CN114359452B (en) 2024-08-16

Similar Documents

Publication Publication Date Title
Kato et al. Neural 3d mesh renderer
KR100415474B1 (en) Computer graphics system for creating and enhancing texture maps
CN100492412C (en) Voxel data generation method in volumetric three-dimensional display
US8134556B2 (en) Method and apparatus for real-time 3D viewer with ray trace on demand
Paulin et al. Review and analysis of synthetic dataset generation methods and techniques for application in computer vision
US7528831B2 (en) Generation of texture maps for use in 3D computer graphics
JPH06231275A (en) Picture simulation method
Ganovelli et al. Introduction to computer graphics: A practical learning approach
JP2023553507A (en) System and method for obtaining high quality rendered display of synthetic data display of custom specification products
WO2017123163A1 (en) Improvements in or relating to the generation of three dimensional geometries of an object
CN107784622A (en) Graphic system and graphics processor
CN104517313B (en) The method of ambient light masking based on screen space
EP1922700B1 (en) 2d/3d combined rendering
US5793372A (en) Methods and apparatus for rapidly rendering photo-realistic surfaces on 3-dimensional wire frames automatically using user defined points
CN108230430A (en) The processing method and processing device of cloud layer shade figure
KR100942026B1 (en) Makeup system and method for virtual 3D face based on multiple sensation interface
CN113144613A (en) Model-based volume cloud generation method
CN113223146A (en) Data labeling method and device based on three-dimensional simulation scene and storage medium
US20180005432A1 (en) Shading Using Multiple Texture Maps
CN114359452B (en) Three-dimensional model texture synthesis method based on semantic image translation
Eisemann et al. Stylized vector art from 3d models with region support
Baer et al. Hardware-accelerated Stippling of Surfaces derived from Medical Volume Data.
Buerger et al. Sample-based surface coloring
WO2022133569A1 (en) Methods and system for reconstructing textured meshes from point cloud data
US11321899B1 (en) 3D animation of 2D images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant